Quantum simulation of quantum field theory using continuous variables
Marshall, Kevin; Pooser, Raphael C.; Siopsis, George; ...
2015-12-14
Much progress has been made in the field of quantum computing using continuous variables over the last couple of years. This includes the generation of extremely large entangled cluster states (10,000 modes, in fact) as well as a fault tolerant architecture. This has lead to the point that continuous-variable quantum computing can indeed be thought of as a viable alternative for universal quantum computing. With that in mind, we present a new algorithm for continuous-variable quantum computers which gives an exponential speedup over the best known classical methods. Specifically, this relates to efficiently calculating the scattering amplitudes in scalar bosonicmore » quantum field theory, a problem that is known to be hard using a classical computer. Thus, we give an experimental implementation based on cluster states that is feasible with today's technology.« less
Quantum simulation of quantum field theory using continuous variables
DOE Office of Scientific and Technical Information (OSTI.GOV)
Marshall, Kevin; Pooser, Raphael C.; Siopsis, George
Much progress has been made in the field of quantum computing using continuous variables over the last couple of years. This includes the generation of extremely large entangled cluster states (10,000 modes, in fact) as well as a fault tolerant architecture. This has lead to the point that continuous-variable quantum computing can indeed be thought of as a viable alternative for universal quantum computing. With that in mind, we present a new algorithm for continuous-variable quantum computers which gives an exponential speedup over the best known classical methods. Specifically, this relates to efficiently calculating the scattering amplitudes in scalar bosonicmore » quantum field theory, a problem that is known to be hard using a classical computer. Thus, we give an experimental implementation based on cluster states that is feasible with today's technology.« less
Braschel, Melissa C; Svec, Ivana; Darlington, Gerarda A; Donner, Allan
2016-04-01
Many investigators rely on previously published point estimates of the intraclass correlation coefficient rather than on their associated confidence intervals to determine the required size of a newly planned cluster randomized trial. Although confidence interval methods for the intraclass correlation coefficient that can be applied to community-based trials have been developed for a continuous outcome variable, fewer methods exist for a binary outcome variable. The aim of this study is to evaluate confidence interval methods for the intraclass correlation coefficient applied to binary outcomes in community intervention trials enrolling a small number of large clusters. Existing methods for confidence interval construction are examined and compared to a new ad hoc approach based on dividing clusters into a large number of smaller sub-clusters and subsequently applying existing methods to the resulting data. Monte Carlo simulation is used to assess the width and coverage of confidence intervals for the intraclass correlation coefficient based on Smith's large sample approximation of the standard error of the one-way analysis of variance estimator, an inverted modified Wald test for the Fleiss-Cuzick estimator, and intervals constructed using a bootstrap-t applied to a variance-stabilizing transformation of the intraclass correlation coefficient estimate. In addition, a new approach is applied in which clusters are randomly divided into a large number of smaller sub-clusters with the same methods applied to these data (with the exception of the bootstrap-t interval, which assumes large cluster sizes). These methods are also applied to a cluster randomized trial on adolescent tobacco use for illustration. When applied to a binary outcome variable in a small number of large clusters, existing confidence interval methods for the intraclass correlation coefficient provide poor coverage. However, confidence intervals constructed using the new approach combined with Smith's method provide nominal or close to nominal coverage when the intraclass correlation coefficient is small (<0.05), as is the case in most community intervention trials. This study concludes that when a binary outcome variable is measured in a small number of large clusters, confidence intervals for the intraclass correlation coefficient may be constructed by dividing existing clusters into sub-clusters (e.g. groups of 5) and using Smith's method. The resulting confidence intervals provide nominal or close to nominal coverage across a wide range of parameters when the intraclass correlation coefficient is small (<0.05). Application of this method should provide investigators with a better understanding of the uncertainty associated with a point estimator of the intraclass correlation coefficient used for determining the sample size needed for a newly designed community-based trial. © The Author(s) 2015.
Clustering and variable selection in the presence of mixed variable types and missing data.
Storlie, C B; Myers, S M; Katusic, S K; Weaver, A L; Voigt, R G; Croarkin, P E; Stoeckel, R E; Port, J D
2018-05-17
We consider the problem of model-based clustering in the presence of many correlated, mixed continuous, and discrete variables, some of which may have missing values. Discrete variables are treated with a latent continuous variable approach, and the Dirichlet process is used to construct a mixture model with an unknown number of components. Variable selection is also performed to identify the variables that are most influential for determining cluster membership. The work is motivated by the need to cluster patients thought to potentially have autism spectrum disorder on the basis of many cognitive and/or behavioral test scores. There are a modest number of patients (486) in the data set along with many (55) test score variables (many of which are discrete valued and/or missing). The goal of the work is to (1) cluster these patients into similar groups to help identify those with similar clinical presentation and (2) identify a sparse subset of tests that inform the clusters in order to eliminate unnecessary testing. The proposed approach compares very favorably with other methods via simulation of problems of this type. The results of the autism spectrum disorder analysis suggested 3 clusters to be most likely, while only 4 test scores had high (>0.5) posterior probability of being informative. This will result in much more efficient and informative testing. The need to cluster observations on the basis of many correlated, continuous/discrete variables with missing values is a common problem in the health sciences as well as in many other disciplines. Copyright © 2018 John Wiley & Sons, Ltd.
Gu, Jianwei; Pitz, Mike; Breitner, Susanne; Birmili, Wolfram; von Klot, Stephanie; Schneider, Alexandra; Soentgen, Jens; Reller, Armin; Peters, Annette; Cyrys, Josef
2012-10-01
The success of epidemiological studies depends on the use of appropriate exposure variables. The purpose of this study is to extract a relatively small selection of variables characterizing ambient particulate matter from a large measurement data set. The original data set comprised a total of 96 particulate matter variables that have been continuously measured since 2004 at an urban background aerosol monitoring site in the city of Augsburg, Germany. Many of the original variables were derived from measured particle size distribution (PSD) across the particle diameter range 3 nm to 10 μm, including size-segregated particle number concentration, particle length concentration, particle surface concentration and particle mass concentration. The data set was complemented by integral aerosol variables. These variables were measured by independent instruments, including black carbon, sulfate, particle active surface concentration and particle length concentration. It is obvious that such a large number of measured variables cannot be used in health effect analyses simultaneously. The aim of this study is a pre-screening and a selection of the key variables that will be used as input in forthcoming epidemiological studies. In this study, we present two methods of parameter selection and apply them to data from a two-year period from 2007 to 2008. We used the agglomerative hierarchical cluster method to find groups of similar variables. In total, we selected 15 key variables from 9 clusters which are recommended for epidemiological analyses. We also applied a two-dimensional visualization technique called "heatmap" analysis to the Spearman correlation matrix. 12 key variables were selected using this method. Moreover, the positive matrix factorization (PMF) method was applied to the PSD data to characterize the possible particle sources. Correlations between the variables and PMF factors were used to interpret the meaning of the cluster and the heatmap analyses. Copyright © 2012 Elsevier B.V. All rights reserved.
Fault-tolerant measurement-based quantum computing with continuous-variable cluster states.
Menicucci, Nicolas C
2014-03-28
A long-standing open question about Gaussian continuous-variable cluster states is whether they enable fault-tolerant measurement-based quantum computation. The answer is yes. Initial squeezing in the cluster above a threshold value of 20.5 dB ensures that errors from finite squeezing acting on encoded qubits are below the fault-tolerance threshold of known qubit-based error-correcting codes. By concatenating with one of these codes and using ancilla-based error correction, fault-tolerant measurement-based quantum computation of theoretically indefinite length is possible with finitely squeezed cluster states.
One-step generation of continuous-variable quadripartite cluster states in a circuit QED system
NASA Astrophysics Data System (ADS)
Yang, Zhi-peng; Li, Zhen; Ma, Sheng-li; Li, Fu-li
2017-07-01
We propose a dissipative scheme for one-step generation of continuous-variable quadripartite cluster states in a circuit QED setup consisting of four superconducting coplanar waveguide resonators and a gap-tunable superconducting flux qubit. With external driving fields to adjust the desired qubit-resonator and resonator-resonator interactions, we show that continuous-variable quadripartite cluster states of the four resonators can be generated with the assistance of energy relaxation of the qubit. By comparison with the previous proposals, the distinct advantage of our scheme is that only one step of quantum operation is needed to realize the quantum state engineering. This makes our scheme simpler and more feasible in experiment. Our result may have useful application for implementing quantum computation in solid-state circuit QED systems.
Gate sequence for continuous variable one-way quantum computation
Su, Xiaolong; Hao, Shuhong; Deng, Xiaowei; Ma, Lingyu; Wang, Meihong; Jia, Xiaojun; Xie, Changde; Peng, Kunchi
2013-01-01
Measurement-based one-way quantum computation using cluster states as resources provides an efficient model to perform computation and information processing of quantum codes. Arbitrary Gaussian quantum computation can be implemented sufficiently by long single-mode and two-mode gate sequences. However, continuous variable gate sequences have not been realized so far due to an absence of cluster states larger than four submodes. Here we present the first continuous variable gate sequence consisting of a single-mode squeezing gate and a two-mode controlled-phase gate based on a six-mode cluster state. The quantum property of this gate sequence is confirmed by the fidelities and the quantum entanglement of two output modes, which depend on both the squeezing and controlled-phase gates. The experiment demonstrates the feasibility of implementing Gaussian quantum computation by means of accessible gate sequences.
NASA Astrophysics Data System (ADS)
Yoshikawa, Jun-ichi; Yokoyama, Shota; Kaji, Toshiyuki; Sornphiphatphong, Chanond; Shiozawa, Yu; Makino, Kenzo; Furusawa, Akira
2016-09-01
In recent quantum optical continuous-variable experiments, the number of fully inseparable light modes has drastically increased by introducing a multiplexing scheme either in the time domain or in the frequency domain. Here, modifying the time-domain multiplexing experiment reported in the work of Yokoyama et al. [Nat. Photonics 7, 982 (2013)], we demonstrate the successive generation of fully inseparable light modes for more than one million modes. The resulting multi-mode state is useful as a dual-rail continuous variable cluster state. We circumvent the previous problem of optical phase drifts, which has limited the number of fully inseparable light modes to around ten thousands, by continuous feedback control of the optical system.
Gay, Emilie; Senoussi, Rachid; Barnouin, Jacques
2007-01-01
Methods for spatial cluster detection dealing with diseases quantified by continuous variables are few, whereas several diseases are better approached by continuous indicators. For example, subclinical mastitis of the dairy cow is evaluated using a continuous marker of udder inflammation, the somatic cell score (SCS). Consequently, this study proposed to analyze spatialized risk and cluster components of herd SCS through a new method based on a spatial hazard model. The dataset included annual SCS for 34 142 French dairy herds for the year 2000, and important SCS risk factors: mean parity, percentage of winter and spring calvings, and herd size. The model allowed the simultaneous estimation of the effects of known risk factors and of potential spatial clusters on SCS, and the mapping of the estimated clusters and their range. Mean parity and winter and spring calvings were significantly associated with subclinical mastitis risk. The model with the presence of 3 clusters was highly significant, and the 3 clusters were attractive, i.e. closeness to cluster center increased the occurrence of high SCS. The three localizations were the following: close to the city of Troyes in the northeast of France; around the city of Limoges in the center-west; and in the southwest close to the city of Tarbes. The semi-parametric method based on spatial hazard modeling applies to continuous variables, and takes account of both risk factors and potential heterogeneity of the background population. This tool allows a quantitative detection but assumes a spatially specified form for clusters.
Shah, Sohil Atul
2017-01-01
Clustering is a fundamental procedure in the analysis of scientific data. It is used ubiquitously across the sciences. Despite decades of research, existing clustering algorithms have limited effectiveness in high dimensions and often require tuning parameters for different domains and datasets. We present a clustering algorithm that achieves high accuracy across multiple domains and scales efficiently to high dimensions and large datasets. The presented algorithm optimizes a smooth continuous objective, which is based on robust statistics and allows heavily mixed clusters to be untangled. The continuous nature of the objective also allows clustering to be integrated as a module in end-to-end feature learning pipelines. We demonstrate this by extending the algorithm to perform joint clustering and dimensionality reduction by efficiently optimizing a continuous global objective. The presented approach is evaluated on large datasets of faces, hand-written digits, objects, newswire articles, sensor readings from the Space Shuttle, and protein expression levels. Our method achieves high accuracy across all datasets, outperforming the best prior algorithm by a factor of 3 in average rank. PMID:28851838
Cluster Analysis to Identify Possible Subgroups in Tinnitus Patients.
van den Berge, Minke J C; Free, Rolien H; Arnold, Rosemarie; de Kleine, Emile; Hofman, Rutger; van Dijk, J Marc C; van Dijk, Pim
2017-01-01
In tinnitus treatment, there is a tendency to shift from a "one size fits all" to a more individual, patient-tailored approach. Insight in the heterogeneity of the tinnitus spectrum might improve the management of tinnitus patients in terms of choice of treatment and identification of patients with severe mental distress. The goal of this study was to identify subgroups in a large group of tinnitus patients. Data were collected from patients with severe tinnitus complaints visiting our tertiary referral tinnitus care group at the University Medical Center Groningen. Patient-reported and physician-reported variables were collected during their visit to our clinic. Cluster analyses were used to characterize subgroups. For the selection of the right variables to enter in the cluster analysis, two approaches were used: (1) variable reduction with principle component analysis and (2) variable selection based on expert opinion. Various variables of 1,783 tinnitus patients were included in the analyses. Cluster analysis (1) included 976 patients and resulted in a four-cluster solution. The effect of external influences was the most discriminative between the groups, or clusters, of patients. The "silhouette measure" of the cluster outcome was low (0.2), indicating a "no substantial" cluster structure. Cluster analysis (2) included 761 patients and resulted in a three-cluster solution, comparable to the first analysis. Again, a "no substantial" cluster structure was found (0.2). Two cluster analyses on a large database of tinnitus patients revealed that clusters of patients are mostly formed by a different response of external influences on their disease. However, both cluster outcomes based on this dataset showed a poor stability, suggesting that our tinnitus population comprises a continuum rather than a number of clearly defined subgroups.
Hierarchical clustering using correlation metric and spatial continuity constraint
Stork, Christopher L.; Brewer, Luke N.
2012-10-02
Large data sets are analyzed by hierarchical clustering using correlation as a similarity measure. This provides results that are superior to those obtained using a Euclidean distance similarity measure. A spatial continuity constraint may be applied in hierarchical clustering analysis of images.
NASA Astrophysics Data System (ADS)
Blume, T.; Hassler, S. K.; Weiler, M.
2017-12-01
Hydrological science still struggles with the fact that while we wish for spatially continuous images or movies of state variables and fluxes at the landscape scale, most of our direct measurements are point measurements. To date regional measurements resolving landscape scale patterns can only be obtained by remote sensing methods, with the common drawback that they remain near the earth surface and that temporal resolution is generally low. However, distributed monitoring networks at the landscape scale provide the opportunity for detailed and time-continuous pattern exploration. Even though measurements are spatially discontinuous, the large number of sampling points and experimental setups specifically designed for the purpose of landscape pattern investigation open up new avenues of regional hydrological analyses. The CAOS hydrological observatory in Luxembourg offers a unique setup to investigate questions of temporal stability, pattern evolution and persistence of certain states. The experimental setup consists of 45 sensor clusters. These sensor clusters cover three different geologies, two land use classes, five different landscape positions, and contrasting aspects. At each of these sensor clusters three soil moisture/soil temperature profiles, basic climate variables, sapflow, shallow groundwater, and stream water levels were measured continuously for the past 4 years. We will focus on characteristic landscape patterns of various hydrological state variables and fluxes, studying their temporal stability on the one hand and the dependence of patterns on hydrological states on the other hand (e.g. wet vs dry). This is extended to time-continuous pattern analysis based on time series of spatial rank correlation coefficients. Analyses focus on the absolute values of soil moisture, soil temperature, groundwater levels and sapflow, but also investigate the spatial pattern of the daily changes of these variables. The analysis aims at identifying hydrologic signatures of the processes or landscape characteristics acting as major controls. While groundwater, soil water and transpiration are closely linked by the water cycle, they are controlled by different processes and we expect this to be reflected in interlinked but not necessarily congruent patterns and responses.
[Application of Kohonen Self-Organizing Feature Maps in QSAR of human ADMET and kinase data sets].
Hegymegi-Barakonyi, Bálint; Orfi, László; Kéri, György; Kövesdi, István
2013-01-01
QSAR predictions have been proven very useful in a large number of studies for drug design, such as kinase inhibitor design as targets for cancer therapy, however the overall predictability often remains unsatisfactory. To improve predictability of ADMET features and kinase inhibitory data, we present a new method using Kohonen's Self-Organizing Feature Map (SOFM) to cluster molecules based on explanatory variables (X) and separate dissimilar ones. We calculated SOFM clusters for a large number of molecules with human ADMET and kinase inhibitory data, and we showed that chemically similar molecules were in the same SOFM cluster, and within such clusters the QSAR models had significantly better predictability. We used also target variables (Y, e.g. ADMET) jointly with X variables to create a novel type of clustering. With our method, cells of loosely coupled XY data could be identified and separated into different model building sets.
Visual analytics of large multidimensional data using variable binned scatter plots
NASA Astrophysics Data System (ADS)
Hao, Ming C.; Dayal, Umeshwar; Sharma, Ratnesh K.; Keim, Daniel A.; Janetzko, Halldór
2010-01-01
The scatter plot is a well-known method of visualizing pairs of two-dimensional continuous variables. Multidimensional data can be depicted in a scatter plot matrix. They are intuitive and easy-to-use, but often have a high degree of overlap which may occlude a significant portion of data. In this paper, we propose variable binned scatter plots to allow the visualization of large amounts of data without overlapping. The basic idea is to use a non-uniform (variable) binning of the x and y dimensions and plots all the data points that fall within each bin into corresponding squares. Further, we map a third attribute to color for visualizing clusters. Analysts are able to interact with individual data points for record level information. We have applied these techniques to solve real-world problems on credit card fraud and data center energy consumption to visualize their data distribution and cause-effect among multiple attributes. A comparison of our methods with two recent well-known variants of scatter plots is included.
NASA Astrophysics Data System (ADS)
Welch, D.; Henden, A.; Bell, T.; Suen, C.; Fare, I.; Sills, A.
2015-12-01
(Abstract only) The variable stars of globular clusters have played and continue to play a significant role in our understanding of certain classes of variable stars. Since all stars associated with a cluster have the same age, metallicity, distance and usually very similar (if not identical reddenings), such variables can produce uniquely powerful constraints on where certain types of pulsation behaviors are excited. Advanced amateur astronomers are increasingly well-positioned to provide long-term CCD monitoring of globular cluster variable star but are hampered by a long history of poor or inaccessible finder charts and coordinates. Many of variable-rich clusters have published photographic finder charts taken in relatively poor seeing with blue-sensitive photographic plates. While useful signal-to-noise ratios are relatively straightforward to achieve for RR Lyrae, Type 2 Cepheids, and red giant variables, correct identification remains a difficult issue—particularly when images are taken at V or longer wavelengths. We describe the project and report its progress using the OC61, TMO61, and SRO telescopes of AAVSOnet after the first year of image acquisition and demonstrate several of the data products being developed for globular cluster variables.
Topic modeling for cluster analysis of large biological and medical datasets
2014-01-01
Background The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. Results In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Conclusion Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting that topic model-based methods could provide an analytic advancement in the analysis of large biological or medical datasets. PMID:25350106
Topic modeling for cluster analysis of large biological and medical datasets.
Zhao, Weizhong; Zou, Wen; Chen, James J
2014-01-01
The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting that topic model-based methods could provide an analytic advancement in the analysis of large biological or medical datasets.
Mixture modelling for cluster analysis.
McLachlan, G J; Chang, S U
2004-10-01
Cluster analysis via a finite mixture model approach is considered. With this approach to clustering, the data can be partitioned into a specified number of clusters g by first fitting a mixture model with g components. An outright clustering of the data is then obtained by assigning an observation to the component to which it has the highest estimated posterior probability of belonging; that is, the ith cluster consists of those observations assigned to the ith component (i = 1,..., g). The focus is on the use of mixtures of normal components for the cluster analysis of data that can be regarded as being continuous. But attention is also given to the case of mixed data, where the observations consist of both continuous and discrete variables.
Data-driven process decomposition and robust online distributed modelling for large-scale processes
NASA Astrophysics Data System (ADS)
Shu, Zhang; Lijuan, Li; Lijuan, Yao; Shipin, Yang; Tao, Zou
2018-02-01
With the increasing attention of networked control, system decomposition and distributed models show significant importance in the implementation of model-based control strategy. In this paper, a data-driven system decomposition and online distributed subsystem modelling algorithm was proposed for large-scale chemical processes. The key controlled variables are first partitioned by affinity propagation clustering algorithm into several clusters. Each cluster can be regarded as a subsystem. Then the inputs of each subsystem are selected by offline canonical correlation analysis between all process variables and its controlled variables. Process decomposition is then realised after the screening of input and output variables. When the system decomposition is finished, the online subsystem modelling can be carried out by recursively block-wise renewing the samples. The proposed algorithm was applied in the Tennessee Eastman process and the validity was verified.
An Empirical Comparison of Variable Standardization Methods in Cluster Analysis.
ERIC Educational Resources Information Center
Schaffer, Catherine M.; Green, Paul E.
1996-01-01
The common marketing research practice of standardizing the columns of a persons-by-variables data matrix prior to clustering the entities corresponding to the rows was evaluated with 10 large-scale data sets. Results indicate that the column standardization practice may be problematic for some kinds of data that marketing researchers used for…
Random variability explains apparent global clustering of large earthquakes
Michael, A.J.
2011-01-01
The occurrence of 5 Mw ≥ 8.5 earthquakes since 2004 has created a debate over whether or not we are in a global cluster of large earthquakes, temporarily raising risks above long-term levels. I use three classes of statistical tests to determine if the record of M ≥ 7 earthquakes since 1900 can reject a null hypothesis of independent random events with a constant rate plus localized aftershock sequences. The data cannot reject this null hypothesis. Thus, the temporal distribution of large global earthquakes is well-described by a random process, plus localized aftershocks, and apparent clustering is due to random variability. Therefore the risk of future events has not increased, except within ongoing aftershock sequences, and should be estimated from the longest possible record of events.
Continuous-variable quantum computing in optical time-frequency modes using quantum memories.
Humphreys, Peter C; Kolthammer, W Steven; Nunn, Joshua; Barbieri, Marco; Datta, Animesh; Walmsley, Ian A
2014-09-26
We develop a scheme for time-frequency encoded continuous-variable cluster-state quantum computing using quantum memories. In particular, we propose a method to produce, manipulate, and measure two-dimensional cluster states in a single spatial mode by exploiting the intrinsic time-frequency selectivity of Raman quantum memories. Time-frequency encoding enables the scheme to be extremely compact, requiring a number of memories that are a linear function of only the number of different frequencies in which the computational state is encoded, independent of its temporal duration. We therefore show that quantum memories can be a powerful component for scalable photonic quantum information processing architectures.
Universal quantum computation with temporal-mode bilayer square lattices
NASA Astrophysics Data System (ADS)
Alexander, Rafael N.; Yokoyama, Shota; Furusawa, Akira; Menicucci, Nicolas C.
2018-03-01
We propose an experimental design for universal continuous-variable quantum computation that incorporates recent innovations in linear-optics-based continuous-variable cluster state generation and cubic-phase gate teleportation. The first ingredient is a protocol for generating the bilayer-square-lattice cluster state (a universal resource state) with temporal modes of light. With this state, measurement-based implementation of Gaussian unitary gates requires only homodyne detection. Second, we describe a measurement device that implements an adaptive cubic-phase gate, up to a random phase-space displacement. It requires a two-step sequence of homodyne measurements and consumes a (non-Gaussian) cubic-phase state.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bauer, Anne H.; Seitz, Stella; Jerke, Jonathan
2011-05-10
We introduce a technique to measure gravitational lensing magnification using the variability of type I quasars. Quasars' variability amplitudes and luminosities are tightly correlated, on average. Magnification due to gravitational lensing increases the quasars' apparent luminosity, while leaving the variability amplitude unchanged. Therefore, the mean magnification of an ensemble of quasars can be measured through the mean shift in the variability-luminosity relation. As a proof of principle, we use this technique to measure the magnification of quasars spectroscopically identified in the Sloan Digital Sky Survey (SDSS), due to gravitational lensing by galaxy clusters in the SDSS MaxBCG catalog. The Palomar-QUESTmore » Variability Survey, reduced using the DeepSky pipeline, provides variability data for the sources. We measure the average quasar magnification as a function of scaled distance (r/R{sub 200}) from the nearest cluster; our measurements are consistent with expectations assuming Navarro-Frenk-White cluster profiles, particularly after accounting for the known uncertainty in the clusters' centers. Variability-based lensing measurements are a valuable complement to shape-based techniques because their systematic errors are very different, and also because the variability measurements are amenable to photometric errors of a few percent and to depths seen in current wide-field surveys. Given the volume data of the expected from current and upcoming surveys, this new technique has the potential to be competitive with weak lensing shear measurements of large-scale structure.« less
Continuous Variable Cluster State Generation over the Optical Spatial Mode Comb
Pooser, Raphael C.; Jing, Jietai
2014-10-20
One way quantum computing uses single qubit projective measurements performed on a cluster state (a highly entangled state of multiple qubits) in order to enact quantum gates. The model is promising due to its potential scalability; the cluster state may be produced at the beginning of the computation and operated on over time. Continuous variables (CV) offer another potential benefit in the form of deterministic entanglement generation. This determinism can lead to robust cluster states and scalable quantum computation. Recent demonstrations of CV cluster states have made great strides on the path to scalability utilizing either time or frequency multiplexingmore » in optical parametric oscillators (OPO) both above and below threshold. The techniques relied on a combination of entangling operators and beam splitter transformations. Here we show that an analogous transformation exists for amplifiers with Gaussian inputs states operating on multiple spatial modes. By judicious selection of local oscillators (LOs), the spatial mode distribution is analogous to the optical frequency comb consisting of axial modes in an OPO cavity. We outline an experimental system that generates cluster states across the spatial frequency comb which can also scale the amount of quantum noise reduction to potentially larger than in other systems.« less
Cluster Analysis of Velocity Field Derived from Dense GNSS Network of Japan
NASA Astrophysics Data System (ADS)
Takahashi, A.; Hashimoto, M.
2015-12-01
Dense GNSS networks have been widely used to observe crustal deformation. Simpson et al. (2012) and Savage and Simpson (2013) have conducted cluster analyses of GNSS velocity field in the San Francisco Bay Area and Mojave Desert, respectively. They have successfully found velocity discontinuities. They also showed an advantage of cluster analysis for classifying GNSS velocity field. Since in western United States, strike-slip events are dominant, geometry is simple. However, the Japanese Islands are tectonically complicated due to subduction of oceanic plates. There are many types of crustal deformation such as slow slip event and large postseismic deformation. We propose a modified clustering method of GNSS velocity field in Japan to separate time variant and static crustal deformation. Our modification is performing cluster analysis every several months or years, then qualifying cluster member similarity. If a GNSS station moved differently from its neighboring GNSS stations, the station will not belong to in the cluster which includes its surrounding stations. With this method, time variant phenomena were distinguished. We applied our method to GNSS data of Japan from 1996 to 2015. According to the analyses, following conclusions were derived. The first is the clusters boundaries are consistent with known active faults. For examples, the Arima-Takatsuki-Hanaore fault system and the Shimane-Tottori segment proposed by Nishimura (2015) are recognized, though without using prior information. The second is improving detectability of time variable phenomena, such as a slow slip event in northern part of Hokkaido region detected by Ohzono et al. (2015). The last one is the classification of postseismic deformation caused by large earthquakes. The result suggested velocity discontinuities in postseismic deformation of the Tohoku-oki earthquake. This result implies that postseismic deformation is not continuously decaying proportional to distance from its epicenter.
State estimation and prediction using clustered particle filters.
Lee, Yoonsang; Majda, Andrew J
2016-12-20
Particle filtering is an essential tool to improve uncertain model predictions by incorporating noisy observational data from complex systems including non-Gaussian features. A class of particle filters, clustered particle filters, is introduced for high-dimensional nonlinear systems, which uses relatively few particles compared with the standard particle filter. The clustered particle filter captures non-Gaussian features of the true signal, which are typical in complex nonlinear dynamical systems such as geophysical systems. The method is also robust in the difficult regime of high-quality sparse and infrequent observations. The key features of the clustered particle filtering are coarse-grained localization through the clustering of the state variables and particle adjustment to stabilize the method; each observation affects only neighbor state variables through clustering and particles are adjusted to prevent particle collapse due to high-quality observations. The clustered particle filter is tested for the 40-dimensional Lorenz 96 model with several dynamical regimes including strongly non-Gaussian statistics. The clustered particle filter shows robust skill in both achieving accurate filter results and capturing non-Gaussian statistics of the true signal. It is further extended to multiscale data assimilation, which provides the large-scale estimation by combining a cheap reduced-order forecast model and mixed observations of the large- and small-scale variables. This approach enables the use of a larger number of particles due to the computational savings in the forecast model. The multiscale clustered particle filter is tested for one-dimensional dispersive wave turbulence using a forecast model with model errors.
State estimation and prediction using clustered particle filters
Lee, Yoonsang; Majda, Andrew J.
2016-01-01
Particle filtering is an essential tool to improve uncertain model predictions by incorporating noisy observational data from complex systems including non-Gaussian features. A class of particle filters, clustered particle filters, is introduced for high-dimensional nonlinear systems, which uses relatively few particles compared with the standard particle filter. The clustered particle filter captures non-Gaussian features of the true signal, which are typical in complex nonlinear dynamical systems such as geophysical systems. The method is also robust in the difficult regime of high-quality sparse and infrequent observations. The key features of the clustered particle filtering are coarse-grained localization through the clustering of the state variables and particle adjustment to stabilize the method; each observation affects only neighbor state variables through clustering and particles are adjusted to prevent particle collapse due to high-quality observations. The clustered particle filter is tested for the 40-dimensional Lorenz 96 model with several dynamical regimes including strongly non-Gaussian statistics. The clustered particle filter shows robust skill in both achieving accurate filter results and capturing non-Gaussian statistics of the true signal. It is further extended to multiscale data assimilation, which provides the large-scale estimation by combining a cheap reduced-order forecast model and mixed observations of the large- and small-scale variables. This approach enables the use of a larger number of particles due to the computational savings in the forecast model. The multiscale clustered particle filter is tested for one-dimensional dispersive wave turbulence using a forecast model with model errors. PMID:27930332
Strong influence of variable treatment on the performance of numerically defined ecological regions.
Snelder, Ton; Lehmann, Anthony; Lamouroux, Nicolas; Leathwick, John; Allenbach, Karin
2009-10-01
Numerical clustering has frequently been used to define hierarchically organized ecological regionalizations, but there has been little robust evaluation of their performance (i.e., the degree to which regions discriminate areas with similar ecological character). In this study we investigated the effect of the weighting and treatment of input variables on the performance of regionalizations defined by agglomerative clustering across a range of hierarchical levels. For this purpose, we developed three ecological regionalizations of Switzerland of increasing complexity using agglomerative clustering. Environmental data for our analysis were drawn from a 400 m grid and consisted of estimates of 11 environmental variables for each grid cell describing climate, topography and lithology. Regionalization 1 was defined from the environmental variables which were given equal weights. We used the same variables in Regionalization 2 but weighted and transformed them on the basis of a dissimilarity model that was fitted to land cover composition data derived for a random sample of cells from interpretation of aerial photographs. Regionalization 3 was a further two-stage development of Regionalization 2 where specific classifications, also weighted and transformed using dissimilarity models, were applied to 25 small scale "sub-domains" defined by Regionalization 2. Performance was assessed in terms of the discrimination of land cover composition for an independent set of sites using classification strength (CS), which measured the similarity of land cover composition within classes and the dissimilarity between classes. Regionalization 2 performed significantly better than Regionalization 1, but the largest gains in performance, compared to Regionalization 1, occurred at coarse hierarchical levels (i.e., CS did not increase significantly beyond the 25-region level). Regionalization 3 performed better than Regionalization 2 beyond the 25-region level and CS values continued to increase to the 95-region level. The results show that the performance of regionalizations defined by agglomerative clustering are sensitive to variable weighting and transformation. We conclude that large gains in performance can be achieved by training classifications using dissimilarity models. However, these gains are restricted to a narrow range of hierarchical levels because agglomerative clustering is unable to represent the variation in importance of variables at different spatial scales. We suggest that further advances in the numerical definition of hierarchically organized ecological regionalizations will be possible with techniques developed in the field of statistical modeling of the distribution of community composition.
NASA Technical Reports Server (NTRS)
Goldberg, Leo
1987-01-01
Observational evidence for mass loss from cool stars is reviewed. Spectra line profiles are used for the derivation of mass-loss rates with the aid of the equation of continuity. This equation implies steady mass loss with spherical symmetry. Data from binary stars, Mira variables, and red giants in globular clusters are examined. Silicate emission is discussed as a useful indicator of mass loss in the middle infrared spectra. The use of thermal millimeter-wave radiation, Very Large Array (VLA) measurement of radio emission, and OH/IR masers are discussed as a tool for mass loss measurement. Evidence for nonsteady mass loss is also reviewed.
Variable Stars in Large Magellanic Cloud Globular Clusters. II. NGC 1786
NASA Astrophysics Data System (ADS)
Kuehn, Charles A.; Smith, Horace A.; Catelan, Márcio; Pritzl, Barton J.; De Lee, Nathan; Borissova, Jura
2012-12-01
This is the second in a series of papers studying the variable stars in Large Magellanic Cloud globular clusters. The primary goal of this series is to study how RR Lyrae stars in Oosterhoff-intermediate systems compare to their counterparts in Oosterhoff I/II systems. In this paper, we present the results of our new time-series B-V photometric study of the globular cluster NGC 1786. A total of 65 variable stars were identified in our field of view. These variables include 53 RR Lyraes (27 RRab, 18 RRc, and 8 RRd), 3 classical Cepheids, 1 Type II Cepheid, 1 Anomalous Cepheid, 2 eclipsing binaries, 3 Delta Scuti/SX Phoenicis variables, and 2 variables of undetermined type. Photometric parameters for these variables are presented. We present physical properties for some of the RR Lyrae stars, derived from Fourier analysis of their light curves. We discuss several different indicators of Oosterhoff type which indicate that the Oosterhoff classification of NGC 1786 is not as clear cut as what is seen in most globular clusters. Based on observations taken with the SMARTS 1.3 m telescope operated by the SMARTS Consortium and observations taken at the Southern Astrophysical Research (SOAR) telescope, which is a joint project of the Ministério da Ciência, Tecnologia, e Inovação (MCTI) da República Federativa do Brasil, the U.S. National Optical Astronomy Observatory (NOAO), the University of North Carolina at Chapel Hill (UNC), and Michigan State University (MSU).
Coarse-Grained Clustering Dynamics of Heterogeneously Coupled Neurons.
Moon, Sung Joon; Cook, Katherine A; Rajendran, Karthikeyan; Kevrekidis, Ioannis G; Cisternas, Jaime; Laing, Carlo R
2015-12-01
The formation of oscillating phase clusters in a network of identical Hodgkin-Huxley neurons is studied, along with their dynamic behavior. The neurons are synaptically coupled in an all-to-all manner, yet the synaptic coupling characteristic time is heterogeneous across the connections. In a network of N neurons where this heterogeneity is characterized by a prescribed random variable, the oscillatory single-cluster state can transition-through [Formula: see text] (possibly perturbed) period-doubling and subsequent bifurcations-to a variety of multiple-cluster states. The clustering dynamic behavior is computationally studied both at the detailed and the coarse-grained levels, and a numerical approach that can enable studying the coarse-grained dynamics in a network of arbitrarily large size is suggested. Among a number of cluster states formed, double clusters, composed of nearly equal sub-network sizes are seen to be stable; interestingly, the heterogeneity parameter in each of the double-cluster components tends to be consistent with the random variable over the entire network: Given a double-cluster state, permuting the dynamical variables of the neurons can lead to a combinatorially large number of different, yet similar "fine" states that appear practically identical at the coarse-grained level. For weak heterogeneity we find that correlations rapidly develop, within each cluster, between the neuron's "identity" (its own value of the heterogeneity parameter) and its dynamical state. For single- and double-cluster states we demonstrate an effective coarse-graining approach that uses the Polynomial Chaos expansion to succinctly describe the dynamics by these quickly established "identity-state" correlations. This coarse-graining approach is utilized, within the equation-free framework, to perform efficient computations of the neuron ensemble dynamics.
Cluster Analysis of Clinical Data Identifies Fibromyalgia Subgroups
Docampo, Elisa; Collado, Antonio; Escaramís, Geòrgia; Carbonell, Jordi; Rivera, Javier; Vidal, Javier; Alegre, José
2013-01-01
Introduction Fibromyalgia (FM) is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. Material and Methods 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. Results Variables clustered into three independent dimensions: “symptomatology”, “comorbidities” and “clinical scales”. Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1), high symptomatology and comorbidities (Cluster 2), and high symptomatology but low comorbidities (Cluster 3), showing differences in measures of disease severity. Conclusions We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment. PMID:24098674
KmL3D: a non-parametric algorithm for clustering joint trajectories.
Genolini, C; Pingault, J B; Driss, T; Côté, S; Tremblay, R E; Vitaro, F; Arnaud, C; Falissard, B
2013-01-01
In cohort studies, variables are measured repeatedly and can be considered as trajectories. A classic way to work with trajectories is to cluster them in order to detect the existence of homogeneous patterns of evolution. Since cohort studies usually measure a large number of variables, it might be interesting to study the joint evolution of several variables (also called joint-variable trajectories). To date, the only way to cluster joint-trajectories is to cluster each trajectory independently, then to cross the partitions obtained. This approach is unsatisfactory because it does not take into account a possible co-evolution of variable-trajectories. KmL3D is an R package that implements a version of k-means dedicated to clustering joint-trajectories. It provides facilities for the management of missing values, offers several quality criteria and its graphic interface helps the user to select the best partition. KmL3D can work with any number of joint-variable trajectories. In the restricted case of two joint trajectories, it proposes 3D tools to visualize the partitioning and then export 3D dynamic rotating-graphs to PDF format. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
RR Lyrae stars and the horizontal branch of NGC 5904 (M5)
NASA Astrophysics Data System (ADS)
Arellano Ferro, A.; Luna, A.; Bramich, D. M.; Giridhar, Sunetra; Ahumada, J. A.; Muneer, S.
2016-05-01
We report the distance and [Fe/H] value for the globular cluster NGC 5904 (M5) derived from the Fourier decomposition of the light curves of selected RRab and RRc stars. The aim in doing this was to bring these parameters into the homogeneous scales established by our previous work on numerous other globular clusters, allowing a direct comparison of the horizontal branch luminosity in clusters with a wide range of metallicities. Our CCD photometry of the large variable star population of this cluster is used to discuss light curve peculiarities, like Blazhko modulations, on an individual basis. New Blazhko variables are reported.
Cluster analysis and prediction of treatment outcomes for chronic rhinosinusitis.
Soler, Zachary M; Hyer, J Madison; Rudmik, Luke; Ramakrishnan, Viswanathan; Smith, Timothy L; Schlosser, Rodney J
2016-04-01
Current clinical classifications of chronic rhinosinusitis (CRS) have weak prognostic utility regarding treatment outcomes. Simplified discriminant analysis based on unsupervised clustering has identified novel phenotypic subgroups of CRS, but prognostic utility is unknown. We sought to determine whether discriminant analysis allows prognostication in patients choosing surgery versus continued medical management. A multi-institutional prospective study of patients with CRS in whom initial medical therapy failed who then self-selected continued medical management or surgical treatment was used to separate patients into 5 clusters based on a previously described discriminant analysis using total Sino-Nasal Outcome Test-22 (SNOT-22) score, age, and missed productivity. Patients completed the SNOT-22 at baseline and for 18 months of follow-up. Baseline demographic and objective measures included olfactory testing, computed tomography, and endoscopy scoring. SNOT-22 outcomes for surgical versus continued medical treatment were compared across clusters. Data were available on 690 patients. Baseline differences in demographics, comorbidities, objective disease measures, and patient-reported outcomes were similar to previous clustering reports. Three of 5 clusters identified by means of discriminant analysis had improved SNOT-22 outcomes with surgical intervention when compared with continued medical management (surgery was a mean of 21.2 points better across these 3 clusters at 6 months, P < .05). These differences were sustained at 18 months of follow-up. Two of 5 clusters had similar outcomes when comparing surgery with continued medical management. A simplified discriminant analysis based on 3 common clinical variables is able to cluster patients and provide prognostic information regarding surgical treatment versus continued medical management in patients with CRS. Copyright © 2015 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
The WIYN Open Cluster Study: A 15-Year Report
NASA Astrophysics Data System (ADS)
Mathieu, Robert D.; WOCS Collaboration
2013-06-01
The WIYN 3.5m telescope combines large aperture, wide field of view and superb image quality. The WIYN consortium includes investigators in numerous areas of open cluster research. The combination spawned the WIYN Open Cluster Study (WOCS) over a decade ago, with the goals of producing 1) comprehensive photometric, astrometric and spectroscopic data for new fundamental open clusters and 2) addressing key astrophysical problems with these data. The set of core WOCS open clusters spans age and metallicity. Low reddening, solar proximity and richness were also desirable features in selecting core open clusters. More than 50 WIYN Open Cluster Study papers have been published in refereed journals. Highlights include: deep and wide-field photometry of NGC 188, NGC 2168 (M35), and NGC 6819 (WOCS I, II, XI and LII); deep and wide-field proper-motion studies of the old open clusters NGC 188, NGC 2682 (M67) and NGC 6791 (WOCS XVII, XXXIII and XLVI); comprehensive radial-velocity surveys of NGC 188, NGC 2168 and NGC 6819 (WOCS XXXII, XXIV, and XXXVIII); metallicity and lithium abundances in NGC 2168 (WOCS V); comprehensive definition of the hard-binary populations of NGC 188 and NGC 2168 (WOCS XXII and XLVIII); rotation period distributions in NGC 1039 (M34) and NGC 2168 (WOCS XXXV, XLIII, and XLV); study of chromospheric activity in NGC 2682 (WOCS XVIII); photometric variability surveys in NGC 188 and NGC 2682 (IX and XV); new Bayesian techniques for determination of cluster parameters (WOCS XXIII); a new infrared age-diagnostic for open clusters (WOCS XL); theoretical studies of stellar rotation (WOCS XIII and XIV); sophisticated N-body simulations of NGC 188 (WOCS LI); and the discovery of a high binary frequency and white dwarf companions among NGC 188 blue stragglers. While the WIYN 3.5m telescope remains at its heart, today the WIYN Open Cluster Study collaboration extends beyond both the WIYN observatory and consortium, and continues as a vital and productive exploration into these fundamental stellar systems. Publication list can be found at http://www.astro.ufl.edu ata/wocs/pubs.html. The WIYN Open Cluster Study has been continuously supported by grants from the National Science Foundation.
Edmands, William M B; Barupal, Dinesh K; Scalbert, Augustin
2015-03-01
MetMSLine represents a complete collection of functions in the R programming language as an accessible GUI for biomarker discovery in large-scale liquid-chromatography high-resolution mass spectral datasets from acquisition through to final metabolite identification forming a backend to output from any peak-picking software such as XCMS. MetMSLine automatically creates subdirectories, data tables and relevant figures at the following steps: (i) signal smoothing, normalization, filtration and noise transformation (PreProc.QC.LSC.R); (ii) PCA and automatic outlier removal (Auto.PCA.R); (iii) automatic regression, biomarker selection, hierarchical clustering and cluster ion/artefact identification (Auto.MV.Regress.R); (iv) Biomarker-MS/MS fragmentation spectra matching and fragment/neutral loss annotation (Auto.MS.MS.match.R) and (v) semi-targeted metabolite identification based on a list of theoretical masses obtained from public databases (DBAnnotate.R). All source code and suggested parameters are available in an un-encapsulated layout on http://wmbedmands.github.io/MetMSLine/. Readme files and a synthetic dataset of both X-variables (simulated LC-MS data), Y-variables (simulated continuous variables) and metabolite theoretical masses are also available on our GitHub repository. © The Author 2014. Published by Oxford University Press.
Edmands, William M. B.; Barupal, Dinesh K.; Scalbert, Augustin
2015-01-01
Summary: MetMSLine represents a complete collection of functions in the R programming language as an accessible GUI for biomarker discovery in large-scale liquid-chromatography high-resolution mass spectral datasets from acquisition through to final metabolite identification forming a backend to output from any peak-picking software such as XCMS. MetMSLine automatically creates subdirectories, data tables and relevant figures at the following steps: (i) signal smoothing, normalization, filtration and noise transformation (PreProc.QC.LSC.R); (ii) PCA and automatic outlier removal (Auto.PCA.R); (iii) automatic regression, biomarker selection, hierarchical clustering and cluster ion/artefact identification (Auto.MV.Regress.R); (iv) Biomarker—MS/MS fragmentation spectra matching and fragment/neutral loss annotation (Auto.MS.MS.match.R) and (v) semi-targeted metabolite identification based on a list of theoretical masses obtained from public databases (DBAnnotate.R). Availability and implementation: All source code and suggested parameters are available in an un-encapsulated layout on http://wmbedmands.github.io/MetMSLine/. Readme files and a synthetic dataset of both X-variables (simulated LC–MS data), Y-variables (simulated continuous variables) and metabolite theoretical masses are also available on our GitHub repository. Contact: ScalbertA@iarc.fr PMID:25348215
Henriques, David; González, Patricia; Doallo, Ramón; Saez-Rodriguez, Julio; Banga, Julio R.
2017-01-01
Background We consider a general class of global optimization problems dealing with nonlinear dynamic models. Although this class is relevant to many areas of science and engineering, here we are interested in applying this framework to the reverse engineering problem in computational systems biology, which yields very large mixed-integer dynamic optimization (MIDO) problems. In particular, we consider the framework of logic-based ordinary differential equations (ODEs). Methods We present saCeSS2, a parallel method for the solution of this class of problems. This method is based on an parallel cooperative scatter search metaheuristic, with new mechanisms of self-adaptation and specific extensions to handle large mixed-integer problems. We have paid special attention to the avoidance of convergence stagnation using adaptive cooperation strategies tailored to this class of problems. Results We illustrate its performance with a set of three very challenging case studies from the domain of dynamic modelling of cell signaling. The simpler case study considers a synthetic signaling pathway and has 84 continuous and 34 binary decision variables. A second case study considers the dynamic modeling of signaling in liver cancer using high-throughput data, and has 135 continuous and 109 binaries decision variables. The third case study is an extremely difficult problem related with breast cancer, involving 690 continuous and 138 binary decision variables. We report computational results obtained in different infrastructures, including a local cluster, a large supercomputer and a public cloud platform. Interestingly, the results show how the cooperation of individual parallel searches modifies the systemic properties of the sequential algorithm, achieving superlinear speedups compared to an individual search (e.g. speedups of 15 with 10 cores), and significantly improving (above a 60%) the performance with respect to a non-cooperative parallel scheme. The scalability of the method is also good (tests were performed using up to 300 cores). Conclusions These results demonstrate that saCeSS2 can be used to successfully reverse engineer large dynamic models of complex biological pathways. Further, these results open up new possibilities for other MIDO-based large-scale applications in the life sciences such as metabolic engineering, synthetic biology, drug scheduling. PMID:28813442
Penas, David R; Henriques, David; González, Patricia; Doallo, Ramón; Saez-Rodriguez, Julio; Banga, Julio R
2017-01-01
We consider a general class of global optimization problems dealing with nonlinear dynamic models. Although this class is relevant to many areas of science and engineering, here we are interested in applying this framework to the reverse engineering problem in computational systems biology, which yields very large mixed-integer dynamic optimization (MIDO) problems. In particular, we consider the framework of logic-based ordinary differential equations (ODEs). We present saCeSS2, a parallel method for the solution of this class of problems. This method is based on an parallel cooperative scatter search metaheuristic, with new mechanisms of self-adaptation and specific extensions to handle large mixed-integer problems. We have paid special attention to the avoidance of convergence stagnation using adaptive cooperation strategies tailored to this class of problems. We illustrate its performance with a set of three very challenging case studies from the domain of dynamic modelling of cell signaling. The simpler case study considers a synthetic signaling pathway and has 84 continuous and 34 binary decision variables. A second case study considers the dynamic modeling of signaling in liver cancer using high-throughput data, and has 135 continuous and 109 binaries decision variables. The third case study is an extremely difficult problem related with breast cancer, involving 690 continuous and 138 binary decision variables. We report computational results obtained in different infrastructures, including a local cluster, a large supercomputer and a public cloud platform. Interestingly, the results show how the cooperation of individual parallel searches modifies the systemic properties of the sequential algorithm, achieving superlinear speedups compared to an individual search (e.g. speedups of 15 with 10 cores), and significantly improving (above a 60%) the performance with respect to a non-cooperative parallel scheme. The scalability of the method is also good (tests were performed using up to 300 cores). These results demonstrate that saCeSS2 can be used to successfully reverse engineer large dynamic models of complex biological pathways. Further, these results open up new possibilities for other MIDO-based large-scale applications in the life sciences such as metabolic engineering, synthetic biology, drug scheduling.
The first search for variable stars in the open cluster NGC 6253 and its surrounding field
NASA Astrophysics Data System (ADS)
de Marchi, F.; Poretti, E.; Montalto, M.; Desidera, S.; Piotto, G.
2010-01-01
Aims: This work presents the first high-precision variability survey in the field of the intermediate-age, metal-rich open cluster NGC 6253. Clusters of this type are benchmarks for stellar evolution models. Methods: Continuous photometric monitoring of the cluster and its surrounding field was performed over a time span of ten nights using the Wide Field Imager mounted at the ESO-MPI 2.2 m telescope. High-quality timeseries, each composed of about 800 datapoints, were obtained for 250 000 stars using ISIS and DAOPHOT packages. Candidate members were selected by using the colour-magnitude diagrams and period-luminosity-colour relations. Membership probabilities based on the proper motions were also used. The membership of all the variables discovered within a radius of 8´ from the centre is discussed by comparing the incidence of the classes in the cluster direction and in the surrounding field. Results: We discovered 595 variables and we also characterized most of them providing their variability classes, periods, and amplitudes. The sample is complete for short periods: we classified 20 pulsating variables, 225 contact systems, 99 eclipsing systems (22 β Lyr type, 59 β Per type, 18 RS CVn type), and 77 rotational variables. The time-baseline hampered the precise characterization of 173 variables with periods longer than 4-5 days. Moreover, we found a cataclysmic system undergoing an outburst of about 2.5 mag. We propose a list of 35 variable stars as probable members of NGC 6253. ARRAY(0x383c870)
Compositional variability in Mediterranean archaeofaunas from Upper Paleolithic Southwest Europe
NASA Astrophysics Data System (ADS)
Jones, Emily Lena
2018-03-01
Recent meta-analyses of Upper Paleolithic Southwestern European archaeofaunas (Jones, 2015, 2016) have identified a consistent "Mediterranean" cluster from the Last Glacial Maximum through the early Holocene, suggesting similarities in environment and/or consistency in hunting strategy across this region through time despite radical changes in climate. However, while these archaeofaunas from this cluster all derive from sites located within today's Mediterranean bioclimatic region, many of them are from locations far from the Mediterranean Sea - Atlantic Portugal, the Spanish Meseta - which today differ significantly from each other in biotic composition. In this paper, I explore clustering (through cluster analysis and non-metric multidimensional scaling) within the Mediterranean archaeofaunal group. I test for the influence of sample size as well as the geographic variables of site elevation, latitude, and longitude on variability in the large mammal portions of archaeofaunal assemblages. ANOVA shows no relationship between cluster-defined groups and site elevation or longitude; instead, site latitude appears to be a primary contributor to patterning. However, the overall compositional similarity of the Mediterranean archaeofaunas in this dataset suggests more consistency than variability in Upper Paleolithic hunting strategy in this region.
Yasuda, Akihito; Onuki, Yoshinori; Obata, Yasuko; Takayama, Kozo
2015-01-01
The "quality by design" concept in pharmaceutical formulation development requires the establishment of a science-based rationale and design space. In this article, we integrate thin-plate spline (TPS) interpolation, Kohonen's self-organizing map (SOM) and a Bayesian network (BN) to visualize the latent structure underlying causal factors and pharmaceutical responses. As a model pharmaceutical product, theophylline tablets were prepared using a standard formulation. We measured the tensile strength and disintegration time as response variables and the compressibility, cohesion and dispersibility of the pretableting blend as latent variables. We predicted these variables quantitatively using nonlinear TPS, generated a large amount of data on pretableting blends and tablets and clustered these data into several clusters using a SOM. Our results show that we are able to predict the experimental values of the latent and response variables with a high degree of accuracy and are able to classify the tablet data into several distinct clusters. In addition, to visualize the latent structure between the causal and latent factors and the response variables, we applied a BN method to the SOM clustering results. We found that despite having inserted latent variables between the causal factors and response variables, their relation is equivalent to the results for the SOM clustering, and thus we are able to explain the underlying latent structure. Consequently, this technique provides a better understanding of the relationships between causal factors and pharmaceutical responses in theophylline tablet formulation.
Employment relations and global health: a typological study of world labor markets.
Chung, Haejoo; Muntaner, Carles; Benach, Joan
2010-01-01
In this study, the authors investigate the global labor market and employment relations, which are central building blocks of the welfare state; the aim is to propose a global typology of labor markets to explain global inequalities in population health. Countries are categorized into core (21), semi-peripheral (42), and peripheral (71) countries, based on gross national product per capita (Atlas method). Labor market-related variables and factors are then used to generate clusters of countries with principal components and cluster analysis methods. The authors then examine the relationship between the resulting clusters and health outcomes. The clusters of countries are largely geographically defined, each cluster with similar historical background and developmental strategy. However, there are interesting exceptions, which warrant further elaboration. The relationship between health outcomes and clusters largely follows the authors' expectations (except for communicable diseases): more egalitarian labor institutions have better health outcomes. The world system, then, can be divided according to different types of labor markets that are predictive of population health outcomes at each level of economic development. As is the case for health and social policies, variability in labor market characteristics is likely to reflect, in part, the relative strength of a country's political actors.
Massive Binaries in the R 136 Cluster
NASA Astrophysics Data System (ADS)
Morrell, N. I.; Massey, P.; Degioia-Eastwood, K.; Penny, L. R.; Gies, D. R.; Tsitkin, Y.; Darnell, E.
2008-08-01
As part of a large project aimed to the discovery and follow up of massive eclipsing systems in young clusters and stellar associations, we have obtained V-band CCD imaging of the R136 cluster in 30 Doradus, and high resolution spectroscopy of several among the variable stars we found there. Here we summarize our preliminary analysis of light and radial velocity variations for 4 massive multiple systems in the R136 cluster.
Photometry Using Kepler "Superstamps" of Open Clusters NGC 6791 & NGC 6819
NASA Astrophysics Data System (ADS)
Kuehn, Charles A.; Drury, Jason A.; Bellamy, Beau R.; Stello, Dennis; Bedding, Timothy R.; Reed, Mike; Quick, Breanna
2015-09-01
The Kepler space telescope has proven to be a gold mine for the study of variable stars. Usually, Kepler only reads out a handful of pixels around each pre-selected target star, omitting a large number of stars in the Kepler field. Fortunately, for the open clusters NGC 6791 and NGC 6819, Kepler also read out larger "superstamps" which contained complete images of the central region of each cluster. These cluster images can be used to study additional stars in the open clusters that were not originally on Kepler's target list. We discuss our work on using two photometric techniques to analyze these superstamps and present sample results from this project to demonstrate the value of this technique for a wide variety of variable stars.
Procedures to handle inventory cluster plots that straddle two or more conditions
Jerold T. Hahn; Colin D. MacLean; Stanford L. Arner; William A. Bechtold
1995-01-01
We review the relative merits and field procedures for four basic plot designs to handle forest inventory plots that straddle two or more conditions, given that subplots will not be moved. A cluster design is recommended that combines fixed-area subplots and variable-radius plot (VRP) sampling. Each subplot in a cluster consists of a large fixed-area subplot for...
YOUNG STELLAR CLUSTERS CONTAINING MASSIVE YOUNG STELLAR OBJECTS IN THE VVV SURVEY
DOE Office of Scientific and Technical Information (OSTI.GOV)
Borissova, J.; Alegría, S. Ramírez; Kurtev, R.
The purpose of this research is to study the connections of the global properties of eight young stellar clusters projected in the Vista Variables in the Via Lactea (VVV) ESO Large Public Survey disk area and their young stellar object (YSO) populations. The analysis is based on the combination of spectroscopic parallax-based reddening and distance determinations with main-sequence and pre-main-sequence ishochrone fitting to determine the basic parameters (reddening, age, distance) of the sample clusters. The lower mass limit estimations show that all clusters are low or intermediate mass (between 110 and 1800 M {sub ⊙}), the slope Γ of themore » obtained present-day mass functions of the clusters is close to the Kroupa initial mass function. The YSOs in the cluster’s surrounding fields are classified using low resolution spectra, spectral energy distribution fits with theoretical predictions, and variability, taking advantage of multi-epoch VVV observations. All spectroscopically confirmed YSOs (except one) are found to be massive (more than 8 M {sub ⊙}). Using VVV and GLIMPSE color–color cuts we have selected a large number of new YSO candidates, which are checked for variability and 57% are found to show at least low-amplitude variations. In few cases it was possible to distinguish between YSO and AGB classifications on the basis of light curves.« less
Xu, Xin; Huang, Zhenhua; Graves, Daniel; Pedrycz, Witold
2014-12-01
In order to deal with the sequential decision problems with large or continuous state spaces, feature representation and function approximation have been a major research topic in reinforcement learning (RL). In this paper, a clustering-based graph Laplacian framework is presented for feature representation and value function approximation (VFA) in RL. By making use of clustering-based techniques, that is, K-means clustering or fuzzy C-means clustering, a graph Laplacian is constructed by subsampling in Markov decision processes (MDPs) with continuous state spaces. The basis functions for VFA can be automatically generated from spectral analysis of the graph Laplacian. The clustering-based graph Laplacian is integrated with a class of approximation policy iteration algorithms called representation policy iteration (RPI) for RL in MDPs with continuous state spaces. Simulation and experimental results show that, compared with previous RPI methods, the proposed approach needs fewer sample points to compute an efficient set of basis functions and the learning control performance can be improved for a variety of parameter settings.
Method and system for data clustering for very large databases
NASA Technical Reports Server (NTRS)
Livny, Miron (Inventor); Zhang, Tian (Inventor); Ramakrishnan, Raghu (Inventor)
1998-01-01
Multi-dimensional data contained in very large databases is efficiently and accurately clustered to determine patterns therein and extract useful information from such patterns. Conventional computer processors may be used which have limited memory capacity and conventional operating speed, allowing massive data sets to be processed in a reasonable time and with reasonable computer resources. The clustering process is organized using a clustering feature tree structure wherein each clustering feature comprises the number of data points in the cluster, the linear sum of the data points in the cluster, and the square sum of the data points in the cluster. A dense region of data points is treated collectively as a single cluster, and points in sparsely occupied regions can be treated as outliers and removed from the clustering feature tree. The clustering can be carried out continuously with new data points being received and processed, and with the clustering feature tree being restructured as necessary to accommodate the information from the newly received data points.
Image-Subtraction Photometry of Variable Stars in the Field of the Globular Cluster NGC 6934
NASA Astrophysics Data System (ADS)
Kaluzny, J.; Olech, A.; Stanek, K. Z.
2001-03-01
We present CCD BVI photometry of 85 variable stars from the field of the globular cluster NGC 6934. The photometry was obtained with the image subtraction package ISIS. 35 variables are new identifications: 24 RRab stars, five RRc stars, two eclipsing binaries of W UMa-type, one SX Phe star, and three variables of other types. Both detected contact binaries are foreground stars. The SX Phe variable belongs most likely to the group of cluster blue stragglers. Large number of newly found RR Lyr variables in this cluster, as well as in other clusters recently observed by us, indicates that total RR Lyr population identified up to date in nearby galactic globular clusters is significantly (>30%) incomplete. Fourier decomposition of the light curves of RR Lyr variables was used to estimate the basic properties of these stars. From the analysis of RRc variables we obtain a mean mass of M=0.63 Msolar, luminosity logL/Lsolar=1.72, effective temperature Teff=7300 and helium abundance Y=0.27. The mean values of the absolute magnitude, metallicity (on Zinn's scale) and effective temperature for RRab variables are MV=0.81, [Fe/H]=-1.53 and Teff=6450, respectively. From the B-V color at minimum light of the RRab variables we obtained the color excess to NGC 6934 equal to E(B-V)=0.09+/-0.01. Different calibrations of absolute magnitudes of RRab and RRc available in literature were used to estimate apparent distance modulus of the cluster: (m-M)V=16.09+/-0.06. We note a likely error in the zero point of the HST-based V-band photometry of NGC 6934 recently presented by Piotto et al. Among analyzed sample of RR Lyr stars we have detected a short period and low amplitude variable which possibly belongs to the group of second overtone pulsators (RRe subtype variables). The BVI photometry of all variables is available electronically via anonymous ftp. The complete set of the CCD frames is available upon request. Based on observations obtained with the 1.2 m Telescope at the F. L. Whipple Observatory of the Harvard-Smithsonian Center for Astrophysics.
Generic Feature Selection with Short Fat Data
Clarke, B.; Chu, J.-H.
2014-01-01
SUMMARY Consider a regression problem in which there are many more explanatory variables than data points, i.e., p ≫ n. Essentially, without reducing the number of variables inference is impossible. So, we group the p explanatory variables into blocks by clustering, evaluate statistics on the blocks and then regress the response on these statistics under a penalized error criterion to obtain estimates of the regression coefficients. We examine the performance of this approach for a variety of choices of n, p, classes of statistics, clustering algorithms, penalty terms, and data types. When n is not large, the discrimination over number of statistics is weak, but computations suggest regressing on approximately [n/K] statistics where K is the number of blocks formed by a clustering algorithm. Small deviations from this are observed when the blocks of variables are of very different sizes. Larger deviations are observed when the penalty term is an Lq norm with high enough q. PMID:25346546
Clustering Multivariate Time Series Using Hidden Markov Models
Ghassempour, Shima; Girosi, Federico; Maeder, Anthony
2014-01-01
In this paper we describe an algorithm for clustering multivariate time series with variables taking both categorical and continuous values. Time series of this type are frequent in health care, where they represent the health trajectories of individuals. The problem is challenging because categorical variables make it difficult to define a meaningful distance between trajectories. We propose an approach based on Hidden Markov Models (HMMs), where we first map each trajectory into an HMM, then define a suitable distance between HMMs and finally proceed to cluster the HMMs with a method based on a distance matrix. We test our approach on a simulated, but realistic, data set of 1,255 trajectories of individuals of age 45 and over, on a synthetic validation set with known clustering structure, and on a smaller set of 268 trajectories extracted from the longitudinal Health and Retirement Survey. The proposed method can be implemented quite simply using standard packages in R and Matlab and may be a good candidate for solving the difficult problem of clustering multivariate time series with categorical variables using tools that do not require advanced statistic knowledge, and therefore are accessible to a wide range of researchers. PMID:24662996
Demonstration of Monogamy Relations for Einstein-Podolsky-Rosen Steering in Gaussian Cluster States.
Deng, Xiaowei; Xiang, Yu; Tian, Caixing; Adesso, Gerardo; He, Qiongyi; Gong, Qihuang; Su, Xiaolong; Xie, Changde; Peng, Kunchi
2017-06-09
Understanding how quantum resources can be quantified and distributed over many parties has profound applications in quantum communication. As one of the most intriguing features of quantum mechanics, Einstein-Podolsky-Rosen (EPR) steering is a useful resource for secure quantum networks. By reconstructing the covariance matrix of a continuous variable four-mode square Gaussian cluster state subject to asymmetric loss, we quantify the amount of bipartite steering with a variable number of modes per party, and verify recently introduced monogamy relations for Gaussian steerability, which establish quantitative constraints on the security of information shared among different parties. We observe a very rich structure for the steering distribution, and demonstrate one-way EPR steering of the cluster state under Gaussian measurements, as well as one-to-multimode steering. Our experiment paves the way for exploiting EPR steering in Gaussian cluster states as a valuable resource for multiparty quantum information tasks.
Demonstration of Monogamy Relations for Einstein-Podolsky-Rosen Steering in Gaussian Cluster States
NASA Astrophysics Data System (ADS)
Deng, Xiaowei; Xiang, Yu; Tian, Caixing; Adesso, Gerardo; He, Qiongyi; Gong, Qihuang; Su, Xiaolong; Xie, Changde; Peng, Kunchi
2017-06-01
Understanding how quantum resources can be quantified and distributed over many parties has profound applications in quantum communication. As one of the most intriguing features of quantum mechanics, Einstein-Podolsky-Rosen (EPR) steering is a useful resource for secure quantum networks. By reconstructing the covariance matrix of a continuous variable four-mode square Gaussian cluster state subject to asymmetric loss, we quantify the amount of bipartite steering with a variable number of modes per party, and verify recently introduced monogamy relations for Gaussian steerability, which establish quantitative constraints on the security of information shared among different parties. We observe a very rich structure for the steering distribution, and demonstrate one-way EPR steering of the cluster state under Gaussian measurements, as well as one-to-multimode steering. Our experiment paves the way for exploiting EPR steering in Gaussian cluster states as a valuable resource for multiparty quantum information tasks.
Small-Scale Drop-Size Variability: Empirical Models for Drop-Size-Dependent Clustering in Clouds
NASA Technical Reports Server (NTRS)
Marshak, Alexander; Knyazikhin, Yuri; Larsen, Michael L.; Wiscombe, Warren J.
2005-01-01
By analyzing aircraft measurements of individual drop sizes in clouds, it has been shown in a companion paper that the probability of finding a drop of radius r at a linear scale l decreases as l(sup D(r)), where 0 less than or equals D(r) less than or equals 1. This paper shows striking examples of the spatial distribution of large cloud drops using models that simulate the observed power laws. In contrast to currently used models that assume homogeneity and a Poisson distribution of cloud drops, these models illustrate strong drop clustering, especially with larger drops. The degree of clustering is determined by the observed exponents D(r). The strong clustering of large drops arises naturally from the observed power-law statistics. This clustering has vital consequences for rain physics, including how fast rain can form. For radiative transfer theory, clustering of large drops enhances their impact on the cloud optical path. The clustering phenomenon also helps explain why remotely sensed cloud drop size is generally larger than that measured in situ.
Fogel, Paul; Gaston-Mathé, Yann; Hawkins, Douglas; Fogel, Fajwel; Luta, George; Young, S. Stanley
2016-01-01
Often data can be represented as a matrix, e.g., observations as rows and variables as columns, or as a doubly classified contingency table. Researchers may be interested in clustering the observations, the variables, or both. If the data is non-negative, then Non-negative Matrix Factorization (NMF) can be used to perform the clustering. By its nature, NMF-based clustering is focused on the large values. If the data is normalized by subtracting the row/column means, it becomes of mixed signs and the original NMF cannot be used. Our idea is to split and then concatenate the positive and negative parts of the matrix, after taking the absolute value of the negative elements. NMF applied to the concatenated data, which we call PosNegNMF, offers the advantages of the original NMF approach, while giving equal weight to large and small values. We use two public health datasets to illustrate the new method and compare it with alternative clustering methods, such as K-means and clustering methods based on the Singular Value Decomposition (SVD) or Principal Component Analysis (PCA). With the exception of situations where a reasonably accurate factorization can be achieved using the first SVD component, we recommend that the epidemiologists and environmental scientists use the new method to obtain clusters with improved quality and interpretability. PMID:27213413
Fogel, Paul; Gaston-Mathé, Yann; Hawkins, Douglas; Fogel, Fajwel; Luta, George; Young, S Stanley
2016-05-18
Often data can be represented as a matrix, e.g., observations as rows and variables as columns, or as a doubly classified contingency table. Researchers may be interested in clustering the observations, the variables, or both. If the data is non-negative, then Non-negative Matrix Factorization (NMF) can be used to perform the clustering. By its nature, NMF-based clustering is focused on the large values. If the data is normalized by subtracting the row/column means, it becomes of mixed signs and the original NMF cannot be used. Our idea is to split and then concatenate the positive and negative parts of the matrix, after taking the absolute value of the negative elements. NMF applied to the concatenated data, which we call PosNegNMF, offers the advantages of the original NMF approach, while giving equal weight to large and small values. We use two public health datasets to illustrate the new method and compare it with alternative clustering methods, such as K-means and clustering methods based on the Singular Value Decomposition (SVD) or Principal Component Analysis (PCA). With the exception of situations where a reasonably accurate factorization can be achieved using the first SVD component, we recommend that the epidemiologists and environmental scientists use the new method to obtain clusters with improved quality and interpretability.
Multi-Wheat-Model Ensemble Responses to Interannual Climate Variability
NASA Technical Reports Server (NTRS)
Ruane, Alex C.; Hudson, Nicholas I.; Asseng, Senthold; Camarrano, Davide; Ewert, Frank; Martre, Pierre; Boote, Kenneth J.; Thorburn, Peter J.; Aggarwal, Pramod K.; Angulo, Carlos
2016-01-01
We compare 27 wheat models' yield responses to interannual climate variability, analyzed at locations in Argentina, Australia, India, and The Netherlands as part of the Agricultural Model Intercomparison and Improvement Project (AgMIP) Wheat Pilot. Each model simulated 1981e2010 grain yield, and we evaluate results against the interannual variability of growing season temperature, precipitation, and solar radiation. The amount of information used for calibration has only a minor effect on most models' climate response, and even small multi-model ensembles prove beneficial. Wheat model clusters reveal common characteristics of yield response to climate; however models rarely share the same cluster at all four sites indicating substantial independence. Only a weak relationship (R2 0.24) was found between the models' sensitivities to interannual temperature variability and their response to long-termwarming, suggesting that additional processes differentiate climate change impacts from observed climate variability analogs and motivating continuing analysis and model development efforts.
Stellar Variability at the Main-sequence Turnoff of the Intermediate-age LMC Cluster NGC 1846
NASA Astrophysics Data System (ADS)
Salinas, R.; Pajkos, M. A.; Vivas, A. K.; Strader, J.; Contreras Ramos, R.
2018-04-01
Intermediate-age (IA) star clusters in the Large Magellanic Cloud (LMC) present extended main-sequence turn-offs (MSTO) that have been attributed to either multiple stellar populations or an effect of stellar rotation. Recently it has been proposed that these extended main sequences can also be produced by ill-characterized stellar variability. Here we present Gemini-S/Gemini Multi-Object Spectrometer (GMOS) time series observations of the IA cluster NGC 1846. Using differential image analysis, we identified 73 new variable stars, with 55 of those being of the Delta Scuti type, that is, pulsating variables close the MSTO for the cluster age. Considering completeness and background contamination effects, we estimate the number of δ Sct belonging to the cluster between 40 and 60 members, although this number is based on the detection of a single δ Sct within the cluster half-light radius. This amount of variable stars at the MSTO level will not produce significant broadening of the MSTO, albeit higher-resolution imaging will be needed to rule out variable stars as a major contributor to the extended MSTO phenomenon. Though modest, this amount of δ Sct makes NGC 1846 the star cluster with the highest number of these variables ever discovered. Lastly, our results present a cautionary tale about the adequacy of shallow variability surveys in the LMC (like OGLE) to derive properties of its δ Sct population. Based on observations obtained at the Gemini Observatory, which is operated by the Association of Universities for Research in Astronomy, Inc., under a cooperative agreement with the NSF on behalf of the Gemini partnership: the National Science Foundation (United States), the National Research Council (Canada), CONICYT (Chile), Ministerio de Ciencia, Tecnología e Innovación Productiva (Argentina), and Ministério da Ciência, Tecnologia e Inovação (Brazil).
NASA Astrophysics Data System (ADS)
Su, Yung-Chao; Wu, Shin-Tza
2017-09-01
We study theoretically the teleportation of a controlled-phase (cz) gate through measurement-based quantum-information processing for continuous-variable systems. We examine the degree of entanglement in the output modes of the teleported cz-gate for two classes of resource states: the canonical cluster states that are constructed via direct implementations of two-mode squeezing operations and the linear-optical version of cluster states which are built from linear-optical networks of beam splitters and phase shifters. In order to reduce the excess noise arising from finite-squeezed resource states, teleportation through resource states with different multirail designs will be considered and the enhancement of entanglement in the teleported cz gates will be analyzed. For multirail cluster with an arbitrary number of rails, we obtain analytical expressions for the entanglement in the output modes and analyze in detail the results for both classes of resource states. At the same time, we also show that for uniformly squeezed clusters the multirail noise reduction can be optimized when the excess noise is allocated uniformly to the rails. To facilitate the analysis, we develop a trick with manipulations of quadrature operators that can reveal rather efficiently the measurement sequence and corrective operations needed for the measurement-based gate teleportation, which will also be explained in detail.
NASA Astrophysics Data System (ADS)
Detzer, J.; Loikith, P. C.; Mechoso, C. R.; Barkhordarian, A.; Lee, H.
2017-12-01
South America's climate varies considerably owing to its large geographic range and diverse topographical features. Spanning the tropics to the mid-latitudes and from high peaks to tropical rainforest, the continent experiences an array of climate and weather patterns. Due to this considerable spatial extent, assessing temperature variability at the continent scale is particularly challenging. It is well documented in the literature that temperatures have been increasing across portions of South America in recent decades, and while there have been many studies that have focused on precipitation variability and change, temperature has received less scientific attention. Therefore, a more thorough understanding of the drivers of temperature variability is critical for interpreting future change. First, k-means cluster analysis is used to identify four primary modes of temperature variability across the continent, stratified by season. Next, composites of large scale meteorological patterns (LSMPs) are calculated for months assigned to each cluster. Initial results suggest that LSMPs, defined using meteorological variables such as sea level pressure (SLP), geopotential height, and wind, are able to identify synoptic scale mechanisms important for driving temperature variability at the monthly scale. Some LSMPs indicate a relationship with known recurrent modes of climate variability. For example, composites of geopotential height suggest that the Southern Annular Mode is an important, but not necessarily dominant, component of temperature variability over southern South America. This work will be extended to assess the drivers of temperature extremes across South America.
Descriptor Fingerprints and Their Application to WhiteWine Clustering and Discrimination.
NASA Astrophysics Data System (ADS)
Bangov, I. P.; Moskovkina, M.; Stojanov, B. P.
2018-03-01
This study continues the attempt to use the statistical process for a large-scale analytical data. A group of 3898 white wines, each with 11 analytical laboratory benchmarks was analyzed by a fingerprint similarity search in order to be grouped into separate clusters. A characterization of the wine's quality in each individual cluster was carried out according to individual laboratory parameters.
Sander, Ulrich; Lubbe, Nils
2018-04-01
Intersection accidents are frequent and harmful. The accident types 'straight crossing path' (SCP), 'left turn across path - oncoming direction' (LTAP/OD), and 'left-turn across path - lateral direction' (LTAP/LD) represent around 95% of all intersection accidents and one-third of all police-reported car-to-car accidents in Germany. The European New Car Assessment Program (Euro NCAP) have announced that intersection scenarios will be included in their rating from 2020; however, how these scenarios are to be tested has not been defined. This study investigates whether clustering methods can be used to identify a small number of test scenarios sufficiently representative of the accident dataset to evaluate Intersection Automated Emergency Braking (AEB). Data from the German In-Depth Accident Study (GIDAS) and the GIDAS-based Pre-Crash Matrix (PCM) from 1999 to 2016, containing 784 SCP and 453 LTAP/OD accidents, were analyzed with principal component methods to identify variables that account for the relevant total variances of the sample. Three different methods for data clustering were applied to each of the accident types, two similarity-based approaches, namely Hierarchical Clustering (HC) and Partitioning Around Medoids (PAM), and the probability-based Latent Class Clustering (LCC). The optimum number of clusters was derived for HC and PAM with the silhouette method. The PAM algorithm was both initiated with random start medoid selection and medoids from HC. For LCC, the Bayesian Information Criterion (BIC) was used to determine the optimal number of clusters. Test scenarios were defined from optimal cluster medoids weighted by their real-life representation in GIDAS. The set of variables for clustering was further varied to investigate the influence of variable type and character. We quantified how accurately each cluster variation represents real-life AEB performance using pre-crash simulations with PCM data and a generic algorithm for AEB intervention. The usage of different sets of clustering variables resulted in substantially different numbers of clusters. The stability of the resulting clusters increased with prioritization of categorical over continuous variables. For each different set of cluster variables, a strong in-cluster variance of avoided versus non-avoided accidents for the specified Intersection AEB was present. The medoids did not predict the most common Intersection AEB behavior in each cluster. Despite thorough analysis using various cluster methods and variable sets, it was impossible to reduce the diversity of intersection accidents into a set of test scenarios without compromising the ability to predict real-life performance of Intersection AEB. Although this does not imply that other methods cannot succeed, it was observed that small changes in the definition of a scenario resulted in a different avoidance outcome. Therefore, we suggest using limited physical testing to validate more extensive virtual simulations to evaluate vehicle safety. Copyright © 2018 Elsevier Ltd. All rights reserved.
Romero-Ortuno, Roman; Cogan, Lisa; Foran, Tim; Kenny, Rose Anne; Fan, Chie Wei
2011-04-01
To identify morphological orthostatic blood pressure (BP) phenotypes in older people and assess their correlation with orthostatic intolerance (OI), falls, and frailty and to compare the discriminatory performance of a morphological classification with two established orthostatic hypotension (OH) definitions: consensus (COH) and initial (IOH). Cross-sectional. Geriatric research clinic. Four hundred forty-two participants (mean age 72, 72% female) without dementia or risk factors for autonomic neuropathy. Active lying-to-standing test monitored using a continuous noninvasive BP monitor. For the morphological classification, four orthostatic systolic BP variables were extracted (delta (baseline - nadir) and maximum percentage of baseline recovered by 30 seconds and 1 and 2 minutes) using the 5-second averages method and entered in K-means cluster analysis (three clusters). Main outcomes were OI, falls (≥1 in past 6 months), and frailty (modified Fried criteria). The morphological clusters were small drop, fast overrecovery (n=112); medium drop, slow recovery (n=238); and large drop, nonrecovery (n=92). Their characterization revealed an increasing OI gradient (17.9%, 27.5%, and 44.6% respectively, P<.001) but no significant gradients in falls or frailty. The COH definition failed to reveal clinical differences between COH+ (n=416) and COH- (n=26) participants. The IOH definition resulted in a clinically meaningful separation between IOH+ (n=85) and IOH- (n=357) subgroups, as assessed according to OI (100% vs 11.5%, P<.001), falls (24.7% vs 10.4%, P<.001), and frailty (14.1% vs 5.4%, P=.005). It is recommended that the IOH definition be applied when taking continuous noninvasive orthostatic BP measurements in older people. © 2011, Copyright the Authors. Journal compilation © 2011, The American Geriatrics Society.
NASA Astrophysics Data System (ADS)
Kirchner-Bossi, Nicolas; Befort, Daniel J.; Wild, Simon B.; Ulbrich, Uwe; Leckebusch, Gregor C.
2016-04-01
Time-clustered winter storms are responsible for a majority of the wind-induced losses in Europe. Over last years, different atmospheric and oceanic large-scale mechanisms as the North Atlantic Oscillation (NAO) or the Meridional Overturning Circulation (MOC) have been proven to drive some significant portion of the windstorm variability over Europe. In this work we systematically investigate the influence of different large-scale natural variability modes: more than 20 indices related to those mechanisms with proven or potential influence on the windstorm frequency variability over Europe - mostly SST- or pressure-based - are derived by means of ECMWF ERA-20C reanalysis during the last century (1902-2009), and compared to the windstorm variability for the European winter (DJF). Windstorms are defined and tracked as in Leckebusch et al. (2008). The derived indices are then employed to develop a statistical procedure including a stepwise Multiple Linear Regression (MLR) and an Artificial Neural Network (ANN), aiming to hindcast the inter-annual (DJF) regional windstorm frequency variability in a case study for the British Isles. This case study reveals 13 indices with a statistically significant coupling with seasonal windstorm counts. The Scandinavian Pattern (SCA) showed the strongest correlation (0.61), followed by the NAO (0.48) and the Polar/Eurasia Pattern (0.46). The obtained indices (standard-normalised) are selected as predictors for a windstorm variability hindcast model applied for the British Isles. First, a stepwise linear regression is performed, to identify which mechanisms can explain windstorm variability best. Finally, the indices retained by the stepwise regression are used to develop a multlayer perceptron-based ANN that hindcasted seasonal windstorm frequency and clustering. Eight indices (SCA, NAO, EA, PDO, W.NAtl.SST, AMO (unsmoothed), EA/WR and Trop.N.Atl SST) are retained by the stepwise regression. Among them, SCA showed the highest linear coefficient, followed by SST in western Atlantic, AMO and NAO. The explanatory regression model (considering all time steps) provided a Coefficient of Determination (R^2) of 0.75. A predictive version of the linear model applying a leave-one-out cross-validation (LOOCV) shows an R2 of 0.56 and a relative RMSE of 4.67 counts/season. An ANN-based nonlinear hindcast model for the seasonal windstorm frequency is developed with the aim to improve the stepwise hindcast ability and thus better predict a time-clustered season over the case study. A 7 node-hidden layer perceptron is set, and the LOOCV procedure reveals a R2 of 0.71. In comparison to the stepwise MLR the RMSE is reduced a 20%. This work shows that for the British Isles case study, most of the interannual variability can be explained by certain large-scale mechanisms, considering also nonlinear effects (ANN). This allows to discern a time-clustered season from a non-clustered one - a key issue for applications e.g., in the (re)insurance industry.
Wang, Xiuquan; Huang, Guohe; Zhao, Shan; Guo, Junhong
2015-09-01
This paper presents an open-source software package, rSCA, which is developed based upon a stepwise cluster analysis method and serves as a statistical tool for modeling the relationships between multiple dependent and independent variables. The rSCA package is efficient in dealing with both continuous and discrete variables, as well as nonlinear relationships between the variables. It divides the sample sets of dependent variables into different subsets (or subclusters) through a series of cutting and merging operations based upon the theory of multivariate analysis of variance (MANOVA). The modeling results are given by a cluster tree, which includes both intermediate and leaf subclusters as well as the flow paths from the root of the tree to each leaf subcluster specified by a series of cutting and merging actions. The rSCA package is a handy and easy-to-use tool and is freely available at http://cran.r-project.org/package=rSCA . By applying the developed package to air quality management in an urban environment, we demonstrate its effectiveness in dealing with the complicated relationships among multiple variables in real-world problems.
Silva, Mauricio Rocha e
2011-01-01
OBJECTIVE: Impact Factors (IF) are widely used surrogates to evaluate single articles, in spite of known shortcomings imposed by cite distribution skewness. We quantify this asymmetry and propose a simple computer-based procedure for evaluating individual articles. METHOD: (a) Analysis of symmetry. Journals clustered around nine Impact Factor points were selected from the medical “Subject Categories” in Journal Citation Reports 2010. Citable items published in 2008 were retrieved and ranked by granted citations over the Jan/2008 - Jun/2011 period. Frequency distribution of cites, normalized cumulative cites and absolute cites/decile were determined for each journal cluster. (b) Positive Predictive Value. Three arbitrarily established evaluation classes were generated: LOW (1.3≤IF<2.6); MID: (2.6≤IF<3.9); HIGH: (IF≥3.9). Positive Predictive Value for journal clusters within each class range was estimated. (c) Continuously Variable Rating. An alternative evaluation procedure is proposed to allow the rating of individually published articles in comparison to all articles published in the same journal within the same year of publication. The general guiding lines for the construction of a totally dedicated software program are delineated. RESULTS AND CONCLUSIONS: Skewness followed the Pareto Distribution for (1
Variable Stars in Large Magellanic Cloud Globular Clusters. III. Reticulum
NASA Astrophysics Data System (ADS)
Kuehn, Charles A.; Dame, Kyra; Smith, Horace A.; Catelan, Márcio; Jeon, Young-Beom; Nemec, James M.; Walker, Alistair R.; Kunder, Andrea; Pritzl, Barton J.; De Lee, Nathan; Borissova, Jura
2013-06-01
This is the third in a series of papers studying the variable stars in old globular clusters in the Large Magellanic Cloud. The primary goal of this series is to look at how the characteristics and behavior of RR Lyrae stars in Oosterhoff-intermediate systems compare to those of their counterparts in Oosterhoff-I/II systems. In this paper we present the results of our new time-series BVI photometric study of the globular cluster Reticulum. We found a total of 32 variables stars (22 RRab, 4 RRc, and 6 RRd stars) in our field of view. We present photometric parameters and light curves for these stars. We also present physical properties, derived from Fourier analysis of light curves, for some of the RR Lyrae stars. We discuss the Oosterhoff classification of Reticulum and use our results to re-derive the distance modulus and age of the cluster. Based on observations taken with the SMARTS 1.3 m telescope operated by the SMARTS Consortium and observations taken at the Southern Astrophysical Research (SOAR) telescope, which is a joint project of the Ministério da Ciência, Tecnologia, e Inovação (MCTI) da República Federativa do Brasil, the U.S. National Optical Astronomy Observatory (NOAO), the University of North Carolina at Chapel Hill (UNC), and Michigan State University (MSU).
NASA Astrophysics Data System (ADS)
Moździerski, D.; Pigulski, A.; Kopacki, G.; Kołaczkowski, Z.; Stęślicki, M.
2014-06-01
We present results of a BVIC variability survey in the young open cluster NGC 457 based on observations obtained during three separate runs spanning almost 20 years. In total, we found 79 variable stars, of which 66 are new. The BVIC photometry was transformed to the standard system and used to derive cluster parameters by means of isochrone fitting. The cluster is about 20 Myr old, the mean reddening amounts to about 0.48 mag in terms of the color excess E(B-V). Depending on the metallicity, the isochrone fitting yields a distance between 2.3 kpc and 2.9 kpc, which locates the cluster in the Perseus arm of the Galaxy. Using the complementary Hα photometry carried out in two seasons separated by over 10 years, we find that the cluster is very rich in Be stars. In total, 15 stars in the observed field of which 14 are cluster members showed Hα in emission either during our observations or in the past. Most of the Be stars vary in brightness on different time scales including short-period variability related most likely to g-mode pulsations. A single-epoch spectrum of NGC 457-6 shows that this Be star is presently in the shell phase. The inventory of variable stars in the observed field consists of a single β Cep-type star, NGC 457-8, 13 Be stars, 21 slowly pulsating B stars, seven δ Sct stars, one γ Dor star, 16 unclassified periodic stars, 8 eclipsing systems and a dozen of stars with irregular variability, of which six are also B-type stars. As many as 45 variable stars are of spectral type B which is the largest number in all open clusters presented in this series of papers. The most interesting is the discovery of a large group of slowly pulsating B stars which occupy the cluster main sequence in the range between V=11 mag and 14.5 mag, corresponding to spectral types B3 to B8. They all have very low amplitudes and about half show pulsations with frequencies higher than 3 d-1. We argue that these are most likely fast-rotating slowly pulsating B stars, observed also in other open clusters.
A deep staring campaign in the σ Orionis cluster. Variability in substellar members
NASA Astrophysics Data System (ADS)
Elliott, P.; Scholz, A.; Jayawardhana, R.; Eislöffel, J.; Hébrard, E. M.
2017-12-01
Context. The young star cluster near σ Orionis is one of the primary environments to study the properties of young brown dwarfs down to masses comparable to those of giant planets. Aims: Deep optical imaging is used to study time-domain properties of young brown dwarfs over typical rotational timescales and to search for new substellar and planetary-mass cluster members. Methods: We used the Visible Multi Object Spectrograph (VIMOS) at the Very Large Telescope (VLT) to monitor a 24'× 16' field in the I-band. We stared at the same area over a total integration time of 21 h, spanning three observing nights. Using the individual images from this run we investigated the photometric time series of nine substellar cluster members with masses from 10 to 60 MJup. The deep stacked image shows cluster members down to ≈5 MJup. We searched for new planetary-mass objects by combining our deep I-band photometry with public J-band magnitudes and by examining the nearby environment of known very low mass members for possible companions. Results: We find two brown dwarfs, with significantly variable, aperiodic light curves, both with masses around 50 MJup, one of which was previously unknown to be variable. The physical mechanism responsible for the observed variability is likely to be different for the two objects. The variability of the first object, a single-lined spectroscopic binary, is most likely linked to its accretion disc; the second may be caused by variable extinction by large grains. We find five new candidate members from the colour-magnitude diagram and three from a search for companions within 2000 au. We rule all eight sources out as potential members based on non-stellar shape and/or infrared colours. The I-band photometry is made available as a public dataset. Conclusions: We present two variable brown dwarfs. One is consistent with ongoing accretion, the other exhibits apparent transient variability without the presence of an accretion disc. Our analysis confirms the existing census of substellar cluster members down to ≈7 MJup. The zero result from our companion search agrees with the low occurrence rate of wide companions to brown dwarfs found in other works. Based on observations made with ESO Telescopes at the Paranal Observatory under programme ID 078.C-0042.Full Table B.1 is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/608/A66
DOE Office of Scientific and Technical Information (OSTI.GOV)
Salinas, R.; Pajkos, M. A.; Strader, J.
Intermediate-age star clusters in the Large Magellanic Cloud show extended main sequence turnoffs (MSTOs) that are not consistent with a canonical single stellar population. These broad turnoffs have been interpreted as evidence for extended star formation and/or stellar rotation. Since most of these studies use single frames per filter to do the photometry, the presence of variable stars near the MSTO in these clusters has remained unnoticed and their impact has been totally ignored. We model the influence of Delta Scuti using synthetic CMDs, adding variable stars following different levels of incidence and amplitude distributions. We show that Delta Scutimore » observed at a single phase will produce a broadening of the MSTO without affecting other areas of a CMD such as the upper MS or the red clump; furthermore, the amount of spread introduced correlates with cluster age, as observed. This broadening is constrained to ages ∼1–3 Gyr when the MSTO area crosses the instability strip, which is also consistent with observations. Variable stars cannot explain bifurcarted MSTOs or the extended MSTOs seen in some young clusters, but they can make an important contribution to the extended MSTOs in intermediate-age clusters.« less
Cardiovascular reactivity patterns and pathways to hypertension: a multivariate cluster analysis.
Brindle, R C; Ginty, A T; Jones, A; Phillips, A C; Roseboom, T J; Carroll, D; Painter, R C; de Rooij, S R
2016-12-01
Substantial evidence links exaggerated mental stress induced blood pressure reactivity to future hypertension, but the results for heart rate reactivity are less clear. For this reason multivariate cluster analysis was carried out to examine the relationship between heart rate and blood pressure reactivity patterns and hypertension in a large prospective cohort (age range 55-60 years). Four clusters emerged with statistically different systolic and diastolic blood pressure and heart rate reactivity patterns. Cluster 1 was characterised by a relatively exaggerated blood pressure and heart rate response while the blood pressure and heart rate responses of cluster 2 were relatively modest and in line with the sample mean. Cluster 3 was characterised by blunted cardiovascular stress reactivity across all variables and cluster 4, by an exaggerated blood pressure response and modest heart rate response. Membership to cluster 4 conferred an increased risk of hypertension at 5-year follow-up (hazard ratio=2.98 (95% CI: 1.50-5.90), P<0.01) that survived adjustment for a host of potential confounding variables. These results suggest that the cardiac reactivity plays a potentially important role in the link between blood pressure reactivity and hypertension and support the use of multivariate approaches to stress psychophysiology.
Multifocal visual evoked potentials for early glaucoma detection.
Weizer, Jennifer S; Musch, David C; Niziol, Leslie M; Khan, Naheed W
2012-07-01
To compare multifocal visual evoked potentials (mfVEP) with other detection methods in early open-angle glaucoma. Ten patients with suspected glaucoma and 5 with early open-angle glaucoma underwent mfVEP, standard automated perimetry (SAP), short-wave automated perimetry, frequency-doubling technology perimetry, and nerve fiber layer optical coherence tomography. Nineteen healthy control subjects underwent mfVEP and SAP for comparison. Comparisons between groups involving continuous variables were made using independent t tests; for categorical variables, Fisher's exact test was used. Monocular mfVEP cluster defects were associated with an increased SAP pattern standard deviation (P = .0195). Visual fields that showed interocular mfVEP cluster defects were more likely to also show superior quadrant nerve fiber layer thinning by OCT (P = .0152). Multifocal visual evoked potential cluster defects are associated with a functional and an anatomic measure that both relate to glaucomatous optic neuropathy. Copyright 2012, SLACK Incorporated.
Montemagni, Cristiana; Frieri, Tiziana; Villari, Vincenzo; Rocca, Paola
2018-06-01
The purpose of the study was to identify homogenous subgroups, based upon achievement of two functional milestones (marriage and employment) and Global Assessment of Functioning (GAF) score in a sample of 848 acute patients admitted to the Psychiatric Emergency Service (PES) of the Città della Salute e della Scienza di Torino, during a 24-months period. A two-step cluster-analysis, using GAF total score and the achievements in the two milestones as input data was performed. In order to examine whether the identified subgroups differed in external variables that were not included in the clustering process, and consequently to validate the found functional profiles, chi-square tests for categorical variables and analyses of variance (ANOVA) for continuous variables were performed. Five clusters were found. Employed patients (Clusters 4 and 5) had more years of education, less illness chronicity (shorter duration of illness and lower proportion of previous voluntary hospitalizations), lower use of mental health resources in the last year yet higher treatment adherence, larger network size, and higher ordinary discharge. Married inpatients (Clusters 3 and 5) had lower frequencies of substance abuse. The remarkably high rate of unemployment in this inpatients' sample, and the evidence of associations between unemployment and poorer functioning, argue for further research and development of evidence-based supported employment programs, that put forth diligent effort in helping people obtain work quickly and sustain; they may also help to reduce health care service use among that clientele.
Exposing the Binary Heart of ETA Carinae
NASA Technical Reports Server (NTRS)
Forman, WIlliam; Mushotzky, Richard (Technical Monitor)
2005-01-01
Continued progress was made last year on A1367. As noted before, A1367 is a puzzling cluster with a large elongation, suggesting a major merger but with an anti-correlation between the luminosity and temperature of the two components of the cluster (NE and SW). The less luminous subconcentration appears hotter and the more luminous portion of the cluster appears cooler in contradiction to the well-established positive correlation of temperature and luminosity for clusters and groups. With the XMM-Newton observation we have developed a merger model to explain this apparent contradiction.
McParland, D; Phillips, C M; Brennan, L; Roche, H M; Gormley, I C
2017-12-10
The LIPGENE-SU.VI.MAX study, like many others, recorded high-dimensional continuous phenotypic data and categorical genotypic data. LIPGENE-SU.VI.MAX focuses on the need to account for both phenotypic and genetic factors when studying the metabolic syndrome (MetS), a complex disorder that can lead to higher risk of type 2 diabetes and cardiovascular disease. Interest lies in clustering the LIPGENE-SU.VI.MAX participants into homogeneous groups or sub-phenotypes, by jointly considering their phenotypic and genotypic data, and in determining which variables are discriminatory. A novel latent variable model that elegantly accommodates high dimensional, mixed data is developed to cluster LIPGENE-SU.VI.MAX participants using a Bayesian finite mixture model. A computationally efficient variable selection algorithm is incorporated, estimation is via a Gibbs sampling algorithm and an approximate BIC-MCMC criterion is developed to select the optimal model. Two clusters or sub-phenotypes ('healthy' and 'at risk') are uncovered. A small subset of variables is deemed discriminatory, which notably includes phenotypic and genotypic variables, highlighting the need to jointly consider both factors. Further, 7 years after the LIPGENE-SU.VI.MAX data were collected, participants underwent further analysis to diagnose presence or absence of the MetS. The two uncovered sub-phenotypes strongly correspond to the 7-year follow-up disease classification, highlighting the role of phenotypic and genotypic factors in the MetS and emphasising the potential utility of the clustering approach in early screening. Additionally, the ability of the proposed approach to define the uncertainty in sub-phenotype membership at the participant level is synonymous with the concepts of precision medicine and nutrition. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Time-Series Monitoring of Open Star Clusters
NASA Astrophysics Data System (ADS)
Hojaev, A. S.; Semakov, D. G.
2006-08-01
Star clusters especially a compact ones (with diameter of few to ten arcmin) are suitable targets to search of light variability for orchestera of stars by means of ordinary Casegrain telescope plus CCD system. A special patroling with short time-fixed exposures and mmag accuracy could be used also to study of stellar oscillation for group of stars simultaneously. The last can be carried out both separately from one site and within international campaigns. Detection and study of optical variability of X-ray sources including X-ray binaries with compact objects might be as a result of a long-term monitoring of such clusters as well. We present the program of open star clusters monitoring with Zeiss 1 meter RCC telescope of Maidanak observatory has been recently automated. In combination with quite good seeing at this observatory (see, e.g., Sarazin, M. 1999, URL http://www.eso.org/gen-fac/pubs/astclim/) the automatic telescope equipped with large-format (2KX2K) CCD camera AP-10 available will allow to collect homogenious time-series for analysis. We already started this program in 2001 and had a set of patrol observations with Zeiss 0.6 meter telescope and AP-10 camera in 2003. 7 compact open clusters in the Milky Way (NGC 7801, King1, King 13, King18, King20, Berkeley 55, IC 4996) have been monitored for stellar variability and some results of photometry will be presented. A few interesting variables were discovered and dozens were suspected for variability to the moment in these clusters for the first time. We have made steps to join the Whole-Earth Telescope effort in its future campaigns.
Five-wave-packet quantum error correction based on continuous-variable cluster entanglement
Hao, Shuhong; Su, Xiaolong; Tian, Caixing; Xie, Changde; Peng, Kunchi
2015-01-01
Quantum error correction protects the quantum state against noise and decoherence in quantum communication and quantum computation, which enables one to perform fault-torrent quantum information processing. We experimentally demonstrate a quantum error correction scheme with a five-wave-packet code against a single stochastic error, the original theoretical model of which was firstly proposed by S. L. Braunstein and T. A. Walker. Five submodes of a continuous variable cluster entangled state of light are used for five encoding channels. Especially, in our encoding scheme the information of the input state is only distributed on three of the five channels and thus any error appearing in the remained two channels never affects the output state, i.e. the output quantum state is immune from the error in the two channels. The stochastic error on a single channel is corrected for both vacuum and squeezed input states and the achieved fidelities of the output states are beyond the corresponding classical limit. PMID:26498395
Fleury, Marie-Josée; Grenier, Guy; Bamvita, Jean-Marie
2017-11-13
This study developed a typology describing change in the perceived adequacy of help received among 204 individuals with severe mental disorders, 5 years after transfer to the community following a major mental health reform in Quebec (Canada). Participant typologies were constructed using a two-step cluster analysis. There were significant differences between T0 and T2 for perceived adequacy of help received and other independent variables, including seriousness of needs, help from services or relatives, and care continuity. Five classes emerged from the analysis. Perceived adequacy of help received at T2 increased for Class 1, mainly comprised of older women with mood disorders. Overall, greater care continuity and levels of help from services and relatives related to higher perceived AHR. Changes in perceived adequacy of help received resulting from several combinations of associated variables indicate that MH service delivery should respond to specific profiles and determinants.
Unsupervised classification of multivariate geostatistical data: Two algorithms
NASA Astrophysics Data System (ADS)
Romary, Thomas; Ors, Fabien; Rivoirard, Jacques; Deraisme, Jacques
2015-12-01
With the increasing development of remote sensing platforms and the evolution of sampling facilities in mining and oil industry, spatial datasets are becoming increasingly large, inform a growing number of variables and cover wider and wider areas. Therefore, it is often necessary to split the domain of study to account for radically different behaviors of the natural phenomenon over the domain and to simplify the subsequent modeling step. The definition of these areas can be seen as a problem of unsupervised classification, or clustering, where we try to divide the domain into homogeneous domains with respect to the values taken by the variables in hand. The application of classical clustering methods, designed for independent observations, does not ensure the spatial coherence of the resulting classes. Image segmentation methods, based on e.g. Markov random fields, are not adapted to irregularly sampled data. Other existing approaches, based on mixtures of Gaussian random functions estimated via the expectation-maximization algorithm, are limited to reasonable sample sizes and a small number of variables. In this work, we propose two algorithms based on adaptations of classical algorithms to multivariate geostatistical data. Both algorithms are model free and can handle large volumes of multivariate, irregularly spaced data. The first one proceeds by agglomerative hierarchical clustering. The spatial coherence is ensured by a proximity condition imposed for two clusters to merge. This proximity condition relies on a graph organizing the data in the coordinates space. The hierarchical algorithm can then be seen as a graph-partitioning algorithm. Following this interpretation, a spatial version of the spectral clustering algorithm is also proposed. The performances of both algorithms are assessed on toy examples and a mining dataset.
Moerbeek, Mirjam; van Schie, Sander
2016-07-11
The number of clusters in a cluster randomized trial is often low. It is therefore likely random assignment of clusters to treatment conditions results in covariate imbalance. There are no studies that quantify the consequences of covariate imbalance in cluster randomized trials on parameter and standard error bias and on power to detect treatment effects. The consequences of covariance imbalance in unadjusted and adjusted linear mixed models are investigated by means of a simulation study. The factors in this study are the degree of imbalance, the covariate effect size, the cluster size and the intraclass correlation coefficient. The covariate is binary and measured at the cluster level; the outcome is continuous and measured at the individual level. The results show covariate imbalance results in negligible parameter bias and small standard error bias in adjusted linear mixed models. Ignoring the possibility of covariate imbalance while calculating the sample size at the cluster level may result in a loss in power of at most 25 % in the adjusted linear mixed model. The results are more severe for the unadjusted linear mixed model: parameter biases up to 100 % and standard error biases up to 200 % may be observed. Power levels based on the unadjusted linear mixed model are often too low. The consequences are most severe for large clusters and/or small intraclass correlation coefficients since then the required number of clusters to achieve a desired power level is smallest. The possibility of covariate imbalance should be taken into account while calculating the sample size of a cluster randomized trial. Otherwise more sophisticated methods to randomize clusters to treatments should be used, such as stratification or balance algorithms. All relevant covariates should be carefully identified, be actually measured and included in the statistical model to avoid severe levels of parameter and standard error bias and insufficient power levels.
Solute Transport Dynamics in a Large Hyporheic Corridor System
NASA Astrophysics Data System (ADS)
Zachara, J. M.; Chen, X.; Murray, C. J.; Shuai, P.; Rizzo, C.; Song, X.; Dai, H.
2016-12-01
A hyporheic corridor is an extended zone of groundwater surface water-interaction that occurs within permeable aquifer sediments in hydrologic continuity with a river. These systems are dynamic and tightly coupled to river stage variations that may occur over variable time scales. Here we describe the behavior of a persistent uranium (U) contaminant plume that exists within the hyporheic corridor of a large, managed river system - the Columbia River. Temporally dense monitoring data were collected for a two year period from wells located within the plume at varying distances up to 400 m from the river shore. Groundwater U originates from desorption of residual U in the lower vadose zone during periods of high river stage and associated elevated water table. U is weakly adsorbed to aquifer sediments because of coarse texture, and along with specific conductance, serves as a tracer of vadose zone source terms, solute transport pathways, and groundwater-surface water mixing. Complex U concentration and specific conductance trends were observed for all wells that varied with distance from the river shoreline and the river hydrograph, although trends for each well were generally repeatable for each year during the monitoring period. Statistical clustering analysis was used to identify four groups of wells that exhibited common trends in dissolved U and specific conductance. A flow and reactive transport code, PFLOTRAN, was implemented within a hydrogeologic model of the groundwater-surface water interaction zone to provide insights on hydrologic processes controlling monitoring trends and cluster behavior. The hydrogeologic model was informed by extensive subsurface characterization, with the spatially variable topography of a basal aquitard being one of several key parameters. Numerical tracer experiments using PFLOTRAN revealed the presence of temporally complex flow trajectories, spatially variable domains of groundwater - river water mixing, and locations of enhanced groundwater - river exchange that helped to explain monitoring trends. Observations and modeling results are integrated into a conceptual model of this highly complex and dynamic system with applicability to hyporheic corridor systems elsewhere.
Scalable clustering algorithms for continuous environmental flow cytometry.
Hyrkas, Jeremy; Clayton, Sophie; Ribalet, Francois; Halperin, Daniel; Armbrust, E Virginia; Howe, Bill
2016-02-01
Recent technological innovations in flow cytometry now allow oceanographers to collect high-frequency flow cytometry data from particles in aquatic environments on a scale far surpassing conventional flow cytometers. The SeaFlow cytometer continuously profiles microbial phytoplankton populations across thousands of kilometers of the surface ocean. The data streams produced by instruments such as SeaFlow challenge the traditional sample-by-sample approach in cytometric analysis and highlight the need for scalable clustering algorithms to extract population information from these large-scale, high-frequency flow cytometers. We explore how available algorithms commonly used for medical applications perform at classification of such a large-scale, environmental flow cytometry data. We apply large-scale Gaussian mixture models to massive datasets using Hadoop. This approach outperforms current state-of-the-art cytometry classification algorithms in accuracy and can be coupled with manual or automatic partitioning of data into homogeneous sections for further classification gains. We propose the Gaussian mixture model with partitioning approach for classification of large-scale, high-frequency flow cytometry data. Source code available for download at https://github.com/jhyrkas/seaflow_cluster, implemented in Java for use with Hadoop. hyrkas@cs.washington.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Cluster Tails for Critical Power-Law Inhomogeneous Random Graphs
NASA Astrophysics Data System (ADS)
van der Hofstad, Remco; Kliem, Sandra; van Leeuwaarden, Johan S. H.
2018-04-01
Recently, the scaling limit of cluster sizes for critical inhomogeneous random graphs of rank-1 type having finite variance but infinite third moment degrees was obtained in Bhamidi et al. (Ann Probab 40:2299-2361, 2012). It was proved that when the degrees obey a power law with exponent τ \\in (3,4), the sequence of clusters ordered in decreasing size and multiplied through by n^{-(τ -2)/(τ -1)} converges as n→ ∞ to a sequence of decreasing non-degenerate random variables. Here, we study the tails of the limit of the rescaled largest cluster, i.e., the probability that the scaling limit of the largest cluster takes a large value u, as a function of u. This extends a related result of Pittel (J Combin Theory Ser B 82(2):237-269, 2001) for the Erdős-Rényi random graph to the setting of rank-1 inhomogeneous random graphs with infinite third moment degrees. We make use of delicate large deviations and weak convergence arguments.
Solar Effects on Global Climate Due to Cosmic Rays and Solar Energetic Particles
NASA Technical Reports Server (NTRS)
Turco, R. P.; Raeder, J.; DAuria, R.
2005-01-01
Although the work reported here does not directly connect solar variability with global climate change, this research establishes a plausible quantitative causative link between observed solar activity and apparently correlated variations in terrestrial climate parameters. Specifically, we have demonstrated that ion-mediated nucleation of atmospheric particles is a likely, and likely widespread, phenomenon that relates solar variability to changes in the microphysical properties of clouds. To investigate this relationship, we have constructed and applied a new model describing the formation and evolution of ionic clusters under a range of atmospheric conditions throughout the lower atmosphere. The activation of large ionic clusters into cloud nuclei is predicted to be favorable in the upper troposphere and mesosphere, and possibly in the lower stratosphere. The model developed under this grant needs to be extended to include additional cluster families, and should be incorporated into microphysical models to further test the cause-and-effect linkages that may ultimately explain key aspects of the connections between solar variability and climate.
Flemish palliative-care nurses' attitudes to palliative sedation: a quantitative study.
Gielen, Joris; Van den Branden, Stef; Van Iersel, Trudie; Broeckaert, Bert
2012-09-01
Palliative sedation is an option of last resort to control refractory suffering. In order to better understand palliative-care nurses' attitudes to palliative sedation, an anonymous questionnaire was sent to all nurses (589) employed in palliative care in Flanders (Belgium). In all, 70.5% of the nurses (n = 415) responded. A large majority did not agree that euthanasia is preferable to palliative sedation, were against non-voluntary euthanasia in the case of a deeply and continuously sedated patient and considered it generally better not to administer artificial floods or fluids to such a patient. Two clusters were found: 58.5% belonged to the cluster of advocates of deep and continuous sedation and 41.5% belonged to the cluster of nurses restricting the application of deep and continuous sedation. These differences notwithstanding, overall the attitudes of the nurses are in accordance with the practice and policy of palliative sedation in Flemish palliative-care units.
Pre-Deployment Stress, Mental Health, and Help-Seeking Behaviors Among Marines
2014-01-01
associations between two categori- cal variables, and Wald tests were conducted to compare mean scores on continuous variables across groups (e.g...Cluster- adjusted wald tests were conducted to determine whether there were significant differences by rank on the average number of potentially...deployed to Iraq or Afghanistan in 2010 or 2011 of rank O6 or lower. a Omnibus rao-Scott chi-square test or adjusted wald test is statistically
Estimating under-five mortality in space and time in a developing world context.
Wakefield, Jon; Fuglstad, Geir-Arne; Riebler, Andrea; Godwin, Jessica; Wilson, Katie; Clark, Samuel J
2018-01-01
Accurate estimates of the under-five mortality rate in a developing world context are a key barometer of the health of a nation. This paper describes a new model to analyze survey data on mortality in this context. We are interested in both spatial and temporal description, that is wishing to estimate under-five mortality rate across regions and years and to investigate the association between the under-five mortality rate and spatially varying covariate surfaces. We illustrate the methodology by producing yearly estimates for subnational areas in Kenya over the period 1980-2014 using data from the Demographic and Health Surveys, which use stratified cluster sampling. We use a binomial likelihood with fixed effects for the urban/rural strata and random effects for the clustering to account for the complex survey design. Smoothing is carried out using Bayesian hierarchical models with continuous spatial and temporally discrete components. A key component of the model is an offset to adjust for bias due to the effects of HIV epidemics. Substantively, there has been a sharp decline in Kenya in the under-five mortality rate in the period 1980-2014, but large variability in estimated subnational rates remains. A priority for future research is understanding this variability. In exploratory work, we examine whether a variety of spatial covariate surfaces can explain the variability in under-five mortality rate. Temperature, precipitation, a measure of malaria infection prevalence, and a measure of nearness to cities were candidates for inclusion in the covariate model, but the interplay between space, time, and covariates is complex.
Morata, Jordi; Puigdomènech, Pere
2017-02-08
Cucurbitaceae species contain a significantly lower number of genes coding for proteins with similarity to plant resistance genes belonging to the NBS-LRR family than other plant species of similar genome size. A large proportion of these genes are organized in clusters that appear to be hotspots of variability. The genomes of the Cucurbitaceae species measured until now are intermediate in size (between 350 and 450 Mb) and they apparently have not undergone any genome duplications beside those at the origin of eudicots. The cluster containing the largest number of NBS-LRR genes has previously been analyzed in melon and related species and showed a high degree of interspecific and intraspecific variability. It was of interest to study whether similar behavior occurred in other cluster of the same family of genes. The cluster of NBS-LRR genes located in melon chromosome 9 was analyzed and compared with the syntenic regions in other cucurbit genomes. This is the second cluster in number within this species and it contains nine sequences with a NBS-LRR annotation including two genes, Fom1 and Prv, providing resistance against Fusarium and Ppapaya ring-spot virus (PRSV). The variability within the melon species appears to consist essentially of single nucleotide polymorphisms. Clusters of similar genes are present in the syntenic regions of the two species of Cucurbitaceae that were sequenced, cucumber and watermelon. Most of the genes in the syntenic clusters can be aligned between species and a hypothesis of generation of the cluster is proposed. The number of genes in the watermelon cluster is similar to that in melon while a higher number of genes (12) is present in cucumber, a species with a smaller genome than melon. After comparing genome resequencing data of 115 cucumber varieties, deletion of a group of genes is observed in a group of varieties of Indian origin. Clusters of genes coding for NBS-LRR proteins in cucurbits appear to have specific variability in different regions of the genome and between different species. This observation is in favour of considering that the adaptation of plant species to changing environments is based upon the variability that may occur at any location in the genome and that has been produced by specific mechanisms of sequence variation acting on plant genomes. This information could be useful both to understand the evolution of species and for plant breeding.
A Photometric Survey of the Open Clusters NGC 7789 and M67
NASA Astrophysics Data System (ADS)
Janes, Kenneth
2010-01-01
Although there is strong evidence that stellar activity declines as a star ages, beyond about the age of the Hyades (600 Myr) there is little direct confirmation of this decline in stars of known age. This report is an update of an earlier report (Hayes-Gehrke, et al., 2004, AJ, 128, 2862) of a long-term project to explore stellar activity in old open clusters. I have now accumulated 12 years of photometry of the old clusters NGC 7789 (about 1.8 Gyr) and M 67 (about 4 Gyr). An analysis of these data has revealed a substantial number of low-amplitude variable stars in both clusters, including a number of previously-discovered eclipsing binary stars, and several stars near the main sequence turnoff of both clusters that exhibit apparently erratic variations. Some of the M 67 erratics are known X-ray sources. On the main sequence, the large majority of stars show little or no evidence for variability at the 0.1% - 0.2% level, consistent with a regular systematic decline in activity level with age.
Multimode delta Scuti stars in the open cluster NGC 7062
NASA Astrophysics Data System (ADS)
Freyhammer, L. M.; Arentoft, T.; Sterken, C.
2001-03-01
The central field of NGC 7062 was observed intensively with the main goal of finding delta Scuti stars suitable for use in asteroseismological tests of stellar structure and evolution theory. BV time series photometry was obtained for this northern open cluster, which has a large population of stars inside the delta Scuti instability strip, making it a probable host of several such variables. We report findings of 15 pulsating stars, including at least 13 delta Scuti stars. Ten variables oscillate in two or more frequencies. Only one of these variables was known before, for which we detected 9 frequencies. Five probable variables are mentioned, and period analysis is given for all 20 stars. Based on observations obtained at the Nordic Optical Telescope, operated on the island of La Palma jointly by Denmark, Finland, Iceland, Norway, and Sweden, in the Spanish Observatorio del Roque de los Muchachos of the Instituto de Astrofisíca de Canarias.
Knox, Stephanie A; Chondros, Patty
2004-01-01
Background Cluster sample study designs are cost effective, however cluster samples violate the simple random sample assumption of independence of observations. Failure to account for the intra-cluster correlation of observations when sampling through clusters may lead to an under-powered study. Researchers therefore need estimates of intra-cluster correlation for a range of outcomes to calculate sample size. We report intra-cluster correlation coefficients observed within a large-scale cross-sectional study of general practice in Australia, where the general practitioner (GP) was the primary sampling unit and the patient encounter was the unit of inference. Methods Each year the Bettering the Evaluation and Care of Health (BEACH) study recruits a random sample of approximately 1,000 GPs across Australia. Each GP completes details of 100 consecutive patient encounters. Intra-cluster correlation coefficients were estimated for patient demographics, morbidity managed and treatments received. Intra-cluster correlation coefficients were estimated for descriptive outcomes and for associations between outcomes and predictors and were compared across two independent samples of GPs drawn three years apart. Results Between April 1999 and March 2000, a random sample of 1,047 Australian general practitioners recorded details of 104,700 patient encounters. Intra-cluster correlation coefficients for patient demographics ranged from 0.055 for patient sex to 0.451 for language spoken at home. Intra-cluster correlations for morbidity variables ranged from 0.005 for the management of eye problems to 0.059 for management of psychological problems. Intra-cluster correlation for the association between two variables was smaller than the descriptive intra-cluster correlation of each variable. When compared with the April 2002 to March 2003 sample (1,008 GPs) the estimated intra-cluster correlation coefficients were found to be consistent across samples. Conclusions The demonstrated precision and reliability of the estimated intra-cluster correlations indicate that these coefficients will be useful for calculating sample sizes in future general practice surveys that use the GP as the primary sampling unit. PMID:15613248
NASA Astrophysics Data System (ADS)
Quinn, Kevin Martin
The total amount of precipitation integrated across a precipitation cluster (contiguous precipitating grid cells exceeding a minimum rain rate) is a useful measure of the aggregate size of the disturbance, expressed as the rate of water mass lost or latent heat released, i.e. the power of the disturbance. Probability distributions of cluster power are examined during boreal summer (May-September) and winter (January-March) using satellite-retrieved rain rates from the Tropical Rainfall Measuring Mission (TRMM) 3B42 and Special Sensor Microwave Imager and Sounder (SSM/I and SSMIS) programs, model output from the High Resolution Atmospheric Model (HIRAM, roughly 0.25-0.5 0 resolution), seven 1-2° resolution members of the Coupled Model Intercomparison Project Phase 5 (CMIP5) experiment, and National Center for Atmospheric Research Large Ensemble (NCAR LENS). Spatial distributions of precipitation-weighted centroids are also investigated in observations (TRMM-3B42) and climate models during winter as a metric for changes in mid-latitude storm tracks. Observed probability distributions for both seasons are scale-free from the smallest clusters up to a cutoff scale at high cluster power, after which the probability density drops rapidly. When low rain rates are excluded by choosing a minimum rain rate threshold in defining clusters, the models accurately reproduce observed cluster power statistics and winter storm tracks. Changes in behavior in the tail of the distribution, above the cutoff, are important for impacts since these quantify the frequency of the most powerful storms. End-of-century cluster power distributions and storm track locations are investigated in these models under a "business as usual" global warming scenario. The probability of high cluster power events increases by end-of-century across all models, by up to an order of magnitude for the highest-power events for which statistics can be computed. For the three models in the suite with continuous time series of high resolution output, there is substantial variability on when these probability increases for the most powerful precipitation clusters become detectable, ranging from detectable within the observational period to statistically significant trends emerging only after 2050. A similar analysis of National Centers for Environmental Prediction (NCEP) Reanalysis 2 and SSM/I-SSMIS rain rate retrievals in the recent observational record does not yield reliable evidence of trends in high-power cluster probabilities at this time. Large impacts to mid-latitude storm tracks are projected over the West Coast and eastern North America, with no less than 8 of the 9 models examined showing large increases by end-of-century in the probability density of the most powerful storms, ranging up to a factor of 6.5 in the highest range bin for which historical statistics are computed. However, within these regional domains, there is considerable variation among models in pinpointing exactly where the largest increases will occur.
Cohen, Mitchell J; Grossman, Adam D; Morabito, Diane; Knudson, M Margaret; Butte, Atul J; Manley, Geoffrey T
2010-01-01
Advances in technology have made extensive monitoring of patient physiology the standard of care in intensive care units (ICUs). While many systems exist to compile these data, there has been no systematic multivariate analysis and categorization across patient physiological data. The sheer volume and complexity of these data make pattern recognition or identification of patient state difficult. Hierarchical cluster analysis allows visualization of high dimensional data and enables pattern recognition and identification of physiologic patient states. We hypothesized that processing of multivariate data using hierarchical clustering techniques would allow identification of otherwise hidden patient physiologic patterns that would be predictive of outcome. Multivariate physiologic and ventilator data were collected continuously using a multimodal bioinformatics system in the surgical ICU at San Francisco General Hospital. These data were incorporated with non-continuous data and stored on a server in the ICU. A hierarchical clustering algorithm grouped each minute of data into 1 of 10 clusters. Clusters were correlated with outcome measures including incidence of infection, multiple organ failure (MOF), and mortality. We identified 10 clusters, which we defined as distinct patient states. While patients transitioned between states, they spent significant amounts of time in each. Clusters were enriched for our outcome measures: 2 of the 10 states were enriched for infection, 6 of 10 were enriched for MOF, and 3 of 10 were enriched for death. Further analysis of correlations between pairs of variables within each cluster reveals significant differences in physiology between clusters. Here we show for the first time the feasibility of clustering physiological measurements to identify clinically relevant patient states after trauma. These results demonstrate that hierarchical clustering techniques can be useful for visualizing complex multivariate data and may provide new insights for the care of critically injured patients.
Chen, Yun; Yang, Hui
2016-01-01
In the era of big data, there are increasing interests on clustering variables for the minimization of data redundancy and the maximization of variable relevancy. Existing clustering methods, however, depend on nontrivial assumptions about the data structure. Note that nonlinear interdependence among variables poses significant challenges on the traditional framework of predictive modeling. In the present work, we reformulate the problem of variable clustering from an information theoretic perspective that does not require the assumption of data structure for the identification of nonlinear interdependence among variables. Specifically, we propose the use of mutual information to characterize and measure nonlinear correlation structures among variables. Further, we develop Dirichlet process (DP) models to cluster variables based on the mutual-information measures among variables. Finally, orthonormalized variables in each cluster are integrated with group elastic-net model to improve the performance of predictive modeling. Both simulation and real-world case studies showed that the proposed methodology not only effectively reveals the nonlinear interdependence structures among variables but also outperforms traditional variable clustering algorithms such as hierarchical clustering. PMID:27966581
Chen, Yun; Yang, Hui
2016-12-14
In the era of big data, there are increasing interests on clustering variables for the minimization of data redundancy and the maximization of variable relevancy. Existing clustering methods, however, depend on nontrivial assumptions about the data structure. Note that nonlinear interdependence among variables poses significant challenges on the traditional framework of predictive modeling. In the present work, we reformulate the problem of variable clustering from an information theoretic perspective that does not require the assumption of data structure for the identification of nonlinear interdependence among variables. Specifically, we propose the use of mutual information to characterize and measure nonlinear correlation structures among variables. Further, we develop Dirichlet process (DP) models to cluster variables based on the mutual-information measures among variables. Finally, orthonormalized variables in each cluster are integrated with group elastic-net model to improve the performance of predictive modeling. Both simulation and real-world case studies showed that the proposed methodology not only effectively reveals the nonlinear interdependence structures among variables but also outperforms traditional variable clustering algorithms such as hierarchical clustering.
STAR FORMATION AND SUPERCLUSTER ENVIRONMENT OF 107 NEARBY GALAXY CLUSTERS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cohen, Seth A.; Hickox, Ryan C.; Wegner, Gary A.
We analyze the relationship between star formation (SF), substructure, and supercluster environment in a sample of 107 nearby galaxy clusters using data from the Sloan Digital Sky Survey. Previous works have investigated the relationships between SF and cluster substructure, and cluster substructure and supercluster environment, but definitive conclusions relating all three of these variables has remained elusive. We find an inverse relationship between cluster SF fraction ( f {sub SF}) and supercluster environment density, calculated using the Galaxy luminosity density field at a smoothing length of 8 h {sup −1} Mpc (D8). The slope of f {sub SF} versus D8more » is −0.008 ± 0.002. The f {sub SF} of clusters located in low-density large-scale environments, 0.244 ± 0.011, is higher than for clusters located in high-density supercluster cores, 0.202 ± 0.014. We also divide superclusters, according to their morphology, into filament- and spider-type systems. The inverse relationship between cluster f {sub SF} and large-scale density is dominated by filament- rather than spider-type superclusters. In high-density cores of superclusters, we find a higher f {sub SF} in spider-type superclusters, 0.229 ± 0.016, than in filament-type superclusters, 0.166 ± 0.019. Using principal component analysis, we confirm these results and the direct correlation between cluster substructure and SF. These results indicate that cluster SF is affected by both the dynamical age of the cluster (younger systems exhibit higher amounts of SF); the large-scale density of the supercluster environment (high-density core regions exhibit lower amounts of SF); and supercluster morphology (spider-type superclusters exhibit higher amounts of SF at high densities).« less
Clustering of Variables for Mixed Data
NASA Astrophysics Data System (ADS)
Saracco, J.; Chavent, M.
2016-05-01
This chapter presents clustering of variables which aim is to lump together strongly related variables. The proposed approach works on a mixed data set, i.e. on a data set which contains numerical variables and categorical variables. Two algorithms of clustering of variables are described: a hierarchical clustering and a k-means type clustering. A brief description of PCAmix method (that is a principal component analysis for mixed data) is provided, since the calculus of the synthetic variables summarizing the obtained clusters of variables is based on this multivariate method. Finally, the R packages ClustOfVar and PCAmixdata are illustrated on real mixed data. The PCAmix and ClustOfVar approaches are first used for dimension reduction (step 1) before applying in step 2 a standard clustering method to obtain groups of individuals.
Race Factors Affecting Performance Times in Elite Long-Track Speed Skating.
Noordhof, Dionne A; Mulder, Roy C; de Koning, Jos J; Hopkins, Will G
2016-05-01
Analysis of sport performance can provide effects of environmental and other venue-specific factors in addition to estimates of within-athlete variability between competitions, which determines smallest worthwhile effects. To analyze elite long-track speed-skating events. Log-transformed performance times were analyzed with a mixed linear model that estimated percentage mean effects for altitude, barometric pressure, type of rink, and competition importance. In addition, coefficients of variation representing residual venue-related differences and within-athlete variability between races within clusters spanning ~8 d were determined. Effects and variability were assessed with magnitude-based inference. A 1000-m increase in altitude resulted in very large mean performance improvements of 2.8% in juniors and 2.1% in seniors. An increase in barometric pressure of 100 hPa resulted in a moderate reduction in performance of 1.1% for juniors but an unclear effect for seniors. Only juniors competed at open rinks, resulting in a very large reduction in performance of 3.4%. Juniors and seniors showed small performance improvements (0.4% and 0.3%) at the more important competitions. After accounting for these effects, residual venue-related variability was still moderate to large. The within-athlete within-cluster race-to-race variability was 0.3-1.3%, with a small difference in variability between male (0.8%) and female juniors (1.0%) and no difference between male and female seniors (both 0.6%). The variability in performance times of skaters is similar to that of athletes in other sports in which air or water resistance limits speed. A performance enhancement of 0.1-0.4% by top-10 athletes is necessary to increase medal-winning chances by 10%.
Pattern Selection and Super-Patterns in Opinion Dynamics
NASA Astrophysics Data System (ADS)
Ben-Naim, Eli; Scheel, Arnd
We study pattern formation in the bounded confidence model of opinion dynamics. In this random process, opinion is quantified by a single variable. Two agents may interact and reach a fair compromise, but only if their difference of opinion falls below a fixed threshold. Starting from a uniform distribution of opinions with compact support, a traveling wave forms and it propagates from the domain boundary into the unstable uniform state. Consequently, the system reaches a steady state with isolated clusters that are separated by distance larger than the interaction range. These clusters form a quasi-periodic pattern where the sizes of the clusters and the separations between them are nearly constant. We obtain analytically the average separation between clusters L. Interestingly, there are also very small quasi-periodic modulations in the size of the clusters. The spatial periods of these modulations are a series of integers that follow from the continued-fraction representation of the irrational average separation L.
Vera, José Fernando; de Rooij, Mark; Heiser, Willem J
2014-11-01
In this paper we propose a latent class distance association model for clustering in the predictor space of large contingency tables with a categorical response variable. The rows of such a table are characterized as profiles of a set of explanatory variables, while the columns represent a single outcome variable. In many cases such tables are sparse, with many zero entries, which makes traditional models problematic. By clustering the row profiles into a few specific classes and representing these together with the categories of the response variable in a low-dimensional Euclidean space using a distance association model, a parsimonious prediction model can be obtained. A generalized EM algorithm is proposed to estimate the model parameters and the adjusted Bayesian information criterion statistic is employed to test the number of mixture components and the dimensionality of the representation. An empirical example highlighting the advantages of the new approach and comparing it with traditional approaches is presented. © 2014 The British Psychological Society.
NASA Technical Reports Server (NTRS)
Mjolsness, Eric; Castano, Rebecca; Mann, Tobias; Wold, Barbara
2000-01-01
We provide preliminary evidence that existing algorithms for inferring small-scale gene regulation networks from gene expression data can be adapted to large-scale gene expression data coming from hybridization microarrays. The essential steps are (I) clustering many genes by their expression time-course data into a minimal set of clusters of co-expressed genes, (2) theoretically modeling the various conditions under which the time-courses are measured using a continuous-time analog recurrent neural network for the cluster mean time-courses, (3) fitting such a regulatory model to the cluster mean time courses by simulated annealing with weight decay, and (4) analysing several such fits for commonalities in the circuit parameter sets including the connection matrices. This procedure can be used to assess the adequacy of existing and future gene expression time-course data sets for determining transcriptional regulatory relationships such as coregulation.
Cross-entropy clustering framework for catchment classification
NASA Astrophysics Data System (ADS)
Tongal, Hakan; Sivakumar, Bellie
2017-09-01
There is an increasing interest in catchment classification and regionalization in hydrology, as they are useful for identification of appropriate model complexity and transfer of information from gauged catchments to ungauged ones, among others. This study introduces a nonlinear cross-entropy clustering (CEC) method for classification of catchments. The method specifically considers embedding dimension (m), sample entropy (SampEn), and coefficient of variation (CV) to represent dimensionality, complexity, and variability of the time series, respectively. The method is applied to daily streamflow time series from 217 gauging stations across Australia. The results suggest that a combination of linear and nonlinear parameters (i.e. m, SampEn, and CV), representing different aspects of the underlying dynamics of streamflows, could be useful for determining distinct patterns of flow generation mechanisms within a nonlinear clustering framework. For the 217 streamflow time series, nine hydrologically homogeneous clusters that have distinct patterns of flow regime characteristics and specific dominant hydrological attributes with different climatic features are obtained. Comparison of the results with those obtained using the widely employed k-means clustering method (which results in five clusters, with the loss of some information about the features of the clusters) suggests the superiority of the cross-entropy clustering method. The outcomes from this study provide a useful guideline for employing the nonlinear dynamic approaches based on hydrologic signatures and for gaining an improved understanding of streamflow variability at a large scale.
Optimizing weak lensing mass estimates for cluster profile uncertainty
Gruen, D.; Bernstein, G. M.; Lam, T. Y.; ...
2011-09-11
Weak lensing measurements of cluster masses are necessary for calibrating mass-observable relations (MORs) to investigate the growth of structure and the properties of dark energy. However, the measured cluster shear signal varies at fixed mass M 200m due to inherent ellipticity of background galaxies, intervening structures along the line of sight, and variations in the cluster structure due to scatter in concentrations, asphericity and substructure. We use N-body simulated halos to derive and evaluate a weak lensing circular aperture mass measurement M ap that minimizes the mass estimate variance <(M ap - M 200m) 2> in the presence of allmore » these forms of variability. Depending on halo mass and observational conditions, the resulting mass estimator improves on M ap filters optimized for circular NFW-profile clusters in the presence of uncorrelated large scale structure (LSS) about as much as the latter improve on an estimator that only minimizes the influence of shape noise. Optimizing for uncorrelated LSS while ignoring the variation of internal cluster structure puts too much weight on the profile near the cores of halos, and under some circumstances can even be worse than not accounting for LSS at all. As a result, we discuss the impact of variability in cluster structure and correlated structures on the design and performance of weak lensing surveys intended to calibrate cluster MORs.« less
Problem decomposition by mutual information and force-based clustering
NASA Astrophysics Data System (ADS)
Otero, Richard Edward
The scale of engineering problems has sharply increased over the last twenty years. Larger coupled systems, increasing complexity, and limited resources create a need for methods that automatically decompose problems into manageable sub-problems by discovering and leveraging problem structure. The ability to learn the coupling (inter-dependence) structure and reorganize the original problem could lead to large reductions in the time to analyze complex problems. Such decomposition methods could also provide engineering insight on the fundamental physics driving problem solution. This work forwards the current state of the art in engineering decomposition through the application of techniques originally developed within computer science and information theory. The work describes the current state of automatic problem decomposition in engineering and utilizes several promising ideas to advance the state of the practice. Mutual information is a novel metric for data dependence and works on both continuous and discrete data. Mutual information can measure both the linear and non-linear dependence between variables without the limitations of linear dependence measured through covariance. Mutual information is also able to handle data that does not have derivative information, unlike other metrics that require it. The value of mutual information to engineering design work is demonstrated on a planetary entry problem. This study utilizes a novel tool developed in this work for planetary entry system synthesis. A graphical method, force-based clustering, is used to discover related sub-graph structure as a function of problem structure and links ranked by their mutual information. This method does not require the stochastic use of neural networks and could be used with any link ranking method currently utilized in the field. Application of this method is demonstrated on a large, coupled low-thrust trajectory problem. Mutual information also serves as the basis for an alternative global optimizer, called MIMIC, which is unrelated to Genetic Algorithms. Advancement to the current practice demonstrates the use of MIMIC as a global method that explicitly models problem structure with mutual information, providing an alternate method for globally searching multi-modal domains. By leveraging discovered problem inter- dependencies, MIMIC may be appropriate for highly coupled problems or those with large function evaluation cost. This work introduces a useful addition to the MIMIC algorithm that enables its use on continuous input variables. By leveraging automatic decision tree generation methods from Machine Learning and a set of randomly generated test problems, decision trees for which method to apply are also created, quantifying decomposition performance over a large region of the design space.
Analyzing chromatographic data using multilevel modeling.
Wiczling, Paweł
2018-06-01
It is relatively easy to collect chromatographic measurements for a large number of analytes, especially with gradient chromatographic methods coupled with mass spectrometry detection. Such data often have a hierarchical or clustered structure. For example, analytes with similar hydrophobicity and dissociation constant tend to be more alike in their retention than a randomly chosen set of analytes. Multilevel models recognize the existence of such data structures by assigning a model for each parameter, with its parameters also estimated from data. In this work, a multilevel model is proposed to describe retention time data obtained from a series of wide linear organic modifier gradients of different gradient duration and different mobile phase pH for a large set of acids and bases. The multilevel model consists of (1) the same deterministic equation describing the relationship between retention time and analyte-specific and instrument-specific parameters, (2) covariance relationships relating various physicochemical properties of the analyte to chromatographically specific parameters through quantitative structure-retention relationship based equations, and (3) stochastic components of intra-analyte and interanalyte variability. The model was implemented in Stan, which provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods. Graphical abstract Relationships between log k and MeOH content for acidic, basic, and neutral compounds with different log P. CI credible interval, PSA polar surface area.
Aguirre von Wobeser, Eneas; Ibelings, Bas W.; Bok, Jasper; Krasikov, Vladimir; Huisman, Jef; Matthijs, Hans C.P.
2011-01-01
Physiological adaptation and genome-wide expression profiles of the cyanobacterium Synechocystis sp. strain PCC 6803 in response to gradual transitions between nitrogen-limited and light-limited growth conditions were measured in continuous cultures. Transitions induced changes in pigment composition, light absorption coefficient, photosynthetic electron transport, and specific growth rate. Physiological changes were accompanied by reproducible changes in the expression of several hundred open reading frames, genes with functions in photosynthesis and respiration, carbon and nitrogen assimilation, protein synthesis, phosphorus metabolism, and overall regulation of cell function and proliferation. Cluster analysis of the nearly 1,600 regulated open reading frames identified eight clusters, each showing a different temporal response during the transitions. Two large clusters mirrored each other. One cluster included genes involved in photosynthesis, which were up-regulated during light-limited growth but down-regulated during nitrogen-limited growth. Conversely, genes in the other cluster were down-regulated during light-limited growth but up-regulated during nitrogen-limited growth; this cluster included several genes involved in nitrogen uptake and assimilation. These results demonstrate complementary regulation of gene expression for two major metabolic activities of cyanobacteria. Comparison with batch-culture experiments revealed interesting differences in gene expression between batch and continuous culture and illustrates that continuous-culture experiments can pick up subtle changes in cell physiology and gene expression. PMID:21205618
NASA Technical Reports Server (NTRS)
Fu, L.-L.; Chelton, D. B.
1985-01-01
A new method is developed for studying large-scale temporal variability of ocean currents from satellite altimetric sea level measurements at intersections (crossovers) of ascending and descending orbit ground tracks. Using this method, sea level time series can be constructed from crossover sea level differences in small sample areas where altimetric crossovers are clustered. The method is applied to Seasat altimeter data to study the temporal evolution of the Antarctic Circumpolar Current (ACC) over the 3-month Seasat mission (July-October 1978). The results reveal a generally eastward acceleration of the ACC around the Southern Ocean with meridional disturbances which appear to be associated with bottom topographic features. This is the first direct observational evidence for large-scale coherence in the temporal variability of the ACC. It demonstrates the great potential of satellite altimetry for synoptic observation of temporal variability of the world ocean circulation.
A model of metastable dynamics during ongoing and evoked cortical activity
NASA Astrophysics Data System (ADS)
La Camera, Giancarlo
The dynamics of simultaneously recorded spike trains in alert animals often evolve through temporal sequences of metastable states. Little is known about the network mechanisms responsible for the genesis of such sequences, or their potential role in neural coding. In the gustatory cortex of alert rates, state sequences can be observed also in the absence of overt sensory stimulation, and thus form the basis of the so-called `ongoing activity'. This activity is characterized by a partial degree of coordination among neurons, sharp transitions among states, and multi-stability of single neurons' firing rates. A recurrent spiking network model with clustered topology can account for both the spontaneous generation of state sequences and the (network-generated) multi-stability. In the model, each network state results from the activation of specific neural clusters with potentiated intra-cluster connections. A mean field solution of the model shows a large number of stable states, each characterized by a subset of simultaneously active clusters. The firing rate in each cluster during ongoing activity depends on the number of active clusters, so that the same neuron can have different firing rates depending on the state of the network. Because of dense intra-cluster connectivity and recurrent inhibition, in finite networks the stable states lose stability due to finite size effects. Simulations of the dynamics show that the model ensemble activity continuously hops among the different states, reproducing the ongoing dynamics observed in the data. Moreover, when probed with external stimuli, the model correctly predicts the quenching of single neuron multi-stability into bi-stability, the reduction of dimensionality of the population activity, the reduction of trial-to-trial variability, and a potential role for metastable states in the anticipation of expected events. Altogether, these results provide a unified mechanistic model of ongoing and evoked cortical dynamics. NSF IIS-1161852, NIDCD K25-DC013557, NIDCD R01-DC010389.
NASA Astrophysics Data System (ADS)
Rozyczka, M.; Narloch, W.; Pietrukowicz, P.; Thompson, I. B.; Pych, W.; Poleski, R.
2018-03-01
We adapt the friends of friends algorithm to the analysis of light curves, and show that it can be succesfully applied to searches for transient phenomena in large photometric databases. As a test case we search OGLE-III light curves for known dwarf novae. A single combination of control parameters allows us to narrow the search to 1% of the data while reaching a ≍90% detection efficiency. A search involving ≍2% of the data and three combinations of control parameters can be significantly more effective - in our case a 100% efficiency is reached. The method can also quite efficiently detect semi-regular variability. In particular, 28 new semi-regular variables have been found in the field of the globular cluster M22, which was examined earlier with the help of periodicity-searching algorithms.
NASA Astrophysics Data System (ADS)
Jeon, Young-Beom; Nemec, James M.; Walker, Alistair R.; Kunder, Andrea M.
2014-06-01
Homogeneous B, V photometry is presented for 19,324 stars in and around 5 Magellanic Cloud globular clusters: NGC 1466, NGC 1841, NGC 2210, NGC 2257, and Reticulum. The photometry is derived from eight nights of CCD imaging with the Cerro Tololo Inter-American Observatory 0.9 m SMARTS telescope. Instrumental magnitudes were transformed to the Johnson B, V system using accurate calibration relations based on a large sample of Landolt-Stetson equatorial standard stars, which were observed on the same nights as the cluster stars. Residual analysis of the equatorial standards used for the calibration, and validation of the new photometry using Stetson's sample of secondary standards in the vicinities of the five Large Magellanic Cloud clusters, shows excellent agreement with our values in both magnitudes and colors. Color-magnitude diagrams reaching to the main-sequence turnoffs at V ~ 22 mag, sigma-magnitude diagrams, and various other summaries are presented for each cluster to illustrate the range and quality of the new photometry. The photometry should prove useful for future studies of the Magellanic Cloud globular clusters, particularly studies of their variable stars.
Continuous-variable quantum computation with spatial degrees of freedom of photons
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tasca, D. S.; Gomes, R. M.; Toscano, F.
2011-05-15
We discuss the use of the transverse spatial degrees of freedom of photons propagating in the paraxial approximation for continuous-variable information processing. Given the wide variety of linear optical devices available, a diverse range of operations can be performed on the spatial degrees of freedom of single photons. Here we show how to implement a set of continuous quantum logic gates which allow for universal quantum computation. In contrast with the usual quadratures of the electromagnetic field, the entire set of single-photon gates for spatial degrees of freedom does not require optical nonlinearity and, in principle, can be performed withmore » a single device: the spatial light modulator. Nevertheless, nonlinear optical processes, such as four-wave mixing, are needed in the implementation of two-photon gates. The efficiency of these gates is at present very low; however, small-scale investigations of continuous-variable quantum computation are within the reach of current technology. In this regard, we show how novel cluster states for one-way quantum computing can be produced using spontaneous parametric down-conversion.« less
Interaction of boron cluster ions with water: Single collision dynamics and sequential etching
NASA Astrophysics Data System (ADS)
Hintz, Paul A.; Ruatta, Stephen A.; Anderson, Scott L.
1990-01-01
Reactions of mass-selected, cooled, boron cluster ions (B+n, n=1-14) with water have been studied for collision energies from 0.1 to 6.0 eV. Most work was done with D2O, however isotope effects were examined for selected reactant cluster ions. For all size clusters there are exoergic product channels, which in most cases have no activation barriers. Cross sections are generally large, however there are fluctuations with cluster size in total reactivity, collision energy dependences, and in product distributions. For small cluster ions, there is a multitude of product channels. For clusters larger than B+6, the product distributions are dominated by a single channel: Bn-1D++DBO. Under multiple collision conditions, the primary products undergo a remarkable sequence of secondary ``etching'' reactions. As these occur, boron atoms are continuously replaced by hydrogen, and the intermediate products retain the composition: Bn-mH+m. This highly efficient chemistry appears to continue unchanged as the composition changes from pure boron to mostly hydrogen. Comparison of these results is made with boron cluster ion reactions with O2 and D2, as well as reactions with water of aluminum and silicon cluster ions. Some discussion is given of the thermochemistry for these reactions, and a possible problem with the thermochemical data in the BOD/DBO system is discussed.
The Universe at Moderate Redshift
NASA Technical Reports Server (NTRS)
Cen, Renyue; Ostriker, Jeremiah P.
1997-01-01
The report covers the work done in the past year and a wide range of fields including properties of clusters of galaxies; topological properties of galaxy distributions in terms of galaxy types; patterns of gravitational nonlinear clustering process; development of a ray tracing algorithm to study the gravitational lensing phenomenon by galaxies, clusters and large-scale structure, one of whose applications being the effects of weak gravitational lensing by large-scale structure on the determination of q(0); the origin of magnetic fields on the galactic and cluster scales; the topological properties of Ly(alpha) clouds the Ly(alpha) optical depth distribution; clustering properties of Ly(alpha) clouds; and a determination (lower bound) of Omega(b) based on the observed Ly(alpha) forest flux distribution. In the coming year, we plan to continue the investigation of Ly(alpha) clouds using larger dynamic range (about a factor of two) and better simulations (with more input physics included) than what we have now. We will study the properties of galaxies on 1 - 100h(sup -1) Mpc scales using our state-of-the-art large scale galaxy formation simulations of various cosmological models, which will have a resolution about a factor of 5 (in each dimension) better than our current, best simulations. We will plan to study the properties of X-ray clusters using unprecedented, very high dynamic range (20,000) simulations which will enable us to resolve the cores of clusters while keeping the simulation volume sufficiently large to ensure a statistically fair sample of the objects of interest. The details of the last year's works are now described.
Identification of chronic rhinosinusitis phenotypes using cluster analysis.
Soler, Zachary M; Hyer, J Madison; Ramakrishnan, Viswanathan; Smith, Timothy L; Mace, Jess; Rudmik, Luke; Schlosser, Rodney J
2015-05-01
Current clinical classifications of chronic rhinosinusitis (CRS) have been largely defined based upon preconceived notions of factors thought to be important, such as polyp or eosinophil status. Unfortunately, these classification systems have little correlation with symptom severity or treatment outcomes. Unsupervised clustering can be used to identify phenotypic subgroups of CRS patients, describe clinical differences in these clusters and define simple algorithms for classification. A multi-institutional, prospective study of 382 patients with CRS who had failed initial medical therapy completed the Sino-Nasal Outcome Test (SNOT-22), Rhinosinusitis Disability Index (RSDI), Medical Outcomes Study Short Form-12 (SF-12), Pittsburgh Sleep Quality Index (PSQI), and Patient Health Questionnaire (PHQ-2). Objective measures of CRS severity included Brief Smell Identification Test (B-SIT), CT, and endoscopy scoring. All variables were reduced and unsupervised hierarchical clustering was performed. After clusters were defined, variations in medication usage were analyzed. Discriminant analysis was performed to develop a simplified, clinically useful algorithm for clustering. Clustering was largely determined by age, severity of patient reported outcome measures, depression, and fibromyalgia. CT and endoscopy varied somewhat among clusters. Traditional clinical measures, including polyp/atopic status, prior surgery, B-SIT and asthma, did not vary among clusters. A simplified algorithm based upon productivity loss, SNOT-22 score, and age predicted clustering with 89% accuracy. Medication usage among clusters did vary significantly. A simplified algorithm based upon hierarchical clustering is able to classify CRS patients and predict medication usage. Further studies are warranted to determine if such clustering predicts treatment outcomes. © 2015 ARS-AAOA, LLC.
NASA Astrophysics Data System (ADS)
Scarth, P.; Phinn, S. R.; Armston, J.; Lucas, R.
2015-12-01
Vertical plant profiles are important descriptors of canopy structure and are used to inform models of biomass, biodiversity and fire risk. In Australia, an approach has been developed to produce large area maps of vertical plant profiles by extrapolating waveform lidar estimates of vertical plant profiles from ICESat/GLAS using large area segmentation of ALOS PALSAR and Landsat satellite image products. The main assumption of this approach is that the vegetation height profiles are consistent across the segments defined from ALOS PALSAR and Landsat image products. More than 1500 field sites were used to develop an index of fractional cover using Landsat data. A time series of the green fraction was used to calculate the persistent green fraction continuously across the landscape. This was fused with ALOS PALSAR L-band Fine Beam Dual polarisation 25m data and used to segment the Australian landscapes. K-means clustering then grouped the segments with similar cover and backscatter into approximately 1000 clusters. Where GLAS-ICESat footprints intersected these clusters, canopy profiles were extracted and aggregated to produce a mean vertical vegetation profile for each cluster that was used to derive mean canopy and understorey height, depth and density. Due to the large number of returns, these retrievals are near continuous across the landscape, enabling them to be used for inventory and modelling applications. To validate this product, a radiative transfer model was adapted to map directional gap probability from airborne waveform lidar datasets to retrieve vertical plant profiles Comparison over several test sites show excellent agreement and work is underway to extend the analysis to improve national biomass mapping. The integration of the three datasets provide options for future operational monitoring of structure and AGB across large areas for quantifying carbon dynamics, structural change and biodiversity.
Long-distance continuous-variable quantum key distribution by controlling excess noise
NASA Astrophysics Data System (ADS)
Huang, Duan; Huang, Peng; Lin, Dakai; Zeng, Guihua
2016-01-01
Quantum cryptography founded on the laws of physics could revolutionize the way in which communication information is protected. Significant progresses in long-distance quantum key distribution based on discrete variables have led to the secure quantum communication in real-world conditions being available. However, the alternative approach implemented with continuous variables has not yet reached the secure distance beyond 100 km. Here, we overcome the previous range limitation by controlling system excess noise and report such a long distance continuous-variable quantum key distribution experiment. Our result paves the road to the large-scale secure quantum communication with continuous variables and serves as a stepping stone in the quest for quantum network.
Long-distance continuous-variable quantum key distribution by controlling excess noise.
Huang, Duan; Huang, Peng; Lin, Dakai; Zeng, Guihua
2016-01-13
Quantum cryptography founded on the laws of physics could revolutionize the way in which communication information is protected. Significant progresses in long-distance quantum key distribution based on discrete variables have led to the secure quantum communication in real-world conditions being available. However, the alternative approach implemented with continuous variables has not yet reached the secure distance beyond 100 km. Here, we overcome the previous range limitation by controlling system excess noise and report such a long distance continuous-variable quantum key distribution experiment. Our result paves the road to the large-scale secure quantum communication with continuous variables and serves as a stepping stone in the quest for quantum network.
Long-distance continuous-variable quantum key distribution by controlling excess noise
Huang, Duan; Huang, Peng; Lin, Dakai; Zeng, Guihua
2016-01-01
Quantum cryptography founded on the laws of physics could revolutionize the way in which communication information is protected. Significant progresses in long-distance quantum key distribution based on discrete variables have led to the secure quantum communication in real-world conditions being available. However, the alternative approach implemented with continuous variables has not yet reached the secure distance beyond 100 km. Here, we overcome the previous range limitation by controlling system excess noise and report such a long distance continuous-variable quantum key distribution experiment. Our result paves the road to the large-scale secure quantum communication with continuous variables and serves as a stepping stone in the quest for quantum network. PMID:26758727
Soil moisture response to snowmelt and rainfall in a Sierra Nevada mixed-conifer forest
Roger C. Bales; Jan W. Hopmans; Anthony T. O’Geen; Matthew Meadows; Peter C. Hartsough; Peter Kirchner; Carolyn T. Hunsaker; Dylan Beaudette
2011-01-01
Using data from a water-balance instrument cluster with spatially distributed sensors we determined the magnitude and within-catchment variability of components of the catchment-scale water balance, focusing on the relationship of seasonal evapotranspiration to changes in snowpack and soil moisture storage. Co-located, continuous snow depth and soil moisture...
Setegn, Tesfaye; Lakew, Yihunie; Deribe, Kebede
2016-01-01
Background Female genital mutilation (FGM) is a common traditional practice in developing nations including Ethiopia. It poses complex and serious long-term health risks for women and girls and can lead to death. In Ethiopia, the geographic distribution and factors associated with FGM practices are poorly understood. Therefore, we assessed the spatial distribution and factors associated with FGM among reproductive age women in the country. Method We used population based national representative surveys. Data from two (2000 and 2005) Ethiopian demographic and health surveys (EDHS) were used in this analysis. Briefly, EDHS used a stratified, two-stage cluster sampling design. A total of 15,367 (from EDHS 2000) and 14,070 (from EDHS 2005) women of reproductive age (15–49 years) were included in the analysis. Three outcome variables were used (prevalence of FGM among women, prevalence of FGM among daughters and support for the continuation of FGM). The data were weighted and descriptive statistics (percentage change), bivariate and multivariable logistic regression analyses were carried out. Multicollinearity of variables was assessed using variance inflation factors (VIF) with a reference value of 10 before interpreting the final output. The geographic variation and clustering of weighted FGM prevalence were analyzed and visualized on maps using ArcGIS. Z-scores were used to assess the statistical difference of geographic clustering of FGM prevalence spots. Result The trend of FGM weighted prevalence has been decreasing. Being wealthy, Muslim and in higher age categories are associated with increased odds of FGM among women. Similarly, daughters from Muslim women have increased odds of experiencing FGM. Women in the higher age categories have increased odds of having daughters who experience FGM. The odds of FGM among daughters decrease with increased maternal education. Mass media exposure, being wealthy and higher paternal and maternal education are associated with decreased odds of women’s support of FGM continuation. FGM prevalence and geographic clustering showed variation across regions in Ethiopia. Conclusion Individual, economic, socio-demographic, religious and cultural factors played major roles in the existing practice and continuation of FGM. The significant geographic clustering of FGM was observed across regions in Ethiopia. Therefore, targeted and integrated interventions involving religious leaders in high FGM prevalence spot clusters and addressing the socio-economic and geographic inequalities are recommended to eliminate FGM. PMID:26741488
Setegn, Tesfaye; Lakew, Yihunie; Deribe, Kebede
2016-01-01
Female genital mutilation (FGM) is a common traditional practice in developing nations including Ethiopia. It poses complex and serious long-term health risks for women and girls and can lead to death. In Ethiopia, the geographic distribution and factors associated with FGM practices are poorly understood. Therefore, we assessed the spatial distribution and factors associated with FGM among reproductive age women in the country. We used population based national representative surveys. Data from two (2000 and 2005) Ethiopian demographic and health surveys (EDHS) were used in this analysis. Briefly, EDHS used a stratified, two-stage cluster sampling design. A total of 15,367 (from EDHS 2000) and 14,070 (from EDHS 2005) women of reproductive age (15-49 years) were included in the analysis. Three outcome variables were used (prevalence of FGM among women, prevalence of FGM among daughters and support for the continuation of FGM). The data were weighted and descriptive statistics (percentage change), bivariate and multivariable logistic regression analyses were carried out. Multicollinearity of variables was assessed using variance inflation factors (VIF) with a reference value of 10 before interpreting the final output. The geographic variation and clustering of weighted FGM prevalence were analyzed and visualized on maps using ArcGIS. Z-scores were used to assess the statistical difference of geographic clustering of FGM prevalence spots. The trend of FGM weighted prevalence has been decreasing. Being wealthy, Muslim and in higher age categories are associated with increased odds of FGM among women. Similarly, daughters from Muslim women have increased odds of experiencing FGM. Women in the higher age categories have increased odds of having daughters who experience FGM. The odds of FGM among daughters decrease with increased maternal education. Mass media exposure, being wealthy and higher paternal and maternal education are associated with decreased odds of women's support of FGM continuation. FGM prevalence and geographic clustering showed variation across regions in Ethiopia. Individual, economic, socio-demographic, religious and cultural factors played major roles in the existing practice and continuation of FGM. The significant geographic clustering of FGM was observed across regions in Ethiopia. Therefore, targeted and integrated interventions involving religious leaders in high FGM prevalence spot clusters and addressing the socio-economic and geographic inequalities are recommended to eliminate FGM.
Grouping of Bulgarian wines according to grape variety by using statistical methods
NASA Astrophysics Data System (ADS)
Milev, M.; Nikolova, Kr.; Ivanova, Ir.; Minkova, St.; Evtimov, T.; Krustev, St.
2017-12-01
68 different types of Bulgarian wines were studied in accordance with 9 optical parameters as follows: color parameters in XYZ and SIE Lab color systems, lightness, Hue angle, chroma, fluorescence intensity and emission wavelength. The main objective of this research is using hierarchical cluster analysis to evaluate the similarity and the distance between examined different types of Bulgarian wines and their grouping based on physical parameters. We have found that wines are grouped in clusters on the base of the degree of identity between them. There are two main clusters each one with two subclusters. The first one contains white wines and Sira, the second contains red wines and rose. The results from cluster analysis are presented graphically by a dendrogram. The other statistical technique used is factor analysis performed by the Method of Principal Components (PCA). The aim is to reduce the large number of variables to a few factors by grouping the correlated variables into one factor and subdividing the noncorrelated variables into different factors. Moreover the factor analysis provided the possibility to determine the parameters with the greatest influence over the distribution of samples in different clusters. In our study after the rotation of the factors with Varimax method the parameters were combined into two factors, which explain about 80 % of the total variation. The first one explains the 61.49% and correlates with color characteristics, the second one explains 18.34% from the variation and correlates with the parameters connected with fluorescence spectroscopy.
Daub, Carsten O; Steuer, Ralf; Selbig, Joachim; Kloska, Sebastian
2004-01-01
Background The information theoretic concept of mutual information provides a general framework to evaluate dependencies between variables. In the context of the clustering of genes with similar patterns of expression it has been suggested as a general quantity of similarity to extend commonly used linear measures. Since mutual information is defined in terms of discrete variables, its application to continuous data requires the use of binning procedures, which can lead to significant numerical errors for datasets of small or moderate size. Results In this work, we propose a method for the numerical estimation of mutual information from continuous data. We investigate the characteristic properties arising from the application of our algorithm and show that our approach outperforms commonly used algorithms: The significance, as a measure of the power of distinction from random correlation, is significantly increased. This concept is subsequently illustrated on two large-scale gene expression datasets and the results are compared to those obtained using other similarity measures. A C++ source code of our algorithm is available for non-commercial use from kloska@scienion.de upon request. Conclusion The utilisation of mutual information as similarity measure enables the detection of non-linear correlations in gene expression datasets. Frequently applied linear correlation measures, which are often used on an ad-hoc basis without further justification, are thereby extended. PMID:15339346
Hundreds of new cluster candidates in the VISTA Variables in the Vía Láctea survey DR1
NASA Astrophysics Data System (ADS)
Barbá, R. H.; Roman-Lopes, A.; Nilo Castellón, J. L.; Firpo, V.; Minniti, D.; Lucas, P.; Emerson, J. P.; Hempel, M.; Soto, M.; Saito, R. K.
2015-09-01
Context. VISTA variables in the Vía Láctea is an ESO Public survey dedicated to scanning the bulge and an adjacent portion of the Galactic disk in the fourth quadrant using the VISTA telescope and its near-infrared camera VIRCAM. One of the leading goals of the VVV survey is to contribute to knowledge of the star cluster population of the Milky Way. Aims: To improve the census of Galactic star clusters, we performed a systematic and careful scan of the JHKs images of the Galactic plane section of the VVV survey. Methods: Our detection procedure is based on a combination of stellar density maps and visual inspection of promising features in the J-, H-, and KS-band images. The material examined are VVV JHKS color-composite images corresponding to Data Release 1 of VVV. Results: We report the discovery of 493 new infrared star cluster candidates. The analysis of the spatial distribution show that the clusters are very concentrated in the Galactic plane, presenting some local maxima around the position of large star-forming complexes, such as G305, RCW 95, and RCW 106. The vast majority of the new star cluster candidates are quite compact and generally surrounded by bright and/or dark nebulosities. IRAS point sources are associated with 59% of the sample, while 88% are associated with MSX point sources. GLIMPSE 8 μm images of the cluster candidates show a variety of morphologies, with 292 clusters dominated by knotty sources, while 361 clusters show some kind of nebulosity in this wavelength regime. Spatial cross-correlation with young stellar objects, masers, and extended green-object catalogs suggest that a large sample of the new cluster candidates are extremely young. In particular, 104 star clusters associated with methanol masers are excellent candidates for ongoing massive star formation. Also, there is a special set of sixteen cluster candidates that present clear signposts of star-forming activity having associated simultaneosly dark nebulae, young stellar objects, extended green objects, and masers. Full Tables 1-3 are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/581/A120
Detecting most influencing courses on students grades using block PCA
NASA Astrophysics Data System (ADS)
Othman, Osama H.; Gebril, Rami Salah
2014-12-01
One of the modern solutions adopted in dealing with the problem of large number of variables in statistical analyses is the Block Principal Component Analysis (Block PCA). This modified technique can be used to reduce the vertical dimension (variables) of the data matrix Xn×p by selecting a smaller number of variables, (say m) containing most of the statistical information. These selected variables can then be employed in further investigations and analyses. Block PCA is an adapted multistage technique of the original PCA. It involves the application of Cluster Analysis (CA) and variable selection throughout sub principal components scores (PC's). The application of Block PCA in this paper is a modified version of the original work of Liu et al (2002). The main objective was to apply PCA on each group of variables, (established using cluster analysis), instead of involving the whole large pack of variables which was proved to be unreliable. In this work, the Block PCA is used to reduce the size of a huge data matrix ((n = 41) × (p = 251)) consisting of Grade Point Average (GPA) of the students in 251 courses (variables) in the faculty of science in Benghazi University. In other words, we are constructing a smaller analytical data matrix of the GPA's of the students with less variables containing most variation (statistical information) in the original database. By applying the Block PCA, (12) courses were found to `absorb' most of the variation or influence from the original data matrix, and hence worth to be keep for future statistical exploring and analytical studies. In addition, the course Independent Study (Math.) was found to be the most influencing course on students GPA among the 12 selected courses.
Akar, Servet; Solmaz, Dilek; Kasifoglu, Timucin; Bilge, Sule Yasar; Sari, Ismail; Gumus, Zeynep Zehra; Tunca, Mehmet
2016-02-01
The aim of this study was to evaluate whether there are clinical subgroups that may have different prognoses among FMF patients. The cumulative clinical features of a large group of FMF patients [1168 patients, 593 (50.8%) male, mean age 35.3 years (s.d. 12.4)] were studied. To analyse our data and identify groups of FMF patients with similar clinical characteristics, a two-step cluster analysis using log-likelihood distance measures was performed. For clustering the FMF patients, we evaluated the following variables: gender, current age, age at symptom onset, age at diagnosis, presence of major clinical features, variables related with therapy and family history for FMF, renal failure and carriage of M694V. Three distinct groups of FMF patients were identified. Cluster 1 was characterized by a high prevalence of arthritis, pleuritis, erysipelas-like erythema (ELE) and febrile myalgia. The dosage of colchicine and the frequency of amyloidosis were lower in cluster 1. Patients in cluster 2 had an earlier age of disease onset and diagnosis. M694V carriage and amyloidosis prevalence were the highest in cluster 2. This group of patients was using the highest dose of colchicine. Patients in cluster 3 had the lowest prevalence of arthritis, ELE and febrile myalgia. The frequencies of M694V carriage and amyloidosis were lower in cluster 3 than the overall FMF patients. Non-response to colchicine was also slightly lower in cluster 3. Patients with FMF can be clustered into distinct patterns of clinical and genetic manifestations and these patterns may have different prognostic significance. © The Author 2015. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Costa, Patrício Soares; Santos, Nadine Correia; Cunha, Pedro; Cotter, Jorge; Sousa, Nuno
2013-01-01
The main focus of this study was to illustrate the applicability of multiple correspondence analysis (MCA) in detecting and representing underlying structures in large datasets used to investigate cognitive ageing. Principal component analysis (PCA) was used to obtain main cognitive dimensions, and MCA was used to detect and explore relationships between cognitive, clinical, physical, and lifestyle variables. Two PCA dimensions were identified (general cognition/executive function and memory), and two MCA dimensions were retained. Poorer cognitive performance was associated with older age, less school years, unhealthier lifestyle indicators, and presence of pathology. The first MCA dimension indicated the clustering of general/executive function and lifestyle indicators and education, while the second association was between memory and clinical parameters and age. The clustering analysis with object scores method was used to identify groups sharing similar characteristics. The weaker cognitive clusters in terms of memory and executive function comprised individuals with characteristics contributing to a higher MCA dimensional mean score (age, less education, and presence of indicators of unhealthier lifestyle habits and/or clinical pathologies). MCA provided a powerful tool to explore complex ageing data, covering multiple and diverse variables, showing if a relationship exists and how variables are related, and offering statistical results that can be seen both analytically and visually.
Dipnall, J F; Pasco, J A; Berk, M; Williams, L J; Dodd, S; Jacka, F N; Meyer, D
2017-01-01
Key lifestyle-environ risk factors are operative for depression, but it is unclear how risk factors cluster. Machine-learning (ML) algorithms exist that learn, extract, identify and map underlying patterns to identify groupings of depressed individuals without constraints. The aim of this research was to use a large epidemiological study to identify and characterise depression clusters through "Graphing lifestyle-environs using machine-learning methods" (GLUMM). Two ML algorithms were implemented: unsupervised Self-organised mapping (SOM) to create GLUMM clusters and a supervised boosted regression algorithm to describe clusters. Ninety-six "lifestyle-environ" variables were used from the National health and nutrition examination study (2009-2010). Multivariate logistic regression validated clusters and controlled for possible sociodemographic confounders. The SOM identified two GLUMM cluster solutions. These solutions contained one dominant depressed cluster (GLUMM5-1, GLUMM7-1). Equal proportions of members in each cluster rated as highly depressed (17%). Alcohol consumption and demographics validated clusters. Boosted regression identified GLUMM5-1 as more informative than GLUMM7-1. Members were more likely to: have problems sleeping; unhealthy eating; ≤2 years in their home; an old home; perceive themselves underweight; exposed to work fumes; experienced sex at ≤14 years; not perform moderate recreational activities. A positive relationship between GLUMM5-1 (OR: 7.50, P<0.001) and GLUMM7-1 (OR: 7.88, P<0.001) with depression was found, with significant interactions with those married/living with partner (P=0.001). Using ML based GLUMM to form ordered depressive clusters from multitudinous lifestyle-environ variables enabled a deeper exploration of the heterogeneous data to uncover better understandings into relationships between the complex mental health factors. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Frank, Stefan; Roberts, Daniel E; Rikvold, Per Arne
2005-02-08
The influence of nearest-neighbor diffusion on the decay of a metastable low-coverage phase (monolayer adsorption) in a square lattice-gas model of electrochemical metal deposition is investigated by kinetic Monte Carlo simulations. The phase-transformation dynamics are compared to the well-established Kolmogorov-Johnson-Mehl-Avrami theory. The phase transformation is accelerated by diffusion, but remains in accord with the theory for continuous nucleation up to moderate diffusion rates. At very high diffusion rates the phase-transformation kinetic shows a crossover to instantaneous nucleation. Then, the probability of medium-sized clusters is reduced in favor of large clusters. Upon reversal of the supersaturation, the adsorbate desorbs, but large clusters still tend to grow during the initial stages of desorption. Calculation of the free energy of subcritical clusters by enumeration of lattice animals yields a quasiequilibrium distribution which is in reasonable agreement with the simulation results. This is an improvement relative to classical droplet theory, which fails to describe the distributions, since the macroscopic surface tension is a bad approximation for small clusters.
Friesen, Melissa C; Shortreed, Susan M; Wheeler, David C; Burstyn, Igor; Vermeulen, Roel; Pronk, Anjoeka; Colt, Joanne S; Baris, Dalsu; Karagas, Margaret R; Schwenn, Molly; Johnson, Alison; Armenti, Karla R; Silverman, Debra T; Yu, Kai
2015-05-01
Rule-based expert exposure assessment based on questionnaire response patterns in population-based studies improves the transparency of the decisions. The number of unique response patterns, however, can be nearly equal to the number of jobs. An expert may reduce the number of patterns that need assessment using expert opinion, but each expert may identify different patterns of responses that identify an exposure scenario. Here, hierarchical clustering methods are proposed as a systematic data reduction step to reproducibly identify similar questionnaire response patterns prior to obtaining expert estimates. As a proof-of-concept, we used hierarchical clustering methods to identify groups of jobs (clusters) with similar responses to diesel exhaust-related questions and then evaluated whether the jobs within a cluster had similar (previously assessed) estimates of occupational diesel exhaust exposure. Using the New England Bladder Cancer Study as a case study, we applied hierarchical cluster models to the diesel-related variables extracted from the occupational history and job- and industry-specific questionnaires (modules). Cluster models were separately developed for two subsets: (i) 5395 jobs with ≥1 variable extracted from the occupational history indicating a potential diesel exposure scenario, but without a module with diesel-related questions; and (ii) 5929 jobs with both occupational history and module responses to diesel-relevant questions. For each subset, we varied the numbers of clusters extracted from the cluster tree developed for each model from 100 to 1000 groups of jobs. Using previously made estimates of the probability (ordinal), intensity (µg m(-3) respirable elemental carbon), and frequency (hours per week) of occupational exposure to diesel exhaust, we examined the similarity of the exposure estimates for jobs within the same cluster in two ways. First, the clusters' homogeneity (defined as >75% with the same estimate) was examined compared to a dichotomized probability estimate (<5 versus ≥5%; <50 versus ≥50%). Second, for the ordinal probability metric and continuous intensity and frequency metrics, we calculated the intraclass correlation coefficients (ICCs) between each job's estimate and the mean estimate for all jobs within the cluster. Within-cluster homogeneity increased when more clusters were used. For example, ≥80% of the clusters were homogeneous when 500 clusters were used. Similarly, ICCs were generally above 0.7 when ≥200 clusters were used, indicating minimal within-cluster variability. The most within-cluster variability was observed for the frequency metric (ICCs from 0.4 to 0.8). We estimated that using an expert to assign exposure at the cluster-level assignment and then to review each job in non-homogeneous clusters would require ~2000 decisions per expert, in contrast to evaluating 4255 unique questionnaire patterns or 14983 individual jobs. This proof-of-concept shows that using cluster models as a data reduction step to identify jobs with similar response patterns prior to obtaining expert ratings has the potential to aid rule-based assessment by systematically reducing the number of exposure decisions needed. While promising, additional research is needed to quantify the actual reduction in exposure decisions and the resulting homogeneity of exposure estimates within clusters for an exposure assessment effort that obtains cluster-level expert assessments as part of the assessment process. Published by Oxford University Press on behalf of the British Occupational Hygiene Society 2014.
Bansal, Ravi; Peterson, Bradley S
2018-06-01
Identifying regional effects of interest in MRI datasets usually entails testing a priori hypotheses across many thousands of brain voxels, requiring control for false positive findings in these multiple hypotheses testing. Recent studies have suggested that parametric statistical methods may have incorrectly modeled functional MRI data, thereby leading to higher false positive rates than their nominal rates. Nonparametric methods for statistical inference when conducting multiple statistical tests, in contrast, are thought to produce false positives at the nominal rate, which has thus led to the suggestion that previously reported studies should reanalyze their fMRI data using nonparametric tools. To understand better why parametric methods may yield excessive false positives, we assessed their performance when applied both to simulated datasets of 1D, 2D, and 3D Gaussian Random Fields (GRFs) and to 710 real-world, resting-state fMRI datasets. We showed that both the simulated 2D and 3D GRFs and the real-world data contain a small percentage (<6%) of very large clusters (on average 60 times larger than the average cluster size), which were not present in 1D GRFs. These unexpectedly large clusters were deemed statistically significant using parametric methods, leading to empirical familywise error rates (FWERs) as high as 65%: the high empirical FWERs were not a consequence of parametric methods failing to model spatial smoothness accurately, but rather of these very large clusters that are inherently present in smooth, high-dimensional random fields. In fact, when discounting these very large clusters, the empirical FWER for parametric methods was 3.24%. Furthermore, even an empirical FWER of 65% would yield on average less than one of those very large clusters in each brain-wide analysis. Nonparametric methods, in contrast, estimated distributions from those large clusters, and therefore, by construct rejected the large clusters as false positives at the nominal FWERs. Those rejected clusters were outlying values in the distribution of cluster size but cannot be distinguished from true positive findings without further analyses, including assessing whether fMRI signal in those regions correlates with other clinical, behavioral, or cognitive measures. Rejecting the large clusters, however, significantly reduced the statistical power of nonparametric methods in detecting true findings compared with parametric methods, which would have detected most true findings that are essential for making valid biological inferences in MRI data. Parametric analyses, in contrast, detected most true findings while generating relatively few false positives: on average, less than one of those very large clusters would be deemed a true finding in each brain-wide analysis. We therefore recommend the continued use of parametric methods that model nonstationary smoothness for cluster-level, familywise control of false positives, particularly when using a Cluster Defining Threshold of 2.5 or higher, and subsequently assessing rigorously the biological plausibility of the findings, even for large clusters. Finally, because nonparametric methods yielded a large reduction in statistical power to detect true positive findings, we conclude that the modest reduction in false positive findings that nonparametric analyses afford does not warrant a re-analysis of previously published fMRI studies using nonparametric techniques. Copyright © 2018 Elsevier Inc. All rights reserved.
Clustering methods for the optimization of atomic cluster structure
NASA Astrophysics Data System (ADS)
Bagattini, Francesco; Schoen, Fabio; Tigli, Luca
2018-04-01
In this paper, we propose a revised global optimization method and apply it to large scale cluster conformation problems. In the 1990s, the so-called clustering methods were considered among the most efficient general purpose global optimization techniques; however, their usage has quickly declined in recent years, mainly due to the inherent difficulties of clustering approaches in large dimensional spaces. Inspired from the machine learning literature, we redesigned clustering methods in order to deal with molecular structures in a reduced feature space. Our aim is to show that by suitably choosing a good set of geometrical features coupled with a very efficient descent method, an effective optimization tool is obtained which is capable of finding, with a very high success rate, all known putative optima for medium size clusters without any prior information, both for Lennard-Jones and Morse potentials. The main result is that, beyond being a reliable approach, the proposed method, based on the idea of starting a computationally expensive deep local search only when it seems worth doing so, is capable of saving a huge amount of searches with respect to an analogous algorithm which does not employ a clustering phase. In this paper, we are not claiming the superiority of the proposed method compared to specific, refined, state-of-the-art procedures, but rather indicating a quite straightforward way to save local searches by means of a clustering scheme working in a reduced variable space, which might prove useful when included in many modern methods.
Pattern selection and super-patterns in the bounded confidence model
Ben-Naim, E.; Scheel, A.
2015-10-26
We study pattern formation in the bounded confidence model of opinion dynamics. In this random process, opinion is quantified by a single variable. Two agents may interact and reach a fair compromise, but only if their difference of opinion falls below a fixed threshold. Starting from a uniform distribution of opinions with compact support, a traveling wave forms and it propagates from the domain boundary into the unstable uniform state. Consequently, the system reaches a steady state with isolated clusters that are separated by distance larger than the interaction range. These clusters form a quasi-periodic pattern where the sizes ofmore » the clusters and the separations between them are nearly constant. We obtain analytically the average separation between clusters L. Interestingly, there are also very small quasi-periodic modulations in the size of the clusters. Furthermore, the spatial periods of these modulations are a series of integers that follow from the continued-fraction representation of the irrational average separation L.« less
Pattern selection and super-patterns in the bounded confidence model
NASA Astrophysics Data System (ADS)
Ben-Naim, E.; Scheel, A.
2015-10-01
We study pattern formation in the bounded confidence model of opinion dynamics. In this random process, opinion is quantified by a single variable. Two agents may interact and reach a fair compromise, but only if their difference of opinion falls below a fixed threshold. Starting from a uniform distribution of opinions with compact support, a traveling wave forms and it propagates from the domain boundary into the unstable uniform state. Consequently, the system reaches a steady state with isolated clusters that are separated by distance larger than the interaction range. These clusters form a quasi-periodic pattern where the sizes of the clusters and the separations between them are nearly constant. We obtain analytically the average separation between clusters L. Interestingly, there are also very small quasi-periodic modulations in the size of the clusters. The spatial periods of these modulations are a series of integers that follow from the continued-fraction representation of the irrational average separation L.
Friesen, Melissa C.; Shortreed, Susan M.; Wheeler, David C.; Burstyn, Igor; Vermeulen, Roel; Pronk, Anjoeka; Colt, Joanne S.; Baris, Dalsu; Karagas, Margaret R.; Schwenn, Molly; Johnson, Alison; Armenti, Karla R.; Silverman, Debra T.; Yu, Kai
2015-01-01
Objectives: Rule-based expert exposure assessment based on questionnaire response patterns in population-based studies improves the transparency of the decisions. The number of unique response patterns, however, can be nearly equal to the number of jobs. An expert may reduce the number of patterns that need assessment using expert opinion, but each expert may identify different patterns of responses that identify an exposure scenario. Here, hierarchical clustering methods are proposed as a systematic data reduction step to reproducibly identify similar questionnaire response patterns prior to obtaining expert estimates. As a proof-of-concept, we used hierarchical clustering methods to identify groups of jobs (clusters) with similar responses to diesel exhaust-related questions and then evaluated whether the jobs within a cluster had similar (previously assessed) estimates of occupational diesel exhaust exposure. Methods: Using the New England Bladder Cancer Study as a case study, we applied hierarchical cluster models to the diesel-related variables extracted from the occupational history and job- and industry-specific questionnaires (modules). Cluster models were separately developed for two subsets: (i) 5395 jobs with ≥1 variable extracted from the occupational history indicating a potential diesel exposure scenario, but without a module with diesel-related questions; and (ii) 5929 jobs with both occupational history and module responses to diesel-relevant questions. For each subset, we varied the numbers of clusters extracted from the cluster tree developed for each model from 100 to 1000 groups of jobs. Using previously made estimates of the probability (ordinal), intensity (µg m−3 respirable elemental carbon), and frequency (hours per week) of occupational exposure to diesel exhaust, we examined the similarity of the exposure estimates for jobs within the same cluster in two ways. First, the clusters’ homogeneity (defined as >75% with the same estimate) was examined compared to a dichotomized probability estimate (<5 versus ≥5%; <50 versus ≥50%). Second, for the ordinal probability metric and continuous intensity and frequency metrics, we calculated the intraclass correlation coefficients (ICCs) between each job’s estimate and the mean estimate for all jobs within the cluster. Results: Within-cluster homogeneity increased when more clusters were used. For example, ≥80% of the clusters were homogeneous when 500 clusters were used. Similarly, ICCs were generally above 0.7 when ≥200 clusters were used, indicating minimal within-cluster variability. The most within-cluster variability was observed for the frequency metric (ICCs from 0.4 to 0.8). We estimated that using an expert to assign exposure at the cluster-level assignment and then to review each job in non-homogeneous clusters would require ~2000 decisions per expert, in contrast to evaluating 4255 unique questionnaire patterns or 14983 individual jobs. Conclusions: This proof-of-concept shows that using cluster models as a data reduction step to identify jobs with similar response patterns prior to obtaining expert ratings has the potential to aid rule-based assessment by systematically reducing the number of exposure decisions needed. While promising, additional research is needed to quantify the actual reduction in exposure decisions and the resulting homogeneity of exposure estimates within clusters for an exposure assessment effort that obtains cluster-level expert assessments as part of the assessment process. PMID:25477475
Star cluster formation history along the minor axis of the Large Magellanic Cloud
NASA Astrophysics Data System (ADS)
Piatti, Andrés E.; Cole, Andrew A.; Emptage, Bryn
2018-01-01
We analysed Washington CMT1 photometry of star clusters located along the minor axis of the Large Magellanic Cloud (LMC), from the LMC optical centre up to ∼39° outwards to the North-West. The data base was exploited in order to search for new star cluster candidates, to produce cluster CMDs cleaned from field star contamination and to derive age estimates for a statistically complete cluster sample. We confirmed that 146 star cluster candidates are genuine physical systems, and concluded that an overall ∼30 per cent of catalogued clusters in the surveyed regions are unlikely to be true physical systems. We did not find any new cluster candidates in the outskirts of the LMC (deprojected distance ≳ 8°). The derived ages of the studied clusters are in the range 7.2 < log(t yr-1) ≤ 9.4, with the sole exception of the globular cluster NGC 1786 (log(t yr-1) = 10.10). We also calculated the cluster frequency for each region, from which we confirmed previously proposed outside-in formation scenarios. In addition, we found that the outer LMC fields show a sudden episode of cluster formation (log(t yr-1) ∼7.8-7.9) which continued until log(t yr-1) ∼7.3 only in the outermost LMC region. We link these features to the first pericentre passage of the LMC to the Milky Way (MW), which could have triggered cluster formation due to ram pressure interaction between the LMC and MW halo.
Faster sequence homology searches by clustering subsequences.
Suzuki, Shuji; Kakuta, Masanori; Ishida, Takashi; Akiyama, Yutaka
2015-04-15
Sequence homology searches are used in various fields. New sequencing technologies produce huge amounts of sequence data, which continuously increase the size of sequence databases. As a result, homology searches require large amounts of computational time, especially for metagenomic analysis. We developed a fast homology search method based on database subsequence clustering, and implemented it as GHOSTZ. This method clusters similar subsequences from a database to perform an efficient seed search and ungapped extension by reducing alignment candidates based on triangle inequality. The database subsequence clustering technique achieved an ∼2-fold increase in speed without a large decrease in search sensitivity. When we measured with metagenomic data, GHOSTZ is ∼2.2-2.8 times faster than RAPSearch and is ∼185-261 times faster than BLASTX. The source code is freely available for download at http://www.bi.cs.titech.ac.jp/ghostz/ akiyama@cs.titech.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
NASA Astrophysics Data System (ADS)
Tian, Jiting; Zhou, Wei; Feng, Qijie; Zheng, Jian
2018-03-01
An unsolved problem in research of sputtering from metals induced by energetic large cluster ions is that molecular dynamics (MD) simulations often produce sputtering yields much higher than experimental results. Different from the previous simulations considering only elastic atomic interactions (nuclear stopping), here we incorporate inelastic electrons-atoms interactions (electronic stopping, ES) into MD simulations using a friction model. In this way we have simulated continuous 45° impacts of 10-20 keV C60 on a Ag(111) surface, and found that the calculated sputtering yields can be very close to the experimental results when the model parameter is appropriately assigned. Conversely, when we ignore the effect of ES, the yields are much higher, just like the previous studies. We further expand our research to the sputtering of Au induced by continuous keV C60 or Ar100 bombardments, and obtain quite similar results. Our study indicates that the gap between the experimental and the simulated sputtering yields is probably induced by the ignorance of ES in the simulations, and that a careful treatment of this issue is important for simulations of cluster-ion-induced sputtering, especially for those aiming to compare with experiments.
Wetzel, Hermann
2006-01-01
In a large number of mostly retrospective association studies, a statistical relationship between volume and quality of health care has been reported. However, the relevance of these results is frequently limited by methodological shortcomings. In this article, criteria for the evidence and definition of thresholds for volume-outcome relations are proposed, e.g. the specification of relevant outcomes for quality indicators, analysis of volume as a continuous variable with an adequate case-mix and risk adjustment, accounting for cluster effects and considering mathematical models for the derivation of cut-off values. Moreover, volume thresholds are regarded as surrogate parameters for the indirect classification of the quality of care, whose diagnostic validity and effectiveness in improving health care quality need to be evaluated in prospective studies.
Hierarchical clusters of phytoplankton variables in dammed water bodies
NASA Astrophysics Data System (ADS)
Silva, Eliana Costa e.; Lopes, Isabel Cristina; Correia, Aldina; Gonçalves, A. Manuela
2017-06-01
In this paper a dataset containing biological variables of the water column of several Portuguese reservoirs is analyzed. Hierarchical cluster analysis is used to obtain clusters of phytoplankton variables of the phylum Cyanophyta, with the objective of validating the classification of Portuguese reservoirs previewly presented in [1] which were divided into three clusters: (1) Interior Tagus and Aguieira; (2) Douro; and (3) Other rivers. Now three new clusters of Cyanophyta variables were found. Kruskal-Wallis and Mann-Whitney tests are used to compare the now obtained Cyanophyta clusters and the previous Reservoirs clusters, in order to validate the classification of the water quality of reservoirs. The amount of Cyanophyta algae present in the reservoirs from the three clusters is significantly different, which validates the previous classification.
Pennings, Stephanie M; Finn, Joseph; Houtsma, Claire; Green, Bradley A; Anestis, Michael D
2017-10-01
Prior studies examining posttraumatic stress disorder (PTSD) symptom clusters and the components of the interpersonal theory of suicide (ITS) have yielded mixed results, likely stemming in part from the use of divergent samples and measurement techniques. This study aimed to expand on these findings by utilizing a large military sample, gold standard ITS measures, and multiple PTSD factor structures. Utilizing a sample of 935 military personnel, hierarchical multiple regression analyses were used to test the association between PTSD symptom clusters and the ITS variables. Additionally, we tested for indirect effects of PTSD symptom clusters on suicidal ideation through thwarted belongingness, conditional on levels of perceived burdensomeness. Results indicated that numbing symptoms are positively associated with both perceived burdensomeness and thwarted belongingness and hyperarousal symptoms (dysphoric arousal in the 5-factor model) are positively associated with thwarted belongingness. Results also indicated that hyperarousal symptoms (anxious arousal in the 5-factor model) were positively associated with fearlessness about death. The positive association between PTSD symptom clusters and suicidal ideation was inconsistent and modest, with mixed support for the ITS model. Overall, these results provide further clarity regarding the association between specific PTSD symptom clusters and suicide risk factors. © 2016 The American Association of Suicidology.
High-Threshold Fault-Tolerant Quantum Computation with Analog Quantum Error Correction
NASA Astrophysics Data System (ADS)
Fukui, Kosuke; Tomita, Akihisa; Okamoto, Atsushi; Fujii, Keisuke
2018-04-01
To implement fault-tolerant quantum computation with continuous variables, the Gottesman-Kitaev-Preskill (GKP) qubit has been recognized as an important technological element. However, it is still challenging to experimentally generate the GKP qubit with the required squeezing level, 14.8 dB, of the existing fault-tolerant quantum computation. To reduce this requirement, we propose a high-threshold fault-tolerant quantum computation with GKP qubits using topologically protected measurement-based quantum computation with the surface code. By harnessing analog information contained in the GKP qubits, we apply analog quantum error correction to the surface code. Furthermore, we develop a method to prevent the squeezing level from decreasing during the construction of the large-scale cluster states for the topologically protected, measurement-based, quantum computation. We numerically show that the required squeezing level can be relaxed to less than 10 dB, which is within the reach of the current experimental technology. Hence, this work can considerably alleviate this experimental requirement and take a step closer to the realization of large-scale quantum computation.
Two-Way Regularized Fuzzy Clustering of Multiple Correspondence Analysis.
Kim, Sunmee; Choi, Ji Yeh; Hwang, Heungsun
2017-01-01
Multiple correspondence analysis (MCA) is a useful tool for investigating the interrelationships among dummy-coded categorical variables. MCA has been combined with clustering methods to examine whether there exist heterogeneous subclusters of a population, which exhibit cluster-level heterogeneity. These combined approaches aim to classify either observations only (one-way clustering of MCA) or both observations and variable categories (two-way clustering of MCA). The latter approach is favored because its solutions are easier to interpret by providing explicitly which subgroup of observations is associated with which subset of variable categories. Nonetheless, the two-way approach has been built on hard classification that assumes observations and/or variable categories to belong to only one cluster. To relax this assumption, we propose two-way fuzzy clustering of MCA. Specifically, we combine MCA with fuzzy k-means simultaneously to classify a subgroup of observations and a subset of variable categories into a common cluster, while allowing both observations and variable categories to belong partially to multiple clusters. Importantly, we adopt regularized fuzzy k-means, thereby enabling us to decide the degree of fuzziness in cluster memberships automatically. We evaluate the performance of the proposed approach through the analysis of simulated and real data, in comparison with existing two-way clustering approaches.
NASA Astrophysics Data System (ADS)
Jeřábková, T.; Kroupa, P.; Dabringhausen, J.; Hilker, M.; Bekki, K.
2017-12-01
The stellar initial mass function (IMF) has been described as being invariant, bottom-heavy, or top-heavy in extremely dense star-burst conditions. To provide usable observable diagnostics, we calculate redshift dependent spectral energy distributions of stellar populations in extreme star-burst clusters, which are likely to have been the precursors of present day massive globular clusters (GCs) and of ultra compact dwarf galaxies (UCDs). The retention fraction of stellar remnants is taken into account to assess the mass to light ratios of the ageing star-burst. Their redshift dependent photometric properties are calculated as predictions for James Webb Space Telescope (JWST) observations. While the present day GCs and UCDs are largely degenerate concerning bottom-heavy or top-heavy IMFs, a metallicity- and density-dependent top-heavy IMF implies the most massive UCDs, at ages < 100 Myr, to appear as objects with quasar-like luminosities with a 0.1-10% variability on a monthly timescale due to core collapse supernovae.
Goad, David M; Zhu, Chuanmei; Kellogg, Elizabeth A
2017-10-01
CLV3/ESR (CLE) proteins are important signaling peptides in plants. The short CLE peptide (12-13 amino acids) is cleaved from a larger pre-propeptide and functions as an extracellular ligand. The CLE family is large and has resisted attempts at classification because the CLE domain is too short for reliable phylogenetic analysis and the pre-propeptide is too variable. We used a model-based search for CLE domains from 57 plant genomes and used the entire pre-propeptide for comprehensive clustering analysis. In total, 1628 CLE genes were identified in land plants, with none recognizable from green algae. These CLEs form 12 groups within which CLE domains are largely conserved and pre-propeptides can be aligned. Most clusters contain sequences from monocots, eudicots and Amborella trichopoda, with sequences from Picea abies, Selaginella moellendorffii and Physcomitrella patens scattered in some clusters. We easily identified previously known clusters involved in vascular differentiation and nodulation. In addition, we found a number of discrete groups whose function remains poorly characterized. Available data indicate that CLE proteins within a cluster are likely to share function, whereas those from different clusters play at least partially different roles. Our analysis provides a foundation for future evolutionary and functional studies. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Variability Survey of ω Centauri in the Near-IR: Period-Luminosity Relations
NASA Astrophysics Data System (ADS)
Navarrete, Camila; Catelan, Márcio; Contreras Ramos, Rodrigo; Gran, Felipe; Alonso-García, Javier; Dékány, István
2015-08-01
ω Centauri (NGC 5139) is by far the most massive globular star cluster in the Milky Way, and has even been suggested to be the remnant of a dwarf galaxy. As such, it contains a large number of variable stars of different classes. Here we report on a deep, wide-field, near-infrared variability survey of omega Cen, carried out by our team using ESO's 4.1m VISTA telescope. Our time-series data comprise 42 and 100 epochs in J and Ks, respectively. This unique dataset has allowed us to derive complete light curves for hundreds of variable stars in the cluster, and thereby perform a detailed analysis of the near-infrared period-luminosity (PL) relations for different variability classes, including type II Cepheids, SX Phoenicis, and RR Lyrae stars. In this contribution, in addition to describing our survey and presenting the derived light curves, we present the resulting PL relations for each of these variability classes, including the first calibration of this sort for the SX Phoenicis stars. Based on these relations, we also provide an updated (pulsational) distance modulus for omega Cen, compare with results based on independent techniques, and discuss possible sources of systematic errors.
Gravitational Lensing and Microlensing in Clusters: Clusters as Dark Matter Telescopes
NASA Astrophysics Data System (ADS)
Safonova, Margarita
2018-04-01
Gravitational lensing is brightening of background objects due to deflection of light by foreground sources. Rich clusters of galaxies are very effective lenses because they are centrally concentrated. Such natural Gravitational Telescopes provide us with strongly magnified galaxies at high redshifts otherwise too faint to be detected or analyzed. With a lensing boost, we can study galaxies shining at the end of the “Dark Ages”. We propose to exploit the opportunity provided by the large field of view and depth, to search for sources magnified by foreground clusters in the vicinity of the cluster critical curves, where enhancements can be of several tens in brightness. Another aspect is microlensing (ML), where we would like to continue our survey of a number of Galactic globular clusters over time-scales of weeks to years to search for ML events from planets to hypothesized central intermediate-mass black holes (IMBH).
The origin of low mass particles within and beyond the dust coma envelopes of Comet Halley
NASA Technical Reports Server (NTRS)
Simpson, J. A.; Rabinowitz, D.; Tuzzolino, A. J.; Ksanfomality, L. V.; Sagdeev, R. Z.
1987-01-01
Measurements from the Dust Counter and Mass Analyzer (DUCMA) instruments on VEGA-1 and -2 revealed unexpected fluxes of low mass (up to 10 to the minus 13th power g) dust particles at very great distances from the nucleus (300,000 to 600,000 km). These particles are detected in clusters (10 sec duration), preceded and followed by relatively long time intervals during which no dust is detected. This cluster phenomenon also occurs inside the envelope boundaries. Clusters of low mass particles are intermixed with the overall dust distribution throughout the coma. The clusters account for many of the short-term small-scale intensity enhancements previously ascribed to microjets in the coma. The origin of these clusters appears to be emission from the nucleus of large conglomerates which disintegrate in the coma to yield clusters of discrete, small particles continuing outward to the distant coma.
Variable Screening for Cluster Analysis.
ERIC Educational Resources Information Center
Donoghue, John R.
Inclusion of irrelevant variables in a cluster analysis adversely affects subgroup recovery. This paper examines using moment-based statistics to screen variables; only variables that pass the screening are then used in clustering. Normal mixtures are analytically shown often to possess negative kurtosis. Two related measures, "m" and…
An opinion-driven behavioral dynamics model for addictive behaviors
NASA Astrophysics Data System (ADS)
Moore, Thomas W.; Finley, Patrick D.; Apelberg, Benjamin J.; Ambrose, Bridget K.; Brodsky, Nancy S.; Brown, Theresa J.; Husten, Corinne; Glass, Robert J.
2015-04-01
We present a model of behavioral dynamics that combines a social network-based opinion dynamics model with behavioral mapping. The behavioral component is discrete and history-dependent to represent situations in which an individual's behavior is initially driven by opinion and later constrained by physiological or psychological conditions that serve to maintain the behavior. Individuals are modeled as nodes in a social network connected by directed edges. Parameter sweeps illustrate model behavior and the effects of individual parameters and parameter interactions on model results. Mapping a continuous opinion variable into a discrete behavioral space induces clustering on directed networks. Clusters provide targets of opportunity for influencing the network state; however, the smaller the network the greater the stochasticity and potential variability in outcomes. This has implications both for behaviors that are influenced by close relationships verses those influenced by societal norms and for the effectiveness of strategies for influencing those behaviors.
Critical behavior of a two-step contagion model with multiple seeds
NASA Astrophysics Data System (ADS)
Choi, Wonjun; Lee, Deokjae; Kahng, B.
2017-06-01
A two-step contagion model with a single seed serves as a cornerstone for understanding the critical behaviors and underlying mechanism of discontinuous percolation transitions induced by cascade dynamics. When the contagion spreads from a single seed, a cluster of infected and recovered nodes grows without any cluster merging process. However, when the contagion starts from multiple seeds of O (N ) where N is the system size, a node weakened by a seed can be infected more easily when it is in contact with another node infected by a different pathogen seed. This contagion process can be viewed as a cluster merging process in a percolation model. Here we show analytically and numerically that when the density of infectious seeds is relatively small but O (1 ) , the epidemic transition is hybrid, exhibiting both continuous and discontinuous behavior, whereas when it is sufficiently large and reaches a critical point, the transition becomes continuous. We determine the full set of critical exponents describing the hybrid and the continuous transitions. Their critical behaviors differ from those in the single-seed case.
Hieu, Nguyen Trong; Brochier, Timothée; Tri, Nguyen-Huu; Auger, Pierre; Brehmer, Patrice
2014-09-01
We consider a fishery model with two sites: (1) a marine protected area (MPA) where fishing is prohibited and (2) an area where the fish population is harvested. We assume that fish can migrate from MPA to fishing area at a very fast time scale and fish spatial organisation can change from small to large clusters of school at a fast time scale. The growth of the fish population and the catch are assumed to occur at a slow time scale. The complete model is a system of five ordinary differential equations with three time scales. We take advantage of the time scales using aggregation of variables methods to derive a reduced model governing the total fish density and fishing effort at the slow time scale. We analyze this aggregated model and show that under some conditions, there exists an equilibrium corresponding to a sustainable fishery. Our results suggest that in small pelagic fisheries the yield is maximum for a fish population distributed among both small and large clusters of school.
Sample size calculations for the design of cluster randomized trials: A summary of methodology.
Gao, Fei; Earnest, Arul; Matchar, David B; Campbell, Michael J; Machin, David
2015-05-01
Cluster randomized trial designs are growing in popularity in, for example, cardiovascular medicine research and other clinical areas and parallel statistical developments concerned with the design and analysis of these trials have been stimulated. Nevertheless, reviews suggest that design issues associated with cluster randomized trials are often poorly appreciated and there remain inadequacies in, for example, describing how the trial size is determined and the associated results are presented. In this paper, our aim is to provide pragmatic guidance for researchers on the methods of calculating sample sizes. We focus attention on designs with the primary purpose of comparing two interventions with respect to continuous, binary, ordered categorical, incidence rate and time-to-event outcome variables. Issues of aggregate and non-aggregate cluster trials, adjustment for variation in cluster size and the effect size are detailed. The problem of establishing the anticipated magnitude of between- and within-cluster variation to enable planning values of the intra-cluster correlation coefficient and the coefficient of variation are also described. Illustrative examples of calculations of trial sizes for each endpoint type are included. Copyright © 2015 Elsevier Inc. All rights reserved.
Intercenter Differences in Bronchopulmonary Dysplasia or Death Among Very Low Birth Weight Infants
Walsh, Michele; Bobashev, Georgiy; Das, Abhik; Levine, Burton; Carlo, Waldemar A.; Higgins, Rosemary D.
2011-01-01
OBJECTIVES: To determine (1) the magnitude of clustering of bronchopulmonary dysplasia (36 weeks) or death (the outcome) across centers of the Eunice Kennedy Shriver National Institute of Child and Human Development National Research Network, (2) the infant-level variables associated with the outcome and estimate their clustering, and (3) the center-specific practices associated with the differences and build predictive models. METHODS: Data on neonates with a birth weight of <1250 g from the cluster-randomized benchmarking trial were used to determine the magnitude of clustering of the outcome according to alternating logistic regression by using pairwise odds ratio and predictive modeling. Clinical variables associated with the outcome were identified by using multivariate analysis. The magnitude of clustering was then evaluated after correction for infant-level variables. Predictive models were developed by using center-specific and infant-level variables for data from 2001 2004 and projected to 2006. RESULTS: In 2001–2004, clustering of bronchopulmonary dysplasia/death was significant (pairwise odds ratio: 1.3; P < .001) and increased in 2006 (pairwise odds ratio: 1.6; overall incidence: 52%; range across centers: 32%–74%); center rates were relatively stable over time. Variables that varied according to center and were associated with increased risk of outcome included lower body temperature at NICU admission, use of prophylactic indomethacin, specific drug therapy on day 1, and lack of endotracheal intubation. Center differences remained significant even after correction for clustered variables. CONCLUSION: Bronchopulmonary dysplasia/death rates demonstrated moderate clustering according to center. Clinical variables associated with the outcome were also clustered. Center differences after correction of clustered variables indicate presence of as-yet unmeasured center variables. PMID:21149431
Determining the Optimal Number of Clusters with the Clustergram
NASA Technical Reports Server (NTRS)
Fluegemann, Joseph K.; Davies, Misty D.; Aguirre, Nathan D.
2011-01-01
Cluster analysis aids research in many different fields, from business to biology to aerospace. It consists of using statistical techniques to group objects in large sets of data into meaningful classes. However, this process of ordering data points presents much uncertainty because it involves several steps, many of which are subject to researcher judgment as well as inconsistencies depending on the specific data type and research goals. These steps include the method used to cluster the data, the variables on which the cluster analysis will be operating, the number of resulting clusters, and parts of the interpretation process. In most cases, the number of clusters must be guessed or estimated before employing the clustering method. Many remedies have been proposed, but none is unassailable and certainly not for all data types. Thus, the aim of current research for better techniques of determining the number of clusters is generally confined to demonstrating that the new technique excels other methods in performance for several disparate data types. Our research makes use of a new cluster-number-determination technique based on the clustergram: a graph that shows how the number of objects in the cluster and the cluster mean (the ordinate) change with the number of clusters (the abscissa). We use the features of the clustergram to make the best determination of the cluster-number.
Wilderjans, Tom F; Ceulemans, Eva; Van Mechelen, Iven; Depril, Dirk
2011-03-01
In many areas of psychology, one is interested in disclosing the underlying structural mechanisms that generated an object by variable data set. Often, based on theoretical or empirical arguments, it may be expected that these underlying mechanisms imply that the objects are grouped into clusters that are allowed to overlap (i.e., an object may belong to more than one cluster). In such cases, analyzing the data with Mirkin's additive profile clustering model may be appropriate. In this model: (1) each object may belong to no, one or several clusters, (2) there is a specific variable profile associated with each cluster, and (3) the scores of the objects on the variables can be reconstructed by adding the cluster-specific variable profiles of the clusters the object in question belongs to. Until now, however, no software program has been publicly available to perform an additive profile clustering analysis. For this purpose, in this article, the ADPROCLUS program, steered by a graphical user interface, is presented. We further illustrate its use by means of the analysis of a patient by symptom data matrix.
Gonzalez, Robert; Suppes, Trisha; Zeitzer, Jamie; McClung, Colleen; Tamminga, Carol; Tohen, Mauricio; Forero, Angelica; Dwivedi, Alok; Alvarado, Andres
2018-02-19
Multiple types of chronobiological disturbances have been reported in bipolar disorder, including characteristics associated with general activity levels, sleep, and rhythmicity. Previous studies have focused on examining the individual relationships between affective state and chronobiological characteristics. The aim of this study was to conduct a variable cluster analysis in order to ascertain how mood states are associated with chronobiological traits in bipolar I disorder (BDI). We hypothesized that manic symptomatology would be associated with disturbances of rhythm. Variable cluster analysis identified five chronobiological clusters in 105 BDI subjects. Cluster 1, comprising subjective sleep quality was associated with both mania and depression. Cluster 2, which comprised variables describing the degree of rhythmicity, was associated with mania. Significant associations between mood state and cluster analysis-identified chronobiological variables were noted. Disturbances of mood were associated with subjectively assessed sleep disturbances as opposed to objectively determined, actigraphy-based sleep variables. No associations with general activity variables were noted. Relationships between gender and medication classes in use and cluster analysis-identified chronobiological characteristics were noted. Exploratory analyses noted that medication class had a larger impact on these relationships than the number of psychiatric medications in use. In a BDI sample, variable cluster analysis was able to group related chronobiological variables. The results support our primary hypothesis that mood state, particularly mania, is associated with chronobiological disturbances. Further research is required in order to define these relationships and to determine the directionality of the associations between mood state and chronobiological characteristics.
Near-infrared Variability in the Orion Nebula Cluster
NASA Astrophysics Data System (ADS)
Rice, Thomas S.; Reipurth, Bo; Wolk, Scott J.; Vaz, Luiz Paulo; Cross, N. J. G.
2015-10-01
Using UKIRT on Mauna Kea, we have carried out a new near-infrared J, H, K monitoring survey of almost a square degree of the star-forming Orion Nebula Cluster with observations on 120 nights over three observing seasons, spanning a total of 894 days. We monitored ˜15,000 stars down to J≈ 20 using the WFCAM instrument, and have extracted 1203 significantly variable stars from our data. By studying variability in young stellar objects (YSOs) in the H - K, K color-magnitude diagram, we are able to distinguish between physical mechanisms of variability. Many variables show color behavior indicating either dust-extinction or disk/accretion activity, but we find that when monitored for longer periods of time, a number of stars shift between these two variability mechanisms. Further, we show that the intrinsic timescale of disk/accretion variability in young stars is longer than that of dust-extinction variability. We confirm that variability amplitude is statistically correlated with evolutionary class in all bands and colors. Our investigations of these 1203 variables have revealed 73 periodic AA Tau type variables, many large-amplitude and long-period (P\\gt 15 days) YSOs, including three stars showing widely spaced periodic brightening events consistent with circumbinary disk activity, and four new eclipsing binaries. These phenomena and others indicate the activity of long-term disk/accretion variability processes taking place in young stars. We have made the light curves and associated data for these 1203 variables available online.
Intrinsic scatter of caustic masses and hydrostatic bias: An observational study
NASA Astrophysics Data System (ADS)
Andreon, S.; Trinchieri, G.; Moretti, A.; Wang, J.
2017-10-01
All estimates of cluster mass have some intrinsic scatter and perhaps some bias with true mass even in the absence of measurement errors for example caused by cluster triaxiality and large scale structure. Knowledge of the bias and scatter values is fundamental for both cluster cosmology and astrophysics. In this paper we show that the intrinsic scatter of a mass proxy can be constrained by measurements of the gas fraction because masses with higher values of intrinsic scatter with true mass produce more scattered gas fractions. Moreover, the relative bias of two mass estimates can be constrained by comparing the mean gas fraction at the same (nominal) cluster mass. Our observational study addresses the scatter between caustic (I.e., dynamically estimated) and true masses, and the relative bias of caustic and hydrostatic masses. For these purposes, we used the X-ray Unbiased Cluster Sample, a cluster sample selected independently from the intracluster medium content with reliable masses: 34 galaxy clusters in the nearby (0.050 < z < 0.135) Universe, mostly with 14 < log M500/M⊙ ≲ 14.5, and with caustic masses. We found a 35% scatter between caustic and true masses. Furthermore, we found that the relative bias between caustic and hydrostatic masses is small, 0.06 ± 0.05 dex, improving upon past measurements. The small scatter found confirms our previous measurements of a highly variable amount of feedback from cluster to cluster, which is the cause of the observed large variety of core-excised X-ray luminosities and gas masses.
Clustering Genes of Common Evolutionary History
Gori, Kevin; Suchan, Tomasz; Alvarez, Nadir; Goldman, Nick; Dessimoz, Christophe
2016-01-01
Phylogenetic inference can potentially result in a more accurate tree using data from multiple loci. However, if the loci are incongruent—due to events such as incomplete lineage sorting or horizontal gene transfer—it can be misleading to infer a single tree. To address this, many previous contributions have taken a mechanistic approach, by modeling specific processes. Alternatively, one can cluster loci without assuming how these incongruencies might arise. Such “process-agnostic” approaches typically infer a tree for each locus and cluster these. There are, however, many possible combinations of tree distance and clustering methods; their comparative performance in the context of tree incongruence is largely unknown. Furthermore, because standard model selection criteria such as AIC cannot be applied to problems with a variable number of topologies, the issue of inferring the optimal number of clusters is poorly understood. Here, we perform a large-scale simulation study of phylogenetic distances and clustering methods to infer loci of common evolutionary history. We observe that the best-performing combinations are distances accounting for branch lengths followed by spectral clustering or Ward’s method. We also introduce two statistical tests to infer the optimal number of clusters and show that they strongly outperform the silhouette criterion, a general-purpose heuristic. We illustrate the usefulness of the approach by 1) identifying errors in a previous phylogenetic analysis of yeast species and 2) identifying topological incongruence among newly sequenced loci of the globeflower fly genus Chiastocheta. We release treeCl, a new program to cluster genes of common evolutionary history (http://git.io/treeCl). PMID:26893301
Two Cepheid variables in the Fornax dwarf galaxy
NASA Technical Reports Server (NTRS)
Light, R. M.; Armandroff, T. E.; Zinn, R.
1986-01-01
Two fields surrounding globular clusters 2 and 3 in the Fornax dwarf spheroidal galaxy have been searched for short-period variable stars that are brighter than the horizontal branch. This survey confirmed as variable the two suspected suprahorizontal-branch variables discovered by Buonanno et al. (1985) in their photometry of the clusters. The observations show that the star in cluster 2 is a W Virginis variable of 14.4 day period. It is the first W Vir variable to be found in a dwarf spheroidal galaxy, and its proximity to the center of cluster 2 suggests that it is a cluster member. The other star appears to be an anomalous Cephpeid of 0.78 day period. It lies outside or very near the boundary of cluster 3, and is therefore probably a member of the field population of Fornax. Although no other suprahorizontal-branch variables were discovered in the survey, it did confirm as variable two of the RR Lyrae candidates of Buonanno et al., which appeared at the survey limit. The implications of these observations for the understanding of the stellar content at Fornax are discussed.
The Future of Wind Energy in California: Future Projections in Variable-Resolution CESM
NASA Astrophysics Data System (ADS)
Wang, M.; Ullrich, P. A.; Millstein, D.; Collier, C.
2017-12-01
This study focuses on the wind energy characterization and future projection at five primary wind turbine sites in California. Historical (1980-2000) and mid-century (2030-2050) simulations were produced using the Variable-Resolution Community Earth System Model (VR-CESM) to analyze the trends and variations in wind energy under climate change. Datasets from Det Norske Veritas Germanischer Llyod (DNV GL), MERRA-2, CFSR, NARR, as well as surface observational data were used for model validation and comparison. Significant seasonal wind speed changes under RCP8.5 were detected from several wind farm sites. Large-scale patterns were then investigated to analyze the synoptic-scale impact on localized wind change. The agglomerative clustering method was applied to analyze and group different wind patterns. The associated meteorological background of each cluster was investigated to analyze the drivers of different wind patterns. This study improves the characterization of uncertainty around the magnitude and variability in space and time of California's wind resources in the near future, and also enhances understanding of the physical mechanisms related to the trends in wind resource variability.
Hsu, David
2015-09-27
Clustering methods are often used to model energy consumption for two reasons. First, clustering is often used to process data and to improve the predictive accuracy of subsequent energy models. Second, stable clusters that are reproducible with respect to non-essential changes can be used to group, target, and interpret observed subjects. However, it is well known that clustering methods are highly sensitive to the choice of algorithms and variables. This can lead to misleading assessments of predictive accuracy and mis-interpretation of clusters in policymaking. This paper therefore introduces two methods to the modeling of energy consumption in buildings: clusterwise regression,more » also known as latent class regression, which integrates clustering and regression simultaneously; and cluster validation methods to measure stability. Using a large dataset of multifamily buildings in New York City, clusterwise regression is compared to common two-stage algorithms that use K-means and model-based clustering with linear regression. Predictive accuracy is evaluated using 20-fold cross validation, and the stability of the perturbed clusters is measured using the Jaccard coefficient. These results show that there seems to be an inherent tradeoff between prediction accuracy and cluster stability. This paper concludes by discussing which clustering methods may be appropriate for different analytical purposes.« less
Trout Fryxell, R. T.; Moore, J. E.; Collins, M. D.; Kwon, Y.; Jean-Philippe, S. R.; Schaeffer, S. M.; Odoi, A.; Kennedy, M.; Houston, A. E.
2015-01-01
Two tick-borne diseases with expanding case and vector distributions are ehrlichiosis (transmitted by Amblyomma americanum) and rickettiosis (transmitted by A. maculatum and Dermacentor variabilis). There is a critical need to identify the specific habitats where each of these species is likely to be encountered to classify and pinpoint risk areas. Consequently, an in-depth tick prevalence study was conducted on the dominant ticks in the southeast. Vegetation, soil, and remote sensing data were used to test the hypothesis that habitat and vegetation variables can predict tick abundances. No variables were significant predictors of A. americanum adult and nymph tick abundance, and no clustering was evident because this species was found throughout the study area. For A. maculatum adult tick abundance was predicted by NDVI and by the interaction between habitat type and plant diversity; two significant population clusters were identified in a heterogeneous area suitable for quail habitat. For D. variabilis no environmental variables were significant predictors of adult abundance; however, D. variabilis collections clustered in three significant areas best described as agriculture areas with defined edges. This study identified few landscape and vegetation variables associated with tick presence. While some variables were significantly associated with tick populations, the amount of explained variation was not useful for predicting reliably where ticks occur; consequently, additional research that includes multiple sampling seasons and locations throughout the southeast are warranted. This low amount of explained variation may also be due to the use of hosts for dispersal, and potentially to other abiotic and biotic variables. Host species play a large role in the establishment, maintenance, and dispersal of a tick species, as well as the maintenance of disease cycles, dispersal to new areas, and identification of risk areas. PMID:26656122
Sex Discrimination in Professional Employment: A Case Study.
ERIC Educational Resources Information Center
Osterman, Paul
1979-01-01
A study analyzed sex discrimination with data on over 700 professional employees in a metropolitan publishing firm. It was found that the sex differential in earnings within clusters of similar jobs was much greater if marriage and children variables were excluded: men received a large "payoff" from being married and having children. (JH)
A population of gamma-ray emitting globular clusters seen with the Fermi Large Area Telescope
Abdo, A. A.
2010-11-24
Context. Globular clusters with their large populations of millisecond pulsars (MSPs) are believed to be potential emitters of high-energy gamma-ray emission. The observation of this emission provides a powerful tool to assess the millisecond pulsar population of a cluster, is essential for understanding the importance of binary systems for the evolution of globular clusters, and provides complementary insights into magnetospheric emission processes. Aims. Our goal is to constrain the millisecond pulsar populations in globular clusters from analysis of gamma-ray observations. Methods. We use 546 days of continuous sky-survey observations obtained with the Large Area Telescope aboard the Fermi Gamma-ray Spacemore » Telescope to study the gamma-ray emission towards 13 globular clusters. Results. Steady point-like high-energy gamma-ray emission has been significantly detected towards 8 globular clusters. Five of them (47 Tucanae, Omega Cen, NGC 6388, Terzan 5, and M 28) show hard spectral power indices (0.7 < Γ < 1.4) and clear evidence for an exponential cut-off in the range 1.0 - 2.6 GeV, which is the characteristic signature of magnetospheric emission from MSPs. Three of them (M 62, NGC 6440 and NGC 6652) also show hard spectral indices (1.0 < Γ < 1.7), however the presence of an exponential cut-off can not be unambiguously established. Three of them (Omega Cen, NGC 6388, NGC 6652) have no known radio or X-ray MSPs yet still exhibit MSP spectral properties. From the observed gamma-ray luminosities, we estimate the total number of MSPs that is expected to be present in these globular clusters. We show that our estimates of the MSP population correlate with the stellar encounter rate and we estimate 2600 - 4700 MSPs in Galactic globular clusters, commensurate with previous estimates. Conclusions. The observation of high-energy gamma-ray emission from globular clusters thus provides a reliable independent method to assess their millisecond pulsar populations.« less
HST Snapshot Study of Variable Stars in Globular Clusters: Inner Region of NGC 6441
NASA Technical Reports Server (NTRS)
Pritzl, Barton J.; Smith, Horace A.; Stetson, Peter B.; Catelan, Marcio; Sweigart, Allen V.; Layden, Andrew C.; Rich, R. Michael
2003-01-01
We present the results of a Hubble Space Telescope snapshot program to survey the inner region of the metal-rich globular cluster NGC 6441 for its variable stars. A total of 57 variable stars was found including 38 RR Lyrae stars, 6 Population II Cepheids, and 12 long period variables. Twenty-four of the RR Lyrae stars and all of the Population II Cepheids were previously undiscovered in ground-based surveys. Of the RR Lyrae stars observed in h s survey, 26 are pulsating in the fundamental mode with a mean period of 0.753 d and 12 are first-overtone mode pulsators with a mean period of 0.365 d. These values match up very well with those found in ground-based surveys. Combining all the available data for NGC 6441, we find mean periods of 0.759 d and 0.375 d for the RRab and RRc stars, respectively. We also find that the RR Lyrae in this survey are located in the same regions of a period-amplitude diagram as those found in ground-based surveys. The overall ratio of RRc to total RR Lyrae is 0.33. Although NGC 6441 is a metal-rich globular cluster and would, on that ground, be expected either to have few RR Lyrae stars, or to be an Oosterhoff type I system, its RR Lyrae more closely resemble those in Oosterhoff type II globular clusters. However, even compared to typical Oosterhoff type II systems, the mean period of its RRab stars is unusually long. We also derived I-band period-luminosity relations for the RR Lyrae stars. Of the six Population II Cepheids, five are of W Virginis type and one is a BL Herculis variable star. This makes NGC 6441, along with NGC 6388, the most metal-rich globular cluster known to contain these types of variable stars. Another variable, V118, may also be a Population II Cepheid given its long period and its separation in magnitude from the RR Lyrae stars. We examine the period-luminosity relation for these Population II Cepheids and compare it to those in other globular clusters and in the Large Magellanic Cloud. We argue that there does not appear to be a change in the period-luminosity relation slope between the BL Herculis and W Virginis stars, but that a change of slope does occur when the RV Tauri stars are added to the period-luminosity relation.
Comprehension priming as rational expectation for repetition: Evidence from syntactic processing.
Myslín, Mark; Levy, Roger
2016-02-01
Why do comprehenders process repeated stimuli more rapidly than novel stimuli? We consider an adaptive explanation for why such facilitation may be beneficial: priming is a consequence of expectation for repetition due to rational adaptation to the environment. If occurrences of a stimulus cluster in time, given one occurrence it is rational to expect a second occurrence closely following. Leveraging such knowledge may be particularly useful in online processing of language, where pervasive clustering may help comprehenders negotiate the considerable challenge of continual expectation update at multiple levels of linguistic structure and environmental variability. We test this account in the domain of structural priming in syntax, making use of the sentential complement-direct object (SC-DO) ambiguity. We first show that sentences containing SC continuations cluster in natural language, motivating an expectation for repetition of this structure. Second, we show that comprehenders are indeed sensitive to the syntactic clustering properties of their current environment. In a series of between-groups self-paced reading studies, we find that participants who are exposed to clusters of SC sentences subsequently process repetitions of SC structure more rapidly than participants who are exposed to the same number of SCs spaced in time, and attribute the difference to the learned degree of expectation for repetition. We model this behavior through Bayesian belief update, showing that (the optimal degree of) sensitivity to clustering properties of syntactic structures is indeed learnable through experience. Comprehension priming effects are thus consistent with rational expectation for repetition based on adaptation to the linguistic environment. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
Comprehension priming as rational expectation for repetition: Evidence from syntactic processing
Levy, Roger
2015-01-01
Why do comprehenders process repeated stimuli more rapidly than novel stimuli? We consider an adaptive explanation for why such facilitation may be beneficial: priming is a consequence of expectation for repetition due to rational adaptation to the environment. If occurrences of a stimulus cluster in time, given one occurrence it is rational to expect a second occurrence closely following. Leveraging such knowledge may be particularly useful in online processing of language, where pervasive clustering may help comprehenders negotiate the considerable challenge of continual expectation update at multiple levels of linguistic structure and environmental variability. We test this account in the domain of structural priming in syntax, making use of the sentential complement-direct object (SC-DO) ambiguity. We first show that sentences containing SC continuations cluster in natural language, motivating an expectation for repetition of this structure. Second, we show that comprehenders are indeed sensitive to the syntactic clustering properties of their current environment. In a series of between-groups self-paced reading studies, we find that participants who are exposed to clusters of SC sentences subsequently process repetitions of SC structure more rapidly than participants who are exposed to the same number of SCs spaced in time, and attribute the difference to the learned degree of expectation for repetition. We model this behavior through Bayesian belief update, showing that (the optimal degree of) sensitivity to clustering properties of syntactic structures is indeed learnable through experience. Comprehension priming effects are thus consistent with rational expectation for repetition based on adaptation to the linguistic environment. PMID:26605963
Findlay, S; Sinsabaugh, R L
2006-10-01
We examined bacterial metabolic activity and community similarity in shallow subsurface stream sediments distributed across three regions of the eastern United States to assess whether there were parallel changes in functional and structural attributes at this large scale. Bacterial growth, oxygen consumption, and a suite of extracellular enzyme activities were assayed to describe functional variability. Community similarity was assessed using randomly amplified polymorphic DNA (RAPD) patterns. There were significant differences in streamwater chemistry, metabolic activity, and bacterial growth among regions with, for instance, twofold higher bacterial production in streams near Baltimore, MD, compared to Hubbard Brook, NH. Five of eight extracellular enzymes showed significant differences among regions. Cluster analyses of individual streams by metabolic variables showed clear groups with significant differences in representation of sites from different regions among groups. Clustering of sites based on randomly amplified polymorphic DNA banding resulted in groups with generally less internal similarity although there were still differences in distribution of regional sites. There was a marginally significant (p = 0.09) association between patterns based on functional and structural variables. There were statistically significant but weak (r2 approximately 30%) associations between landcover and measures of both structure and function. These patterns imply a large-scale organization of biofilm communities and this structure may be imposed by factor(s) such as landcover and covariates such as nutrient concentrations, which are known to also cause differences in macrobiota of stream ecosystems.
NASA Astrophysics Data System (ADS)
Fučkar, Neven-Stjepan; Guemas, Virginie; Massonnet, François; Doblas-Reyes, Francisco
2015-04-01
Over the modern observational era, the northern hemisphere sea ice concentration, age and thickness have experienced a sharp long-term decline superimposed with strong internal variability. Hence, there is a crucial need to identify robust patterns of Arctic sea ice variability on interannual timescales and disentangle them from the long-term trend in noisy datasets. The principal component analysis (PCA) is a versatile and broadly used method for the study of climate variability. However, the PCA has several limiting aspects because it assumes that all modes of variability have symmetry between positive and negative phases, and suppresses nonlinearities by using a linear covariance matrix. Clustering methods offer an alternative set of dimension reduction tools that are more robust and capable of taking into account possible nonlinear characteristics of a climate field. Cluster analysis aggregates data into groups or clusters based on their distance, to simultaneously minimize the distance between data points in a given cluster and maximize the distance between the centers of the clusters. We extract modes of Arctic interannual sea-ice variability with nonhierarchical K-means cluster analysis and investigate the mechanisms leading to these modes. Our focus is on the sea ice thickness (SIT) as the base variable for clustering because SIT holds most of the climate memory for variability and predictability on interannual timescales. We primarily use global reconstructions of sea ice fields with a state-of-the-art ocean-sea-ice model, but we also verify the robustness of determined clusters in other Arctic sea ice datasets. Applied cluster analysis over the 1958-2013 period shows that the optimal number of detrended SIT clusters is K=3. Determined SIT cluster patterns and their time series of occurrence are rather similar between different seasons and months. Two opposite thermodynamic modes are characterized with prevailing negative or positive SIT anomalies over the Arctic basin. The intermediate mode, with negative anomalies centered on the East Siberian shelf and positive anomalies along the North American side of the basin, has predominately dynamic characteristics. The associated sea ice concentration (SIC) clusters vary more between different seasons and months, but the SIC patterns are physically framed by the SIT cluster patterns.
Hamblion, Esther L; Le Menach, Arnaud; Anderson, Laura F; Lalor, Maeve K; Brown, Tim; Abubakar, Ibrahim; Anderson, Charlotte; Maguire, Helen; Anderson, Sarah R
2016-01-01
Background The incidence of TB has doubled in the last 20 years in London. A better understanding of risk groups for recent transmission is required to effectively target interventions. We investigated the molecular epidemiological characteristics of TB cases to estimate the proportion of cases due to recent transmission, and identify predictors for belonging to a cluster. Methods The study population included all culture-positive TB cases in London residents, notified between January 2010 and December 2012, strain typed using 24-loci multiple interspersed repetitive units-variable number tandem repeats. Multivariable logistic regression analysis was performed to assess the risk factors for clustering using sociodemographic and clinical characteristics of cases and for cluster size based on the characteristics of the first two cases. Results There were 10 147 cases of which 5728 (57%) were culture confirmed and 4790 isolates (84%) were typed. 2194 (46%) were clustered in 570 clusters, and the estimated proportion attributable to recent transmission was 34%. Clustered cases were more likely to be UK born, have pulmonary TB, a previous diagnosis, a history of substance abuse or alcohol abuse and imprisonment, be of white, Indian, black-African or Caribbean ethnicity. The time between notification of the first two cases was more likely to be <90 days in large clusters. Conclusions Up to a third of TB cases in London may be due to recent transmission. Resources should be directed to the timely investigation of clusters involving cases with risk factors, particularly those with a short period between the first two cases, to interrupt onward transmission of TB. PMID:27417280
Radio active galactic nuclei in galaxy clusters: Feedback, merger signatures, and cluster tracers
NASA Astrophysics Data System (ADS)
Paterno-Mahler, Rachel Beth
Galaxy clusters, the largest gravitationally-bound structures in the universe, are composed of 50-1000s of galaxies, hot X-ray emitting gas, and dark matter. They grow in size over time through cluster and group mergers. The merger history of a cluster can be imprinted on the hot gas, known as the intracluster medium (ICM). Merger signatures include shocks, cold fronts, and sloshing of the ICM, which can form spiral structures. Some clusters host double-lobed radio sources driven by active galactic nuclei (AGN). First, I will present a study of the galaxy cluster Abell 2029, which is very relaxed on large scales and has one of the largest continuous sloshing spirals yet observed in the X-ray, extending outward approximately 400 kpc. The sloshing gas interacts with the southern lobe of the radio galaxy, causing it to bend. Energy injection from the AGN is insufficient to offset cooling. The sloshing spiral may be an important additional mechanism in preventing large amounts of gas from cooling to very low temperatures. Next, I will present a study of Abell 98, a triple system currently undergoing a merger. I will discuss the merger history, and show that it is causing a shock. The central subcluster hosts a double-lobed AGN, which is evacuating a cavity in the ICM. Understanding the physical processes that affect the ICM is important for determining the mass of clusters, which in turn affects our calculations of cosmological parameters. To further constrain these parameters, as well as models of galaxy evolution, it is important to use a large sample of galaxy clusters over a range of masses and redshifts. Bent, double-lobed radio sources can potentially act as tracers of galaxy clusters over wide ranges of these parameters. I examine how efficient bent radio sources are at tracing high-redshift (z>0.7) clusters. Out of 646 sources in our high-redshift Clusters Occupied by Bent Radio AGN (COBRA) sample, 282 are candidate new, distant clusters of galaxies based on measurements of excess galaxy counts surrounding the radio sources in Spitzer infrared images.
Benis, Arriel; Harel, Nissim; Barkan, Refael; Sela, Tomer; Feldman, Becca
2017-01-01
HMOs record medical data and their interactions with patients. Using this data we strive to identify sub-populations of healthcare customers based on their communication patterns and characterize these sub-populations by their socio-demographic, medical, treatment effectiveness, and treatment adherence profiles. This work will be used to develop tools and interventions aimed at improving patient care. The process included: (1) Extracting socio-demographic, clinical, laboratory, and communication data of 309,460 patients with diabetes in 2015, aged 32+ years, having 7+ years of the disease treated by Clalit Healthcare Services; (2) Reducing dimensions of continuous variables; (3) Finding the K communication-patterns clusters; (4) Building a hierarchical clustering and its associated heatmap to summarize the discovered clusters; (5) Analyzing the clusters found; (6) Validating results epidemiologically. Such a process supports understanding different communication-channel usage and the implementation of personalized services focusing on patients' needs and preferences.
Congdon, P
1990-08-01
London's average total fertility rate (TFR) stood at 1.75. Using a cluster analysis to compare the 1985-1987 fertility patterns of different boroughs of London, demographers learned that 5 natural groupings occurred. 4 boroughs in a central London cluster have the distinction of having a low TFR (1.38) and late fertility (average age of 29.58 years). The researchers attributed these occurrences to the high levels of employment and career attachment and low rates of marriage among women in this cluster. 2 inner city boroughs constituted the smallest cluster and had the largest TFR (2.37), mainly due to high numbers of births to the ethnic minorities. The largest cluster consisted of 12 boroughs located mainly along the periphery with 2 centrally located boroughs (TFR, 1.79). Some of the upper class outer boroughs characterized another cluster with a TFR of 1.61. Another cluster made up of inner and outer boroughs in east and southeast London had a ample proportion of manual worker (TFR, 2.04). Social class most likely accounted for the contrast in TFRs between the 2 aformentioned clusters. Demographers observed that cyclical fluctuation of fertility occurred as opposed to secular trends. Due to these fluctuations, demographers used autoregressive moving average forecast models to time series of the fertility variables in London since 1952. They also applied structural time series models which included regression variables and the influence of cyclical and/or trend behavior. The results showed that large cohorts and the increase in female economic activity caused a delay in the modal age of births and a reduction in the number of births.
Marston, Louise; Peacock, Janet L; Yu, Keming; Brocklehurst, Peter; Calvert, Sandra A; Greenough, Anne; Marlow, Neil
2009-07-01
Studies of prematurely born infants contain a relatively large percentage of multiple births, so the resulting data have a hierarchical structure with small clusters of size 1, 2 or 3. Ignoring the clustering may lead to incorrect inferences. The aim of this study was to compare statistical methods which can be used to analyse such data: generalised estimating equations, multilevel models, multiple linear regression and logistic regression. Four datasets which differed in total size and in percentage of multiple births (n = 254, multiple 18%; n = 176, multiple 9%; n = 10 098, multiple 3%; n = 1585, multiple 8%) were analysed. With the continuous outcome, two-level models produced similar results in the larger dataset, while generalised least squares multilevel modelling (ML GLS 'xtreg' in Stata) and maximum likelihood multilevel modelling (ML MLE 'xtmixed' in Stata) produced divergent estimates using the smaller dataset. For the dichotomous outcome, most methods, except generalised least squares multilevel modelling (ML GH 'xtlogit' in Stata) gave similar odds ratios and 95% confidence intervals within datasets. For the continuous outcome, our results suggest using multilevel modelling. We conclude that generalised least squares multilevel modelling (ML GLS 'xtreg' in Stata) and maximum likelihood multilevel modelling (ML MLE 'xtmixed' in Stata) should be used with caution when the dataset is small. Where the outcome is dichotomous and there is a relatively large percentage of non-independent data, it is recommended that these are accounted for in analyses using logistic regression with adjusted standard errors or multilevel modelling. If, however, the dataset has a small percentage of clusters greater than size 1 (e.g. a population dataset of children where there are few multiples) there appears to be less need to adjust for clustering.
A Variable-Selection Heuristic for K-Means Clustering.
ERIC Educational Resources Information Center
Brusco, Michael J.; Cradit, J. Dennis
2001-01-01
Presents a variable selection heuristic for nonhierarchical (K-means) cluster analysis based on the adjusted Rand index for measuring cluster recovery. Subjected the heuristic to Monte Carlo testing across more than 2,200 datasets. Results indicate that the heuristic is extremely effective at eliminating masking variables. (SLD)
A Hierarchical Framework for State-Space Matrix Inference and Clustering.
Zuo, Chandler; Chen, Kailei; Hewitt, Kyle J; Bresnick, Emery H; Keleş, Sündüz
2016-09-01
In recent years, a large number of genomic and epigenomic studies have been focusing on the integrative analysis of multiple experimental datasets measured over a large number of observational units. The objectives of such studies include not only inferring a hidden state of activity for each unit over individual experiments, but also detecting highly associated clusters of units based on their inferred states. Although there are a number of methods tailored for specific datasets, there is currently no state-of-the-art modeling framework for this general class of problems. In this paper, we develop the MBASIC ( M atrix B ased A nalysis for S tate-space I nference and C lustering) framework. MBASIC consists of two parts: state-space mapping and state-space clustering. In state-space mapping, it maps observations onto a finite state-space, representing the activation states of units across conditions. In state-space clustering, MBASIC incorporates a finite mixture model to cluster the units based on their inferred state-space profiles across all conditions. Both the state-space mapping and clustering can be simultaneously estimated through an Expectation-Maximization algorithm. MBASIC flexibly adapts to a large number of parametric distributions for the observed data, as well as the heterogeneity in replicate experiments. It allows for imposing structural assumptions on each cluster, and enables model selection using information criterion. In our data-driven simulation studies, MBASIC showed significant accuracy in recovering both the underlying state-space variables and clustering structures. We applied MBASIC to two genome research problems using large numbers of datasets from the ENCODE project. The first application grouped genes based on transcription factor occupancy profiles of their promoter regions in two different cell types. The second application focused on identifying groups of loci that are similar to a GATA2 binding site that is functional at its endogenous locus by utilizing transcription factor occupancy data and illustrated applicability of MBASIC in a wide variety of problems. In both studies, MBASIC showed higher levels of raw data fidelity than analyzing these data with a two-step approach using ENCODE results on transcription factor occupancy data.
Covassin, Tracey; Moran, Ryan; Wilhelm, Kristyn
2013-12-01
Multiple concussions have been associated with prolonged symptoms, recovery time, and risk for future concussions. However, very few studies have examined the effect of multiple concussions on neurocognitive performance and the recently revised symptom clusters using a large database. To examine concussed athletes with a history of 0, 1, 2, or ≥3 concussions on neurocognitive performance and the recently revised symptom clusters. Cohort study (prognosis); Level of evidence, 2. The independent variables were concussion group (0, 1, 2, and ≥3 concussions) and time (baseline, 3 days, and 8 days). The dependent variables were neurocognitive test scores as measured by the Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) neurocognitive test battery (verbal and visual memory, processing speed, and reaction time) and 4 concussion symptom clusters (migraine-cognitive-fatigue, affective, somatic, and sleep). All concussed athletes (n = 596) were administered the ImPACT test at a mean 2.67 ± 1.98 and 7.95 ± 4.46 days after injury. A series of 4 (concussion group) × 3 (time) repeated-measures analyses of covariance (age = covariate) were performed on ImPACT composite scores and symptom clusters. Concussed athletes with ≥3 concussions were still impaired 8 days after a concussion compared with baseline scores on verbal memory (P < .001), reaction time (P < .001), and migraine-cognitive-fatigue symptoms (P < .001). There were no significant findings on the remaining dependent variables. Concussed athletes with a history of ≥3 concussions take longer to recover than athletes with 1 or no previous concussion. Future research should concentrate on validating the new symptom clusters on multiple concussed athletes, examining longer recovery times (ie, >8 days) among athletes with multiple concussions.
NASA Astrophysics Data System (ADS)
Brenden, T. O.; Clark, R. D.; Wiley, M. J.; Seelbach, P. W.; Wang, L.
2005-05-01
Remote sensing and geographic information systems have made it possible to attribute variables for streams at increasingly detailed resolutions (e.g., individual river reaches). Nevertheless, management decisions still must be made at large scales because land and stream managers typically lack sufficient resources to manage on an individual reach basis. Managers thus require a method for identifying stream management units that are ecologically similar and that can be expected to respond similarly to management decisions. We have developed a spatially-constrained clustering algorithm that can merge neighboring river reaches with similar ecological characteristics into larger management units. The clustering algorithm is based on the Cluster Affinity Search Technique (CAST), which was developed for clustering gene expression data. Inputs to the clustering algorithm are the neighbor relationships of the reaches that comprise the digital river network, the ecological attributes of the reaches, and an affinity value, which identifies the minimum similarity for merging river reaches. In this presentation, we describe the clustering algorithm in greater detail and contrast its use with other methods (expert opinion, classification approach, regular clustering) for identifying management units using several Michigan watersheds as a backdrop.
Simultaneous alignment and clustering of peptide data using a Gibbs sampling approach.
Andreatta, Massimo; Lund, Ole; Nielsen, Morten
2013-01-01
Proteins recognizing short peptide fragments play a central role in cellular signaling. As a result of high-throughput technologies, peptide-binding protein specificities can be studied using large peptide libraries at dramatically lower cost and time. Interpretation of such large peptide datasets, however, is a complex task, especially when the data contain multiple receptor binding motifs, and/or the motifs are found at different locations within distinct peptides. The algorithm presented in this article, based on Gibbs sampling, identifies multiple specificities in peptide data by performing two essential tasks simultaneously: alignment and clustering of peptide data. We apply the method to de-convolute binding motifs in a panel of peptide datasets with different degrees of complexity spanning from the simplest case of pre-aligned fixed-length peptides to cases of unaligned peptide datasets of variable length. Example applications described in this article include mixtures of binders to different MHC class I and class II alleles, distinct classes of ligands for SH3 domains and sub-specificities of the HLA-A*02:01 molecule. The Gibbs clustering method is available online as a web server at http://www.cbs.dtu.dk/services/GibbsCluster.
Fontes, Cristiano Hora; Budman, Hector
2017-11-01
A clustering problem involving multivariate time series (MTS) requires the selection of similarity metrics. This paper shows the limitations of the PCA similarity factor (SPCA) as a single metric in nonlinear problems where there are differences in magnitude of the same process variables due to expected changes in operation conditions. A novel method for clustering MTS based on a combination between SPCA and the average-based Euclidean distance (AED) within a fuzzy clustering approach is proposed. Case studies involving either simulated or real industrial data collected from a large scale gas turbine are used to illustrate that the hybrid approach enhances the ability to recognize normal and fault operating patterns. This paper also proposes an oversampling procedure to create synthetic multivariate time series that can be useful in commonly occurring situations involving unbalanced data sets. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.
A survey for variable young stars with small telescopes: First results from HOYS-CAPS
NASA Astrophysics Data System (ADS)
Froebrich, D.; Campbell-White, J.; Scholz, A.; Eislöffel, J.; Zegmott, T.; Billington, S. J.; Donohoe, J.; Makin, S. V.; Hibbert, R.; Newport, R. J.; Pickard, R.; Quinn, N.; Rodda, T.; Piehler, G.; Shelley, M.; Parkinson, S.; Wiersema, K.; Walton, I.
2018-05-01
Variability in Young Stellar Objects (YSOs) is one of their primary characteristics. Long-term, multi-filter, high-cadence monitoring of large YSO samples is the key to understand the partly unusual light-curves that many of these objects show. Here we introduce and present the first results of the HOYS-CAPScitizen science project which aims to perform such monitoring for nearby (d < 1 kpc) and young (age < 10 Myr) clusters and star forming regions, visible from the northern hemisphere, with small telescopes. We have identified and characterised 466 variable (413 confirmed young) stars in 8 young, nearby clusters. All sources vary by at least 0.2 mag in V, have been observed at least 15 times in V, R and I in the same night over a period of about 2 yrs and have a Stetson index of larger than 1. This is one of the largest samples of variable YSOs observed over such a time-span and cadence in multiple filters. About two thirds of our sample are classical T-Tauri stars, while the rest are objects with depleted or transition disks. Objects characterised as bursters show by far the highest variability. Dippers and objects whose variability is dominated by occultations from normal interstellar dust or dust with larger grains (or opaque material) have smaller amplitudes. We have established a hierarchical clustering algorithm based on the light-curve properties which allows the identification of the YSOs with the most unusual behaviour, and to group sources with similar properties. We discuss in detail the light-curves of the unusual objects V2492 Cyg, V350 Cep and 2MASS J21383981+5708470.
ERIC Educational Resources Information Center
Meulman, Jacqueline J.; Verboon, Peter
1993-01-01
Points of view analysis, as a way to deal with individual differences in multidimensional scaling, was largely supplanted by the weighted Euclidean model. It is argued that the approach deserves new attention, especially as a technique to analyze group differences. A streamlined and integrated process is proposed. (SLD)
Functional brain segmentation using inter-subject correlation in fMRI.
Kauppi, Jukka-Pekka; Pajula, Juha; Niemi, Jari; Hari, Riitta; Tohka, Jussi
2017-05-01
The human brain continuously processes massive amounts of rich sensory information. To better understand such highly complex brain processes, modern neuroimaging studies are increasingly utilizing experimental setups that better mimic daily-life situations. A new exploratory data-analysis approach, functional segmentation inter-subject correlation analysis (FuSeISC), was proposed to facilitate the analysis of functional magnetic resonance (fMRI) data sets collected in these experiments. The method provides a new type of functional segmentation of brain areas, not only characterizing areas that display similar processing across subjects but also areas in which processing across subjects is highly variable. FuSeISC was tested using fMRI data sets collected during traditional block-design stimuli (37 subjects) as well as naturalistic auditory narratives (19 subjects). The method identified spatially local and/or bilaterally symmetric clusters in several cortical areas, many of which are known to be processing the types of stimuli used in the experiments. The method is not only useful for spatial exploration of large fMRI data sets obtained using naturalistic stimuli, but also has other potential applications, such as generation of a functional brain atlases including both lower- and higher-order processing areas. Finally, as a part of FuSeISC, a criterion-based sparsification of the shared nearest-neighbor graph was proposed for detecting clusters in noisy data. In the tests with synthetic data, this technique was superior to well-known clustering methods, such as Ward's method, affinity propagation, and K-means ++. Hum Brain Mapp 38:2643-2665, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
High- and low-level hierarchical classification algorithm based on source separation process
NASA Astrophysics Data System (ADS)
Loghmari, Mohamed Anis; Karray, Emna; Naceur, Mohamed Saber
2016-10-01
High-dimensional data applications have earned great attention in recent years. We focus on remote sensing data analysis on high-dimensional space like hyperspectral data. From a methodological viewpoint, remote sensing data analysis is not a trivial task. Its complexity is caused by many factors, such as large spectral or spatial variability as well as the curse of dimensionality. The latter describes the problem of data sparseness. In this particular ill-posed problem, a reliable classification approach requires appropriate modeling of the classification process. The proposed approach is based on a hierarchical clustering algorithm in order to deal with remote sensing data in high-dimensional space. Indeed, one obvious method to perform dimensionality reduction is to use the independent component analysis process as a preprocessing step. The first particularity of our method is the special structure of its cluster tree. Most of the hierarchical algorithms associate leaves to individual clusters, and start from a large number of individual classes equal to the number of pixels; however, in our approach, leaves are associated with the most relevant sources which are represented according to mutually independent axes to specifically represent some land covers associated with a limited number of clusters. These sources contribute to the refinement of the clustering by providing complementary rather than redundant information. The second particularity of our approach is that at each level of the cluster tree, we combine both a high-level divisive clustering and a low-level agglomerative clustering. This approach reduces the computational cost since the high-level divisive clustering is controlled by a simple Boolean operator, and optimizes the clustering results since the low-level agglomerative clustering is guided by the most relevant independent sources. Then at each new step we obtain a new finer partition that will participate in the clustering process to enhance semantic capabilities and give good identification rates.
[Genetic polymorphism of Tulipa gesneriana L. evaluated on the basis of the ISSR marking data].
Kashin, A S; Kritskaya, T A; Schanzer, I A
2016-10-01
Using the method of ISSR analysis, the genetic diversity of 18 natural populations of Tulipa gesneriana L. from the north of the Lower Volga region was examined. The ten ISSR primers used in the study provided identification of 102 PCR fragments, of which 50 were polymorphic (49.0%). According to the proportion of polymorphic markers, two population groups were distinguished: (1) the populations in which the proportion of polymorphic markers ranged from 0.35 to 0.41; (2) the populations in which the proportion of polymorphic markers ranged from 0.64 to 0.85. UPGMA clustering analysis provided subdivision of the sample into two large clusters. The unrooted tree constructed using the Neighbor Joining algorithm had similar topology. The first cluster included slightly variable populations and the second cluster included highly variable populations. The AMOVA analysis showed statistically significant differences (F CT = 0.430; p = 0.000) between the two groups. Local populations are considerably genetically differentiated from each other (F ST = 0.632) and have almost no links via modern gene flow, as evidenced by the results of the Mantel test (r =–0.118; p = 0.819). It is suggested that the degree of genetic similarities and differences between the populations depends on the time and the species dispersal patterns on these territories.
Structures in the Great Attractor region
NASA Astrophysics Data System (ADS)
Radburn-Smith, D. J.; Lucey, J. R.; Woudt, P. A.; Kraan-Korteweg, R. C.; Watson, F. G.
2006-07-01
To further our understanding of the Great Attractor (GA), we have undertaken a redshift survey using the 2-degree Field (2dF) instrument on the Anglo-Australian Telescope (AAT). Clusters and filaments in the GA region were targeted with 25 separate pointings resulting in approximately 2600 new redshifts. Targets included poorly studied X-ray clusters from the Clusters in the Zone of Avoidance (CIZA) Catalogue as well as the Cen-Crux and PKS 1343-601 clusters, both of which lie close to the classic GA centre. For nine clusters in the region, we report velocity distributions as well as virial and projected mass estimates. The virial mass of CIZA J1324.7-5736, now identified as a separate structure from the Cen-Crux cluster, is found to be ˜3 × 1014-M⊙, in good agreement with the X-ray inferred mass. In the PKS 1343-601 field, five redshifts are measured of which four are new. An analysis of redshifts from this survey, in combination with those from the literature, reveals the dominant structure in the GA region to be a large filament, which appears to extend from Abell S0639 (l= 281°, b=+11°) to (l˜ 5°, b˜-50°), encompassing the Cen-Crux, CIZA J1324.7-5736, Norma and Pavo II clusters. Behind the Norma cluster at cz˜ 15-000-km-s-1, the masses of four rich clusters are calculated. These clusters (Triangulum Australis, Ara, CIZA J1514.6-4558 and CIZA J1410.4-4246) may contribute to a continued large-scale flow beyond the GA. The results of these observations will be incorporated into a subsequent analysis of the GA flow.
QKD Via a Quantum Wavelength Router Using Spatial Soliton
NASA Astrophysics Data System (ADS)
Kouhnavard, M.; Amiri, I. S.; Afroozeh, A.; Jalil, M. A.; Ali, J.; Yupapin, P. P.
2011-05-01
A system for continuous variable quantum key distribution via a wavelength router is proposed. The Kerr type of light in the nonlinear microring resonator (NMRR) induces the chaotic behavior. In this proposed system chaotic signals are generated by an optical soliton or Gaussian pulse within a NMRR system. The parameters, such as input power, MRRs radii and coupling coefficients can change and plays important role in determining the results in which the continuous signals are generated spreading over the spectrum. Large bandwidth signals of optical soliton are generated by the input pulse propagating within the MRRs, which is allowed to form the continuous wavelength or frequency with large tunable channel capacity. The continuous variable QKD is formed by using the localized spatial soliton pulses via a quantum router and networks. The selected optical spatial pulse can be used to perform the secure communication network. Here the entangled photon generated by chaotic signals has been analyzed. The continuous entangled photon is generated by using the polarization control unit incorporating into the MRRs, required to provide the continuous variable QKD. Results obtained have shown that the application of such a system for the simultaneous continuous variable quantum cryptography can be used in the mobile telephone hand set and networks. In this study frequency band of 500 MHz and 2.0 GHz and wavelengths of 775 nm, 2,325 nm and 1.55 μm can be obtained for QKD use with input optical soliton and Gaussian beam respectively.
An opinion-driven behavioral dynamics model for addictive behaviors
Moore, Thomas W.; Finley, Patrick D.; Apelberg, Benjamin J.; ...
2015-04-08
We present a model of behavioral dynamics that combines a social network-based opinion dynamics model with behavioral mapping. The behavioral component is discrete and history-dependent to represent situations in which an individual’s behavior is initially driven by opinion and later constrained by physiological or psychological conditions that serve to maintain the behavior. Additionally, individuals are modeled as nodes in a social network connected by directed edges. Parameter sweeps illustrate model behavior and the effects of individual parameters and parameter interactions on model results. Mapping a continuous opinion variable into a discrete behavioral space induces clustering on directed networks. Clusters providemore » targets of opportunity for influencing the network state; however, the smaller the network the greater the stochasticity and potential variability in outcomes. Furthermore, this has implications both for behaviors that are influenced by close relationships verses those influenced by societal norms and for the effectiveness of strategies for influencing those behaviors.« less
Individualization as Driving Force of Clustering Phenomena in Humans
Mäs, Michael; Flache, Andreas; Helbing, Dirk
2010-01-01
One of the most intriguing dynamics in biological systems is the emergence of clustering, in the sense that individuals self-organize into separate agglomerations in physical or behavioral space. Several theories have been developed to explain clustering in, for instance, multi-cellular organisms, ant colonies, bee hives, flocks of birds, schools of fish, and animal herds. A persistent puzzle, however, is the clustering of opinions in human populations, particularly when opinions vary continuously, such as the degree to which citizens are in favor of or against a vaccination program. Existing continuous opinion formation models predict “monoculture” in the long run, unless subsets of the population are perfectly separated from each other. Yet, social diversity is a robust empirical phenomenon, although perfect separation is hardly possible in an increasingly connected world. Considering randomness has not overcome the theoretical shortcomings so far. Small perturbations of individual opinions trigger social influence cascades that inevitably lead to monoculture, while larger noise disrupts opinion clusters and results in rampant individualism without any social structure. Our solution to the puzzle builds on recent empirical research, combining the integrative tendencies of social influence with the disintegrative effects of individualization. A key element of the new computational model is an adaptive kind of noise. We conduct computer simulation experiments demonstrating that with this kind of noise a third phase besides individualism and monoculture becomes possible, characterized by the formation of metastable clusters with diversity between and consensus within clusters. When clusters are small, individualization tendencies are too weak to prohibit a fusion of clusters. When clusters grow too large, however, individualization increases in strength, which promotes their splitting. In summary, the new model can explain cultural clustering in human societies. Strikingly, model predictions are not only robust to “noise”—randomness is actually the central mechanism that sustains pluralism and clustering. PMID:20975937
Ambrosini, Roberto; Cuervo, José Javier; du Feu, Chris; Fiedler, Wolfgang; Musitelli, Federica; Rubolini, Diego; Sicurella, Beatrice; Spina, Fernando; Saino, Nicola; Møller, Anders Pape
2016-05-01
Many partially migratory species show phenotypically divergent populations in terms of migratory behaviour, with climate hypothesized to be a major driver of such variability through its differential effects on sedentary and migratory individuals. Based on long-term (1947-2011) bird ringing data, we analysed phenotypic differentiation of migratory behaviour among populations of the European robin Erithacus rubecula across Europe. We showed that clusters of populations sharing breeding and wintering ranges varied from partial (British Isles and Western Europe, NW cluster) to completely migratory (Scandinavia and north-eastern Europe, NE cluster). Distance migrated by birds of the NE (but not of the NW) cluster decreased through time because of a north-eastwards shift in the wintering grounds. Moreover, when winter temperatures in the breeding areas were cold, individuals from the NE cluster also migrated longer distances, while those of the NW cluster moved over shorter distances. Climatic conditions may therefore affect migratory behaviour of robins, although large geographical variation in response to climate seems to exist. © 2016 The Authors. Journal of Animal Ecology © 2016 British Ecological Society.
[Cardiac risk profile in diabetes mellitus and impaired fasting glucose].
Schaan, Beatriz D'Agord; Harzheim, Erno; Gus, Iseu
2004-08-01
Mortality of diabetic patients is higher than that of the population at large, and mainly results from cardiovascular diseases. The purpose of the present study was to identify the prevalence of cardiovascular risk factors in subjects with diabetes mellitus (DM) or abnormal fasting glucose (FG) in order to guide health actions. A population-based cross-sectional study was carried out in a representative random cluster sampling of 1,066 adult urban population (> or =20 years) in the state of Rio Grande do Sul between 1999 and 2000. A structured questionnaire on coronary risk factors was applied and sociodemographic characteristics of all adults older than 20 years living in the same dwelling were collected. Subjects were clinically evaluated and blood samples were obtained for measuring total cholesterol and fasting glycemia. Statistical analysis was performed using Stata 7 and a 5% significance level was set. Categorical variables were compared by Pearson's chi-square and continuous variables were compared using Student's t-test or Anova and multivariate analysis, all controlled for the cluster effect. Of 992 subjects, 12.4% were diabetic and 7.4% had impaired fasting glucose. Among the risk factors evaluated, subjects who presented any kind of glucose homeostasis abnormality were at a higher prevalence of obesity (17.8, 29.2 and 35.3% in healthy subjects, impaired fasting glucose and DM respectively, p<0.001), hypertension (30.1, 56.3 and 50.5% in healthy subjects, impaired fasting glucose and DM, respectively, p<0.001), and hypercholesterolemia (23.2, 35.1 and 39.5 in healthy subjects, impaired fasting glucose and DM respectively, p=0.01). Subjects with any kind of glucose homeostasis abnormality represent a group, which preventive individual and population health policies should target since they have higher prevalence of coronary artery disease risk factors.
Innovating Big Data Computing Geoprocessing for Analysis of Engineered-Natural Systems
NASA Astrophysics Data System (ADS)
Rose, K.; Baker, V.; Bauer, J. R.; Vasylkivska, V.
2016-12-01
Big data computing and analytical techniques offer opportunities to improve predictions about subsurface systems while quantifying and characterizing associated uncertainties from these analyses. Spatial analysis, big data and otherwise, of subsurface natural and engineered systems are based on variable resolution, discontinuous, and often point-driven data to represent continuous phenomena. We will present examples from two spatio-temporal methods that have been adapted for use with big datasets and big data geo-processing capabilities. The first approach uses regional earthquake data to evaluate spatio-temporal trends associated with natural and induced seismicity. The second algorithm, the Variable Grid Method (VGM), is a flexible approach that presents spatial trends and patterns, such as those resulting from interpolation methods, while simultaneously visualizing and quantifying uncertainty in the underlying spatial datasets. In this presentation we will show how we are utilizing Hadoop to store and perform spatial analyses to efficiently consume and utilize large geospatial data in these custom analytical algorithms through the development of custom Spark and MapReduce applications that incorporate ESRI Hadoop libraries. The team will present custom `Big Data' geospatial applications that run on the Hadoop cluster and integrate with ESRI ArcMap with the team's probabilistic VGM approach. The VGM-Hadoop tool has been specially built as a multi-step MapReduce application running on the Hadoop cluster for the purpose of data reduction. This reduction is accomplished by generating multi-resolution, non-overlapping, attributed topology that is then further processed using ESRI's geostatistical analyst to convey a probabilistic model of a chosen study region. Finally, we will share our approach for implementation of data reduction and topology generation via custom multi-step Hadoop applications, performance benchmarking comparisons, and Hadoop-centric opportunities for greater parallelization of geospatial operations.
Sani-Kast, Nicole; Scheringer, Martin; Slomberg, Danielle; Labille, Jérôme; Praetorius, Antonia; Ollivier, Patrick; Hungerbühler, Konrad
2015-12-01
Engineered nanoparticle (ENP) fate models developed to date - aimed at predicting ENP concentration in the aqueous environment - have limited applicability because they employ constant environmental conditions along the modeled system or a highly specific environmental representation; both approaches do not show the effects of spatial and/or temporal variability. To address this conceptual gap, we developed a novel modeling strategy that: 1) incorporates spatial variability in environmental conditions in an existing ENP fate model; and 2) analyzes the effect of a wide range of randomly sampled environmental conditions (representing variations in water chemistry). This approach was employed to investigate the transport of nano-TiO2 in the Lower Rhône River (France) under numerous sets of environmental conditions. The predicted spatial concentration profiles of nano-TiO2 were then grouped according to their similarity by using cluster analysis. The analysis resulted in a small number of clusters representing groups of spatial concentration profiles. All clusters show nano-TiO2 accumulation in the sediment layer, supporting results from previous studies. Analysis of the characteristic features of each cluster demonstrated a strong association between the water conditions in regions close to the ENP emission source and the cluster membership of the corresponding spatial concentration profiles. In particular, water compositions favoring heteroaggregation between the ENPs and suspended particulate matter resulted in clusters of low variability. These conditions are, therefore, reliable predictors of the eventual fate of the modeled ENPs. The conclusions from this study are also valid for ENP fate in other large river systems. Our results, therefore, shift the focus of future modeling and experimental research of ENP environmental fate to the water characteristic in regions near the expected ENP emission sources. Under conditions favoring heteroaggregation in these regions, the fate of the ENPs can be readily predicted. Copyright © 2014 Elsevier B.V. All rights reserved.
Evolution of the early-type galaxy fraction in clusters since z = 0.8
NASA Astrophysics Data System (ADS)
Simard, L.; Clowe, D.; Desai, V.; Dalcanton, J. J.; von der Linden, A.; Poggianti, B. M.; White, S. D. M.; Aragón-Salamanca, A.; De Lucia, G.; Halliday, C.; Jablonka, P.; Milvang-Jensen, B.; Saglia, R. P.; Pelló, R.; Rudnick, G. H.; Zaritsky, D.
2009-12-01
We study the morphological content of a large sample of high-redshift clusters to determine its dependence on cluster mass and redshift. Quantitative morphologies are based on PSF-convolved, 2D bulge+disk decompositions of cluster and field galaxies on deep Very Large Telescope FORS2 images of eighteen, optically-selected galaxy clusters at 0.45 < z < 0.80 observed as part of the ESO Distant Cluster Survey (“EDisCS”). Morphological content is characterized by the early-type galaxy fraction f_et, and early-type galaxies are objectively selected based on their bulge fraction and image smoothness. This quantitative selection is equivalent to selecting galaxies visually classified as E or S0. Changes in early-type fractions as a function of cluster velocity dispersion, redshift and star-formation activity are studied. A set of 158 clusters extracted from the Sloan Digital Sky Survey is analyzed exactly as the distant EDisCS sample to provide a robust local comparison. We also compare our results to a set of clusters from the Millennium Simulation. Our main results are: (1) the early-type fractions of the SDSS and EDisCS clusters exhibit no clear trend as a function of cluster velocity dispersion. (2) Mid-z EDisCS clusters around σ = 500 km s-1 have f_et ≃ 0.5 whereas high-z EDisCS clusters have f_et ≃ 0.4. This represents a ~25% increase over a time interval of 2 Gyr. (3) There is a marked difference in the morphological content of EDisCS and SDSS clusters. None of the EDisCS clusters have early-type galaxy fractions greater than 0.6 whereas half of the SDSS clusters lie above this value. This difference is seen in clusters of all velocity dispersions. (4) There is a strong and clear correlation between morphology and star formation activity in SDSS and EDisCS clusters in the sense that decreasing fractions of [OII] emitters are tracked by increasing early-type fractions. This correlation holds independent of cluster velocity dispersion and redshift even though the fraction of [OII] emitters decreases from z ˜0.8 to z ˜ 0.06 in all environments. Our results pose an interesting challenge to structural transformation and star formation quenching processes that strongly depend on the global cluster environment (e.g., a dense ICM) and suggest that cluster membership may be of lesser importance than other variables in determining galaxy properties. Based on observations obtained in visitor and service modes at the ESO Very Large Telescope (VLT) as part of the Large Programme 166.A-0162 (the ESO Distant Cluster Survey). Also based on observations made with the NASA/ESA Hubble Space Telescope, obtained at the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS 5-26555. These observations are associated with proposal 9476. Support for this proposal was provided by NASA through a grant from the Space Telescope Science Institute. Table [see full textsee full textsee full textsee full textsee full text] is only available in electronic form at http://www.aanda.org
CAOS: the nested catchment soil-vegetation-atmosphere observation platform
NASA Astrophysics Data System (ADS)
Weiler, Markus; Blume, Theresa
2016-04-01
Most catchment based observations linking hydrometeorology, ecohydrology, soil hydrology and hydrogeology are typically not integrated with each other and lack a consistent and appropriate spatial-temporal resolution. Within the research network CAOS (Catchments As Organized Systems), we have initiated and developed a novel and integrated observation platform in several catchments in Luxembourg. In 20 nested catchments covering three distinct geologies the subscale processes at the bedrock-soil-vegetation-atmosphere interface are being monitored at 46 sensor cluster locations. Each sensor cluster is designed to observe a variety of different fluxes and state variables above and below ground, in the saturated and unsaturated zone. The numbers of sensors are chosen to capture the spatial variability as well the average dynamics. At each of these sensor clusters three soil moisture profiles with sensors at different depths, four soil temperature profiles as well as matric potential, air temperature, relative humidity, global radiation, rainfall/throughfall, sapflow and shallow groundwater and stream water levels are measured continuously. In addition, most sensors also measure temperature (water, soil, atmosphere) and electrical conductivity. This setup allows us to determine the local water and energy balance at each of these sites. The discharge gauging sites in the nested catchments are also equipped with automatic water samplers to monitor water quality and water stable isotopes continuously. Furthermore, water temperature and electrical conductivity observations are extended to over 120 locations distributed across the entire stream network to capture the energy exchange between the groundwater, stream water and atmosphere. The measurements at the sensor clusters are complemented by hydrometeorological observations (rain radar, network of distrometers and dense network of precipitation gauges) and linked with high resolution meteorological models. In this presentation, we will highlight the potential of this integrated observation platform to estimate energy and water exchange between the terrestrial and aquatic systems and the atmosphere, to trace water flow pathways in the unsaturated and saturated zone, and to understand the organization of processes and fluxes and thus runoff generation at different temporal and spatial scales.
Variability in the Milky Way: Contact Binaries as Diagnostic Tools
NASA Astrophysics Data System (ADS)
de Grijs, R.; Chen, X.; Deng, L.
2017-07-01
We used the 50 cm Binocular Network (50BiN) telescope at Delingha Station (Qinghai Province) of Purple Mountain Observatory (Chinese Academy of Sciences) to obtain simultaneous V- and R-band observations of the old open cluster NGC 188. Our aim was a search for populations of variable stars. We derived light-curve solutions for six W Ursae Majoris (W UMa) eclipsing-binary systems and estimated their orbital parameters. The resulting distance to the W UMas is independent of the physical characteristics of the host cluster. We next determined the current best period-luminosity relations for contact binaries (CBs; scatter σ<0.10 mag). We conclude that CBs can be used as distance tracers with better than 5% uncertainty. We apply our new relations to the 102 CBs in the Large Magellanic Cloud, which yields a distance modulus of (m-M)V,0=18.41±0.20 mag.
The Structure of the Young Star Cluster NGC 6231. I. Stellar Population
NASA Astrophysics Data System (ADS)
Kuhn, Michael A.; Medina, Nicolás; Getman, Konstantin V.; Feigelson, Eric D.; Gromadzki, Mariusz; Borissova, Jordanka; Kurtev, Radostin
2017-09-01
NGC 6231 is a young cluster (age ˜2-7 Myr) dominating the Sco OB1 association (distance ˜1.59 kpc) with ˜100 O and B stars and a large pre-main-sequence stellar population. We combine a reanalysis of archival Chandra X-ray data with multiepoch near-infrared (NIR) photometry from the VISTA Variables in the Vía Lactéa (VVV) survey and published optical catalogs to obtain a catalog of 2148 probable cluster members. This catalog is 70% larger than previous censuses of probable cluster members in NGC 6231. It includes many low-mass stars detected in the NIR but not in the optical and some B stars without previously noted X-ray counterparts. In addition, we identify 295 NIR variables, about half of which are expected to be pre-main-sequence stars. With the more complete sample, we estimate a total population in the Chandra field of 5700-7500 cluster members down to 0.08 {M}⊙ (assuming a universal initial mass function) with a completeness limit at 0.5 {M}⊙ . A decrease in stellar X-ray luminosities is noted relative to other younger clusters. However, within the cluster, there is little variation in the distribution of X-ray luminosities for ages less than 5 Myr. The X-ray spectral hardness for B stars may be useful for distinguishing between early-B stars with X-rays generated in stellar winds and B-star systems with X-rays from a pre-main-sequence companion (>35% of B stars). A small fraction of catalog members have unusually high X-ray median energies or reddened NIR colors, which might be explained by absorption from thick or edge-on disks or being background field stars.
Spatial event cluster detection using an approximate normal distribution.
Torabi, Mahmoud; Rosychuk, Rhonda J
2008-12-12
In geographic surveillance of disease, areas with large numbers of disease cases are to be identified so that investigations of the causes of high disease rates can be pursued. Areas with high rates are called disease clusters and statistical cluster detection tests are used to identify geographic areas with higher disease rates than expected by chance alone. Typically cluster detection tests are applied to incident or prevalent cases of disease, but surveillance of disease-related events, where an individual may have multiple events, may also be of interest. Previously, a compound Poisson approach that detects clusters of events by testing individual areas that may be combined with their neighbours has been proposed. However, the relevant probabilities from the compound Poisson distribution are obtained from a recursion relation that can be cumbersome if the number of events are large or analyses by strata are performed. We propose a simpler approach that uses an approximate normal distribution. This method is very easy to implement and is applicable to situations where the population sizes are large and the population distribution by important strata may differ by area. We demonstrate the approach on pediatric self-inflicted injury presentations to emergency departments and compare the results for probabilities based on the recursion and the normal approach. We also implement a Monte Carlo simulation to study the performance of the proposed approach. In a self-inflicted injury data example, the normal approach identifies twelve out of thirteen of the same clusters as the compound Poisson approach, noting that the compound Poisson method detects twelve significant clusters in total. Through simulation studies, the normal approach well approximates the compound Poisson approach for a variety of different population sizes and case and event thresholds. A drawback of the compound Poisson approach is that the relevant probabilities must be determined through a recursion relation and such calculations can be computationally intensive if the cluster size is relatively large or if analyses are conducted with strata variables. On the other hand, the normal approach is very flexible, easily implemented, and hence, more appealing for users. Moreover, the concepts may be more easily conveyed to non-statisticians interested in understanding the methodology associated with cluster detection test results.
Selection of Variables in Cluster Analysis: An Empirical Comparison of Eight Procedures
ERIC Educational Resources Information Center
Steinley, Douglas; Brusco, Michael J.
2008-01-01
Eight different variable selection techniques for model-based and non-model-based clustering are evaluated across a wide range of cluster structures. It is shown that several methods have difficulties when non-informative variables (i.e., random noise) are included in the model. Furthermore, the distribution of the random noise greatly impacts the…
Statistical significance test for transition matrices of atmospheric Markov chains
NASA Technical Reports Server (NTRS)
Vautard, Robert; Mo, Kingtse C.; Ghil, Michael
1990-01-01
Low-frequency variability of large-scale atmospheric dynamics can be represented schematically by a Markov chain of multiple flow regimes. This Markov chain contains useful information for the long-range forecaster, provided that the statistical significance of the associated transition matrix can be reliably tested. Monte Carlo simulation yields a very reliable significance test for the elements of this matrix. The results of this test agree with previously used empirical formulae when each cluster of maps identified as a distinct flow regime is sufficiently large and when they all contain a comparable number of maps. Monte Carlo simulation provides a more reliable way to test the statistical significance of transitions to and from small clusters. It can determine the most likely transitions, as well as the most unlikely ones, with a prescribed level of statistical significance.
Quantitative estimation of time-variable earthquake hazard by using fuzzy set theory
NASA Astrophysics Data System (ADS)
Deyi, Feng; Ichikawa, M.
1989-11-01
In this paper, the various methods of fuzzy set theory, called fuzzy mathematics, have been applied to the quantitative estimation of the time-variable earthquake hazard. The results obtained consist of the following. (1) Quantitative estimation of the earthquake hazard on the basis of seismicity data. By using some methods of fuzzy mathematics, seismicity patterns before large earthquakes can be studied more clearly and more quantitatively, highly active periods in a given region and quiet periods of seismic activity before large earthquakes can be recognized, similarities in temporal variation of seismic activity and seismic gaps can be examined and, on the other hand, the time-variable earthquake hazard can be assessed directly on the basis of a series of statistical indices of seismicity. Two methods of fuzzy clustering analysis, the method of fuzzy similarity, and the direct method of fuzzy pattern recognition, have been studied is particular. One method of fuzzy clustering analysis is based on fuzzy netting, and another is based on the fuzzy equivalent relation. (2) Quantitative estimation of the earthquake hazard on the basis of observational data for different precursors. The direct method of fuzzy pattern recognition has been applied to research on earthquake precursors of different kinds. On the basis of the temporal and spatial characteristics of recognized precursors, earthquake hazards in different terms can be estimated. This paper mainly deals with medium-short-term precursors observed in Japan and China.
Eyler, Lauren; Hubbard, Alan; Juillard, Catherine
2016-10-01
Low and middle-income countries (LMICs) and the world's poor bear a disproportionate share of the global burden of injury. Data regarding disparities in injury are vital to inform injury prevention and trauma systems strengthening interventions targeted towards vulnerable populations, but are limited in LMICs. We aim to facilitate injury disparities research by generating a standardized methodology for assessing economic status in resource-limited country trauma registries where complex metrics such as income, expenditures, and wealth index are infeasible to assess. To address this need, we developed a cluster analysis-based algorithm for generating simple population-specific metrics of economic status using nationally representative Demographic and Health Surveys (DHS) household assets data. For a limited number of variables, g, our algorithm performs weighted k-medoids clustering of the population using all combinations of g asset variables and selects the combination of variables and number of clusters that maximize average silhouette width (ASW). In simulated datasets containing both randomly distributed variables and "true" population clusters defined by correlated categorical variables, the algorithm selected the correct variable combination and appropriate cluster numbers unless variable correlation was very weak. When used with 2011 Cameroonian DHS data, our algorithm identified twenty economic clusters with ASW 0.80, indicating well-defined population clusters. This economic model for assessing health disparities will be used in the new Cameroonian six-hospital centralized trauma registry. By describing our standardized methodology and algorithm for generating economic clustering models, we aim to facilitate measurement of health disparities in other trauma registries in resource-limited countries. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Statistical sampling of the distribution of uranium deposits using geologic/geographic clusters
Finch, W.I.; Grundy, W.D.; Pierson, C.T.
1992-01-01
The concept of geologic/geographic clusters was developed particularly to study grade and tonnage models for sandstone-type uranium deposits. A cluster is a grouping of mined as well as unmined uranium occurrences within an arbitrary area about 8 km across. A cluster is a statistical sample that will reflect accurately the distribution of uranium in large regions relative to various geologic and geographic features. The example of the Colorado Plateau Uranium Province reveals that only 3 percent of the total number of clusters is in the largest tonnage-size category, greater than 10,000 short tons U3O8, and that 80 percent of the clusters are hosted by Triassic and Jurassic rocks. The distributions of grade and tonnage for clusters in the Powder River Basin show a wide variation; the grade distribution is highly variable, reflecting a difference between roll-front deposits and concretionary deposits, and the Basin contains about half the number in the greater-than-10,000 tonnage-size class as does the Colorado Plateau, even though it is much smaller. The grade and tonnage models should prove useful in finding the richest and largest uranium deposits. ?? 1992 Oxford University Press.
Path integrals and large deviations in stochastic hybrid systems.
Bressloff, Paul C; Newby, Jay M
2014-04-01
We construct a path-integral representation of solutions to a stochastic hybrid system, consisting of one or more continuous variables evolving according to a piecewise-deterministic dynamics. The differential equations for the continuous variables are coupled to a set of discrete variables that satisfy a continuous-time Markov process, which means that the differential equations are only valid between jumps in the discrete variables. Examples of stochastic hybrid systems arise in biophysical models of stochastic ion channels, motor-driven intracellular transport, gene networks, and stochastic neural networks. We use the path-integral representation to derive a large deviation action principle for a stochastic hybrid system. Minimizing the associated action functional with respect to the set of all trajectories emanating from a metastable state (assuming that such a minimization scheme exists) then determines the most probable paths of escape. Moreover, evaluating the action functional along a most probable path generates the so-called quasipotential used in the calculation of mean first passage times. We illustrate the theory by considering the optimal paths of escape from a metastable state in a bistable neural network.
Liu, Ying; Tang, Yuanman; Qin, Xiyun; Yang, Liang; Jiang, Gaofei; Li, Shili; Ding, Wei
2017-01-01
Ralstonia solanacearum, an agent of bacterial wilt, is a highly variable species with a broad host range and wide geographic distribution. As a species complex, it has extensive genetic diversity and its living environment is polymorphic like the lowland and the highland area, so more genomes are needed for studying population evolution and environment adaptation. In this paper, we reported the genome sequencing of R. solanacearum strain CQPS-1 isolated from wilted tobacco in Pengshui, Chongqing, China, a highland area with severely acidified soil and continuous cropping of tobacco more than 20 years. The comparative genomic analysis among different R. solanacearum strains was also performed. The completed genome size of CQPS-1 was 5.89 Mb and contained the chromosome (3.83 Mb) and the megaplasmid (2.06 Mb). A total of 5229 coding sequences were predicted (the chromosome and megaplasmid encoded 3573 and 1656 genes, respectively). A comparative analysis with eight strains from four phylotypes showed that there was some variation among the species, e.g., a large set of specific genes in CQPS-1. Type III secretion system gene cluster (hrp gene cluster) was conserved in CQPS-1 compared with the reference strain GMI1000. In addition, most genes coding core type III effectors were also conserved with GMI1000, but significant gene variation was found in the gene ripAA: the identity compared with strain GMI1000 was 75% and the hrpII box promoter in the upstream had significantly mutated. This study provided a potential resource for further understanding of the relationship between variation of pathogenicity factors and adaptation to the host environment. PMID:28620361
Large amplitude change in spot-induced rotational modulation of the Kepler Ap star KIC 2569073
NASA Astrophysics Data System (ADS)
Drury, Jason A.; Murphy, Simon J.; Derekas, Aliz; Sódor, Ádám; Stello, Dennis; Kuehn, Charles A.; Bedding, Timothy R.; Bognár, Zsófia; Szigeti, László; Szakáts, Róbert; Sárneczky, Krisztián; Molnár, László
2017-11-01
An investigation of the 200 × 200 pixel `superstamp' images of the centres of the open clusters NGC 6791 and NGC 6819 allows for the identification and study of many variable stars that were not included in the Kepler target list. KIC 2569073 (V = 14.22), is a particularly interesting variable Ap star that we discovered in the NGC 6791 superstamp. With a rotational period of 14.67 d and 0.034 mag variability, it has one of the largest peak-to-peak variations of any known Ap star. Colour photometry reveals an antiphase correlation between the B band, and the V, R and I bands. This Ap star is a rotational variable, also known as an α2 CVn star, and is one of only a handful of Ap stars observed by Kepler. While no change in spot period or amplitude is observed within the 4 yr Kepler time series, the amplitude shows a large increase compared to ground-based photometry obtained two decades ago.
NASA Astrophysics Data System (ADS)
Saxe, Samuel; Hogue, Terri S.; Hay, Lauren
2018-02-01
This research investigates the impact of wildfires on watershed flow regimes, specifically focusing on evaluation of fire events within specified hydroclimatic regions in the western United States, and evaluating the impact of climate and geophysical variables on response. Eighty-two watersheds were identified with at least 10 years of continuous pre-fire daily streamflow records and 5 years of continuous post-fire daily flow records. Percent change in annual runoff ratio, low flows, high flows, peak flows, number of zero flow days, baseflow index, and Richards-Baker flashiness index were calculated for each watershed using pre- and post-fire periods. Independent variables were identified for each watershed and fire event, including topographic, vegetation, climate, burn severity, percent area burned, and soils data. Results show that low flows, high flows, and peak flows increase in the first 2 years following a wildfire and decrease over time. Relative response was used to scale response variables with the respective percent area of watershed burned in order to compare regional differences in watershed response. To account for variability in precipitation events, runoff ratio was used to compare runoff directly to PRISM precipitation estimates. To account for regional differences in climate patterns, watersheds were divided into nine regions, or clusters, through k-means clustering using climate data, and regression models were produced for watersheds grouped by total area burned. Watersheds in Cluster 9 (eastern California, western Nevada, Oregon) demonstrate a small negative response to observed flow regimes after fire. Cluster 8 watersheds (coastal California) display the greatest flow responses, typically within the first year following wildfire. Most other watersheds show a positive mean relative response. In addition, simple regression models show low correlation between percent watershed burned and streamflow response, implying that other watershed factors strongly influence response. Spearman correlation identified NDVI, aridity index, percent of a watershed's precipitation that falls as rain, and slope as being positively correlated with post-fire streamflow response. This metric also suggested a negative correlation between response and the soil erodibility factor, watershed area, and percent low burn severity. Regression models identified only moderate burn severity and watershed area as being consistently positively/negatively correlated, respectively, with response. The random forest model identified only slope and percent area burned as significant watershed parameters controlling response. Results will help inform post-fire runoff management decisions by helping to identify expected changes to flow regimes, as well as facilitate parameterization for model application in burned watersheds.
Pascual-García, Alberto; Abia, David; Ortiz, Angel R; Bastolla, Ugo
2009-03-01
Structural classifications of proteins assume the existence of the fold, which is an intrinsic equivalence class of protein domains. Here, we test in which conditions such an equivalence class is compatible with objective similarity measures. We base our analysis on the transitive property of the equivalence relationship, requiring that similarity of A with B and B with C implies that A and C are also similar. Divergent gene evolution leads us to expect that the transitive property should approximately hold. However, if protein domains are a combination of recurrent short polypeptide fragments, as proposed by several authors, then similarity of partial fragments may violate the transitive property, favouring the continuous view of the protein structure space. We propose a measure to quantify the violations of the transitive property when a clustering algorithm joins elements into clusters, and we find out that such violations present a well defined and detectable cross-over point, from an approximately transitive regime at high structure similarity to a regime with large transitivity violations and large differences in length at low similarity. We argue that protein structure space is discrete and hierarchic classification is justified up to this cross-over point, whereas at lower similarities the structure space is continuous and it should be represented as a network. We have tested the qualitative behaviour of this measure, varying all the choices involved in the automatic classification procedure, i.e., domain decomposition, alignment algorithm, similarity score, and clustering algorithm, and we have found out that this behaviour is quite robust. The final classification depends on the chosen algorithms. We used the values of the clustering coefficient and the transitivity violations to select the optimal choices among those that we tested. Interestingly, this criterion also favours the agreement between automatic and expert classifications. As a domain set, we have selected a consensus set of 2,890 domains decomposed very similarly in SCOP and CATH. As an alignment algorithm, we used a global version of MAMMOTH developed in our group, which is both rapid and accurate. As a similarity measure, we used the size-normalized contact overlap, and as a clustering algorithm, we used average linkage. The resulting automatic classification at the cross-over point was more consistent than expert ones with respect to the structure similarity measure, with 86% of the clusters corresponding to subsets of either SCOP or CATH superfamilies and fewer than 5% containing domains in distinct folds according to both SCOP and CATH. Almost 15% of SCOP superfamilies and 10% of CATH superfamilies were split, consistent with the notion of fold change in protein evolution. These results were qualitatively robust for all choices that we tested, although we did not try to use alignment algorithms developed by other groups. Folds defined in SCOP and CATH would be completely joined in the regime of large transitivity violations where clustering is more arbitrary. Consistently, the agreement between SCOP and CATH at fold level was lower than their agreement with the automatic classification obtained using as a clustering algorithm, respectively, average linkage (for SCOP) or single linkage (for CATH). The networks representing significant evolutionary and structural relationships between clusters beyond the cross-over point may allow us to perform evolutionary, structural, or functional analyses beyond the limits of classification schemes. These networks and the underlying clusters are available at http://ub.cbm.uam.es/research/ProtNet.php.
Roushangar, Kiyoumars; Alizadeh, Farhad; Adamowski, Jan
2018-08-01
Understanding precipitation on a regional basis is an important component of water resources planning and management. The present study outlines a methodology based on continuous wavelet transform (CWT) and multiscale entropy (CWME), combined with self-organizing map (SOM) and k-means clustering techniques, to measure and analyze the complexity of precipitation. Historical monthly precipitation data from 1960 to 2010 at 31 rain gauges across Iran were preprocessed by CWT. The multi-resolution CWT approach segregated the major features of the original precipitation series by unfolding the structure of the time series which was often ambiguous. The entropy concept was then applied to components obtained from CWT to measure dispersion, uncertainty, disorder, and diversification of subcomponents. Based on different validity indices, k-means clustering captured homogenous areas more accurately, and additional analysis was performed based on the outcome of this approach. The 31 rain gauges in this study were clustered into 6 groups, each one having a unique CWME pattern across different time scales. The results of clustering showed that hydrologic similarity (multiscale variation of precipitation) was not based on geographic contiguity. According to the pattern of entropy across the scales, each cluster was assigned an entropy signature that provided an estimation of the entropy pattern of precipitation data in each cluster. Based on the pattern of mean CWME for each cluster, a characteristic signature was assigned, which provided an estimation of the CWME of a cluster across scales of 1-2, 3-8, and 9-13 months relative to other stations. The validity of the homogeneous clusters demonstrated the usefulness of the proposed approach to regionalize precipitation. Further analysis based on wavelet coherence (WTC) was performed by selecting central rain gauges in each cluster and analyzing against temperature, wind, Multivariate ENSO index (MEI), and East Atlantic (EA) and North Atlantic Oscillation (NAO), indeces. The results revealed that all climatic features except NAO influenced precipitation in Iran during the 1960-2010 period. Copyright © 2018 Elsevier Inc. All rights reserved.
Farmer, Jocelyn R; Ong, Mei-Sing; Barmettler, Sara; Yonker, Lael M; Fuleihan, Ramsay; Sullivan, Kathleen E; Cunningham-Rundles, Charlotte; Walter, Jolan E
2017-01-01
Common variable immunodeficiency (CVID) is increasingly recognized for its association with autoimmune and inflammatory complications. Despite recent advances in immunophenotypic and genetic discovery, clinical care of CVID remains limited by our inability to accurately model risk for non-infectious disease development. Herein, we demonstrate the utility of unbiased network clustering as a novel method to analyze inter-relationships between non-infectious disease outcomes in CVID using databases at the United States Immunodeficiency Network (USIDNET), the centralized immunodeficiency registry of the United States, and Partners, a tertiary care network in Boston, MA, USA, with a shared electronic medical record amenable to natural language processing. Immunophenotypes were comparable in terms of native antibody deficiencies, low titer response to pneumococcus, and B cell maturation arrest. However, recorded non-infectious disease outcomes were more substantial in the Partners cohort across the spectrum of lymphoproliferation, cytopenias, autoimmunity, atopy, and malignancy. Using unbiased network clustering to analyze 34 non-infectious disease outcomes in the Partners cohort, we further identified unique patterns of lymphoproliferative (two clusters), autoimmune (two clusters), and atopic (one cluster) disease that were defined as CVID non-infectious endotypes according to discrete and non-overlapping immunophenotypes. Markers were both previously described {high serum IgE in the atopic cluster [odds ratio (OR) 6.5] and low class-switched memory B cells in the total lymphoproliferative cluster (OR 9.2)} and novel [low serum C3 in the total lymphoproliferative cluster (OR 5.1)]. Mortality risk in the Partners cohort was significantly associated with individual non-infectious disease outcomes as well as lymphoproliferative cluster 2, specifically (OR 5.9). In contrast, unbiased network clustering failed to associate known comorbidities in the adult USIDNET cohort. Together, these data suggest that unbiased network clustering can be used in CVID to redefine non-infectious disease inter-relationships; however, applicability may be limited to datasets well annotated through mechanisms such as natural language processing. The lymphoproliferative, autoimmune, and atopic Partners CVID endotypes herein described can be used moving forward to streamline genetic and biomarker discovery and to facilitate early screening and intervention in CVID patients at highest risk for autoimmune and inflammatory progression.
Delva, Wim; Helleringer, Stéphane
2016-01-01
Introduction Concerns about risk compensation—increased risk behaviours in response to a perception of reduced HIV transmission risk—after the initiation of ART have largely been dispelled in empirical studies, but other changes in sexual networking patterns may still modify the effects of ART on HIV incidence. Methods We developed an exploratory mathematical model of HIV transmission that incorporates the possibility of ART clusters, i.e. subsets of the sexual network in which the density of ART patients is much higher than in the rest of the network. Such clusters may emerge as a result of ART homophily—a tendency for ART patients to preferentially form and maintain relationships with other ART patients. We assessed whether ART clusters may affect the impact of ART on HIV incidence, and how the influence of this effect-modifying variable depends on contextual variables such as HIV prevalence, HIV serosorting, coverage of HIV testing and ART, and adherence to ART. Results ART homophily can modify the impact of ART on HIV incidence in both directions. In concentrated epidemics and generalized epidemics with moderate HIV prevalence (≈ 10%), ART clusters can enhance the impact of ART on HIV incidence, especially when adherence to ART is poor. In hyperendemic settings (≈ 35% HIV prevalence), ART clusters can reduce the impact of ART on HIV incidence when adherence to ART is high but few people living with HIV (PLWH) have been diagnosed. In all contexts, the effects of ART clusters on HIV epidemic dynamics are distinct from those of HIV serosorting. Conclusions Depending on the programmatic and epidemiological context, ART clusters may enhance or reduce the impact of ART on HIV incidence, in contrast to serosorting, which always leads to a lower impact of ART on HIV incidence. ART homophily and the emergence of ART clusters should be measured empirically and incorporated into more refined models used to plan and evaluate ART programmes. PMID:27657492
Delva, Wim; Helleringer, Stéphane
Concerns about risk compensation-increased risk behaviours in response to a perception of reduced HIV transmission risk-after the initiation of ART have largely been dispelled in empirical studies, but other changes in sexual networking patterns may still modify the effects of ART on HIV incidence. We developed an exploratory mathematical model of HIV transmission that incorporates the possibility of ART clusters, i.e. subsets of the sexual network in which the density of ART patients is much higher than in the rest of the network. Such clusters may emerge as a result of ART homophily-a tendency for ART patients to preferentially form and maintain relationships with other ART patients. We assessed whether ART clusters may affect the impact of ART on HIV incidence, and how the influence of this effect-modifying variable depends on contextual variables such as HIV prevalence, HIV serosorting, coverage of HIV testing and ART, and adherence to ART. ART homophily can modify the impact of ART on HIV incidence in both directions. In concentrated epidemics and generalized epidemics with moderate HIV prevalence (≈ 10%), ART clusters can enhance the impact of ART on HIV incidence, especially when adherence to ART is poor. In hyperendemic settings (≈ 35% HIV prevalence), ART clusters can reduce the impact of ART on HIV incidence when adherence to ART is high but few people living with HIV (PLWH) have been diagnosed. In all contexts, the effects of ART clusters on HIV epidemic dynamics are distinct from those of HIV serosorting. Depending on the programmatic and epidemiological context, ART clusters may enhance or reduce the impact of ART on HIV incidence, in contrast to serosorting, which always leads to a lower impact of ART on HIV incidence. ART homophily and the emergence of ART clusters should be measured empirically and incorporated into more refined models used to plan and evaluate ART programmes.
False Discovery Control in Large-Scale Spatial Multiple Testing
Sun, Wenguang; Reich, Brian J.; Cai, T. Tony; Guindani, Michele; Schwartzman, Armin
2014-01-01
Summary This article develops a unified theoretical and computational framework for false discovery control in multiple testing of spatial signals. We consider both point-wise and cluster-wise spatial analyses, and derive oracle procedures which optimally control the false discovery rate, false discovery exceedance and false cluster rate, respectively. A data-driven finite approximation strategy is developed to mimic the oracle procedures on a continuous spatial domain. Our multiple testing procedures are asymptotically valid and can be effectively implemented using Bayesian computational algorithms for analysis of large spatial data sets. Numerical results show that the proposed procedures lead to more accurate error control and better power performance than conventional methods. We demonstrate our methods for analyzing the time trends in tropospheric ozone in eastern US. PMID:25642138
A Detailed Survey of Pulsating Variables in Five Globular Clusters (Abstract)
NASA Astrophysics Data System (ADS)
Murphy, B. W.
2016-12-01
(Abstract only) Globular clusters are ideal laboratories for conducting a stellar census. Of particular interest are pulsating variables, which provide astronomers with a tool to probe the properties of the stars and the cluster. We observed each of five globular clusters hundreds to thousands of times over a time span ranging from 2 to 4 years in B, V, and I filters using the SARA 0.6-meter telescope located at Cerro Tololo Interamerican Observatory and the 0.9-meter telescope located at Kitt Peak, Arizona. The images were analyzed using difference image analysis to identify and produce light curves of all variables found in each cluster. In total we identified 377 variables with 140 of these being newly discovered increasing the number of known variables stars in these clusters by 60%. Of the total we have identified 319 RR Lyrae variables (193 RR0, 18 RR01, 101 RR1, 7 RR2), 9 SX Phe stars, 5 Cepheid variables, 11 eclipsing variables, and 33 long period variables. For IC4499 we identified 64 RR0, 18 RR01, 14 RR1, 4 RR2, 1 SX Phe, 1 eclipsing binary, and 2 long period variables. For NGC4833 we identified 10 RR0, 7 RR1, 3 RR2, 6 SX Phe, 5 eclipsing binaries, and 9 long period variables. For NGC6171 (M107) we identified 14 RR0, 7 RR1, and 1 SX Phe. For NGC6402 (M14) we identified 55 RR0, 57 RR1, 1 RR2, 1 SX Phe, 6 Cepheids, 1 eclipsing binary, and 15 long period variables. For NGC6584 we identified 50 RR0, 16 RR1, 4 eclipsing binaries, and 7 long period variables. From our extensive data set we were able to obtain sufficient temporal and complete phase coverage of the RR Lyrae variables. This has allowed us not only to properly classify each of the RR Lyrae variables but also to use Fourier decomposition of the B, V, and I light curves to further analyze the properties of the variable stars and hence the physical properties of each globular cluster.
A Multivariate Model and Analysis of Competitive Strategy in the U.S. Hardwood Lumber Industry
Robert J. Bush; Steven A. Sinclair
1991-01-01
Business-level competitive strategy in the hardwood lumber industry was modeled through the identification of strategic groups among large U.S. hardwood lumber producers. Strategy was operationalized using a measure based on the variables developed by Dess and Davis (1984). Factor and cluster analyses were used to define strategic groups along the dimensions of cost...
Ritenberga, Olga; Sofiev, Mikhail; Siljamo, Pilvi; Saarto, Annika; Dahl, Aslog; Ekebom, Agneta; Sauliene, Ingrida; Shalaboda, Valentina; Severova, Elena; Hoebeke, Lucie; Ramfjord, Hallvard
2018-02-15
The paper suggests a methodology for predicting next-year seasonal pollen index (SPI, a sum of daily-mean pollen concentrations) over large regions and demonstrates its performance for birch in Northern and North-Eastern Europe. A statistical model is constructed using meteorological, geophysical and biological characteristics of the previous year). A cluster analysis of multi-annual data of European Aeroallergen Network (EAN) revealed several large regions in Europe, where the observed SPI exhibits similar patterns of the multi-annual variability. We built the model for the northern cluster of stations, which covers Finland, Sweden, Baltic States, part of Belarus, and, probably, Russia and Norway, where the lack of data did not allow for conclusive analysis. The constructed model was capable of predicting the SPI with correlation coefficient reaching up to 0.9 for some stations, odds ratio is infinitely high for 50% of sites inside the region and the fraction of prediction falling within factor of 2 from observations, stays within 40-70%. In particular, model successfully reproduced both the bi-annual cycle of the SPI and years when this cycle breaks down. Copyright © 2017 Elsevier B.V. All rights reserved.
Zhang, Miao; Bommer, Martin; Chatterjee, Ruchira; ...
2017-07-18
In plants, algae and cyanobacteria, Photosystem II (PSII) catalyzes the light-driven splitting of water at a protein-bound Mn 4CaO 5-cluster, the water-oxidizing complex (WOC). In the photosynthetic organisms, the light-driven formation of the WOC from dissolved metal ions is a key process because it is essential in both initial activation and continuous repair of PSII. Structural information is required for understanding of this chaperone-free metal-cluster assembly. For the first time, we obtained a structure of PSII from Thermosynechococcus elongatus without the Mn 4CaO 5-cluster. Surprisingly, cluster-removal leaves the positions of all coordinating amino acid residues and most nearby water moleculesmore » largely unaffected, resulting in a pre-organized ligand shell for kinetically competent and error-free photo-assembly of the Mn 4CaO 5-cluster. First experiments initiating (i) partial disassembly and (ii) partial re-assembly after complete depletion of the Mn4CaO5-cluster agree with a specific bi-manganese cluster, likely a di-µ-oxo bridged pair of Mn(III) ions, as an assembly intermediate.« less
Zhang, Miao; Bommer, Martin; Chatterjee, Ruchira; Hussein, Rana; Yano, Junko; Dau, Holger; Kern, Jan; Dobbek, Holger; Zouni, Athina
2017-07-18
In plants, algae and cyanobacteria, Photosystem II (PSII) catalyzes the light-driven splitting of water at a protein-bound Mn 4 CaO 5 -cluster, the water-oxidizing complex (WOC). In the photosynthetic organisms, the light-driven formation of the WOC from dissolved metal ions is a key process because it is essential in both initial activation and continuous repair of PSII. Structural information is required for understanding of this chaperone-free metal-cluster assembly. For the first time, we obtained a structure of PSII from Thermosynechococcus elongatus without the Mn 4 CaO 5 -cluster. Surprisingly, cluster-removal leaves the positions of all coordinating amino acid residues and most nearby water molecules largely unaffected, resulting in a pre-organized ligand shell for kinetically competent and error-free photo-assembly of the Mn 4 CaO 5 -cluster. First experiments initiating (i) partial disassembly and (ii) partial re-assembly after complete depletion of the Mn 4 CaO 5 -cluster agree with a specific bi-manganese cluster, likely a di-µ-oxo bridged pair of Mn(III) ions, as an assembly intermediate.
A non-voxel-based broad-beam (NVBB) framework for IMRT treatment planning.
Lu, Weiguo
2010-12-07
We present a novel framework that enables very large scale intensity-modulated radiation therapy (IMRT) planning in limited computation resources with improvements in cost, plan quality and planning throughput. Current IMRT optimization uses a voxel-based beamlet superposition (VBS) framework that requires pre-calculation and storage of a large amount of beamlet data, resulting in large temporal and spatial complexity. We developed a non-voxel-based broad-beam (NVBB) framework for IMRT capable of direct treatment parameter optimization (DTPO). In this framework, both objective function and derivative are evaluated based on the continuous viewpoint, abandoning 'voxel' and 'beamlet' representations. Thus pre-calculation and storage of beamlets are no longer needed. The NVBB framework has linear complexities (O(N(3))) in both space and time. The low memory, full computation and data parallelization nature of the framework render its efficient implementation on the graphic processing unit (GPU). We implemented the NVBB framework and incorporated it with the TomoTherapy treatment planning system (TPS). The new TPS runs on a single workstation with one GPU card (NVBB-GPU). Extensive verification/validation tests were performed in house and via third parties. Benchmarks on dose accuracy, plan quality and throughput were compared with the commercial TomoTherapy TPS that is based on the VBS framework and uses a computer cluster with 14 nodes (VBS-cluster). For all tests, the dose accuracy of these two TPSs is comparable (within 1%). Plan qualities were comparable with no clinically significant difference for most cases except that superior target uniformity was seen in the NVBB-GPU for some cases. However, the planning time using the NVBB-GPU was reduced many folds over the VBS-cluster. In conclusion, we developed a novel NVBB framework for IMRT optimization. The continuous viewpoint and DTPO nature of the algorithm eliminate the need for beamlets and lead to better plan quality. The computation parallelization on a GPU instead of a computer cluster significantly reduces hardware and service costs. Compared with using the current VBS framework on a computer cluster, the planning time is significantly reduced using the NVBB framework on a single workstation with a GPU card.
Coon, Andrew; Carson, Robert; Debes, Paul V.
2016-01-01
The study of population differentiation in the context of ecological speciation is commonly assessed using populations with obvious discreteness. Fewer studies have examined diversifying populations with occasional adaptive variation and minor reproductive isolation, so factors impeding or facilitating the progress of early stage differentiation are less understood. We detected non-random genetic structuring in lake trout (Salvelinus namaycush) inhabiting a large, pristine, postglacial lake (Mistassini Lake, Canada), with up to five discernible genetic clusters having distinctions in body shape, size, colouration and head shape. However, genetic differentiation was low (FST = 0.017) and genetic clustering was largely incongruent between several population- and individual-based clustering approaches. Genotype- and phenotype-environment associations with spatial habitat, depth and fish community structure (competitors and prey) were either inconsistent or weak. Striking morphological variation was often more continuous within than among defined genetic clusters. Low genetic differentiation was a consequence of relatively high contemporary gene flow despite large effective population sizes, not migration-drift disequilibrium. Our results suggest a highly plastic propensity for occupying multiple habitat niches in lake trout and a low cost of morphological plasticity, which may constrain the speed and extent of adaptive divergence. We discuss how factors relating to niche conservatism in this species may also influence how plasticity affects adaptive divergence, even where ample ecological opportunity apparently exists. PMID:27680019
Variable Circumstellar Disks of Classical Be Stars in Clusters
NASA Astrophysics Data System (ADS)
Gerhartz, C.; Bjorkman, K. S.; Bjorkman, J. E.; Wisniewski, J. P.
2016-11-01
Circumstellar disks are common among many stars, at most spectral types, and at different stages of their lifetimes. Among the near-main-sequence classical Be stars, there is growing evidence that these disks form, dissipate, and reform on timescales that differ from star to star. Using data obtained with the Large Monolithic Imager (LMI) at the Lowell Observatory Discovery Channel Telescope (DCT), along with additional complementary data obtained at the University of Toledo Ritter Observatory (RO), we have begun a long-term monitoring project of a well-studied set of galactic star clusters that are known to contain Be stars. Our goal is to develop a statistically significant sample of variable circumstellar disk systems over multiple timescales. With a robust multi-epoch study we can determine the relative fraction of Be stars that exhibit disk-loss or disk-renewal phases, and investigate the range of timescales over which these events occur. A larger sample will improve our understanding of the prevalence and nature of the disk variability, and may provide insight about underlying physical mechanisms.
A comparison of regional flood frequency analysis approaches in a simulation framework
NASA Astrophysics Data System (ADS)
Ganora, D.; Laio, F.
2016-07-01
Regional frequency analysis (RFA) is a well-established methodology to provide an estimate of the flood frequency curve at ungauged (or scarcely gauged) sites. Different RFA approaches exist, depending on the way the information is transferred to the site of interest, but it is not clear in the literature if a specific method systematically outperforms the others. The aim of this study is to provide a framework wherein carrying out the intercomparison by building up a virtual environment based on synthetically generated data. The considered regional approaches include: (i) a unique regional curve for the whole region; (ii) a multiple-region model where homogeneous subregions are determined through cluster analysis; (iii) a Region-of-Influence model which defines a homogeneous subregion for each site; (iv) a spatially smooth estimation procedure where the parameters of the regional model vary continuously along the space. Virtual environments are generated considering different patterns of heterogeneity, including step change and smooth variations. If the region is heterogeneous, with the parent distribution changing continuously within the region, the spatially smooth regional approach outperforms the others, with overall errors 10-50% lower than the other methods. In the case of a step-change, the spatially smooth and clustering procedures perform similarly if the heterogeneity is moderate, while clustering procedures work better when the step-change is severe. To extend our findings, an extensive sensitivity analysis has been performed to investigate the effect of sample length, number of virtual stations, return period of the predicted quantile, variability of the scale parameter of the parent distribution, number of predictor variables and different parent distribution. Overall, the spatially smooth approach appears as the most robust approach as its performances are more stable across different patterns of heterogeneity, especially when short records are considered.
Minati, Ludovico
2014-12-01
In this paper, experimental evidence of multiple synchronization phenomena in a large (n = 30) ring of chaotic oscillators is presented. Each node consists of an elementary circuit, generating spikes of irregular amplitude and comprising one bipolar junction transistor, one capacitor, two inductors, and one biasing resistor. The nodes are mutually coupled to their neighbours via additional variable resistors. As coupling resistance is decreased, phase synchronization followed by complete synchronization is observed, and onset of synchronization is associated with partial synchronization, i.e., emergence of communities (clusters). While component tolerances affect community structure, the general synchronization properties are maintained across three prototypes and in numerical simulations. The clusters are destroyed by adding long distance connections with distant notes, but are otherwise relatively stable with respect to structural connectivity changes. The study provides evidence that several fundamental synchronization phenomena can be reliably observed in a network of elementary single-transistor oscillators, demonstrating their generative potential and opening way to potential applications of this undemanding setup in experimental modelling of the relationship between network structure, synchronization, and dynamical properties.
Rennard, Stephen I; Locantore, Nicholas; Delafont, Bruno; Tal-Singer, Ruth; Silverman, Edwin K; Vestbo, Jørgen; Miller, Bruce E; Bakke, Per; Celli, Bartolomé; Calverley, Peter M A; Coxson, Harvey; Crim, Courtney; Edwards, Lisa D; Lomas, David A; MacNee, William; Wouters, Emiel F M; Yates, Julie C; Coca, Ignacio; Agustí, Alvar
2015-03-01
Chronic obstructive pulmonary disease (COPD) is a heterogeneous disease that likely includes clinically relevant subgroups. To identify subgroups of COPD in ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints) subjects using cluster analysis and to assess clinically meaningful outcomes of the clusters during 3 years of longitudinal follow-up. Factor analysis was used to reduce 41 variables determined at recruitment in 2,164 patients with COPD to 13 main factors, and the variables with the highest loading were used for cluster analysis. Clusters were evaluated for their relationship with clinically meaningful outcomes during 3 years of follow-up. The relationships among clinical parameters were evaluated within clusters. Five subgroups were distinguished using cross-sectional clinical features. These groups differed regarding outcomes. Cluster A included patients with milder disease and had fewer deaths and hospitalizations. Cluster B had less systemic inflammation at baseline but had notable changes in health status and emphysema extent. Cluster C had many comorbidities, evidence of systemic inflammation, and the highest mortality. Cluster D had low FEV1, severe emphysema, and the highest exacerbation and COPD hospitalization rate. Cluster E was intermediate for most variables and may represent a mixed group that includes further clusters. The relationships among clinical variables within clusters differed from that in the entire COPD population. Cluster analysis using baseline data in ECLIPSE identified five COPD subgroups that differ in outcomes and inflammatory biomarkers and show different relationships between clinical parameters, suggesting the clusters represent clinically and biologically different subtypes of COPD.
Predictions of a population of cataclysmic variables in globular clusters
NASA Technical Reports Server (NTRS)
Di Stefano, R.; Rappaport, S.
1994-01-01
We have studied the number of cataclysmic variables (CVs) that should be active in globular clusters during the present epoch as a result of binary formation via two-body tidal capture. We predict the orbital period and luminosity distributions of CVs in globular clusters. The results arebased on Monte Carlo simulations combined with evolution calculations appropriate to each system formed during the lifetime of two specific globular clusters, omega Cen and 47 Tuc. From our study of these two clusters, which represent the range of core densities and states of mass segregation that are likely to be interesting, we extrapolate our results to the Galactic globlular cluster system. Although there is at present little direct observational evidence of CVs in globular clusters, we find that there should be a large number of active systems. We predict that there should be more than approximately 100 CVs in both 47 Tuc and omega Cen and several thousand in the Galactic globular cluster system. These numbers are based on two-body processes alone and represent a lower bound on the number of systems that may have been formed as a result of stellar interaction within globular clusters. The relation between these calculations and the paucity of optically detected CVs in globular clusters is discussed. Should future observations fail to find convincing evidence of a substantial population of cluster CVs, then the two-body tidal capture scenario is likely to be seriously constrained. Of the CVs we espect in 47 Tuc and omega Cen, approximately 45 and 20, respectively, should have accretion luminosities above 10(exp 33) ergs/s. If one utilizes a relation for converting accretion luminosity to hard X-ray luminosity that is based on observations of Galactic plane CVs, even these sources will not exhibit X-ray luminosities above 10(exp 33) ergs/s. While we cannot account directly for the most luminous subset of the low-luminosity globular cluster X-ray sources without assuming an evolutionary pattern that is different from that of the majority of CVs in the disk, we are able to account for all of the observed lower luminosity subset of these sources, many of which have been recently discovered through ROSAT observations. In order for our predicted integrated cluster X-ray luminosities to be consistent with observational upper limits, the relation between accretion and X-ray luminosities should be something like that inferred from the Galactic plane population of CVs. Our calculations predict a large number of systems with L(sub acc) is less than 10(exp 32) ergs/s. Although our calculations imply that globular clusters should have an enhancement of CVs relative to the number thought to be present in the Galactic disk, this enhancement is at most roughly an order of magnitude, not comparable to the factor of approximately 100 for low-mass X-ray binaries (LMXBs).
NASA Technical Reports Server (NTRS)
Fabbiano, G.
1995-01-01
X-ray studies of galaxies by the Smithsonian Astrophysical Observatory (SAO) and MIT are described. Activities at SAO include ROSAT PSPC x-ray data reduction and analysis pipeline; x-ray sources in nearby Sc galaxies; optical, x-ray, and radio study of ongoing galactic merger; a radio, far infrared, optical, and x-ray study of the Sc galaxy NGC247; and a multiparametric analysis of the Einstein sample of early-type galaxies. Activities at MIT included continued analysis of observations with ROSAT and ASCA, and continued development of new approaches to spectral analysis with ASCA and AXAF. Also, a new method for characterizing structure in galactic clusters was developed and applied to ROSAT images of a large sample of clusters. An appendix contains preprints generated by the research.
Yao, Shuai-Lei; Luo, Jing-Jia; Huang, Gang
2016-01-01
Regional climate projections are challenging because of large uncertainty particularly stemming from unpredictable, internal variability of the climate system. Here, we examine the internal variability-induced uncertainty in precipitation and surface air temperature (SAT) trends during 2005-2055 over East Asia based on 40 member ensemble projections of the Community Climate System Model Version 3 (CCSM3). The model ensembles are generated from a suite of different atmospheric initial conditions using the same SRES A1B greenhouse gas scenario. We find that projected precipitation trends are subject to considerably larger internal uncertainty and hence have lower confidence, compared to the projected SAT trends in both the boreal winter and summer. Projected SAT trends in winter have relatively higher uncertainty than those in summer. Besides, the lower-level atmospheric circulation has larger uncertainty than that in the mid-level. Based on k-means cluster analysis, we demonstrate that a substantial portion of internally-induced precipitation and SAT trends arises from internal large-scale atmospheric circulation variability. These results highlight the importance of internal climate variability in affecting regional climate projections on multi-decadal timescales.
The Clusters AgeS Experiment (CASE). Variable stars in the field of the globular cluster NGC 362
NASA Astrophysics Data System (ADS)
Rozyczka, M.; Thompson, I. B.; Narloch, W.; Pych, W.; Schwarzenberg-Czerny, A.
2016-09-01
The field of the globular cluster NGC 362 was monitored between 1997 and 2015 in a search for variable stars. BV light curves were obtained for 151 periodic or likely periodic variable stars, over a hundred of which are new detections. Twelve newly detected variable stars are proper-motion members of the cluster: two SX Phe and two RR Lyr pulsators, one contact binary, three detached or semi-detached eclipsing binaries, and four spotted variable stars. The most interesting objects among these are the binary blue straggler V20 with an asymmetric light curve, and the 8.1 d semidetached binary V24 located on the red giant branch of NGC 362, which is a Chandra X-ray source. We also provide substantial new data for 24 previously known variable stars.
Feder, Stephan; Sundermann, Benedikt; Wersching, Heike; Teuber, Anja; Kugel, Harald; Teismann, Henning; Heindel, Walter; Berger, Klaus; Pfleiderer, Bettina
2017-11-01
Combinations of resting-state fMRI and machine-learning techniques are increasingly employed to develop diagnostic models for mental disorders. However, little is known about the neurobiological heterogeneity of depression and diagnostic machine learning has mainly been tested in homogeneous samples. Our main objective was to explore the inherent structure of a diverse unipolar depression sample. The secondary objective was to assess, if such information can improve diagnostic classification. We analyzed data from 360 patients with unipolar depression and 360 non-depressed population controls, who were subdivided into two independent subsets. Cluster analyses (unsupervised learning) of functional connectivity were used to generate hypotheses about potential patient subgroups from the first subset. The relationship of clusters with demographical and clinical measures was assessed. Subsequently, diagnostic classifiers (supervised learning), which incorporated information about these putative depression subgroups, were trained. Exploratory cluster analyses revealed two weakly separable subgroups of depressed patients. These subgroups differed in the average duration of depression and in the proportion of patients with concurrently severe depression and anxiety symptoms. The diagnostic classification models performed at chance level. It remains unresolved, if subgroups represent distinct biological subtypes, variability of continuous clinical variables or in part an overfitting of sparsely structured data. Functional connectivity in unipolar depression is associated with general disease effects. Cluster analyses provide hypotheses about potential depression subtypes. Diagnostic models did not benefit from this additional information regarding heterogeneity. Copyright © 2017 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Poppenhaeger, K.; Wolk, S. J.; Hora, J. L.
2015-10-15
We present a time-variability study of young stellar objects (YSOs) in the cluster IRAS 20050+2720, performed at 3.6 and 4.5 μm with the Spitzer Space Telescope; this study is part of the Young Stellar Object VARiability (YSOVAR) project. We have collected light curves for 181 cluster members over 60 days. We find a high variability fraction among embedded cluster members of ca. 70%, whereas young stars without a detectable disk display variability less often (in ca. 50% of the cases) and with lower amplitudes. We detect periodic variability for 33 sources with periods primarily in the range of 2–6 days.more » Practically all embedded periodic sources display additional variability on top of their periodicity. Furthermore, we analyze the slopes of the tracks that our sources span in the color–magnitude diagram (CMD). We find that sources with long variability time scales tend to display CMD slopes that are at least partially influenced by accretion processes, while sources with short variability timescales tend to display extinction-dominated slopes. We find a tentative trend of X-ray detected cluster members to vary on longer timescales than the X-ray undetected members.« less
A New Variable Weighting and Selection Procedure for K-Means Cluster Analysis
ERIC Educational Resources Information Center
Steinley, Douglas; Brusco, Michael J.
2008-01-01
A variance-to-range ratio variable weighting procedure is proposed. We show how this weighting method is theoretically grounded in the inherent variability found in data exhibiting cluster structure. In addition, a variable selection procedure is proposed to operate in conjunction with the variable weighting technique. The performances of these…
Classes and continua of hippocampal CA1 inhibitory neurons revealed by single-cell transcriptomics.
Harris, Kenneth D; Hochgerner, Hannah; Skene, Nathan G; Magno, Lorenza; Katona, Linda; Bengtsson Gonzales, Carolina; Somogyi, Peter; Kessaris, Nicoletta; Linnarsson, Sten; Hjerling-Leffler, Jens
2018-06-18
Understanding any brain circuit will require a categorization of its constituent neurons. In hippocampal area CA1, at least 23 classes of GABAergic neuron have been proposed to date. However, this list may be incomplete; additionally, it is unclear whether discrete classes are sufficient to describe the diversity of cortical inhibitory neurons or whether continuous modes of variability are also required. We studied the transcriptomes of 3,663 CA1 inhibitory cells, revealing 10 major GABAergic groups that divided into 49 fine-scale clusters. All previously described and several novel cell classes were identified, with three previously described classes unexpectedly found to be identical. A division into discrete classes, however, was not sufficient to describe the diversity of these cells, as continuous variation also occurred between and within classes. Latent factor analysis revealed that a single continuous variable could predict the expression levels of several genes, which correlated similarly with it across multiple cell types. Analysis of the genes correlating with this variable suggested it reflects a range from metabolically highly active faster-spiking cells that proximally target pyramidal cells to slower-spiking cells targeting distal dendrites or interneurons. These results elucidate the complexity of inhibitory neurons in one of the simplest cortical structures and show that characterizing these cells requires continuous modes of variation as well as discrete cell classes.
Unequal cluster sizes in stepped-wedge cluster randomised trials: a systematic review
Morris, Tom; Gray, Laura
2017-01-01
Objectives To investigate the extent to which cluster sizes vary in stepped-wedge cluster randomised trials (SW-CRT) and whether any variability is accounted for during the sample size calculation and analysis of these trials. Setting Any, not limited to healthcare settings. Participants Any taking part in an SW-CRT published up to March 2016. Primary and secondary outcome measures The primary outcome is the variability in cluster sizes, measured by the coefficient of variation (CV) in cluster size. Secondary outcomes include the difference between the cluster sizes assumed during the sample size calculation and those observed during the trial, any reported variability in cluster sizes and whether the methods of sample size calculation and methods of analysis accounted for any variability in cluster sizes. Results Of the 101 included SW-CRTs, 48% mentioned that the included clusters were known to vary in size, yet only 13% of these accounted for this during the calculation of the sample size. However, 69% of the trials did use a method of analysis appropriate for when clusters vary in size. Full trial reports were available for 53 trials. The CV was calculated for 23 of these: the median CV was 0.41 (IQR: 0.22–0.52). Actual cluster sizes could be compared with those assumed during the sample size calculation for 14 (26%) of the trial reports; the cluster sizes were between 29% and 480% of that which had been assumed. Conclusions Cluster sizes often vary in SW-CRTs. Reporting of SW-CRTs also remains suboptimal. The effect of unequal cluster sizes on the statistical power of SW-CRTs needs further exploration and methods appropriate to studies with unequal cluster sizes need to be employed. PMID:29146637
A progress report on seismic model studies
Healy, J.H.; Mangan, G.B.
1963-01-01
The value of seismic-model studies as an aid to understanding wave propagation in the Earth's crust was recognized by early investigators (Tatel and Tuve, 1955). Preliminary model results were very promising, but progress in model seismology has been restricted by two problems: (1) difficulties in the development of models with continuously variable velocity-depth functions, and (2) difficulties in the construction of models of adequate size to provide a meaningful wave-length to layer-thickness ratio. The problem of a continuously variable velocity-depth function has been partly solved by a technique using two-dimensional plate models constructed by laminating plastic to aluminum, so that the ratio of plastic to aluminum controls the velocity-depth function (Healy and Press, 1960). These techniques provide a continuously variable velocity-depth function, but it is not possible to construct such models large enough to study short-period wave propagation in the crust. This report describes improvements in our ability to machine large models. Two types of models are being used: one is a cylindrical aluminum tube machined on a lathe, and the other is a large plate machined on a precision planer. Both of these modeling techniques give promising results and are a significant improvement over earlier efforts.
Diversity in migratory patterns among Neotropical fishes in a highly regulated river basin.
Makrakis, M C; Miranda, L E; Makrakis, S; Fontes Júnior, H M; Morlis, W G; Dias, J H P; Garcia, J O
2012-07-01
Migratory behaviour of selected fish species is described in the Paraná River, Brazil-Argentina-Paraguay, to search for patterns relevant to tropical regulated river systems. In a 10 year mark-recapture study, spanning a 1425 km section of the river, 32 867 fishes composed of 18 species were released and 1083 fishes were recaptured. The fishes recaptured were at liberty an average 166 days (maximum 1548 days) and travelled an average 35 km (range 0-625 km). Cluster analysis applied to variables descriptive of movement behaviour identified four general movement patterns. Cluster 1 included species that moved long distances (mean 164 km) upstream (54%) and downstream (40%) the mainstem river and showed high incidence (27%) of passage through dams; cluster 2 also exhibited high rate of movement along the mainstem (49% upstream, 13% downstream), but moved small distances (mean 10 km); cluster 3 included the most fishes moving laterally into tributaries (45%) or not moving at all (25%), but little downstream movement (8%); fishes in cluster 4 exhibited little upstream movement (13%) and farthest downstream movements (mean 41 km). Whereas species could be numerically clustered with statistical models, a species ordination showed ample spread, suggesting that species exhibit diverse movement patterns that cannot be easily classified into just a few classes. The cluster and ordination procedures also showed that adults and juveniles of the same species exhibit similar movement patterns. Conventional concepts about Neotropical migratory fishes portray them as travelling long distances upstream. The present results broaden these concepts suggesting that migratory movements are more diverse, could be long, short or at times absent, upriver, downriver or lateral, and the diversity of movements can vary within and among species. The intense lateral migrations exhibited by a diversity of species, especially to and from large tributaries (above reservoirs) and reservoir tributaries, illustrate the importance of these habitats for the fish species life cycle. Considering that the Paraná River is highly impounded, special attention should be given to the few remaining low-impact habitats as they continue to be targets of hydropower development that will probably intensify the effects on migratory fish stocks. © 2012 The Authors. Journal of Fish Biology © 2012 The Fisheries Society of the British Isles.
Large-Scale Circulation and Climate Variability. Chapter 5
NASA Technical Reports Server (NTRS)
Perlwitz, J.; Knutson, T.; Kossin, J. P.; LeGrande, A. N.
2017-01-01
The causes of regional climate trends cannot be understood without considering the impact of variations in large-scale atmospheric circulation and an assessment of the role of internally generated climate variability. There are contributions to regional climate trends from changes in large-scale latitudinal circulation, which is generally organized into three cells in each hemisphere-Hadley cell, Ferrell cell and Polar cell-and which determines the location of subtropical dry zones and midlatitude jet streams. These circulation cells are expected to shift poleward during warmer periods, which could result in poleward shifts in precipitation patterns, affecting natural ecosystems, agriculture, and water resources. In addition, regional climate can be strongly affected by non-local responses to recurring patterns (or modes) of variability of the atmospheric circulation or the coupled atmosphere-ocean system. These modes of variability represent preferred spatial patterns and their temporal variation. They account for gross features in variance and for teleconnections which describe climate links between geographically separated regions. Modes of variability are often described as a product of a spatial climate pattern and an associated climate index time series that are identified based on statistical methods like Principal Component Analysis (PC analysis), which is also called Empirical Orthogonal Function Analysis (EOF analysis), and cluster analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Springmeyer, R R; Brugger, E; Cook, R
The Data group provides data analysis and visualization support to its customers. This consists primarily of the development and support of VisIt, a data analysis and visualization tool. Support ranges from answering questions about the tool, providing classes on how to use the tool, and performing data analysis and visualization for customers. The Information Management and Graphics Group supports and develops tools that enhance our ability to access, display, and understand large, complex data sets. Activities include applying visualization software for large scale data exploration; running video production labs on two networks; supporting graphics libraries and tools for end users;more » maintaining PowerWalls and assorted other displays; and developing software for searching and managing scientific data. Researchers in the Center for Applied Scientific Computing (CASC) work on various projects including the development of visualization techniques for large scale data exploration that are funded by the ASC program, among others. The researchers also have LDRD projects and collaborations with other lab researchers, academia, and industry. The IMG group is located in the Terascale Simulation Facility, home to Dawn, Atlas, BGL, and others, which includes both classified and unclassified visualization theaters, a visualization computer floor and deployment workshop, and video production labs. We continued to provide the traditional graphics group consulting and video production support. We maintained five PowerWalls and many other displays. We deployed a 576-node Opteron/IB cluster with 72 TB of memory providing a visualization production server on our classified network. We continue to support a 128-node Opteron/IB cluster providing a visualization production server for our unclassified systems and an older 256-node Opteron/IB cluster for the classified systems, as well as several smaller clusters to drive the PowerWalls. The visualization production systems includes NFS servers to provide dedicated storage for data analysis and visualization. The ASC projects have delivered new versions of visualization and scientific data management tools to end users and continue to refine them. VisIt had 4 releases during the past year, ending with VisIt 2.0. We released version 2.4 of Hopper, a Java application for managing and transferring files. This release included a graphical disk usage view which works on all types of connections and an aggregated copy feature for quickly transferring massive datasets quickly and efficiently to HPSS. We continue to use and develop Blockbuster and Telepath. Both the VisIt and IMG teams were engaged in a variety of movie production efforts during the past year in addition to the development tasks.« less
The SWIFT AGN and Cluster Survey. I. Number Counts of AGNs and Galaxy Clusters
NASA Astrophysics Data System (ADS)
Dai, Xinyu; Griffin, Rhiannon D.; Kochanek, Christopher S.; Nugent, Jenna M.; Bregman, Joel N.
2015-05-01
The Swift active galactic nucleus (AGN) and Cluster Survey (SACS) uses 125 deg2 of Swift X-ray Telescope serendipitous fields with variable depths surrounding γ-ray bursts to provide a medium depth (4× {{10}-15} erg cm-2 s-1) and area survey filling the gap between deep, narrow Chandra/XMM-Newton surveys and wide, shallow ROSAT surveys. Here, we present a catalog of 22,563 point sources and 442 extended sources and examine the number counts of the AGN and galaxy cluster populations. SACS provides excellent constraints on the AGN number counts at the bright end with negligible uncertainties due to cosmic variance, and these constraints are consistent with previous measurements. We use Wide-field Infrared Survey Explorer mid-infrared (MIR) colors to classify the sources. For AGNs we can roughly separate the point sources into MIR-red and MIR-blue AGNs, finding roughly equal numbers of each type in the soft X-ray band (0.5-2 keV), but fewer MIR-blue sources in the hard X-ray band (2-8 keV). The cluster number counts, with 5% uncertainties from cosmic variance, are also consistent with previous surveys but span a much larger continuous flux range. Deep optical or IR follow-up observations of this cluster sample will significantly increase the number of higher-redshift (z\\gt 0.5) X-ray-selected clusters.
Snell, Deborah L; Surgenor, Lois J; Hay-Smith, E Jean C; Williman, Jonathan; Siegert, Richard J
2015-01-01
Outcomes after mild traumatic brain injury (MTBI) vary, with slow or incomplete recovery for a significant minority. This study examines whether groups of cases with shared psychological factors but with different injury outcomes could be identified using cluster analysis. This is a prospective observational study following 147 adults presenting to a hospital-based emergency department or concussion services in Christchurch, New Zealand. This study examined associations between baseline demographic, clinical, psychological variables (distress, injury beliefs and symptom burden) and outcome 6 months later. A two-step approach to cluster analysis was applied (Ward's method to identify clusters, K-means to refine results). Three meaningful clusters emerged (high-adapters, medium-adapters, low-adapters). Baseline cluster-group membership was significantly associated with outcomes over time. High-adapters appeared recovered by 6-weeks and medium-adapters revealed improvements by 6-months. The low-adapters continued to endorse many symptoms, negative recovery expectations and distress, being significantly at risk for poor outcome more than 6-months after injury (OR (good outcome) = 0.12; CI = 0.03-0.53; p < 0.01). Cluster analysis supported the notion that groups could be identified early post-injury based on psychological factors, with group membership associated with differing outcomes over time. Implications for clinical care providers regarding therapy targets and cases that may benefit from different intensities of intervention are discussed.
The spatial clustering of obesity: does the built environment matter?
Huang, R; Moudon, A V; Cook, A J; Drewnowski, A
2015-12-01
Obesity rates in the USA show distinct geographical patterns. The present study used spatial cluster detection methods and individual-level data to locate obesity clusters and to analyse them in relation to the neighbourhood built environment. The 2008-2009 Seattle Obesity Study provided data on the self-reported height, weight, and sociodemographic characteristics of 1602 King County adults. Home addresses were geocoded. Clusters of high or low body mass index were identified using Anselin's Local Moran's I and a spatial scan statistic with regression models that searched for unmeasured neighbourhood-level factors from residuals, adjusting for measured individual-level covariates. Spatially continuous values of objectively measured features of the local neighbourhood built environment (SmartMaps) were constructed for seven variables obtained from tax rolls and commercial databases. Both the Local Moran's I and a spatial scan statistic identified similar spatial concentrations of obesity. High and low obesity clusters were attenuated after adjusting for age, gender, race, education and income, and they disappeared once neighbourhood residential property values and residential density were included in the model. Using individual-level data to detect obesity clusters with two cluster detection methods, the present study showed that the spatial concentration of obesity was wholly explained by neighbourhood composition and socioeconomic characteristics. These characteristics may serve to more precisely locate obesity prevention and intervention programmes. © 2014 The British Dietetic Association Ltd.
Population-based 3D genome structure analysis reveals driving forces in spatial genome organization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tjong, Harianto; Li, Wenyuan; Kalhor, Reza
Conformation capture technologies (e.g., Hi-C) chart physical interactions between chromatin regions on a genome-wide scale. However, the structural variability of the genome between cells poses a great challenge to interpreting ensemble-averaged Hi-C data, particularly for long-range and interchromosomal interactions. Here, we present a probabilistic approach for deconvoluting Hi-C data into a model population of distinct diploid 3D genome structures, which facilitates the detection of chromatin interactions likely to co-occur in individual cells. Here, our approach incorporates the stochastic nature of chromosome conformations and allows a detailed analysis of alternative chromatin structure states. For example, we predict and experimentally confirm themore » presence of large centromere clusters with distinct chromosome compositions varying between individual cells. The stability of these clusters varies greatly with their chromosome identities. We show that these chromosome-specific clusters can play a key role in the overall chromosome positioning in the nucleus and stabilizing specific chromatin interactions. By explicitly considering genome structural variability, our population-based method provides an important tool for revealing novel insights into the key factors shaping the spatial genome organization.« less
Population-based 3D genome structure analysis reveals driving forces in spatial genome organization
Tjong, Harianto; Li, Wenyuan; Kalhor, Reza; ...
2016-03-07
Conformation capture technologies (e.g., Hi-C) chart physical interactions between chromatin regions on a genome-wide scale. However, the structural variability of the genome between cells poses a great challenge to interpreting ensemble-averaged Hi-C data, particularly for long-range and interchromosomal interactions. Here, we present a probabilistic approach for deconvoluting Hi-C data into a model population of distinct diploid 3D genome structures, which facilitates the detection of chromatin interactions likely to co-occur in individual cells. Here, our approach incorporates the stochastic nature of chromosome conformations and allows a detailed analysis of alternative chromatin structure states. For example, we predict and experimentally confirm themore » presence of large centromere clusters with distinct chromosome compositions varying between individual cells. The stability of these clusters varies greatly with their chromosome identities. We show that these chromosome-specific clusters can play a key role in the overall chromosome positioning in the nucleus and stabilizing specific chromatin interactions. By explicitly considering genome structural variability, our population-based method provides an important tool for revealing novel insights into the key factors shaping the spatial genome organization.« less
Ku, Wai Lim; Girvan, Michelle; Ott, Edward
2015-12-01
In this paper, we study dynamical systems in which a large number N of identical Landau-Stuart oscillators are globally coupled via a mean-field. Previously, it has been observed that this type of system can exhibit a variety of different dynamical behaviors. These behaviors include time periodic cluster states in which each oscillator is in one of a small number of groups for which all oscillators in each group have the same state which is different from group to group, as well as a behavior in which all oscillators have different states and the macroscopic dynamics of the mean field is chaotic. We argue that this second type of behavior is "extensive" in the sense that the chaotic attractor in the full phase space of the system has a fractal dimension that scales linearly with N and that the number of positive Lyapunov exponents of the attractor also scales linearly with N. An important focus of this paper is the transition between cluster states and extensive chaos as the system is subjected to slow adiabatic parameter change. We observe discontinuous transitions between the cluster states (which correspond to low dimensional dynamics) and the extensively chaotic states. Furthermore, examining the cluster state, as the system approaches the discontinuous transition to extensive chaos, we find that the oscillator population distribution between the clusters continually evolves so that the cluster state is always marginally stable. This behavior is used to reveal the mechanism of the discontinuous transition. We also apply the Kaplan-Yorke formula to study the fractal structure of the extensively chaotic attractors.
NASA Astrophysics Data System (ADS)
Ku, Wai Lim; Girvan, Michelle; Ott, Edward
2015-12-01
In this paper, we study dynamical systems in which a large number N of identical Landau-Stuart oscillators are globally coupled via a mean-field. Previously, it has been observed that this type of system can exhibit a variety of different dynamical behaviors. These behaviors include time periodic cluster states in which each oscillator is in one of a small number of groups for which all oscillators in each group have the same state which is different from group to group, as well as a behavior in which all oscillators have different states and the macroscopic dynamics of the mean field is chaotic. We argue that this second type of behavior is "extensive" in the sense that the chaotic attractor in the full phase space of the system has a fractal dimension that scales linearly with N and that the number of positive Lyapunov exponents of the attractor also scales linearly with N. An important focus of this paper is the transition between cluster states and extensive chaos as the system is subjected to slow adiabatic parameter change. We observe discontinuous transitions between the cluster states (which correspond to low dimensional dynamics) and the extensively chaotic states. Furthermore, examining the cluster state, as the system approaches the discontinuous transition to extensive chaos, we find that the oscillator population distribution between the clusters continually evolves so that the cluster state is always marginally stable. This behavior is used to reveal the mechanism of the discontinuous transition. We also apply the Kaplan-Yorke formula to study the fractal structure of the extensively chaotic attractors.
The Clusters AgeS Experiment (CASE). Variable Stars in the Field of the Globular Cluster NGC 6362
NASA Astrophysics Data System (ADS)
Kaluzny, J.; Thompson, I. B.; Rozyczka, M.; Pych, W.; Narloch, W.
2014-12-01
The field of the globular cluster NGC 6362 was monitored between 1995 and 2009 in a search for variable stars. BV light curves were obtained for 69 periodic variable stars including 34 known RR Lyr stars, 10 known objects of other types and 25 newly detected variable stars. Among the latter we identified 18 proper-motion members of the cluster: seven detached eclipsing binaries (DEBs), six SX Phe stars, two W UMa binaries, two spotted red giants, and a very interesting eclipsing binary composed of two red giants - the first example of such a system found in a globular cluster. Five of the DEBs are located at the turnoff region, and the remaining two are redward of the lower main sequence. Eighty-four objects from the central 9×9 arcmin2 of the cluster were found in the region of cluster blue stragglers. Of these 70 are proper motion (PM) members of NGC 6362 (including all SX Phe and two W UMa stars), and five are field stars. The remaining nine objects lacking PM information are located at the very core of the cluster, and as such they are likely genuine blue stragglers.
Marateb, Hamid Reza; Mansourian, Marjan; Adibi, Peyman; Farina, Dario
2014-01-01
Background: selecting the correct statistical test and data mining method depends highly on the measurement scale of data, type of variables, and purpose of the analysis. Different measurement scales are studied in details and statistical comparison, modeling, and data mining methods are studied based upon using several medical examples. We have presented two ordinal–variables clustering examples, as more challenging variable in analysis, using Wisconsin Breast Cancer Data (WBCD). Ordinal-to-Interval scale conversion example: a breast cancer database of nine 10-level ordinal variables for 683 patients was analyzed by two ordinal-scale clustering methods. The performance of the clustering methods was assessed by comparison with the gold standard groups of malignant and benign cases that had been identified by clinical tests. Results: the sensitivity and accuracy of the two clustering methods were 98% and 96%, respectively. Their specificity was comparable. Conclusion: by using appropriate clustering algorithm based on the measurement scale of the variables in the study, high performance is granted. Moreover, descriptive and inferential statistics in addition to modeling approach must be selected based on the scale of the variables. PMID:24672565
Spatial modelling and mapping of female genital mutilation in Kenya.
Achia, Thomas N O
2014-03-25
Female genital mutilation/cutting (FGM/C) is still prevalent in several communities in Kenya and other areas in Africa, as well as being practiced by some migrants from African countries living in other parts of the world. This study aimed at detecting clustering of FGM/C in Kenya, and identifying those areas within the country where women still intend to continue the practice. A broader goal of the study was to identify geographical areas where the practice continues unabated and where broad intervention strategies need to be introduced. The prevalence of FGM/C was investigated using the 2008 Kenya Demographic and Health Survey (KDHS) data. The 2008 KDHS used a multistage stratified random sampling plan to select women of reproductive age (15-49 years) and asked questions concerning their FGM/C status and their support for the continuation of FGM/C. A spatial scan statistical analysis was carried out using SaTScan™ to test for statistically significant clustering of the practice of FGM/C in the country. The risk of FGM/C was also modelled and mapped using a hierarchical spatial model under the Integrated Nested Laplace approximation approach using the INLA library in R. The prevalence of FGM/C stood at 28.2% and an estimated 10.3% of the women interviewed indicated that they supported the continuation of FGM. On the basis of the Deviance Information Criterion (DIC), hierarchical spatial models with spatially structured random effects were found to best fit the data for both response variables considered. Age, region, rural-urban classification, education, marital status, religion, socioeconomic status and media exposure were found to be significantly associated with FGM/C. The current FGM/C status of a woman was also a significant predictor of support for the continuation of FGM/C. Spatial scan statistics confirm FGM clusters in the North-Eastern and South-Western regions of Kenya (p<0.001). This suggests that the fight against FGM/C in Kenya is not yet over. There are still deep cultural and religious beliefs to be addressed in a bid to eradicate the practice. Interventions by government and other stakeholders must address these challenges and target the identified clusters.
NASA Astrophysics Data System (ADS)
Sarron, F.; Martinet, N.; Durret, F.; Adami, C.
2018-06-01
Obtaining large samples of galaxy clusters is important for cosmology: cluster counts as a function of redshift and mass can constrain the parameters of our Universe. They are also useful in order to understand the formation and evolution of clusters. We develop an improved version of the Adami & MAzure Cluster FInder (AMACFI), now the Adami, MAzure & Sarron Cluster FInder (AMASCFI), and apply it to the 154 deg2 of the Canada-France-Hawaii Telescope Legacy Survey (CFHTLS) to obtain a large catalogue of 1371 cluster candidates with mass M200 > 1014 M⊙ and redshift z ≤ 0.7. We derive the selection function of the algorithm from the Millennium simulation, and cluster masses from a richness-mass scaling relation built from matching our candidates with X-ray detections. We study the evolution of these clusters with mass and redshift by computing the i'-band galaxy luminosity functions (GLFs) for the early-type (ETGs) and late-type galaxies (LTGs). This sample is 90% pure and 70% complete, and therefore our results are representative of a large fraction of the cluster population in these redshift and mass ranges. We find an increase in both the ETG and LTG faint populations with decreasing redshift (with Schechter slopes αETG = -0.65 ± 0.03 and αLTG = -0.95 ± 0.04 at z = 0.6, and αETG = -0.79 ± 0.02 and αLTG = -1.26 ± 0.03 at z = 0.2) and also a decrease in the LTG (but not the ETG) bright end. Our large sample allows us to break the degeneracy between mass and redshift, finding that the redshift evolution is more pronounced in high-mass clusters, but that there is no significant dependence of the faint end on mass for a given redshift. These results show that the cluster red sequence is mainly formed at redshift z > 0.7, and that faint ETGs continue to enrich the red sequence through quenching of brighter LTGs at z ≤ 0.7. The efficiency of this quenching is higher in large-mass clusters, while the accretion rate of faint LTGs is lower as the more massive clusters have already emptied most of their environment at higher redshifts. Based on observations obtained with MegaPrime/MegaCam, a joint project of CFHT and CEA/IRFU, at the Canada-France-Hawaii Telescope (CFHT) which is operated by the National Research Council (NRC) of Canada, the Institut National des Sciences de l'Univers of the Centre National de la Recherche Scientifique (CNRS) of France, and the University of Hawaii. This work is based in part on data products produced at Terapix available at the Canadian Astronomy Data Centre as part of the Canada-France-Hawaii Telescope Legacy Survey, a collaborative project of NRC and CNRS.The candidate cluster catalog is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/613/A67
Clustering "N" Objects into "K" Groups under Optimal Scaling of Variables.
ERIC Educational Resources Information Center
van Buuren, Stef; Heiser, Willem J.
1989-01-01
A method based on homogeneity analysis (multiple correspondence analysis or multiple scaling) is proposed to reduce many categorical variables to one variable with "k" categories. The method is a generalization of the sum of squared distances cluster analysis problem to the case of mixed measurement level variables. (SLD)
Ultra-diffuse cluster galaxies as key to the MOND cluster conundrum
NASA Astrophysics Data System (ADS)
Milgrom, Mordehai
2015-12-01
Modified Newtonian Dynamics (MOND) reduces greatly the mass discrepancy in clusters of galaxies,but does leave a global discrepancy of about a factor of 2 (epitomized by the structure of the Bullet Cluster). It has been proposed, within the minimalist and purist MOND, that clusters harbour some indigenous, yet undetected, cluster baryonic (dark) matter (CBDM), whose total amount is comparable with that of the observed hot gas. Koda et al. have recently identified more than a thousand ultra-diffuse, galaxy-like objects (UDGs) in the Coma cluster. These, they argue, require, within Newtonian dynamics, that they are much more massive than their observed stellar component. Here, I propound that some of the CBDM is internal to UDGs, which endows them with robustness. The rest of the CBDM objects formed in now-disrupted kin of the UDGs, and is dispersed in the intracluster medium. The discovery of cluster UDGs is not in itself a resolution of the MOND cluster conundrum, but it lends greater plausibility to CBDM as its resolution. Alternatively, if the UDGs are only now falling into Coma, their large size and very low surface brightness could result from the inflation due to the MOND, variable external-field effect (EFE). I also consider briefly solutions to the conundrum that invoke more elaborate extensions of purist MOND, e.g. that in clusters, the MOND constant takes up larger than canonical values of the MOND constant. Whatever solves the cluster conundrum within MOND might also naturally account for UDGs.
Percolation on fitness landscapes: effects of correlation, phenotype, and incompatibilities
Gravner, Janko; Pitman, Damien; Gavrilets, Sergey
2009-01-01
We study how correlations in the random fitness assignment may affect the structure of fitness landscapes, in three classes of fitness models. The first is a phenotype space in which individuals are characterized by a large number n of continuously varying traits. In a simple model of random fitness assignment, viable phenotypes are likely to form a giant connected cluster percolating throughout the phenotype space provided the viability probability is larger than 1/2n. The second model explicitly describes genotype-to-phenotype and phenotype-to-fitness maps, allows for neutrality at both phenotype and fitness levels, and results in a fitness landscape with tunable correlation length. Here, phenotypic neutrality and correlation between fitnesses can reduce the percolation threshold, and correlations at the point of phase transition between local and global are most conducive to the formation of the giant cluster. In the third class of models, particular combinations of alleles or values of phenotypic characters are “incompatible” in the sense that the resulting genotypes or phenotypes have zero fitness. This setting can be viewed as a generalization of the canonical Bateson-Dobzhansky-Muller model of speciation and is related to K- SAT problems, prominent in computer science. We analyze the conditions for the existence of viable genotypes, their number, as well as the structure and the number of connected clusters of viable genotypes. We show that analysis based on expected values can easily lead to wrong conclusions, especially when fitness correlations are strong. We focus on pairwise incompatibilities between diallelic loci, but we also address multiple alleles, complex incompatibilities, and continuous phenotype spaces. In the case of diallelic loci, the number of clusters is stochastically bounded and each cluster contains a very large sub-cube. Finally, we demonstrate that the discrete NK model shares some signature properties of models with high correlations. PMID:17692873
Levene, Louis S; Baker, Richard; Walker, Nicola; Williams, Christopher; Wilson, Andrew; Bankart, John
2018-06-01
Increased relationship continuity in primary care is associated with better health outcomes, greater patient satisfaction, and fewer hospital admissions. Greater socioeconomic deprivation is associated with lower levels of continuity, as well as poorer health outcomes. To investigate whether deprivation scores predicted variations in the decline over time of patient-perceived relationship continuity of care, after adjustment for practice organisational and population factors. An observational study in 6243 primary care practices with more than one GP, in England, using a longitudinal multilevel linear model, 2012-2017 inclusive. Patient-perceived relationship continuity was calculated using two questions from the GP Patient Survey. The effect of deprivation on the linear slope of continuity over time was modelled, adjusting for nine confounding variables (practice population and organisational factors). Clustering of measurements within general practices was adjusted for by using a random intercepts and random slopes model. Descriptive statistics and univariable analyses were also undertaken. Relationship continuity declined by 27.5% between 2012 and 2017, and at all deprivation levels. Deprivation scores from 2012 did not predict variations in the decline of relationship continuity at practice level, after accounting for the effects of organisational and population confounding variables, which themselves did not predict, or weakly predicted with very small effect sizes, the decline of continuity. Cross-sectionally, continuity and deprivation were negatively correlated within each year. The decline in relationship continuity of care has been marked and widespread. Measures to maximise continuity will need to be feasible for individual practices with diverse population and organisational characteristics. © British Journal of General Practice 2018.
ERIC Educational Resources Information Center
Scharfenberg, Franz-Josef; Bogner, Franz X.
2013-01-01
This study classified students into different cognitive load (CL) groups by means of cluster analysis based on their experienced CL in a gene technology outreach lab which has instructionally been designed with regard to CL theory. The relationships of the identified student CL clusters to learner characteristics, laboratory variables, and…
NASA Astrophysics Data System (ADS)
Musella, I.; Marconi, M.; Stetson, P. B.; Raimondo, G.; Brocato, E.; Molinaro, R.; Ripepi, V.; Carini, R.; Coppola, G.; Walker, A. R.; Welch, D. L.
2016-04-01
We present the analysis of multiband time series data for a sample of 24 Cepheids in the field of the Large Magellanic Cloud cluster NGC 1866. Very accurate BVI Very Large Telescope photometry is combined with archival UBVI data, covering a large temporal window, to obtain precise mean magnitudes and periods with typical errors of 1-2 per cent and of 1 ppm, respectively. These results represent the first accurate and homogeneous data set for a substantial sample of Cepheid variables belonging to a cluster and hence sharing common distance, age and original chemical composition. Comparisons of the resulting multiband period-luminosity and Wesenheit relations to both empirical and theoretical results for the Large Magellanic Cloud are presented and discussed to derive the distance of the cluster and to constrain the mass-luminosity relation of the Cepheids. The adopted theoretical scenario is also tested by comparison with independent calibrations of the Cepheid Wesenheit zero-point based on trigonometric parallaxes and Baade-Wesselink techniques. Our analysis suggests that a mild overshooting and/or a moderate mass-loss can affect intermediate-mass stellar evolution in this cluster and gives a distance modulus of 18.50 ± 0.01 mag. The obtained V,I colour-magnitude diagram is also analysed and compared with both synthetic models and theoretical isochrones for a range of ages and metallicities and for different efficiencies of core overshooting. As a result, we find that the age of NGC 1866 is about 140 Myr, assuming Z = 0.008 and the mild efficiency of overshooting suggested by the comparison with the pulsation models.
Maurage, Pierre; Timary, Philippe de; D'Hondt, Fabien
2017-08-01
Emotional and interpersonal impairments have been largely reported in alcohol-dependence, and their role in its development and maintenance is widely established. However, earlier studies have exclusively focused on group comparisons between healthy controls and alcohol-dependent individuals, considering them as a homogeneous population. The variability of socio-emotional profiles in this disorder thus remains totally unexplored. The present study used a cluster analytic approach to explore the heterogeneity of affective and social disorders in alcohol-dependent individuals. 296 recently-detoxified alcohol-dependent patients were first compared with 246 matched healthy controls regarding self-reported emotional (i.e. alexithymia) and social (i.e. interpersonal problems) difficulties. Then, a cluster analysis was performed, focusing on the alcohol-dependent sample, to explore the presence of differential patterns of socio-emotional deficits and their links with demographic, psychopathological and alcohol-related variables. The group comparison between alcohol-dependent individuals and controls clearly confirmed that emotional and interpersonal difficulties constitute a key factor in alcohol-dependence. However, the cluster analysis identified five subgroups of alcohol-dependent individuals, presenting distinct combinations of alexithymia and interpersonal problems ranging from a total absence of reported impairment to generalized socio-emotional difficulties. Alcohol-dependent individuals should no more be considered as constituting a unitary group regarding their affective and interpersonal difficulties, but rather as a population encompassing a wide variety of socio-emotional profiles. Future experimental studies on emotional and social variables should thus go beyond mere group comparisons to explore this heterogeneity, and prevention programs proposing an individualized evaluation and rehabilitation of these deficits should be promoted. Copyright © 2017 Elsevier B.V. All rights reserved.
Genetic divergence of physiological-quality traits of seeds in a population of peppers.
Pessoa, A M S; Barroso, P A; do Rêgo, E R; Medeiros, G D A; Bruno, R L A; do Rêgo, M M
2015-10-16
Brazil has a great diversity of Capsicum peppers that can be used in breeding programs. The objective of this study was to evaluate genetic variation in traits related to the physiological quality of seeds of Capsicum annuum L. in a segregating F2 population and its parents. A total of 250 seeds produced by selfing in the F1 generation resulting from crosses between UFPB 77.3 and UFPB 76 were used, with 100 seeds of both parents used as additional controls, totaling 252 genotypes. The seeds were germinated in gerboxes containing substrate blotting paper moistened with distilled water. Germination and the following vigor tests were evaluated: first count, germination velocity index, and root and shoot lengths. Data were subjected to analysis of variance, and means were compared by Scott and Knott's method at 1% probability. Tocher's clustering based on Mahalanobis distance and canonical variable analysis with graphic dispersion of genotypes were performed, and genetic parameters were estimated. All variables were found to be significant by the F test (P ≤ 0.01) and showed high heritability and a CVg/CVe ratio higher than 1.0, indicating genetic differences among genotypes. Parents (genotypes 1 and 2) formed distinct groups in all clustering methods. Genotypes 3, 104, 153, and 232 were found to be the most divergent according to Tocher's clustering method, and this was mainly due to early germination, which was observed on day 14, and would therefore be selected. Understanding the phenotypic variability among these 252 genotypes will serve as a basis for continuing the breeding program within this family.
New SX Phoenicis Variables in the Globular Cluster NGC 4833
NASA Astrophysics Data System (ADS)
Darragh, A. N.; Murphy, B. W.
2012-07-01
We report the discovery of 6 SX Phoenicis stars in the southern globular cluster NGC 4833. Images were obtained from January through June 2011 with the Southeastern Association for Research in Astronomy 0.6 meter telescope located at Cerro Tololo Interamerican Observatory. The ISIS image subtraction method was used to search for variable stars in the cluster. We confirmed 17 previously cataloged variables and have identified 10 new variables. Of the total number of confirmed variables in our 10×10 arcmin^2 field, we classified 10 RRab variables, with a mean period of 0.69591 days, 7 RRc, with a mean period of 0.39555 days, 2 possible RRe variables with a mean period of 0.30950 days, a W Ursae Majoris contact binary, an Algol-type binary, and the 6 SX Phoenicis stars with a mean period of 0.05847 days. The periods, relative numbers of RRab and RRc variables, and Bailey diagram are indicative of the cluster being of the Oosterhoff type II. We present the phased-light curves, periods of previously known variables and the periods and classifications of the newly discovered variables, and their location on the color-magnitude diagram.
Zhang, X; Patel, L A; Beckwith, O; Schneider, R; Weeden, C J; Kindt, J T
2017-11-14
Micelle cluster distributions from molecular dynamics simulations of a solvent-free coarse-grained model of sodium octyl sulfate (SOS) were analyzed using an improved method to extract equilibrium association constants from small-system simulations containing one or two micelle clusters at equilibrium with free surfactants and counterions. The statistical-thermodynamic and mathematical foundations of this partition-enabled analysis of cluster histograms (PEACH) approach are presented. A dramatic reduction in computational time for analysis was achieved through a strategy similar to the selector variable method to circumvent the need for exhaustive enumeration of the possible partitions of surfactants and counterions into clusters. Using statistics from a set of small-system (up to 60 SOS molecules) simulations as input, equilibrium association constants for micelle clusters were obtained as a function of both number of surfactants and number of associated counterions through a global fitting procedure. The resulting free energies were able to accurately predict micelle size and charge distributions in a large (560 molecule) system. The evolution of micelle size and charge with SOS concentration as predicted by the PEACH-derived free energies and by a phenomenological four-parameter model fit, along with the sensitivity of these predictions to variations in cluster definitions, are analyzed and discussed.
GALAXY INFALL BY INTERACTING WITH ITS ENVIRONMENT: A COMPREHENSIVE STUDY OF 340 GALAXY CLUSTERS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gu, Liyi; Wen, Zhonglue; Gandhi, Poshak
To study systematically the evolution of the angular extents of the galaxy, intracluster medium (ICM), and dark matter components in galaxy clusters, we compiled the optical and X-ray properties of a sample of 340 clusters with redshifts <0.5, based on all the available data from the Sloan Digital Sky Survey and Chandra / XMM-Newton . For each cluster, the member galaxies were determined primarily with photometric redshift measurements. The radial ICM mass distribution, as well as the total gravitational mass distribution, was derived from a spatially resolved spectral analysis of the X-ray data. When normalizing the radial profile of galaxymore » number to that of the ICM mass, the relative curve was found to depend significantly on the cluster redshift; it drops more steeply toward the outside in lower-redshift subsamples. The same evolution is found in the galaxy-to-total mass profile, while the ICM-to-total mass profile varies in an opposite way. The behavior of the galaxy-to-ICM distribution does not depend on the cluster mass, suggesting that the detected redshift dependence is not due to mass-related effects, such as sample selection bias. Also, it cannot be ascribed to various redshift-dependent systematic errors. We interpret that the galaxies, the ICM, and the dark matter components had similar angular distributions when a cluster was formed, while the galaxies traveling in the interior of the cluster have continuously fallen toward the center relative to the other components, and the ICM has slightly expanded relative to the dark matter although it suffers strong radiative loss. This cosmological galaxy infall, accompanied by an ICM expansion, can be explained by considering that the galaxies interact strongly with the ICM while they are moving through it. The interaction is considered to create a large energy flow of 10{sup 4445} erg s{sup 1} per cluster from the member galaxies to their environment, which is expected to continue over cosmological timescales.« less
Galaxy Infall by Interacting with Its Environment: A Comprehensive Study of 340 Galaxy Clusters
NASA Astrophysics Data System (ADS)
Gu, Liyi; Wen, Zhonglue; Gandhi, Poshak; Inada, Naohisa; Kawaharada, Madoka; Kodama, Tadayuki; Konami, Saori; Nakazawa, Kazuhiro; Xu, Haiguang; Makishima, Kazuo
2016-07-01
To study systematically the evolution of the angular extents of the galaxy, intracluster medium (ICM), and dark matter components in galaxy clusters, we compiled the optical and X-ray properties of a sample of 340 clusters with redshifts <0.5, based on all the available data from the Sloan Digital Sky Survey and Chandra/XMM-Newton. For each cluster, the member galaxies were determined primarily with photometric redshift measurements. The radial ICM mass distribution, as well as the total gravitational mass distribution, was derived from a spatially resolved spectral analysis of the X-ray data. When normalizing the radial profile of galaxy number to that of the ICM mass, the relative curve was found to depend significantly on the cluster redshift; it drops more steeply toward the outside in lower-redshift subsamples. The same evolution is found in the galaxy-to-total mass profile, while the ICM-to-total mass profile varies in an opposite way. The behavior of the galaxy-to-ICM distribution does not depend on the cluster mass, suggesting that the detected redshift dependence is not due to mass-related effects, such as sample selection bias. Also, it cannot be ascribed to various redshift-dependent systematic errors. We interpret that the galaxies, the ICM, and the dark matter components had similar angular distributions when a cluster was formed, while the galaxies traveling in the interior of the cluster have continuously fallen toward the center relative to the other components, and the ICM has slightly expanded relative to the dark matter although it suffers strong radiative loss. This cosmological galaxy infall, accompanied by an ICM expansion, can be explained by considering that the galaxies interact strongly with the ICM while they are moving through it. The interaction is considered to create a large energy flow of 1044-45 erg s-1 per cluster from the member galaxies to their environment, which is expected to continue over cosmological timescales.
NASA Technical Reports Server (NTRS)
Rock, M.; Kunigahalli, V.; Khan, S.; Mcnair, A.
1984-01-01
Sealed nickel cadmium cells having undergone a large number of cycles were discharged using the Hg/HgO reference electrode. The negative electrode exhibited the second plateau. SEM of negative plates of such cells show clusters of large crystals of cadmium hydroxide. These large crystals on the negative plates disappear after continuous overcharging in flooded cells. Atomic Absorption Spectroscopy and standard wet chemical methods are being used to determine the cell materials viz: nickel, cadmium, cobalt, potassum and carbonate. The anodes and cathodes are analyzed after careful examination and the condition of the separator material is evaluated.
NASA Astrophysics Data System (ADS)
Okita, Shin; Verestek, Wolfgang; Sakane, Shinji; Takaki, Tomohiro; Ohno, Munekazu; Shibuta, Yasushi
2017-09-01
Continuous processes of homogeneous nucleation, solidification and grain growth are spontaneously achieved from an undercooled iron melt without any phenomenological parameter in the molecular dynamics (MD) simulation with 12 million atoms. The nucleation rate at the critical temperature is directly estimated from the atomistic configuration by cluster analysis to be of the order of 1034 m-3 s-1. Moreover, time evolution of grain size distribution during grain growth is obtained by the combination of Voronoi and cluster analyses. The grain growth exponent is estimated to be around 0.3 from the geometric average of the grain size distribution. Comprehensive understanding of kinetic properties during continuous processes is achieved in the large-scale MD simulation by utilizing the high parallel efficiency of a graphics processing unit (GPU), which is shedding light on the fundamental aspects of production processes of materials from the atomistic viewpoint.
Lin, Sheng-Hsiang; Liu, Chih-Min; Liu, Yu-Li; Fann, Cathy Shen-Jang; Hsiao, Po-Chang; Wu, Jer-Yuarn; Hung, Shuen-Iu; Chen, Chun-Houh; Wu, Han-Ming; Jou, Yuh-Shan; Liu, Shi K.; Hwang, Tzung J.; Hsieh, Ming H.; Chang, Chien-Ching; Yang, Wei-Chih; Lin, Jin-Jia; Chou, Frank Huang-Chih; Faraone, Stephen V.; Tsuang, Ming T.; Hwu, Hai-Gwo; Chen, Wei J.
2009-01-01
Chromosome 6p is one of the most commonly implicated regions in the genome-wide linkage scans of schizophrenia, whereas further association studies for markers in this region were inconsistent likely due to heterogeneity. This study aimed to identify more homogeneous subgroups of families for fine mapping on regions around markers D6S296 and D6S309 (both in 6p24.3) as well as D6S274 (in 6p22.3) by means of similarity in neurocognitive functioning. A total of 160 families of patients with schizophrenia comprising at least two affected siblings who had data for 8 neurocognitive test variables of the Continuous Performance Test (CPT) and the Wisconsin Card Sorting Test (WCST) were subjected to cluster analysis with data visualization using the test scores of both affected siblings. Family clusters derived were then used separately in family-based association tests for 64 single nucleotide polymorphisms covering the region of 6p24.3 and 6p22.3. Three clusters were derived from the family-based clustering, with deficit cluster 1 representing deficit on the CPT, deficit cluster 2 representing deficit on both the CPT and the WCST, and a third cluster of non-deficit. After adjustment using false discovery rate for multiple testing, SNP rs13873 and haplotype rs1225934-rs13873 on BMP6-TXNDC5 genes were significantly associated with schizophrenia for the deficit cluster 1 but not for the deficit cluster 2 or non-deficit cluster. Our results provide further evidence that the BMP6-TXNDC5 locus on 6p24.3 may play a role in the selective impairments on sustained attention of schizophrenia. PMID:19694819
Exploring the Dynamics of Exoplanetary Systems in a Young Stellar Cluster
NASA Astrophysics Data System (ADS)
Thornton, Jonathan Daniel; Glaser, Joseph Paul; Wall, Joshua Edward
2018-01-01
I describe a dynamical simulation of planetary systems in a young star cluster. One rather arbitrary aspect of cluster simulations is the choice of initial conditions. These are typically chosen from some standard model, such as Plummer or King, or from a “fractal” distribution to try to model young clumpy systems. Here I adopt the approach of realizing an initial cluster model directly from a detailed magnetohydrodynamical model of cluster formation from a 1000-solar-mass interstellar gas cloud, with magnetic fields and radiative and wind feedback from massive stars included self-consistently. The N-body simulation of the stars and planets starts once star formation is largely over and feedback has cleared much of the gas from the region where the newborn stars reside. It continues until the cluster dissolves in the galactic field. Of particular interest is what would happen to the free-floating planets created in the gas cloud simulation. Are they captured by a star or are they ejected from the cluster? This method of building a dynamical cluster simulation directly from the results of a cluster formation model allows us to better understand the evolution of young star clusters and enriches our understanding of extrasolar planet development in them. These simulations were performed within the AMUSE simulation framework, and combine N-body, multiples and background potential code.
A Massive, Cooling-Flow-Induced Starburst in the Core of a Highly Luminous Galaxy Cluster
NASA Technical Reports Server (NTRS)
McDonald, M.; Bayliss, M.; Benson, B. A.; Foley, R. J.; Ruel, J.; Sullivan, P.; Veilleux, S.; Aird, K. A.; Ashby, M. L. N.; Bautz, M.;
2012-01-01
In the cores of some galaxy clusters the hot intracluster plasma is dense enough that it should cool radiatively in the cluster s lifetime, leading to continuous "cooling flows" of gas sinking towards the cluster center, yet no such cooling flow has been observed. The low observed star formation rates and cool gas masses for these "cool core" clusters suggest that much of the cooling must be offset by astrophysical feedback to prevent the formation of a runaway cooling flow. Here we report X-ray, optical, and infrared observations of the galaxy cluster SPT-CLJ2344-4243 at z = 0.596. These observations reveal an exceptionally luminous (L(sub 2-10 keV) = 8.2 10(exp 45) erg/s) galaxy cluster which hosts an extremely strong cooling flow (M(sub cool) = 3820 +/- 530 Stellar Mass/yr). Further, the central galaxy in this cluster appears to be experiencing a massive starburst (740 +/- 160 Stellar Mass/ yr), which suggests that the feedback source responsible for preventing runaway cooling in nearby cool core clusters may not yet be fully established in SPT-CLJ2344-4243. This large star formation rate implies that a significant fraction of the stars in the central galaxy of this cluster may form via accretion of the intracluster medium, rather than the current picture of central galaxies assembling entirely via mergers.
Unsupervised classification of variable stars
NASA Astrophysics Data System (ADS)
Valenzuela, Lucas; Pichara, Karim
2018-03-01
During the past 10 years, a considerable amount of effort has been made to develop algorithms for automatic classification of variable stars. That has been primarily achieved by applying machine learning methods to photometric data sets where objects are represented as light curves. Classifiers require training sets to learn the underlying patterns that allow the separation among classes. Unfortunately, building training sets is an expensive process that demands a lot of human efforts. Every time data come from new surveys; the only available training instances are the ones that have a cross-match with previously labelled objects, consequently generating insufficient training sets compared with the large amounts of unlabelled sources. In this work, we present an algorithm that performs unsupervised classification of variable stars, relying only on the similarity among light curves. We tackle the unsupervised classification problem by proposing an untraditional approach. Instead of trying to match classes of stars with clusters found by a clustering algorithm, we propose a query-based method where astronomers can find groups of variable stars ranked by similarity. We also develop a fast similarity function specific for light curves, based on a novel data structure that allows scaling the search over the entire data set of unlabelled objects. Experiments show that our unsupervised model achieves high accuracy in the classification of different types of variable stars and that the proposed algorithm scales up to massive amounts of light curves.
A Locus Encoding Variable Defense Systems against Invading DNA Identified in Streptococcus suis
Okura, Masatoshi; Nozawa, Takashi; Watanabe, Takayasu; Murase, Kazunori; Nakagawa, Ichiro; Takamatsu, Daisuke; Osaki, Makoto; Sekizaki, Tsutomu; Gottschalk, Marcelo; Hamada, Shigeyuki
2017-01-01
Streptococcus suis, an important zoonotic pathogen, is known to have an open pan-genome and to develop a competent state. In S. suis, limited genetic lineages are suggested to be associated with zoonosis. However, little is known about the evolution of diversified lineages and their respective phenotypic or ecological characteristics. In this study, we performed comparative genome analyses of S. suis, with a focus on the competence genes, mobile genetic elements, and genetic elements related to various defense systems against exogenous DNAs (defense elements) that are associated with gene gain/loss/exchange mediated by horizontal DNA movements and their restrictions. Our genome analyses revealed a conserved competence-inducing peptide type (pherotype) of the competence system and large-scale genome rearrangements in certain clusters based on the genome phylogeny of 58 S. suis strains. Moreover, the profiles of the defense elements were similar or identical to each other among the strains belonging to the same genomic clusters. Our findings suggest that these genetic characteristics of each cluster might exert specific effects on the phenotypic or ecological differences between the clusters. We also found certain loci that shift several types of defense elements in S. suis. Of note, one of these loci is a previously unrecognized variable region in bacteria, at which strains of distinct clusters code for different and various defense elements. This locus might represent a novel defense mechanism that has evolved through an arms race between bacteria and invading DNAs, mediated by mobile genetic elements and genetic competence. PMID:28379509
NASA Astrophysics Data System (ADS)
Murphy, Brian W.; Darragh, Andrew; Hettinger, Paul; Hibshman, Adam; Johnson, Elliott W.; Liu, Z. J.; Pajkos, Michael A.; Stephenson, Hunter R.; Vondersaar, John R.; Conroy, Kyle E.; McCombs, Thayne A.; Reinhardt, Erik D.; Toddy, Joseph
2015-08-01
We present the results of an extensive study intended to search for and properly classify the variable stars in five galactic globular clusters. Each of the five clusters was observed hundreds to thousands of times over a time span ranging from 2 to 4 years using the SARA 0.6m located at Cerro Tololo Interamerican Observatory. The images were analyzed using the image subtract method of Alard (2000) to identify and produce light curves of all variables found in each cluster. In total we identified 373 variables with 140 of these being newly discovered increasing the number of known variables stars in these clusters by 60%. Of the total we have identified 312 RR Lyrae variables (187 RR0, 18 RR01, 99 RR1, 8 RR2), 9 SX Phe stars, 6 Cepheid variables, 11 eclipsing variables, and 35 long period variables. For IC4499 we identified 64 RR0, 18 RR01, 14 RR1, 4 RR2, 1 SX Phe, 1 eclipsing binary, and 2 long period variables. For NGC4833 we identified 10 RR0, 7 RR1, 2 RR2, 6 SX Phe, 5 eclipsing binaries, and 9 long period variables. For NGC6171 (M107) we identified 13 RR0, 7 RR1, and 1 SX Phe. For NGC6402 (M14) we identified 52 RR0, 56 RR1, 1 RR2, 1 SX Phe, 6 Cepheids, 1 eclipsing binary, and 15 long period variables. For NGC6584 we identified 48 RR0, 15 RR1, 1 RR2, 5 eclipsing binaries, and 9 long period variables. Using the RR Lyrae variables we found the mean V magnitude of the horizontal branch to be VHB = ⟨V ⟩RR = 17.63, 15.51, 15.72, 17.13, and 16.37 magnitudes for IC4499, NGC4833, NGC6171 (M107), NGC6402 (M14), and NGC6584, respectively. From our extensive data set we were able to obtain sufficient temporal and complete phase coverage of the RR Lyrae variables. This has allowed us not only to properly classify each of the RR Lyrae variables but also to use Fourier decomposition of the light curves to further analyze the properties of the variable stars and hence physical properties of each clusters. In this poster we will give the temperature, radius, stellar mass, metallicity, and helium abundance of the set of RR Lyrae variable stars found in each of the five globular clusters.
Zhang, Miao; Bommer, Martin; Chatterjee, Ruchira; Hussein, Rana; Yano, Junko; Dau, Holger; Kern, Jan; Dobbek, Holger; Zouni, Athina
2017-01-01
In plants, algae and cyanobacteria, Photosystem II (PSII) catalyzes the light-driven splitting of water at a protein-bound Mn4CaO5-cluster, the water-oxidizing complex (WOC). In the photosynthetic organisms, the light-driven formation of the WOC from dissolved metal ions is a key process because it is essential in both initial activation and continuous repair of PSII. Structural information is required for understanding of this chaperone-free metal-cluster assembly. For the first time, we obtained a structure of PSII from Thermosynechococcus elongatus without the Mn4CaO5-cluster. Surprisingly, cluster-removal leaves the positions of all coordinating amino acid residues and most nearby water molecules largely unaffected, resulting in a pre-organized ligand shell for kinetically competent and error-free photo-assembly of the Mn4CaO5-cluster. First experiments initiating (i) partial disassembly and (ii) partial re-assembly after complete depletion of the Mn4CaO5-cluster agree with a specific bi-manganese cluster, likely a di-µ-oxo bridged pair of Mn(III) ions, as an assembly intermediate. DOI: http://dx.doi.org/10.7554/eLife.26933.001 PMID:28718766
Diverse hematological phenotypes of β-thalassemia carriers.
Luo, Hong-Yuan; Chui, David H K
2016-03-01
Most β-thalassemia carriers have mild anemia, low mean corpuscular volume and mean corpuscular hemoglobin, and elevated hemoglobin α2 (HbA2 ). However, there is considerable variability resulting from coinheritance with α- and/or δ-globin gene mutations, dominant inheritance of β-thalassemia mutations, highly unstable variant globin chains, large deletions removing part or all of the β-globin gene cluster, loss of heterozygosity of the β-globin gene cluster during development, or concomitant erythroid enzyme or membrane protein abnormalities. Recognition of the specific abnormality and correct diagnosis can allay anxiety and unnecessary investigation, help formulate treatment programs, and deliver appropriate genetic and family counseling. © 2016 New York Academy of Sciences.
The Hierarchical Distribution of the Young Stellar Clusters in Six Local Star-forming Galaxies
NASA Astrophysics Data System (ADS)
Grasha, K.; Calzetti, D.; Adamo, A.; Kim, H.; Elmegreen, B. G.; Gouliermis, D. A.; Dale, D. A.; Fumagalli, M.; Grebel, E. K.; Johnson, K. E.; Kahre, L.; Kennicutt, R. C.; Messa, M.; Pellerin, A.; Ryon, J. E.; Smith, L. J.; Shabani, F.; Thilker, D.; Ubeda, L.
2017-05-01
We present a study of the hierarchical clustering of the young stellar clusters in six local (3-15 Mpc) star-forming galaxies using Hubble Space Telescope broadband WFC3/UVIS UV and optical images from the Treasury Program LEGUS (Legacy ExtraGalactic UV Survey). We identified 3685 likely clusters and associations, each visually classified by their morphology, and we use the angular two-point correlation function to study the clustering of these stellar systems. We find that the spatial distribution of the young clusters and associations are clustered with respect to each other, forming large, unbound hierarchical star-forming complexes that are in general very young. The strength of the clustering decreases with increasing age of the star clusters and stellar associations, becoming more homogeneously distributed after ˜40-60 Myr and on scales larger than a few hundred parsecs. In all galaxies, the associations exhibit a global behavior that is distinct and more strongly correlated from compact clusters. Thus, populations of clusters are more evolved than associations in terms of their spatial distribution, traveling significantly from their birth site within a few tens of Myr, whereas associations show evidence of disruption occurring very quickly after their formation. The clustering of the stellar systems resembles that of a turbulent interstellar medium that drives the star formation process, correlating the components in unbound star-forming complexes in a hierarchical manner, dispersing shortly after formation, suggestive of a single, continuous mode of star formation across all galaxies.
Cabezas, Carmen; Advani, Mamta; Puente, Diana; Rodriguez-Blanco, Teresa; Martin, Carlos
2011-09-01
To evaluate the effectiveness in primary care of a stepped smoking cessation intervention based on the transtheoretical model of change. Cluster randomized trial; unit of randomization: basic care unit (family physician and nurse who care for the same group of patients); and intention-to-treat analysis. All interested basic care units (n = 176) that worked in 82 primary care centres belonging to the Spanish Preventive Services and Health Promotion Research Network in 13 regions of Spain. A total of 2,827 smokers (aged 14-85 years) who consulted a primary care centre for any reason, provided written informed consent and had valid interviews. The outcome variable was the 1-year continuous abstinence rate at the 2-year follow-up. The main variable was the study group (intervention/control). Intervention involved 6-month implementation of recommendations from a Clinical Practice Guideline which included brief motivational interviews for smokers at the precontemplation-contemplation stage, brief intervention for smokers in preparation-action who do not want help, intensive intervention with pharmacotherapy for smokers in preparation-action who want help and reinforcing intervention in the maintenance stage. Control group involved usual care. Among others, characteristics of tobacco use and motivation to quit variables were also collected. The 1-year continuous abstinence rate at the 2-year follow-up was 8.1% in the intervention group and 5.8% in the control group (P = 0.014). In the multivariate logistic regression, the odds of quitting of the intervention versus control group was 1.50 (95% confidence interval = 1.05-2.14). A stepped smoking cessation intervention based on the transtheoretical model significantly increased smoking abstinence at a 2-year follow-up among smokers visiting primary care centres. © 2011 The Authors, Addiction © 2011 Society for the Study of Addiction.
Stellar Variability in the Intermediate Age Cluster NGC 1846
NASA Astrophysics Data System (ADS)
Pajkos, Michael A.; Salinas, Ricardo; Vivas, Anna Katherina; Strader, Jay; Contreras, Rodrigo
2017-01-01
The existence of multiple stellar populations in Galactic globular clusters is considered a widespread phenomenon, with only a few possible exceptions. In the LMC intermediate-age globular clusters, the presence of extended main sequence turn off points (MSTOs), initially interpreted as evidence for multiple stellar populations, is now under scrutiny and stellar rotation has emerged as an alternative explanation. Here we propose yet another ingredient to this puzzle: the fact that the MSTO of these clusters passes through the instability strip making stellar variability a new alternative to explain this phenomenon. We report the first in-depth characterization of the variability, at the MSTO level, in any LMC cluster, and assess the role of variability masquerading as multiple stellar populations. We used the Gemini-S/GMOS to obtain time series photometry of NGC 1846. Using differencing image analysis, we identified 90 variables in the r-band, 68 of which were also found in the g-band. Of these 68, 57 were δ-scuti—with 35 having full phase coverage and 22 without. The average full period (Pfull) was 1.93 ± 0.79 hours. Furthermore, two eclipsing binaries and two RR Lyrae identified by OGLE were recovered. We conclude that not enough variables were found to provide a statistically significant impact on the extended MSTO, nor to explain the bifurcation of MSTO in NGC 1846. But the effect of variable stars could still be a viable explanation on clusters where only a hint of a MS extension is seen.
An AO-assisted Variability Study of Four Globular Clusters
NASA Astrophysics Data System (ADS)
Salinas, R.; Contreras Ramos, R.; Strader, J.; Hakala, P.; Catelan, M.; Peacock, M. B.; Simunovic, M.
2016-09-01
The image-subtraction technique applied to study variable stars in globular clusters represented a leap in the number of new detections, with the drawback that many of these new light curves could not be transformed to magnitudes due to severe crowding. In this paper, we present observations of four Galactic globular clusters, M 2 (NGC 7089), M 10 (NGC 6254), M 80 (NGC 6093), and NGC 1261, taken with the ground-layer adaptive optics module at the SOAR Telescope, SAM. We show that the higher image quality provided by SAM allows for the calibration of the light curves of the great majority of the variables near the cores of these clusters as well as the detection of new variables, even in clusters where image-subtraction searches were already conducted. We report the discovery of 15 new variables in M 2 (12 RR Lyrae stars and 3 SX Phe stars), 12 new variables in M 10 (11 SX Phe and 1 long-period variable), and 1 new W UMa-type variable in NGC 1261. No new detections are found in M 80, but previous uncertain detections are confirmed and the corresponding light curves are calibrated into magnitudes. Additionally, based on the number of detected variables and new Hubble Space Telescope/UVIS photometry, we revisit a previous suggestion that M 80 may be the globular cluster with the richest population of blue stragglers in our Galaxy. Based on observations obtained at the Southern Astrophysical Research (SOAR) telescope, which is a joint project of the Ministério da Ciência, Tecnologia, e Inovação (MCTI) da República Federativa do Brasil, the U.S. National Optical Astronomy Observatory (NOAO), the University of North Carolina at Chapel Hill (UNC), and Michigan State University (MSU).
Anonymous broadcasting of classical information with a continuous-variable topological quantum code
NASA Astrophysics Data System (ADS)
Menicucci, Nicolas C.; Baragiola, Ben Q.; Demarie, Tommaso F.; Brennen, Gavin K.
2018-03-01
Broadcasting information anonymously becomes more difficult as surveillance technology improves, but remarkably, quantum protocols exist that enable provably traceless broadcasting. The difficulty is making scalable entangled resource states that are robust to errors. We propose an anonymous broadcasting protocol that uses a continuous-variable surface-code state that can be produced using current technology. High squeezing enables large transmission bandwidth and strong anonymity, and the topological nature of the state enables local error mitigation.
Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor; Essex, M
2015-05-01
To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice.
Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor
2015-01-01
Abstract To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice. PMID:25560745
Unequal cluster sizes in stepped-wedge cluster randomised trials: a systematic review.
Kristunas, Caroline; Morris, Tom; Gray, Laura
2017-11-15
To investigate the extent to which cluster sizes vary in stepped-wedge cluster randomised trials (SW-CRT) and whether any variability is accounted for during the sample size calculation and analysis of these trials. Any, not limited to healthcare settings. Any taking part in an SW-CRT published up to March 2016. The primary outcome is the variability in cluster sizes, measured by the coefficient of variation (CV) in cluster size. Secondary outcomes include the difference between the cluster sizes assumed during the sample size calculation and those observed during the trial, any reported variability in cluster sizes and whether the methods of sample size calculation and methods of analysis accounted for any variability in cluster sizes. Of the 101 included SW-CRTs, 48% mentioned that the included clusters were known to vary in size, yet only 13% of these accounted for this during the calculation of the sample size. However, 69% of the trials did use a method of analysis appropriate for when clusters vary in size. Full trial reports were available for 53 trials. The CV was calculated for 23 of these: the median CV was 0.41 (IQR: 0.22-0.52). Actual cluster sizes could be compared with those assumed during the sample size calculation for 14 (26%) of the trial reports; the cluster sizes were between 29% and 480% of that which had been assumed. Cluster sizes often vary in SW-CRTs. Reporting of SW-CRTs also remains suboptimal. The effect of unequal cluster sizes on the statistical power of SW-CRTs needs further exploration and methods appropriate to studies with unequal cluster sizes need to be employed. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Testing for entanglement with periodic coarse graining
NASA Astrophysics Data System (ADS)
Tasca, D. S.; Rudnicki, Łukasz; Aspden, R. S.; Padgett, M. J.; Souto Ribeiro, P. H.; Walborn, S. P.
2018-04-01
Continuous-variable systems find valuable applications in quantum information processing. To deal with an infinite-dimensional Hilbert space, one in general has to handle large numbers of discretized measurements in tasks such as entanglement detection. Here we employ the continuous transverse spatial variables of photon pairs to experimentally demonstrate entanglement criteria based on a periodic structure of coarse-grained measurements. The periodization of the measurements allows an efficient evaluation of entanglement using spatial masks acting as mode analyzers over the entire transverse field distribution of the photons and without the need to reconstruct the probability densities of the conjugate continuous variables. Our experimental results demonstrate the utility of the derived criteria with a success rate in entanglement detection of ˜60 % relative to 7344 studied cases.
Composable security proof for continuous-variable quantum key distribution with coherent States.
Leverrier, Anthony
2015-02-20
We give the first composable security proof for continuous-variable quantum key distribution with coherent states against collective attacks. Crucially, in the limit of large blocks the secret key rate converges to the usual value computed from the Holevo bound. Combining our proof with either the de Finetti theorem or the postselection technique then shows the security of the protocol against general attacks, thereby confirming the long-standing conjecture that Gaussian attacks are optimal asymptotically in the composable security framework. We expect that our parameter estimation procedure, which does not rely on any assumption about the quantum state being measured, will find applications elsewhere, for instance, for the reliable quantification of continuous-variable entanglement in finite-size settings.
How large is large? Identifying large corporate ownerships in FIA datasets
Jesse Caputo; Brett Butler; Andy Hartsell
2017-01-01
Forest ownership size is a continuous variable, albeit one with a distinctly nonnormal distribution. Although large corporate forest ownerships are expected to differ in terms of behavior and objectives from smaller corporate ownerships, there is no clear and unambiguous means of defined these two ownership groups. We examined the distribution of the ownership size...
Variable Stars in M13. II.The Red Variables and the Globular Cluster Period-Luminosity Relation
NASA Astrophysics Data System (ADS)
Osborn, W.; Layden, A.; Kopacki, G.; Smith, H.; Anderson, M.; Kelly, A.; McBride, K.; Pritzl, B.
2017-06-01
New CCD observations have been combined with archival data to investigate the nature of the red variables in the globular cluster M13. Mean magnitudes, colors and variation ranges on the UBVIC system have been determined for the 17 cataloged red variables. 15 of the stars are irregular or semi-regular variables that lie at the top of the red giant branch in the color-magnitude diagram. Two stars are not, including one with a well-defined period and a light curve shape indicating it is an ellipsoidal or eclipsing variable. All stars redder than (V-IC)0=1.38 mag vary, with the amplitudes being larger with increased stellar luminosity and with bluer filter passband. Searches of the data for periodicities yielded typical variability cycle times ranging from 30 d up to 92 d for the most luminous star. Several stars have evidence of multiple periods. The stars' period-luminosity diagram compared to those from microlensing survey data shows that most M13 red variables are overtone pulsators. Comparison with the diagrams for other globular clusters shows a correlation between red variable luminosity and cluster metallicity.
NASA Astrophysics Data System (ADS)
Kwok, Ron
2015-09-01
After the summer of 2013, a convergence-induced tail in the thickness distribution of the ice cover is found along the Arctic coasts of Greenland and Canadian Arctic Archipelago. Prompted by this, a normalized ice convergence index (ICI) is introduced to examine the variability and extremes in convergence in a 23 year record (1992-2014) of monthly ice drift. Large-scale composites of circulation patterns, characteristic of regional convergence and divergence, are examined. Indeed, the ICI shows the June 2013 convergence event to be an extreme (i.e., ICI > 2). Furthermore, there is a cluster of 9 months over a 17 month period with positive ICIs (i.e., >1) following the record summer minimum ice extent (SMIE) in 2012; the imprint of ice dynamics from this cluster of positive ICIs likely contributed to higher SMIEs in 2013 and 2014. The impact of convergence on SMIE is discussed, and the increase in Arctic ice volume in 2013 is underscored.
VizieR Online Data Catalog: OGLE RR Lyrae in LMC (Soszynski+, 2003)
NASA Astrophysics Data System (ADS)
Soszynski, I.; Udalski, A.; Szymanski, M.; Kubiak, M.; Pietrzynski, G.; Wozniak, P.; Zebrun, K.; Szewczyk, O.; Wyrzykowski, L.
2003-11-01
We present the catalog of RR Lyr stars discovered in a 4.5 square degrees area in the central parts of the Large Magellanic Cloud (LMC). Presented sample contains 7612 objects, including 5455 fundamental mode pulsators (RRab), 1655 first-overtone (RRc), 272 second-overtone (RRe) and 230 double-mode RR Lyr stars (RRd). Additionally we attach alist of several dozen other short-period pulsating variables. The catalog data include astrometry, periods, BVI photometry, amplitudes, and parameters of the Fourier decomposition of the I-band light curve of each object. We provide a list of six LMC star clusters which contain RR Lyr stars. The richest cluster, NGC 1835, hosts 84 RR Lyr variables. The period distribution of these stars suggests that NGC1835 shares features of Oosterhoff type I and type II groups. All presented data, including individual BVI observations and finding charts are available from the OGLE Internet archive at ftp://sirius.astrouw.edu.pl/ogle/ogle2/var_stars/lmc/rrlyr (6 data files).
A Real-Time PCR with Melting Curve Analysis for Molecular Typing of Vibrio parahaemolyticus.
He, Peiyan; Wang, Henghui; Luo, Jianyong; Yan, Yong; Chen, Zhongwen
2018-05-23
Foodborne disease caused by Vibrio parahaemolyticus is a serious public health problem in many countries. Molecular typing has a great scientific significance and application value for epidemiological research of V. parahaemolyticus. In this study, a real-time PCR with melting curve analysis was established for molecular typing of V. parahaemolyticus. Eighteen large variably presented gene clusters (LVPCs) of V. parahaemolyticus which have different distributions in the genome of different strains were selected as targets. Primer pairs of 18 LVPCs were distributed into three tubes. To validate this newly developed assay, we tested 53 Vibrio parahaemolyticus strains, which were classified in 13 different types. Furthermore, cluster analysis using NTSYS PC 2.02 software could divide 53 V. parahaemolyticus strains into six clusters at a relative similarity coefficient of 0.85. This method is fast, simple, and conveniently for molecular typing of V. parahaemolyticus.
Typology of patients with fibromyalgia: cluster analysis of duloxetine study patients.
Lipkovich, Ilya A; Choy, Ernest H; Van Wambeke, Peter; Deberdt, Walter; Sagman, Doron
2014-12-23
To identify distinct groups of patients with fibromyalgia (FM) with respect to multiple outcome measures. Data from 631 duloxetine-treated women in 4 randomized, placebo-controlled trials were included in a cluster analysis based on outcomes after up to 12 weeks of treatment. Corresponding classification rules were constructed using a classification tree method. Probabilities for transitioning from baseline to Week 12 category were estimated for placebo and duloxetine patients (Ntotal = 1188) using logistic regression. Five clusters were identified, from "worst" (high pain levels and severe mental/physical impairment) to "best" (low pain levels and nearly normal mental/physical function). For patients with moderate overall severity, mental and physical symptoms were less correlated, resulting in 2 distinct clusters based on these 2 symptom domains. Three key variables with threshold values were identified for classification of patients: Brief Pain Inventory (BPI) pain interference overall scores of <3.29 and <7.14, respectively, a Fibromyalgia Impact Questionnaire (FIQ) interference with work score of <2, and an FIQ depression score of ≥5. Patient characteristics and frequencies per baseline category were similar between treatments; >80% of patients were in the 3 worst categories. Duloxetine patients were significantly more likely to improve after 12 weeks than placebo patients. A sustained effect was seen with continued duloxetine treatment. FM patients are heterogeneous and can be classified into distinct subgroups by simple descriptive rules derived from only 3 variables, which may guide individual patient management. Duloxetine showed higher improvement rates than placebo and had a sustained effect beyond 12 weeks.
Large Magellanic Cloud Near-infrared Synoptic Survey. IV. Leavitt Laws for Type II Cepheid Variables
NASA Astrophysics Data System (ADS)
Bhardwaj, Anupam; Macri, Lucas M.; Rejkuba, Marina; Kanbur, Shashi M.; Ngeow, Chow-Choong; Singh, Harinder P.
2017-04-01
We present time-series observations of Population II Cepheids in the Large Magellanic Cloud at near-infrared (JHK s ) wavelengths. Our sample consists of 81 variables with accurate periods and optical (VI) magnitudes from the OGLE survey, covering various subtypes of pulsators (BL Herculis, W Virginis, and RV Tauri). We generate light-curve templates using high-quality I-band data in the LMC from OGLE and K s -band data in the Galactic bulge from VISTA Variables in Via Láctea survey and use them to obtain robust mean magnitudes. We derive period-luminosity (P-L) relations in the near-infrared and Period-Wesenheit (P-W) relations by combining optical and near-infrared data. Our P-L and P-W relations are consistent with published work when excluding long-period RV Tauris. We find that Pop II Cepheids and RR Lyraes follow the same P-L relations in the LMC. Therefore, we use trigonometric parallax from the Gaia DR1 for VY Pyx and the Hubble Space Telescope parallaxes for k Pav and 5 RR Lyrae variables to obtain an absolute calibration of the Galactic K s -band P-L relation, resulting in a distance modulus to the LMC of {μ }{LMC}=18.54+/- 0.08 mag. We update the mean magnitudes of Pop II Cepheids in Galactic globular clusters using our light-curve templates and obtain distance estimates to those systems, anchored to a precise late-type eclipsing binary distance to the LMC. We find that the distances to these globular clusters based on Pop II Cepheids are consistent (within 2σ ) with estimates based on the {M}V-[{Fe}/{{H}}] relation for horizontal branch stars.
Multiplicative Forests for Continuous-Time Processes
Weiss, Jeremy C.; Natarajan, Sriraam; Page, David
2013-01-01
Learning temporal dependencies between variables over continuous time is an important and challenging task. Continuous-time Bayesian networks effectively model such processes but are limited by the number of conditional intensity matrices, which grows exponentially in the number of parents per variable. We develop a partition-based representation using regression trees and forests whose parameter spaces grow linearly in the number of node splits. Using a multiplicative assumption we show how to update the forest likelihood in closed form, producing efficient model updates. Our results show multiplicative forests can be learned from few temporal trajectories with large gains in performance and scalability. PMID:25284967
Multiplicative Forests for Continuous-Time Processes.
Weiss, Jeremy C; Natarajan, Sriraam; Page, David
2012-01-01
Learning temporal dependencies between variables over continuous time is an important and challenging task. Continuous-time Bayesian networks effectively model such processes but are limited by the number of conditional intensity matrices, which grows exponentially in the number of parents per variable. We develop a partition-based representation using regression trees and forests whose parameter spaces grow linearly in the number of node splits. Using a multiplicative assumption we show how to update the forest likelihood in closed form, producing efficient model updates. Our results show multiplicative forests can be learned from few temporal trajectories with large gains in performance and scalability.
The Optical Gravitational Lensing Experiment. Catalog of RR Lyr Stars in the Large Magellanic Cloud
NASA Astrophysics Data System (ADS)
Soszynski, I.; Udalski, A.; Szymanski, M.; Kubiak, M.; Pietrzynski, G.; Wozniak, P.; Zebrun, K.; Szewczyk, O.; Wyrzykowski, L.
2003-06-01
We present the catalog of RR Lyr stars discovered in a 4.5 square degrees area in the central parts of the Large Magellanic Cloud (LMC). Presented sample contains 7612 objects, including 5455 fundamental mode pulsators (RRab), 1655 first-overtone (RRc), 272 second-overtone (RRe) and 230 double-mode RR Lyr stars (RRd). Additionally we attach alist of several dozen other short-period pulsating variables. The catalog data include astrometry, periods, BVI photometry, amplitudes, and parameters of the Fourier decomposition of the I-band light curve of each object. We present density map of RR Lyr stars in the observed fields which shows that the variables are strongly concentrated toward the LMC center. The modal values of the period distribution for RRab, RRc and RRe stars are 0.573, 0.339 and 0.276 days, respectively. The period-luminosity diagrams for BVI magnitudes and for extinction insensitive index W_I are constructed. We provide the log P-I, log P-V and log P-W_I relations for RRab, RRc and RRe stars. The mean observed V-band magnitudes of RR Lyr stars in the LMC are 19.36 mag and 19.31 mag for ab and c types, respectively, while the extinction free values are 18.91 mag and 18.89 mag. We found a large number of RR Lyr stars pulsating in two modes closely spaced in the power spectrum. These stars are believed to exhibit non-radial pulsating modes. We discovered three stars which simultaneously reveal RR Lyr-type and eclipsing-type variability. If any of these objects were an eclipsing binary system containing RR Lyr star, then for the first time the direct determination of the mass of RR Lyr variable would be possible. We provide a list of six LMC star clusters which contain RR Lyr stars. The richest cluster, NGC 1835, hosts 84 RR Lyr variables. The period distribution of these stars suggests that NGC1835 shares features of Oosterhoff type I and type II groups. All presented data, including individual BVI observations and finding charts are available from the OGLE Internet archive.
Dahlö, Martin; Scofield, Douglas G; Schaal, Wesley; Spjuth, Ola
2018-05-01
Next-generation sequencing (NGS) has transformed the life sciences, and many research groups are newly dependent upon computer clusters to store and analyze large datasets. This creates challenges for e-infrastructures accustomed to hosting computationally mature research in other sciences. Using data gathered from our own clusters at UPPMAX computing center at Uppsala University, Sweden, where core hour usage of ∼800 NGS and ∼200 non-NGS projects is now similar, we compare and contrast the growth, administrative burden, and cluster usage of NGS projects with projects from other sciences. The number of NGS projects has grown rapidly since 2010, with growth driven by entry of new research groups. Storage used by NGS projects has grown more rapidly since 2013 and is now limited by disk capacity. NGS users submit nearly twice as many support tickets per user, and 11 more tools are installed each month for NGS projects than for non-NGS projects. We developed usage and efficiency metrics and show that computing jobs for NGS projects use more RAM than non-NGS projects, are more variable in core usage, and rarely span multiple nodes. NGS jobs use booked resources less efficiently for a variety of reasons. Active monitoring can improve this somewhat. Hosting NGS projects imposes a large administrative burden at UPPMAX due to large numbers of inexperienced users and diverse and rapidly evolving research areas. We provide a set of recommendations for e-infrastructures that host NGS research projects. We provide anonymized versions of our storage, job, and efficiency databases.
Exploring the origin of a large cavity in Abell 1795 using deep Chandra observations
NASA Astrophysics Data System (ADS)
Walker, S. A.; Fabian, A. C.; Kosec, P.
2014-12-01
We examine deep stacked Chandra observations of the galaxy cluster Abell 1795 (over 700 ks) to study in depth a large (34 kpc radius) cavity in the X-ray emission. Curiously, despite the large energy required to form this cavity (4PV = 4 × 1060 erg), there is no obvious counterpart to the cavity on the opposite side of the cluster, which would be expected if it has formed due to jets from the central active galactic nucleus (AGN) inflating bubbles. There is also no radio emission associated with the cavity, and no metal enhancement or filaments between it and the brightest cluster galaxy, which are normally found for bubbles inflated by AGN which have risen from the core. One possibility is that this is an old ghost cavity, and that gas sloshing has dominated the distribution of metals around the core. Projection effects, particularly the long X-ray bright filament to the south-east, may prevent us from seeing the companion bubble on the opposite side of the cluster core. We calculate that such a companion bubble would easily have been able to uplift the gas in the southern filament from the core. Interestingly, it has recently been found that inside the cavity is a highly variable X-ray point source coincident with a small dwarf galaxy. Given the remarkable spatial correlation of this point source and the X-ray cavity, we explore the possibility that an outburst from this dwarf galaxy in the past could have led to the formation of the cavity, but find this to be an unlikely scenario.
2018-01-01
Abstract Background Next-generation sequencing (NGS) has transformed the life sciences, and many research groups are newly dependent upon computer clusters to store and analyze large datasets. This creates challenges for e-infrastructures accustomed to hosting computationally mature research in other sciences. Using data gathered from our own clusters at UPPMAX computing center at Uppsala University, Sweden, where core hour usage of ∼800 NGS and ∼200 non-NGS projects is now similar, we compare and contrast the growth, administrative burden, and cluster usage of NGS projects with projects from other sciences. Results The number of NGS projects has grown rapidly since 2010, with growth driven by entry of new research groups. Storage used by NGS projects has grown more rapidly since 2013 and is now limited by disk capacity. NGS users submit nearly twice as many support tickets per user, and 11 more tools are installed each month for NGS projects than for non-NGS projects. We developed usage and efficiency metrics and show that computing jobs for NGS projects use more RAM than non-NGS projects, are more variable in core usage, and rarely span multiple nodes. NGS jobs use booked resources less efficiently for a variety of reasons. Active monitoring can improve this somewhat. Conclusions Hosting NGS projects imposes a large administrative burden at UPPMAX due to large numbers of inexperienced users and diverse and rapidly evolving research areas. We provide a set of recommendations for e-infrastructures that host NGS research projects. We provide anonymized versions of our storage, job, and efficiency databases. PMID:29659792
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ku, Wai Lim; Girvan, Michelle; Ott, Edward
In this paper, we study dynamical systems in which a large number N of identical Landau-Stuart oscillators are globally coupled via a mean-field. Previously, it has been observed that this type of system can exhibit a variety of different dynamical behaviors. These behaviors include time periodic cluster states in which each oscillator is in one of a small number of groups for which all oscillators in each group have the same state which is different from group to group, as well as a behavior in which all oscillators have different states and the macroscopic dynamics of the mean field ismore » chaotic. We argue that this second type of behavior is “extensive” in the sense that the chaotic attractor in the full phase space of the system has a fractal dimension that scales linearly with N and that the number of positive Lyapunov exponents of the attractor also scales linearly with N. An important focus of this paper is the transition between cluster states and extensive chaos as the system is subjected to slow adiabatic parameter change. We observe discontinuous transitions between the cluster states (which correspond to low dimensional dynamics) and the extensively chaotic states. Furthermore, examining the cluster state, as the system approaches the discontinuous transition to extensive chaos, we find that the oscillator population distribution between the clusters continually evolves so that the cluster state is always marginally stable. This behavior is used to reveal the mechanism of the discontinuous transition. We also apply the Kaplan-Yorke formula to study the fractal structure of the extensively chaotic attractors.« less
Multivariate Statistical Analysis of MSL APXS Bulk Geochemical Data
NASA Astrophysics Data System (ADS)
Hamilton, V. E.; Edwards, C. S.; Thompson, L. M.; Schmidt, M. E.
2014-12-01
We apply cluster and factor analyses to bulk chemical data of 130 soil and rock samples measured by the Alpha Particle X-ray Spectrometer (APXS) on the Mars Science Laboratory (MSL) rover Curiosity through sol 650. Multivariate approaches such as principal components analysis (PCA), cluster analysis, and factor analysis compliment more traditional approaches (e.g., Harker diagrams), with the advantage of simultaneously examining the relationships between multiple variables for large numbers of samples. Principal components analysis has been applied with success to APXS, Pancam, and Mössbauer data from the Mars Exploration Rovers. Factor analysis and cluster analysis have been applied with success to thermal infrared (TIR) spectral data of Mars. Cluster analyses group the input data by similarity, where there are a number of different methods for defining similarity (hierarchical, density, distribution, etc.). For example, without any assumptions about the chemical contributions of surface dust, preliminary hierarchical and K-means cluster analyses clearly distinguish the physically adjacent rock targets Windjana and Stephen as being distinctly different than lithologies observed prior to Curiosity's arrival at The Kimberley. In addition, they are separated from each other, consistent with chemical trends observed in variation diagrams but without requiring assumptions about chemical relationships. We will discuss the variation in cluster analysis results as a function of clustering method and pre-processing (e.g., log transformation, correction for dust cover) and implications for interpreting chemical data. Factor analysis shares some similarities with PCA, and examines the variability among observed components of a dataset so as to reveal variations attributable to unobserved components. Factor analysis has been used to extract the TIR spectra of components that are typically observed in mixtures and only rarely in isolation; there is the potential for similar results with data from APXS. These techniques offer new ways to understand the chemical relationships between the materials interrogated by Curiosity, and potentially their relation to materials observed by APXS instruments on other landed missions.
NASA Astrophysics Data System (ADS)
Piper, David; Kunz, Michael; Ehmele, Florian; Mohr, Susanna; Mühr, Bernhard; Kron, Andreas; Daniell, James
2016-12-01
During a 15-day episode from 26 May to 9 June 2016, Germany was affected by an exceptionally large number of severe thunderstorms. Heavy rainfall, related flash floods and creek flooding, hail, and tornadoes caused substantial losses running into billions of euros (EUR). This paper analyzes the key features of the severe thunderstorm episode using extreme value statistics, an aggregated precipitation severity index, and two different objective weather-type classification schemes. It is shown that the thunderstorm episode was caused by the interaction of high moisture content, low thermal stability, weak wind speed, and large-scale lifting by surface lows, persisting over almost 2 weeks due to atmospheric blocking.For the long-term assessment of the recent thunderstorm episode, we draw comparisons to a 55-year period (1960-2014) regarding clusters of convective days with variable length (2-15 days) based on precipitation severity, convection-favoring weather patterns, and compound events with low stability and weak flow. It is found that clusters with more than 8 consecutive convective days are very rare. For example, a 10-day cluster with convective weather patterns prevailing during the recent thunderstorm episode has a probability of less than 1 %.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Minati, Ludovico, E-mail: lminati@ieee.org, E-mail: ludovico.minati@unitn.it
In this paper, experimental evidence of multiple synchronization phenomena in a large (n = 30) ring of chaotic oscillators is presented. Each node consists of an elementary circuit, generating spikes of irregular amplitude and comprising one bipolar junction transistor, one capacitor, two inductors, and one biasing resistor. The nodes are mutually coupled to their neighbours via additional variable resistors. As coupling resistance is decreased, phase synchronization followed by complete synchronization is observed, and onset of synchronization is associated with partial synchronization, i.e., emergence of communities (clusters). While component tolerances affect community structure, the general synchronization properties are maintained across three prototypes andmore » in numerical simulations. The clusters are destroyed by adding long distance connections with distant notes, but are otherwise relatively stable with respect to structural connectivity changes. The study provides evidence that several fundamental synchronization phenomena can be reliably observed in a network of elementary single-transistor oscillators, demonstrating their generative potential and opening way to potential applications of this undemanding setup in experimental modelling of the relationship between network structure, synchronization, and dynamical properties.« less
Subgroups of physically abusive parents based on cluster analysis of parenting behavior and affect.
Haskett, Mary E; Smith Scott, Susan; Sabourin Ward, Caryn
2004-10-01
Cluster analysis of observed parenting and self-reported discipline was used to categorize 83 abusive parents into subgroups. A 2-cluster solution received support for validity. Cluster 1 parents were relatively warm, positive, sensitive, and engaged during interactions with their children, whereas Cluster 2 parents were relatively negative, disengaged or intrusive, and insensitive. Further, clusters differed in emotional health, parenting stress, perceptions of children, and problem solving. Children of parents in the 2 clusters differed on several indexes of social adjustment. Cluster 1 parents were similar to nonabusive parents (n = 66) on parenting and related constructs, but Cluster 2 parents differed from nonabusive parents on all clustering variables and many validation variables. Results highlight clinically relevant diversity in parenting practices and functioning among abusive parents. ((c) 2004 APA, all rights reserved).
Hoogerheide, E S S; Azevedo Filho, J A; Vencovsky, R; Zucchi, M I; Zago, B W; Pinheiro, J B
2017-05-31
The cultivated garlic (Allium sativum L.) displays a wide phenotypic diversity, which is derived from natural mutations and phenotypic plasticity, due to dependence on soil type, moisture, latitude, altitude and cultural practices, leading to a large number of cultivars. This study aimed to evaluate the genetic variability shown by 63 garlic accessions belonging to Instituto Agronômico de Campinas and the Escola Superior de Agricultura "Luiz de Queiroz" germplasm collections. We evaluated ten quantitative characters in experimental trials conducted under two localities of the State of São Paulo: Monte Alegre do Sul and Piracicaba, during the agricultural year of 2007, in a randomized blocks design with five replications. The Mahalanobis distance was used to measure genetic dissimilarities. The UPGMA method and Tocher's method were used as clustering procedures. Results indicated significant variation among accessions (P < 0.01) for all evaluated characters, except for the percentage of secondary bulb growth in MAS, indicating the existence of genetic variation for bulb production, and germplasm evaluation considering different environments is more reliable for the characterization of the genotypic variability among garlic accessions, since it diminishes the environmental effects in the clustering of genotypes.
NASA Astrophysics Data System (ADS)
Sirait, Kamson; Tulus; Budhiarti Nababan, Erna
2017-12-01
Clustering methods that have high accuracy and time efficiency are necessary for the filtering process. One method that has been known and applied in clustering is K-Means Clustering. In its application, the determination of the begining value of the cluster center greatly affects the results of the K-Means algorithm. This research discusses the results of K-Means Clustering with starting centroid determination with a random and KD-Tree method. The initial determination of random centroid on the data set of 1000 student academic data to classify the potentially dropout has a sse value of 952972 for the quality variable and 232.48 for the GPA, whereas the initial centroid determination by KD-Tree has a sse value of 504302 for the quality variable and 214,37 for the GPA variable. The smaller sse values indicate that the result of K-Means Clustering with initial KD-Tree centroid selection have better accuracy than K-Means Clustering method with random initial centorid selection.
Nagwani, Naresh Kumar; Deo, Shirish V
2014-01-01
Understanding of the compressive strength of concrete is important for activities like construction arrangement, prestressing operations, and proportioning new mixtures and for the quality assurance. Regression techniques are most widely used for prediction tasks where relationship between the independent variables and dependent (prediction) variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. In this work cluster regression technique is applied for estimating the compressive strength of the concrete and a novel state of the art is proposed for predicting the concrete compressive strength. The objective of this work is to demonstrate that clustering along with regression ensures less prediction errors for estimating the concrete compressive strength. The proposed technique consists of two major stages: in the first stage, clustering is used to group the similar characteristics concrete data and then in the second stage regression techniques are applied over these clusters (groups) to predict the compressive strength from individual clusters. It is found from experiments that clustering along with regression techniques gives minimum errors for predicting compressive strength of concrete; also fuzzy clustering algorithm C-means performs better than K-means algorithm.
Nagwani, Naresh Kumar; Deo, Shirish V.
2014-01-01
Understanding of the compressive strength of concrete is important for activities like construction arrangement, prestressing operations, and proportioning new mixtures and for the quality assurance. Regression techniques are most widely used for prediction tasks where relationship between the independent variables and dependent (prediction) variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. In this work cluster regression technique is applied for estimating the compressive strength of the concrete and a novel state of the art is proposed for predicting the concrete compressive strength. The objective of this work is to demonstrate that clustering along with regression ensures less prediction errors for estimating the concrete compressive strength. The proposed technique consists of two major stages: in the first stage, clustering is used to group the similar characteristics concrete data and then in the second stage regression techniques are applied over these clusters (groups) to predict the compressive strength from individual clusters. It is found from experiments that clustering along with regression techniques gives minimum errors for predicting compressive strength of concrete; also fuzzy clustering algorithm C-means performs better than K-means algorithm. PMID:25374939
Mass Distribution in Galaxy Cluster Cores
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogan, M. T.; McNamara, B. R.; Pulido, F.
Many processes within galaxy clusters, such as those believed to govern the onset of thermally unstable cooling and active galactic nucleus feedback, are dependent upon local dynamical timescales. However, accurate mapping of the mass distribution within individual clusters is challenging, particularly toward cluster centers where the total mass budget has substantial radially dependent contributions from the stellar ( M {sub *}), gas ( M {sub gas}), and dark matter ( M {sub DM}) components. In this paper we use a small sample of galaxy clusters with deep Chandra observations and good ancillary tracers of their gravitating mass at both largemore » and small radii to develop a method for determining mass profiles that span a wide radial range and extend down into the central galaxy. We also consider potential observational pitfalls in understanding cooling in hot cluster atmospheres, and find tentative evidence for a relationship between the radial extent of cooling X-ray gas and nebular H α emission in cool-core clusters. At large radii the entropy profiles of our clusters agree with the baseline power law of K ∝ r {sup 1.1} expected from gravity alone. At smaller radii our entropy profiles become shallower but continue with a power law of the form K ∝ r {sup 0.67} down to our resolution limit. Among this small sample of cool-core clusters we therefore find no support for the existence of a central flat “entropy floor.”.« less
Multiscale temporal variability and regional patterns in 555 years of conterminous U.S. streamflow
NASA Astrophysics Data System (ADS)
Ho, Michelle; Lall, Upmanu; Sun, Xun; Cook, Edward R.
2017-04-01
The development of paleoclimate streamflow reconstructions in the conterminous United States (CONUS) has provided water resource managers with improved insights into multidecadal and centennial scale variability that cannot be reliably detected using shorter instrumental records. Paleoclimate streamflow reconstructions have largely focused on individual catchments limiting the ability to quantify variability across the CONUS. The Living Blended Drought Atlas (LBDA), a spatially and temporally complete 555 year long paleoclimate record of summer drought across the CONUS, provides an opportunity to reconstruct and characterize streamflow variability at a continental scale. We explore the validity of the first paleoreconstructions of streamflow that span the CONUS informed by the LBDA targeting a set of U.S. Geological Survey streamflow sites. The reconstructions are skillful under cross validation across most of the country, but the variance explained is generally low. Spatial and temporal structures of streamflow variability are analyzed using hierarchical clustering, principal component analysis, and wavelet analyses. Nine spatially coherent clusters are identified. The reconstructions show signals of contemporary droughts such as the Dust Bowl (1930s) and 1950s droughts. Decadal-scale variability was detected in the late 1900s in the western U.S., however, similar modes of temporal variability were rarely present prior to the 1950s. The twentieth century featured longer wet spells and shorter dry spells compared with the preceding 450 years. Streamflows in the Pacific Northwest and Northeast are negatively correlated with the central U.S. suggesting the potential to mitigate some drought impacts by balancing economic activities and insurance pools across these regions during major droughts.
Preparation of Gelatin Layer Film with Gold Clusters in Using Photographic Film
NASA Astrophysics Data System (ADS)
Kuge, Ken'ichi; Arisawa, Michiko; Aoki, Naokazu; Hasegawa, Akira
2000-12-01
A gelatin layer film with gold clusters is produced by taking advantage of the photosensitivity of silver halide photography. Through exposure silver specks, which are called latent-image specks and are composed of several reduced silver atoms, are formed on the surface of silver halide grains in the photographic film. As the latent-image specks act as a catalyst for redox reaction, reduced gold atoms are deposited on the latent-image specks when the exposed film is immersed in a gold (I) thiocyanate complex solution for 5-20 days. Subsequently, when the silver halide grains are dissolved and removed, the gelatin layer film with gold clusters remains. The film produced by this method is purple and showed an absorption spectrum having a maximum of approximately 560 nm as a result of plasmon absorption. The clusters continued to grow with immersion time, and the growth rate increased as the concentration of the gold complex solution was increased. The cluster diameter changed from 20 nm to 100 nm. By this method, it is possible to produce a gelatin film of a large area with evenly dispersed gold clusters, and since it is produced only on the exposed area, pattern forming is also possible.
Nijman, Henk; Simpson, Alan; Jones, Julia
2010-01-01
Background Conflict (aggression, substance use, absconding, etc.) and containment (coerced medication, manual restraint, etc.) threaten the safety of patients and staff on psychiatric wards. Previous work has suggested that staff variables may be significant in explaining differences between wards in their rates of these behaviours, and that structure (ward organisation, rules and daily routines) might be the most critical of these. This paper describes the exploration of a large dataset to assess the relationship between structure and other staff variables. Methods A multivariate cross-sectional design was utilised. Data were collected from staff on 136 acute psychiatric wards in 26 NHS Trusts in England, measuring leadership, teamwork, structure, burnout and attitudes towards difficult patients. Relationships between these variables were explored through principal components analysis (PCA), structural equation modelling and cluster analysis. Results Principal components analysis resulted in the identification of each questionnaire as a separate factor, indicating that the selected instruments assessed a number of non-overlapping items relevant for ward functioning. Structural equation modelling suggested a linear model in which leadership influenced teamwork, teamwork structure; structure burnout; and burnout feelings about difficult patients. Finally, cluster analysis identified two significantly distinct groups of wards: the larger of which had particularly good leadership, teamwork, structure, attitudes towards patients and low burnout; and the second smaller proportion which was poor on all variables and high on burnout. The better functioning cluster of wards had significantly lower rates of containment events. Conclusion The overall performance of staff teams is associated with differing rates of containment on wards. Interventions to reduce rates of containment on wards may need to address staff issues at every level, from leadership through to staff attitudes. PMID:20082064
2010-01-01
Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is preferable, in particular if the gene selection is successful. However, this is an area that needs to be studied further in order to draw any general conclusions. Conclusions The choice of cluster analysis, and in particular gene selection, has a large impact on the ability to cluster individuals correctly based on expression profiles. Normalization has a positive effect, but the relative performance of different normalizations is an area that needs more research. In summary, although clustering, gene selection and normalization are considered standard methods in bioinformatics, our comprehensive analysis shows that selecting the right methods, and the right combinations of methods, is far from trivial and that much is still unexplored in what is considered to be the most basic analysis of genomic data. PMID:20937082
[Health financing conditions in large cities in Brazil].
de Lima, Luciana Dias; de Andrade, Carla Lourenço Tavares
2009-10-01
We evaluated the funding of the Brazilian Unified National Health System (SUS) in municipalities with more than 100,000 inhabitants. The main goal was to evaluate the impact of policies for health resource allocation within the municipal budget. A database was organized with information from revenues reported by municipalities in the Information System on Government Health Budgets (SIOPS) for the year 2005. Reported budgets were compared and correlated to the municipalities' geographic location. We conducted a cluster analysis to create more homogeneous groups according to health-related budget. The study showed a major variability among different regions and States, with varying degrees of municipal dependence on external funds. Although the large variability in sources may indicate multiple strategies for ensuring the necessary budget funds, the study suggests some barriers to public health funding in larger municipalities.
An adaptive two-stage sequential design for sampling rare and clustered populations
Brown, J.A.; Salehi, M.M.; Moradi, M.; Bell, G.; Smith, D.R.
2008-01-01
How to design an efficient large-area survey continues to be an interesting question for ecologists. In sampling large areas, as is common in environmental studies, adaptive sampling can be efficient because it ensures survey effort is targeted to subareas of high interest. In two-stage sampling, higher density primary sample units are usually of more interest than lower density primary units when populations are rare and clustered. Two-stage sequential sampling has been suggested as a method for allocating second stage sample effort among primary units. Here, we suggest a modification: adaptive two-stage sequential sampling. In this method, the adaptive part of the allocation process means the design is more flexible in how much extra effort can be directed to higher-abundance primary units. We discuss how best to design an adaptive two-stage sequential sample. ?? 2008 The Society of Population Ecology and Springer.
Small and Large Number Processing in Infants and Toddlers with Williams Syndrome
ERIC Educational Resources Information Center
Van Herwegen, Jo; Ansari, Daniel; Xu, Fei; Karmiloff-Smith, Annette
2008-01-01
Previous studies have suggested that typically developing 6-month-old infants are able to discriminate between small and large numerosities. However, discrimination between small numerosities in young infants is only possible when variables continuous with number (e.g. area or circumference) are confounded. In contrast, large number discrimination…
Pearson's chi-square test and rank correlation inferences for clustered data.
Shih, Joanna H; Fay, Michael P
2017-09-01
Pearson's chi-square test has been widely used in testing for association between two categorical responses. Spearman rank correlation and Kendall's tau are often used for measuring and testing association between two continuous or ordered categorical responses. However, the established statistical properties of these tests are only valid when each pair of responses are independent, where each sampling unit has only one pair of responses. When each sampling unit consists of a cluster of paired responses, the assumption of independent pairs is violated. In this article, we apply the within-cluster resampling technique to U-statistics to form new tests and rank-based correlation estimators for possibly tied clustered data. We develop large sample properties of the new proposed tests and estimators and evaluate their performance by simulations. The proposed methods are applied to a data set collected from a PET/CT imaging study for illustration. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.
A clustering approach to segmenting users of internet-based risk calculators.
Harle, C A; Downs, J S; Padman, R
2011-01-01
Risk calculators are widely available Internet applications that deliver quantitative health risk estimates to consumers. Although these tools are known to have varying effects on risk perceptions, little is known about who will be more likely to accept objective risk estimates. To identify clusters of online health consumers that help explain variation in individual improvement in risk perceptions from web-based quantitative disease risk information. A secondary analysis was performed on data collected in a field experiment that measured people's pre-diabetes risk perceptions before and after visiting a realistic health promotion website that provided quantitative risk information. K-means clustering was performed on numerous candidate variable sets, and the different segmentations were evaluated based on between-cluster variation in risk perception improvement. Variation in responses to risk information was best explained by clustering on pre-intervention absolute pre-diabetes risk perceptions and an objective estimate of personal risk. Members of a high-risk overestimater cluster showed large improvements in their risk perceptions, but clusters of both moderate-risk and high-risk underestimaters were much more muted in improving their optimistically biased perceptions. Cluster analysis provided a unique approach for segmenting health consumers and predicting their acceptance of quantitative disease risk information. These clusters suggest that health consumers were very responsive to good news, but tended not to incorporate bad news into their self-perceptions much. These findings help to quantify variation among online health consumers and may inform the targeted marketing of and improvements to risk communication tools on the Internet.
Suppressed epidemics in multirelational networks
NASA Astrophysics Data System (ADS)
Xu, Elvis H. W.; Wang, Wei; Xu, C.; Tang, Ming; Do, Younghae; Hui, P. M.
2015-08-01
A two-state epidemic model in networks with links mimicking two kinds of relationships between connected nodes is introduced. Links of weights w1 and w0 occur with probabilities p and 1 -p , respectively. The fraction of infected nodes ρ (p ) shows a nonmonotonic behavior, with ρ drops with p for small p and increases for large p . For small to moderate w1/w0 ratios, ρ (p ) exhibits a minimum that signifies an optimal suppression. For large w1/w0 ratios, the suppression leads to an absorbing phase consisting only of healthy nodes within a range pL≤p ≤pR , and an active phase with mixed infected and healthy nodes for p
Cluster Analysis of Downscaled and Explicitly Simulated North Atlantic Tropical Cyclone Tracks
Daloz, Anne S.; Camargo, S. J.; Kossin, J. P.; ...
2015-02-11
A realistic representation of the North Atlantic tropical cyclone tracks is crucial as it allows, for example, explaining potential changes in U.S. landfalling systems. Here, the authors present a tentative study that examines the ability of recent climate models to represent North Atlantic tropical cyclone tracks. Tracks from two types of climate models are evaluated: explicit tracks are obtained from tropical cyclones simulated in regional or global climate models with moderate to high horizontal resolution (1°–0.25°), and downscaled tracks are obtained using a downscaling technique with large-scale environmental fields from a subset of these models. Here, for both configurations, tracksmore » are objectively separated into four groups using a cluster technique, leading to a zonal and a meridional separation of the tracks. The meridional separation largely captures the separation between deep tropical and subtropical, hybrid or baroclinic cyclones, while the zonal separation segregates Gulf of Mexico and Cape Verde storms. The properties of the tracks’ seasonality, intensity, and power dissipation index in each cluster are documented for both configurations. The authors’ results show that, except for the seasonality, the downscaled tracks better capture the observed characteristics of the clusters. The authors also use three different idealized scenarios to examine the possible future changes of tropical cyclone tracks under 1) warming sea surface temperature, 2) increasing carbon dioxide, and 3) a combination of the two. The response to each scenario is highly variable depending on the simulation considered. Lastly, the authors examine the role of each cluster in these future changes and find no preponderant contribution of any single cluster over the others.« less
A Network-Based Algorithm for Clustering Multivariate Repeated Measures Data
NASA Technical Reports Server (NTRS)
Koslovsky, Matthew; Arellano, John; Schaefer, Caroline; Feiveson, Alan; Young, Millennia; Lee, Stuart
2017-01-01
The National Aeronautics and Space Administration (NASA) Astronaut Corps is a unique occupational cohort for which vast amounts of measures data have been collected repeatedly in research or operational studies pre-, in-, and post-flight, as well as during multiple clinical care visits. In exploratory analyses aimed at generating hypotheses regarding physiological changes associated with spaceflight exposure, such as impaired vision, it is of interest to identify anomalies and trends across these expansive datasets. Multivariate clustering algorithms for repeated measures data may help parse the data to identify homogeneous groups of astronauts that have higher risks for a particular physiological change. However, available clustering methods may not be able to accommodate the complex data structures found in NASA data, since the methods often rely on strict model assumptions, require equally-spaced and balanced assessment times, cannot accommodate missing data or differing time scales across variables, and cannot process continuous and discrete data simultaneously. To fill this gap, we propose a network-based, multivariate clustering algorithm for repeated measures data that can be tailored to fit various research settings. Using simulated data, we demonstrate how our method can be used to identify patterns in complex data structures found in practice.
Kilborn, Joshua P; Jones, David L; Peebles, Ernst B; Naar, David F
2017-04-01
Clustering data continues to be a highly active area of data analysis, and resemblance profiles are being incorporated into ecological methodologies as a hypothesis testing-based approach to clustering multivariate data. However, these new clustering techniques have not been rigorously tested to determine the performance variability based on the algorithm's assumptions or any underlying data structures. Here, we use simulation studies to estimate the statistical error rates for the hypothesis test for multivariate structure based on dissimilarity profiles (DISPROF). We concurrently tested a widely used algorithm that employs the unweighted pair group method with arithmetic mean (UPGMA) to estimate the proficiency of clustering with DISPROF as a decision criterion. We simulated unstructured multivariate data from different probability distributions with increasing numbers of objects and descriptors, and grouped data with increasing overlap, overdispersion for ecological data, and correlation among descriptors within groups. Using simulated data, we measured the resolution and correspondence of clustering solutions achieved by DISPROF with UPGMA against the reference grouping partitions used to simulate the structured test datasets. Our results highlight the dynamic interactions between dataset dimensionality, group overlap, and the properties of the descriptors within a group (i.e., overdispersion or correlation structure) that are relevant to resemblance profiles as a clustering criterion for multivariate data. These methods are particularly useful for multivariate ecological datasets that benefit from distance-based statistical analyses. We propose guidelines for using DISPROF as a clustering decision tool that will help future users avoid potential pitfalls during the application of methods and the interpretation of results.
Smith, Noelle B; Tsai, Jack; Pietrzak, Robert H; Cook, Joan M; Hoff, Rani; Harpaz-Rotem, Ilan
2017-10-01
Veterans from the recent conflicts in Iraq and Afghanistan are being diagnosed with posttraumatic stress disorder (PTSD) at high rates. This study examined characteristics associated with mental health service utilization, specifically psychotherapy, through the Department of Veterans Affairs (VA), in a large cohort of Iraq and Afghanistan veterans newly diagnosed with PTSD. This study utilized national VA administrative data from Iraq and Afghanistan veterans following an initial diagnosis of PTSD and completed a self-report measure of PTSD symptoms between Fiscal Years 2008-2012 (N=52,456; 91.7% male; 59.7% Caucasian; mean age 30.6, SD=8.3). Regression analyses examined the relation between PTSD symptom cluster severity and treatment-related variables. Accounting for demographic/clinical variables, PTSD symptom clusters were related to psychotherapy initiation (re-experiencing, OR=1.23; numbing, OR=1.15), combination treatment (medication and psychotherapy; re-experiencing, OR=1.13; avoidance, OR=1.07; dysphoric arousal, OR=1.06), number of psychotherapy visits (re-experiencing, IRR= 1.08; numbing, IRR=1.09), and adequate dose of therapy (e.g., 8 visits/14 weeks; re-experiencing: OR= 1.07). When considering treatment approaches for trauma-exposed veterans, it is important to map the severity of unique PTSD symptoms clusters; this may have implications on the selection of treatment that best fits the veterans' needs and preferences (e.g., exposure therapy versus cognitive processing therapy). Published by Elsevier B.V.
Clustering Binary Data in the Presence of Masking Variables
ERIC Educational Resources Information Center
Brusco, Michael J.
2004-01-01
A number of important applications require the clustering of binary data sets. Traditional nonhierarchical cluster analysis techniques, such as the popular K-means algorithm, can often be successfully applied to these data sets. However, the presence of masking variables in a data set can impede the ability of the K-means algorithm to recover the…
Ford, John A; Jones, Andy; Wong, Geoff; Clark, Allan; Porter, Tom; Steel, Nick
2018-06-19
Realist approaches seek to answer questions such as 'how?', 'why?', 'for whom?', 'in what circumstances?' and 'to what extent?' interventions 'work' using context-mechanism-outcome (CMO) configurations. Quantitative methods are not well-established in realist approaches, but structural equation modelling (SEM) may be useful to explore CMO configurations. Our aim was to assess the feasibility and appropriateness of SEM to explore CMO configurations and, if appropriate, make recommendations based on our access to primary care research. Our specific objectives were to map variables from two large population datasets to CMO configurations from our realist review looking at access to primary care, generate latent variables where needed, and use SEM to quantitatively test the CMO configurations. A linked dataset was created by merging individual patient data from the English Longitudinal Study of Ageing and practice data from the GP Patient Survey. Patients registered in rural practices and who were in the highest deprivation tertile were included. Three latent variables were defined using confirmatory factor analysis. SEM was used to explore the nine full CMOs. All models were estimated using robust maximum likelihoods and accounted for clustering at practice level. Ordinal variables were treated as continuous to ensure convergence. We successfully explored our CMO configurations, but analysis was limited because of data availability. Two hundred seventy-six participants were included. We found a statistically significant direct (context to outcome) or indirect effect (context to outcome via mechanism) for two of nine CMOs. The strongest association was between 'ease of getting through to the surgery' and 'being able to get an appointment' with an indirect mediated effect through convenience (proportion of the indirect effect of the total was 21%). Healthcare experience was not directly associated with getting an appointment, but there was a statistically significant indirect effect through convenience (53% mediated effect). Model fit indices showed adequate fit. SEM allowed quantification of CMO configurations and could complement other qualitative and quantitative techniques in realist evaluations to support inferences about strengths of relationships. Future research exploring CMO configurations with SEM should aim to collect, preferably continuous, primary data.
Image quality guided approach for adaptive modelling of biometric intra-class variations
NASA Astrophysics Data System (ADS)
Abboud, Ali J.; Jassim, Sabah A.
2010-04-01
The high intra-class variability of acquired biometric data can be attributed to several factors such as quality of acquisition sensor (e.g. thermal), environmental (e.g. lighting), behavioural (e.g. change face pose). Such large fuzziness of biometric data can cause a big difference between an acquired and stored biometric data that will eventually lead to reduced performance. Many systems store multiple templates in order to account for such variations in the biometric data during enrolment stage. The number and typicality of these templates are the most important factors that affect system performance than other factors. In this paper, a novel offline approach is proposed for systematic modelling of intra-class variability and typicality in biometric data by regularly selecting new templates from a set of available biometric images. Our proposed technique is a two stage algorithm whereby in the first stage image samples are clustered in terms of their image quality profile vectors, rather than their biometric feature vectors, and in the second stage a per cluster template is selected from a small number of samples in each clusters to create an ultimate template sets. These experiments have been conducted on five face image databases and their results will demonstrate the effectiveness of proposed quality guided approach.
Mo, Yun; Zhang, Zhongzhao; Meng, Weixiao; Ma, Lin; Wang, Yao
2014-01-01
Indoor positioning systems based on the fingerprint method are widely used due to the large number of existing devices with a wide range of coverage. However, extensive positioning regions with a massive fingerprint database may cause high computational complexity and error margins, therefore clustering methods are widely applied as a solution. However, traditional clustering methods in positioning systems can only measure the similarity of the Received Signal Strength without being concerned with the continuity of physical coordinates. Besides, outage of access points could result in asymmetric matching problems which severely affect the fine positioning procedure. To solve these issues, in this paper we propose a positioning system based on the Spatial Division Clustering (SDC) method for clustering the fingerprint dataset subject to physical distance constraints. With the Genetic Algorithm and Support Vector Machine techniques, SDC can achieve higher coarse positioning accuracy than traditional clustering algorithms. In terms of fine localization, based on the Kernel Principal Component Analysis method, the proposed positioning system outperforms its counterparts based on other feature extraction methods in low dimensionality. Apart from balancing online matching computational burden, the new positioning system exhibits advantageous performance on radio map clustering, and also shows better robustness and adaptability in the asymmetric matching problem aspect. PMID:24451470
The Evolutionary Status of M3 RR Lyrae Variable Stars: Breakdown of the Canonical Framework?
NASA Astrophysics Data System (ADS)
Catelan, M.
2004-01-01
In order to test the prevailing paradigm of horizontal-branch (HB) stellar evolution, we use the large databases of measured RR Lyrae parameters for the globular cluster M3 (NGC 5272) recently provided by Bakos et al. and Corwin & Carney. We compare the observed distribution of fundamentalized periods against the predictions of synthetic HBs. The observed distribution shows a sharp peak at Pf~0.55 days, which is primarily due to the RRab variables, whereas the model predictions instead indicate that the distribution should be more uniform in Pf, with a buildup of variables with shorter periods (Pf<0.5 days). Detailed statistical tests show, for the first time, that the observed and predicted distributions are incompatible with one another at a high significance level. This indicates either that canonical HB models are inappropriate, or that M3 is a pathological case that cannot be considered representative of the Oosterhoff type I (OoI) class. In this sense, we show that the OoI cluster with the next largest number of RR Lyrae variables, M5 (NGC 5904), presents a similar, although less dramatic, challenge to the models. We show that the sharp peak in the M3 period distribution receives a significant contribution from the Blazhko variables in the cluster. We also show that M15 (NGC 7078) and M68 (NGC 4590) show similar peaks in their Pf distributions, which in spite of being located at a Pf value similar to that of M3, can, however, be primarily ascribed to the RRc variables. Again similar to M3, a demise of RRc variables toward the blue edge of the instability strip is also identified in these two globulars. This is again in sharp contrast to the evolutionary scenario, which also foresees a strong buildup of RRc variables with short periods in OoII globulars. We speculate that in OoI systems RRab variables may somehow get ``trapped'' close to the transition line between RRab and RRc pulsators as they evolve to the blue in the H-R diagram, whereas in OoII systems it is the RRc variables that may get similarly trapped instead, as they evolve to the red, before changing their pulsation mode to RRab. Such a scenario is supported by the available CMDs and Bailey diagrams for M3, M15, and M68.
Cluster-based analysis of multi-model climate ensembles
NASA Astrophysics Data System (ADS)
Hyde, Richard; Hossaini, Ryan; Leeson, Amber A.
2018-06-01
Clustering - the automated grouping of similar data - can provide powerful and unique insight into large and complex data sets, in a fast and computationally efficient manner. While clustering has been used in a variety of fields (from medical image processing to economics), its application within atmospheric science has been fairly limited to date, and the potential benefits of the application of advanced clustering techniques to climate data (both model output and observations) has yet to be fully realised. In this paper, we explore the specific application of clustering to a multi-model climate ensemble. We hypothesise that clustering techniques can provide (a) a flexible, data-driven method of testing model-observation agreement and (b) a mechanism with which to identify model development priorities. We focus our analysis on chemistry-climate model (CCM) output of tropospheric ozone - an important greenhouse gas - from the recent Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP). Tropospheric column ozone from the ACCMIP ensemble was clustered using the Data Density based Clustering (DDC) algorithm. We find that a multi-model mean (MMM) calculated using members of the most-populous cluster identified at each location offers a reduction of up to ˜ 20 % in the global absolute mean bias between the MMM and an observed satellite-based tropospheric ozone climatology, with respect to a simple, all-model MMM. On a spatial basis, the bias is reduced at ˜ 62 % of all locations, with the largest bias reductions occurring in the Northern Hemisphere - where ozone concentrations are relatively large. However, the bias is unchanged at 9 % of all locations and increases at 29 %, particularly in the Southern Hemisphere. The latter demonstrates that although cluster-based subsampling acts to remove outlier model data, such data may in fact be closer to observed values in some locations. We further demonstrate that clustering can provide a viable and useful framework in which to assess and visualise model spread, offering insight into geographical areas of agreement among models and a measure of diversity across an ensemble. Finally, we discuss caveats of the clustering techniques and note that while we have focused on tropospheric ozone, the principles underlying the cluster-based MMMs are applicable to other prognostic variables from climate models.
Search for Carbon-Rich Asymptotic Giant Branch Stars in Milky Way Globular Clusters
NASA Astrophysics Data System (ADS)
Indahl, Briana; Pessev, P.
2014-01-01
From our current understanding of stellar evolution, it would not be expected to find carbon rich asymptotic giant branch (AGB) stars in Milky Way globular clusters. Due to the low metallicity of the population II stars making up the globular clusters and their age, stars large enough to fuse carbon should have already evolved off of the asymptotic giant branch. Recently, however, there have been serendipitous discoveries of these types of stars. Matsunaga et al. (2006) discovered a Mira variable in the globular cluster Lynga 7. It was later confirmed by Feast et al. (2012) that the star is a member of the cluster and must be a product of a stellar merger. In the same year, Sharina et al. (2012) discovered a carbon star in the low metallicity globular cluster NGC6426 and reports it to be a CH star. Five more of these types of stars have been made as serendipitous discoveries and have been reported by Harding (1962), Dickens (1972), Cote et al. (1997), and Van Loon (2007). The abundance of these types of carbon stars in Milky Way globular clusters has been unknown because the discovery of these types of objects has only ever been a serendipitous discovery. These stars could have been easily overlooked in the past as they are outside the typical parameter space of galactic globular clusters. Also advances in near-infrared instruments and observing techniques have made it possible to detect the fainter carbon stars in binary systems. Having an understanding of the abundances of carbon stars in galactic globular clusters will aid in the modeling of globular cluster and galaxy formation leading to a better understanding of these processes. To get an understanding of the abundances of these stars we conducted the first comprehensive search for AGB carbon stars into all Milky Way globular clusters listed in the Harris Catalog (expect for Pyxis). I have found 128 carbon star candidates using methods of comparing color magnitude diagrams of the clusters with the carbon stars of the Large Magellenic Clouds and picking out very red stars in the red giant branch range. Observations will need to be done of these candidates to further confirm if they are carbon stars and are members of their respective globular cluster.
Suppression of vacancy cluster growth in concentrated solid solution alloys
Zhao, Shijun; Velisa, Gihan; Xue, Haizhou; ...
2016-12-13
Large vacancy clusters, such as stacking-fault tetrahedra, are detrimental vacancy-type defects in ion-irradiated structural alloys. Suppression of vacancy cluster formation and growth is highly desirable to improve the irradiation tolerance of these materials. In this paper, we demonstrate that vacancy cluster growth can be inhibited in concentrated solid solution alloys by modifying cluster migration pathways and diffusion kinetics. The alloying effects of Fe and Cr on the migration of vacancy clusters in Ni concentrated alloys are investigated by molecular dynamics simulations and ion irradiation experiment. While the diffusion coefficients of small vacancy clusters in Ni-based binary and ternary solid solutionmore » alloys are higher than in pure Ni, they become lower for large clusters. This observation suggests that large clusters can easily migrate and grow to very large sizes in pure Ni. In contrast, cluster growth is suppressed in solid solution alloys owing to the limited mobility of large vacancy clusters. Finally, the differences in cluster sizes and mobilities in Ni and in solid solution alloys are consistent with the results from ion irradiation experiments.« less
Periodic, chaotic, and doubled earthquake recurrence intervals on the deep San Andreas Fault
Shelly, David R.
2010-01-01
Earthquake recurrence histories may provide clues to the timing of future events, but long intervals between large events obscure full recurrence variability. In contrast, small earthquakes occur frequently, and recurrence intervals are quantifiable on a much shorter time scale. In this work, I examine an 8.5-year sequence of more than 900 recurring low-frequency earthquake bursts composing tremor beneath the San Andreas fault near Parkfield, California. These events exhibit tightly clustered recurrence intervals that, at times, oscillate between ~3 and ~6 days, but the patterns sometimes change abruptly. Although the environments of large and low-frequency earthquakes are different, these observations suggest that similar complexity might underlie sequences of large earthquakes.
Identification of Hard X-ray Sources in Galactic Globular Clusters: Simbol-X Simulations
NASA Astrophysics Data System (ADS)
Servillat, M.
2009-05-01
Globular clusters harbour an excess of X-ray sources compared to the number of X-ray sources in the Galactic plane. It has been proposed that many of these X-ray sources are cataclysmic variables that have an intermediate magnetic field, i.e. intermediate polars, which remains to be confirmed and understood. We present here several methods to identify intermediate polars in globular clusters from multiwavelength analysis. First, we report on XMM-Newton, Chandra and HST observations of the very dense Galactic globular cluster NGC 2808. By comparing UV and X-ray properties of the cataclysmic variable candidates, the fraction of intermediate polars in this cluster can be estimated. We also present the optical spectra of two cataclysmic variables in the globular cluster M 22. The HeII (4868 Å) emission line in these spectra could be related to the presence of a magnetic field in these objects. Simulations of Simbol-X observations indicate that the angular resolution is sufficient to study X-ray sources in the core of close, less dense globular clusters, such as M 22. The sensitivity of Simbol-X in an extended energy band up to 80 keV will allow us to discriminate between hard X-ray sources (such as magnetic cataclysmic variables) and soft X-ray sources (such as chromospherically active binaries).
Dependence of Halo Bias and Kinematics on Assembly Variables
NASA Astrophysics Data System (ADS)
Xu, Xiaoju; Zheng, Zheng
2018-06-01
Using dark matter haloes identified in a large N-body simulation, we study halo assembly bias, with halo formation time, peak maximum circular velocity, concentration, and spin as the assembly variables. Instead of grouping haloes at fixed mass into different percentiles of each assembly variable, we present the joint dependence of halo bias on the values of halo mass and each assembly variable. In the plane of halo mass and one assembly variable, the joint dependence can be largely described as halo bias increasing outward from a global minimum. We find it unlikely to have a combination of halo variables to absorb all assembly bias effects. We then present the joint dependence of halo bias on two assembly variables at fixed halo mass. The gradient of halo bias does not necessarily follow the correlation direction of the two assembly variables and it varies with halo mass. Therefore in general for two correlated assembly variables one cannot be used as a proxy for the other in predicting halo assembly bias trend. Finally, halo assembly is found to affect the kinematics of haloes. Low-mass haloes formed earlier can have much higher pairwise velocity dispersion than those of massive haloes. In general, halo assembly leads to a correlation between halo bias and halo pairwise velocity distribution, with more strongly clustered haloes having higher pairwise velocity and velocity dispersion. However, the correlation is not tight, and the kinematics of haloes at fixed halo bias still depends on halo mass and assembly variables.
The Hierarchical Distribution of the Young Stellar Clusters in Six Local Star-forming Galaxies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grasha, K.; Calzetti, D.; Adamo, A.
We present a study of the hierarchical clustering of the young stellar clusters in six local (3–15 Mpc) star-forming galaxies using Hubble Space Telescope broadband WFC3/UVIS UV and optical images from the Treasury Program LEGUS (Legacy ExtraGalactic UV Survey). We identified 3685 likely clusters and associations, each visually classified by their morphology, and we use the angular two-point correlation function to study the clustering of these stellar systems. We find that the spatial distribution of the young clusters and associations are clustered with respect to each other, forming large, unbound hierarchical star-forming complexes that are in general very young. Themore » strength of the clustering decreases with increasing age of the star clusters and stellar associations, becoming more homogeneously distributed after ∼40–60 Myr and on scales larger than a few hundred parsecs. In all galaxies, the associations exhibit a global behavior that is distinct and more strongly correlated from compact clusters. Thus, populations of clusters are more evolved than associations in terms of their spatial distribution, traveling significantly from their birth site within a few tens of Myr, whereas associations show evidence of disruption occurring very quickly after their formation. The clustering of the stellar systems resembles that of a turbulent interstellar medium that drives the star formation process, correlating the components in unbound star-forming complexes in a hierarchical manner, dispersing shortly after formation, suggestive of a single, continuous mode of star formation across all galaxies.« less
Clustering P-Wave Receiver Functions To Constrain Subsurface Seismic Structure
NASA Astrophysics Data System (ADS)
Chai, C.; Larmat, C. S.; Maceira, M.; Ammon, C. J.; He, R.; Zhang, H.
2017-12-01
The acquisition of high-quality data from permanent and temporary dense seismic networks provides the opportunity to apply statistical and machine learning techniques to a broad range of geophysical observations. Lekic and Romanowicz (2011) used clustering analysis on tomographic velocity models of the western United States to perform tectonic regionalization and the velocity-profile clusters agree well with known geomorphic provinces. A complementary and somewhat less restrictive approach is to apply cluster analysis directly to geophysical observations. In this presentation, we apply clustering analysis to teleseismic P-wave receiver functions (RFs) continuing efforts of Larmat et al. (2015) and Maceira et al. (2015). These earlier studies validated the approach with surface waves and stacked EARS RFs from the USArray stations. In this study, we experiment with both the K-means and hierarchical clustering algorithms. We also test different distance metrics defined in the vector space of RFs following Lekic and Romanowicz (2011). We cluster data from two distinct data sets. The first, corresponding to the western US, was by smoothing/interpolation of receiver-function wavefield (Chai et al. 2015). Spatial coherence and agreement with geologic region increase with this simpler, spatially smoothed set of observations. The second data set is composed of RFs for more than 800 stations of the China Digital Seismic Network (CSN). Preliminary results show a first order agreement between clusters and tectonic region and each region cluster includes a distinct Ps arrival, which probably reflects differences in crustal thickness. Regionalization remains an important step to characterize a model prior to application of full waveform and/or stochastic imaging techniques because of the computational expense of these types of studies. Machine learning techniques can provide valuable information that can be used to design and characterize formal geophysical inversion, providing information on spatial variability in the subsurface geology.
Mwangi, Benson; Soares, Jair C; Hasan, Khader M
2014-10-30
Neuroimaging machine learning studies have largely utilized supervised algorithms - meaning they require both neuroimaging scan data and corresponding target variables (e.g. healthy vs. diseased) to be successfully 'trained' for a prediction task. Noticeably, this approach may not be optimal or possible when the global structure of the data is not well known and the researcher does not have an a priori model to fit the data. We set out to investigate the utility of an unsupervised machine learning technique; t-distributed stochastic neighbour embedding (t-SNE) in identifying 'unseen' sample population patterns that may exist in high-dimensional neuroimaging data. Multimodal neuroimaging scans from 92 healthy subjects were pre-processed using atlas-based methods, integrated and input into the t-SNE algorithm. Patterns and clusters discovered by the algorithm were visualized using a 2D scatter plot and further analyzed using the K-means clustering algorithm. t-SNE was evaluated against classical principal component analysis. Remarkably, based on unlabelled multimodal scan data, t-SNE separated study subjects into two very distinct clusters which corresponded to subjects' gender labels (cluster silhouette index value=0.79). The resulting clusters were used to develop an unsupervised minimum distance clustering model which identified 93.5% of subjects' gender. Notably, from a neuropsychiatric perspective this method may allow discovery of data-driven disease phenotypes or sub-types of treatment responders. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Thompson, A. M.; Stauffer, R. M.; Young, G. S.
2015-12-01
Ozone (O3) trends analysis is typically performed with monthly or seasonal averages. Although this approach works well for stratospheric or total O3, uncertainties in tropospheric O3 amounts may be large due to rapid meteorological changes near the tropopause and in the lower free troposphere (LFT) where pollution has a days-weeks lifetime. We use self-organizing maps (SOM), a clustering technique, as an alternative for creating tropospheric climatologies from O3 soundings. In a previous study of 900 tropical ozonesondes, clusters representing >40% of profiles deviated > 1-sigma from mean O3. Here SOM are based on 15 years of data from four sites in the contiguous US (CONUS; Boulder, CO; Huntsville, AL; Trinidad Head, CA; Wallops Island, VA). Ozone profiles from 2 - 12 km are used to evaluate the impact of tropopause variability on climatology; 2 - 6 km O3 profile segments are used for the LFT. Near-tropopause O3 is twice the mean O3 mixing ratio in three clusters of 2 - 12 km O3, representing > 15% of profiles at each site. Large mid and lower-tropospheric O3 deviations from monthly means are found in clusters of both 2 - 12 and 2 - 6 km O3. Positive offsets result from pollution and stratosphere-to-troposphere exchange. In the LFT the lowest tropospheric O3 is associated with subtropical air. Some clusters include profiles with common seasonality but other factors, e.g., tropopause height or LFT column amount, characterize other SOM nodes. Thus, as for tropical profiles, CONUS O3 averages can be a poor choice for a climatology.
Kleinman, Ana; Caetano, Sheila Cavalcante; Brentani, Helena; Rocca, Cristiana Castanho de Almeida; dos Santos, Bernardo; Andrade, Enio Roberto; Zeni, Cristian Patrick; Tramontina, Silzá; Rohde, Luis Augusto Paim; Lafer, Beny
2015-03-01
The National Institute of Mental Health has initiated the Research Domain Criteria (RDoC) project. Instead of using disorder categories as the basis for grouping individuals, the RDoC suggests finding relevant dimensions that can cut across traditional disorders. Our aim was to use the RDoC's framework to study patterns of attention deficit based on results of Conners' Continuous Performance Test (CPT II) in youths diagnosed with bipolar disorder (BD), attention-deficit/hyperactivity disorder (ADHD), BD+ADHD and controls. Eighteen healthy controls, 23 patients with ADHD, 10 with BD and 33 BD+ADHD aged 12-17 years old were assessed. Pattern recognition was used to partition subjects into clusters based simultaneously on their performance in all CPT II variables. A Fisher's linear discriminant analysis was used to build a classifier. Using cluster analysis, the entire sample set was best clustered into two new groups, A and B, independently of the original diagnoses. ADHD and BD+ADHD were divided almost 50% in each subgroup, and there was an agglomeration of controls and BD in group B. Group A presented a greater impairment with higher means in all CPT II variables and lower Children's Global Assessment Scale. We found a high cross-validated classification accuracy for groups A and B: 95.2%. Variability of response time was the strongest CPT II measure in the discriminative pattern between groups A and B. Our classificatory exercise supports the concept behind new approaches, such as the RDoC framework, for child and adolescent psychiatry. Our approach was able to define clinical subgroups that could be used in future pathophysiological and treatment studies. © The Royal Australian and New Zealand College of Psychiatrists 2014.
Zhang, Han; Rokas, Antonis; Slot, Jason C
2012-01-01
Dermatophyte fungi of the family Arthrodermataceae (Eurotiomycetes) colonize keratinized tissue, such as skin, frequently causing superficial mycoses in humans and other mammals, reptiles, and birds. Competition with native microflora likely underlies the propensity of these dermatophytes to produce a diversity of antibiotics and compounds for scavenging iron, which is extremely scarce, as well as the presence of an unusually large number of putative secondary metabolism gene clusters, most of which contain non-ribosomal peptide synthetases (NRPS), in their genomes. To better understand the historical origins and diversification of NRPS-containing gene clusters we examined the evolution of a variable locus (VL) that exists in one of three alternative conformations among the genomes of seven dermatophyte species. The first conformation of the VL (termed VLA) contains only 539 base pairs of sequence and lacks protein-coding genes, whereas the other two conformations (termed VLB and VLC) span 36 Kb and 27 Kb and contain 12 and 10 genes, respectively. Interestingly, both VLB and VLC appear to contain distinct secondary metabolism gene clusters; VLB contains a NRPS gene as well as four porphyrin metabolism genes never found to be physically linked in the genomes of 128 other fungal species, whereas VLC also contains a NRPS gene as well as several others typically found associated with secondary metabolism gene clusters. Phylogenetic evidence suggests that the VL locus was present in the ancestor of all seven species achieving its present distribution through subsequent differential losses or retentions of specific conformations. We propose that the existence of variable loci, similar to the one we studied, in fungal genomes could potentially explain the dramatic differences in secondary metabolic diversity between closely related species of filamentous fungi, and contribute to host adaptation and the generation of metabolic diversity.
Observing Globular Cluster RR Lyrae Variables with the BYU West Mountain Observatory
NASA Astrophysics Data System (ADS)
Jeffery, E. J.; Joner, M. D.
2016-06-01
We have utilized the 0.9-meter telescope of the Brigham Young University West Mountain Observatory to secure data on six northern hemisphere globular clusters. Here we present representative observations of RR Lyrae stars located in these clusters, including light curves. We compare light curves produced using both DAOPHOT and ISIS software packages. Light curve fitting is done with FITLC. We find that for well-separated stars, DAOPHOT and ISIS provide comparable results. However, for stars within the cluster core, ISIS provides superior results. These improved techniques will allow us to better measure the properties of cluster variable stars.
VizieR Online Data Catalog: Updated catalog of variable stars in globular clusters (Clement+ 2017)
NASA Astrophysics Data System (ADS)
Clement, C. M.
2017-02-01
This Catalogue is an update to Helen Sawyer Hogg's Third Catalogue on Variable Stars in Globular Clusters (1973, David Dunlap Observatory Publications, Volume 3, Number 6: 1973PDDO....3....6S; see Cat V/97; see also Clement+, 2001AJ....122.2587C). This catalogue is based on the individual cluster files downloaded on http://www.astro.utoronto.ca/~cclement/cat/listngc.html on the 01-Feb-2017. Later updates are indicated in clusters.dat; column "Update". (7 data files).
A proper motion study of the globular cluster M55
NASA Astrophysics Data System (ADS)
Zloczewski, K.; Kaluzny, J.; Thompson, I. B.
2011-07-01
We have derived the absolute proper motion (PM) of the globular cluster M55 using a large set of CCD images collected with the du Pont telescope between 1997 and 2008. We find (μα cos δ, μδ) = (-3.31 ± 0.10, -9.14 ± 0.15) mas yr-1 relative to background galaxies. Membership status was determined for 16 945 stars with 14 < V < 21 from the central part of the cluster. The PM catalogue includes 52 variables, of which 43 are probable members of M55. This sample not only is dominated by pulsating blue straggler stars, but also includes five eclipsing binaries, three of which are main-sequence objects. The survey also identified several candidate blue, yellow and red straggler stars belonging to the cluster. We detected 15 likely members of the Sgr dSph galaxy located behind M55. The average PM for these stars was measured to be (μα cos δ, μδ) = (-2.23 ± 0.14, -1.83 ± 0.24) mas yr-1.
ERIC Educational Resources Information Center
Rodríguez-Ruiz, Beatriz; Rodrigo, María José; Martínez-González, Raquel-Amaya
2015-01-01
The authors examined how the variability in adult conflict resolution styles in family and school contexts was related to adolescents' positive development. Cluster analysis classified 440 fathers, 440 mothers, and 125 tutors into 4 clusters, based on self-reports of their conflict resolution styles. Adolescents exposed to Cluster 1 (inconsistency…
STAR CLUSTER FORMATION WITH STELLAR FEEDBACK AND LARGE-SCALE INFLOW
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matzner, Christopher D.; Jumper, Peter H., E-mail: matzner@astro.utoronto.ca
2015-12-10
During star cluster formation, ongoing mass accretion is resisted by stellar feedback in the form of protostellar outflows from the low-mass stars and photo-ionization and radiation pressure feedback from the massive stars. We model the evolution of cluster-forming regions during a phase in which both accretion and feedback are present and use these models to investigate how star cluster formation might terminate. Protostellar outflows are the strongest form of feedback in low-mass regions, but these cannot stop cluster formation if matter continues to flow in. In more massive clusters, radiation pressure and photo-ionization rapidly clear the cluster-forming gas when itsmore » column density is too small. We assess the rates of dynamical mass ejection and of evaporation, while accounting for the important effect of dust opacity on photo-ionization. Our models are consistent with the census of protostellar outflows in NGC 1333 and Serpens South and with the dust temperatures observed in regions of massive star formation. Comparing observations of massive cluster-forming regions against our model parameter space, and against our expectations for accretion-driven evolution, we infer that massive-star feedback is a likely cause of gas disruption in regions with velocity dispersions less than a few kilometers per second, but that more massive and more turbulent regions are too strongly bound for stellar feedback to be disruptive.« less
Neurodevelopmental disorders: cluster 2 of the proposed meta-structure for DSM-V and ICD-11.
Andrews, G; Pine, D S; Hobbs, M J; Anderson, T M; Sunderland, M
2009-12-01
DSM-IV and ICD-10 are atheoretical and largely descriptive. Although this achieves good reliability, the validity of diagnoses can be increased by an understanding of risk factors and other clinical features. In an effort to group mental disorders on this basis, five clusters have been proposed. We now consider the second cluster, namely neurodevelopmental disorders. We reviewed the literature in relation to 11 validating criteria proposed by a DSM-V Task Force Study Group. This cluster reflects disorders of neurodevelopment rather than a 'childhood' disorders cluster. It comprises disorders subcategorized in DSM-IV and ICD-10 as Mental Retardation; Learning, Motor, and Communication Disorders; and Pervasive Developmental Disorders. Although these disorders seem to be heterogeneous, they share similarities on some risk and clinical factors. There is evidence of a neurodevelopmental genetic phenotype, the disorders have an early emerging and continuing course, and all have salient cognitive symptoms. Within-cluster co-morbidity also supports grouping these disorders together. Other childhood disorders currently listed in DSM-IV share similarities with the Externalizing and Emotional clusters. These include Conduct Disorder, Attention Deficit Hyperactivity Disorder and Separation Anxiety Disorder. The Tic, Eating/Feeding and Elimination disorders, and Selective Mutisms were allocated to the 'Not Yet Assigned' group. Neurodevelopmental disorders meet some of the salient criteria proposed by the American Psychiatric Association (APA) to suggest a classification cluster.
Body shape analyses of large persons in South Korea.
Park, Woojin; Park, Sungjoon
2013-01-01
Despite the prevalence of obesity and overweight, anthropometric characteristics of large individuals have not been extensively studied. This study investigated body shapes of large persons (Broca index ≥ 20, BMI ≥ 25 or WHR>1.0) using stature-normalised body dimensions data from the latest South Korean anthropometric survey. For each sex, a factor analysis was performed on the anthropometric data set to identify the key factors that explain the shape variability; and then, a cluster analysis was conducted on the factor scores data to determine a set of representative body types. The body types were labelled in terms of their distinct shape characteristics and their relative frequencies were computed for each of the four age groups considered: the 10s, 20s-30s, 40s-50s and 60s. The study findings may facilitate creating artefacts that anthropometrically accommodate large individuals, developing digital human models of large persons and designing future ergonomics studies on largeness. This study investigated body shapes of large persons using anthropometric data from South Korea. For each sex, multivariate statistical analyses were conducted to identify the key factors of the body shape variability and determine the representative body types. The study findings may facilitate designing artefacts that anthropometrically accommodate large persons.
Biostatistics Series Module 10: Brief Overview of Multivariate Methods.
Hazra, Avijit; Gogtay, Nithya
2017-01-01
Multivariate analysis refers to statistical techniques that simultaneously look at three or more variables in relation to the subjects under investigation with the aim of identifying or clarifying the relationships between them. These techniques have been broadly classified as dependence techniques, which explore the relationship between one or more dependent variables and their independent predictors, and interdependence techniques, that make no such distinction but treat all variables equally in a search for underlying relationships. Multiple linear regression models a situation where a single numerical dependent variable is to be predicted from multiple numerical independent variables. Logistic regression is used when the outcome variable is dichotomous in nature. The log-linear technique models count type of data and can be used to analyze cross-tabulations where more than two variables are included. Analysis of covariance is an extension of analysis of variance (ANOVA), in which an additional independent variable of interest, the covariate, is brought into the analysis. It tries to examine whether a difference persists after "controlling" for the effect of the covariate that can impact the numerical dependent variable of interest. Multivariate analysis of variance (MANOVA) is a multivariate extension of ANOVA used when multiple numerical dependent variables have to be incorporated in the analysis. Interdependence techniques are more commonly applied to psychometrics, social sciences and market research. Exploratory factor analysis and principal component analysis are related techniques that seek to extract from a larger number of metric variables, a smaller number of composite factors or components, which are linearly related to the original variables. Cluster analysis aims to identify, in a large number of cases, relatively homogeneous groups called clusters, without prior information about the groups. The calculation intensive nature of multivariate analysis has so far precluded most researchers from using these techniques routinely. The situation is now changing with wider availability, and increasing sophistication of statistical software and researchers should no longer shy away from exploring the applications of multivariate methods to real-life data sets.
DeGroote, John P; Sugumaran, Ramanathan; Ecker, Mark
2014-11-01
After several years of low West Nile virus (WNV) occurrence in the United States of America (USA), 2012 witnessed large outbreaks in several parts of the country. In order to understand the outbreak dynamics, spatial clustering and landscape, demographic and climatic associations with WNV occurrence were investigated at a regional level in the USA. Previous research has demonstrated that there are a handful of prominent WNV mosquito vectors with varying ecological requirements responsible for WNV transmission in the USA. Published range maps of these important vectors were georeferenced and used to define eight functional ecological regions in the coterminous USA. The number of human WNV cases and human populations by county were attained in order to calculate a WNV rate for each county in 2012. Additionally, a binary value (high/low) was calculated for each county based on whether the county WNV rate was above or below the rate for the region it fell in. Global Moran's I and Anselin Local Moran's I statistics of spatial association were used per region to examine and visualize clustering of the WNV rate and the high/low rating. Spatial data on landscape, demographic and climatic variables were compiled and derived from a variety of sources and then investigated in relation to human WNV using both Spearman rho correlation coefficients and Poisson regression models. Findings demonstrated significant spatial clustering of WNV and substantial inter-regional differences in relationships between WNV occurrence and landscape, demographic and climatically related variables. The regional associations were consistent with the ecologies of the dominant vectors for those regions. The large outbreak in the Southeast region was preceded by higher than normal winter and spring precipitation followed by dry and hot conditions in the summer.
Clustering of samples and variables with mixed-type data
Edelmann, Dominic; Kopp-Schneider, Annette
2017-01-01
Analysis of data measured on different scales is a relevant challenge. Biomedical studies often focus on high-throughput datasets of, e.g., quantitative measurements. However, the need for integration of other features possibly measured on different scales, e.g. clinical or cytogenetic factors, becomes increasingly important. The analysis results (e.g. a selection of relevant genes) are then visualized, while adding further information, like clinical factors, on top. However, a more integrative approach is desirable, where all available data are analyzed jointly, and where also in the visualization different data sources are combined in a more natural way. Here we specifically target integrative visualization and present a heatmap-style graphic display. To this end, we develop and explore methods for clustering mixed-type data, with special focus on clustering variables. Clustering of variables does not receive as much attention in the literature as does clustering of samples. We extend the variables clustering methodology by two new approaches, one based on the combination of different association measures and the other on distance correlation. With simulation studies we evaluate and compare different clustering strategies. Applying specific methods for mixed-type data proves to be comparable and in many cases beneficial as compared to standard approaches applied to corresponding quantitative or binarized data. Our two novel approaches for mixed-type variables show similar or better performance than the existing methods ClustOfVar and bias-corrected mutual information. Further, in contrast to ClustOfVar, our methods provide dissimilarity matrices, which is an advantage, especially for the purpose of visualization. Real data examples aim to give an impression of various kinds of potential applications for the integrative heatmap and other graphical displays based on dissimilarity matrices. We demonstrate that the presented integrative heatmap provides more information than common data displays about the relationship among variables and samples. The described clustering and visualization methods are implemented in our R package CluMix available from https://cran.r-project.org/web/packages/CluMix. PMID:29182671
Water quality assessment with hierarchical cluster analysis based on Mahalanobis distance.
Du, Xiangjun; Shao, Fengjing; Wu, Shunyao; Zhang, Hanlin; Xu, Si
2017-07-01
Water quality assessment is crucial for assessment of marine eutrophication, prediction of harmful algal blooms, and environment protection. Previous studies have developed many numeric modeling methods and data driven approaches for water quality assessment. The cluster analysis, an approach widely used for grouping data, has also been employed. However, there are complex correlations between water quality variables, which play important roles in water quality assessment but have always been overlooked. In this paper, we analyze correlations between water quality variables and propose an alternative method for water quality assessment with hierarchical cluster analysis based on Mahalanobis distance. Further, we cluster water quality data collected form coastal water of Bohai Sea and North Yellow Sea of China, and apply clustering results to evaluate its water quality. To evaluate the validity, we also cluster the water quality data with cluster analysis based on Euclidean distance, which are widely adopted by previous studies. The results show that our method is more suitable for water quality assessment with many correlated water quality variables. To our knowledge, it is the first attempt to apply Mahalanobis distance for coastal water quality assessment.
Spatial modelling and mapping of female genital mutilation in Kenya
2014-01-01
Background Female genital mutilation/cutting (FGM/C) is still prevalent in several communities in Kenya and other areas in Africa, as well as being practiced by some migrants from African countries living in other parts of the world. This study aimed at detecting clustering of FGM/C in Kenya, and identifying those areas within the country where women still intend to continue the practice. A broader goal of the study was to identify geographical areas where the practice continues unabated and where broad intervention strategies need to be introduced. Methods The prevalence of FGM/C was investigated using the 2008 Kenya Demographic and Health Survey (KDHS) data. The 2008 KDHS used a multistage stratified random sampling plan to select women of reproductive age (15–49 years) and asked questions concerning their FGM/C status and their support for the continuation of FGM/C. A spatial scan statistical analysis was carried out using SaTScan™ to test for statistically significant clustering of the practice of FGM/C in the country. The risk of FGM/C was also modelled and mapped using a hierarchical spatial model under the Integrated Nested Laplace approximation approach using the INLA library in R. Results The prevalence of FGM/C stood at 28.2% and an estimated 10.3% of the women interviewed indicated that they supported the continuation of FGM. On the basis of the Deviance Information Criterion (DIC), hierarchical spatial models with spatially structured random effects were found to best fit the data for both response variables considered. Age, region, rural–urban classification, education, marital status, religion, socioeconomic status and media exposure were found to be significantly associated with FGM/C. The current FGM/C status of a woman was also a significant predictor of support for the continuation of FGM/C. Spatial scan statistics confirm FGM clusters in the North-Eastern and South-Western regions of Kenya (p < 0.001). Conclusion This suggests that the fight against FGM/C in Kenya is not yet over. There are still deep cultural and religious beliefs to be addressed in a bid to eradicate the practice. Interventions by government and other stakeholders must address these challenges and target the identified clusters. PMID:24661558
High Performance Geostatistical Modeling of Biospheric Resources
NASA Astrophysics Data System (ADS)
Pedelty, J. A.; Morisette, J. T.; Smith, J. A.; Schnase, J. L.; Crosier, C. S.; Stohlgren, T. J.
2004-12-01
We are using parallel geostatistical codes to study spatial relationships among biospheric resources in several study areas. For example, spatial statistical models based on large- and small-scale variability have been used to predict species richness of both native and exotic plants (hot spots of diversity) and patterns of exotic plant invasion. However, broader use of geostastics in natural resource modeling, especially at regional and national scales, has been limited due to the large computing requirements of these applications. To address this problem, we implemented parallel versions of the kriging spatial interpolation algorithm. The first uses the Message Passing Interface (MPI) in a master/slave paradigm on an open source Linux Beowulf cluster, while the second is implemented with the new proprietary Xgrid distributed processing system on an Xserve G5 cluster from Apple Computer, Inc. These techniques are proving effective and provide the basis for a national decision support capability for invasive species management that is being jointly developed by NASA and the US Geological Survey.
Genetic Divergence and Chemotype Diversity in the Fusarium Head Blight Pathogen Fusarium poae.
Vanheule, Adriaan; De Boevre, Marthe; Moretti, Antonio; Scauflaire, Jonathan; Munaut, Françoise; De Saeger, Sarah; Bekaert, Boris; Haesaert, Geert; Waalwijk, Cees; van der Lee, Theo; Audenaert, Kris
2017-08-23
Fusarium head blight is a disease caused by a complex of Fusarium species. F. poae is omnipresent throughout Europe in spite of its low virulence. In this study, we assessed a geographically diverse collection of F. poae isolates for its genetic diversity using AFLP (Amplified Fragment Length Polymorphism). Furthermore, studying the mating type locus and chromosomal insertions, we identified hallmarks of both sexual recombination and clonal spread of successful genotypes in the population. Despite the large genetic variation found, all F. poae isolates possess the nivalenol chemotype based on Tri7 sequence analysis. Nevertheless, Tri gene clusters showed two layers of genetic variability. Firstly, the Tri1 locus was highly variable with mostly synonymous mutations and mutations in introns pointing to a strong purifying selection pressure. Secondly, in a subset of isolates, the main trichothecene gene cluster was invaded by a transposable element between Tri5 and Tri6 . To investigate the impact of these variations on the phenotypic chemotype, mycotoxin production was assessed on artificial medium. Complex blends of type A and type B trichothecenes were produced but neither genetic variability in the Tri genes nor variability in the genome or geography accounted for the divergence in trichothecene production. In view of its complex chemotype, it will be of utmost interest to uncover the role of trichothecenes in virulence, spread and survival of F. poae .
Performance Analysis of Cluster Formation in Wireless Sensor Networks.
Montiel, Edgar Romo; Rivero-Angeles, Mario E; Rubino, Gerardo; Molina-Lozano, Heron; Menchaca-Mendez, Rolando; Menchaca-Mendez, Ricardo
2017-12-13
Clustered-based wireless sensor networks have been extensively used in the literature in order to achieve considerable energy consumption reductions. However, two aspects of such systems have been largely overlooked. Namely, the transmission probability used during the cluster formation phase and the way in which cluster heads are selected. Both of these issues have an important impact on the performance of the system. For the former, it is common to consider that sensor nodes in a clustered-based Wireless Sensor Network (WSN) use a fixed transmission probability to send control data in order to build the clusters. However, due to the highly variable conditions experienced by these networks, a fixed transmission probability may lead to extra energy consumption. In view of this, three different transmission probability strategies are studied: optimal, fixed and adaptive. In this context, we also investigate cluster head selection schemes, specifically, we consider two intelligent schemes based on the fuzzy C-means and k-medoids algorithms and a random selection with no intelligence. We show that the use of intelligent schemes greatly improves the performance of the system, but their use entails higher complexity and selection delay. The main performance metrics considered in this work are energy consumption, successful transmission probability and cluster formation latency. As an additional feature of this work, we study the effect of errors in the wireless channel and the impact on the performance of the system under the different transmission probability schemes.
Performance Analysis of Cluster Formation in Wireless Sensor Networks
Montiel, Edgar Romo; Rivero-Angeles, Mario E.; Rubino, Gerardo; Molina-Lozano, Heron; Menchaca-Mendez, Rolando; Menchaca-Mendez, Ricardo
2017-01-01
Clustered-based wireless sensor networks have been extensively used in the literature in order to achieve considerable energy consumption reductions. However, two aspects of such systems have been largely overlooked. Namely, the transmission probability used during the cluster formation phase and the way in which cluster heads are selected. Both of these issues have an important impact on the performance of the system. For the former, it is common to consider that sensor nodes in a clustered-based Wireless Sensor Network (WSN) use a fixed transmission probability to send control data in order to build the clusters. However, due to the highly variable conditions experienced by these networks, a fixed transmission probability may lead to extra energy consumption. In view of this, three different transmission probability strategies are studied: optimal, fixed and adaptive. In this context, we also investigate cluster head selection schemes, specifically, we consider two intelligent schemes based on the fuzzy C-means and k-medoids algorithms and a random selection with no intelligence. We show that the use of intelligent schemes greatly improves the performance of the system, but their use entails higher complexity and selection delay. The main performance metrics considered in this work are energy consumption, successful transmission probability and cluster formation latency. As an additional feature of this work, we study the effect of errors in the wireless channel and the impact on the performance of the system under the different transmission probability schemes. PMID:29236065
Semi-supervised clustering methods.
Bair, Eric
2013-01-01
Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning that there is no outcome variable nor is anything known about the relationship between the observations in the data set. In many situations, however, information about the clusters is available in addition to the values of the features. For example, the cluster labels of some observations may be known, or certain observations may be known to belong to the same cluster. In other cases, one may wish to identify clusters that are associated with a particular outcome variable. This review describes several clustering algorithms (known as "semi-supervised clustering" methods) that can be applied in these situations. The majority of these methods are modifications of the popular k-means clustering method, and several of them will be described in detail. A brief description of some other semi-supervised clustering algorithms is also provided.
Kepler Planets Tend to Have Siblings of the Same Size
NASA Astrophysics Data System (ADS)
Kohler, Susanna
2017-11-01
After 8.5 years of observations with the Kepler space observatory, weve discovered a large number of close-in, tightly-spaced, multiple-planet systems orbiting distant stars. In the process, weve learned a lot about the properties about these systems and discovered some unexpected behavior. A new study explores one of the properties that has surprised us: planets of the same size tend to live together.Orbital architectures for 25 of the authors multiplanet systems. The dots are sized according to the planets relative radii and colored according to mass. Planets of similar sizes and masses tend to live together in the same system. [Millholland et al. 2017]Ordering of SystemsFrom Keplers observations of extrasolar multiplanet systems, we have seen that the sizes of planets in a given system arent completely random. Systems that contain a large planet, for example, are more likely to contain additional large planets rather than additional planets of random size. So though there is a large spread in the radii weve observed for transiting exoplanets, the spread within any given multiplanet system tends to be much smaller.This odd behavior has led us to ask whether this clustering occurs not just for radius, but also for mass. Since the multiplanet systems discovered by Kepler most often contain super-Earths and mini-Neptunes, which have an extremely large spread in densities, the fact that two such planets have similar radii does not guarantee that they have similar masses.If planets dont cluster in mass within a system, this would raise the question of why planets coordinate only their radii within a given system. If they do cluster in mass, it implies that planets within the same system tend to have similar densities, potentially allowing us to predict the sizes and masses of planets we might find in a given system.Insight into MassesLed by NSF graduate research fellow Sarah Millholland, a team of scientists at Yale University used recently determined masses for planets in 37 Kepler multiplanet systems to explore this question of whether exoplanets in a multiplanet system are more likely to have similar masses rather than random ones.Millholland and collaborators find that the masses do show the same clustering trend as radii in multiplanet systems i.e., sibling planets in the same system tend to have both masses and radii that are more similar than if the system were randomly assembled from the total population of planets weve observed. Furthermore, the masses and radii tend to be ordered within a system when the planets are ranked by their periods.The host stars metallicity is correlated with the median planetary radius for a system. [Adapted from Millholland et al. 2017]The authors note two important implications of these results:The scatter in the relation between mass and radius of observed exoplanets is primarily due to system-to-system variability, rather than the variability within each system.Knowing the properties of a star and its primordial protoplanetary disk might allow us to predict the outcome of the planet formation process for the system.Following up on the second point, the authors test whether certain properties of the host star correlate with properties of the planets. They find that the stellar mass and metallicity have a significant effect on the planet properties and the structure of the system.Continuing to explore multiplanet systems like these appears to be an excellent path forward for understanding the hidden order in the broad variety of exoplanets weve observed.CitationSarah Millholland et al 2017 ApJL 849 L33. doi:10.3847/2041-8213/aa9714
DOE Office of Scientific and Technical Information (OSTI.GOV)
Apatin, V. M.; Lokhman, V. N.; Makarov, G. N., E-mail: gmakarov@isan.troitsk.ru
The fragmentation of free homogeneous (CF{sub 3}I){sub n} clusters in a molecular beam (n ≤ 45 is the average number of molecules in the cluster) and (CF{sub 3}I){sub n} clusters inside or on the surface of large (Xe){sub m} clusters (m ≥ 100 is the average number of atoms in the cluster) by ultraviolet and infrared laser radiations has been studied. These three types of (CF{sub 3}I){sub n} clusters are shown to have different stabilities with respect to fragmentation by both ultraviolet and infrared radiations and completely different dependences of the fragmentation probability on the energy of ultraviolet and infraredmore » radiations. When exposed to ultraviolet radiation, the free (CF{sub 3}I){sub n} clusters fragment at comparatively low fluences (Φ{sub UV} ≤ 0.15 J cm{sup −2}) and the weakest energy dependence of the fragmentation probability is observed for them. A stronger energy dependence of the fragmentation probability is observed for the (CF{sub 3}I){sub n} clusters localized inside (Xe){sub m} clusters, and the strongest dependence is observed for the (CF{sub 3}I){sub n} clusters located on the surface of (Xe){sub m} clusters. When the clusters are exposed to infrared radiation, the homogeneous (CF{sub 3}I){sub n} clusters efficiently fragment at low fluences (Φ{sub IR} ≤ 25 mJ cm{sup −2}), higher fluences (Φ{sub IR} ≈ 75 mJ cm{sup −2}) are needed for the fragmentation of the (CF{sub 3}I){sub n} localized inside (Xe){sub m} clusters, and even higher fluences (Φ{sub IR} ≈ 150 mJ cm{sup −2}) are needed for the fragmentation of the (CF{sub 3}I){sub n} clusters located on the surface of (Xe){sub m} clusters. It has been established that small (CF{sub 3}I){sub n} clusters located on the surface of (Xe){sub m} clusters do not fragment up to fluences Φ{sub IR} ≈ 250 mJ cm{sup −2}. The fragmentation efficiency of (CF{sub 3}I){sub n} clusters is shown to be the same (at the same fluence) when they are excited by both pulsed (τ{sub p} ≈ 150 ns) and continuous-wave infrared laser radiations. Possible causes of such a pattern of ultraviolet and infrared laser-induced fragmentation of these clusters are discussed.« less
NASA Technical Reports Server (NTRS)
LeMoigne, Jacqueline; Laporte, Nadine; Netanyahuy, Nathan S.; Zukor, Dorothy (Technical Monitor)
2001-01-01
The characterization and the mapping of land cover/land use of forest areas, such as the Central African rainforest, is a very complex task. This complexity is mainly due to the extent of such areas and, as a consequence, to the lack of full and continuous cloud-free coverage of those large regions by one single remote sensing instrument, In order to provide improved vegetation maps of Central Africa and to develop forest monitoring techniques for applications at the local and regional scales, we propose to utilize multi-sensor remote sensing observations coupled with in-situ data. Fusion and clustering of multi-sensor data are the first steps towards the development of such a forest monitoring system. In this paper, we will describe some preliminary experiments involving the fusion of SAR and Landsat image data of the Lope Reserve in Gabon. Similarly to previous fusion studies, our fusion method is wavelet-based. The fusion provides a new image data set which contains more detailed texture features and preserves the large homogeneous regions that are observed by the Thematic Mapper sensor. The fusion step is followed by unsupervised clustering and provides a vegetation map of the area.
GLOBULAR CLUSTERS AS CRADLES OF LIFE AND ADVANCED CIVILIZATIONS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stefano, R. Di; Ray, A., E-mail: rdistefano@cfa.harvard.edu, E-mail: akr@tifr.res.in
2016-08-10
Globular clusters are ancient stellar populations in compact dense ellipsoids. There is no star formation and there are no core-collapse supernovae, but several lines of evidence suggest that globular clusters are rich in planets. If so, and if advanced civilizations can develop there, then the distances between these civilizations and other stars would be far smaller than typical distances between stars in the Galactic disk, facilitating interstellar communication and travel. The potent combination of long-term stability and high stellar densities provides a globular cluster opportunity. Yet the very proximity that promotes interstellar travel also brings danger, as stellar interactions canmore » destroy planetary systems. We find, however, that large portions of many globular clusters are “sweet spots,” where habitable-zone planetary orbits are stable for long times. Globular clusters in our own and other galaxies are, therefore, among the best targets for searches for extraterrestrial intelligence (SETI). We use the Drake equation to compare the likelihood of advanced civilizations in globular clusters to that in the Galactic disk. We also consider free-floating planets, since wide-orbit planets can be ejected to travel through the cluster. Civilizations spawned in globular clusters may be able to establish self-sustaining outposts, reducing the probability that a single catastrophic event will destroy the civilization. Although individual civilizations may follow different evolutionary paths, or even be destroyed, the cluster may continue to host advanced civilizations once a small number have jumped across interstellar space. Civilizations residing in globular clusters could therefore, in a sense, be immortal.« less
Globular Clusters as Cradles of Life and Advanced Civilizations
NASA Astrophysics Data System (ADS)
Di Stefano, R.; Ray, A.
2016-08-01
Globular clusters are ancient stellar populations in compact dense ellipsoids. There is no star formation and there are no core-collapse supernovae, but several lines of evidence suggest that globular clusters are rich in planets. If so, and if advanced civilizations can develop there, then the distances between these civilizations and other stars would be far smaller than typical distances between stars in the Galactic disk, facilitating interstellar communication and travel. The potent combination of long-term stability and high stellar densities provides a globular cluster opportunity. Yet the very proximity that promotes interstellar travel also brings danger, as stellar interactions can destroy planetary systems. We find, however, that large portions of many globular clusters are “sweet spots,” where habitable-zone planetary orbits are stable for long times. Globular clusters in our own and other galaxies are, therefore, among the best targets for searches for extraterrestrial intelligence (SETI). We use the Drake equation to compare the likelihood of advanced civilizations in globular clusters to that in the Galactic disk. We also consider free-floating planets, since wide-orbit planets can be ejected to travel through the cluster. Civilizations spawned in globular clusters may be able to establish self-sustaining outposts, reducing the probability that a single catastrophic event will destroy the civilization. Although individual civilizations may follow different evolutionary paths, or even be destroyed, the cluster may continue to host advanced civilizations once a small number have jumped across interstellar space. Civilizations residing in globular clusters could therefore, in a sense, be immortal.
Exploring the IMF of star clusters: a joint SLUG and LEGUS effort
NASA Astrophysics Data System (ADS)
Ashworth, G.; Fumagalli, M.; Krumholz, M. R.; Adamo, A.; Calzetti, D.; Chandar, R.; Cignoni, M.; Dale, D.; Elmegreen, B. G.; Gallagher, J. S., III; Gouliermis, D. A.; Grasha, K.; Grebel, E. K.; Johnson, K. E.; Lee, J.; Tosi, M.; Wofford, A.
2017-08-01
We present the implementation of a Bayesian formalism within the Stochastically Lighting Up Galaxies (slug) stellar population synthesis code, which is designed to investigate variations in the initial mass function (IMF) of star clusters. By comparing observed cluster photometry to large libraries of clusters simulated with a continuously varying IMF, our formalism yields the posterior probability distribution function (PDF) of the cluster mass, age and extinction, jointly with the parameters describing the IMF. We apply this formalism to a sample of star clusters from the nearby galaxy NGC 628, for which broad-band photometry in five filters is available as part of the Legacy ExtraGalactic UV Survey (LEGUS). After allowing the upper-end slope of the IMF (α3) to vary, we recover PDFs for the mass, age and extinction that are broadly consistent with what is found when assuming an invariant Kroupa IMF. However, the posterior PDF for α3 is very broad due to a strong degeneracy with the cluster mass, and it is found to be sensitive to the choice of priors, particularly on the cluster mass. We find only a modest improvement in the constraining power of α3 when adding Hα photometry from the companion Hα-LEGUS survey. Conversely, Hα photometry significantly improves the age determination, reducing the frequency of multi-modal PDFs. With the aid of mock clusters, we quantify the degeneracy between physical parameters, showing how constraints on the cluster mass that are independent of photometry can be used to pin down the IMF properties of star clusters.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Langer, S; Rotman, D; Schwegler, E
The Institutional Computing Executive Group (ICEG) review of FY05-06 Multiprogrammatic and Institutional Computing (M and IC) activities is presented in the attached report. In summary, we find that the M and IC staff does an outstanding job of acquiring and supporting a wide range of institutional computing resources to meet the programmatic and scientific goals of LLNL. The responsiveness and high quality of support given to users and the programs investing in M and IC reflects the dedication and skill of the M and IC staff. M and IC has successfully managed serial capacity, parallel capacity, and capability computing resources.more » Serial capacity computing supports a wide range of scientific projects which require access to a few high performance processors within a shared memory computer. Parallel capacity computing supports scientific projects that require a moderate number of processors (up to roughly 1000) on a parallel computer. Capability computing supports parallel jobs that push the limits of simulation science. M and IC has worked closely with Stockpile Stewardship, and together they have made LLNL a premier institution for computational and simulation science. Such a standing is vital to the continued success of laboratory science programs and to the recruitment and retention of top scientists. This report provides recommendations to build on M and IC's accomplishments and improve simulation capabilities at LLNL. We recommend that institution fully fund (1) operation of the atlas cluster purchased in FY06 to support a few large projects; (2) operation of the thunder and zeus clusters to enable 'mid-range' parallel capacity simulations during normal operation and a limited number of large simulations during dedicated application time; (3) operation of the new yana cluster to support a wide range of serial capacity simulations; (4) improvements to the reliability and performance of the Lustre parallel file system; (5) support for the new GDO petabyte-class storage facility on the green network for use in data intensive external collaborations; and (6) continued support for visualization and other methods for analyzing large simulations. We also recommend that M and IC begin planning in FY07 for the next upgrade of its parallel clusters. LLNL investments in M and IC have resulted in a world-class simulation capability leading to innovative science. We thank the LLNL management for its continued support and thank the M and IC staff for its vision and dedicated efforts to make it all happen.« less
NASA Astrophysics Data System (ADS)
Leckebusch, G. C.; Kirchner-Bossi, N. O.; Befort, D. J.; Ulbrich, U.
2015-12-01
Time-clustered mid-latitude winter storms are responsible for a large portion of the overall windstorm-related damage in Europe. Thus, its study entails a high meteorological interest, while its outcome can result in a crucial utility for the (re)insurance industry. In addition to existing cyclone-based studies, here we use an event identification approach based on surface near wind speeds only, to investigate windstorm clustering and compare it to cyclone clustering. Specifically, cyclone and windstorm tracks are identified for winter 1979-2013 (Oct-Mar), to perform two sensitivity analyses on event-clustering in the North Atlantic using ERA-Interim Reanalysis. First, the link between clustering and cyclone intensity is analysed and compared to windstorms. Secondly, the sensitivity of clustering on intra-seasonal time scales is investigated, for both cyclones and windstorms. The wind-based approach reveals additional regions of clustering over Western Europe, which could be related to extreme damages, showing the added value of investigating wind field derived tracks in addition to that of cyclone tracks. Previous studies indicate a higher degree of clustering for stronger cyclones. However, our results show that this assumption is not always met. Although a positive relationship is confirmed for the clustering centre located over Iceland, clustering off the coast of the Iberian Peninsula behaves opposite. Even though this region shows the highest clustering, most of its signal is due to cyclones with intensities below the 70th percentile of the Laplacian of MSLP. Results on the sensitivity of clustering to the time of the winter season (Oct-Mar) show a temporal evolution of the clustering patterns, for both windstorms and cyclones. Compared to all cyclones, clustering of windstorms and strongest cyclones culminate around February, while all cyclone clustering peak in December to January.
NASA Astrophysics Data System (ADS)
Majer, C. L.; Meyer, S.; Konrad, S.; Sarli, E.; Bartelmann, M.
2016-07-01
This paper continues a series in which we intend to show how all observables of galaxy clusters can be combined to recover the two-dimensional, projected gravitational potential of individual clusters. Our goal is to develop a non-parametric algorithm for joint cluster reconstruction taking all cluster observables into account. For this reason we focus on the line-of-sight projected gravitational potential, proportional to the lensing potential, in order to extend existing reconstruction algorithms. In this paper, we begin with the relation between the Compton-y parameter and the Newtonian gravitational potential, assuming hydrostatic equilibrium and a polytropic stratification of the intracluster gas. Extending our first publication we now consider a spheroidal rather than a spherical cluster symmetry. We show how a Richardson-Lucy deconvolution can be used to convert the intensity change of the CMB due to the thermal Sunyaev-Zel'dovich effect into an estimate for the two-dimensional gravitational potential. We apply our reconstruction method to a cluster based on an N-body/hydrodynamical simulation processed with the characteristics (resolution and noise) of the ALMA interferometer for which we achieve a relative error of ≲20 per cent for a large fraction of the virial radius. We further apply our method to an observation of the galaxy cluster RXJ1347 for which we can reconstruct the potential with a relative error of ≲20 per cent for the observable cluster range.
Volz, Erik M.; Koopman, James S.; Ward, Melissa J.; Brown, Andrew Leigh; Frost, Simon D. W.
2012-01-01
Phylogenies of highly genetically variable viruses such as HIV-1 are potentially informative of epidemiological dynamics. Several studies have demonstrated the presence of clusters of highly related HIV-1 sequences, particularly among recently HIV-infected individuals, which have been used to argue for a high transmission rate during acute infection. Using a large set of HIV-1 subtype B pol sequences collected from men who have sex with men, we demonstrate that virus from recent infections tend to be phylogenetically clustered at a greater rate than virus from patients with chronic infection (‘excess clustering’) and also tend to cluster with other recent HIV infections rather than chronic, established infections (‘excess co-clustering’), consistent with previous reports. To determine the role that a higher infectivity during acute infection may play in excess clustering and co-clustering, we developed a simple model of HIV infection that incorporates an early period of intensified transmission, and explicitly considers the dynamics of phylogenetic clusters alongside the dynamics of acute and chronic infected cases. We explored the potential for clustering statistics to be used for inference of acute stage transmission rates and found that no single statistic explains very much variance in parameters controlling acute stage transmission rates. We demonstrate that high transmission rates during the acute stage is not the main cause of excess clustering of virus from patients with early/acute infection compared to chronic infection, which may simply reflect the shorter time since transmission in acute infection. Higher transmission during acute infection can result in excess co-clustering of sequences, while the extent of clustering observed is most sensitive to the fraction of infections sampled. PMID:22761556
Vehicle energy conservation indicating device and process for use
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crump, J.M.
A vehicle energy conservation indicating device comprises an integrated instrument cluster functioning basically as a nomographic computing mechanism. The odometer distance traveled indicator computing mechanism is linked with the fuel indicating gauge mechanism such that a three variable equation computing mechanism is obtained. The three variables are distance traveled, quantity of fuel consumed and distance traveled per unit of fuel consumed. Energy conservation is achieved by operating the vehicle under such performance conditions as to produce the highest possible value for distance traveled per unit of fuel consumed. The instrument panel cluster brings the operator's attention to focus upon andmore » continuously stimulated to conserving energy. Furthermore, the vehicle energy conservation indicating device can be adapted for recording these performance variables on tape type print out. The speedometer advises the vehicle operator when he is obeying or breaking the speed laws which are enforced and monitored by the police with specific punishment prescribed for violations of the law. At this time there is no comparable procedure for enforcing vehicle energy conservation. Thus, this direct read out of distance traveled per unit of energy will moderate the operation in an analogous manner similar to subliminal advertising. This device becomes the focal point of the instrument panel along with the speedometer, thereby providing constant motivation to obey both the speed and energy conservation laws.« less
Heo, Seulkee; Lee, Eunil; Kwon, Bo Yeon; Lee, Suji; Jo, Kyung Hee; Kim, Jinsun
2016-08-03
Several studies identified a heterogeneous impact of heat on mortality in hot and cool regions during a fixed period, whereas less evidence is available for changes in risk over time due to climate change in these regions. We compared changes in risk during periods without (1996-2000) and with (2008-2012) heatwave warning forecasts in regions of South Korea with different climates. Study areas were categorised into 3 clusters based on the spatial clustering of cooling degree days in the period 1993-2012: hottest cluster (cluster H), moderate cluster (cluster M) and cool cluster (cluster C). The risk was estimated according to increases in the daily all-cause, cardiovascular and respiratory mortality per 1°C change in daily temperature above the threshold, using a generalised additive model. The risk of all types of mortality increased in cluster H in 2008-2012, compared with 1996-2000, whereas the risks in all-combined regions and cooler clusters decreased. Temporal increases in mortality risk were larger for some vulnerable subgroups, including younger adults (<75 years), those with a lower education and blue-collar workers, in cluster H as well as all-combined regions. Different patterns of risk change among clusters might be attributable to large increases in heatwave frequency or duration during study periods and the degree of urbanisation in cluster H. People living in hotter regions or with a lower socioeconomic status are at higher risk following an increasing trend of heat-related mortality risks. Continuous efforts are needed to understand factors which affect changes in heat-related mortality risks. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
NASA Astrophysics Data System (ADS)
Madonna, E.; Li, C.; Grams, C. M.; Woollings, T.
2017-12-01
Understanding the variability of the North Atlantic eddy-driven jet is key to unravelling the dynamics, predictability and climate change response of extratropical weather in the region. This study aims to 1) reconcile two perspectives on wintertime variability in the North Atlantic-European sector and 2) clarify their link to atmospheric blocking. Two common views of wintertime variability in the North Atlantic are the zonal-mean framework comprising three preferred locations of the eddy-driven jet (southern, central, northern), and the weather regime framework comprising four classical North Atlantic-European regimes (Atlantic ridge AR, zonal ZO, European/Scandinavian blocking BL, Greenland anticyclone GA). We use a k-means clustering algorithm to characterize the two-dimensional variability of the eddy-driven jet stream, defined by the lower tropospheric zonal wind in the ERA-Interim reanalysis. The first three clusters capture the central jet and northern jet, along with a new mixed jet configuration; a fourth cluster is needed to recover the southern jet. The mixed cluster represents a split or strongly tilted jet, neither of which is well described in the zonal-mean framework, and has a persistence of about one week, similar to the other clusters. Connections between the preferred jet locations and weather regimes are corroborated - southern to GA, central to ZO, and northern to AR. In addition, the new mixed cluster is found to be linked to European/Scandinavian blocking, whose relation to the eddy-driven jet was previously unclear. The results highlight the necessity of bridging from weather to climate scales for a deeper understanding of atmospheric circulation variability.
The Search for Bright Variable Stars in Open Cluster NGC 6819.
NASA Astrophysics Data System (ADS)
Talamantes, Antonio; Sandquist, E. L.
2009-01-01
During this research period data was taken for seven nights at the 1m telescope at Mt. Laguna Observatory for the open cluster NGC 6819. For four of the nights data was taken using a V-band filter. For the three nights remaining nights the data was taken using an R-band filter. Photometry was done using the ISIS image subtraction package. Six new variable stars were located using these techniques. These variable types include a pulsating variable, five detached eclipsing binaries. Of the detached eclipsing binaries, three are near the cluster turnoff and two in the blue straggler region(and one of these has total eclipses). Nine previously known variables(six contact binaries, two detached eclipsing binaries and one near-contact binary) were also studied.
Comparing Effects of Cluster-Coupled Patterns on Opinion Dynamics
NASA Astrophysics Data System (ADS)
Liu, Yun; Si, Xia-Meng; Zhang, Yan-Chao
2012-07-01
Community structure is another important feature besides small-world and scale-free property of complex networks. Communities can be coupled through specific fixed links between nodes, or occasional encounter behavior. We introduce a model for opinion evolution with multiple cluster-coupled patterns, in which the interconnectivity denotes the coupled degree of communities by fixed links, and encounter frequency controls the coupled degree of communities by encounter behaviors. Considering the complicated cognitive system of people, the CODA (continuous opinions and discrete actions) update rules are used to mimic how people update their decisions after interacting with someone. It is shown that, large interconnectivity and encounter frequency both can promote consensus, reduce competition between communities and propagate some opinion successfully across the whole population. Encounter frequency is better than interconnectivity at facilitating the consensus of decisions. When the degree of social cohesion is same, small interconnectivity has better effects on lessening the competence between communities than small encounter frequency does, while large encounter frequency can make the greater degree of agreement across the whole populations than large interconnectivity can.
NASA Astrophysics Data System (ADS)
Ahumada, J. A.; Arellano Ferro, A.; Calderón, J. H.; Kains, N.
2015-08-01
We present CCD time-series observations of the central region of the globular cluster NGC 3201, collected from CASLEO in March 2013, with the aim of performing the Fourier decomposition of the light curves of the RR Lyrae variables. This procedure, applied to the RRab-type stars, gave a mean value [Fe/H], for the cluster metallicity, and 5.00 0.22 kpc, for the cluster distance. The values found from two RRc stars are consistent with those derived previously. Because of differential reddening across the cluster field, individual reddenings for the RRab stars were estimated from their curves, resulting in an average value . An investigation of the light curves of stars in the blue straggler region led to the discovery of three new SX Phoenicis variables. The period-luminosity relation of the SX Phoenicis was used for an independent determination of the distance to the cluster and of the individual reddenings of these variables.
Wall, Martin; Casswell, Sally
2017-05-01
The aim was to identify a typology of drinkers in New Zealand based on alcohol consumption, beverage choice, and public versus private drinking locations and investigate the relationship between drinker types, harms experienced, and policy-related variables. Model-based cluster analysis of male and female drinkers including volumes of alcohol consumed in the form of beer, wine, spirits, and ready-to-drinks (RTDs) in off- and on-premise settings. Cluster membership was then related to harm measures: alcohol dependence, self-rated health; and to 3 policy-relevant variables: liking for alcohol adverts, price paid for alcohol, and time of purchase. Males and females were analyzed separately. Men fell into 4 and women into 14 clearly discriminated clusters. The male clusters consumed a relatively high proportion of alcohol in the form of beer. Women had a number of small extreme clusters and some consumed mainly spirits-based RTDs, while others drank mainly wine. Those in the higher consuming clusters were more likely to have signs of alcohol dependency, to report lower satisfaction with their health, to like alcohol ads, and to have purchased late at night. Consumption patterns are sufficiently distinctive to identify typologies of male and female alcohol consumers. Women drinkers are more heterogeneous than men. The clusters relate differently to policy-related variables. Copyright © 2017 by the Research Society on Alcoholism.
A NEW CENSUS OF THE VARIABLE STAR POPULATION IN THE GLOBULAR CLUSTER NGC 2419
DOE Office of Scientific and Technical Information (OSTI.GOV)
Di Criscienzo, M.; Greco, C.; Ripepi, V.
We present B, V, and I CCD light curves for 101 variable stars belonging to the globular cluster NGC 2419, 60 of which are new discoveries, based on data sets obtained at the Telescopio Nazionale Galileo, the Subaru telescope, and the Hubble Space Telescope. The sample includes 75 RR Lyrae stars (38 RRab, 36 RRc, and one RRd), one Population II Cepheid, 12 SX Phoenicis variables, two {delta} Scuti stars, three binary systems, five long-period variables, and three variables of uncertain classification. The pulsation properties of the RR Lyrae variables are close to those of Oosterhoff type II clusters, consistentmore » with the low metal abundance and the cluster horizontal branch morphology, disfavoring (but not totally ruling out) an extragalactic hypothesis for the origin of NGC 2419. The observed properties of RR Lyrae and SX Phoenicis stars are used to estimate the cluster reddening and distance, using a number of different methods. Our final value is {mu}{sub 0} (NGC 2419) = 19.71 {+-} 0.08 mag (D = 87.5 {+-} 3.3 kpc), with E(B - V) = 0.08 {+-} 0.01 mag, [Fe/H] = -2.1 dex on the Zinn and West metallicity scale, and a value of M{sub V} that sets {mu}{sub 0} (LMC) = 18.52 mag. This value is in good agreement with the most recent literature estimates of the distance to NGC 2419.« less
Occupancy mapping and surface reconstruction using local Gaussian processes with Kinect sensors.
Kim, Soohwan; Kim, Jonghyuk
2013-10-01
Although RGB-D sensors have been successfully applied to visual SLAM and surface reconstruction, most of the applications aim at visualization. In this paper, we propose a noble method of building continuous occupancy maps and reconstructing surfaces in a single framework for both navigation and visualization. Particularly, we apply a Bayesian nonparametric approach, Gaussian process classification, to occupancy mapping. However, it suffers from high-computational complexity of O(n(3))+O(n(2)m), where n and m are the numbers of training and test data, respectively, limiting its use for large-scale mapping with huge training data, which is common with high-resolution RGB-D sensors. Therefore, we partition both training and test data with a coarse-to-fine clustering method and apply Gaussian processes to each local clusters. In addition, we consider Gaussian processes as implicit functions, and thus extract iso-surfaces from the scalar fields, continuous occupancy maps, using marching cubes. By doing that, we are able to build two types of map representations within a single framework of Gaussian processes. Experimental results with 2-D simulated data show that the accuracy of our approximated method is comparable to previous work, while the computational time is dramatically reduced. We also demonstrate our method with 3-D real data to show its feasibility in large-scale environments.
Neuroimaging paradigms for tonotopic mapping (II): the influence of acquisition protocol.
Langers, Dave R M; Sanchez-Panchuelo, Rosa M; Francis, Susan T; Krumbholz, Katrin; Hall, Deborah A
2014-10-15
Numerous studies on the tonotopic organisation of auditory cortex in humans have employed a wide range of neuroimaging protocols to assess cortical frequency tuning. In the present functional magnetic resonance imaging (fMRI) study, we made a systematic comparison between acquisition protocols with variable levels of interference from acoustic scanner noise. Using sweep stimuli to evoke travelling waves of activation, we measured sound-evoked response signals using sparse, clustered, and continuous imaging protocols that were characterised by inter-scan intervals of 8.8, 2.2, or 0.0 s, respectively. With regard to sensitivity to sound-evoked activation, the sparse and clustered protocols performed similarly, and both detected more activation than the continuous method. Qualitatively, tonotopic maps in activated areas proved highly similar, in the sense that the overall pattern of tonotopic gradients was reproducible across all three protocols. However, quantitatively, we observed substantial reductions in response amplitudes to moderately low stimulus frequencies that coincided with regions of strong energy in the scanner noise spectrum for the clustered and continuous protocols compared to the sparse protocol. At the same time, extreme frequencies became over-represented for these two protocols, and high best frequencies became relatively more abundant. Our results indicate that although all three scanning protocols are suitable to determine the layout of tonotopic fields, an exact quantitative assessment of the representation of various sound frequencies is substantially confounded by the presence of scanner noise. In addition, we noticed anomalous signal dynamics in response to our travelling wave paradigm that suggest that the assessment of frequency-dependent tuning is non-trivially influenced by time-dependent (hemo)dynamics when using sweep stimuli. Copyright © 2014. Published by Elsevier Inc.
The Nature and Origin of UCDs in the Coma Cluster
NASA Astrophysics Data System (ADS)
Chiboucas, Kristin; Tully, R. Brent; Madrid, Juan; Phillipps, Steven; Carter, David; Peng, Eric
2018-01-01
UCDs are super massive star clusters found largely in dense regions but have also been found around individual galaxies and in smaller groups. Their origin is still under debate but currently favored scenarios include formation as giant star clusters, either as the brightest globular clusters or through mergers of super star clusters, themselves formed during major galaxy mergers, or as remnant nuclei from tidal stripping of nucleated dwarf ellipticals. Establishing the nature of these enigmatic objects has important implications for our understanding of star formation, star cluster formation, the missing satellite problem, and galaxy evolution. We are attempting to disentangle these competing formation scenarios with a large survey of UCDs in the Coma cluster. Using ACS two-passband imaging from the HST/ACS Coma Cluster Treasury Survey, we are using colors and sizes to identify the UCD cluster members. With a large size limited sample of the UCD population within the core region of the Coma cluster, we are investigating the population size, properties, and spatial distribution, and comparing that with the Coma globular cluster and nuclear star cluster populations to discriminate between the threshing and globular cluster scenarios. In previous work, we had found a possible correlation of UCD colors with host galaxy and a possible excess of UCDs around a non-central giant galaxy with an unusually large globular cluster population, both suggestive of a globular cluster origin. With a larger sample size and additional imaging fields that encompass the regions around these giant galaxies, we have found that the color correlation with host persists and the giant galaxy with unusually large globular cluster population does appear to host a large UCD population as well. We present the current status of the survey.
Semi-supervised clustering methods
Bair, Eric
2013-01-01
Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning that there is no outcome variable nor is anything known about the relationship between the observations in the data set. In many situations, however, information about the clusters is available in addition to the values of the features. For example, the cluster labels of some observations may be known, or certain observations may be known to belong to the same cluster. In other cases, one may wish to identify clusters that are associated with a particular outcome variable. This review describes several clustering algorithms (known as “semi-supervised clustering” methods) that can be applied in these situations. The majority of these methods are modifications of the popular k-means clustering method, and several of them will be described in detail. A brief description of some other semi-supervised clustering algorithms is also provided. PMID:24729830
On-chip continuous-variable quantum entanglement
NASA Astrophysics Data System (ADS)
Masada, Genta; Furusawa, Akira
2016-09-01
Entanglement is an essential feature of quantum theory and the core of the majority of quantum information science and technologies. Quantum computing is one of the most important fruits of quantum entanglement and requires not only a bipartite entangled state but also more complicated multipartite entanglement. In previous experimental works to demonstrate various entanglement-based quantum information processing, light has been extensively used. Experiments utilizing such a complicated state need highly complex optical circuits to propagate optical beams and a high level of spatial interference between different light beams to generate quantum entanglement or to efficiently perform balanced homodyne measurement. Current experiments have been performed in conventional free-space optics with large numbers of optical components and a relatively large-sized optical setup. Therefore, they are limited in stability and scalability. Integrated photonics offer new tools and additional capabilities for manipulating light in quantum information technology. Owing to integrated waveguide circuits, it is possible to stabilize and miniaturize complex optical circuits and achieve high interference of light beams. The integrated circuits have been firstly developed for discrete-variable systems and then applied to continuous-variable systems. In this article, we review the currently developed scheme for generation and verification of continuous-variable quantum entanglement such as Einstein-Podolsky-Rosen beams using a photonic chip where waveguide circuits are integrated. This includes balanced homodyne measurement of a squeezed state of light. As a simple example, we also review an experiment for generating discrete-variable quantum entanglement using integrated waveguide circuits.
NASA Astrophysics Data System (ADS)
Caputo, F.
1987-01-01
It is shown that the pulsational properties of RR Lyrae variables in globular clusters can be used together with the Red Giant Branch location to derive reliable information on the cluster reddening and distance modulus. By demanding full agreement with some key observables, the reddening and distance modulus of the globular clusters M4 and M15 are derived as a function of the mass of the variables and of the adopted cluster metallicity. Thus, from the comparison between observations and theoretical isochrones, the cluster age can be evaluated. A best guess for the age of M4 and M15 can be presented: 16×109yr, with a total uncertainty of 2 billion years.
A census of variability in globular cluster M 68 (NGC 4590)
NASA Astrophysics Data System (ADS)
Kains, N.; Arellano Ferro, A.; Figuera Jaimes, R.; Bramich, D. M.; Skottfelt, J.; Jørgensen, U. G.; Tsapras, Y.; Street, R. A.; Browne, P.; Dominik, M.; Horne, K.; Hundertmark, M.; Ipatov, S.; Snodgrass, C.; Steele, I. A.; Lcogt/Robonet Consortium; Alsubai, K. A.; Bozza, V.; Calchi Novati, S.; Ciceri, S.; D'Ago, G.; Galianni, P.; Gu, S.-H.; Harpsøe, K.; Hinse, T. C.; Juncher, D.; Korhonen, H.; Mancini, L.; Popovas, A.; Rabus, M.; Rahvar, S.; Southworth, J.; Surdej, J.; Vilela, C.; Wang, X.-B.; Wertz, O.; Mindstep Consortium
2015-06-01
Aims: We analyse 20 nights of CCD observations in the V and I bands of the globular cluster M 68 (NGC 4590) and use them to detect variable objects. We also obtained electron-multiplying CCD (EMCCD) observations for this cluster in order to explore its core with unprecedented spatial resolution from the ground. Methods: We reduced our data using difference image analysis to achieve the best possible photometry in the crowded field of the cluster. In doing so, we show that when dealing with identical networked telescopes, a reference image from any telescope may be used to reduce data from any other telescope, which facilitates the analysis significantly. We then used our light curves to estimate the properties of the RR Lyrae (RRL) stars in M 68 through Fourier decomposition and empirical relations. The variable star properties then allowed us to derive the cluster's metallicity and distance. Results: M 68 had 45 previously confirmed variables, including 42 RRL and 2 SX Phoenicis (SX Phe) stars. In this paper we determine new periods and search for new variables, especially in the core of the cluster where our method performs particularly well. We detect 4 additional SX Phe stars and confirm the variability of another star, bringing the total number of confirmed variable stars in this cluster to 50. We also used archival data stretching back to 1951 to derive period changes for some of the single-mode RRL stars, and analyse the significant number of double-mode RRL stars in M 68. Furthermore, we find evidence for double-mode pulsation in one of the SX Phe stars in this cluster. Using the different classes of variables, we derived values for the metallicity of the cluster of [Fe/H] = -2.07 ± 0.06 on the ZW scale, or -2.20 ± 0.10 on the UVES scale, and found true distance moduli μ0 = 15.00 ± 0.11 mag (using RR0 stars), 15.00 ± 0.05 mag (using RR1 stars), 14.97 ± 0.11 mag (using SX Phe stars), and 15.00 ± 0.07 mag (using the MV -[Fe/H] relation for RRL stars), corresponding to physical distances of 10.00 ± 0.49, 9.99 ± 0.21, 9.84 ± 0.50, and 10.00 ± 0.30 kpc, respectively. Thanks to the first use of difference image analysis on time-series observations of M 68, we are now confident that we have a complete census of the RRL stars in this cluster. The full Table 2 is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/578/A128
Predicting Future Suicide Attempts among Depressed Suicide Ideators: A 10-year Longitudinal Study
May, Alexis M.; Klonsky, E. David; Klein, Daniel N.
2012-01-01
Suicidal ideation and attempts are a major public health problem. Research has identified many risk factors for suicidality; however, most fail to identify which suicide ideators are at greatest risk of progressing to a suicide attempt. Thus, the present study identified predictors of future suicide attempts in a sample of psychiatric patients reporting suicidal ideation. The sample comprised 49 individuals who met full DSM-IV criteria for major depressive disorder and/or dysthymic disorder and reported suicidal ideation at baseline. Participants were followed for 10 years. Demographic, psychological, personality, and psychosocial risk factors were assessed using validated questionnaires and structured interviews. Phi coefficients and point-biserial correlations were used to identify prospective predictors of attempts, and logistic regressions were used to identify which variables predicted future attempts over and above past suicide attempts. Six significant predictors of future suicide attempts were identified – cluster A personality disorder, cluster B personality disorder, lifetime substance abuse, baseline anxiety disorder, poor maternal relationship, and poor social adjustment. Finally, exploratory logistic regressions were used to examine the unique contribution of each significant predictor controlling for the others. Co-morbid cluster B personality disorder emerged as the only robust, unique predictor of future suicide attempts among depressed suicide ideators. Future research should continue to identify variables that predict transition from suicidal thoughts to suicide attempts, as such work will enhance clinical assessment of suicide risk as well as theoretical models of suicide. PMID:22575331
Liu, Yan-Lin; Shih, Cheng-Ting; Chang, Yuan-Jen; Chang, Shu-Jun; Wu, Jay
2014-01-01
The rapid development of picture archiving and communication systems (PACSs) thoroughly changes the way of medical informatics communication and management. However, as the scale of a hospital's operations increases, the large amount of digital images transferred in the network inevitably decreases system efficiency. In this study, a server cluster consisting of two server nodes was constructed. Network load balancing (NLB), distributed file system (DFS), and structured query language (SQL) duplication services were installed. A total of 1 to 16 workstations were used to transfer computed radiography (CR), computed tomography (CT), and magnetic resonance (MR) images simultaneously to simulate the clinical situation. The average transmission rate (ATR) was analyzed between the cluster and noncluster servers. In the download scenario, the ATRs of CR, CT, and MR images increased by 44.3%, 56.6%, and 100.9%, respectively, when using the server cluster, whereas the ATRs increased by 23.0%, 39.2%, and 24.9% in the upload scenario. In the mix scenario, the transmission performance increased by 45.2% when using eight computer units. The fault tolerance mechanisms of the server cluster maintained the system availability and image integrity. The server cluster can improve the transmission efficiency while maintaining high reliability and continuous availability in a healthcare environment.
Brocher, Thomas M.; Blakely, Richard J.; Sherrod, Brian
2017-01-01
We investigate spatial and temporal relations between an ongoing and prolific seismicity cluster in central Washington, near Entiat, and the 14 December 1872 Entiat earthquake, the largest historic crustal earthquake in Washington. A fault scarp produced by the 1872 earthquake lies within the Entiat cluster; the locations and areas of both the cluster and the estimated 1872 rupture surface are comparable. Seismic intensities and the 1–2 m of coseismic displacement suggest a magnitude range between 6.5 and 7.0 for the 1872 earthquake. Aftershock forecast models for (1) the first several hours following the 1872 earthquake, (2) the largest felt earthquakes from 1900 to 1974, and (3) the seismicity within the Entiat cluster from 1976 through 2016 are also consistent with this magnitude range. Based on this aftershock modeling, most of the current seismicity in the Entiat cluster could represent aftershocks of the 1872 earthquake. Other earthquakes, especially those with long recurrence intervals, have long‐lived aftershock sequences, including the Mw">MwMw 7.5 1891 Nobi earthquake in Japan, with aftershocks continuing 100 yrs after the mainshock. Although we do not rule out ongoing tectonic deformation in this region, a long‐lived aftershock sequence can account for these observations.
Chang, Shu-Jun; Wu, Jay
2014-01-01
The rapid development of picture archiving and communication systems (PACSs) thoroughly changes the way of medical informatics communication and management. However, as the scale of a hospital's operations increases, the large amount of digital images transferred in the network inevitably decreases system efficiency. In this study, a server cluster consisting of two server nodes was constructed. Network load balancing (NLB), distributed file system (DFS), and structured query language (SQL) duplication services were installed. A total of 1 to 16 workstations were used to transfer computed radiography (CR), computed tomography (CT), and magnetic resonance (MR) images simultaneously to simulate the clinical situation. The average transmission rate (ATR) was analyzed between the cluster and noncluster servers. In the download scenario, the ATRs of CR, CT, and MR images increased by 44.3%, 56.6%, and 100.9%, respectively, when using the server cluster, whereas the ATRs increased by 23.0%, 39.2%, and 24.9% in the upload scenario. In the mix scenario, the transmission performance increased by 45.2% when using eight computer units. The fault tolerance mechanisms of the server cluster maintained the system availability and image integrity. The server cluster can improve the transmission efficiency while maintaining high reliability and continuous availability in a healthcare environment. PMID:24701580
Hu, Yi; Xiong, Chenglong; Zhang, Zhijie; Luo, Can; Cohen, Ted; Gao, Jie; Zhang, Lijuan; Jiang, Qingwu
2014-01-03
We compared changes in the spatial clustering of schistosomiasis in Southwest China at the conclusion of and six years following the end of the World Bank Loan Project (WBLP), the control strategy of which was focused on the large-scale use of chemotherapy. Parasitological data were obtained through standardized surveys conducted in 1999-2001 and again in 2007-2008. Two alternate spatial cluster methods were used to identify spatial clusters of cases: Anselin's Local Moran's I test and Kulldorff's spatial scan statistic. Substantial reductions in the burden of schistosomiasis were found after the end of the WBLP, but the spatial extent of schistosomiasis was not reduced across the study area. Spatial clusters continued to occur in three regions: Chengdu Plain, Yangtze River Valley, and Lancang River Valley during the two periods, and regularly involved five counties. These findings suggest that despite impressive reductions in burden, the hilly and mountainous regions of Southwest China remain at risk of schistosome re-emergence. Our results help to highlight specific locations where integrated control programs can focus to speed the elimination of schistosomiasis in China.
Baurens, Franc-Christophe; Bocs, Stéphanie; Rouard, Mathieu; Matsumoto, Takashi; Miller, Robert N G; Rodier-Goud, Marguerite; MBéguié-A-MBéguié, Didier; Yahiaoui, Nabila
2010-07-16
Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana.
The Clusters AgeS Experiment (CASE). Variable Stars in the Field of the Globular Cluster NGC 3201
NASA Astrophysics Data System (ADS)
Kaluzny, J.; Rozyczka, M.; Thompson, I. B.; Narloch, W.; Mazur, B.; Pych, W.; Schwarzenberg-Czerny, A.
2016-01-01
The field of the globular cluster NGC 3201 was monitored between 1998 and 2009 in a search for variable stars. BV light curves were obtained for 152 periodic or likely periodic variables, fifty-seven of which are new detections. Thirty-seven newly detected variables are proper motion members of the cluster. Among them we found seven detached or semi-detached eclipsing binaries, four contact binaries, and eight SX Phe pulsators. Four of the eclipsing binaries are located in the turnoff region, one on the lower main sequence and the remaining two slightly above the subgiant branch. Two contact systems are blue stragglers, and another two reside in the turnoff region. In the blue straggler region a total of 266 objects were found, of which 140 are proper motion (PM) members of NGC 3201, and another nineteen are field stars. Seventy-eight of the remaining objects for which we do not have PM data are located within the half-light radius from the center of the cluster, and most of them are likely genuine blue stragglers. Four variable objects in our field of view were found to coincide with X-ray sources: three chromospherically active stars and a quasar at a redshift z≍0.5.
Scalable NIC-based reduction on large-scale clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moody, A.; Fernández, J. C.; Petrini, F.
2003-01-01
Many parallel algorithms require effiaent support for reduction mllectives. Over the years, researchers have developed optimal reduction algonduns by taking inm account system size, dam size, and complexities of reduction operations. However, all of these algorithm have assumed the faa that the reduction precessing takes place on the host CPU. Modem Network Interface Cards (NICs) sport programmable processors with substantial memory and thus introduce a fresh variable into the equation This raises the following intersting challenge: Can we take advantage of modern NICs to implementJost redudion operations? In this paper, we take on this challenge in the context of large-scalemore » clusters. Through experiments on the 960-node, 1920-processor or ASCI Linux Cluster (ALC) located at the Lawrence Livermore National Laboratory, we show that NIC-based reductions indeed perform with reduced latency and immed consistency over host-based aleorithms for the wmmon case and that these benefits scale as the system grows. In the largest configuration tested--1812 processors-- our NIC-based algorithm can sum a single element vector in 73 ps with 32-bi integers and in 118 with Mbit floating-point numnbers. These results represent an improvement, respeaively, of 121% and 39% with resvect w the {approx}roductionle vel MPI library« less
NASA Astrophysics Data System (ADS)
De, Anupam; Bandyopadhyay, Gautam; Chakraborty, B. N.
2010-10-01
Financial ratio analysis is an important and commonly used tool in analyzing financial health of a firm. Quite a large number of financial ratios, which can be categorized in different groups, are used for this analysis. However, to reduce number of ratios to be used for financial analysis and regrouping them into different groups on basis of empirical evidence, Factor Analysis technique is being used successfully by different researches during the last three decades. In this study Factor Analysis has been applied over audited financial data of Indian cement companies for a period of 10 years. The sample companies are listed on the Stock Exchange India (BSE and NSE). Factor Analysis, conducted over 44 variables (financial ratios) grouped in 7 categories, resulted in 11 underlying categories (factors). Each factor is named in an appropriate manner considering the factor loads and constituent variables (ratios). Representative ratios are identified for each such factor. To validate the results of Factor Analysis and to reach final conclusion regarding the representative ratios, Cluster Analysis had been performed.
NASA Astrophysics Data System (ADS)
Mukherjee, Anamitra; Patel, Niravkumar D.; Bishop, Chris; Dagotto, Elbio
2015-06-01
Lattice spin-fermion models are important to study correlated systems where quantum dynamics allows for a separation between slow and fast degrees of freedom. The fast degrees of freedom are treated quantum mechanically while the slow variables, generically referred to as the "spins," are treated classically. At present, exact diagonalization coupled with classical Monte Carlo (ED + MC) is extensively used to solve numerically a general class of lattice spin-fermion problems. In this common setup, the classical variables (spins) are treated via the standard MC method while the fermion problem is solved by exact diagonalization. The "traveling cluster approximation" (TCA) is a real space variant of the ED + MC method that allows to solve spin-fermion problems on lattice sizes with up to 103 sites. In this publication, we present a novel reorganization of the TCA algorithm in a manner that can be efficiently parallelized. This allows us to solve generic spin-fermion models easily on 104 lattice sites and with some effort on 105 lattice sites, representing the record lattice sizes studied for this family of models.
Kumar, A; Taneja, N; Sharma, R K; Sharma, H; Ramamurthy, T; Sharma, M
2014-12-01
In a first study from India, a diverse collection of 140 environmental and clinical non-O157 Shiga-toxigenic Escherichia coli strains from a large geographical area in north India was typed by multi-locus variable number tandem repeat analysis (MLVA). The distribution of major virulence genes stx1, stx2 and eae was found to be 78%, 70% and 10%, respectively; 15 isolates were enterohaemorrhagic E. coli (stx1 +/stx2 + and eae +). By MLVA analysis, 44 different alleles were obtained. Dendrogram analysis revealed 104 different genotypes and 19 MLVA-type complexes divided into two main lineages, i.e. mutton and animal stool. Human isolates presented a statistically significant greater odds ratio for clustering with mutton samples compared to animal stool isolates. Five human isolates clustered with animal stool strains suggesting that some of the human infections may be from cattle, perhaps through milk, contact or the environment. Further epidemiological studies are required to explore these sources in context with occurrence of human cases.
Fleming, Brandon J.; LaMotte, Andrew E.; Sekellick, Andrew J.
2013-01-01
Hydrogeologic regions in the fractured rock area of Maryland were classified using geographic information system tools with principal components and cluster analyses. A study area consisting of the 8-digit Hydrologic Unit Code (HUC) watersheds with rivers that flow through the fractured rock area of Maryland and bounded by the Fall Line was further subdivided into 21,431 catchments from the National Hydrography Dataset Plus. The catchments were then used as a common hydrologic unit to compile relevant climatic, topographic, and geologic variables. A principal components analysis was performed on 10 input variables, and 4 principal components that accounted for 83 percent of the variability in the original data were identified. A subsequent cluster analysis grouped the catchments based on four principal component scores into six hydrogeologic regions. Two crystalline rock hydrogeologic regions, including large parts of the Washington, D.C. and Baltimore metropolitan regions that represent over 50 percent of the fractured rock area of Maryland, are distinguished by differences in recharge, Precipitation minus Potential Evapotranspiration, sand content in soils, and groundwater contributions to streams. This classification system will provide a georeferenced digital hydrogeologic framework for future investigations of groundwater availability in the fractured rock area of Maryland.
Modulation of Subseasonal Tropical Cyclone Genesis In The Western North Pacific By Wave Activities
NASA Astrophysics Data System (ADS)
Gao, Jianyun; Cheung, Kevin K. W.
2017-04-01
Tropical cyclone (TC) activity is well known to possess variability on multiple timescales, ranging from inter-decadal to intraseasonal. In this study, the subseasonal variability of TC genesis in the western North Pacific (WNP) is examined during summer (May-October) for the period of 1979-2015. In particular, clustering of TC activity within subseasonal timescale is the focus. First, three phases (active, normal and inactive phases) of TC clustering are defined based on the statistics of genesis frequency. Then the modes of subseasonal modulation of these three phases by intraseasonal (30-60-day) oscillation (ISO), biweekly (10-20-day) oscillation (BWO), and the convectively coupled equatorial waves (CCEW), including Rossby, Kelvin, and mixed Rossby-gravity and tropical depression-type waves are considered. It is found that the embedding large-scale circulation is significantly different between the inactive phase and the other phases. Further, the intensities and propagation phases of the ISO, BWO and CCEW play different roles to modulate TC genesis frequency during the active and normal phase. Considering the lag correlation of these subseasonal modulation modes and TC genesis, it is possible to construct a statistical model for the purpose of extended-range forecasting of subseasonal variability of TC occurrence over the WNP.
Large-Angular-Scale Clustering as a Clue to the Source of UHECRs
NASA Astrophysics Data System (ADS)
Berlind, Andreas A.; Farrar, Glennys R.
We explore what can be learned about the sources of UHECRs from their large-angular-scale clustering (referred to as their "bias" by the cosmology community). Exploiting the clustering on large scales has the advantage over small-scale correlations of being insensitive to uncertainties in source direction from magnetic smearing or measurement error. In a Cold Dark Matter cosmology, the amplitude of large-scale clustering depends on the mass of the system, with more massive systems such as galaxy clusters clustering more strongly than less massive systems such as ordinary galaxies or AGN. Therefore, studying the large-scale clustering of UHECRs can help determine a mass scale for their sources, given the assumption that their redshift depth is as expected from the GZK cutoff. We investigate the constraining power of a given UHECR sample as a function of its cutoff energy and number of events. We show that current and future samples should be able to distinguish between the cases of their sources being galaxy clusters, ordinary galaxies, or sources that are uncorrelated with the large-scale structure of the universe.
Impoinvil, Daniel E; Solomon, Tom; Schluter, W William; Rayamajhi, Ajit; Bichha, Ram Padarath; Shakya, Geeta; Caminade, Cyril; Baylis, Matthew
2011-01-01
To identify potential environmental drivers of Japanese Encephalitis virus (JE) transmission in Nepal, we conducted an ecological study to determine the spatial association between 2005 Nepal JE incidence, and climate, agricultural, and land-cover variables at district level. District-level data on JE cases were examined using Local Indicators of Spatial Association (LISA) analysis to identify spatial clusters from 2004 to 2008 and 2005 data was used to fit a spatial lag regression model with climate, agriculture and land-cover variables. Prior to 2006, there was a single large cluster of JE cases located in the Far-West and Mid-West terai regions of Nepal. After 2005, the distribution of JE cases in Nepal shifted with clusters found in the central hill areas. JE incidence during the 2005 epidemic had a stronger association with May mean monthly temperature and April mean monthly total precipitation compared to mean annual temperature and precipitation. A parsimonious spatial lag regression model revealed, 1) a significant negative relationship between JE incidence and April precipitation, 2) a significant positive relationship between JE incidence and percentage of irrigated land 3) a non-significant negative relationship between JE incidence and percentage of grassland cover, and 4) a unimodal non-significant relationship between JE Incidence and pig-to-human ratio. JE cases clustered in the terai prior to 2006 where it seemed to shift to the Kathmandu region in subsequent years. The spatial pattern of JE cases during the 2005 epidemic in Nepal was significantly associated with low precipitation and the percentage of irrigated land. Despite the availability of an effective vaccine, it is still important to understand environmental drivers of JEV transmission since the enzootic cycle of JEV transmission is not likely to be totally interrupted. Understanding the spatial dynamics of JE risk factors may be useful in providing important information to the Nepal immunization program.
Impoinvil, Daniel E.; Solomon, Tom; Schluter, W. William; Rayamajhi, Ajit; Bichha, Ram Padarath; Shakya, Geeta; Caminade, Cyril; Baylis, Matthew
2011-01-01
Background To identify potential environmental drivers of Japanese Encephalitis virus (JE) transmission in Nepal, we conducted an ecological study to determine the spatial association between 2005 Nepal JE incidence, and climate, agricultural, and land-cover variables at district level. Methods District-level data on JE cases were examined using Local Indicators of Spatial Association (LISA) analysis to identify spatial clusters from 2004 to 2008 and 2005 data was used to fit a spatial lag regression model with climate, agriculture and land-cover variables. Results Prior to 2006, there was a single large cluster of JE cases located in the Far-West and Mid-West terai regions of Nepal. After 2005, the distribution of JE cases in Nepal shifted with clusters found in the central hill areas. JE incidence during the 2005 epidemic had a stronger association with May mean monthly temperature and April mean monthly total precipitation compared to mean annual temperature and precipitation. A parsimonious spatial lag regression model revealed, 1) a significant negative relationship between JE incidence and April precipitation, 2) a significant positive relationship between JE incidence and percentage of irrigated land 3) a non-significant negative relationship between JE incidence and percentage of grassland cover, and 4) a unimodal non-significant relationship between JE Incidence and pig-to-human ratio. Conclusion JE cases clustered in the terai prior to 2006 where it seemed to shift to the Kathmandu region in subsequent years. The spatial pattern of JE cases during the 2005 epidemic in Nepal was significantly associated with low precipitation and the percentage of irrigated land. Despite the availability of an effective vaccine, it is still important to understand environmental drivers of JEV transmission since the enzootic cycle of JEV transmission is not likely to be totally interrupted. Understanding the spatial dynamics of JE risk factors may be useful in providing important information to the Nepal immunization program. PMID:21811573
Spatiotemporal Analysis of Corn Phenoregions in the Continental United States
NASA Astrophysics Data System (ADS)
Konduri, V. S.; Kumar, J.; Hoffman, F. M.; Ganguly, A. R.; Hargrove, W. W.
2017-12-01
The delineation of regions exhibiting similar crop performance has potential benefits for agricultural planning and management, policymaking and natural resource conservation. Studies of natural ecosystems have used multivariate clustering algorithms based on environmental characteristics to identify ecoregions for species range prediction and habitat conservation. However, few studies have used clustering to delineate regions based on crop phenology. The aim of this study was to perform a spatiotemporal analysis of phenologically self-similar clusters, or phenoregions, for the major corn growing areas in the Continental United States (CONUS) for the period 2008-2016. Annual trajectories of remotely sensed normalized difference vegetation index (NDVI), a useful proxy for land surface phenology, derived from Moderate Resolution Spectroradiometer (MODIS) instruments at 8-day intervals and 250 m resolution was used as the phenological metric. Because of the large data volumes involved, the phenoregion delineation was performed using a highly scalable, unsupervised clustering technique with the help of high performance computing. These phenoregions capture the spatial variability in the timing of important crop phenological stages (like emergence and maturity dates) and thus could be used to develop more accurate parameterizations for crop models applied at regional to global scales. Moreover, historical crop performance from phenoregions, in combination with climate and soils data, could be used to improve production forecasts. The temporal variability in NDVI at each location could also be used to develop an early warning system to identify locations where the crop deviates from its expected phenological behavior. Such deviations may indicate a need for irrigation or fertilization or suggest where pest outbreaks or other disturbances have occurred.
ICM: a web server for integrated clustering of multi-dimensional biomedical data.
He, Song; He, Haochen; Xu, Wenjian; Huang, Xin; Jiang, Shuai; Li, Fei; He, Fuchu; Bo, Xiaochen
2016-07-08
Large-scale efforts for parallel acquisition of multi-omics profiling continue to generate extensive amounts of multi-dimensional biomedical data. Thus, integrated clustering of multiple types of omics data is essential for developing individual-based treatments and precision medicine. However, while rapid progress has been made, methods for integrated clustering are lacking an intuitive web interface that facilitates the biomedical researchers without sufficient programming skills. Here, we present a web tool, named Integrated Clustering of Multi-dimensional biomedical data (ICM), that provides an interface from which to fuse, cluster and visualize multi-dimensional biomedical data and knowledge. With ICM, users can explore the heterogeneity of a disease or a biological process by identifying subgroups of patients. The results obtained can then be interactively modified by using an intuitive user interface. Researchers can also exchange the results from ICM with collaborators via a web link containing a Project ID number that will directly pull up the analysis results being shared. ICM also support incremental clustering that allows users to add new sample data into the data of a previous study to obtain a clustering result. Currently, the ICM web server is available with no login requirement and at no cost at http://biotech.bmi.ac.cn/icm/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Image-Subtraction Photometry of Variable Stars in the Globular Clusters NGC 6388 and NGC 6441
NASA Technical Reports Server (NTRS)
Corwin, Michael T.; Sumerel, Andrew N.; Pritzl, Barton J.; Smith, Horace A.; Catelan, M.; Sweigart, Allen V.; Stetson, Peter B.
2006-01-01
We have applied Alard's image subtraction method (ISIS v2.1) to the observations of the globular clusters NGC 6388 and NGC 6441 previously analyzed using standard photometric techniques (DAOPHOT, ALLFRAME). In this reanalysis of observations obtained at CTIO, besides recovering the variables previously detected on the basis of our ground-based images, we have also been able to recover most of the RR Lyrae variables previously detected only in the analysis of Hubble Space Telescope WFPC2 observations of the inner region of NGC 6441. In addition, we report five possible new variables not found in the analysis of the EST observations of NGC 6441. This dramatically illustrates the capabilities of image subtraction techniques applied to ground-based data to recover variables in extremely crowded fields. We have also detected twelve new variables and six possible variables in NGC 6388 not found in our previous groundbased studies. Revised mean periods for RRab stars in NGC 6388 and NGC 6441 are 0.676 day and 0.756 day, respectively. These values are among the largest known for any galactic globular cluster. Additional probable type II Cepheids were identified in NGC 6388, confirming its status as a metal-rich globular cluster rich in Cepheids.
Bashan, Anat; Yonath, Ada
2009-01-01
Crystallography of ribosomes, the universal cell nucleoprotein assemblies facilitating the translation of the genetic-code into proteins, met with severe problems owing to their large size, complex structure, inherent flexibility and high conformational variability. For the case of the small ribosomal subunit, which caused extreme difficulties, post crystallization treatment by minute amounts of a heteropolytungstate cluster allowed structure determination at atomic resolution. This cluster played a dual role in ribosomal crystallography: providing anomalous phasing power and dramatically increased the resolution, by stabilization of a selected functional conformation. Thus, four out of the fourteen clusters that bind to each of the crystallized small subunits are attached to a specific ribosomal protein in a fashion that may control a significant component of the subunit internal flexibility, by “gluing” symmetrical related subunits. Here we highlight basic issues in the relationship between metal ions and macromolecules and present common traits controlling in the interactions between polymetalates and various macromolecules, which may be extended towards the exploitation of polymetalates for therapeutical treatment. PMID:19915655
VanGelder, L E; Kosswattaarachchi, A M; Forrestel, P L; Cook, T R; Matson, E M
2018-02-14
Non-aqueous redox flow batteries have emerged as promising systems for large-capacity, reversible energy storage, capable of meeting the variable demands of the electrical grid. Here, we investigate the potential for a series of Lindqvist polyoxovanadate-alkoxide (POV-alkoxide) clusters, [V 6 O 7 (OR) 12 ] (R = CH 3 , C 2 H 5 ), to serve as the electroactive species for a symmetric, non-aqueous redox flow battery. We demonstrate that the physical and electrochemical properties of these POV-alkoxides make them suitable for applications in redox flow batteries, as well as the ability for ligand modification at the bridging alkoxide moieties to yield significant improvements in cluster stability during charge-discharge cycling. Indeed, the metal-oxide core remains intact upon deep charge-discharge cycling, enabling extremely high coulombic efficiencies (∼97%) with minimal overpotential losses (∼0.3 V). Furthermore, the bulky POV-alkoxide demonstrates significant resistance to deleterious crossover, which will lead to improved lifetime and efficiency in a redox flow battery.
NASA Astrophysics Data System (ADS)
Choi, Jiwoong; Leblanc, Lawrence; Choi, Sanghun; Haghighi, Babak; Hoffman, Eric; Lin, Ching-Long
2017-11-01
The goal of this study is to assess inter-subject variability in delivery of orally inhaled drug products to small airways in asthmatic lungs. A recent multiscale imaging-based cluster analysis (MICA) of computed tomography (CT) lung images in an asthmatic cohort identified four clusters with statistically distinct structural and functional phenotypes associating with unique clinical biomarkers. Thus, we aimed to address inter-subject variability via inter-cluster variability. We selected a representative subject from each of the 4 asthma clusters as well as 1 male and 1 female healthy controls, and performed computational fluid and particle simulations on CT-based airway models of these subjects. The results from one severe and one non-severe asthmatic cluster subjects characterized by segmental airway constriction had increased particle deposition efficiency, as compared with the other two cluster subjects (one non-severe and one severe asthmatics) without airway constriction. Constriction-induced jets impinging on distal bifurcations led to excessive particle deposition. The results emphasize the impact of airway constriction on regional particle deposition rather than disease severity, demonstrating the potential of using cluster membership to tailor drug delivery. NIH Grants U01HL114494 and S10-RR022421, and FDA Grant U01FD005837. XSEDE.
NASA Astrophysics Data System (ADS)
Lee, S.; Maharani, Y. N.; Ki, S. J.
2015-12-01
The application of Self-Organizing Map (SOM) to analyze social vulnerability to recognize the resilience within sites is a challenging tasks. The aim of this study is to propose a computational method to identify the sites according to their similarity and to determine the most relevant variables to characterize the social vulnerability in each cluster. For this purposes, SOM is considered as an effective platform for analysis of high dimensional data. By considering the cluster structure, the characteristic of social vulnerability of the sites identification can be fully understand. In this study, the social vulnerability variable is constructed from 17 variables, i.e. 12 independent variables which represent the socio-economic concepts and 5 dependent variables which represent the damage and losses due to Merapi eruption in 2010. These variables collectively represent the local situation of the study area, based on conducted fieldwork on September 2013. By using both independent and dependent variables, we can identify if the social vulnerability is reflected onto the actual situation, in this case, Merapi eruption 2010. However, social vulnerability analysis in the local communities consists of a number of variables that represent their socio-economic condition. Some of variables employed in this study might be more or less redundant. Therefore, SOM is used to reduce the redundant variable(s) by selecting the representative variables using the component planes and correlation coefficient between variables in order to find the effective sample size. Then, the selected dataset was effectively clustered according to their similarities. Finally, this approach can produce reliable estimates of clustering, recognize the most significant variables and could be useful for social vulnerability assessment, especially for the stakeholder as decision maker. This research was supported by a grant 'Development of Advanced Volcanic Disaster Response System considering Potential Volcanic Risk around Korea' [MPSS-NH-2015-81] from the Natural Hazard Mitigation Research Group, National Emergency Management Agency of Korea. Keywords: Self-organizing map, Component Planes, Correlation coefficient, Cluster analysis, Sites identification, Social vulnerability, Merapi eruption 2010
Reshef, Noam; Walbaum, Natasha; Agam, Nurit; Fait, Aaron
2017-01-01
Vineyards are characterized by their large spatial variability of solar irradiance (SI) and temperature, known to effectively modulate grape metabolism. To explore the role of sunlight in shaping fruit composition and cluster uniformity, we studied the spatial pattern of incoming irradiance, fruit temperature and metabolic profile within individual grape clusters under three levels of sunlight exposure. The experiment was conducted in a vineyard of Cabernet Sauvignon cv. located in the Negev Highlands, Israel, where excess SI and midday temperatures are known to degrade grape quality. Filtering SI lowered the surface temperature of exposed fruits and increased the uniformity of irradiance and temperature in the cluster zone. SI affected the overall levels and patterns of accumulation of sugars, organic acids, amino acids and phenylpropanoids, across the grape cluster. Increased exposure to sunlight was associated with lower accumulation levels of malate, aspartate, and maleate but with higher levels of valine, leucine, and serine, in addition to the stress-related proline and GABA. Flavan-3-ols metabolites showed a negative response to SI, whereas flavonols were highly induced. The overall levels of anthocyanins decreased with increased sunlight exposure; however, a hierarchical cluster analysis revealed that the members of this family were grouped into three distinct accumulation patterns, with malvidin anthocyanins and cyanidin-glucoside showing contrasting trends. The flavonol-glucosides, quercetin and kaempferol, exhibited a logarithmic response to SI, leading to improved cluster uniformity under high-light conditions. Comparing the within-cluster variability of metabolite accumulation highlighted the stability of sugars, flavan-3-ols, and cinnamic acid metabolites to SI, in contrast to the plasticity of flavonols. A correlation-based network analysis revealed that extended exposure to SI modified metabolic coordination, increasing the number of negative correlations between metabolites in both pulp and skin. This integrated study of micrometeorology and metabolomics provided insights into the grape-cluster pattern of accumulation of 70 primary and secondary metabolites as a function of spatial variations in SI. Studying compound-specific responses against an extended gradient of quantified conditions improved our knowledge regarding the modulation of berry metabolism by SI, with the aim of using sunlight regulation to accurately modulate fruit composition in warm and arid/semi-arid regions.
Reshef, Noam; Walbaum, Natasha; Agam, Nurit; Fait, Aaron
2017-01-01
Vineyards are characterized by their large spatial variability of solar irradiance (SI) and temperature, known to effectively modulate grape metabolism. To explore the role of sunlight in shaping fruit composition and cluster uniformity, we studied the spatial pattern of incoming irradiance, fruit temperature and metabolic profile within individual grape clusters under three levels of sunlight exposure. The experiment was conducted in a vineyard of Cabernet Sauvignon cv. located in the Negev Highlands, Israel, where excess SI and midday temperatures are known to degrade grape quality. Filtering SI lowered the surface temperature of exposed fruits and increased the uniformity of irradiance and temperature in the cluster zone. SI affected the overall levels and patterns of accumulation of sugars, organic acids, amino acids and phenylpropanoids, across the grape cluster. Increased exposure to sunlight was associated with lower accumulation levels of malate, aspartate, and maleate but with higher levels of valine, leucine, and serine, in addition to the stress-related proline and GABA. Flavan-3-ols metabolites showed a negative response to SI, whereas flavonols were highly induced. The overall levels of anthocyanins decreased with increased sunlight exposure; however, a hierarchical cluster analysis revealed that the members of this family were grouped into three distinct accumulation patterns, with malvidin anthocyanins and cyanidin-glucoside showing contrasting trends. The flavonol-glucosides, quercetin and kaempferol, exhibited a logarithmic response to SI, leading to improved cluster uniformity under high-light conditions. Comparing the within-cluster variability of metabolite accumulation highlighted the stability of sugars, flavan-3-ols, and cinnamic acid metabolites to SI, in contrast to the plasticity of flavonols. A correlation-based network analysis revealed that extended exposure to SI modified metabolic coordination, increasing the number of negative correlations between metabolites in both pulp and skin. This integrated study of micrometeorology and metabolomics provided insights into the grape-cluster pattern of accumulation of 70 primary and secondary metabolites as a function of spatial variations in SI. Studying compound-specific responses against an extended gradient of quantified conditions improved our knowledge regarding the modulation of berry metabolism by SI, with the aim of using sunlight regulation to accurately modulate fruit composition in warm and arid/semi-arid regions. PMID:28203242
Methods for sample size determination in cluster randomized trials
Rutterford, Clare; Copas, Andrew; Eldridge, Sandra
2015-01-01
Background: The use of cluster randomized trials (CRTs) is increasing, along with the variety in their design and analysis. The simplest approach for their sample size calculation is to calculate the sample size assuming individual randomization and inflate this by a design effect to account for randomization by cluster. The assumptions of a simple design effect may not always be met; alternative or more complicated approaches are required. Methods: We summarise a wide range of sample size methods available for cluster randomized trials. For those familiar with sample size calculations for individually randomized trials but with less experience in the clustered case, this manuscript provides formulae for a wide range of scenarios with associated explanation and recommendations. For those with more experience, comprehensive summaries are provided that allow quick identification of methods for a given design, outcome and analysis method. Results: We present first those methods applicable to the simplest two-arm, parallel group, completely randomized design followed by methods that incorporate deviations from this design such as: variability in cluster sizes; attrition; non-compliance; or the inclusion of baseline covariates or repeated measures. The paper concludes with methods for alternative designs. Conclusions: There is a large amount of methodology available for sample size calculations in CRTs. This paper gives the most comprehensive description of published methodology for sample size calculation and provides an important resource for those designing these trials. PMID:26174515
Internet Gamblers Differ on Social Variables: A Latent Class Analysis.
Khazaal, Yasser; Chatton, Anne; Achab, Sophia; Monney, Gregoire; Thorens, Gabriel; Dufour, Magali; Zullino, Daniele; Rothen, Stephane
2017-09-01
Online gambling has gained popularity in the last decade, leading to an important shift in how consumers engage in gambling and in the factors related to problem gambling and prevention. Indebtedness and loneliness have previously been associated with problem gambling. The current study aimed to characterize online gamblers in relation to indebtedness, loneliness, and several in-game social behaviors. The data set was obtained from 584 Internet gamblers recruited online through gambling websites and forums. Of these gamblers, 372 participants completed all study assessments and were included in the analyses. Questionnaires included those on sociodemographics and social variables (indebtedness, loneliness, in-game social behaviors), as well as the Gambling Motives Questionnaire, Gambling Related Cognitions Scale, Internet Addiction Test, Problem Gambling Severity Index, Short Depression-Happiness Scale, and UPPS-P Impulsive Behavior Scale. Social variables were explored with a latent class model. The clusters obtained were compared for psychological measures and three clusters were found: lonely indebted gamblers (cluster 1: 6.5%), not lonely not indebted gamblers (cluster 2: 75.4%), and not lonely indebted gamblers (cluster 3: 18%). Participants in clusters 1 and 3 (particularly in cluster 1) were at higher risk of problem gambling than were those in cluster 2. The three groups differed on most assessed variables, including the Problem Gambling Severity Index, the Short Depression-Happiness Scale, and the UPPS-P subscales (except the sensation seeking subscore). Results highlight significant between-group differences, suggesting that Internet gamblers are not a homogeneous group. Specific intervention strategies could be implemented for groups at risk.
Variable stars around selected open clusters in the VVV area: Young Stellar Objects
NASA Astrophysics Data System (ADS)
Medina, Nicolas; Borissova, Jura; Bayo, Amelia; Kurtev, Radostin; Lucas, Philip
2017-09-01
Time-varying phenomena are one of the most substantial sources of astrophysical information, and led to many fundamental discoveries in modern astronomy. We have developed an automated tool to search and analyze variable sources in the near infrared Ks band, using the data from the Vista Variables in the Vía Láctea (VVV) ESO Public Survey ([5, 8]). One of our main goals is to investigate the Young Stellar Objects (YSOs) in the Galactic star forming regions, looking for:
Here we present the newly discovered YSOs within some selected stellar clusters in our Galaxy.
NASA Astrophysics Data System (ADS)
Fukunaga, Naoto; Konishi, Katsuaki
2015-12-01
Poly(ethylene glycol) (PEG) has been widely used for the surface protection of inorganic nanoobjects because of its virtually `inert' nature, but little attention has been paid to its inherent electronic impacts on inorganic cores. Herein, we definitively show, through studies on optical properties of a series of PEG-modified Cd10Se4(SR)10 clusters, that the surrounding PEG environments can electronically affect the properties of the inorganic core. For the clusters with PEG units directly attached to an inorganic core (R = (CH2CH2O)nOCH3, 1-PEGn, n = 3, ~7, ~17, ~46), the absorption bands, associated with the low-energy transitions, continuously blue-shifted with the increasing PEG chain length. The chain length dependencies were also observed in the photoluminescence properties, particularly in the excitation spectral profiles. By combining the spectral features of several PEG17-modified clusters (2-Cm-PEG17 and 3) whose PEG and core units are separated by various alkyl chain-based spacers, it was demonstrated that sufficiently long PEG units, including PEG17 and PEG46, cause electronic perturbations in the cluster properties when they are arranged near the inorganic core. These unique effects of the long-PEG environments could be correlated with their large dipole moments, suggesting that the polarity of the proximal chemical environment is critical when affecting the electronic properties of the inorganic cluster core.Poly(ethylene glycol) (PEG) has been widely used for the surface protection of inorganic nanoobjects because of its virtually `inert' nature, but little attention has been paid to its inherent electronic impacts on inorganic cores. Herein, we definitively show, through studies on optical properties of a series of PEG-modified Cd10Se4(SR)10 clusters, that the surrounding PEG environments can electronically affect the properties of the inorganic core. For the clusters with PEG units directly attached to an inorganic core (R = (CH2CH2O)nOCH3, 1-PEGn, n = 3, ~7, ~17, ~46), the absorption bands, associated with the low-energy transitions, continuously blue-shifted with the increasing PEG chain length. The chain length dependencies were also observed in the photoluminescence properties, particularly in the excitation spectral profiles. By combining the spectral features of several PEG17-modified clusters (2-Cm-PEG17 and 3) whose PEG and core units are separated by various alkyl chain-based spacers, it was demonstrated that sufficiently long PEG units, including PEG17 and PEG46, cause electronic perturbations in the cluster properties when they are arranged near the inorganic core. These unique effects of the long-PEG environments could be correlated with their large dipole moments, suggesting that the polarity of the proximal chemical environment is critical when affecting the electronic properties of the inorganic cluster core. Electronic supplementary information (ESI) available: Details of synthetic procedures and characterisation data of the PEGylated thiols and clusters and additional absorption, photoluminescence emission and excitation spectral data. See DOI: 10.1039/c5nr06307h
On the complexity of some quadratic Euclidean 2-clustering problems
NASA Astrophysics Data System (ADS)
Kel'manov, A. V.; Pyatkin, A. V.
2016-03-01
Some problems of partitioning a finite set of points of Euclidean space into two clusters are considered. In these problems, the following criteria are minimized: (1) the sum over both clusters of the sums of squared pairwise distances between the elements of the cluster and (2) the sum of the (multiplied by the cardinalities of the clusters) sums of squared distances from the elements of the cluster to its geometric center, where the geometric center (or centroid) of a cluster is defined as the mean value of the elements in that cluster. Additionally, another problem close to (2) is considered, where the desired center of one of the clusters is given as input, while the center of the other cluster is unknown (is the variable to be optimized) as in problem (2). Two variants of the problems are analyzed, in which the cardinalities of the clusters are (1) parts of the input or (2) optimization variables. It is proved that all the considered problems are strongly NP-hard and that, in general, there is no fully polynomial-time approximation scheme for them (unless P = NP).
THE MASS-RICHNESS RELATION OF MaxBCG CLUSTERS FROM QUASAR LENSING MAGNIFICATION USING VARIABILITY
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bauer, Anne H.; Baltay, Charles; Ellman, Nancy
2012-04-10
Accurate measurement of galaxy cluster masses is an essential component not only in studies of cluster physics but also for probes of cosmology. However, different mass measurement techniques frequently yield discrepant results. The Sloan Digital Sky Survey MaxBCG catalog's mass-richness relation has previously been constrained using weak lensing shear, Sunyaev-Zeldovich (SZ), and X-ray measurements. The mass normalization of the clusters as measured by weak lensing shear is {approx}>25% higher than that measured using SZ and X-ray methods, a difference much larger than the stated measurement errors in the analyses. We constrain the mass-richness relation of the MaxBCG galaxy cluster catalogmore » by measuring the gravitational lensing magnification of type I quasars in the background of the clusters. The magnification is determined using the quasars' variability and the correlation between quasars' variability amplitude and intrinsic luminosity. The mass-richness relation determined through magnification is in agreement with that measured using shear, confirming that the lensing strength of the clusters implies a high mass normalization and that the discrepancy with other methods is not due to a shear-related systematic measurement error. We study the dependence of the measured mass normalization on the cluster halo orientation. As expected, line-of-sight clusters yield a higher normalization; however, this minority of haloes does not significantly bias the average mass-richness relation of the catalog.« less
Variable Stars In the Unusual, Metal-Rich Globular Cluster
NASA Technical Reports Server (NTRS)
Pritzl, Barton J.; Smith, Horace A.; Catelan, Marcio; Sweigart, Allen V.; Oegerle, William R. (Technical Monitor)
2002-01-01
We have undertaken a search for variable stars in the metal-rich globular cluster NGC 6388 using time-series BV photometry. Twenty-eight new variables were found in this survey, increasing the total number of variables found near NGC 6388 to approx. 57. A significant number of the variables are RR Lyrae (approx. 14), most of which are probable cluster members. The periods of the fundamental mode RR Lyrae are shown to be unusually long compared to metal-rich field stars. The existence of these long period RRab stars suggests that the horizontal branch of NGC 6388 is unusually bright. This implies that the metallicity-luminosity relationship for RR Lyrae stars is not universal if the RR Lyrae in NGC 6388 are indeed metal-rich. We consider the alternative possibility that the stars in NGC 6388 may span a range in [Fe/H]. Four candidate Population II Cepheids were also found. If they are members of the cluster, NGC 6388 would be the most metal-rich globular cluster to contain Population II Cepheids. The mean V magnitude of the RR Lyrae is found to be 16.85 +/- 0.05 resulting in a distance of 9.0 to 10.3 kpc, for a range of assumed values of (M(sub V)) for RR Lyrae. We determine the reddening of the cluster to be E(B - V) = 0.40 +/- 0.03 mag, with differential reddening across the face of the cluster. We discuss the difficulty in determining the Oosterhoff classification of NGC 6388 and NGC 6441 due to the unusual nature of their RR Lyrae, and address evolutionary constraints on a recent suggestion that they are of Oosterhoff type II.
NASA Astrophysics Data System (ADS)
Arca-Sedda, Manuel; Kocsis, Bence; Brandt, Timothy D.
2018-06-01
The Milky Way centre exhibits an intense flux in the gamma and X-ray bands, whose origin is partly ascribed to the possible presence of a large population of millisecond pulsars (MSPs) and cataclysmic variables (CVs), respectively. However, the number of sources required to generate such an excess is much larger than what is expected from in situ star formation and evolution, opening a series of questions about the formation history of the Galactic nucleus. In this paper we make use of direct N-body simulations to investigate whether these sources could have been brought to the Galactic centre by a population of star clusters that underwent orbital decay and formed the Galactic nuclear star cluster (NSC). Our results suggest that the gamma ray emission is compatible with a population of MSPs that were mass segregated in their parent clusters, while the X-ray emission is consistent with a population of CVs born via dynamical interactions in dense star clusters. Combining observations with our modelling, we explore how the observed γ ray flux can be related to different NSC formation scenarios. Finally, we show that the high-energy emission coming from the galactic central regions can be used to detect black holes heavier than 105M⊙ in nearby dwarf galaxies.
López-Sanz, David; Garcés, Pilar; Álvarez, Blanca; Delgado-Losada, María Luisa; López-Higes, Ramón; Maestú, Fernando
2017-12-01
Subjective Cognitive Decline (SCD) is a largely unknown state thought to represent a preclinical stage of Alzheimer's Disease (AD) previous to mild cognitive impairment (MCI). However, the course of network disruption in these stages is scarcely characterized. We employed resting state magnetoencephalography in the source space to calculate network smallworldness, clustering, modularity and transitivity. Nodal measures (clustering and node degree) as well as modular partitions were compared between groups. The MCI group exhibited decreased smallworldness, clustering and transitivity and increased modularity in theta and beta bands. SCD showed similar but smaller changes in clustering and transitivity, while exhibiting alterations in the alpha band in opposite direction to those showed by MCI for modularity and transitivity. At the node level, MCI disrupted both clustering and nodal degree while SCD showed minor changes in the latter. Additionally, we observed an increase in modular partition variability in both SCD and MCI in theta and beta bands. SCD elders exhibit a significant network disruption, showing intermediate values between HC and MCI groups in multiple parameters. These results highlight the relevance of cognitive concerns in the clinical setting and suggest that network disorganization in AD could start in the preclinical stages before the onset of cognitive symptoms.
Understanding the determinants of volatility clustering in terms of stationary Markovian processes
NASA Astrophysics Data System (ADS)
Miccichè, S.
2016-11-01
Volatility is a key variable in the modeling of financial markets. The most striking feature of volatility is that it is a long-range correlated stochastic variable, i.e. its autocorrelation function decays like a power-law τ-β for large time lags. In the present work we investigate the determinants of such feature, starting from the empirical observation that the exponent β of a certain stock's volatility is a linear function of the average correlation of such stock's volatility with all other volatilities. We propose a simple approach consisting in diagonalizing the cross-correlation matrix of volatilities and investigating whether or not the diagonalized volatilities still keep some of the original volatility stylized facts. As a result, the diagonalized volatilities result to share with the original volatilities either the power-law decay of the probability density function and the power-law decay of the autocorrelation function. This would indicate that volatility clustering is already present in the diagonalized un-correlated volatilities. We therefore present a parsimonious univariate model based on a non-linear Langevin equation that well reproduces these two stylized facts of volatility. The model helps us in understanding that the main source of volatility clustering, once volatilities have been diagonalized, is that the economic forces driving volatility can be modeled in terms of a Smoluchowski potential with logarithmic tails.
NASA Astrophysics Data System (ADS)
Karimi, Hamed; Rosenberg, Gili; Katzgraber, Helmut G.
2017-10-01
We present and apply a general-purpose, multistart algorithm for improving the performance of low-energy samplers used for solving optimization problems. The algorithm iteratively fixes the value of a large portion of the variables to values that have a high probability of being optimal. The resulting problems are smaller and less connected, and samplers tend to give better low-energy samples for these problems. The algorithm is trivially parallelizable since each start in the multistart algorithm is independent, and could be applied to any heuristic solver that can be run multiple times to give a sample. We present results for several classes of hard problems solved using simulated annealing, path-integral quantum Monte Carlo, parallel tempering with isoenergetic cluster moves, and a quantum annealer, and show that the success metrics and the scaling are improved substantially. When combined with this algorithm, the quantum annealer's scaling was substantially improved for native Chimera graph problems. In addition, with this algorithm the scaling of the time to solution of the quantum annealer is comparable to the Hamze-de Freitas-Selby algorithm on the weak-strong cluster problems introduced by Boixo et al. Parallel tempering with isoenergetic cluster moves was able to consistently solve three-dimensional spin glass problems with 8000 variables when combined with our method, whereas without our method it could not solve any.
Impact of heuristics in clustering large biological networks.
Shafin, Md Kishwar; Kabir, Kazi Lutful; Ridwan, Iffatur; Anannya, Tasmiah Tamzid; Karim, Rashid Saadman; Hoque, Mohammad Mozammel; Rahman, M Sohel
2015-12-01
Traditional clustering algorithms often exhibit poor performance for large networks. On the contrary, greedy algorithms are found to be relatively efficient while uncovering functional modules from large biological networks. The quality of the clusters produced by these greedy techniques largely depends on the underlying heuristics employed. Different heuristics based on different attributes and properties perform differently in terms of the quality of the clusters produced. This motivates us to design new heuristics for clustering large networks. In this paper, we have proposed two new heuristics and analyzed the performance thereof after incorporating those with three different combinations in a recently celebrated greedy clustering algorithm named SPICi. We have extensively analyzed the effectiveness of these new variants. The results are found to be promising. Copyright © 2015 Elsevier Ltd. All rights reserved.
Two dynamic regimes in the human gut microbiome
Smillie, Chris S.; Alm, Eric J.
2017-01-01
The gut microbiome is a dynamic system that changes with host development, health, behavior, diet, and microbe-microbe interactions. Prior work on gut microbial time series has largely focused on autoregressive models (e.g. Lotka-Volterra). However, we show that most of the variance in microbial time series is non-autoregressive. In addition, we show how community state-clustering is flawed when it comes to characterizing within-host dynamics and that more continuous methods are required. Most organisms exhibited stable, mean-reverting behavior suggestive of fixed carrying capacities and abundant taxa were largely shared across individuals. This mean-reverting behavior allowed us to apply sparse vector autoregression (sVAR)—a multivariate method developed for econometrics—to model the autoregressive component of gut community dynamics. We find a strong phylogenetic signal in the non-autoregressive co-variance from our sVAR model residuals, which suggests niche filtering. We show how changes in diet are also non-autoregressive and that Operational Taxonomic Units strongly correlated with dietary variables have much less of an autoregressive component to their variance, which suggests that diet is a major driver of microbial dynamics. Autoregressive variance appears to be driven by multi-day recovery from frequent facultative anaerobe blooms, which may be driven by fluctuations in luminal redox. Overall, we identify two dynamic regimes within the human gut microbiota: one likely driven by external environmental fluctuations, and the other by internal processes. PMID:28222117
Two dynamic regimes in the human gut microbiome.
Gibbons, Sean M; Kearney, Sean M; Smillie, Chris S; Alm, Eric J
2017-02-01
The gut microbiome is a dynamic system that changes with host development, health, behavior, diet, and microbe-microbe interactions. Prior work on gut microbial time series has largely focused on autoregressive models (e.g. Lotka-Volterra). However, we show that most of the variance in microbial time series is non-autoregressive. In addition, we show how community state-clustering is flawed when it comes to characterizing within-host dynamics and that more continuous methods are required. Most organisms exhibited stable, mean-reverting behavior suggestive of fixed carrying capacities and abundant taxa were largely shared across individuals. This mean-reverting behavior allowed us to apply sparse vector autoregression (sVAR)-a multivariate method developed for econometrics-to model the autoregressive component of gut community dynamics. We find a strong phylogenetic signal in the non-autoregressive co-variance from our sVAR model residuals, which suggests niche filtering. We show how changes in diet are also non-autoregressive and that Operational Taxonomic Units strongly correlated with dietary variables have much less of an autoregressive component to their variance, which suggests that diet is a major driver of microbial dynamics. Autoregressive variance appears to be driven by multi-day recovery from frequent facultative anaerobe blooms, which may be driven by fluctuations in luminal redox. Overall, we identify two dynamic regimes within the human gut microbiota: one likely driven by external environmental fluctuations, and the other by internal processes.
Kinetics of copper growth on graphene revealed by time-resolved small-angle x-ray scattering
NASA Astrophysics Data System (ADS)
Hodas, M.; Siffalovic, P.; Jergel, M.; Pelletta, M.; Halahovets, Y.; Vegso, K.; Kotlar, M.; Majkova, E.
2017-01-01
Metal growth on graphene has many applications. Transition metals are known to favor three-dimensional (3D) cluster growth on graphene. Copper is of particular interest for cost-effective surface-supported catalysis applications and as a contact material in electronics. This paper presents an in situ real-time study of Cu growth kinetics on graphene covering all stages preceding formation of a continuous film performed by laboratory-based grazing-incidence small-angle x-ray scattering (GISAXS) technique. In particular, nucleation and 3D cluster growth, coalescence, and percolation stages were identified. The cluster nucleation saturates after reaching a density of 1012c m-2 at ≈1 monolayer thickness. A Kratky plot and a paracrystal model with cumulative structural disorder were necessary to evaluate properly cluster growth and coalescence, respectively. The power law scaling constants 0.27 ±0.05 and 0.81 ±0.02 of the temporal evolution of Cu cluster size suggest the growth of isolated clusters and dynamic cluster coalescence keeping the cluster shape, respectively. Coalescence and percolation thresholds occur at Cu thicknesses of 2 ±0.4 and 8.8 ±0.7 nm , respectively. This paper demonstrates the potential of laboratory-based in situ GISAXS as a vital diagnostic tool for tailoring a large variety of Cu nanostructures on graphene based on an in situ Cu growth monitoring which is applicable in a broad range of deposition times.
NASA Astrophysics Data System (ADS)
Caputo, F.; Castellani, V.; Quarta, M. L.
1985-02-01
It is shown that pulsational properties of RR Lyrae variables in globular clusters can be used to put theoretical constraints on the values of cluster reddening and distance modulus. By requiring that the HR diagram location of pulsators agrees with the period distribution observed and with the theoretical boundaries of the instability strip, reddening and distance modulus of the globular cluster M4 are derived as a (slow) function of the pulsator masses. Thus, a best guess is presented for the cluster age (t = 12.2 billion years), some evidence for a non-canonical evolutionary having been taken into account.
A PRECISE CLUSTER MASS PROFILE AVERAGED FROM THE HIGHEST-QUALITY LENSING DATA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Umetsu, Keiichi; Broadhurst, Tom; Zitrin, Adi
2011-09-01
We outline our methods for obtaining high-precision mass profiles, combining independent weak-lensing distortion, magnification, and strong-lensing measurements. For massive clusters, the strong- and weak-lensing regimes contribute equal logarithmic coverage of the radial profile. The utility of high-quality data is limited by the cosmic noise from large-scale structure along the line of sight. This noise is overcome when stacking clusters, as too are the effects of cluster asphericity and substructure, permitting a stringent test of theoretical models. We derive a mean radial mass profile of four similar mass clusters of high-quality Hubble Space Telescope and Subaru images, in the range Rmore » = 40-2800 kpc h {sup -1}, where the inner radial boundary is sufficiently large to avoid smoothing from miscentering effects. The stacked mass profile is detected at 58{sigma} significance over the entire radial range, with the contribution from the cosmic noise included. We show that the projected mass profile has a continuously steepening gradient out to beyond the virial radius, in remarkably good agreement with the standard Navarro-Frenk-White form predicted for the family of cold dark matter (CDM) dominated halos in gravitational equilibrium. The central slope is constrained to lie in the range, -dln {rho}/dln r = 0.89{sup +0.27}{sub -0.39}. The mean concentration is c{sub vir} = 7.68{sup +0.42}{sub -0.40} (at M{sub vir} = 1.54{sup +0.11}{sub -0.10} x 10{sup 15} M{sub sun} h {sup -1}), which is high for relaxed, high-mass clusters, but consistent with {Lambda}CDM when a sizable projection bias estimated from N-body simulations is considered. This possible tension will be more definitively explored with new cluster surveys, such as CLASH, LoCuSS, Subaru Hyper Suprime-Cam, and XXM-XXL, to construct the c{sub vir}-M{sub vir} relation over a wider mass range.« less
NASA Astrophysics Data System (ADS)
Kuo, Yi-Ming; Liu, Wen-Wen
2015-04-01
The Han River basin is one of the most important industrial and grain production bases in the central China. A lot of factories and towns have been established along the river where large farmlands are located nearby. In the last few decades the water quality of the Han River, specifically in middle and lower reaches, has gradually declined. The agricultural nonpoint pollution and municipal and industrial point pollution significantly degrade the water quality of the Han River. Factor analysis can be applied to reduce the dimensionality of a data set consisting of a large number of inter-related variables. Cluster analysis can classify the samples according to their similar characters. In this study, factor analysis is used to identify major pollution indicators, and cluster analysis is employed to classify the samples based on the sample locations and hydrochemical variables. Water samples were collected from 12 sample sites collected from Xiangyang City (middle Han River) to Wuhan City (lower Han River). Correlations among 25 hydrochemical variables are statistically examined. The important pollutants are determined by factor analysis. A three-factor model is determined and explains over 85% of the total river water quality variation. Factor 1, including SS, Chl-a, TN and TP, can be considered as the nonpoint source pollution. Factor 2, including Cl-, Br-, SO42-, Ca2+, Mg2+, K+, Fe2+ and PO43-, can be treated as the industrial pollutant pollution. Factor 3, including F- and NO3-, reflects the influence of the groundwater or self-purification capability of the river water. The various land uses along the Han River correlate well with the pollution types. In addition, the result showed that the water quality of Han River deteriorated gradually from middle to lower Han River. Some tributaries have been seriously polluted and significantly influence the mainstream water quality of the Han River. Finally, the result showed that the nonpoint pollution and the point pollution both significantly influence water quality in the middle and lower Han River. This study provides an effective method for watershed management and pollution control in Han River.
The Cluster AgeS Experiment (CASE). Variable Stars in the Field of the Globular Cluster M22
NASA Astrophysics Data System (ADS)
Rozyczka, M.; Thompson, I. B.; Pych, W.; Narloch, W.; Poleski, R.; Schwarzenberg-Czerny, A.
2017-09-01
The field of the globular cluster M22 (NGC 6656) was monitored between 2000 and 2008 in a search for variable stars. BV light curves were obtained for 359 periodic, likely periodic, and long-term variables, 238 of which are new detections. 39 newly detected variables, and 63 previously known ones are members or likely members of the cluster, including 20 SX Phe, 10 RRab and 16 RRc type pulsators, one BL Her type pulsator, 21 contact binaries, and 9 detached or semi-detached eclipsing binaries. The most interesting among the identified objects are V112 - a bright multimode SX Phe pulsator, V125 - a β Lyr type binary on the blue horizontal branch, V129 - a blue/yellow straggler with a W UMa-like light curve, located halfway between the extreme horizontal branch and red giant branch, and V134 - an extreme horizontal branch object with P=2.33 d and a nearly sinusoidal light curve. All four of them are proper motion members of the cluster. Among nonmembers, a P=2.83 d detached eclipsing binary hosting a δ Sct type pulsator was found, and a peculiar P=0.93 d binary with ellipsoidal modulation and narrow minimum in the middle of one of the descending shoulders of the sinusoid. We also collected substantial new data for previously known variables. In particular we revise the statistics of the occurrence of the Blazhko effect in RR Lyr type variables of M22.
Dark energy and modified gravity in the Effective Field Theory of Large-Scale Structure
NASA Astrophysics Data System (ADS)
Cusin, Giulia; Lewandowski, Matthew; Vernizzi, Filippo
2018-04-01
We develop an approach to compute observables beyond the linear regime of dark matter perturbations for general dark energy and modified gravity models. We do so by combining the Effective Field Theory of Dark Energy and Effective Field Theory of Large-Scale Structure approaches. In particular, we parametrize the linear and nonlinear effects of dark energy on dark matter clustering in terms of the Lagrangian terms introduced in a companion paper [1], focusing on Horndeski theories and assuming the quasi-static approximation. The Euler equation for dark matter is sourced, via the Newtonian potential, by new nonlinear vertices due to modified gravity and, as in the pure dark matter case, by the effects of short-scale physics in the form of the divergence of an effective stress tensor. The effective fluid introduces a counterterm in the solution to the matter continuity and Euler equations, which allows a controlled expansion of clustering statistics on mildly nonlinear scales. We use this setup to compute the one-loop dark-matter power spectrum.
2017-10-01
Facility is a large-scale cascade that allows detailed flow field surveys and blade surface measurements.10–12 The facility has a continuous run ...structured grids at 2 flow conditions, cruise and takeoff, of the VSPT blade . Computations were run in parallel on a Department of Defense...RANS/LES) and Unsteady RANS Predictions of Separated Flow for a Variable-Speed Power- Turbine Blade Operating with Low Inlet Turbulence Levels
Li, Nicole; Yan, Lijing L; Niu, Wenyi; Labarthe, Darwin; Feng, Xiangxian; Shi, Jingpu; Zhang, Jianxin; Zhang, Ruijuan; Zhang, Yuhong; Chu, Hongling; Neiman, Andrea; Engelgau, Michael; Elliott, Paul; Wu, Yangfeng; Neal, Bruce
2013-11-01
Cardiovascular diseases are the leading cause of death and disability in China. High blood pressure caused by excess intake of dietary sodium is widespread and an effective sodium reduction program has potential to improve cardiovascular health. This study is a large-scale, cluster-randomized, trial done in five Northern Chinese provinces. Two counties have been selected from each province and 12 townships in each county making a total of 120 clusters. Within each township one village has been selected for participation with 1:1 randomization stratified by county. The sodium reduction intervention comprises community health education and a food supply strategy based upon providing access to salt substitute. Subsidization of the price of salt substitute was done in 30 intervention villages selected at random. Control villages continued usual practices. The primary outcome for the study is dietary sodium intake level estimated from assays of 24-hour urine. The trial recruited and randomized 120 townships in April 2011. The sodium reduction program was commenced in the 60 intervention villages between May and June of that year with outcome surveys scheduled for October to December 2012. Baseline data collection shows that randomisation achieved good balance across groups. The establishment of the China Rural Health Initiative has enabled the launch of this large-scale trial designed to identify a novel, scalable strategy for reduction of dietary sodium and control of blood pressure. If proved effective, the intervention could plausibly be implemented at low cost in large parts of China and other countries worldwide. © 2013.
Search for Pulsating Stars in the Open Cluster NGC 1502
NASA Astrophysics Data System (ADS)
Stęślicki, M.
2006-04-01
We present results of a variability search in the field of the young open cluster NGC 1502. We confirm that a beta Cephei suspect WEBDA 26 is indeed pulsating with a period of 0.09612 d and semi-amplitude of about 3 mmag in V. A new VI light curve of the bright eclipsing binary and cluster member SZ Cam was obtained. In addition, we found two new variable stars. One is an interesting eclipsing binary showing total eclipses, which can be used to derive the distance to the cluster once radial velocities of the components will be obtained.
Efficiency Improvements to the Displacement Based Multilevel Structural Optimization Algorithm
NASA Technical Reports Server (NTRS)
Plunkett, C. L.; Striz, A. G.; Sobieszczanski-Sobieski, J.
2001-01-01
Multilevel Structural Optimization (MSO) continues to be an area of research interest in engineering optimization. In the present project, the weight optimization of beams and trusses using Displacement based Multilevel Structural Optimization (DMSO), a member of the MSO set of methodologies, is investigated. In the DMSO approach, the optimization task is subdivided into a single system and multiple subsystems level optimizations. The system level optimization minimizes the load unbalance resulting from the use of displacement functions to approximate the structural displacements. The function coefficients are then the design variables. Alternately, the system level optimization can be solved using the displacements themselves as design variables, as was shown in previous research. Both approaches ensure that the calculated loads match the applied loads. In the subsystems level, the weight of the structure is minimized using the element dimensions as design variables. The approach is expected to be very efficient for large structures, since parallel computing can be utilized in the different levels of the problem. In this paper, the method is applied to a one-dimensional beam and a large three-dimensional truss. The beam was tested to study possible simplifications to the system level optimization. In previous research, polynomials were used to approximate the global nodal displacements. The number of coefficients of the polynomials equally matched the number of degrees of freedom of the problem. Here it was desired to see if it is possible to only match a subset of the degrees of freedom in the system level. This would lead to a simplification of the system level, with a resulting increase in overall efficiency. However, the methods tested for this type of system level simplification did not yield positive results. The large truss was utilized to test further improvements in the efficiency of DMSO. In previous work, parallel processing was applied to the subsystems level, where the derivative verification feature of the optimizer NPSOL had been utilized in the optimizations. This resulted in large runtimes. In this paper, the optimizations were repeated without using the derivative verification, and the results are compared to those from the previous work. Also, the optimizations were run on both, a network of SUN workstations using the MPICH implementation of the Message Passing Interface (MPI) and on the faster Beowulf cluster at ICASE, NASA Langley Research Center, using the LAM implementation of UP]. The results on both systems were consistent and showed that it is not necessary to verify the derivatives and that this gives a large increase in efficiency of the DMSO algorithm.
Analysis on flood generation processes by means of a continuous simulation model
NASA Astrophysics Data System (ADS)
Fiorentino, M.; Gioia, A.; Iacobellis, V.; Manfreda, S.
2006-03-01
In the present research, we exploited a continuous hydrological simulation to investigate on key variables responsible of flood peak formation. With this purpose, a distributed hydrological model (DREAM) is used in cascade with a rainfall generator (IRP-Iterated Random Pulse) to simulate a large number of extreme events providing insight into the main controls of flood generation mechanisms. Investigated variables are those used in theoretically derived probability distribution of floods based on the concept of partial contributing area (e.g. Iacobellis and Fiorentino, 2000). The continuous simulation model is used to investigate on the hydrological losses occurring during extreme events, the variability of the source area contributing to the flood peak and its lag-time. Results suggest interesting simplification for the theoretical probability distribution of floods according to the different climatic and geomorfologic environments. The study is applied to two basins located in Southern Italy with different climatic characteristics.
Yokoyama, Eiji; Uchimura, Masako
2007-11-01
Ninety-five enterohemorrhagic Escherichia coli serovar O157 strains, including 30 strains isolated from 13 intrafamily outbreaks and 14 strains isolated from 3 mass outbreaks, were studied by pulsed-field gel electrophoresis (PFGE) and variable number of tandem repeats (VNTR) typing, and the resulting data were subjected to cluster analysis. Cluster analysis of the VNTR typing data revealed that 57 (60.0%) of 95 strains, including all epidemiologically linked strains, formed clusters with at least 95% similarity. Cluster analysis of the PFGE patterns revealed that 67 (70.5%) of 95 strains, including all but 1 of the epidemiologically linked strains, formed clusters with 90% similarity. The number of epidemiologically unlinked strains forming clusters was significantly less by VNTR cluster analysis than by PFGE cluster analysis. The congruence value between PFGE and VNTR cluster analysis was low and did not show an obvious correlation. With two-step cluster analysis, the number of clustered epidemiologically unlinked strains by PFGE cluster analysis that were divided by subsequent VNTR cluster analysis was significantly higher than the number by VNTR cluster analysis that were divided by subsequent PFGE cluster analysis. These results indicate that VNTR cluster analysis is more efficient than PFGE cluster analysis as an epidemiological tool to trace the transmission of enterohemorrhagic E. coli O157.
A balanced hazard ratio for risk group evaluation from survival data.
Branders, Samuel; Dupont, Pierre
2015-07-30
Common clinical studies assess the quality of prognostic factors, such as gene expression signatures, clinical variables or environmental factors, and cluster patients into various risk groups. Typical examples include cancer clinical trials where patients are clustered into high or low risk groups. Whenever applied to survival data analysis, such groups are intended to represent patients with similar survival odds and to select the most appropriate therapy accordingly. The relevance of such risk groups, and of the related prognostic factors, is typically assessed through the computation of a hazard ratio. We first stress three limitations of assessing risk groups through the hazard ratio: (1) it may promote the definition of arbitrarily unbalanced risk groups; (2) an apparently optimal group hazard ratio can be largely inconsistent with the p-value commonly associated to it; and (3) some marginal changes between risk group proportions may lead to highly different hazard ratio values. Those issues could lead to inappropriate comparisons between various prognostic factors. Next, we propose the balanced hazard ratio to solve those issues. This new performance metric keeps an intuitive interpretation and is as simple to compute. We also show how the balanced hazard ratio leads to a natural cut-off choice to define risk groups from continuous risk scores. The proposed methodology is validated through controlled experiments for which a prescribed cut-off value is defined by design. Further results are also reported on several cancer prognosis studies, and the proposed methodology could be applied more generally to assess the quality of any prognostic markers. Copyright © 2015 John Wiley & Sons, Ltd.
Features and heterogeneities in growing network models
NASA Astrophysics Data System (ADS)
Ferretti, Luca; Cortelezzi, Michele; Yang, Bin; Marmorini, Giacomo; Bianconi, Ginestra
2012-06-01
Many complex networks from the World Wide Web to biological networks grow taking into account the heterogeneous features of the nodes. The feature of a node might be a discrete quantity such as a classification of a URL document such as personal page, thematic website, news, blog, search engine, social network, etc., or the classification of a gene in a functional module. Moreover the feature of a node can be a continuous variable such as the position of a node in the embedding space. In order to account for these properties, in this paper we provide a generalization of growing network models with preferential attachment that includes the effect of heterogeneous features of the nodes. The main effect of heterogeneity is the emergence of an “effective fitness” for each class of nodes, determining the rate at which nodes acquire new links. The degree distribution exhibits a multiscaling behavior analogous to the the fitness model. This property is robust with respect to variations in the model, as long as links are assigned through effective preferential attachment. Beyond the degree distribution, in this paper we give a full characterization of the other relevant properties of the model. We evaluate the clustering coefficient and show that it disappears for large network size, a property shared with the Barabási-Albert model. Negative degree correlations are also present in this class of models, along with nontrivial mixing patterns among features. We therefore conclude that both small clustering coefficients and disassortative mixing are outcomes of the preferential attachment mechanism in general growing networks.
Subjective well-being among young people in five Eastern European countries.
Lim, M S C; Cappa, C; Patton, G C
2017-01-01
Subjective well-being incorporates elements of life satisfaction, happiness and optimism. It is increasingly relevant in the assessment of population health and economic development. There are strong continuities in well-being from youth into later life. Despite its significance, few global surveys capture subjective well-being. This paper describes patterns of well-being among young people in five Eastern European countries [Belarus, Bosnia and Herzegovina (BiH), the Former Yugoslav Republic of Macedonia, Serbia and Ukraine] and investigates association between demographic factors and well-being. Nationally representative household surveys, including large Roma population samples, were conducted as part of UNICEF's Multiple Indicator Cluster Survey programme. Young people aged 15-24 years ( N = 11 944) indicated their satisfaction with life, happiness and expectations about the future. Multilevel logistic regressions were conducted to determine the impact of individual-level predictors while accounting for country- and cluster-level variability. Around 40% of young people considered themselves very happy or very satisfied with their life overall. Three quarters reported optimism. Yet well-being varied greatly between countries, with youth in BiH and Ukraine reporting lowest levels of well-being. Current marriage, increasing wealth, higher education, rural residence and not having children were associated with greater well-being. Patterns of well-being in youth vary substantially between countries and are only partly accounted for by standard demographic characteristics. Despite higher rates of adolescent marriage and childbearing, and lower levels of educational attainment and employment, Roma youth had similar levels of well-being to the general population.
Adalsteinsson, David; McMillen, David; Elston, Timothy C
2004-03-08
Intrinsic fluctuations due to the stochastic nature of biochemical reactions can have large effects on the response of biochemical networks. This is particularly true for pathways that involve transcriptional regulation, where generally there are two copies of each gene and the number of messenger RNA (mRNA) molecules can be small. Therefore, there is a need for computational tools for developing and investigating stochastic models of biochemical networks. We have developed the software package Biochemical Network Stochastic Simulator (BioNetS) for efficiently and accurately simulating stochastic models of biochemical networks. BioNetS has a graphical user interface that allows models to be entered in a straightforward manner, and allows the user to specify the type of random variable (discrete or continuous) for each chemical species in the network. The discrete variables are simulated using an efficient implementation of the Gillespie algorithm. For the continuous random variables, BioNetS constructs and numerically solves the appropriate chemical Langevin equations. The software package has been developed to scale efficiently with network size, thereby allowing large systems to be studied. BioNetS runs as a BioSpice agent and can be downloaded from http://www.biospice.org. BioNetS also can be run as a stand alone package. All the required files are accessible from http://x.amath.unc.edu/BioNetS. We have developed BioNetS to be a reliable tool for studying the stochastic dynamics of large biochemical networks. Important features of BioNetS are its ability to handle hybrid models that consist of both continuous and discrete random variables and its ability to model cell growth and division. We have verified the accuracy and efficiency of the numerical methods by considering several test systems.
VizieR Online Data Catalog: Catalogue of variable stars in open clusters (Zejda+, 2012)
NASA Astrophysics Data System (ADS)
Zejda, M.; Paunzen, E.; Baumann, B.; Mikulasek, Z.; Liska, J.
2012-08-01
The catalogue of variable stars in open clusters were prepared by cross-matching of Variable Stars Index (http://www.aavso.org/vsx) version Apr 29, 2012 (available online, Cat. B/vsx) against the version 3.1. catalogue of open clusters DAML02 (Dias et al. 2002A&A...389..871D, Cat. B/ocl) available on the website http://www.astro.iag.usp.br/~wilton. The open clusters were divided into two categories according to their size, where the limiting diameter was 60 arcmin. The list of all suspected variables and variable stars located within the fields of open clusters up to two times of given cluster radius were generated (Table 1). 8938 and 9127 variable stars are given in 461 "smaller" and 74 "larger" clusters, respectively. All found variable stars were matched against the PPMXL catalog of positions and proper motions within the ICRS (Roeser et al., 2010AJ....139.2440R, Cat. I/317). Proper motion data were included in our catalogue. Unfortunately, a homogeneous data set of mean cluster proper motions has not been available until now. Therefore we used the following sources (sorted alphabetically) to compile a new catalogue: Baumgardt et al. (2000, Cat. J/A+AS/146/251): based on the Hipparcos catalogue Beshenov & Loktin (2004A&AT...23..103B): based on the Tycho-2 catalogue Dias et al. (2001, Cat. J/A+A/376/441, 2002A&A...389..871D, Cat. B/ocl): based on the Tycho-2 catalogue Dias et al. (2006, Cat. J/A+A/446/949): based on the UCAC2 catalog (Zacharias et al., 2004AJ....127.3043Z, Cat. I/289) Frinchaboy & Majewski (2008, Cat. J/AJ/136/118): based on the Tycho-2 catalogue Kharchenko et al. (2005, J/A+A/438/1163): based on the ASCC2.5 catalogue (Kharchenko, 2001KFNT...17..409K, Cat. I/280) Krone-Martins et al. (2010, Cat. J/A+A/516/A3): based on the Bordeaux PM2000 proper motion catalogue (Ducourant et al., 2006A&A...448.1235D, Cat. I/300) Robichon et al. (1999, Cat. J/A+A/345/471): based on the Hipparcos catalogue van Leeuwen (2009A&A...497..209V): based on the new Hipparcos catalogue. In total, a catalogue of proper motions for 879 open clusters (Table 2), from which 436 have more than one available measurement, was compiled. (3 data files).
The mental health impact of 9/11 on inner-city high school students 20 miles north of Ground Zero.
Calderoni, Michele E; Alderman, Elizabeth M; Silver, Ellen J; Bauman, Laurie J
2006-07-01
To determine the rate of post-traumatic stress disorder (PTSD) after 9/11 in a sample of New York City high school students and associations among personal exposure, loss of psychosocial resources, prior mental health treatment, and PTSD. A total of 1214 students (grades 9 through 12) attending a large community high school in Bronx County, 20 miles north of "Ground Zero," completed a 45-item questionnaire during gym class on one day eight months after 9/11. Students were primarily Hispanic (62%) and African American (29%) and lived in the surrounding neighborhood. The questionnaire included the PCL-T, a 17-item PTSD checklist supplied by the Office of Behavioral and Social Science Research of the National Institutes of Health (NIH). The PCL-T was scored following the DSM-IV criteria for PTSD requiring endorsement of at least one repeating symptom, two hyperarousal symptoms, and three avoidance symptoms. Bivariate analysis comparing PTSD with personal exposure, loss of psychosocial resources, and mental health variables was done and multiple logistic regression was used to identify significant associations. There were 7.4 % of students with the PTSD symptom cluster. Bivariate analysis showed a trend for females to have higher rates of PTSD (males [6%] vs. females [9%], p = .06] with no overall ethnic differences. Five of the six personal exposure variables, and both of the loss of psychosocial resources and mental health variables were significantly associated with PTSD symptom cluster. Multiple logistic regression analysis found one personal exposure variable (having financial difficulties after 9/11, odds ratio [OR] = 5.27; 95% confidence interval [CI] 2.9-9.7); both the loss of psychosocial resources variables (currently feeling less safe, OR = 3.58; 95% CI 1.9-6.8) and currently feeling less protected by the government, (OR = 4.04; 95% CI 2.1-7.7); and one mental health variable (use of psychotropic medication before 9/11, OR = 3.95; 95% CI 1.2-13.0) were significantly associated with PTSD symptom cluster. We found a rate of PTSD in Bronx students after 9/11 that was much higher than other large studies of PTSD in adolescents done before 9/11. Adolescents living in inner cities with high poverty and violence rates may be at high risk for PTSD after a terrorist attack. Students who still felt vulnerable and less safe eight months later and those with prior mental health treatment were four times more likely to have PTSD than those without such characteristics, highlighting the influence of personality and mental health on development of PTSD after a traumatic event.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deshpande, Amruta J.; Hughes, John P.; Wittman, David, E-mail: amrejd@physics.rutgers.edu, E-mail: jph@physics.rutgers.edu, E-mail: dwittman@physics.ucdavis.edu
We continue the study of the first sample of shear-selected clusters from the initial 8.6 square degrees of the Deep Lens Survey (DLS); a sample with well-defined selection criteria corresponding to the highest ranked shear peaks in the survey area. We aim to characterize the weak lensing selection by examining the sample’s X-ray properties. There are multiple X-ray clusters associated with nearly all the shear peaks: 14 X-ray clusters corresponding to seven DLS shear peaks. An additional three X-ray clusters cannot be definitively associated with shear peaks, mainly due to large positional offsets between the X-ray centroid and the shearmore » peak. Here we report on the XMM-Newton properties of the 17 X-ray clusters. The X-ray clusters display a wide range of luminosities and temperatures; the L {sub X} − T {sub X} relation we determine for the shear-associated X-ray clusters is consistent with X-ray cluster samples selected without regard to dynamical state, while it is inconsistent with self-similarity. For a subset of the sample, we measure X-ray masses using temperature as a proxy, and compare to weak lensing masses determined by the DLS team. The resulting mass comparison is consistent with equality. The X-ray and weak lensing masses show considerable intrinsic scatter (∼48%), which is consistent with X-ray selected samples when their X-ray and weak lensing masses are independently determined.« less
Systematic Association of Genes to Phenotypes by Genome and Literature Mining
Jensen, Lars J; Perez-Iratxeta, Carolina; Kaczanowski, Szymon; Hooper, Sean D; Andrade, Miguel A
2005-01-01
One of the major challenges of functional genomics is to unravel the connection between genotype and phenotype. So far no global analysis has attempted to explore those connections in the light of the large phenotypic variability seen in nature. Here, we use an unsupervised, systematic approach for associating genes and phenotypic characteristics that combines literature mining with comparative genome analysis. We first mine the MEDLINE literature database for terms that reflect phenotypic similarities of species. Subsequently we predict the likely genomic determinants: genes specifically present in the respective genomes. In a global analysis involving 92 prokaryotic genomes we retrieve 323 clusters containing a total of 2,700 significant gene–phenotype associations. Some clusters contain mostly known relationships, such as genes involved in motility or plant degradation, often with additional hypothetical proteins associated with those phenotypes. Other clusters comprise unexpected associations; for example, a group of terms related to food and spoilage is linked to genes predicted to be involved in bacterial food poisoning. Among the clusters, we observe an enrichment of pathogenicity-related associations, suggesting that the approach reveals many novel genes likely to play a role in infectious diseases. PMID:15799710
2009-05-01
to result in any extra- renal effects. In the CVVH group the initial prescribed dose was variable based on hemo- dynamic compromise and perceived...associated with a decrease in 28-day and hospital mortality when compared with a historical control group, which largely did not receive any form of renal ...BUN: blood urea nitrogen; CVVH: continuous venovenous hemofiltration; CVVHDF: continuous venovenous hemodiafiltration; ESRD: end-stage renal disease
The rotation of very low mass objects
NASA Astrophysics Data System (ADS)
Scholz, Alexander
2004-10-01
This dissertation contains an investigation of the rotation of very low mass objects, i.e. Brown Dwarfs and stars with masses <0.4 MS. Today, it is well-established that there are large populations of such VLM objects in open clusters and in the field, but our knowledge about their physical properties and evolution is still very limited. Contrary to their solar-mass siblings, VLM objects are fully convective throughout their evolution. Thus, they are not able to form a large-scale magnetic field like for example the sun. The magnetic field, in turn, is crucial for the regulation of rotation: Magnetic interaction between star and circumstellar disk ("disk-locking") and angular momentum losses through stellar winds have dominant influence on the rotational evolution. Thus, we can expect major differences in the rotational behaviour of VLM objects and solar-mass stars. The best method to investigate stellar rotation is to measure rotation periods. If a star exhibits surface features which are asymmetrically distributed, its brightness may be modulated with the rotation period. Thus, this dissertation is based on the analysis of photometric time series. Open clusters are an ideal environment for such a project, since they enable one to follow many objects at the same time. Additionally, they allow one to investigate the age and mass dependence of rotation, because distance and age of the clusters are known in good approximation. For this thesis, five open clusters were observed, which span an age range from 3 to 750 Myr. In three of them (SigmaOri, EpsilonOri, IC4665), VLM objects were identified by means of colour magnitude diagrams. The candidate lists for these three regions comprise at least 100 objects, for which photometry in at least three wavelength bands is available. About a fifth to a third of these candidates could be contaminating field stars in the fore- or background of the clusters. For the remaining two clusters (Pleiades and Praesepe), objects from the literature were selected as targets for the variability study. Masses for all these candidates were estimated by comparing the photometry with stellar evolutionary tracks. For each of the clusters, at least one photometric monitoring campaign was carried out; three of them were observed twice. Subsequently, the magnitudes of the VLM objects were measured relative to non-variable stars in the same fields. The difference image analysis procedure was used to improve the precision for two time series. That way, a photometric precision between 5 and 20 mmag was reached for the brightest stars. A comparison of several period search techniques showed that periodogram analysis delivers by far the best results for the available time series data. Beside the Scargle and CLEAN periodogram, the period search includes several independent and robust control procedures, to assure the reliability of the results. Additionally, a test to identify even non-periodic variability was implemented. For 87 candidates, a photometric rotation period was determined, 80 of these objects have masses <0.4 MS. Thus, this work increases the number of known VLM rotation periods in the age range between 3 and 750 Myr by a factor of 14. Altogether, about 30-50% of the candidates are variable. In the two youngest clusters, several objects show variability with very high amplitudes between 0.2 and 1.1 mag. Their lightcurves contain in the most cases a periodic component, but additionally irregular brightness variations. For two VLM stars, a flare event was detected. The origin of the periodic variability is surface features co-rotating with the objects. In most cases, these surface features are cool magnetically induced spots. From the lightcurves, it can be concluded that the spot properties change on timescales of at most two or three weeks. The amplitudes of the lightcurves are in the VLM regime by a factor of 2.4 smaller than for solar-mass stars, indicating a change of the spot properties with mass. The best explanation for this phenomenon is a more symmetric spot distribution on VLM objects. Additionally, it is probable that the contrast between spots and photospheric environment is smaller than for more massive stars. The lightcurves of the highly variable objects in the youngest clusters cannot be understood only with cool spots. This kind of variability resembles very much the photometric behaviour of classical T Tauri stars, i.e. stars which accrete matter from a circumstellar disk. Thus, it is likely that the highly variable VLM objects possess accretion disks as well. This interpretation is confirmed by near-infrared photometry and optical spectroscopy. For VLM objects in the SigmaOri cluster, a disk frequency of 6-14% was estimated. From this value and the age of SigmaOri it follows that VLM objects loose their disk on shorter timescales than solar-mass stars, which could be an indication for a formation through ejection from a multiple system. This result, however, needs confirmation, since the derived disk frequency should only be considered as a lower limit. The majority of the periodic variable objects rotate with periods <2 d. Slow rotators, with periods longer than 2d, are rare, in contrast to solar-mass stars. For M<0.3 MS, a tendency of faster rotation with decreasing object mass is observed. The origin of this tendency lies very probably in the earliest phases of the rotational evolution. The lower limit of the periods is, within the statistical uncertainties, nearly independent of age and ranges from three to six hours. On the other hand, the upper period limit clearly evolves with time. Between ages of 3 and 100 Myr, it declines from at least ten days to about two days. Afterwards, it increases again up to at least four days. To investigate this behaviour in more detail, simple models were constructed which simulate the basic mechanisms of angular momentum regulation. It turns out that the basic aspects of the rotational evolution can be understood if one takes into account the contraction of the objects and exponential rotational braking through stellar winds. On the contrary, for solar-mass stars the angular momentum losses through stellar winds can be described with the Skumanich law, which predicts a period increase proportional to the squareroot of time. This Skumanich law is not applicable in the VLM regime. Moreover, in the considered age range, the influence of "disk-locking" is negligible. Many of these results can be understood by taking into account the fact that VLM objects are fully convective and cannot possess a large-scale magnetic field. This basic physical property could be responsible for the fast rotation, the breakdown of the Skumanich law, the exponential braking of the rotation, and a more symmetric spot distribution. Thus, main results of this thesis can be ascribed to the internal structure of VLM objects.
Photometric search for variable stars in the young open cluster Berkeley 59
NASA Astrophysics Data System (ADS)
Lata, Sneh; Pandey, A. K.; Maheswar, G.; Mondal, Soumen; Kumar, Brijesh
2011-12-01
We present the time series photometry of stars located in the extremely young open cluster Berkeley 59. Using the 1.04-m telescope at Aryabhatta Research Institute of Observational Sciences (ARIES), Nainital, we have identified 42 variables in a field of ˜13 × 13 arcmin2 around the cluster. The probable members of the cluster have been identified using a (V, V-I) colour-magnitude diagram and a (J-H, H-K) colour-colour diagram. 31 variables have been found to be pre-main-sequence stars associated with the cluster. The ages and masses of the pre-main-sequence stars have been derived from the colour-magnitude diagram by fitting theoretical models to the observed data points. The ages of the majority of the probable pre-main-sequence variable candidates range from 1 to 5 Myr. The masses of these pre-main-sequence variable stars have been found to be in the range of ˜0.3 to ˜3.5 M⊙, and these could be T Tauri stars. The present statistics reveal that about 90 per cent T Tauri stars have period <15 d. The classical T Tauri stars are found to have a larger amplitude than the weak-line T Tauri stars. There is an indication that the amplitude decreases with an increase in mass, which could be due to the dispersal of the discs of relatively massive stars.
NASA Technical Reports Server (NTRS)
Stauffer, Ryan M.; Thompson, Anne M.; Young, George S.
2016-01-01
Sonde-based climatologies of tropospheric ozone (O3) are vital for developing satellite retrieval algorithms and evaluating chemical transport model output. Typical O3 climatologies average measurements by latitude or region, and season. A recent analysis using self-organizing maps (SOM) to cluster ozonesondes from two tropical sites found that clusters of O3 mixing ratio profiles are an excellent way to capture O3variability and link meteorological influences to O3 profiles. Clusters correspond to distinct meteorological conditions, e.g., convection, subsidence, cloud cover, and transported pollution. Here the SOM technique is extended to four long-term U.S. sites (Boulder, CO; Huntsville, AL; Trinidad Head, CA; and Wallops Island, VA) with4530 total profiles. Sensitivity tests on k-means algorithm and SOM justify use of 3 3 SOM (nine clusters). Ateach site, SOM clusters together O3 profiles with similar tropopause height, 500 hPa height temperature, and amount of tropospheric and total column O3. Cluster means are compared to monthly O3 climatologies.For all four sites, near-tropopause O3 is double (over +100 parts per billion by volume; ppbv) the monthly climatological O3 mixing ratio in three clusters that contain 1316 of profiles, mostly in winter and spring.Large midtropospheric deviations from monthly means (6 ppbv, +710 ppbv O3 at 6 km) are found in two of the most populated clusters (combined 3639 of profiles). These two clusters contain distinctly polluted(summer) and clean O3 (fall-winter, high tropopause) profiles, respectively. As for tropical profiles previously analyzed with SOM, O3 averages are often poor representations of U.S. O3 profile statistics.
Stauffer, Ryan M.; Thompson, Anne M.; Young, George S.
2018-01-01
Sonde-based climatologies of tropospheric ozone (O3) are vital for developing satellite retrieval algorithms and evaluating chemical transport model output. Typical O3 climatologies average measurements by latitude or region, and season. Recent analysis using self-organizing maps (SOM) to cluster ozonesondes from two tropical sites found clusters of O3 mixing ratio profiles are an excellent way to capture O3 variability and link meteorological influences to O3 profiles. Clusters correspond to distinct meteorological conditions, e.g. convection, subsidence, cloud cover, and transported pollution. Here, the SOM technique is extended to four long-term U.S. sites (Boulder, CO; Huntsville, AL; Trinidad Head, CA; Wallops Island, VA) with 4530 total profiles. Sensitivity tests on k-means algorithm and SOM justify use of 3×3 SOM (nine clusters). At each site, SOM clusters together O3 profiles with similar tropopause height, 500 hPa height/temperature, and amount of tropospheric and total column O3. Cluster means are compared to monthly O3 climatologies. For all four sites, near-tropopause O3 is double (over +100 parts per billion by volume; ppbv) the monthly climatological O3 mixing ratio in three clusters that contain 13 – 16% of profiles, mostly in winter and spring. Large mid-tropospheric deviations from monthly means (−6 ppbv, +7 – 10 ppbv O3 at 6 km) are found in two of the most populated clusters (combined 36 – 39% of profiles). These two clusters contain distinctly polluted (summer) and clean O3 (fall-winter, high tropopause) profiles, respectively. As for tropical profiles previously analyzed with SOM, O3 averages are often poor representations of U.S. O3 profile statistics. PMID:29619288
Stauffer, Ryan M; Thompson, Anne M; Young, George S
2016-02-16
Sonde-based climatologies of tropospheric ozone (O 3 ) are vital for developing satellite retrieval algorithms and evaluating chemical transport model output. Typical O 3 climatologies average measurements by latitude or region, and season. Recent analysis using self-organizing maps (SOM) to cluster ozonesondes from two tropical sites found clusters of O 3 mixing ratio profiles are an excellent way to capture O 3 variability and link meteorological influences to O 3 profiles. Clusters correspond to distinct meteorological conditions, e.g. convection, subsidence, cloud cover, and transported pollution. Here, the SOM technique is extended to four long-term U.S. sites (Boulder, CO; Huntsville, AL; Trinidad Head, CA; Wallops Island, VA) with 4530 total profiles. Sensitivity tests on k-means algorithm and SOM justify use of 3×3 SOM (nine clusters). At each site, SOM clusters together O 3 profiles with similar tropopause height, 500 hPa height/temperature, and amount of tropospheric and total column O 3 . Cluster means are compared to monthly O 3 climatologies. For all four sites, near-tropopause O 3 is double (over +100 parts per billion by volume; ppbv) the monthly climatological O 3 mixing ratio in three clusters that contain 13 - 16% of profiles, mostly in winter and spring. Large mid-tropospheric deviations from monthly means (-6 ppbv, +7 - 10 ppbv O 3 at 6 km) are found in two of the most populated clusters (combined 36 - 39% of profiles). These two clusters contain distinctly polluted (summer) and clean O 3 (fall-winter, high tropopause) profiles, respectively. As for tropical profiles previously analyzed with SOM, O 3 averages are often poor representations of U.S. O 3 profile statistics.
Tiotropium might improve survival in subjects with COPD at high risk of mortality
2014-01-01
Background Inhaled therapies reduce risk of chronic obstructive pulmonary disease (COPD) exacerbations, but their effect on mortality is less well established. We hypothesized that heterogeneity in baseline mortality risk influenced the results of drug trials assessing mortality in COPD. Methods The 5706 patients with COPD from the Understanding Potential Long-term Impacts on Function with Tiotropium (UPLIFT®) study that had complete clinical information for variables associated with mortality (age, forced expiratory volume in 1 s, St George’s Respiratory Questionnaire, pack-years and body mass index) were classified by cluster analysis. Baseline risk of mortality between clusters, and impact of tiotropium were evaluated during the 4-yr follow up. Results Four clusters were identified, including low-risk (low mortality rate) patients (n = 2339; 41%; cluster 2), and high-risk patients (n = 1022; 18%; cluster 3), who had a 2.6- and a six-fold increase in all-cause and respiratory mortality compared with cluster 2, respectively. Tiotropium reduced exacerbations in all clusters, and reduced hospitalizations in high-risk patients (p < 0.05). The beneficial effect of tiotropium on all-cause mortality in the overall population (hazard ratio, 0.87; 95% confidence interval, 0.75–1.00, p = 0.054) was explained by a 21% reduction in cluster 3 (p = 0.07), with no effect in other clusters. Conclusions Large variations in baseline risks of mortality existed among patients in the UPLIFT® study. Inclusion of numerous low-risk patients may have reduced the ability to show beneficial effect on mortality. Future clinical trials should consider selective inclusion of high-risk patients. PMID:24913266
Gagneux, Sebastien; Helbling, Peter; Battegay, Manuel; Rieder, Hans L.; Pfyffer, Gaby E.; Zwahlen, Marcel; Furrer, Hansjakob; Siegrist, Hans H.; Fehr, Jan; Dolina, Marisa; Calmy, Alexandra; Stucki, David; Jaton, Katia; Janssens, Jean-Paul; Stalder, Jesica Mazza; Bodmer, Thomas; Ninet, Beatrice; Böttger, Erik C.; Egger, Matthias; Barth, J.; Battegay, M.; Bernasconi, E.; Böni, J.; Bucher, H. C.; Burton-Jeangros, A. Calmy; Cavassini, M.; Cellerai, C.; Egger, M.; Elzi, L.; Fehr, J.; Fellay, J.; Flepp, M.; Francioli, P.; Furrer, H.; Fux, C. A.; Gorgievski, M.; Günthard, H.; Haerry, D.; Hasse, B.; Hirschel, B.; Hirsch, H. H.; Hirschel, B.; Hoffmann, M.; Hösli, I.; Kahlert, C.; Kaiser, L.; Kaiser, O.; Kind, C.; Klimkait, T.; Kovari, H.; Ledergerber, B.; Lugano, A. P.; Martinetti, G.; Martinez de Tejada, B.; Metzner, K.; Müller, N.; Nadal, D.; Pantaleo, G.; Rauch, A.; Regenass, S.; Rickenbach, M.; Rudin, C.; Schmid, P.; Schultze, D.; Schöni-Affolter, F.; Schüpbach, J.; Speck, R.; Taffé, P.; Tarr, P.; Telenti, A.; Trkola, A.; Vernazza, P.; Weber, R.; Yerly, S.
2012-01-01
Immigrants from high-burden countries and HIV-coinfected individuals are risk groups for tuberculosis (TB) in countries with low TB incidence. Therefore, we studied their role in transmission of Mycobacterium tuberculosis in Switzerland. We included all TB patients from the Swiss HIV Cohort and a sample of patients from the national TB registry. We identified molecular clusters by spoligotyping and mycobacterial interspersed repetitive-unit–variable-number tandem-repeat (MIRU-VNTR) analysis and used weighted logistic regression adjusted for age and sex to identify risk factors for clustering, taking sampling proportions into account. In total, we analyzed 520 TB cases diagnosed between 2000 and 2008; 401 were foreign born, and 113 were HIV coinfected. The Euro-American M. tuberculosis lineage dominated throughout the study period (378 strains; 72.7%), with no evidence for another lineage, such as the Beijing genotype, emerging. We identified 35 molecular clusters with 90 patients, indicating recent transmission; 31 clusters involved foreign-born patients, and 15 involved HIV-infected patients. Birth origin was not associated with clustering (adjusted odds ratio [aOR], 1.58; 95% confidence interval [CI], 0.73 to 3.43; P = 0.25, comparing Swiss-born with foreign-born patients), but clustering was reduced in HIV-infected patients (aOR, 0.49; 95% CI, 0.26 to 0.93; P = 0.030). Cavitary disease, male sex, and younger age were all associated with molecular clustering. In conclusion, most TB patients in Switzerland were foreign born, but transmission of M. tuberculosis was not more common among immigrants and was reduced in HIV-infected patients followed up in the national HIV cohort study. Continued access to health services and clinical follow-up will be essential to control TB in this population. PMID:22116153
ERIC Educational Resources Information Center
Rupp, Andre A.
2012-01-01
In the focus article of this issue, von Davier, Naemi, and Roberts essentially coupled: (1) a short methodological review of structural similarities of latent variable models with discrete and continuous latent variables; and (2) 2 short empirical case studies that show how these models can be applied to real, rather than simulated, large-scale…
Revealing The Impact Of Climate Variability On The Wind Resource Using Data Mining Techniques
NASA Astrophysics Data System (ADS)
Clifton, A.; Lundquist, J. K.
2011-12-01
Wind turbines harvest energy from the wind. Winds at heights where industrial-scale turbines operate, up to 200 m above ground, experience a complex interaction between the atmosphere and the Earth's surface. Previous studies for a variety of locations have shown that the wind resource varies over time. In some locations, this variability can be related to large-scale climate oscillations as revealed in climate indices such as the El-Nino-Southern Oscillation (ENSO). These indices can be used to quantify climate change in the past, and can also be extracted from models of future climate. Understanding the correlation between climate indices and wind resources therefore allows us to understand how climate change may influence wind energy production. We present a new methodology for assessing relevant climate modes of oscillation at a given site in order to quantify future wind resource variability. We demonstrate the method on a 14-year record of 10-minute averaged wind speed and wind direction data from several levels of an 80m tower at the National Renewable Energy Laboratory (NREL) National Wind Technology Center near Boulder, Colorado. Data mining techniques (based on k-means clustering) identify 4 major groups of wind speed and direction. After removing annual means, each cluster was compared to a series of climate indices, including the Arctic Oscillation (AO) and Multivariate ENSO Index (MEI). Statistically significant relationships emerge between individual clusters and climate indices. At this location, this result is consistent with the MEI's relationship with other meteorological parameters, such as precipitation, in the Rocky Mountain Region. The presentation will illustrate these relationships between wind resource at this location and other relevant climate indices, and suggest how these relationships can provide a foundation for quantifying the potential future variability of wind energy production at this site and others.
Solid-state NMR/NQR and first-principles study of two niobium halide cluster compounds.
Perić, Berislav; Gautier, Régis; Pickard, Chris J; Bosiočić, Marko; Grbić, Mihael S; Požek, Miroslav
2014-01-01
Two hexanuclear niobium halide cluster compounds with a [Nb6X12](2+) (X=Cl, Br) diamagnetic cluster core, have been studied by a combination of experimental solid-state NMR/NQR techniques and PAW/GIPAW calculations. For niobium sites the NMR parameters were determined by using variable Bo field static broadband NMR measurements and additional NQR measurements. It was found that they possess large positive chemical shifts, contrary to majority of niobium compounds studied so far by solid-state NMR, but in accordance with chemical shifts of (95)Mo nuclei in structurally related compounds containing [Mo6Br8](4+) cluster cores. Experimentally determined δiso((93)Nb) values are in the range from 2,400 to 3,000 ppm. A detailed analysis of geometrical relations between computed electric field gradient (EFG) and chemical shift (CS) tensors with respect to structural features of cluster units was carried out. These tensors on niobium sites are almost axially symmetric with parallel orientation of the largest EFG and the smallest CS principal axes (Vzz and δ33) coinciding with the molecular four-fold axis of the [Nb6X12](2+) unit. Bridging halogen sites are characterized by large asymmetry of EFG and CS tensors, the largest EFG principal axis (Vzz) is perpendicular to the X-Nb bonds, while intermediate EFG principal axis (Vyy) and the largest CS principal axis (δ11) are oriented in the radial direction with respect to the center of the cluster unit. For more symmetrical bromide compound the PAW predictions for EFG parameters are in better correspondence with the NMR/NQR measurements than in the less symmetrical chlorine compound. Theoretically predicted NMR parameters of bridging halogen sites were checked by (79/81)Br NQR and (35)Cl solid-state NMR measurements. Copyright © 2014 Elsevier Inc. All rights reserved.
Goodsitt, Mitchell M.; Helvie, Mark A.; Zelakiewicz, Scott; Schmitz, Andrea; Noroozian, Mitra; Paramagul, Chintana; Roubidoux, Marilyn A.; Nees, Alexis V.; Neal, Colleen H.; Carson, Paul; Lu, Yao; Hadjiiski, Lubomir; Wei, Jun
2014-01-01
Purpose To investigate the dependence of microcalcification cluster detectability on tomographic scan angle, angular increment, and number of projection views acquired at digital breast tomosynthesis (DBTdigital breast tomosynthesis). Materials and Methods A prototype DBTdigital breast tomosynthesis system operated in step-and-shoot mode was used to image breast phantoms. Four 5-cm-thick phantoms embedded with 81 simulated microcalcification clusters of three speck sizes (subtle, medium, and obvious) were imaged by using a rhodium target and rhodium filter with 29 kV, 50 mAs, and seven acquisition protocols. Fixed angular increments were used in four protocols (denoted as scan angle, angular increment, and number of projection views, respectively: 16°, 1°, and 17; 24°, 3°, and nine; 30°, 3°, and 11; and 60°, 3°, and 21), and variable increments were used in three (40°, variable, and 13; 40°, variable, and 15; and 60°, variable, and 21). The reconstructed DBTdigital breast tomosynthesis images were interpreted by six radiologists who located the microcalcification clusters and rated their conspicuity. Results The mean sensitivity for detection of subtle clusters ranged from 80% (22.5 of 28) to 96% (26.8 of 28) for the seven DBTdigital breast tomosynthesis protocols; the highest sensitivity was achieved with the 16°, 1°, and 17 protocol (96%), but the difference was significant only for the 60°, 3°, and 21 protocol (80%, P < .002) and did not reach significance for the other five protocols (P = .01–.15). The mean sensitivity for detection of medium and obvious clusters ranged from 97% (28.2 of 29) to 100% (24 of 24), but the differences fell short of significance (P = .08 to >.99). The conspicuity of subtle and medium clusters with the 16°, 1°, and 17 protocol was rated higher than those with other protocols; the differences were significant for subtle clusters with the 24°, 3°, and nine protocol and for medium clusters with 24°, 3°, and nine; 30°, 3°, and 11; 60°, 3° and 21; and 60°, variable, and 21 protocols (P < .002). Conclusion With imaging that did not include x-ray source motion or patient motion during acquisition of the projection views, narrow-angle DBTdigital breast tomosynthesis provided higher sensitivity and conspicuity than wide-angle DBTdigital breast tomosynthesis for subtle microcalcification clusters. © RSNA, 2014 PMID:25007048
Effects of selective attention on continuous opinions and discrete decisions
NASA Astrophysics Data System (ADS)
Si, Xia-Meng; Liu, Yun; Xiong, Fei; Zhang, Yan-Chao; Ding, Fei; Cheng, Hui
2010-09-01
Selective attention describes that individuals have a preference on information according to their involving motivation. Based on achievements of social psychology, we propose an opinion interacting model to improve the modeling of individuals’ interacting behaviors. There are two parameters governing the probability of agents interacting with opponents, i.e. individual relevance and time-openness. It is found that, large individual relevance and large time-openness advance the appearance of large clusters, but large individual relevance and small time-openness favor the lessening of extremism. We also put this new model into application to work out some factor leading to a successful product. Numerical simulations show that selective attention, especially individual relevance, cannot be ignored by launcher firms and information spreaders so as to attain the most successful promotion.
NASA Astrophysics Data System (ADS)
González, J. F.; Levato, H.; Grosso, M.
We present preliminary results of a long-term project devoted to the observational study of the binary star population in open clusters and its connection with the dynamical and evolutionary properties of the clusters. We report the discovery of 17 double-lined spectroscopic binaries, 30 radial velocity variables and about 30 suspected variables. In the 17 clusters of our sample the binary frequency ranges between 20 and 40 %, and reaches typically 60 % if all suspected binaries are included. We study the spatial distribution of the binary stars with respect to the cluster center and we discuss the statistical correlation of the mass-ratio distribution with the cluster age.
Clustering performance comparison using K-means and expectation maximization algorithms.
Jung, Yong Gyu; Kang, Min Soo; Heo, Jun
2014-11-14
Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K -means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K -means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.
The Large-scale Structure of the Universe: Probes of Cosmology and Structure Formation
NASA Astrophysics Data System (ADS)
Noh, Yookyung
The usefulness of large-scale structure as a probe of cosmology and structure formation is increasing as large deep surveys in multi-wavelength bands are becoming possible. The observational analysis of large-scale structure guided by large volume numerical simulations are beginning to offer us complementary information and crosschecks of cosmological parameters estimated from the anisotropies in Cosmic Microwave Background (CMB) radiation. Understanding structure formation and evolution and even galaxy formation history is also being aided by observations of different redshift snapshots of the Universe, using various tracers of large-scale structure. This dissertation work covers aspects of large-scale structure from the baryon acoustic oscillation scale, to that of large scale filaments and galaxy clusters. First, I discuss a large- scale structure use for high precision cosmology. I investigate the reconstruction of Baryon Acoustic Oscillation (BAO) peak within the context of Lagrangian perturbation theory, testing its validity in a large suite of cosmological volume N-body simulations. Then I consider galaxy clusters and the large scale filaments surrounding them in a high resolution N-body simulation. I investigate the geometrical properties of galaxy cluster neighborhoods, focusing on the filaments connected to clusters. Using mock observations of galaxy clusters, I explore the correlations of scatter in galaxy cluster mass estimates from multi-wavelength observations and different measurement techniques. I also examine the sources of the correlated scatter by considering the intrinsic and environmental properties of clusters.
Mothers of young children cluster into 4 groups based on psychographic food decision influencers.
Byrd-Bredbenner, Carol; Abbot, Jaclyn Maurer; Cussler, Ellen
2008-08-01
This study explored how mothers grouped into clusters according to multiple psychographic food decision influencers and how the clusters differed in nutrient intake and nutrient content of their household food supply. Mothers (n = 201) completed a survey assessing basic demographic characteristics, food shopping and meal preparation activities, self and spouse employment, exposure to formal food or nutrition education, education level and occupation, weight status, nutrition and food preparation knowledge and skill, family member health and nutrition status, food decision influencer constructs, and dietary intake. In addition, an in-home inventory of 100 participants' household food supplies was conducted. Four distinct clusters presented when 26 psychographic food choice influencers were evaluated. These clusters appear to be valid and robust classifications of mothers in that they discriminated well on the psychographic variables used to construct the clusters as well as numerous other variables not used in the cluster analysis. In addition, the clusters appear to transcend demographic variables that often segment audiences (eg, race, mother's age, socioeconomic status), thereby adding a new dimension to the way in which this audience can be characterized. Furthermore, psychographically defined clusters predicted dietary quality. This study demonstrates that mothers are not a homogenous group and need to have their unique characteristics taken into consideration when designing strategies to promote health. These results can help health practitioners better understand factors affecting food decisions and tailor interventions to better meet the needs of mothers.
Globular Clusters for Faint Galaxies
NASA Astrophysics Data System (ADS)
Kohler, Susanna
2017-07-01
The origin of ultra-diffuse galaxies (UDGs) has posed a long-standing mystery for astronomers. New observations of several of these faint giants with the Hubble Space Telescope are now lending support to one theory.Faint-Galaxy MysteryHubble images of Dragonfly 44 (top) and DFX1 (bottom). The right panels show the data with greater contrast and extended objects masked. [van Dokkum et al. 2017]UDGs large, extremely faint spheroidal objects were first discovered in the Virgo galaxy cluster roughly three decades ago. Modern telescope capabilities have resulted in many more discoveries of similar faint galaxies in recent years, suggesting that they are a much more common phenomenon than we originally thought.Despite the many observations, UDGs still pose a number of unanswered questions. Chief among them: what are UDGs? Why are these objects the size of normal galaxies, yet so dim? There are two primary models that explain UDGs:UDGs were originally small galaxies, hence their low luminosity. Tidal interactions then puffed them up to the large size we observe today.UDGs are effectively failed galaxies. They formed the same way as normal galaxies of their large size, but something truncated their star formation early, preventing them from gaining the brightness that we would expect for galaxies of their size.Now a team of scientists led by Pieter van Dokkum (Yale University) has made some intriguing observations with Hubble that lend weight to one of these models.Globulars observed in 16 Coma-cluster UDGs by Hubble. The top right panel shows the galaxy identifications. The top left panel shows the derived number of globular clusters in each galaxy. [van Dokkum et al. 2017]Globulars GaloreVan Dokkum and collaborators imaged two UDGs with Hubble: Dragonfly 44 and DFX1, both located in the Coma galaxy cluster. These faint galaxies are both smooth and elongated, with no obvious irregular features, spiral arms, star-forming regions, or other indications of tidal interactions.The most striking feature of these galaxies, however, is that they are surrounded by a large number of compact objects that appear to be globular clusters. From the observations, Van Dokkum and collaborators estimate that Dragonfly 44 and DFX1 have approximately 74 and 62 globulars, respectively significantly more than the low numbers expected for galaxies of this luminosity.Armed with this knowledge, the authors went back and looked at archival observations of 14 other UDGs also located in the Coma cluster. They found that these smaller and fainter galaxies dont host quite as many globular clusters as Dragonfly 44 and DFX1, but more than half also show significant overdensities of globulars.Main panel: relation between the number of globular clusters and total absolute magnitude for Coma UDGs (solid symbols) compared to normal galaxies (open symbols). Top panel: relation between effective radius and absolute magnitude. The UDGs are significantly larger and have more globular clusters than normal galaxies of the same luminosity. [van Dokkum et al. 2017]Evidence of FailureIn general, UDGs appear to have more globular clusters than other galaxies of the same total luminosity, by a factor of nearly 7. These results are consistent with the scenario in which UDGs are failed galaxies: they likely have the halo mass to have formed a large number of globular clusters, but they were quenched before they formed a disk and bulge. Because star formation never got going in UDGs, they are now much dimmer than other galaxies of the same size.The authors suggest that the next step is to obtain dynamical measurements of the UDGs to determine whether these faint galaxies really do have the halo mass suggested by their large numbers of globulars. Future observations will continue to help us pin down the origin of these dim giants.CitationPieter van Dokkum et al 2017 ApJL 844 L11. doi:10.3847/2041-8213/aa7ca2
NASA Astrophysics Data System (ADS)
Matos, Catarina; Grigoli, Francesco; Cesca, Simone; Custódio, Susana
2015-04-01
In the last decade a permanent seismic network of 30 broadband stations, complemented by dense temporary deployments, covered Portugal. This extraordinary network coverage enables now the computation of a high-resolution image of the seismicity of Portugal, which in turn will shed light on the seismotectonics of Portugal. The large data volumes available cannot be analyzed by traditional time-consuming manual location procedures. In this presentation we show first results on the automatic detection and location of earthquakes occurred in a selected region in the south of Portugal Our main goal is to implement an automatic earthquake detection and location routine in order to have a tool to quickly process large data sets, while at the same time detecting low magnitude earthquakes (i.e., lowering the detection threshold). We present a modified version of the automatic seismic event location by waveform coherency analysis developed by Grigoli et al. (2013, 2014), designed to perform earthquake detections and locations in continuous data. The event detection is performed by continuously computing the short-term-average/long-term-average of two different characteristic functions (CFs). For the P phases we used a CF based on the vertical energy trace, while for S phases we used a CF based on the maximum eigenvalue of the instantaneous covariance matrix (Vidale 1991). Seismic event detection and location is obtained by performing waveform coherence analysis scanning different hypocentral coordinates. We apply this technique to earthquakes in the Alentejo region (South Portugal), taking advantage from a small aperture seismic network installed in the south of Portugal for two years (2010 - 2011) during the DOCTAR experiment. In addition to the good network coverage, the Alentejo region was chosen for its simple tectonic setting and also because the relationship between seismicity, tectonics and local lithospheric structure is intriguing and still poorly understood. Inside the target area the seismicity clusters mainly within two clouds, oriented SE-NW and SW-NE. Should these clusters be seen as the expression of local active faults? Are they associated to lithological transitions? Or do the locations obtained from the previously sparse permanent network have large errors and generate fake clusters? We present preliminary results from this study, and compare them with manual locations. This work is supported by project QuakeLoc, reference: PTDC/GEO-FIQ/3522/2012
Medem, Anna V; Seidling, Hanna M; Eichler, Hans-Georg; Kaltschmidt, Jens; Metzner, Michael; Hubert, Carina M; Czock, David; Haefeli, Walter E
2017-05-01
Electronic clinical decision support systems (CDSS) require drug information that can be processed by computers. The goal of this project was to determine and evaluate a compilation of variables that comprehensively capture the information contained in the summary of product characteristic (SmPC) and unequivocally describe the drug, its dosage options, and clinical pharmacokinetics. An expert panel defined and structured a set of variables and drafted a guideline to extract and enter information on dosage and clinical pharmacokinetics from textual SmPCs as published by the European Medicines Agency (EMA). The set of variables was iteratively revised and evaluated by data extraction and variable allocation of roughly 7% of all centrally approved drugs. The information contained in the SmPC was allocated to three information clusters consisting of 260 variables. The cluster "drug characterization" specifies the nature of the drug. The cluster "dosage" provides information on approved drug dosages and defines corresponding specific conditions. The cluster "clinical pharmacokinetics" includes pharmacokinetic parameters of relevance for dosing in clinical practice. A first evaluation demonstrated that, despite the complexity of the current free text SmPCs, dosage and pharmacokinetic information can be reliably extracted from the SmPCs and comprehensively described by a limited set of variables. By proposing a compilation of variables well describing drug dosage and clinical pharmacokinetics, the project represents a step forward towards the development of a comprehensive database system serving as information source for sophisticated CDSS.
NASA Astrophysics Data System (ADS)
Dalkilic, Turkan Erbay; Apaydin, Aysen
2009-11-01
In a regression analysis, it is assumed that the observations come from a single class in a data cluster and the simple functional relationship between the dependent and independent variables can be expressed using the general model; Y=f(X)+[epsilon]. However; a data cluster may consist of a combination of observations that have different distributions that are derived from different clusters. When faced with issues of estimating a regression model for fuzzy inputs that have been derived from different distributions, this regression model has been termed the [`]switching regression model' and it is expressed with . Here li indicates the class number of each independent variable and p is indicative of the number of independent variables [J.R. Jang, ANFIS: Adaptive-network-based fuzzy inference system, IEEE Transaction on Systems, Man and Cybernetics 23 (3) (1993) 665-685; M. Michel, Fuzzy clustering and switching regression models using ambiguity and distance rejects, Fuzzy Sets and Systems 122 (2001) 363-399; E.Q. Richard, A new approach to estimating switching regressions, Journal of the American Statistical Association 67 (338) (1972) 306-310]. In this study, adaptive networks have been used to construct a model that has been formed by gathering obtained models. There are methods that suggest the class numbers of independent variables heuristically. Alternatively, in defining the optimal class number of independent variables, the use of suggested validity criterion for fuzzy clustering has been aimed. In the case that independent variables have an exponential distribution, an algorithm has been suggested for defining the unknown parameter of the switching regression model and for obtaining the estimated values after obtaining an optimal membership function, which is suitable for exponential distribution.
Schultz, K K; Bennett, T B; Nordlund, K V; Döpfer, D; Cook, N B
2016-09-01
Transition cow management has been tracked via the Transition Cow Index (TCI; AgSource Cooperative Services, Verona, WI) since 2006. Transition Cow Index was developed to measure the difference between actual and predicted milk yield at first test day to evaluate the relative success of the transition period program. This project aimed to assess TCI in relation to all commonly used Dairy Herd Improvement (DHI) metrics available through AgSource Cooperative Services. Regression analysis was used to isolate variables that were relevant to TCI, and then principal components analysis and network analysis were used to determine the relative strength and relatedness among variables. Finally, cluster analysis was used to segregate herds based on similarity of relevant variables. The DHI data were obtained from 2,131 Wisconsin dairy herds with test-day mean ≥30 cows, which were tested ≥10 times throughout the 2014 calendar year. The original list of 940 DHI variables was reduced through expert-driven selection and regression analysis to 23 variables. The K-means cluster analysis produced 5 distinct clusters. Descriptive statistics were calculated for the 23 variables per cluster grouping. Using principal components analysis, cluster analysis, and network analysis, 4 parameters were isolated as most relevant to TCI; these were energy-corrected milk, 3 measures of intramammary infection (dry cow cure rate, linear somatic cell count score in primiparous cows, and new infection rate), peak ratio, and days in milk at peak milk production. These variables together with cow and newborn calf survival measures form a group of metrics that can be used to assist in the evaluation of overall transition period performance. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Mascagni, Flavia; Giordani, Tommaso; Ceccarelli, Marilena; Cavallini, Andrea; Natali, Lucia
2017-08-18
Genome divergence by mobile elements activity and recombination is a continuous process that plays a key role in the evolution of species. Nevertheless, knowledge on retrotransposon-related variability among species belonging to the same genus is still limited. Considering the importance of the genus Helianthus, a model system for studying the ecological genetics of speciation and adaptation, we performed a comparative analysis of the repetitive genome fraction across ten species and one subspecies of sunflower, focusing on long terminal repeat retrotransposons at superfamily, lineage and sublineage levels. After determining the relative genome size of each species, genomic DNA was isolated and subjected to Illumina sequencing. Then, different assembling and clustering approaches allowed exploring the repetitive component of all genomes. On average, repetitive DNA in Helianthus species represented more than 75% of the genome, being composed mostly by long terminal repeat retrotransposons. Also, the prevalence of Gypsy over Copia superfamily was observed and, among lineages, Chromovirus was by far the most represented. Although nearly all the same sublineages are present in all species, we found considerable variability in the abundance of diverse retrotransposon lineages and sublineages, especially between annual and perennial species. This large variability should indicate that different events of amplification or loss related to these elements occurred following species separation and should have been involved in species differentiation. Our data allowed us inferring on the extent of interspecific repetitive DNA variation related to LTR-RE abundance, investigating the relationship between changes of LTR-RE abundance and the evolution of the genus, and determining the degree of coevolution of different LTR-RE lineages or sublineages between and within species. Moreover, the data suggested that LTR-RE abundance in a species was affected by the annual or perennial habit of that species.
Development of an automated energy audit protocol for office buildings
NASA Astrophysics Data System (ADS)
Deb, Chirag
This study aims to enhance the building energy audit process, and bring about reduction in time and cost requirements in the conduction of a full physical audit. For this, a total of 5 Energy Service Companies in Singapore have collaborated and provided energy audit reports for 62 office buildings. Several statistical techniques are adopted to analyse these reports. These techniques comprise cluster analysis and development of prediction models to predict energy savings for buildings. The cluster analysis shows that there are 3 clusters of buildings experiencing different levels of energy savings. To understand the effect of building variables on the change in EUI, a robust iterative process for selecting the appropriate variables is developed. The results show that the 4 variables of GFA, non-air-conditioning energy consumption, average chiller plant efficiency and installed capacity of chillers should be taken for clustering. This analysis is extended to the development of prediction models using linear regression and artificial neural networks (ANN). An exhaustive variable selection algorithm is developed to select the input variables for the two energy saving prediction models. The results show that the ANN prediction model can predict the energy saving potential of a given building with an accuracy of +/-14.8%.
Resche-Rigon, Matthieu; White, Ian R
2018-06-01
In multilevel settings such as individual participant data meta-analysis, a variable is 'systematically missing' if it is wholly missing in some clusters and 'sporadically missing' if it is partly missing in some clusters. Previously proposed methods to impute incomplete multilevel data handle either systematically or sporadically missing data, but frequently both patterns are observed. We describe a new multiple imputation by chained equations (MICE) algorithm for multilevel data with arbitrary patterns of systematically and sporadically missing variables. The algorithm is described for multilevel normal data but can easily be extended for other variable types. We first propose two methods for imputing a single incomplete variable: an extension of an existing method and a new two-stage method which conveniently allows for heteroscedastic data. We then discuss the difficulties of imputing missing values in several variables in multilevel data using MICE, and show that even the simplest joint multilevel model implies conditional models which involve cluster means and heteroscedasticity. However, a simulation study finds that the proposed methods can be successfully combined in a multilevel MICE procedure, even when cluster means are not included in the imputation models.
Pre-main sequence variables in young cluster Stock 18
NASA Astrophysics Data System (ADS)
Sinha, Tirthendu; Sharma, Saurabh; Pandey, Rakesh; Pandey, Anil Kumar
2018-04-01
We have carried out multi-epoch deep I band photometry of the open cluster Stock 18 to search for variable stars in star forming regions. In the present study, we identified 65 periodic and 217 non-periodic variable stars. The periods of most of the periodic variables are between 2 hours to 15 days and their magnitude varies between 0.05 to 0.6 mag. We have derived spectral energy distributions for 48 probable pre-main sequence variables. Their average age and mass are 2.7 ± 0.3 Myrs and 2.7 ± 0.2 Mo, respectively.
Konno, Satoshi; Taniguchi, Natsuko; Makita, Hironi; Nakamaru, Yuji; Shimizu, Kaoruko; Shijubo, Noriharu; Fuke, Satoshi; Takeyabu, Kimihiro; Oguri, Mitsuru; Kimura, Hirokazu; Maeda, Yukiko; Suzuki, Masaru; Nagai, Katsura; Ito, Yoichi M; Wenzel, Sally E; Nishimura, Masaharu
2015-12-01
Smoking may have multifactorial effects on asthma phenotypes, particularly in severe asthma. Cluster analysis has been applied to explore novel phenotypes, which are not based on any a priori hypotheses. To explore novel severe asthma phenotypes by cluster analysis when including cigarette smokers. We recruited a total of 127 subjects with severe asthma, including 59 current or ex-smokers, from our university hospital and its 29 affiliated hospitals/pulmonary clinics. Twelve clinical variables obtained during a 2-day hospital stay were used for cluster analysis. After clustering using clinical variables, the sputum levels of 14 molecules were measured to biologically characterize the clinical clusters. Five clinical clusters were identified, including two characterized by high pack-year exposure to cigarette smoking and low FEV1/FVC. There were marked differences between the two clusters of cigarette smokers. One had high levels of circulating eosinophils, high IgE levels, and a high sinus disease score. The other was characterized by low levels of the same parameters. Sputum analysis revealed increased levels of IL-5 in the former cluster and increased levels of IL-6 and osteopontin in the latter. The other three clusters were similar to those previously reported: young onset/atopic, nonsmoker/less eosinophilic, and female/obese. Key clinical variables were confirmed to be stable and consistent 1 year later. This study reveals two distinct phenotypes of severe asthma in current and former cigarette smokers with potentially different biological pathways contributing to fixed airflow limitation. Clinical trial registered with www.umin.ac.jp (000003254).
Vandenberghe, V; Goethals, P L M; Van Griensven, A; Meirlaen, J; De Pauw, N; Vanrolleghem, P; Bauwens, W
2005-09-01
During the summer of 1999, two automated water quality measurement stations were installed along the Dender river in Belgium. The variables dissolved oxygen, temperature, conductivity, pH, rain-intensity, flow and solar radiation were measured continuously. In this paper these on-line measurement series are presented and interpreted using also additional measurements and ecological expert-knowledge. The purpose was to demonstrate the variability in time and space of the aquatic processes and the consequences of conducting and interpreting discrete measurements for river quality assessment and management. The large fluctuations of the data illustrated the importance of continuous measurements for the complete description and modelling of the biological processes in the river.
Using LUCAS topsoil database to estimate soil organic carbon content in local spectral libraries
NASA Astrophysics Data System (ADS)
Castaldi, Fabio; van Wesemael, Bas; Chabrillat, Sabine; Chartin, Caroline
2017-04-01
The quantification of the soil organic carbon (SOC) content over large areas is mandatory to obtain accurate soil characterization and classification, which can improve site specific management at local or regional scale exploiting the strong relationship between SOC and crop growth. The estimation of the SOC is not only important for agricultural purposes: in recent years, the increasing attention towards global warming highlighted the crucial role of the soil in the global carbon cycle. In this context, soil spectroscopy is a well consolidated and widespread method to estimate soil variables exploiting the interaction between chromophores and electromagnetic radiation. The importance of spectroscopy in soil science is reflected by the increasing number of large soil spectral libraries collected in the world. These large libraries contain soil samples derived from a consistent number of pedological regions and thus from different parent material and soil types; this heterogeneity entails, in turn, a large variability in terms of mineralogical and organic composition. In the light of the huge variability of the spectral responses to SOC content and composition, a rigorous classification process is necessary to subset large spectral libraries and to avoid the calibration of global models failing to predict local variation in SOC content. In this regard, this study proposes a method to subset the European LUCAS topsoil database into soil classes using a clustering analysis based on a large number of soil properties. The LUCAS database was chosen to apply a standardized multivariate calibration approach valid for large areas without the need for extensive field and laboratory work for calibration of local models. Seven soil classes were detected by the clustering analyses and the samples belonging to each class were used to calibrate specific partial least square regression (PLSR) models to estimate SOC content of three local libraries collected in Belgium (Loam belt and Wallonia) and Luxembourg. The three local libraries only consist of spectral data (199 samples) acquired using the same protocol as the one used for the LUCAS database. SOC was estimated with a good accuracy both within each local library (RMSE: 1.2 ÷ 5.4 g kg-1; RPD: 1.41 ÷ 2.06) and for the samples of the three libraries together (RMSE: 3.9 g kg-1; RPD: 2.47). The proposed approach could allow to estimate SOC everywhere in Europe only collecting spectra, without the need for chemical laboratory analyses, exploiting the potentiality of the LUCAS database and specific PLSR models.
Ning, P; Guo, Y F; Sun, T Y; Zhang, H S; Chai, D; Li, X M
2016-09-01
To study the distinct clinical phenotype of chronic airway diseases by hierarchical cluster analysis and two-step cluster analysis. A population sample of adult patients in Donghuamen community, Dongcheng district and Qinghe community, Haidian district, Beijing from April 2012 to January 2015, who had wheeze within the last 12 months, underwent detailed investigation, including a clinical questionnaire, pulmonary function tests, total serum IgE levels, blood eosinophil level and a peak flow diary. Nine variables were chosen as evaluating parameters, including pre-salbutamol forced expired volume in one second(FEV1)/forced vital capacity(FVC) ratio, pre-salbutamol FEV1, percentage of post-salbutamol change in FEV1, residual capacity, diffusing capacity of the lung for carbon monoxide/alveolar volume adjusted for haemoglobin level, peak expiratory flow(PEF) variability, serum IgE level, cumulative tobacco cigarette consumption (pack-years) and respiratory symptoms (cough and expectoration). Subjects' different clinical phenotype by hierarchical cluster analysis and two-step cluster analysis was identified. (1) Four clusters were identified by hierarchical cluster analysis. Cluster 1 was chronic bronchitis in smokers with normal pulmonary function. Cluster 2 was chronic bronchitis or mild chronic obstructive pulmonary disease (COPD) patients with mild airflow limitation. Cluster 3 included COPD patients with heavy smoking, poor quality of life and severe airflow limitation. Cluster 4 recognized atopic patients with mild airflow limitation, elevated serum IgE and clinical features of asthma. Significant differences were revealed regarding pre-salbutamol FEV1/FVC%, pre-salbutamol FEV1% pred, post-salbutamol change in FEV1%, maximal mid-expiratory flow curve(MMEF)% pred, carbon monoxide diffusing capacity per liter of alveolar(DLCO)/(VA)% pred, residual volume(RV)% pred, total serum IgE level, smoking history (pack-years), St.George's respiratory questionnaire(SGRQ) score, acute exacerbation in the past one year, PEF variability and allergic dermatitis (P<0.05). (2) Four clusters were also identified by two-step cluster analysis as followings, cluster 1, COPD patients with moderate to severe airflow limitation; cluster 2, asthma and COPD patients with heavy smoking, airflow limitation and increased airways reversibility; cluster 3, patients having less smoking and normal pulmonary function with wheezing but no chronic cough; cluster 4, chronic bronchitis patients with normal pulmonary function and chronic cough. Significant differences were revealed regarding gender distribution, respiratory symptoms, pre-salbutamol FEV1/FVC%, pre-salbutamol FEV1% pred, post-salbutamol change in FEV1%, MMEF% pred, DLCO/VA% pred, RV% pred, PEF variability, total serum IgE level, cumulative tobacco cigarette consumption (pack-years), and SGRQ score (P<0.05). By different cluster analyses, distinct clinical phenotypes of chronic airway diseases are identified. Thus, individualized treatments may guide doctors to provide based on different phenotypes.
Miller, Janis M; Guo, Ying; Rodseth, Sarah Becker
2011-01-01
Background Data that incorporate the full complexity of healthy beverage intake and voiding frequency do not exist; therefore, clinicians reviewing bladder habits or voiding diaries for continence care must rely on expert opinion recommendations. Objective To use data-driven cluster analyses to reduce complex voiding diary variables into discrete patterns or data cluster profiles, descriptively name the clusters, and perform validity testing. Method Participants were 352 community women who filled out a 3-day voiding diary. Six variables (void frequency during daytime hours, void frequency during nighttime hours, modal output, total output, total intake, and body mass index) were entered into cluster analyses. The clusters were analyzed for differences by continence status, age, race (Black women, n = 196 White women, n = 156), and for those who were incontinent, by leakage episode severity. Results Three clusters emerged, labeled descriptively as Conventional, Benchmark, and Superplus. The Conventional cluster (68% of the sample) demonstrated mean daily intake of 45 ±13 ounces; mean daily output of 37 ± 15 ounces, mean daily voids 5 ± 2 times, mean modal daytime output 10±0.5 ounces, and mean nighttime voids 1±1 times. The Superplus cluster (7% of the sample) showed double or triple these values across the 5 variables, and the Benchmark cluster (25%) showed values consistent with current popular recommendations on intake and output (e.g., meeting or exceeding the 8 × 8 fluid intake rule of thumb). The clusters differed significantly (p < .05) by age, race, amount of irritating beverages consumed, and incontinence status. Discussion Identification of three discrete clusters provides for a potential parsimonious but data-driven means of classifying individuals for additional epidemiological or clinical study. The clinical utility rests with potential for intervening to move an individual from a high risk to low risk cluster with regards to incontinence. PMID:21317828
Sun, Keping; Kimball, Rebecca T.; Liu, Tong; Wei, Xuewen; Jin, Longru; Jiang, Tinglei; Lin, Aiqing; Feng, Jiang
2016-01-01
Palaeoclimatic oscillations and different landscapes frequently result in complex population-level structure or the evolution of cryptic species. Elucidating the potential mechanisms is vital to understanding speciation events. However, such complex evolutionary patterns have rarely been reported in bats. In China, the Rhinolophus macrotis complex contains a large form and a small form, suggesting the existence of a cryptic bat species. Our field surveys found these two sibling species have a continuous and widespread distribution with partial sympatry. However, their evolutionary history has received little attention. Here, we used extensive sampling, morphological and acoustic data, as well as different genetic markers to investigate their evolutionary history. Genetic analyses revealed discordance between the mitochondrial and nuclear data. Mitochondrial data identified three reciprocally monophyletic lineages: one representing all small forms from Southwest China, and the other two containing all large forms from Central and Southeast China, respectively. The large form showed paraphyly with respect to the small form. However, clustering analyses of microsatellite and Chd1 gene sequences support two divergent clusters separating the large form and the small form. Moreover, morphological and acoustic analyses were consistent with nuclear data. This unusual pattern in the R. macrotis complex might be accounted for by palaeoclimatic oscillations, shared ancestral polymorphism and/or interspecific hybridization. PMID:27748429
Cluster-enriched Yang-Baxter equation from SUSY gauge theories
NASA Astrophysics Data System (ADS)
Yamazaki, Masahito
2018-04-01
We propose a new generalization of the Yang-Baxter equation, where the R-matrix depends on cluster y-variables in addition to the spectral parameters. We point out that we can construct solutions to this new equation from the recently found correspondence between Yang-Baxter equations and supersymmetric gauge theories. The S^2 partition function of a certain 2d N=(2,2) quiver gauge theory gives an R-matrix, whereas its FI parameters can be identified with the cluster y-variables.
Phenotypes determined by cluster analysis in severe or difficult-to-treat asthma.
Schatz, Michael; Hsu, Jin-Wen Y; Zeiger, Robert S; Chen, Wansu; Dorenbaum, Alejandro; Chipps, Bradley E; Haselkorn, Tmirah
2014-06-01
Asthma phenotyping can facilitate understanding of disease pathogenesis and potential targeted therapies. To further characterize the distinguishing features of phenotypic groups in difficult-to-treat asthma. Children ages 6-11 years (n = 518) and adolescents and adults ages ≥12 years (n = 3612) with severe or difficult-to-treat asthma from The Epidemiology and Natural History of Asthma: Outcomes and Treatment Regimens (TENOR) study were evaluated in this post hoc cluster analysis. Analyzed variables included sex, race, atopy, age of asthma onset, smoking (adolescents and adults), passive smoke exposure (children), obesity, and aspirin sensitivity. Cluster analysis used the hierarchical clustering algorithm with the Ward minimum variance method. The results were compared among clusters by χ(2) analysis; variables with significant (P < .05) differences among clusters were considered as distinguishing feature candidates. Associations among clusters and asthma-related health outcomes were assessed in multivariable analyses by adjusting for socioeconomic status, environmental exposures, and intensity of therapy. Five clusters were identified in each age stratum. Sex, atopic status, and nonwhite race were distinguishing variables in both strata; passive smoke exposure was distinguishing in children and aspirin sensitivity in adolescents and adults. Clusters were not related to outcomes in children, but 2 adult and adolescent clusters distinguished by nonwhite race and aspirin sensitivity manifested poorer quality of life (P < .0001), and the aspirin-sensitive cluster experienced more frequent asthma exacerbations (P < .0001). Distinct phenotypes appear to exist in patients with severe or difficult-to-treat asthma, which is related to outcomes in adolescents and adults but not in children. The study of the therapeutic implications of these phenotypes is warranted. Copyright © 2013 American Academy of Allergy, Asthma & Immunology. Published by Mosby, Inc. All rights reserved.
Isolation of Notl sites from chromosome 22q11
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ten Hoeve, J.; Groffen, J.; Heisterkamp, N.
1993-12-01
Chromosome 22q11 contains a large number of interesting loci, including genes associated with cancer and developmental defects. The region is also the site of the lambda immunoglobulin variable and constants regions and the BCR, [gamma]-glutamyl transpeptidase, and GGT-like activity multigene families. Because of the complexities associated with mapping highly related gene families, the authors have examined the utility of mapping large areas of DNA using a defined approach. A total of 21 complete NotI sites from band q11 were cloned and ordered into six noncontiguous clusters of sites using a combination of somatic cell hybrid panels, NotI jumping and linkingmore » libraries, and fluorescence in situ hybridization. The largest cluster spanned an estimated 2 Mb of NotI fragments, the smallest 115 kb. Approximately 3.5 Mb of band q11 could be examined for rearrangements in NotI restriction enzyme fragments. A number of conserved sequences, two genes, and a minimum of two families of related sequences were identified adjacent to NotI sites. 51 refs., 5 figs., 4 tabs.« less
NASA Astrophysics Data System (ADS)
Leon, Stéphane; Bergond, Gilles; Vallenari, Antonella
1999-04-01
We present the tidal tail distributions of a sample of candidate binary clusters located in the bar of the Large Magellanic Cloud (LMC). One isolated cluster, SL 268, is presented in order to study the effect of the LMC tidal field. All the candidate binary clusters show tidal tails, confirming that the pairs are formed by physically linked objects. The stellar mass in the tails covers a large range, from 1.8x 10(3) to 3x 10(4) \\msun. We derive a total mass estimate for SL 268 and SL 356. At large radii, the projected density profiles of SL 268 and SL 356 fall off as r(-gamma ) , with gamma = 2.27 and gamma =3.44, respectively. Out of 4 pairs or multiple systems, 2 are older than the theoretical survival time of binary clusters (going from a few 10(6) years to 10(8) years). A pair shows too large age difference between the components to be consistent with classical theoretical models of binary cluster formation (Fujimoto & Kumai \\cite{fujimoto97}). We refer to this as the ``overmerging'' problem. A different scenario is proposed: the formation proceeds in large molecular complexes giving birth to groups of clusters over a few 10(7) years. In these groups the expected cluster encounter rate is larger, and tidal capture has higher probability. Cluster pairs are not born together through the splitting of the parent cloud, but formed later by tidal capture. For 3 pairs, we tentatively identify the star cluster group (SCG) memberships. The SCG formation, through the recent cluster starburst triggered by the LMC-SMC encounter, in contrast with the quiescent open cluster formation in the Milky Way can be an explanation to the paucity of binary clusters observed in our Galaxy. Based on observations collected at the European Southern Observatory, La Silla, Chile}
Does tidal capture produce cataclysmic variables?
NASA Technical Reports Server (NTRS)
Bailyn, Charles D.; Grindlay, Jonathan E.; Garcia, Michael R.
1990-01-01
It is shown that earlier estimates of the number of cataclysmic variables (CVs) to be expected from tidal capture in globular clusters may have been considerably too high, since many such binaries will result in unstable mass transfer, and thus not become CVs after all. In particular, CVs with white dwarf masses less than or obout 1.0 solar mass will be supressed. Such unstable mass transfer events may produce some of the cluster mass loss required to stabilize the cluster core. The smaller number of stable CVs predicted may suggest a reconsideration of the nature of some of the low-luminosity cluster X-ray sources.
The Clusters AgeS Experiment (CASE). Variable Stars in the Field of the Globular Cluster M12
NASA Astrophysics Data System (ADS)
Kaluzny, J.; Thompson, I. B.; Narloch, W.; Pych, W.; Rozyczka, M.
2015-09-01
The field of the globular cluster M12 (NGC 6218) was monitored between 1995 and 2009 in a search for variable stars. BV light curves were obtained for thirty-six periodic or likely periodic variable stars. Thirty-four of these are new detections. Among the latter we identified twenty proper-motion members of the cluster: six detached or semi-detached eclipsing binaries, five contact binaries, five SX Phe pulsators, and three yellow stragglers. Two of the eclipsing binaries are located in the turnoff region, one on the lower main sequence and the remaining three among the blue stragglers. Two contact systems are blue stragglers, and the remaining three reside in the turnoff region. In the blue straggler region a total of 103 objects were found, of which 42 are proper motion members of M12, and another four are field stars. 55 of the remaining objects are located within two core radii from the center of the cluster, and as such they are likely genuine blue stragglers. We also report the discoveries of a radial color gradient of M12, and the shortest period among contact systems in globular clusters in general.
Feng, Tao; Wang, Chao; Wang, Peifang; Qian, Jin; Wang, Xun
2018-09-01
Cyanobacterial blooms have emerged as one of the most severe ecological problems affecting large and shallow freshwater lakes. To improve our understanding of the factors that influence, and could be used to predict, surface blooms, this study developed a novel Euler-Lagrangian coupled approach combining the Eulerian model with agent-based modelling (ABM). The approach was subsequently verified based on monitoring datasets and MODIS data in a large shallow lake (Lake Taihu, China). The Eulerian model solves the Eulerian variables and physiological parameters, whereas ABM generates the complete life cycle and transport processes of cyanobacterial colonies. This model ensemble performed well in fitting historical data and predicting the dynamics of cyanobacterial biomass, bloom distribution, and area. Based on the calculated physical and physiological characteristics of surface blooms, principal component analysis (PCA) captured the major processes influencing surface bloom formation at different stages (two bloom clusters). Early bloom outbreaks were influenced by physical processes (horizontal transport and vertical turbulence-induced mixing), whereas buoyancy-controlling strategies were essential for mature bloom outbreaks. Canonical correlation analysis (CCA) revealed the combined actions of multiple environment variables on different bloom clusters. The effects of buoyancy-controlling strategies (ISP), vertical turbulence-induced mixing velocity of colony (VMT) and horizontal drift velocity of colony (HDT) were quantitatively compared using scenario simulations in the coupled model. VMT accounted for 52.9% of bloom formations and maintained blooms over long periods, thus demonstrating the importance of wind-induced turbulence in shallow lakes. In comparison, HDT and buoyancy controlling strategies influenced blooms at different stages. In conclusion, the approach developed here presents a promising tool for understanding the processes of onshore/offshore algal blooms formation and subsequent predicting. Copyright © 2018 Elsevier Ltd. All rights reserved.
Distance correlation methods for discovering associations in large astrophysical databases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Martínez-Gómez, Elizabeth; Richards, Mercedes T.; Richards, Donald St. P., E-mail: elizabeth.martinez@itam.mx, E-mail: mrichards@astro.psu.edu, E-mail: richards@stat.psu.edu
2014-01-20
High-dimensional, large-sample astrophysical databases of galaxy clusters, such as the Chandra Deep Field South COMBO-17 database, provide measurements on many variables for thousands of galaxies and a range of redshifts. Current understanding of galaxy formation and evolution rests sensitively on relationships between different astrophysical variables; hence an ability to detect and verify associations or correlations between variables is important in astrophysical research. In this paper, we apply a recently defined statistical measure called the distance correlation coefficient, which can be used to identify new associations and correlations between astrophysical variables. The distance correlation coefficient applies to variables of any dimension,more » can be used to determine smaller sets of variables that provide equivalent astrophysical information, is zero only when variables are independent, and is capable of detecting nonlinear associations that are undetectable by the classical Pearson correlation coefficient. Hence, the distance correlation coefficient provides more information than the Pearson coefficient. We analyze numerous pairs of variables in the COMBO-17 database with the distance correlation method and with the maximal information coefficient. We show that the Pearson coefficient can be estimated with higher accuracy from the corresponding distance correlation coefficient than from the maximal information coefficient. For given values of the Pearson coefficient, the distance correlation method has a greater ability than the maximal information coefficient to resolve astrophysical data into highly concentrated horseshoe- or V-shapes, which enhances classification and pattern identification. These results are observed over a range of redshifts beyond the local universe and for galaxies from elliptical to spiral.« less
Pineda, David A.; Lopera, Francisco; Puerta, Isabel C.; Trujillo-Orrego, Natalia; Aguirre-Acevedo, Daniel C.; Hincapié-Henao, Liliana; Arango, Clara P.; Acosta, Maria T.; Holzinger, Sandra I.; Palacio, Juan David; Pineda-Alvarez, Daniel E.; Velez, Jorge I.; Martinez, Ariel F.; Lewis, John E.
2014-01-01
Endophenotypes are neurobiological markers cosegregating and associated with illness. These biomarkers represent a promising strategy to dissect ADHD biological causes. This study was aimed at contrasting the genetics of neuropsychological tasks for intelligence, attention, memory, visual-motor skills, and executive function in children from multigenerational and extended pedigrees that cluster ADHD in a genetic isolate. In a sample of 288 children and adolescents, 194 (67.4%) ADHD affected and 94 (32.6%) unaffected, a battery of neuropsychological tests was utilized to assess the association between genetic transmission and the ADHD phenotype. We found significant differences between affected and unaffected children in the WISC block design, PIQ and FSIQ, continuous vigilance, and visual-motor skills, and these variables exhibited a significant heritability. Given the association between these neuropsychological variables and ADHD, and also the high genetic component underlying their transmission in the studied pedigrees, we suggest that these variables be considered as potential cognitive endophenotypes suitable as quantitative trait loci (QTLs) in future studies of linkage and association. PMID:21779842
Di Pietro, C; Di Pietro, V; Emmanuele, G; Ferro, A; Maugeri, T; Modica, E; Pigola, G; Pulvirenti, A; Purrello, M; Ragusa, M; Scalia, M; Shasha, D; Travali, S; Zimmitti, V
2003-01-01
In this paper we present a new Multiple Sequence Alignment (MSA) algorithm called AntiClusAl. The method makes use of the commonly use idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process ina bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called Antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high aminoacid conservation during evolution of Xenopus laevis SOD2 is also cited.
Embedded cluster metal-polymeric micro interface and process for producing the same
Menezes, Marlon E.; Birnbaum, Howard K.; Robertson, Ian M.
2002-01-29
A micro interface between a polymeric layer and a metal layer includes isolated clusters of metal partially embedded in the polymeric layer. The exposed portion of the clusters is smaller than embedded portions, so that a cross section, taken parallel to the interface, of an exposed portion of an individual cluster is smaller than a cross section, taken parallel to the interface, of an embedded portion of the individual cluster. At least half, but not all of the height of a preferred spherical cluster is embedded. The metal layer is completed by a continuous layer of metal bonded to the exposed portions of the discontinuous clusters. The micro interface is formed by heating a polymeric layer to a temperature, near its glass transition temperature, sufficient to allow penetration of the layer by metal clusters, after isolated clusters have been deposited on the layer at lower temperatures. The layer is recooled after embedding, and a continuous metal layer is deposited upon the polymeric layer to bond with the discontinuous metal clusters.
Dauster, Ingo; Suhm, Martin A; Buck, Udo; Zeuch, Thomas
2008-01-07
Methanol clusters are generated in a continuous He-seeded supersonic expansion and doped with sodium atoms in a pick-up cell. By this method, clusters of the type Na(CH(3)OH)(n) are formed and subsequently photoionized by applying a tunable dye-laser system. The microsolvation process of the Na 3s electron is studied by determining the ionization potentials (IPs) of these clusters size-selectively for n = 2-40. A decrease is found from n = 2 to 6 and a constant value of 3.19 +/- 0.07 eV for n = 6-40. The experimentally-determined ionization potentials are compared with ionization potentials derived from quantum-chemical calculations, assuming limiting vertical and adiabatic processes. In the first case, energy differences are calculated between the neutral and the ionized cationic clusters of the same geometry. In the second case, the ionized clusters are used in their optimized relaxed geometry. These energy differences and relative stabilities of isomeric clusters vary significantly with the applied quantum-chemical method (B3LYP or MP2). The comparison with the experiment for n = 2-7 reveals strong variations of the ionization potential with the cluster structure indicating that structural diversity and non-vertical pathways give significant signal contributions at the threshold. Based on these findings, a possible explanation for the remarkable difference in IP evolutions of methanol or water and ammonia is presented: for methanol and water a rather localized surface or semi-internal Na 3s electron is excited to either high Rydberg or more localized states below the vertical ionization threshold. This excitation is followed by a local structural relaxation that couples to an autoionization process. For small clusters with n < 6 for methanol and n < 4 for water the addition of solvent molecules leads to larger solvent-metal-ion interaction energies, which consequently lead to lower ionization thresholds. For n = 6 (methanol) and n = 4 (water) this effect comes to a halt, which may be connected with the completion of the first cationic solvation shell limiting the release of local relaxation energy. For Na(NH(3))(n), a largely delocalized and internal electron is excited to autoionizing electronic states, a process that is no longer local and consequently may depend on cluster size up to very large n.
Dewhirst, Oliver P; Roskilly, Kyle; Hubel, Tatjana Y; Jordan, Neil R; Golabek, Krystyna A; McNutt, J Weldon; Wilson, Alan M
2017-02-01
Changes in stride frequency and length with speed are key parameters in animal locomotion research. They are commonly measured in a laboratory on a treadmill or by filming trained captive animals. Here, we show that a clustering approach can be used to extract these variables from data collected by a tracking collar containing a GPS module and tri-axis accelerometers and gyroscopes. The method enables stride parameters to be measured during free-ranging locomotion in natural habitats. As it does not require labelled data, it is particularly suitable for use with difficult to observe animals. The method was tested on large data sets collected from collars on free-ranging lions and African wild dogs and validated using a domestic dog. © 2017. Published by The Company of Biologists Ltd.
Spatial and kinematic structure of Monoceros star-forming region
NASA Astrophysics Data System (ADS)
Costado, M. T.; Alfaro, E. J.
2018-05-01
The principal aim of this work is to study the velocity field in the Monoceros star-forming region using the radial velocity data available in the literature, as well as astrometric data from the Gaia first release. This region is a large star-forming complex formed by two associations named Monoceros OB1 and OB2. We have collected radial velocity data for more than 400 stars in the area of 8 × 12 deg2 and distance for more than 200 objects. We apply a clustering analysis in the subspace of the phase space formed by angular coordinates and radial velocity or distance data using the Spectrum of Kinematic Grouping methodology. We found four and three spatial groupings in radial velocity and distance variables, respectively, corresponding to the Local arm, the central clusters forming the associations and the Perseus arm, respectively.
Optimal Cluster Mill Pass Scheduling With an Accurate and Rapid New Strip Crown Model
NASA Astrophysics Data System (ADS)
Malik, Arif S.; Grandhi, Ramana V.; Zipf, Mark E.
2007-05-01
Besides the requirement to roll coiled sheet at high levels of productivity, the optimal pass scheduling of cluster-type reversing cold mills presents the added challenge of assigning mill parameters that facilitate the best possible strip flatness. The pressures of intense global competition, and the requirements for increasingly thinner, higher quality specialty sheet products that are more difficult to roll, continue to force metal producers to commission innovative flatness-control technologies. This means that during the on-line computerized set-up of rolling mills, the mathematical model should not only determine the minimum total number of passes and maximum rolling speed, it should simultaneously optimize the pass-schedule so that desired flatness is assured, either by manual or automated means. In many cases today, however, on-line prediction of strip crown and corresponding flatness for the complex cluster-type rolling mills is typically addressed either by trial and error, by approximate deflection models for equivalent vertical roll-stacks, or by non-physical pattern recognition style models. The abundance of the aforementioned methods is largely due to the complexity of cluster-type mill configurations and the lack of deflection models with sufficient accuracy and speed for on-line use. Without adequate assignment of the pass-schedule set-up parameters, it may be difficult or impossible to achieve the required strip flatness. In this paper, we demonstrate optimization of cluster mill pass-schedules using a new accurate and rapid strip crown model. This pass-schedule optimization includes computations of the predicted strip thickness profile to validate mathematical constraints. In contrast to many of the existing methods for on-line prediction of strip crown and flatness on cluster mills, the demonstrated method requires minimal prior tuning and no extensive training with collected mill data. To rapidly and accurately solve the multi-contact problem and predict the strip crown, a new customized semi-analytical modeling technique that couples the Finite Element Method (FEM) with classical solid mechanics was developed to model the deflection of the rolls and strip while under load. The technique employed offers several important advantages over traditional methods to calculate strip crown, including continuity of elastic foundations, non-iterative solution when using predetermined foundation moduli, continuous third-order displacement fields, simple stress-field determination, and a comparatively faster solution time.
Zhang, Han; Rokas, Antonis; Slot, Jason C.
2012-01-01
Background Dermatophyte fungi of the family Arthrodermataceae (Eurotiomycetes) colonize keratinized tissue, such as skin, frequently causing superficial mycoses in humans and other mammals, reptiles, and birds. Competition with native microflora likely underlies the propensity of these dermatophytes to produce a diversity of antibiotics and compounds for scavenging iron, which is extremely scarce, as well as the presence of an unusually large number of putative secondary metabolism gene clusters, most of which contain non-ribosomal peptide synthetases (NRPS), in their genomes. To better understand the historical origins and diversification of NRPS-containing gene clusters we examined the evolution of a variable locus (VL) that exists in one of three alternative conformations among the genomes of seven dermatophyte species. Results The first conformation of the VL (termed VLA) contains only 539 base pairs of sequence and lacks protein-coding genes, whereas the other two conformations (termed VLB and VLC) span 36 Kb and 27 Kb and contain 12 and 10 genes, respectively. Interestingly, both VLB and VLC appear to contain distinct secondary metabolism gene clusters; VLB contains a NRPS gene as well as four porphyrin metabolism genes never found to be physically linked in the genomes of 128 other fungal species, whereas VLC also contains a NRPS gene as well as several others typically found associated with secondary metabolism gene clusters. Phylogenetic evidence suggests that the VL locus was present in the ancestor of all seven species achieving its present distribution through subsequent differential losses or retentions of specific conformations. Conclusions We propose that the existence of variable loci, similar to the one we studied, in fungal genomes could potentially explain the dramatic differences in secondary metabolic diversity between closely related species of filamentous fungi, and contribute to host adaptation and the generation of metabolic diversity. PMID:22860027
An exploratory survey of eating behaviour patterns in adolescent students.
Arata, A; Battini, V; Chiorri, C; Masini, B
2010-12-01
Empirical research has always treated adolescents' eating habits from a variable-centered perspective, but this approach may miss the configurations of eating behaviours that uniquely describe discrete groups of individuals. The aim of this study was to investigate prototypical patterns of eating habits in a large sample of Italian adolescents and their behavioural and psychological correlates. Data were gathered from 1388 students (F=60%, mean age 14.90±1.34 yrs), who were asked to fill in an original questionnaire surveying dietary habits, body weight attitudes, body image, sport activities and sources of information about food. Perfectionism, self-esteem, self-efficacy and care for food were also assessed as well-known psychological risk factors for Eating Disorders. Five prototypical eating behaviour patterns were identified through cluster analysis. Cluster membership was associated (p<0.05) with gender, age and age- and gender-correct BMI percentile, perceived relevance of physical appearance in achieving success in life; one's weight and body image evaluation, dieting, physical activity, self-efficacy, self-esteem and care for food. Clusters did not differ in perfectionism score and in frequency of consulting different sources of information about food and weight, except in the case of dieticians. The identification of prototypical eating habits patterns revealed a large range of wrong eating attitudes and behaviours among Italian adolescents. Such data suggest the need to develop and implement adequate prevention programs.
Fraiman, Daniel; Chialvo, Dante R.
2012-01-01
The study of spontaneous fluctuations of brain activity, often referred as brain noise, is getting increasing attention in functional magnetic resonance imaging (fMRI) studies. Despite important efforts, much of the statistical properties of such fluctuations remain largely unknown. This work scrutinizes these fluctuations looking at specific statistical properties which are relevant to clarify its dynamical origins. Here, three statistical features which clearly differentiate brain data from naive expectations for random processes are uncovered: First, the variance of the fMRI mean signal as a function of the number of averaged voxels remains constant across a wide range of observed clusters sizes. Second, the anomalous behavior of the variance is originated by bursts of synchronized activity across regions, regardless of their widely different sizes. Finally, the correlation length (i.e., the length at which the correlation strength between two regions vanishes) as well as mutual information diverges with the cluster's size considered, such that arbitrarily large clusters exhibit the same collective dynamics than smaller ones. These three properties are known to be exclusive of complex systems exhibiting critical dynamics, where the spatio-temporal dynamics show these peculiar type of fluctuations. Thus, these findings are fully consistent with previous reports of brain critical dynamics, and are relevant for the interpretation of the role of fluctuations and variability in brain function in health and disease. PMID:22934058
DIMM-SC: a Dirichlet mixture model for clustering droplet-based single cell transcriptomic data.
Sun, Zhe; Wang, Ting; Deng, Ke; Wang, Xiao-Feng; Lafyatis, Robert; Ding, Ying; Hu, Ming; Chen, Wei
2018-01-01
Single cell transcriptome sequencing (scRNA-Seq) has become a revolutionary tool to study cellular and molecular processes at single cell resolution. Among existing technologies, the recently developed droplet-based platform enables efficient parallel processing of thousands of single cells with direct counting of transcript copies using Unique Molecular Identifier (UMI). Despite the technology advances, statistical methods and computational tools are still lacking for analyzing droplet-based scRNA-Seq data. Particularly, model-based approaches for clustering large-scale single cell transcriptomic data are still under-explored. We developed DIMM-SC, a Dirichlet Mixture Model for clustering droplet-based Single Cell transcriptomic data. This approach explicitly models UMI count data from scRNA-Seq experiments and characterizes variations across different cell clusters via a Dirichlet mixture prior. We performed comprehensive simulations to evaluate DIMM-SC and compared it with existing clustering methods such as K-means, CellTree and Seurat. In addition, we analyzed public scRNA-Seq datasets with known cluster labels and in-house scRNA-Seq datasets from a study of systemic sclerosis with prior biological knowledge to benchmark and validate DIMM-SC. Both simulation studies and real data applications demonstrated that overall, DIMM-SC achieves substantially improved clustering accuracy and much lower clustering variability compared to other existing clustering methods. More importantly, as a model-based approach, DIMM-SC is able to quantify the clustering uncertainty for each single cell, facilitating rigorous statistical inference and biological interpretations, which are typically unavailable from existing clustering methods. DIMM-SC has been implemented in a user-friendly R package with a detailed tutorial available on www.pitt.edu/∼wec47/singlecell.html. wei.chen@chp.edu or hum@ccf.org. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Linear Modeling and Evaluation of Controls on Flow Response in Western Post-Fire Watersheds
NASA Astrophysics Data System (ADS)
Saxe, S.; Hogue, T. S.; Hay, L.
2015-12-01
This research investigates the impact of wildfires on watershed flow regimes throughout the western United States, specifically focusing on evaluation of fire events within specified subregions and determination of the impact of climate and geophysical variables in post-fire flow response. Fire events were collected through federal and state-level databases and streamflow data were collected from U.S. Geological Survey stream gages. 263 watersheds were identified with at least 10 years of continuous pre-fire daily streamflow records and 5 years of continuous post-fire daily flow records. For each watershed, percent changes in runoff ratio (RO), annual seven day low-flows (7Q2) and annual seven day high-flows (7Q10) were calculated from pre- to post-fire. Numerous independent variables were identified for each watershed and fire event, including topographic, land cover, climate, burn severity, and soils data. The national watersheds were divided into five regions through K-clustering and a lasso linear regression model, applying the Leave-One-Out calibration method, was calculated for each region. Nash-Sutcliffe Efficiency (NSE) was used to determine the accuracy of the resulting models. The regions encompassing the United States along and west of the Rocky Mountains, excluding the coastal watersheds, produced the most accurate linear models. The Pacific coast region models produced poor and inconsistent results, indicating that the regions need to be further subdivided. Presently, RO and HF response variables appear to be more easily modeled than LF. Results of linear regression modeling showed varying importance of watershed and fire event variables, with conflicting correlation between land cover types and soil types by region. The addition of further independent variables and constriction of current variables based on correlation indicators is ongoing and should allow for more accurate linear regression modeling.
Reproductive pair correlations and the clustering of organisms.
Young, W R; Roberts, A J; Stuhne, G
2001-07-19
Clustering of organisms can be a consequence of social behaviour, or of the response of individuals to chemical and physical cues. Environmental variability can also cause clustering: for example, marine turbulence transports plankton and produces chlorophyll concentration patterns in the upper ocean. Even in a homogeneous environment, nonlinear interactions between species can result in spontaneous pattern formation. Here we show that a population of independent, random-walking organisms ('brownian bugs'), reproducing by binary division and dying at constant rates, spontaneously aggregates. Using an individual-based model, we show that clusters form out of spatially homogeneous initial conditions without environmental variability, predator-prey interactions, kinesis or taxis. The clustering mechanism is reproductively driven-birth must always be adjacent to a living organism. This clustering can overwhelm diffusion and create non-poissonian correlations between pairs (parent and offspring) or organisms, leading to the emergence of patterns.
Aboriginal hunting buffers climate-driven fire-size variability in Australia's spinifex grasslands.
Bliege Bird, Rebecca; Codding, Brian F; Kauhanen, Peter G; Bird, Douglas W
2012-06-26
Across diverse ecosystems, greater climatic variability tends to increase wildfire size, particularly in Australia, where alternating wet-dry cycles increase vegetation growth, only to leave a dry overgrown landscape highly susceptible to fire spread. Aboriginal Australian hunting fires have been hypothesized to buffer such variability, mitigating mortality on small-mammal populations, which have suffered declines and extinctions in the arid zone coincident with Aboriginal depopulation. We test the hypothesis that the relationship between climate and fire size is buffered through the maintenance of an anthropogenic, fine-grained fire regime by comparing the effect of climatic variability on landscapes dominated by Martu Aboriginal hunting fires with those dominated by lightning fires. We show that Aboriginal fires are smaller, more tightly clustered, and remain small even when climate variation causes huge fires in the lightning region. As these effects likely benefit threatened small-mammal species, Aboriginal hunters should be considered trophic facilitators, and policies aimed at reducing the risk of large fires should promote land-management strategies consistent with Aboriginal burning regimes.
Fontes, Amanda N. B.; Lima, Luana N. G. C.; Mota, Rosa M. S.; Almeida, Rosa L. F.; Pontes, Maria A.; Gonçalves, Heitor de S.; Frota, Cristiane C.; Vissa, Varalakshmi D.; Brennan, Patrick J.; Guimaraes, Ricardo J. P. S.; Kendall, Carl; Kerr, Ligia R. F. S.; Suffys, Philip N.
2017-01-01
Leprosy is endemic in large part of Brazil with 28,761 new patients in 2015, the second largest number worldwide and reaches 9/10.000 in highly endemic regions and 2.7/10.000 in the city of Fortaleza, Ceará, Northeast Brazil. For better understanding of risk factors for leprosy transmission, we conducted an epidemiologic study supplemented by 17 locus VNTR and SNP 1–4 typing of Mycobacterium leprae in skin biopsy samples from new multibacillary (MB) patients diagnosed at a reference center in 2009 and 2010. Among the 1,519 new patients detected during the study period, 998 (65.7%) were MB and we performed DNA extraction and genotyping on 160 skin biopsy samples, resulting in 159 (16%) good multilocus VNTR types. Thirty-eight of these patients also provided VNTR types from M. leprae in nasal swabs. The SNP-Type was obtained for 157 patients and 87% were of type 4. Upon consideration all VNTR markers, 156 different genotypes and three pairs with identical genotypes were observed; no epidemiologic relation could be observed between individuals in these pairs. Considerable variability in differentiating index (DI) was observed between the different markers and the four with highest DI [(AT)15, (TA)18, (AT)17 and (GAA)21] frequently demonstrated differences in copy number when comparing genotypes from both type of samples. Excluding these markers from analysis resulted in 83 genotypes, 20 of which included 96 of the patients (60.3%). These clusters were composed of two (n = 8), three (n = 6), four (n = 1), five (n = 2), six (n = 1), 19 (n = 1) and 23 (n = 23) individuals and suggests that recent transmission is contributing to the maintenance of leprosy in Fortaleza. When comparing epidemiological and clinical variables among patients within clustered or with unique M. leprae genotypes, a positive bacterial index in skin biopsies and knowledge of working with someone with the disease were significantly associated with clustering. A tendency to belong to a cluster was observed with later notification of disease (mean value of 3.4 months) and having disability grade 2. A tendency for lack of clustering was observed for patients who reported to have lived with another leprosy case but this might be due to lack of inclusion of household contacts in the study. Although clusters were spread over the city, kernel analysis revealed that some of the patients belonging to the two major clusters were spatially related to some neighborhoods that report poverty and high disease incidence in children. Finally, inclusion of genotypes from nasal swabs might be warranted. A major limitation of the study is that sample size of 160 patients from a two year period represents only 15% of the new patients and this could have weakened statistical outcomes. This is the first molecular epidemiology study of leprosy in Brazil and although the high clustering level suggests that recent transmission is the major cause of disease in Fortaleza; the existence of two large clusters needs further investigation. PMID:29244821
Fontes, Amanda N B; Lima, Luana N G C; Mota, Rosa M S; Almeida, Rosa L F; Pontes, Maria A; Gonçalves, Heitor de S; Frota, Cristiane C; Vissa, Varalakshmi D; Brennan, Patrick J; Guimaraes, Ricardo J P S; Kendall, Carl; Kerr, Ligia R F S; Suffys, Philip N
2017-12-01
Leprosy is endemic in large part of Brazil with 28,761 new patients in 2015, the second largest number worldwide and reaches 9/10.000 in highly endemic regions and 2.7/10.000 in the city of Fortaleza, Ceará, Northeast Brazil. For better understanding of risk factors for leprosy transmission, we conducted an epidemiologic study supplemented by 17 locus VNTR and SNP 1-4 typing of Mycobacterium leprae in skin biopsy samples from new multibacillary (MB) patients diagnosed at a reference center in 2009 and 2010. Among the 1,519 new patients detected during the study period, 998 (65.7%) were MB and we performed DNA extraction and genotyping on 160 skin biopsy samples, resulting in 159 (16%) good multilocus VNTR types. Thirty-eight of these patients also provided VNTR types from M. leprae in nasal swabs. The SNP-Type was obtained for 157 patients and 87% were of type 4. Upon consideration all VNTR markers, 156 different genotypes and three pairs with identical genotypes were observed; no epidemiologic relation could be observed between individuals in these pairs. Considerable variability in differentiating index (DI) was observed between the different markers and the four with highest DI [(AT)15, (TA)18, (AT)17 and (GAA)21] frequently demonstrated differences in copy number when comparing genotypes from both type of samples. Excluding these markers from analysis resulted in 83 genotypes, 20 of which included 96 of the patients (60.3%). These clusters were composed of two (n = 8), three (n = 6), four (n = 1), five (n = 2), six (n = 1), 19 (n = 1) and 23 (n = 23) individuals and suggests that recent transmission is contributing to the maintenance of leprosy in Fortaleza. When comparing epidemiological and clinical variables among patients within clustered or with unique M. leprae genotypes, a positive bacterial index in skin biopsies and knowledge of working with someone with the disease were significantly associated with clustering. A tendency to belong to a cluster was observed with later notification of disease (mean value of 3.4 months) and having disability grade 2. A tendency for lack of clustering was observed for patients who reported to have lived with another leprosy case but this might be due to lack of inclusion of household contacts in the study. Although clusters were spread over the city, kernel analysis revealed that some of the patients belonging to the two major clusters were spatially related to some neighborhoods that report poverty and high disease incidence in children. Finally, inclusion of genotypes from nasal swabs might be warranted. A major limitation of the study is that sample size of 160 patients from a two year period represents only 15% of the new patients and this could have weakened statistical outcomes. This is the first molecular epidemiology study of leprosy in Brazil and although the high clustering level suggests that recent transmission is the major cause of disease in Fortaleza; the existence of two large clusters needs further investigation.
Stephens, D.W.; Wangsgard, J.B.
1988-01-01
A computer program, Numerical Taxonomy System of Multivariate Statistical Programs (NTSYS), was used with interfacing software to perform cluster analyses of phytoplankton data stored in the biological files of the U.S. Geological Survey. The NTSYS software performs various types of statistical analyses and is capable of handling a large matrix of data. Cluster analyses were done on phytoplankton data collected from 1974 to 1981 at four national Stream Quality Accounting Network stations in the Tennessee River basin. Analysis of the changes in clusters of phytoplankton genera indicated possible changes in the water quality of the French Broad River near Knoxville, Tennessee. At this station, the most common diatom groups indicated a shift in dominant forms with some of the less common diatoms being replaced by green and blue-green algae. There was a reduction in genera variability between 1974-77 and 1979-81 sampling periods. Statistical analysis of chloride and dissolved solids confirmed that concentrations of these substances were smaller in 1974-77 than in 1979-81. At Pickwick Landing Dam, the furthest downstream station used in the study, there was an increase in the number of genera of ' rare ' organisms with time. The appearance of two groups of green and blue-green algae indicated that an increase in temperature or nutrient concentrations occurred from 1974 to 1981, but this could not be confirmed using available water quality data. Associations of genera forming the phytoplankton communities at three stations on the Tennessee River were found to be seasonal. Nodal analysis of combined data from all four stations used in the study did not identify any seasonal or temporal patterns during 1974-81. Cluster analysis using the NYSYS programs was effective in reducing the large phytoplankton data set to a manageable size and provided considerable insight into the structure of phytoplankton communities in the Tennessee River basin. Problems encountered using cluster analysis were the subjectivity introduced in the definition of meaningful clusters, and the lack of taxonomic identification to the species level. (Author 's abstract)
Design and Modeling of a Variable Heat Rejection Radiator
NASA Technical Reports Server (NTRS)
Miller, Jennifer R.; Birur, Gajanana C.; Ganapathi, Gani B.; Sunada, Eric T.; Berisford, Daniel F.; Stephan, Ryan
2011-01-01
Variable Heat Rejection Radiator technology needed for future NASA human rated & robotic missions Primary objective is to enable a single loop architecture for human-rated missions (1) Radiators are typically sized for maximum heat load in the warmest continuous environment resulting in a large panel area (2) Large radiator area results in fluid being susceptible to freezing at low load in cold environment and typically results in a two-loop system (3) Dual loop architecture is approximately 18% heavier than single loop architecture (based on Orion thermal control system mass) (4) Single loop architecture requires adaptability to varying environments and heat loads