An Improved Cluster Richness Estimator
Rozo, Eduardo; Rykoff, Eli S.; Koester, Benjamin P.; McKay, Timothy; Hao, Jiangang; Evrard, August; Wechsler, Risa H.; Hansen, Sarah; Sheldon, Erin; Johnston, David; Becker, Matthew R.; Annis, James T.; Bleem, Lindsey; Scranton, Ryan; /Pittsburgh U.
2009-08-03
Minimizing the scatter between cluster mass and accessible observables is an important goal for cluster cosmology. In this work, we introduce a new matched filter richness estimator, and test its performance using the maxBCG cluster catalog. Our new estimator significantly reduces the variance in the L{sub X}-richness relation, from {sigma}{sub lnL{sub X}}{sup 2} = (0.86 {+-} 0.02){sup 2} to {sigma}{sub lnL{sub X}}{sup 2} = (0.69 {+-} 0.02){sup 2}. Relative to the maxBCG richness estimate, it also removes the strong redshift dependence of the richness scaling relations, and is significantly more robust to photometric and redshift errors. These improvements are largely due to our more sophisticated treatment of galaxy color data. We also demonstrate the scatter in the L{sub X}-richness relation depends on the aperture used to estimate cluster richness, and introduce a novel approach for optimizing said aperture which can be easily generalized to other mass tracers.
Improving lensing cluster mass estimate with flexion
NASA Astrophysics Data System (ADS)
Cardone, V. F.; Vicinanza, M.; Er, X.; Maoli, R.; Scaramella, R.
2016-11-01
Gravitational lensing has long been considered as a valuable tool to determine the total mass of galaxy clusters. The shear profile, as inferred from the statistics of ellipticity of background galaxies, allows us to probe the cluster intermediate and outer regions, thus determining the virial mass estimate. However, the mass sheet degeneracy and the need for a large number of background galaxies motivate the search for alternative tracers which can break the degeneracy among model parameters and hence improve the accuracy of the mass estimate. Lensing flexion, i.e. the third derivative of the lensing potential, has been suggested as a good answer to the above quest since it probes the details of the mass profile. We investigate here whether this is indeed the case considering jointly using weak lensing, magnification and flexion. We use a Fisher matrix analysis to forecast the relative improvement in the mass accuracy for different assumptions on the shear and flexion signal-to- noise (S/N) ratio also varying the cluster mass, redshift, and ellipticity. It turns out that the error on the cluster mass may be reduced up to a factor of ˜2 for reasonable values of the flexion S/N ratio. As a general result, we get that the improvement in mass accuracy is larger for more flattened haloes, but it extracting general trends is difficult because of the many parameters at play. We nevertheless find that flexion is as efficient as magnification to increase the accuracy in both mass and concentration determination.
The cluster graphical lasso for improved estimation of Gaussian graphical models
Tan, Kean Ming; Witten, Daniela; Shojaie, Ali
2015-01-01
The task of estimating a Gaussian graphical model in the high-dimensional setting is considered. The graphical lasso, which involves maximizing the Gaussian log likelihood subject to a lasso penalty, is a well-studied approach for this task. A surprising connection between the graphical lasso and hierarchical clustering is introduced: the graphical lasso in effect performs a two-step procedure, in which (1) single linkage hierarchical clustering is performed on the variables in order to identify connected components, and then (2) a penalized log likelihood is maximized on the subset of variables within each connected component. Thus, the graphical lasso determines the connected components of the estimated network via single linkage clustering. The single linkage clustering is known to perform poorly in certain finite-sample settings. Therefore, the cluster graphical lasso, which involves clustering the features using an alternative to single linkage clustering, and then performing the graphical lasso on the subset of variables within each cluster, is proposed. Model selection consistency for this technique is established, and its improved performance relative to the graphical lasso is demonstrated in a simulation study, as well as in applications to a university webpage and a gene expression data sets. PMID:25642008
Cluster-based analysis improves predictive validity of spike-triggered receptive field estimates.
Bigelow, James; Malone, Brian J
2017-01-01
Spectrotemporal receptive field (STRF) characterization is a central goal of auditory physiology. STRFs are often approximated by the spike-triggered average (STA), which reflects the average stimulus preceding a spike. In many cases, the raw STA is subjected to a threshold defined by gain values expected by chance. However, such correction methods have not been universally adopted, and the consequences of specific gain-thresholding approaches have not been investigated systematically. Here, we evaluate two classes of statistical correction techniques, using the resulting STRF estimates to predict responses to a novel validation stimulus. The first, more traditional technique eliminated STRF pixels (time-frequency bins) with gain values expected by chance. This correction method yielded significant increases in prediction accuracy, including when the threshold setting was optimized for each unit. The second technique was a two-step thresholding procedure wherein clusters of contiguous pixels surviving an initial gain threshold were then subjected to a cluster mass threshold based on summed pixel values. This approach significantly improved upon even the best gain-thresholding techniques. Additional analyses suggested that allowing threshold settings to vary independently for excitatory and inhibitory subfields of the STRF resulted in only marginal additional gains, at best. In summary, augmenting reverse correlation techniques with principled statistical correction choices increased prediction accuracy by over 80% for multi-unit STRFs and by over 40% for single-unit STRFs, furthering the interpretational relevance of the recovered spectrotemporal filters for auditory systems analysis.
NASA Astrophysics Data System (ADS)
Samui, Saumyadip; Samui Pal, Shanoli
2017-02-01
We present an improved photometric redshift estimator code, CuBANz, that is publicly available at https://goo.gl/fpk90V. It uses the back propagation neural network along with clustering of the training set, which makes it more efficient than existing neural network codes. In CuBANz, the training set is divided into several self learning clusters with galaxies having similar photometric properties and spectroscopic redshifts within a given span. The clustering algorithm uses the color information (i.e. u - g , g - r etc.) rather than the apparent magnitudes at various photometric bands as the photometric redshift is more sensitive to the flux differences between different bands rather than the actual values. Separate neural networks are trained for each cluster using all possible colors, magnitudes and uncertainties in the measurements. For a galaxy with unknown redshift, we identify the closest possible clusters having similar photometric properties and use those clusters to get the photometric redshifts using the particular networks that were trained using those cluster members. For galaxies that do not match with any training cluster, the photometric redshifts are obtained from a separate network that uses entire training set. This clustering method enables us to determine the redshifts more accurately. SDSS Stripe 82 catalog has been used here for the demonstration of the code. For the clustered sources with redshift range zspec < 0.7, the residual error (<(zspec -zphot) 2 > 1 / 2) in the training/testing phase is as low as 0.03 compared to the existing ANNz code that provides residual error on the same test data set of 0.05. Further, we provide a much better estimate of the uncertainty of the derived photometric redshift.
2013-01-01
optimization problem (2)–(3) is convex and can 1We adopt the convention that yii = 1 for any node i that belongs to a cluster. 2We assume aii = 1 for all i. 3The...relaxations: The formulation (2)–(3) is not the only way to relax the non - convex ML estimator. Instead of the nuclear norm regularizer, a hard constraint ...presented a convex optimization formulation, essentially a convexification of the maximum likelihood estimator. Our theoretic analysis shows that this
A nonparametric clustering technique which estimates the number of clusters
NASA Technical Reports Server (NTRS)
Ramey, D. B.
1983-01-01
In applications of cluster analysis, one usually needs to determine the number of clusters, K, and the assignment of observations to each cluster. A clustering technique based on recursive application of a multivariate test of bimodality which automatically estimates both K and the cluster assignments is presented.
Geist, David R. ); Jones, Julia; Murray, Christopher J. ); Dauble, Dennis D. )
1999-12-01
We improved our predictions of fall chinook salmon (Oncorhynchus tshawytscha) habitat use by analyzing spawning habitat at the spatial scale of redd clusters. Spatial point pattern analyses indicated that redd clusters in the Hanford Reach, Columbia River, were consistent in their location from 1994 to 1995. Redd densities were 16.1 and 8.9 redds?ha-1 in 1994 and 1995, respectively, and individual redds within clusters were usually less than 30 m apart. Pattern analysis also showed strong evidence that redds were uniformly distributed within the clusters where inter-redd distances ranged from 2 to 5 m. Redd clusters were found to occur predominantly where water velocity was between 1.4 to 2 m?s-1, water depth was 2 to 4 m, and lateral slope of the riverbed was less than 4%. This habitat use represented a narrower range of use than previously reported for adult fall chinook salmon. Logistic regression analysis determined that water velocity and lateral slope were the most significant predictors of redd cluster location over a range of river discharges. Over-estimates of available spawning habitat lead to non-achievable goals for protecting and restoring critical salmonid habitat. Better predictions of spawning habitat may be possible if cluster-specific characteristics are used.
NASA Astrophysics Data System (ADS)
Mokdad, Fatiha; Haddad, Boualem
2017-06-01
In this paper, two new infrared precipitation estimation approaches based on the concept of k-means clustering are first proposed, named the NAW-Kmeans and the GPI-Kmeans methods. Then, they are adapted to the southern Mediterranean basin, where the subtropical climate prevails. The infrared data (10.8 μm channel) acquired by MSG-SEVIRI sensor in winter and spring 2012 are used. Tests are carried out in eight areas distributed over northern Algeria: Sebra, El Bordj, Chlef, Blida, Bordj Menael, Sidi Aich, Beni Ourthilane, and Beni Aziz. The validation is performed by a comparison of the estimated rainfalls to rain gauges observations collected by the National Office of Meteorology in Dar El Beida (Algeria). Despite the complexity of the subtropical climate, the obtained results indicate that the NAW-Kmeans and the GPI-Kmeans approaches gave satisfactory results for the considered rain rates. Also, the proposed schemes lead to improvement in precipitation estimation performance when compared to the original algorithms NAW (Nagri, Adler, and Wetzel) and GPI (GOES Precipitation Index).
Horney, Jennifer; Zotti, Marianne E.; Williams, Amy; Hsia, Jason
2015-01-01
Introduction and Background Women of reproductive age, in particular women who are pregnant or fewer than 6 months postpartum, are uniquely vulnerable to the effects of natural disasters, which may create stressors for caregivers, limit access to prenatal/postpartum care, or interrupt contraception. Traditional approaches (e.g., newborn records, community surveys) to survey women of reproductive age about unmet needs may not be practical after disasters. Finding pregnant or postpartum women is especially challenging because fewer than 5% of women of reproductive age are pregnant or postpartum at any time. Methods From 2009 to 2011, we conducted three pilots of a sampling strategy that aimed to increase the proportion of pregnant and postpartum women of reproductive age who were included in postdisaster reproductive health assessments in Johnston County, North Carolina, after tornadoes, Cobb/Douglas Counties, Georgia, after flooding, and Bertie County, North Carolina, after hurricane-related flooding. Results Using this method, the percentage of pregnant and postpartum women interviewed in each pilot increased from 0.06% to 21%, 8% to 19%, and 9% to 17%, respectively. Conclusion and Discussion Two-stage cluster sampling with referral can be used to increase the proportion of pregnant and postpartum women included in a postdisaster assessment. This strategy may be a promising way to assess unmet needs of pregnant and postpartum women in disaster-affected communities. PMID:22365134
Horney, Jennifer; Zotti, Marianne E; Williams, Amy; Hsia, Jason
2012-01-01
Women of reproductive age, in particular women who are pregnant or fewer than 6 months postpartum, are uniquely vulnerable to the effects of natural disasters, which may create stressors for caregivers, limit access to prenatal/postpartum care, or interrupt contraception. Traditional approaches (e.g., newborn records, community surveys) to survey women of reproductive age about unmet needs may not be practical after disasters. Finding pregnant or postpartum women is especially challenging because fewer than 5% of women of reproductive age are pregnant or postpartum at any time. From 2009 to 2011, we conducted three pilots of a sampling strategy that aimed to increase the proportion of pregnant and postpartum women of reproductive age who were included in postdisaster reproductive health assessments in Johnston County, North Carolina, after tornadoes, Cobb/Douglas Counties, Georgia, after flooding, and Bertie County, North Carolina, after hurricane-related flooding. Using this method, the percentage of pregnant and postpartum women interviewed in each pilot increased from 0.06% to 21%, 8% to 19%, and 9% to 17%, respectively. Two-stage cluster sampling with referral can be used to increase the proportion of pregnant and postpartum women included in a postdisaster assessment. This strategy may be a promising way to assess unmet needs of pregnant and postpartum women in disaster-affected communities. Published by Elsevier Inc.
Attitude Estimation in Fractionated Spacecraft Cluster Systems
NASA Technical Reports Server (NTRS)
Hadaegh, Fred Y.; Blackmore, James C.
2011-01-01
An attitude estimation was examined in fractioned free-flying spacecraft. Instead of a single, monolithic spacecraft, a fractionated free-flying spacecraft uses multiple spacecraft modules. These modules are connected only through wireless communication links and, potentially, wireless power links. The key advantage of this concept is the ability to respond to uncertainty. For example, if a single spacecraft module in the cluster fails, a new one can be launched at a lower cost and risk than would be incurred with onorbit servicing or replacement of the monolithic spacecraft. In order to create such a system, however, it is essential to know what the navigation capabilities of the fractionated system are as a function of the capabilities of the individual modules, and to have an algorithm that can perform estimation of the attitudes and relative positions of the modules with fractionated sensing capabilities. Looking specifically at fractionated attitude estimation with startrackers and optical relative attitude sensors, a set of mathematical tools has been developed that specify the set of sensors necessary to ensure that the attitude of the entire cluster ( cluster attitude ) can be observed. Also developed was a navigation filter that can estimate the cluster attitude if these conditions are satisfied. Each module in the cluster may have either a startracker, a relative attitude sensor, or both. An extended Kalman filter can be used to estimate the attitude of all modules. A range of estimation performances can be achieved depending on the sensors used and the topology of the sensing network.
Tidal radius estimates for three open clusters
NASA Astrophysics Data System (ADS)
Danilov, V. M.; Loktin, A. V.
2015-10-01
A new method is developed for estimating tidal radii and masses of open star clusters (OCL) based on the sky-plane coordinates and proper motions and/or radial velocities of cluster member stars. To this end, we perform the correlation and spectral analysis of oscillations of absolute values of stellar velocity components relative to the cluster mass center along three coordinate planes and along each coordinate axis in five OCL models. Mutual correlation functions for fluctuations of absolute values of velocity field components are computed. The spatial Fourier transform of the mutual correlation functions in the case of zero time offset is used to compute wavenumber spectra of oscillations of absolute values of stellar velocity components. The oscillation spectra of these quantities contain series of local maxima at equidistant wavenumber k values. The ratio of the tidal radius of the cluster to the wavenumber difference Δ k of adjacent local maxima in the oscillation spectra of absolute values of velocity field components is found to be the same for all five OCL models. This ratio is used to estimate the tidal radii and masses of the Pleiades, Praesepe, and M67 based on the proper motions and sky-plane coordinates of the member stars of these clusters. The radial dependences of the absolute values of the tangential and radial projections of cluster star velocities computed using the proper motions relative to the cluster center are determined, along with the corresponding autocorrelation functions and wavenumber spectra of oscillations of absolute values of velocity field components. The Pleiades virial mass is estimated assuming that the cluster is either isolated or non-isolated. Also derived are the estimates of the Pleiades dynamical mass assuming that it is non-stationary and non-isolated. The inferred Pleiades tidal radii corresponding to these masses are reported.
Optimizing weak lensing mass estimates for cluster profile uncertainty
Gruen, D.; Bernstein, G. M.; Lam, T. Y.; Seitz, S.
2011-09-11
Weak lensing measurements of cluster masses are necessary for calibrating mass-observable relations (MORs) to investigate the growth of structure and the properties of dark energy. However, the measured cluster shear signal varies at fixed mass M_{200m }due to inherent ellipticity of background galaxies, intervening structures along the line of sight, and variations in the cluster structure due to scatter in concentrations, asphericity and substructure. We use N-body simulated halos to derive and evaluate a weak lensing circular aperture mass measurement M_{ap} that minimizes the mass estimate variance <(M_{ap} - M_{200m})^{2}> in the presence of all these forms of variability. Depending on halo mass and observational conditions, the resulting mass estimator improves on M_{ap} filters optimized for circular NFW-profile clusters in the presence of uncorrelated large scale structure (LSS) about as much as the latter improve on an estimator that only minimizes the influence of shape noise. Optimizing for uncorrelated LSS while ignoring the variation of internal cluster structure puts too much weight on the profile near the cores of halos, and under some circumstances can even be worse than not accounting for LSS at all. As a result, we discuss the impact of variability in cluster structure and correlated structures on the design and performance of weak lensing surveys intended to calibrate cluster MORs.
Optimizing weak lensing mass estimates for cluster profile uncertainty
Gruen, D.; Bernstein, G. M.; Lam, T. Y.; ...
2011-09-11
Weak lensing measurements of cluster masses are necessary for calibrating mass-observable relations (MORs) to investigate the growth of structure and the properties of dark energy. However, the measured cluster shear signal varies at fixed mass M200m due to inherent ellipticity of background galaxies, intervening structures along the line of sight, and variations in the cluster structure due to scatter in concentrations, asphericity and substructure. We use N-body simulated halos to derive and evaluate a weak lensing circular aperture mass measurement Map that minimizes the mass estimate variance <(Map - M200m)2> in the presence of all these forms of variability. Dependingmore » on halo mass and observational conditions, the resulting mass estimator improves on Map filters optimized for circular NFW-profile clusters in the presence of uncorrelated large scale structure (LSS) about as much as the latter improve on an estimator that only minimizes the influence of shape noise. Optimizing for uncorrelated LSS while ignoring the variation of internal cluster structure puts too much weight on the profile near the cores of halos, and under some circumstances can even be worse than not accounting for LSS at all. As a result, we discuss the impact of variability in cluster structure and correlated structures on the design and performance of weak lensing surveys intended to calibrate cluster MORs.« less
Improvements in Ionized Cluster-Beam Deposition
NASA Technical Reports Server (NTRS)
Fitzgerald, D. J.; Compton, L. E.; Pawlik, E. V.
1986-01-01
Lower temperatures result in higher purity and fewer equipment problems. In cluster-beam deposition, clusters of atoms formed by adiabatic expansion nozzle and with proper nozzle design, expanding vapor cools sufficiently to become supersaturated and form clusters of material deposited. Clusters are ionized and accelerated in electric field and then impacted on substrate where films form. Improved cluster-beam technique useful for deposition of refractory metals.
Nagwani, Naresh Kumar; Deo, Shirish V
2014-01-01
Understanding of the compressive strength of concrete is important for activities like construction arrangement, prestressing operations, and proportioning new mixtures and for the quality assurance. Regression techniques are most widely used for prediction tasks where relationship between the independent variables and dependent (prediction) variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. In this work cluster regression technique is applied for estimating the compressive strength of the concrete and a novel state of the art is proposed for predicting the concrete compressive strength. The objective of this work is to demonstrate that clustering along with regression ensures less prediction errors for estimating the concrete compressive strength. The proposed technique consists of two major stages: in the first stage, clustering is used to group the similar characteristics concrete data and then in the second stage regression techniques are applied over these clusters (groups) to predict the compressive strength from individual clusters. It is found from experiments that clustering along with regression techniques gives minimum errors for predicting compressive strength of concrete; also fuzzy clustering algorithm C-means performs better than K-means algorithm.
Nagwani, Naresh Kumar; Deo, Shirish V.
2014-01-01
Understanding of the compressive strength of concrete is important for activities like construction arrangement, prestressing operations, and proportioning new mixtures and for the quality assurance. Regression techniques are most widely used for prediction tasks where relationship between the independent variables and dependent (prediction) variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. In this work cluster regression technique is applied for estimating the compressive strength of the concrete and a novel state of the art is proposed for predicting the concrete compressive strength. The objective of this work is to demonstrate that clustering along with regression ensures less prediction errors for estimating the concrete compressive strength. The proposed technique consists of two major stages: in the first stage, clustering is used to group the similar characteristics concrete data and then in the second stage regression techniques are applied over these clusters (groups) to predict the compressive strength from individual clusters. It is found from experiments that clustering along with regression techniques gives minimum errors for predicting compressive strength of concrete; also fuzzy clustering algorithm C-means performs better than K-means algorithm. PMID:25374939
Clustering method for estimating principal diffusion directions
Nazem-Zadeh, Mohammad-Reza; Jafari-Khouzani, Kourosh; Davoodi-Bojd, Esmaeil; Jiang, Quan; Soltanian-Zadeh, Hamid
2012-01-01
Diffusion tensor magnetic resonance imaging (DTMRI) is a non-invasive tool for the investigation of white matter structure within the brain. However, the traditional tensor model is unable to characterize anisotropies of orders higher than two in heterogeneous areas containing more than one fiber population. To resolve this issue, high angular resolution diffusion imaging (HARDI) with a large number of diffusion encoding gradients is used along with reconstruction methods such as Q-ball. Using HARDI data, the fiber orientation distribution function (ODF) on the unit sphere is calculated and used to extract the principal diffusion directions (PDDs). Fast and accurate estimation of PDDs is a prerequisite for tracking algorithms that deal with fiber crossings. In this paper, the PDDs are defined as the directions around which the ODF data is concentrated. Estimates of the PDDs based on this definition are less sensitive to noise in comparison with the previous approaches. A clustering approach to estimate the PDDs is proposed which is an extension of fuzzy c-means clustering developed for orientation of points on a sphere. MDL (Minimum description length) principle is proposed to estimate the number of PDDs. Using both simulated and real diffusion data, the proposed method has been evaluated and compared with some previous protocols. Experimental results show that the proposed clustering algorithm is more accurate, more resistant to noise, and faster than some of techniques currently being utilized. PMID:21642005
Estimation of rank correlation for clustered data.
Rosner, Bernard; Glynn, Robert J
2017-06-30
It is well known that the sample correlation coefficient (Rxy ) is the maximum likelihood estimator of the Pearson correlation (ρxy ) for independent and identically distributed (i.i.d.) bivariate normal data. However, this is not true for ophthalmologic data where X (e.g., visual acuity) and Y (e.g., visual field) are available for each eye and there is positive intraclass correlation for both X and Y in fellow eyes. In this paper, we provide a regression-based approach for obtaining the maximum likelihood estimator of ρxy for clustered data, which can be implemented using standard mixed effects model software. This method is also extended to allow for estimation of partial correlation by controlling both X and Y for a vector U_ of other covariates. In addition, these methods can be extended to allow for estimation of rank correlation for clustered data by (i) converting ranks of both X and Y to the probit scale, (ii) estimating the Pearson correlation between probit scores for X and Y, and (iii) using the relationship between Pearson and rank correlation for bivariate normally distributed data. The validity of the methods in finite-sized samples is supported by simulation studies. Finally, two examples from ophthalmology and analgesic abuse are used to illustrate the methods. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
The School Improvement Cluster: A Concept Paper.
ERIC Educational Resources Information Center
Colorado State Dept. of Education, Denver. Office of Field Services.
This paper describes school improvement through clusters (or leagues) of the Colorado Department of Education. School improvement clusters are defined as associations of schools and cooperating organizations dedicated to improving the quality of education. Participants work together with a common goal or unifying concept. The paper describes 14…
Auplish, Aashima; Clarke, Alison S; Van Zanten, Trent; Abel, Kate; Tham, Charmaine; Bhutia, Thinlay N; Wilks, Colin R; Stevenson, Mark A; Firestone, Simon M
2017-05-01
Educational initiatives targeting at-risk populations have long been recognized as a mainstay of ongoing rabies control efforts. Cluster-based studies are often utilized to assess levels of knowledge, attitudes and practices of a population in response to education campaigns. The design of cluster-based studies requires estimates of intra-cluster correlation coefficients obtained from previous studies. This study estimates the school-level intra-cluster correlation coefficient (ICC) for rabies knowledge change following an educational intervention program. A cross-sectional survey was conducted with 226 students from 7 schools in Sikkim, India, using cluster sampling. In order to assess knowledge uptake, rabies education sessions with pre- and post-session questionnaires were administered. Paired differences of proportions were estimated for questions answered correctly. A mixed effects logistic regression model was developed to estimate school-level and student-level ICCs and to test for associations between gender, age, school location and educational level. The school- and student-level ICCs for rabies knowledge and awareness were 0.04 (95% CI: 0.01, 0.19) and 0.05 (95% CI: 0.2, 0.09), respectively. These ICCs suggest design effect multipliers of 5.45 schools and 1.05 students per school, will be required when estimating sample sizes and designing future cluster randomized trials. There was a good baseline level of rabies knowledge (mean pre-session score 71%), however, key knowledge gaps were identified in understanding appropriate behavior around scared dogs, potential sources of rabies and how to correctly order post rabies exposure precaution steps. After adjusting for the effect of gender, age, school location and education level, school and individual post-session test scores improved by 19%, with similar performance amongst boys and girls attending schools in urban and rural regions. The proportion of participants that were able to correctly order post
Cross-Clustering: A Partial Clustering Algorithm with Automatic Estimation of the Number of Clusters
Tellaroli, Paola; Bazzi, Marco; Donato, Michele; Brazzale, Alessandra R.; Drăghici, Sorin
2016-01-01
Four of the most common limitations of the many available clustering methods are: i) the lack of a proper strategy to deal with outliers; ii) the need for a good a priori estimate of the number of clusters to obtain reasonable results; iii) the lack of a method able to detect when partitioning of a specific data set is not appropriate; and iv) the dependence of the result on the initialization. Here we propose Cross-clustering (CC), a partial clustering algorithm that overcomes these four limitations by combining the principles of two well established hierarchical clustering algorithms: Ward’s minimum variance and Complete-linkage. We validated CC by comparing it with a number of existing clustering methods, including Ward’s and Complete-linkage. We show on both simulated and real datasets, that CC performs better than the other methods in terms of: the identification of the correct number of clusters, the identification of outliers, and the determination of real cluster memberships. We used CC to cluster samples in order to identify disease subtypes, and on gene profiles, in order to determine groups of genes with the same behavior. Results obtained on a non-biological dataset show that the method is general enough to be successfully used in such diverse applications. The algorithm has been implemented in the statistical language R and is freely available from the CRAN contributed packages repository. PMID:27015427
Improved dynamical modelling of the Arches cluster
NASA Astrophysics Data System (ADS)
Lee, Joowon; Kim, Sungsoo S.
2014-05-01
Recently, Clarkson et al. (2012) measured the intrinsic velocity dispersion of the Arches cluster, a young and massive star cluster in the Galactic center. Using the observed velocity dispersion profile and the surface brightness profile of Espinoza et al. (2009), they estimate the cluster's present-day mass to be ˜ 1.5×104 M⊙ by fitting an isothermal King model. In this study, we trace the best-fit initial mass for the Arches cluster using the same observed data set and also the anisotropic Fokker-Planck calculations for the dynamical evolution.
Improved Estimates of Thermodynamic Parameters
NASA Technical Reports Server (NTRS)
Lawson, D. D.
1982-01-01
Techniques refined for estimating heat of vaporization and other parameters from molecular structure. Using parabolic equation with three adjustable parameters, heat of vaporization can be used to estimate boiling point, and vice versa. Boiling points and vapor pressures for some nonpolar liquids were estimated by improved method and compared with previously reported values. Technique for estimating thermodynamic parameters should make it easier for engineers to choose among candidate heat-exchange fluids for thermochemical cycles.
State estimation and prediction using clustered particle filters
Lee, Yoonsang; Majda, Andrew J.
2016-01-01
Particle filtering is an essential tool to improve uncertain model predictions by incorporating noisy observational data from complex systems including non-Gaussian features. A class of particle filters, clustered particle filters, is introduced for high-dimensional nonlinear systems, which uses relatively few particles compared with the standard particle filter. The clustered particle filter captures non-Gaussian features of the true signal, which are typical in complex nonlinear dynamical systems such as geophysical systems. The method is also robust in the difficult regime of high-quality sparse and infrequent observations. The key features of the clustered particle filtering are coarse-grained localization through the clustering of the state variables and particle adjustment to stabilize the method; each observation affects only neighbor state variables through clustering and particles are adjusted to prevent particle collapse due to high-quality observations. The clustered particle filter is tested for the 40-dimensional Lorenz 96 model with several dynamical regimes including strongly non-Gaussian statistics. The clustered particle filter shows robust skill in both achieving accurate filter results and capturing non-Gaussian statistics of the true signal. It is further extended to multiscale data assimilation, which provides the large-scale estimation by combining a cheap reduced-order forecast model and mixed observations of the large- and small-scale variables. This approach enables the use of a larger number of particles due to the computational savings in the forecast model. The multiscale clustered particle filter is tested for one-dimensional dispersive wave turbulence using a forecast model with model errors. PMID:27930332
State estimation and prediction using clustered particle filters.
Lee, Yoonsang; Majda, Andrew J
2016-12-20
Particle filtering is an essential tool to improve uncertain model predictions by incorporating noisy observational data from complex systems including non-Gaussian features. A class of particle filters, clustered particle filters, is introduced for high-dimensional nonlinear systems, which uses relatively few particles compared with the standard particle filter. The clustered particle filter captures non-Gaussian features of the true signal, which are typical in complex nonlinear dynamical systems such as geophysical systems. The method is also robust in the difficult regime of high-quality sparse and infrequent observations. The key features of the clustered particle filtering are coarse-grained localization through the clustering of the state variables and particle adjustment to stabilize the method; each observation affects only neighbor state variables through clustering and particles are adjusted to prevent particle collapse due to high-quality observations. The clustered particle filter is tested for the 40-dimensional Lorenz 96 model with several dynamical regimes including strongly non-Gaussian statistics. The clustered particle filter shows robust skill in both achieving accurate filter results and capturing non-Gaussian statistics of the true signal. It is further extended to multiscale data assimilation, which provides the large-scale estimation by combining a cheap reduced-order forecast model and mixed observations of the large- and small-scale variables. This approach enables the use of a larger number of particles due to the computational savings in the forecast model. The multiscale clustered particle filter is tested for one-dimensional dispersive wave turbulence using a forecast model with model errors.
van Breukelen, Gerard Jp; Candel, Math Jjm; Berger, Martijn Pf
2008-08-01
Cluster randomized and multicentre trials evaluate the effect of a treatment on persons nested within clusters, for instance patients within clinics or pupils within schools. Although equal sample sizes per cluster are generally optimal for parameter estimation, they are rarely feasible. This paper addresses the relative efficiency (RE) of unequal versus equal cluster sizes for estimating variance components in cluster randomized trials and in multicentre trials with person randomization within centres, assuming a quantitative outcome. Starting from maximum likelihood estimation, the RE is investigated numerically for a range of cluster size distributions. An approximate formula is presented for computing the RE as a function of the mean and variance of cluster sizes and the intraclass correlation. The accuracy of this approximation is checked and found to be good. It is concluded that the loss of efficiency for variance component estimation due to variation of cluster sizes rarely exceeds 20% and can be compensated by sampling 25% more clusters.
Estimating potential evapotranspiration with improved radiation estimation
USDA-ARS?s Scientific Manuscript database
Potential evapotranspiration (PET) is of great importance to estimation of surface energy budget and water balance calculation. The accurate estimation of PET will facilitate efficient irrigation scheduling, drainage design, and other agricultural and meteorological applications. However, accuracy o...
Entropy-based cluster validation and estimation of the number of clusters in gene expression data.
Novoselova, Natalia; Tom, Igor
2012-10-01
Many external and internal validity measures have been proposed in order to estimate the number of clusters in gene expression data but as a rule they do not consider the analysis of the stability of the groupings produced by a clustering algorithm. Based on the approach assessing the predictive power or stability of a partitioning, we propose the new measure of cluster validation and the selection procedure to determine the suitable number of clusters. The validity measure is based on the estimation of the "clearness" of the consensus matrix, which is the result of a resampling clustering scheme or consensus clustering. According to the proposed selection procedure the stable clustering result is determined with the reference to the validity measure for the null hypothesis encoding for the absence of clusters. The final number of clusters is selected by analyzing the distance between the validity plots for initial and permutated data sets. We applied the selection procedure to estimate the clustering results on several datasets. As a result the proposed procedure produced an accurate and robust estimate of the number of clusters, which are in agreement with the biological knowledge and gold standards of cluster quality.
Cluster Stability Estimation Based on a Minimal Spanning Trees Approach
NASA Astrophysics Data System (ADS)
Volkovich, Zeev (Vladimir); Barzily, Zeev; Weber, Gerhard-Wilhelm; Toledano-Kitai, Dvora
2009-08-01
Among the areas of data and text mining which are employed today in science, economy and technology, clustering theory serves as a preprocessing step in the data analyzing. However, there are many open questions still waiting for a theoretical and practical treatment, e.g., the problem of determining the true number of clusters has not been satisfactorily solved. In the current paper, this problem is addressed by the cluster stability approach. For several possible numbers of clusters we estimate the stability of partitions obtained from clustering of samples. Partitions are considered consistent if their clusters are stable. Clusters validity is measured as the total number of edges, in the clusters' minimal spanning trees, connecting points from different samples. Actually, we use the Friedman and Rafsky two sample test statistic. The homogeneity hypothesis, of well mingled samples within the clusters, leads to asymptotic normal distribution of the considered statistic. Resting upon this fact, the standard score of the mentioned edges quantity is set, and the partition quality is represented by the worst cluster corresponding to the minimal standard score value. It is natural to expect that the true number of clusters can be characterized by the empirical distribution having the shortest left tail. The proposed methodology sequentially creates the described value distribution and estimates its left-asymmetry. Numerical experiments, presented in the paper, demonstrate the ability of the approach to detect the true number of clusters.
Learning Markov Random Walks for robust subspace clustering and estimation.
Liu, Risheng; Lin, Zhouchen; Su, Zhixun
2014-11-01
Markov Random Walks (MRW) has proven to be an effective way to understand spectral clustering and embedding. However, due to less global structural measure, conventional MRW (e.g., the Gaussian kernel MRW) cannot be applied to handle data points drawn from a mixture of subspaces. In this paper, we introduce a regularized MRW learning model, using a low-rank penalty to constrain the global subspace structure, for subspace clustering and estimation. In our framework, both the local pairwise similarity and the global subspace structure can be learnt from the transition probabilities of MRW. We prove that under some suitable conditions, our proposed local/global criteria can exactly capture the multiple subspace structure and learn a low-dimensional embedding for the data, in which giving the true segmentation of subspaces. To improve robustness in real situations, we also propose an extension of the MRW learning model based on integrating transition matrix learning and error correction in a unified framework. Experimental results on both synthetic data and real applications demonstrate that our proposed MRW learning model and its robust extension outperform the state-of-the-art subspace clustering methods.
Proportion estimation using prior cluster purities
NASA Technical Reports Server (NTRS)
Terrell, G. R. (Principal Investigator)
1980-01-01
The prior distribution of CLASSY component purities is studied, and this information incorporated into maximum likelihood crop proportion estimators. The method is tested on Transition Year spring small grain segments.
NASA Astrophysics Data System (ADS)
Gifford, Daniel William
2016-08-01
Galaxy clusters are large virialized structures that exist at the intersection of filaments of matter that make up the cosmic web. Due to their hierarchical growth history, they are excellent probes of the cosmology that governs our universe. Here, we aim to use clusters to better constrain cosmological parameters by systematically studying the uncertainties on galaxy cluster mass estimation for use in a halo mass function analysis. We find that the caustic technique is capable on average of recovering unbiased cluster masses to within 30% for well sampled systems. We also quantify potential statistical and systematic biases due to observational challenges. To address statistical biases in the caustic technique, we developed a new stacking algorithm to measure the average cluster mass for a single stack of projected cluster phase-spaces. By varying the number of galaxies and number of clusters we stack, we find that the single limited value is the total number of galaxies in the stack opening up the possibility for self-calibrated mass estimates of low mass or poorly sampled clusters in large surveys. We then utilize the SDSS-C4 catalog of galaxy clusters to place some of the tightest galaxy cluster based constraints on the matter density and power spectrum normalization for matter in our universe.
the-wizz: clustering redshift estimation for everyone
NASA Astrophysics Data System (ADS)
Morrison, C. B.; Hildebrandt, H.; Schmidt, S. J.; Baldry, I. K.; Bilicki, M.; Choi, A.; Erben, T.; Schneider, P.
2017-05-01
We present the-wizz, an open source and user-friendly software for estimating the redshift distributions of photometric galaxies with unknown redshifts by spatially cross-correlating them against a reference sample with known redshifts. The main benefit of the-wizz is in separating the angular pair finding and correlation estimation from the computation of the output clustering redshifts allowing anyone to create a clustering redshift for their sample without the intervention of an 'expert'. It allows the end user of a given survey to select any subsample of photometric galaxies with unknown redshifts, match this sample's catalogue indices into a value-added data file and produce a clustering redshift estimation for this sample in a fraction of the time it would take to run all the angular correlations needed to produce a clustering redshift. We show results with this software using photometric data from the Kilo-Degree Survey (KiDS) and spectroscopic redshifts from the Galaxy and Mass Assembly survey and the Sloan Digital Sky Survey. The results we present for KiDS are consistent with the redshift distributions used in a recent cosmic shear analysis from the survey. We also present results using a hybrid machine learning-clustering redshift analysis that enables the estimation of clustering redshifts for individual galaxies. the-wizz can be downloaded at http://github.com/morriscb/The-wiZZ/.
NASA Astrophysics Data System (ADS)
Bo, Yizhou; Shifa, Naima
2013-09-01
An estimator for finding the abundance of a rare, clustered and mobile population has been introduced. This model is based on adaptive cluster sampling (ACS) to identify the location of the population and negative binomial distribution to estimate the total in each site. To identify the location of the population we consider both sampling with replacement (WR) and sampling without replacement (WOR). Some mathematical properties of the model are also developed.
Estimation and testing problems in auditory neuroscience via clustering.
Hwang, Youngdeok; Wright, Samantha; Hanlon, Bret M
2017-09-01
The processing of auditory information in neurons is an important area in neuroscience. We consider statistical analysis for an electrophysiological experiment related to this area. The recorded synaptic current responses from the experiment are observed as clusters, where the number of clusters is related to an important characteristic of the auditory system. This number is difficult to estimate visually because the clusters are blurred by biological variability. Using singular value decomposition and a Gaussian mixture model, we develop an estimator for the number of clusters. Additionally, we provide a method for hypothesis testing and sample size determination in the two-sample problem. We illustrate our approach with both simulated and experimental data. © 2017, The International Biometric Society.
A hierarchical clustering methodology for the estimation of toxicity.
Martin, Todd M; Harten, Paul; Venkatapathy, Raghuraman; Das, Shashikala; Young, Douglas M
2008-01-01
ABSTRACT A quantitative structure-activity relationship (QSAR) methodology based on hierarchical clustering was developed to predict toxicological endpoints. This methodology utilizes Ward's method to divide a training set into a series of structurally similar clusters. The structural similarity is defined in terms of 2-D physicochemical descriptors (such as connectivity and E-state indices). A genetic algorithm-based technique is used to generate statistically valid QSAR models for each cluster (using the pool of descriptors described above). The toxicity for a given query compound is estimated using the weighted average of the predictions from the closest cluster from each step in the hierarchical clustering assuming that the compound is within the domain of applicability of the cluster. The hierarchical clustering methodology was tested using a Tetrahymena pyriformis acute toxicity data set containing 644 chemicals in the training set and with two prediction sets containing 339 and 110 chemicals. The results from the hierarchical clustering methodology were compared to the results from several different QSAR methodologies.
The impact of baryons on massive galaxy clusters: halo structure and cluster mass estimates
NASA Astrophysics Data System (ADS)
Henson, Monique A.; Barnes, David J.; Kay, Scott T.; McCarthy, Ian G.; Schaye, Joop
2017-03-01
We use the BAHAMAS (BAryons and HAloes of MAssive Systems) and MACSIS (MAssive ClusterS and Intercluster Structures) hydrodynamic simulations to quantify the impact of baryons on the mass distribution and dynamics of massive galaxy clusters, as well as the bias in X-ray and weak lensing mass estimates. These simulations use the subgrid physics models calibrated in the BAHAMAS project, which include feedback from both supernovae and active galactic nuclei. They form a cluster population covering almost two orders of magnitude in mass, with more than 3500 clusters with masses greater than 1014 M⊙ at z = 0. We start by characterizing the clusters in terms of their spin, shape and density profile, before considering the bias in both weak lensing and hydrostatic mass estimates. Whilst including baryonic effects leads to more spherical, centrally concentrated clusters, the median weak lensing mass bias is unaffected by the presence of baryons. In both the dark matter only and hydrodynamic simulations, the weak lensing measurements underestimate cluster masses by ≈10 per cent for clusters with M200 ≤ 1015 M⊙ and this bias tends to zero at higher masses. We also consider the hydrostatic bias when using both the true density and temperature profiles, and those derived from X-ray spectroscopy. When using spectroscopic temperatures and densities, the hydrostatic bias decreases as a function of mass, leading to a bias of ≈40 per cent for clusters with M500 ≥ 1015 M⊙. This is due to the presence of cooler gas in the cluster outskirts. Using mass weighted temperatures and the true density profile reduces this bias to 5-15 per cent.
Abell 315: reconciling cluster mass estimates from kinematics, X-ray, and lensing
NASA Astrophysics Data System (ADS)
Biviano, A.; Popesso, P.; Dietrich, J. P.; Zhang, Y.-Y.; Erfanianfar, G.; Romaniello, M.; Sartoris, B.
2017-06-01
Context. Determination of cluster masses is a fundamental tool for cosmology. Comparing mass estimates obtained by different probes allows to understand possible systematic uncertainties. Aims: The cluster Abell 315 is an interesting test case, since it has been claimed to be underluminous in X-ray for its mass (determined via kinematics and weak lensing). We have undertaken new spectroscopic observations with the aim of improving the cluster mass estimate, using the distribution of galaxies in projected phase space. Methods: We identified cluster members in our new spectroscopic sample. We estimated the cluster mass from the projected phase-space distribution of cluster members using the MAMPOSSt method. In doing this estimate we took into account the presence of substructures that we were able to identify. Results: We identify several cluster substructures. The main two have an overlapping spatial distribution, suggesting a (past or ongoing) collision along the line-of-sight. After accounting for the presence of substructures, the mass estimate of Abell 315 from kinematics is reduced by a factor 4, down to M200 = 0.8+0.6-0.4 × 1014M⊙. We also find evidence that the cluster mass concentration is unusually low, c200 ≡ r200/r-2 ≲ 1. Using our new estimate of c200 we revise the weak lensing mass estimate down to M200 = 1.8+1.7-0.9 × 1014M⊙. Our new mass estimates are in agreement with that derived from the cluster X-ray luminosity via a scaling relation, M200 = 0.9 ± 0.2 × 1014M⊙. Conclusions: Abell 315 no longer belongs to the class of X-ray underluminous clusters. Its mass estimate was inflated by the presence of an undetected subcluster in collision with the main cluster. Whether the presence of undetected line-of-sight structures can be a general explanation for all X-ray underluminous clusters remains to be explored using a statistically significant sample. Based in large part on data collected at the ESO VLT (prog. ID 083.A-0930).Full Table 1 and
Estimation Methods for Mixed Logistic Models with Few Clusters.
McNeish, Daniel
2016-01-01
For mixed models generally, it is well known that modeling data with few clusters will result in biased estimates, particularly of the variance components and fixed effect standard errors. In linear mixed models, small sample bias is typically addressed through restricted maximum likelihood estimation (REML) and a Kenward-Roger correction. Yet with binary outcomes, there is no direct analog of either procedure. With a larger number of clusters, estimation methods for binary outcomes that approximate the likelihood to circumvent the lack of a closed form solution such as adaptive Gaussian quadrature and the Laplace approximation have been shown to yield less-biased estimates than linearization estimation methods that instead linearly approximate the model. However, adaptive Gaussian quadrature and the Laplace approximation are approximating the full likelihood rather than the restricted likelihood; the full likelihood is known to yield biased estimates with few clusters. On the other hand, linearization methods linearly approximate the model, which allows for restricted maximum likelihood and the Kenward-Roger correction to be applied. Thus, the following question arises: Which is preferable, a better approximation of a biased function or a worse approximation of an unbiased function? We address this question with a simulation and an illustrative empirical analysis.
IMPROVING BIOGENIC EMISSION ESTIMATES WITH SATELLITE IMAGERY
This presentation will review how existing and future applications of satellite imagery can improve the accuracy of biogenic emission estimates. Existing applications of satellite imagery to biogenic emission estimates have focused on characterizing land cover. Vegetation dat...
IMPROVING BIOGENIC EMISSION ESTIMATES WITH SATELLITE IMAGERY
This presentation will review how existing and future applications of satellite imagery can improve the accuracy of biogenic emission estimates. Existing applications of satellite imagery to biogenic emission estimates have focused on characterizing land cover. Vegetation dat...
A data-driven approach to estimating the number of clusters in hierarchical clustering
Zambelli, Antoine E.
2016-01-01
DNA microarray and gene expression problems often require a researcher to perform clustering on their data in a bid to better understand its structure. In cases where the number of clusters is not known, one can resort to hierarchical clustering methods. However, there currently exist very few automated algorithms for determining the true number of clusters in the data. We propose two new methods (mode and maximum difference) for estimating the number of clusters in a hierarchical clustering framework to create a fully automated process with no human intervention. These methods are compared to the established elbow and gap statistic algorithms using simulated datasets and the Biobase Gene ExpressionSet. We also explore a data mixing procedure inspired by cross validation techniques. We find that the overall performance of the maximum difference method is comparable or greater to that of the gap statistic in multi-cluster scenarios, and achieves that performance at a fraction of the computational cost. This method also responds well to our mixing procedure, which opens the door to future research. We conclude that both the mode and maximum difference methods warrant further study related to their mixing and cross-validation potential. We particularly recommend the use of the maximum difference method in multi-cluster scenarios given its accuracy and execution times, and present it as an alternative to existing algorithms. PMID:28408972
Biomedical ontology improves biomedical literature clustering performance: a comparison study.
Yoo, Illhoi; Hu, Xiaohua; Song, Il-Yeol
2007-01-01
Document clustering has been used for better document retrieval and text mining. In this paper, we investigate if a biomedical ontology improves biomedical literature clustering performance in terms of the effectiveness and the scalability. For this investigation, we perform a comprehensive comparison study of various document clustering approaches such as hierarchical clustering methods, Bisecting K-means, K-means and Suffix Tree Clustering (STC). According to our experiment results, a biomedical ontology significantly enhances clustering quality on biomedical documents. In addition, our results show that decent document clustering approaches, such as Bisecting K-means, K-means and STC, gains some benefit from the ontology while hierarchical algorithms showing the poorest clustering quality do not reap the benefit of the biomedical ontology.
Improved Ant Colony Clustering Algorithm and Its Performance Study.
Gao, Wei
2016-01-01
Clustering analysis is used in many disciplines and applications; it is an important tool that descriptively identifies homogeneous groups of objects based on attribute values. The ant colony clustering algorithm is a swarm-intelligent method used for clustering problems that is inspired by the behavior of ant colonies that cluster their corpses and sort their larvae. A new abstraction ant colony clustering algorithm using a data combination mechanism is proposed to improve the computational efficiency and accuracy of the ant colony clustering algorithm. The abstraction ant colony clustering algorithm is used to cluster benchmark problems, and its performance is compared with the ant colony clustering algorithm and other methods used in existing literature. Based on similar computational difficulties and complexities, the results show that the abstraction ant colony clustering algorithm produces results that are not only more accurate but also more efficiently determined than the ant colony clustering algorithm and the other methods. Thus, the abstraction ant colony clustering algorithm can be used for efficient multivariate data clustering.
Improved Ant Colony Clustering Algorithm and Its Performance Study
Gao, Wei
2016-01-01
Clustering analysis is used in many disciplines and applications; it is an important tool that descriptively identifies homogeneous groups of objects based on attribute values. The ant colony clustering algorithm is a swarm-intelligent method used for clustering problems that is inspired by the behavior of ant colonies that cluster their corpses and sort their larvae. A new abstraction ant colony clustering algorithm using a data combination mechanism is proposed to improve the computational efficiency and accuracy of the ant colony clustering algorithm. The abstraction ant colony clustering algorithm is used to cluster benchmark problems, and its performance is compared with the ant colony clustering algorithm and other methods used in existing literature. Based on similar computational difficulties and complexities, the results show that the abstraction ant colony clustering algorithm produces results that are not only more accurate but also more efficiently determined than the ant colony clustering algorithm and the other methods. Thus, the abstraction ant colony clustering algorithm can be used for efficient multivariate data clustering. PMID:26839533
Clustering-based redshift estimation: application to VIPERS/CFHTLS
NASA Astrophysics Data System (ADS)
Scottez, V.; Mellier, Y.; Granett, B. R.; Moutard, T.; Kilbinger, M.; Scodeggio, M.; Garilli, B.; Bolzonella, M.; de la Torre, S.; Guzzo, L.; Abbas, U.; Adami, C.; Arnouts, S.; Bottini, D.; Branchini, E.; Cappi, A.; Cucciati, O.; Davidzon, I.; Fritz, A.; Franzetti, P.; Iovino, A.; Krywult, J.; Le Brun, V.; Le Fèvre, O.; Maccagni, D.; Małek, K.; Marulli, F.; Polletta, M.; Pollo, A.; Tasca, L. A. M.; Tojeiro, R.; Vergani, D.; Zanichelli, A.; Bel, J.; Coupon, J.; De Lucia, G.; Ilbert, O.; McCracken, H. J.; Moscardini, L.
2016-10-01
We explore the accuracy of the clustering-based redshift estimation proposed by Ménard et al. when applied to VIMOS Public Extragalactic Redshift Survey (VIPERS) and Canada-France-Hawaii Telescope Legacy Survey (CFHTLS) real data. This method enables us to reconstruct redshift distributions from measurement of the angular clustering of objects using a set of secure spectroscopic redshifts. We use state-of-the-art spectroscopic measurements with iAB < 22.5 from the VIPERS as reference population to infer the redshift distribution of galaxies from the CFHTLS T0007 release. VIPERS provides a nearly representative sample to a flux limit of iAB < 22.5 at a redshift of >0.5 which allows us to test the accuracy of the clustering-based redshift distributions. We show that this method enables us to reproduce the true mean colour-redshift relation when both populations have the same magnitude limit. We also show that this technique allows the inference of redshift distributions for a population fainter than the reference and we give an estimate of the colour-redshift mapping in this case. This last point is of great interest for future large-redshift surveys which require a complete faint spectroscopic sample.
Improving clustering by imposing network information
Gerber, Susanne; Horenko, Illia
2015-01-01
Cluster analysis is one of the most popular data analysis tools in a wide range of applied disciplines. We propose and justify a computationally efficient and straightforward-to-implement way of imposing the available information from networks/graphs (a priori available in many application areas) on a broad family of clustering methods. The introduced approach is illustrated on the problem of a noninvasive unsupervised brain signal classification. This task is faced with several challenging difficulties such as nonstationary noisy signals and a small sample size, combined with a high-dimensional feature space and huge noise-to-signal ratios. Applying this approach results in an exact unsupervised classification of very short signals, opening new possibilities for clustering methods in the area of a noninvasive brain-computer interface. PMID:26601225
Improved preprocessing and data clustering for land mine discrimination
NASA Astrophysics Data System (ADS)
Mereddy, Pramodh; Agarwal, Sanjeev; Rao, Vittal S.
2000-08-01
In this paper we discuss an improved algorithm for sensor- specific data processing and unsupervised data clustering for landmine discrimination. Pre-processor and data- clustering modules forma central part of modular sensor fusion architecture for landmine detection and discrimination. The dynamic unsupervised clustering algorithm is based on Dignet clustering. The self-organizing capability of Dignet is based on the idea of competitive generation and elimination of attraction wells. The center, width and depth characterize each attraction well. The Dignet architecture assumes prior knowledge of the data characteristics in the form of predefine well width. In this paper some modifications to Dignet architecture are presented in order to make Dignet truly self-organizing and data independent clustering algorithm. Information theoretic per-processing is used to capture underlying statistical properties of the sensor data which in turn is used to define important parameter for Dignet clustering such as similarity metrics, initial cluster width etc. The width of the cluster is also adapted online so that a fixed width is not enforced. A suitable procedure for online merge and clean operations is defined to re-organize the cluster development. A concept of dual width is employed to satisfy the competing requirements of compact clusters and high coverage of the data space. The performance of the improved clustering algorithm is compared with base-line Dignet algorithm using simulated data.
Galaxy cluster mass estimation from stacked spectroscopic analysis
NASA Astrophysics Data System (ADS)
Farahi, Arya; Evrard, August E.; Rozo, Eduardo; Rykoff, Eli S.; Wechsler, Risa H.
2016-08-01
We use simulated galaxy surveys to study: (i) how galaxy membership in redMaPPer clusters maps to the underlying halo population, and (ii) the accuracy of a mean dynamical cluster mass, Mσ(λ), derived from stacked pairwise spectroscopy of clusters with richness λ. Using ˜130 000 galaxy pairs patterned after the Sloan Digital Sky Survey (SDSS) redMaPPer cluster sample study of Rozo et al., we show that the pairwise velocity probability density function of central-satellite pairs with mi < 19 in the simulation matches the form seen in Rozo et al. Through joint membership matching, we deconstruct the main Gaussian velocity component into its halo contributions, finding that the top-ranked halo contributes ˜60 per cent of the stacked signal. The halo mass scale inferred by applying the virial scaling of Evrard et al. to the velocity normalization matches, to within a few per cent, the log-mean halo mass derived through galaxy membership matching. We apply this approach, along with miscentring and galaxy velocity bias corrections, to estimate the log-mean matched halo mass at z = 0.2 of SDSS redMaPPer clusters. Employing the velocity bias constraints of Guo et al., we find
Clustering of Casablanca stock market based on hurst exponent estimates
NASA Astrophysics Data System (ADS)
Lahmiri, Salim
2016-08-01
This paper deals with the problem of Casablanca Stock Exchange (CSE) topology modeling as a complex network during three different market regimes: general trend characterized by ups and downs, increasing trend, and decreasing trend. In particular, a set of seven different Hurst exponent estimates are used to characterize long-range dependence in each industrial sector generating process. They are employed in conjunction with hierarchical clustering approach to examine the co-movements of the Casablanca Stock Exchange industrial sectors. The purpose is to investigate whether cluster structures are similar across variable, increasing and decreasing regimes. It is observed that the general structure of the CSE topology has been considerably changed over 2009 (variable regime), 2010 (increasing regime), and 2011 (decreasing regime) time periods. The most important findings follow. First, in general a high value of Hurst exponent is associated to a variable regime and a small one to a decreasing regime. In addition, Hurst estimates during increasing regime are higher than those of a decreasing regime. Second, correlations between estimated Hurst exponent vectors of industrial sectors increase when Casablanca stock exchange follows an upward regime, whilst they decrease when the overall market follows a downward regime.
Improved correction of VIPERS angular selection effects in clustering measurements
NASA Astrophysics Data System (ADS)
Pezzotta, A.; Granett, B. R.; Bel, J.; Guzzo, L.; de la Torre, S.; Aff004
2016-10-01
Clustering estimates in galaxy redshift surveys need to account and correct for the way targets are selected from the general population, as to avoid biasing the measured values of cosmological parameters. The VIMOS Public Extragalactic Redshift Survey (VIPERS) is no exception to this, involving slit collisions and masking effects. Pushed by the increasing precision of the measurements, e.g. of the growth rate f, we have been re-assessing these effects in detail. We present here an improved correction for the two-point correlation function, capable to recover the amplitude of the monopole of the two-point correlation function ξ(r) above 1 h-1 Mpc to better than 2.
The improvement and simulation for LEACH clustering routing protocol
NASA Astrophysics Data System (ADS)
Ji, Ai-guo; Zhao, Jun-xiang
2017-01-01
An energy-balanced unequal multi-hop clustering routing protocol LEACH-EUMC is proposed in this paper. The candidate cluster head nodes are elected firstly, then they compete to be formal cluster head nodes by adding energy and distance factors, finally the date are transferred to sink through multi-hop. The results of simulation show that the improved algorithm is better than LEACH in network lifetime, energy consumption and the amount of data transmission.
Motion estimation using point cluster method and Kalman filter.
Senesh, M; Wolf, A
2009-05-01
The most frequently used method in a three dimensional human gait analysis involves placing markers on the skin of the analyzed segment. This introduces a significant artifact, which strongly influences the bone position and orientation and joint kinematic estimates. In this study, we tested and evaluated the effect of adding a Kalman filter procedure to the previously reported point cluster technique (PCT) in the estimation of a rigid body motion. We demonstrated the procedures by motion analysis of a compound planar pendulum from indirect opto-electronic measurements of markers attached to an elastic appendage that is restrained to slide along the rigid body long axis. The elastic frequency is close to the pendulum frequency, as in the biomechanical problem, where the soft tissue frequency content is similar to the actual movement of the bones. Comparison of the real pendulum angle to that obtained by several estimation procedures--PCT, Kalman filter followed by PCT, and low pass filter followed by PCT--enables evaluation of the accuracy of the procedures. When comparing the maximal amplitude, no effect was noted by adding the Kalman filter; however, a closer look at the signal revealed that the estimated angle based only on the PCT method was very noisy with fluctuation, while the estimated angle based on the Kalman filter followed by the PCT was a smooth signal. It was also noted that the instantaneous frequencies obtained from the estimated angle based on the PCT method is more dispersed than those obtained from the estimated angle based on Kalman filter followed by the PCT method. Addition of a Kalman filter to the PCT method in the estimation procedure of rigid body motion results in a smoother signal that better represents the real motion, with less signal distortion than when using a digital low pass filter. Furthermore, it can be concluded that adding a Kalman filter to the PCT procedure substantially reduces the dispersion of the maximal and minimal
Improving performance through concept formation and conceptual clustering
NASA Technical Reports Server (NTRS)
Fisher, Douglas H.
1992-01-01
Research from June 1989 through October 1992 focussed on concept formation, clustering, and supervised learning for purposes of improving the efficiency of problem-solving, planning, and diagnosis. These projects resulted in two dissertations on clustering, explanation-based learning, and means-ends planning, and publications in conferences and workshops, several book chapters, and journals; a complete Bibliography of NASA Ames supported publications is included. The following topics are studied: clustering of explanations and problem-solving experiences; clustering and means-end planning; and diagnosis of space shuttle and space station operating modes.
Nanostar clustering improves the sensitivity of plasmonic assays
Park, Yong; Im, Hyungsoon; Weissleder, Ralph; Lee, Hakho
2017-01-01
Star-shaped Au nanoparticles (Au nanostars, AuNS) have been developed to improve the plasmonic sensitivity, but their application has largely been limited to single-particle probes. We herein describe a AuNS clustering assay based on nanoscale self-assembly of multiple AuNS and which further increases detection sensitivity. We show that each cluster contains multiple nanogaps to concentrate electric fields, thereby amplifying the signal via plasmon coupling. Numerical simulation indicated that AuNS clusters assume up to 460-fold higher field density than Au nanosphere clusters of similar mass. The results were validated in model assays of protein biomarker detection. The AuNS clustering assay showed higher sensitivity than Au nanosphere. Minimizing the size of affinity ligand was found important to tightly confine electric fields and improve the sensitivity. The resulting assay is simple and fast, and can be readily applied to point-of-care molecular detection schemes. PMID:26102604
Xing, Jian; Burkom, Howard; Moniz, Linda; Edgerton, James; Leuze, Michael; Tokars, Jerome
2009-01-01
Background The Centers for Disease Control and Prevention's (CDC's) BioSense system provides near-real time situational awareness for public health monitoring through analysis of electronic health data. Determination of anomalous spatial and temporal disease clusters is a crucial part of the daily disease monitoring task. Our study focused on finding useful anomalies at manageable alert rates according to available BioSense data history. Methods The study dataset included more than 3 years of daily counts of military outpatient clinic visits for respiratory and rash syndrome groupings. We applied four spatial estimation methods in implementations of space-time scan statistics cross-checked in Matlab and C. We compared the utility of these methods according to the resultant background cluster rate (a false alarm surrogate) and sensitivity to injected cluster signals. The comparison runs used a spatial resolution based on the facility zip code in the patient record and a finer resolution based on the residence zip code. Results Simple estimation methods that account for day-of-week (DOW) data patterns yielded a clear advantage both in background cluster rate and in signal sensitivity. A 28-day baseline gave the most robust results for this estimation; the preferred baseline is long enough to remove daily fluctuations but short enough to reflect recent disease trends and data representation. Background cluster rates were lower for the rash syndrome counts than for the respiratory counts, likely because of seasonality and the large scale of the respiratory counts. Conclusion The spatial estimation method should be chosen according to characteristics of the selected data streams. In this dataset with strong day-of-week effects, the overall best detection performance was achieved using subregion averages over a 28-day baseline stratified by weekday or weekend/holiday behavior. Changing the estimation method for particular scenarios involving different spatial resolution
Clustering Information of Non-Sampled Area in Small Area Estimation of Poverty Indicators
NASA Astrophysics Data System (ADS)
Sundara, V. Y.; Kurnia, A.; Sadik, K.
2017-03-01
Empirical Bayes (EB) is one of indirect estimates methods which used to estimate parameters in small area. Molina and Rao has been used this method for estimates nonlinear small area parameter based on a nested error model. Problems occur when this method is used to estimate parameter of non-sampled area which is solely based on synthetic model which ignore the area effects. This paper proposed an approach to clustering area effects of auxiliary variable by assuming that there are similarities among particular area. A simulation study was presented to demonstrate the proposed approach. All estimations were evaluated based on the relative bias and relative root mean squares error. The result of simulation showed that proposed approach can improve the ability of model to estimate non-sampled area. The proposed model was applied to estimate poverty indicators at sub-districts level in regency and city of Bogor, West Java, Indonesia. The result of case study, relative root mean squares error prediction of empirical Bayes with information cluster is smaller than synthetic model.
Unsupervised, Robust Estimation-based Clustering for Multispectral Images
NASA Technical Reports Server (NTRS)
Netanyahu, Nathan S.
1997-01-01
To prepare for the challenge of handling the archiving and querying of terabyte-sized scientific spatial databases, the NASA Goddard Space Flight Center's Applied Information Sciences Branch (AISB, Code 935) developed a number of characterization algorithms that rely on supervised clustering techniques. The research reported upon here has been aimed at continuing the evolution of some of these supervised techniques, namely the neural network and decision tree-based classifiers, plus extending the approach to incorporating unsupervised clustering algorithms, such as those based on robust estimation (RE) techniques. The algorithms developed under this task should be suited for use by the Intelligent Information Fusion System (IIFS) metadata extraction modules, and as such these algorithms must be fast, robust, and anytime in nature. Finally, so that the planner/schedule module of the IlFS can oversee the use and execution of these algorithms, all information required by the planner/scheduler must be provided to the IIFS development team to ensure the timely integration of these algorithms into the overall system.
Scott, JoAnna M; deCamp, Allan; Juraska, Michal; Fay, Michael P; Gilbert, Peter B
2017-04-01
Stepped wedge designs are increasingly commonplace and advantageous for cluster randomized trials when it is both unethical to assign placebo, and it is logistically difficult to allocate an intervention simultaneously to many clusters. We study marginal mean models fit with generalized estimating equations for assessing treatment effectiveness in stepped wedge cluster randomized trials. This approach has advantages over the more commonly used mixed models that (1) the population-average parameters have an important interpretation for public health applications and (2) they avoid untestable assumptions on latent variable distributions and avoid parametric assumptions about error distributions, therefore, providing more robust evidence on treatment effects. However, cluster randomized trials typically have a small number of clusters, rendering the standard generalized estimating equation sandwich variance estimator biased and highly variable and hence yielding incorrect inferences. We study the usual asymptotic generalized estimating equation inferences (i.e., using sandwich variance estimators and asymptotic normality) and four small-sample corrections to generalized estimating equation for stepped wedge cluster randomized trials and for parallel cluster randomized trials as a comparison. We show by simulation that the small-sample corrections provide improvement, with one correction appearing to provide at least nominal coverage even with only 10 clusters per group. These results demonstrate the viability of the marginal mean approach for both stepped wedge and parallel cluster randomized trials. We also study the comparative performance of the corrected methods for stepped wedge and parallel designs, and describe how the methods can accommodate interval censoring of individual failure times and incorporate semiparametric efficient estimators.
Estimating cougar predation rates from GPS location clusters
Anderson, C.R.; Lindzey, F.G.
2003-01-01
We examined cougar (Puma concolor) predation from Global Positioning System (GPS) location clusters (???2 locations within 200 m on the same or consecutive nights) of 11 cougars during September-May, 1999-2001. Location success of GPS averaged 2.4-5.0 of 6 location attempts/night/cougar. We surveyed potential predation sites during summer-fall 2000 and summer 2001 to identify prey composition (n = 74; 3-388 days post predation) and record predation-site variables (n = 97; 3-270 days post predation). We developed a model to estimate probability that a cougar killed a large mammal from data collected at GPS location clusters where the probability of predation increased with number of nights (defined as locations at 2200, 0200, or 0500 hr) of cougar presence within a 200-m radius (P < 0.001). Mean estimated cougar predation rates for large mammals were 7.3 days/kill for subadult females (1-2.5 yr; n = 3, 90% CI: 6.3 to 9.9), 7.0 days/kill for adult females (n = 2, 90% CI: 5.8 to 10.8), 5.4 days/kill for family groups (females with young; n = 3, 90% CI: 4.5 to 8.4), 9.5 days/kill for a subadult male (1-2.5 yr; n = 1, 90% CI: 6.9 to 16.4), and 7.8 days/kill for adult males (n = 2, 90% CI: 6.8 to 10.7). We may have slightly overestimated cougar predation rates due to our inability to separate scavenging from predation. We detected 45 deer (Odocoileus spp.), 15 elk (Cervus elaphus), 6 pronghorn (Antilocapra americana), 2 livestock, 1 moose (Alces alces), and 6 small mammals at cougar predation sites. Comparisons between cougar sexes suggested that females selected mule deer and males selected elk (P < 0.001). Cougars averaged 3.0 nights on pronghorn carcasses, 3.4 nights on deer carcasses, and 6.0 nights on elk carcasses. Most cougar predation (81.7%) occurred between 1901-0500 hr and peaked from 2201-0200 hr (31.7%). Applying GPS technology to identify predation rates and prey selection will allow managers to efficiently estimate the ability of an area's prey base to
Desai, Manisha; Bryson, Susan W; Robinson, Thomas
2013-03-01
This paper examines the implications of using robust estimators (REs) of standard errors in the presence of clustering when cluster membership is unclear as may commonly occur in clustered randomized trials. For example, in such trials, cluster membership may not be recorded for one or more treatment arms and/or cluster membership may be dynamic. When clusters are well defined, REs have properties that are robust to misspecification of the correlation structure. To examine whether results were sensitive to assumptions about the clustering membership, we conducted simulation studies for a two-arm clinical trial, where the number of clusters, the intracluster correlation (ICC), and the sample size varied. REs of standard errors that incorrectly assumed clustering of data that were truly independent yielded type I error rates of up to 40%. Partial and complete misspecifications of membership (where some and no knowledge of true membership were incorporated into assumptions) for data generated from a large number of clusters (50) with a moderate ICC (0.20) yielded type I error rates that ranged from 7.2% to 9.1% and 10.5% to 45.6%, respectively; incorrectly assuming independence gave a type I error rate of 10.5%. REs of standard errors can be useful when the ICC and knowledge of cluster membership are high. When the ICC is weak, a number of factors must be considered. Our findings suggest guidelines for making sensible analytic choices in the presence of clustering.
Improved entropy rate estimation in physiological data.
Lake, D E
2011-01-01
Calculating entropy rate in physiologic signals has proven very useful in many settings. Common entropy estimates for this purpose are sample entropy (SampEn) and its less robust elder cousin, approximate entropy (ApEn). Both approaches count matches within a tolerance r for templates of length m consecutive observations. When physiologic data records are long and well-behaved, both approaches work very well for a wide range of m and r. However, more attention to the details of the estimation algorithm is needed for short records and signals with anomalies. In addition, interpretation of the magnitude of these estimates is highly dependent on how r is chosen and precludes comparison across studies with even slightly different methodologies. In this paper, we summarize recent novel approaches to improve the accuracy of entropy estimation. An important (but not necessarily new) alternative to current approaches is to develop estimates that convert probabilities to densities by normalizing by the matching region volume. This approach leads to a novel concept introduced here of reporting entropy rate in equivalent Gaussian white noise units. Another approach is to allow r to vary so that a pre-specified number of matches are found, called the minimum numerator count, to ensure confident probability estimation. The approaches are illustrated using a simple example of detecting abnormal cardiac rhythms in heart rate records.
A Hierarchical Clustering Methodology for the Estimation of Toxicity
A Quantitative Structure Activity Relationship (QSAR) methodology based on hierarchical clustering was developed to predict toxicological endpoints. This methodology utilizes Ward's method to divide a training set into a series of structurally similar clusters. The structural sim...
A Hierarchical Clustering Methodology for the Estimation of Toxicity
A Quantitative Structure Activity Relationship (QSAR) methodology based on hierarchical clustering was developed to predict toxicological endpoints. This methodology utilizes Ward's method to divide a training set into a series of structurally similar clusters. The structural sim...
Improved Estimation Model of Lunar Surface Temperature
NASA Astrophysics Data System (ADS)
Zheng, Y.
2015-12-01
Lunar surface temperature (LST) is of great scientific interest both uncovering the thermal properties and designing the lunar robotic or manned landing missions. In this paper, we proposed the improved LST estimation model based on the one-dimensional partial differential equation (PDE). The shadow and surface tilts effects were combined into the model. Using the Chang'E (CE-1) DEM data from the Laser Altimeter (LA), the topographic effect can be estimated with an improved effective solar irradiance (ESI) model. In Fig. 1, the highest LST of the global Moon has been estimated with the spatial resolution of 1 degree /pixel, applying the solar albedo data derived from Clementine UV-750nm in solving the PDE function. The topographic effect is significant in the LST map. It can be identified clearly the maria, highland, and craters. The maximum daytime LST presents at the regions with low albedo, i.g. mare Procellarum, mare Serenitatis and mare Imbrium. The results are consistent with the Diviner's measurements of the LRO mission. Fig. 2 shows the temperature variations at the center of the disk in one year, assuming the Moon to be standard spherical. The seasonal variation of LST at the equator is about 10K. The highest LST occurs in early May. Fig.1. Estimated maximum surface temperatures of the global Moon in spatial resolution of 1 degree /pixel
Radial Velocities, Metallicities, and Improved Fundamental Parameters of Outer Disk Open Clusters
NASA Astrophysics Data System (ADS)
Zasowski, Gail; Hamm, K.; Beaton, R.; Damke, G.; Carlberg, J. K.; Majewski, S. R.; Frinchaboy, P. M.
2014-01-01
Open stellar clusters have proven to be powerful tools for understanding the structure and stellar evolution of our Galaxy. Using photometry from 2MASS and the new Spitzer-IRAC GLIMPSE-360 surveys, Zasowski et al. (2013) identified and characterized more than a dozen new or poorly studied, heavily reddened open clusters in the outer Galactic disk. Here, we present follow-up spectroscopy for 11 of the clusters. Low resolution optical spectra were obtained with the DIS spectrograph on the Apache Point Observatory 3.5-meter telescope (R˜1200) for candidate members of seven clusters (GLM-CYGX 16, GLM-G360 18, GLM-G360 105, SAI 24, Berkeley 14, Berkeley 14a, and Czernik 20), and with the B&C spectrograph on the Las Campanas Observatory duPont telescope (R˜5400) for three clusters (GLM-G360 50, GLM-G360 75, and GLM-G360 79). High resolution (R˜22,500) infrared (H-band) spectra were also obtained for one cluster (GLM-G360 90) as part of an ancillary program for the SDSS-III/APOGEE survey. We use the mean chemical abundances and radial velocities (RVs) to identify likely cluster members and then revisit our previous isochrone fits. With reddening constrained by the Rayleigh-Jeans Color Excess method and mean metallicities by spectroscopy, the cluster distances and ages are estimated from improved isochrone fits to the stellar overdensity, weighted by confirmed RV and/or abundance members.
Accounting for One-Group Clustering in Effect-Size Estimation
ERIC Educational Resources Information Center
Citkowicz, Martyna; Hedges, Larry V.
2013-01-01
In some instances, intentionally or not, study designs are such that there is clustering in one group but not in the other. This paper describes methods for computing effect size estimates and their variances when there is clustering in only one group and the analysis has not taken that clustering into account. The authors provide the effect size…
Improved Gravitation Field Algorithm and Its Application in Hierarchical Clustering
Zheng, Ming; Sun, Ying; Liu, Gui-xia; Zhou, You; Zhou, Chun-guang
2012-01-01
Background Gravitation field algorithm (GFA) is a new optimization algorithm which is based on an imitation of natural phenomena. GFA can do well both for searching global minimum and multi-minima in computational biology. But GFA needs to be improved for increasing efficiency, and modified for applying to some discrete data problems in system biology. Method An improved GFA called IGFA was proposed in this paper. Two parts were improved in IGFA. The first one is the rule of random division, which is a reasonable strategy and makes running time shorter. The other one is rotation factor, which can improve the accuracy of IGFA. And to apply IGFA to the hierarchical clustering, the initial part and the movement operator were modified. Results Two kinds of experiments were used to test IGFA. And IGFA was applied to hierarchical clustering. The global minimum experiment was used with IGFA, GFA, GA (genetic algorithm) and SA (simulated annealing). Multi-minima experiment was used with IGFA and GFA. The two experiments results were compared with each other and proved the efficiency of IGFA. IGFA is better than GFA both in accuracy and running time. For the hierarchical clustering, IGFA is used to optimize the smallest distance of genes pairs, and the results were compared with GA and SA, singular-linkage clustering, UPGMA. The efficiency of IGFA is proved. PMID:23173043
Improved gravitation field algorithm and its application in hierarchical clustering.
Zheng, Ming; Sun, Ying; Liu, Gui-Xia; Zhou, You; Zhou, Chun-Guang
2012-01-01
Gravitation field algorithm (GFA) is a new optimization algorithm which is based on an imitation of natural phenomena. GFA can do well both for searching global minimum and multi-minima in computational biology. But GFA needs to be improved for increasing efficiency, and modified for applying to some discrete data problems in system biology. An improved GFA called IGFA was proposed in this paper. Two parts were improved in IGFA. The first one is the rule of random division, which is a reasonable strategy and makes running time shorter. The other one is rotation factor, which can improve the accuracy of IGFA. And to apply IGFA to the hierarchical clustering, the initial part and the movement operator were modified. Two kinds of experiments were used to test IGFA. And IGFA was applied to hierarchical clustering. The global minimum experiment was used with IGFA, GFA, GA (genetic algorithm) and SA (simulated annealing). Multi-minima experiment was used with IGFA and GFA. The two experiments results were compared with each other and proved the efficiency of IGFA. IGFA is better than GFA both in accuracy and running time. For the hierarchical clustering, IGFA is used to optimize the smallest distance of genes pairs, and the results were compared with GA and SA, singular-linkage clustering, UPGMA. The efficiency of IGFA is proved.
NASA Astrophysics Data System (ADS)
Cui, Jia; Hong, Bei; Jiang, Xuepeng; Chen, Qinghua
2017-05-01
With the purpose of reinforcing correlation analysis of risk assessment threat factors, a dynamic assessment method of safety risks based on particle filtering is proposed, which takes threat analysis as the core. Based on the risk assessment standards, the method selects threat indicates, applies a particle filtering algorithm to calculate influencing weight of threat indications, and confirms information system risk levels by combining with state estimation theory. In order to improve the calculating efficiency of the particle filtering algorithm, the k-means cluster algorithm is introduced to the particle filtering algorithm. By clustering all particles, the author regards centroid as the representative to operate, so as to reduce calculated amount. The empirical experience indicates that the method can embody the relation of mutual dependence and influence in risk elements reasonably. Under the circumstance of limited information, it provides the scientific basis on fabricating a risk management control strategy.
Improvement of propeller static thrust estimation
NASA Technical Reports Server (NTRS)
Brusse, J.; Kettleborough, C. F.
1975-01-01
The problem of improving the performance estimation of propellers operating in the heavily loaded static thrust condition was studied. The Goldstein theory was assessed as it applies to propellers operating in the static thrust. A review of theoretical considerations is presented along with a summary of the attempts made to obtain a numerical solution. The chordwise pressure distribution was determined during operation at a tip speed of 500 ft/sec. Chordwise integration of the pressures leads to the spanwise load distribution and further integration would give the axial thrust.
Improvement of Targeting Efficiency in Chaos Control Using Clustering
NASA Astrophysics Data System (ADS)
Sutcu, Y.; Iplikci, S.; Denizhan, Y.
2002-09-01
In this paper an improved version of the previously presented ECR (Extended Control Regions) targeting method is proposed, where the system data is first pre-processed and subdivided into clusters, and then one artificial neural network is assigned to each such cluster. Furthermore, an analytical criterion for determining the region of the current system state during targeting is introduced, whereas in the original ECR method the region information was hidden in the neural networks. Simulation results on several chaotic systems show that this modified version of the ECR method reduces the average reaching time and in general also the training time of the neural networks.
Improving Lidar Turbulence Estimates for Wind Energy
Newman, Jennifer F.; Clifton, Andrew; Churchfield, Matthew J.; Klein, Petra
2016-10-06
Remote sensing devices (e.g., lidars) are quickly becoming a cost-effective and reliable alternative to meteorological towers for wind energy applications. Although lidars can measure mean wind speeds accurately, these devices measure different values of turbulence intensity (TI) than an instrument on a tower. In response to these issues, a lidar TI error reduction model was recently developed for commercially available lidars. The TI error model first applies physics-based corrections to the lidar measurements, then uses machine-learning techniques to further reduce errors in lidar TI estimates. The model was tested at two sites in the Southern Plains where vertically profiling lidars were collocated with meteorological towers. This presentation primarily focuses on the physics-based corrections, which include corrections for instrument noise, volume averaging, and variance contamination. As different factors affect TI under different stability conditions, the combination of physical corrections applied in L-TERRA changes depending on the atmospheric stability during each 10-minute time period. This stability-dependent version of L-TERRA performed well at both sites, reducing TI error and bringing lidar TI estimates closer to estimates from instruments on towers. However, there is still scatter evident in the lidar TI estimates, indicating that there are physics that are not being captured in the current version of L-TERRA. Two options are discussed for modeling the remainder of the TI error physics in L-TERRA: machine learning and lidar simulations. Lidar simulations appear to be a better approach, as they can help improve understanding of atmospheric effects on TI error and do not require a large training data set.
An improved Chebyshev distance metric for clustering medical images
NASA Astrophysics Data System (ADS)
Mousa, Aseel; Yusof, Yuhanis
2015-12-01
A metric or distance function is a function which defines a distance between elements of a set. In clustering, measuring the similarity between objects has become an important issue. In practice, there are various similarity measures used and this includes the Euclidean, Manhattan and Minkowski. In this paper, an improved Chebyshev similarity measure is introduced to replace existing metrics (such as Euclidean and standard Chebyshev) in clustering analysis. The proposed measure is later realized in analyzing blood cancer images. Results demonstrate that the proposed measure produces the smallest objective function value and converge at the lowest number of iteration. Hence, it can be concluded that the proposed distance metric contribute in producing better clusters.
Liu, Bo; Lu, Wenbin; Zhang, Jiajia
2013-01-01
Summary Clustered survival data frequently arise in biomedical applications, where event times of interest are clustered into groups such as families. In this article we consider an accelerated failure time frailty model for clustered survival data and develop nonparametric maximum likelihood estimation for it via a kernel smoother aided EM algorithm. We show that the proposed estimator for the regression coefficients is consistent, asymptotically normal and semiparametric efficient when the kernel bandwidth is properly chosen. An EM-aided numerical differentiation method is derived for estimating its variance. Simulation studies evaluate the finite sample performance of the estimator, and it is applied to the Diabetic Retinopathy data set. PMID:24443587
Estimation of Carcinogenicity using Hierarchical Clustering and Nearest Neighbor Methodologies
Previously a hierarchical clustering (HC) approach and a nearest neighbor (NN) approach were developed to model acute aquatic toxicity end points. These approaches were developed to correlate the toxicity for large, noncongeneric data sets. In this study these approaches applie...
Estimation of Carcinogenicity using Hierarchical Clustering and Nearest Neighbor Methodologies
Previously a hierarchical clustering (HC) approach and a nearest neighbor (NN) approach were developed to model acute aquatic toxicity end points. These approaches were developed to correlate the toxicity for large, noncongeneric data sets. In this study these approaches applie...
A Multicriteria Decision Making Approach for Estimating the Number of Clusters in a Data Set
Peng, Yi; Zhang, Yong; Kou, Gang; Shi, Yong
2012-01-01
Determining the number of clusters in a data set is an essential yet difficult step in cluster analysis. Since this task involves more than one criterion, it can be modeled as a multiple criteria decision making (MCDM) problem. This paper proposes a multiple criteria decision making (MCDM)-based approach to estimate the number of clusters for a given data set. In this approach, MCDM methods consider different numbers of clusters as alternatives and the outputs of any clustering algorithm on validity measures as criteria. The proposed method is examined by an experimental study using three MCDM methods, the well-known clustering algorithm–k-means, ten relative measures, and fifteen public-domain UCI machine learning data sets. The results show that MCDM methods work fairly well in estimating the number of clusters in the data and outperform the ten relative measures considered in the study. PMID:22870181
NASA Astrophysics Data System (ADS)
Raghunathan, Srinivasan; Patil, Sanjaykumar; Baxter, Eric J.; Bianchini, Federico; Bleem, Lindsey E.; Crawford, Thomas M.; Holder, Gilbert P.; Manzotti, Alessandro; Reichardt, Christian L.
2017-08-01
We develop a Maximum Likelihood estimator (MLE) to measure the masses of galaxy clusters through the impact of gravitational lensing on the temperature and polarization anisotropies of the cosmic microwave background (CMB). We show that, at low noise levels in temperature, this optimal estimator outperforms the standard quadratic estimator by a factor of two. For polarization, we show that the Stokes Q/U maps can be used instead of the traditional E- and B-mode maps without losing information. We test and quantify the bias in the recovered lensing mass for a comprehensive list of potential systematic errors. Using realistic simulations, we examine the cluster mass uncertainties from CMB-cluster lensing as a function of an experiment's beam size and noise level. We predict the cluster mass uncertainties will be 3 - 6% for SPT-3G, AdvACT, and Simons Array experiments with 10,000 clusters and less than 1% for the CMB-S4 experiment with a sample containing 100,000 clusters. The mass constraints from CMB polarization are very sensitive to the experimental beam size and map noise level: for a factor of three reduction in either the beam size or noise level, the lensing signal-to-noise improves by roughly a factor of two.
Raghunathan, Srinivasan; Patil, Sanjaykumar; Baxter, Eric J.; ...
2017-08-25
We develop a Maximum Likelihood estimator (MLE) to measure the masses of galaxy clusters through the impact of gravitational lensing on the temperature and polarization anisotropies of the cosmic microwave background (CMB). We show that, at low noise levels in temperature, this optimal estimator outperforms the standard quadratic estimator by a factor of two. For polarization, we show that the Stokes Q/U maps can be used instead of the traditional E- and B-mode maps without losing information. We test and quantify the bias in the recovered lensing mass for a comprehensive list of potential systematic errors. Using realistic simulations, wemore » examine the cluster mass uncertainties from CMB-cluster lensing as a function of an experiment’s beam size and noise level. We predict the cluster mass uncertainties will be 3 - 6% for SPT-3G, AdvACT, and Simons Array experiments with 10,000 clusters and less than 1% for the CMB-S4 experiment with a sample containing 100,000 clusters. The mass constraints from CMB polarization are very sensitive to the experimental beam size and map noise level: for a factor of three reduction in either the beam size or noise level, the lensing signal-to-noise improves by roughly a factor of two.« less
VizieR Online Data Catalog: Metallicity estimates of M31 globular clusters (Galleti+, 2009)
NASA Astrophysics Data System (ADS)
Galleti, S.; Bellazzini, M.; Buzzoni, A.; Federici, L.; Fusi Pecci, F.
2010-04-01
New empirical relations of [Fe/H] as a function of [MgFe] and Mg2 indices are based on the well-studied galactic globular clusters, complemented with theoretical model predictions for -0.2<=[Fe/H]<=+0.5. Lick indices for M31 clusters from various literature sources (225 clusters) and from new observations by our team (71 clusters) have been transformed into the Trager et al. (2000AJ....119.1645T) system, yielding new metallicity estimates for 245 globular clusters of M31. (3 data files).
Xiao, Yongling; Abrahamowicz, Michal
2010-03-30
We propose two bootstrap-based methods to correct the standard errors (SEs) from Cox's model for within-cluster correlation of right-censored event times. The cluster-bootstrap method resamples, with replacement, only the clusters, whereas the two-step bootstrap method resamples (i) the clusters, and (ii) individuals within each selected cluster, with replacement. In simulations, we evaluate both methods and compare them with the existing robust variance estimator and the shared gamma frailty model, which are available in statistical software packages. We simulate clustered event time data, with latent cluster-level random effects, which are ignored in the conventional Cox's model. For cluster-level covariates, both proposed bootstrap methods yield accurate SEs, and type I error rates, and acceptable coverage rates, regardless of the true random effects distribution, and avoid serious variance under-estimation by conventional Cox-based standard errors. However, the two-step bootstrap method over-estimates the variance for individual-level covariates. We also apply the proposed bootstrap methods to obtain confidence bands around flexible estimates of time-dependent effects in a real-life analysis of cluster event times.
An improved unsupervised clustering-based intrusion detection method
NASA Astrophysics Data System (ADS)
Hai, Yong J.; Wu, Yu; Wang, Guo Y.
2005-03-01
Practical Intrusion Detection Systems (IDSs) based on data mining are facing two key problems, discovering intrusion knowledge from real-time network data, and automatically updating them when new intrusions appear. Most data mining algorithms work on labeled data. In order to set up basic data set for mining, huge volumes of network data need to be collected and labeled manually. In fact, it is rather difficult and impractical to label intrusions, which has been a big restrict for current IDSs and has led to limited ability of identifying all kinds of intrusion types. An improved unsupervised clustering-based intrusion model working on unlabeled training data is introduced. In this model, center of a cluster is defined and used as substitution of this cluster. Then all cluster centers are adopted to detect intrusions. Testing on data sets of KDDCUP"99, experimental results demonstrate that our method has good performance in detection rate. Furthermore, the incremental-learning method is adopted to detect those unknown-type intrusions and it decreases false positive rate.
Young, Siobhan K; Lyles, Robert H; Kupper, Lawrence L; Keys, Jessica R; Martin, Sandra L; Costenbader, Elizabeth C
2014-06-01
Population sexual mixing patterns can be quantified using Newman's assortativity coefficient (r). Suggested methods for estimating the SE for r may lead to inappropriate statistical conclusions in situations where intracluster correlation is ignored and/or when cluster size is predictive of the response. We describe a computer-intensive, but highly accessible, within-cluster resampling approach for providing a valid large-sample estimated SE for r and an associated 95% CI. We introduce needed statistical notation and describe the within-cluster resampling approach. Sexual network data and a simulation study were employed to compare within-cluster resampling with standard methods when cluster size is informative. For the analysis of network data when cluster size is informative, the simulation study demonstrates that within-cluster resampling produces valid statistical inferences about Newman's assortativity coefficient, a popular statistic used to quantify the strength of mixing patterns. In contrast, commonly used methods are biased with attendant extremely poor CI coverage. Within-cluster resampling is recommended when cluster size is informative and/or when there is within-cluster response correlation. Within-cluster resampling is recommended for providing valid statistical inferences when applying Newman's assortativity coefficient r to network data. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
IMPROVED RISK ESTIMATES FOR CARBON TETRACHLORIDE
Benson, Janet M.; Springer, David L.
1999-12-31
Carbon tetrachloride has been used extensively within the DOE nuclear weapons facilities. Rocky Flats was formerly the largest volume consumer of CCl4 in the United States using 5000 gallons in 1977 alone (Ripple, 1992). At the Hanford site, several hundred thousand gallons of CCl4 were discharged between 1955 and 1973 into underground cribs for storage. Levels of CCl4 in groundwater at highly contaminated sites at the Hanford facility have exceeded 8 the drinking water standard of 5 ppb by several orders of magnitude (Illman, 1993). High levels of CCl4 at these facilities represent a potential health hazard for workers conducting cleanup operations and for surrounding communities. The level of CCl4 cleanup required at these sites and associated costs are driven by current human health risk estimates, which assume that CCl4 is a genotoxic carcinogen. The overall purpose of these studies was to improve the scientific basis for assessing the health risk associated with human exposure to CCl4. Specific research objectives of this project were to: (1) compare the rates of CCl4 metabolism by rats, mice and hamsters in vivo and extrapolate those rates to man based on parallel studies on the metabolism of CCl4 by rat, mouse, hamster and human hepatic microsomes in vitro; (2) using hepatic microsome preparations, determine the role of specific cytochrome P450 isoforms in CCl4-mediated toxicity and the effects of repeated inhalation and ingestion of CCl4 on these isoforms; and (3) evaluate the toxicokinetics of inhaled CCl4 in rats, mice and hamsters. This information has been used to improve the physiologically based pharmacokinetic (PBPK) model for CCl4 originally developed by Paustenbach et al. (1988) and more recently revised by Thrall and Kenny (1996). Another major objective of the project was to provide scientific evidence that CCl4, like chloroform, is a hepatocarcinogen only when exposure results in cell damage, cell killing and regenerative proliferation. In
Research opportunities to improve DSM impact estimates
Misuriello, H.; Hopkins, M.E.F. )
1992-03-01
This report was commissioned by the California Institute for Energy Efficiency (CIEE) as part of its research mission to advance the energy efficiency and productivity of all end-use sectors in California. Our specific goal in this effort has been to identify viable research and development (R D) opportunities that can improve capabilities to determine the energy-use and demand reductions achieved through demand-side management (DSM) programs and measures. We surveyed numerous practitioners in California and elsewhere to identify the major obstacles to effective impact evaluation, drawing on their collective experience. As a separate effort, we have also profiled the status of regulatory practices in leading states with respect to DSM impact evaluation. We have synthesized this information, adding our own perspective and experience to those of our survey-respondent colleagues, to characterize today's state of the art in impact-evaluation practices. This scoping study takes a comprehensive look at the problems and issues involved in DSM impact estimates at the customer-facility or site level. The major portion of our study investigates three broad topic areas of interest to CIEE: Data analysis issues, field-monitoring issues, issues in evaluating DSM measures. Across these three topic areas, we have identified 22 potential R D opportunities, to which we have assigned priority levels. These R D opportunities are listed by topic area and priority.
Research opportunities to improve DSM impact estimates
Misuriello, H.; Hopkins, M.E.F.
1992-03-01
This report was commissioned by the California Institute for Energy Efficiency (CIEE) as part of its research mission to advance the energy efficiency and productivity of all end-use sectors in California. Our specific goal in this effort has been to identify viable research and development (R&D) opportunities that can improve capabilities to determine the energy-use and demand reductions achieved through demand-side management (DSM) programs and measures. We surveyed numerous practitioners in California and elsewhere to identify the major obstacles to effective impact evaluation, drawing on their collective experience. As a separate effort, we have also profiled the status of regulatory practices in leading states with respect to DSM impact evaluation. We have synthesized this information, adding our own perspective and experience to those of our survey-respondent colleagues, to characterize today`s state of the art in impact-evaluation practices. This scoping study takes a comprehensive look at the problems and issues involved in DSM impact estimates at the customer-facility or site level. The major portion of our study investigates three broad topic areas of interest to CIEE: Data analysis issues, field-monitoring issues, issues in evaluating DSM measures. Across these three topic areas, we have identified 22 potential R&D opportunities, to which we have assigned priority levels. These R&D opportunities are listed by topic area and priority.
NASA Astrophysics Data System (ADS)
Strauss, Cesar; Rosa, Marcelo Barbio; Stephany, Stephan
2013-12-01
Convective cells are cloud formations whose growth, maturation and dissipation are of great interest among meteorologists since they are associated with severe storms with large precipitation structures. Some works suggest a strong correlation between lightning occurrence and convective cells. The current work proposes a new approach to analyze the correlation between precipitation and lightning, and to identify electrically active cells. Such cells may be employed for tracking convective events in the absence of weather radar coverage. This approach employs a new spatio-temporal clustering technique based on a temporal sliding-window and a standard kernel density estimation to process lightning data. Clustering allows the identification of the cells from lightning data and density estimation bounds the contours of the cells. The proposed approach was evaluated for two convective events in Southeast Brazil. Image segmentation of radar data was performed to identify convective precipitation structures using the Steiner criteria. These structures were then compared and correlated to the electrically active cells in particular instants of time for both events. It was observed that most precipitation structures have associated cells, by comparing the ground tracks of their centroids. In addition, for one particular cell of each event, its temporal evolution was compared to that of the associated precipitation structure. Results show that the proposed approach may improve the use of lightning data for tracking convective events in countries that lack weather radar coverage.
An improved clustering algorithm based on reverse learning in intelligent transportation
NASA Astrophysics Data System (ADS)
Qiu, Guoqing; Kou, Qianqian; Niu, Ting
2017-05-01
With the development of artificial intelligence and data mining technology, big data has gradually entered people's field of vision. In the process of dealing with large data, clustering is an important processing method. By introducing the reverse learning method in the clustering process of PAM clustering algorithm, to further improve the limitations of one-time clustering in unsupervised clustering learning, and increase the diversity of clustering clusters, so as to improve the quality of clustering. The algorithm analysis and experimental results show that the algorithm is feasible.
Open-Source Sequence Clustering Methods Improve the State Of the Art
Navas-Molina, Jose A.; Mercier, Céline; Xu, Zhenjiang Zech; Mahé, Frédéric; He, Yan; Zhou, Hong-Wei; Rognes, Torbjørn; Caporaso, J. Gregory; Knight, Rob
2016-01-01
ABSTRACT Sequence clustering is a common early step in amplicon-based microbial community analysis, when raw sequencing reads are clustered into operational taxonomic units (OTUs) to reduce the run time of subsequent analysis steps. Here, we evaluated the performance of recently released state-of-the-art open-source clustering software products, namely, OTUCLUST, Swarm, SUMACLUST, and SortMeRNA, against current principal options (UCLUST and USEARCH) in QIIME, hierarchical clustering methods in mothur, and USEARCH’s most recent clustering algorithm, UPARSE. All the latest open-source tools showed promising results, reporting up to 60% fewer spurious OTUs than UCLUST, indicating that the underlying clustering algorithm can vastly reduce the number of these derived OTUs. Furthermore, we observed that stringent quality filtering, such as is done in UPARSE, can cause a significant underestimation of species abundance and diversity, leading to incorrect biological results. Swarm, SUMACLUST, and SortMeRNA have been included in the QIIME 1.9.0 release. IMPORTANCE Massive collections of next-generation sequencing data call for fast, accurate, and easily accessible bioinformatics algorithms to perform sequence clustering. A comprehensive benchmark is presented, including open-source tools and the popular USEARCH suite. Simulated, mock, and environmental communities were used to analyze sensitivity, selectivity, species diversity (alpha and beta), and taxonomic composition. The results demonstrate that recent clustering algorithms can significantly improve accuracy and preserve estimated diversity without the application of aggressive filtering. Moreover, these tools are all open source, apply multiple levels of multithreading, and scale to the demands of modern next-generation sequencing data, which is essential for the analysis of massive multidisciplinary studies such as the Earth Microbiome Project (EMP) (J. A. Gilbert, J. K. Jansson, and R. Knight, BMC Biol 12:69, 2014
Open-Source Sequence Clustering Methods Improve the State Of the Art.
Kopylova, Evguenia; Navas-Molina, Jose A; Mercier, Céline; Xu, Zhenjiang Zech; Mahé, Frédéric; He, Yan; Zhou, Hong-Wei; Rognes, Torbjørn; Caporaso, J Gregory; Knight, Rob
2016-01-01
Sequence clustering is a common early step in amplicon-based microbial community analysis, when raw sequencing reads are clustered into operational taxonomic units (OTUs) to reduce the run time of subsequent analysis steps. Here, we evaluated the performance of recently released state-of-the-art open-source clustering software products, namely, OTUCLUST, Swarm, SUMACLUST, and SortMeRNA, against current principal options (UCLUST and USEARCH) in QIIME, hierarchical clustering methods in mothur, and USEARCH's most recent clustering algorithm, UPARSE. All the latest open-source tools showed promising results, reporting up to 60% fewer spurious OTUs than UCLUST, indicating that the underlying clustering algorithm can vastly reduce the number of these derived OTUs. Furthermore, we observed that stringent quality filtering, such as is done in UPARSE, can cause a significant underestimation of species abundance and diversity, leading to incorrect biological results. Swarm, SUMACLUST, and SortMeRNA have been included in the QIIME 1.9.0 release. IMPORTANCE Massive collections of next-generation sequencing data call for fast, accurate, and easily accessible bioinformatics algorithms to perform sequence clustering. A comprehensive benchmark is presented, including open-source tools and the popular USEARCH suite. Simulated, mock, and environmental communities were used to analyze sensitivity, selectivity, species diversity (alpha and beta), and taxonomic composition. The results demonstrate that recent clustering algorithms can significantly improve accuracy and preserve estimated diversity without the application of aggressive filtering. Moreover, these tools are all open source, apply multiple levels of multithreading, and scale to the demands of modern next-generation sequencing data, which is essential for the analysis of massive multidisciplinary studies such as the Earth Microbiome Project (EMP) (J. A. Gilbert, J. K. Jansson, and R. Knight, BMC Biol 12:69, 2014, http
Salmaso, S.; Rota, M. C.; Ciofi Degli Atti, M. L.; Tozzi, A. E.; Kreidl, P.
1999-01-01
In 1998, a series of regional cluster surveys (the ICONA Study) was conducted simultaneously in 19 out of the 20 regions in Italy to estimate the mandatory immunization coverage of children aged 12-24 months with oral poliovirus (OPV), diphtheria-tetanus (DT) and viral hepatitis B (HBV) vaccines, as well as optional immunization coverage with pertussis, measles and Haemophilus influenzae b (Hib) vaccines. The study children were born in 1996 and selected from birth registries using the Expanded Programme of Immunization (EPI) cluster sampling technique. Interviews with parents were conducted to determine each child's immunization status and the reasons for any missed or delayed vaccinations. The study population comprised 4310 children aged 12-24 months. Coverage for both mandatory and optional vaccinations differed by region. The overall coverage for mandatory vaccines (OPV, DT and HBV) exceeded 94%, but only 79% had been vaccinated in accord with the recommended schedule (i.e. during the first year of life). Immunization coverage for pertussis increased from 40% (1993 survey) to 88%, but measles coverage (56%) remained inadequate for controlling the disease; Hib coverage was 20%. These results confirm that in Italy the coverage of only mandatory immunizations is satisfactory. Pertussis immunization coverage has improved dramatically since the introduction of acellular vaccines. A greater effort to educate parents and physicians is still needed to improve the coverage of optional vaccinations in all regions. PMID:10593033
Bayesian Estimation of Conditional Independence Graphs Improves Functional Connectivity Estimates
Hinne, Max; Janssen, Ronald J.; Heskes, Tom; van Gerven, Marcel A.J.
2015-01-01
Functional connectivity concerns the correlated activity between neuronal populations in spatially segregated regions of the brain, which may be studied using functional magnetic resonance imaging (fMRI). This coupled activity is conveniently expressed using covariance, but this measure fails to distinguish between direct and indirect effects. A popular alternative that addresses this issue is partial correlation, which regresses out the signal of potentially confounding variables, resulting in a measure that reveals only direct connections. Importantly, provided the data are normally distributed, if two variables are conditionally independent given all other variables, their respective partial correlation is zero. In this paper, we propose a probabilistic generative model that allows us to estimate functional connectivity in terms of both partial correlations and a graph representing conditional independencies. Simulation results show that this methodology is able to outperform the graphical LASSO, which is the de facto standard for estimating partial correlations. Furthermore, we apply the model to estimate functional connectivity for twenty subjects using resting-state fMRI data. Results show that our model provides a richer representation of functional connectivity as compared to considering partial correlations alone. Finally, we demonstrate how our approach can be extended in several ways, for instance to achieve data fusion by informing the conditional independence graph with data from probabilistic tractography. As our Bayesian formulation of functional connectivity provides access to the posterior distribution instead of only to point estimates, we are able to quantify the uncertainty associated with our results. This reveals that while we are able to infer a clear backbone of connectivity in our empirical results, the data are not accurately described by simply looking at the mode of the distribution over connectivity. The implication of this is that
Towards Improved Estimates of Ocean Heat Flux
NASA Astrophysics Data System (ADS)
Bentamy, Abderrahim; Hollman, Rainer; Kent, Elisabeth; Haines, Keith
2014-05-01
Recommendations and priorities for ocean heat flux research are for instance outlined in recent CLIVAR and WCRP reports, eg. Yu et al (2013). Among these is the need for improving the accuracy, the consistency, and the spatial and temporal resolution of air-sea fluxes over global as well as at region scales. To meet the main air-sea flux requirements, this study is aimed at obtaining and analyzing all the heat flux components (latent, sensible and radiative) at the ocean surface over global oceans using multiple satellite sensor observations in combination with in-situ measurements and numerical model analyses. The fluxes will be generated daily and monthly for the 20-year (1992-2011) period, between 80N and 80S and at 0.25deg resolution. Simultaneous estimates of all surface heat flux terms have not yet been calculated at such large scale and long time period. Such an effort requires a wide range of expertise and data sources that only recently are becoming available. Needed are methods for integrating many data sources to calculate energy fluxes (short-wave, long wave, sensible and latent heat) across the air-sea interface. We have access to all the relevant, recently available satellite data to perform such computations. Yu, L., K. Haines, M. Bourassa, M. Cronin, S. Gulev, S. Josey, S. Kato, A. Kumar, T. Lee, D. Roemmich: Towards achieving global closure of ocean heat and freshwater budgets: Recommendations for advancing research in air-sea fluxes through collaborative activities. INTERNATIONAL CLIVAR PROJECT OFFICE, 2013: International CLIVAR Publication Series No 189. http://www.clivar.org/sites/default/files/ICPO189_WHOI_fluxes_workshop.pdf
Improved diagnostic model for estimating wind energy
Endlich, R.M.; Lee, J.D.
1983-03-01
Because wind data are available only at scattered locations, a quantitative method is needed to estimate the wind resource at specific sites where wind energy generation may be economically feasible. This report describes a computer model that makes such estimates. The model uses standard weather reports and terrain heights in deriving wind estimates; the method of computation has been changed from what has been used previously. The performance of the current model is compared with that of the earlier version at three sites; estimates of wind energy at four new sites are also presented.
Westgate, Philip M; Braun, Thomas M
2012-09-10
Generalized estimating equations (GEE) are commonly used for the analysis of correlated data. However, use of quadratic inference functions (QIFs) is becoming popular because it increases efficiency relative to GEE when the working covariance structure is misspecified. Although shown to be advantageous in the literature, the impacts of covariates and imbalanced cluster sizes on the estimation performance of the QIF method in finite samples have not been studied. This cluster size variation causes QIF's estimating equations and GEE to be in separate classes when an exchangeable correlation structure is implemented, causing QIF and GEE to be incomparable in terms of efficiency. When utilizing this structure and the number of clusters is not large, we discuss how covariates and cluster size imbalance can cause QIF, rather than GEE, to produce estimates with the larger variability. This occurrence is mainly due to the empirical nature of weighting QIF employs, rather than differences in estimating equations classes. We demonstrate QIF's lost estimation precision through simulation studies covering a variety of general cluster randomized trial scenarios and compare QIF and GEE in the analysis of data from a cluster randomized trial. Copyright © 2012 John Wiley & Sons, Ltd.
Communication: Improved pair approximations in local coupled-cluster methods
NASA Astrophysics Data System (ADS)
Schwilk, Max; Usvyat, Denis; Werner, Hans-Joachim
2015-03-01
In local coupled cluster treatments the electron pairs can be classified according to the magnitude of their energy contributions or distances into strong, close, weak, and distant pairs. Different approximations are introduced for the latter three classes. In this communication, an improved simplified treatment of close and weak pairs is proposed, which is based on long-range cancellations of individually slowly decaying contributions in the amplitude equations. Benchmark calculations for correlation, reaction, and activation energies demonstrate that these approximations work extremely well, while pair approximations based on local second-order Møller-Plesset theory can lead to errors that are 1-2 orders of magnitude larger.
Improved Yield Estimation by Trellis Tension Monitoring
USDA-ARS?s Scientific Manuscript database
Most yield estimation practices for commercial vineyards rely on hand-sampling fruit on one or a small number of dates during the growing season. Limitations associated with the static yield estimates may be overcome with Trellis Tension Monitors (TTMs), systems that measure dynamically changes in t...
Improving Density Estimation by Incorporating Spatial Information
NASA Astrophysics Data System (ADS)
Smith, Laura M.; Keegan, Matthew S.; Wittman, Todd; Mohler, George O.; Bertozzi, Andrea L.
2010-12-01
Given discrete event data, we wish to produce a probability density that can model the relative probability of events occurring in a spatial region. Common methods of density estimation, such as Kernel Density Estimation, do not incorporate geographical information. Using these methods could result in nonnegligible portions of the support of the density in unrealistic geographic locations. For example, crime density estimation models that do not take geographic information into account may predict events in unlikely places such as oceans, mountains, and so forth. We propose a set of Maximum Penalized Likelihood Estimation methods based on Total Variation and [InlineEquation not available: see fulltext.] Sobolev norm regularizers in conjunction with a priori high resolution spatial data to obtain more geographically accurate density estimates. We apply this method to a residential burglary data set of the San Fernando Valley using geographic features obtained from satellite images of the region and housing density information.
Neuronal spike train entropy estimation by history clustering.
Watters, Nicholas; Reeke, George N
2014-09-01
Neurons send signals to each other by means of sequences of action potentials (spikes). Ignoring variations in spike amplitude and shape that are probably not meaningful to a receiving cell, the information content, or entropy of the signal depends on only the timing of action potentials, and because there is no external clock, only the interspike intervals, and not the absolute spike times, are significant. Estimating spike train entropy is a difficult task, particularly with small data sets, and many methods of entropy estimation have been proposed. Here we present two related model-based methods for estimating the entropy of neural signals and compare them to existing methods. One of the methods is fast and reasonably accurate, and it converges well with short spike time records; the other is impractically time-consuming but apparently very accurate, relying on generating artificial data that are a statistical match to the experimental data. Using the slow, accurate method to generate a best-estimate entropy value, we find that the faster estimator converges to this value more closely and with smaller data sets than many existing entropy estimators.
Dynamical mass estimates of young massive clusters in NGC1140 and M83
NASA Astrophysics Data System (ADS)
Moll, Sarah L.; de Grijs, Richard; Mengel, Sabine; Smith, Linda J.; Crowther, Paul A.
2009-12-01
We present virial mass estimates of young massive clusters (YMCs) in the starburst galaxies NGC1140 and M83, determined from high spectral resolution VLT echelle spectroscopy and high spatial resolution Hubble Space Telescope imaging. The survivability of such clusters is important in testing the scenario that YMCs are potentially proto-globular clusters. As young clusters, they lie in the domain in which dynamical masses appear to overestimate true cluster masses, most likely due to the clusters not being virialised. We find that the dynamical mass of NGC1140-1 is approximately ten times greater than its photometric mass. We propose that the most likely explanation for this disparity is the crowded environment of NGC1140-1, rather than this being solely due to a lack of virial equilibrium.
Yelland, Lisa N; Salter, Amy B; Ryan, Philip
2011-10-15
Modified Poisson regression, which combines a log Poisson regression model with robust variance estimation, is a useful alternative to log binomial regression for estimating relative risks. Previous studies have shown both analytically and by simulation that modified Poisson regression is appropriate for independent prospective data. This method is often applied to clustered prospective data, despite a lack of evidence to support its use in this setting. The purpose of this article is to evaluate the performance of the modified Poisson regression approach for estimating relative risks from clustered prospective data, by using generalized estimating equations to account for clustering. A simulation study is conducted to compare log binomial regression and modified Poisson regression for analyzing clustered data from intervention and observational studies. Both methods generally perform well in terms of bias, type I error, and coverage. Unlike log binomial regression, modified Poisson regression is not prone to convergence problems. The methods are contrasted by using example data sets from 2 large studies. The results presented in this article support the use of modified Poisson regression as an alternative to log binomial regression for analyzing clustered prospective data when clustering is taken into account by using generalized estimating equations.
SAR image segmentation using MSER and improved spectral clustering
NASA Astrophysics Data System (ADS)
Gui, Yang; Zhang, Xiaohu; Shang, Yang
2012-12-01
A novel approach is presented for synthetic aperture radar (SAR) image segmentation. By incorporating the advantages of maximally stable extremal regions (MSER) algorithm and spectral clustering (SC) method, the proposed approach provides effective and robust segmentation. First, the input image is transformed from a pixel-based to a region-based model by using the MSER algorithm. The input image after MSER procedure is composed of some disjoint regions. Then the regions are treated as nodes in the image plane, and a graph structure is applied to represent them. Finally, the improved SC is used to perform globally optimal clustering, by which the result of image segmentation can be generated. To avoid some incorrect partitioning when considering each region as one graph node, we assign different numbers of nodes to represent the regions according to area ratios among the regions. In addition, K-harmonic means instead of K-means is applied in the improved SC procedure in order to raise its stability and performance. Experimental results show that the proposed approach is effective on SAR image segmentation and has the advantage of calculating quickly.
Formation of Education Clusters as a Way to Improve Education
ERIC Educational Resources Information Center
Aitbayeva, Gul'zamira D.; Zhubanova, Mariyash K.; Kulgildinova, Tulebike A.; Tusupbekova, Gulsum M.; Uaisova, Gulnar I.
2016-01-01
The purpose of this research is to analyze basic prerequisites formation and development factors of educational clusters of the world's leading nations for studying the possibility of cluster policy introduction and creating educational clusters in the Republic of Kazakhstan. The authors of this study concluded that educational cluster could be…
Estimated Satellite Cluster Elements in Near Circular Orbit
1988-12-01
values of the covariance matriz P to see if the filter performs as well as it believes it is performing [4:page 3391. 1.1.. Thuth Model The truth...between satellites will bc affected. Since the measurements contain no informa- L tion on absolute downrange position, it is impossible to estimate
Improved estimation of frequency importance functions.
Kates, James M
2013-11-01
The Speech Intelligibility Index (SII) estimates speech intelligibility based on the audibility of speech cues across frequency. The frequency importance function gives the relative contribution to the SII of the speech audibility at different frequencies. The frequency importance function is usually estimated from the intelligibility data using a complicated multi-step procedure. This paper presents a new procedure for computing the frequency importance function directly from the intelligibility data based on nonlinear joint optimization of the frequency importance function and the SII curve-fitting parameters. An example of using the new approach is presented for previously published W-22 word list intelligibility data.
Richardson, R Tyler; Nicholson, Kristen F; Rapp, Elizabeth A; Johnston, Therese E; Richards, James G
2016-05-03
Accurate measurement of joint kinematics is required to understand the musculoskeletal effects of a therapeutic intervention such as upper extremity (UE) ergometry. Traditional surface-based motion capture is effective for quantifying humerothoracic motion, but scapular kinematics are challenging to obtain. Methods for estimating scapular kinematics include the widely-reported acromion marker cluster (AMC) which utilizes a static calibration between the scapula and the AMC to estimate the orientation of the scapula during motion. Previous literature demonstrates that including additional calibration positions throughout the motion improves AMC accuracy for single plane motions; however this approach has not been assessed for the non-planar shoulder complex motion occurring during UE ergometry. The purpose of this study was to evaluate the accuracy of single, dual, and multiple AMC calibration methods during UE ergometry. The orientations of the UE segments of 13 healthy subjects were recorded with motion capture. Scapular landmarks were palpated at eight evenly-spaced static positions around the 360° cycle. The single AMC method utilized one static calibration position to estimate scapular kinematics for the entire cycle, while the dual and multiple AMC methods used two and four static calibration positions, respectively. Scapulothoracic angles estimated by the three AMC methods were compared with scapulothoracic angles determined by palpation. The multiple AMC method produced the smallest RMS errors and was not significantly different from palpation about any axis. We recommend the multiple AMC method as a practical and accurate way to estimate scapular kinematics during UE ergometry. Copyright © 2016 Elsevier Ltd. All rights reserved.
Clustering-based urbanisation to improve enterprise information systems agility
NASA Astrophysics Data System (ADS)
Imache, Rabah; Izza, Said; Ahmed-Nacer, Mohamed
2015-11-01
Enterprises are daily facing pressures to demonstrate their ability to adapt quickly to the unpredictable changes of their dynamic in terms of technology, social, legislative, competitiveness and globalisation. Thus, to ensure its place in this hard context, enterprise must always be agile and must ensure its sustainability by a continuous improvement of its information system (IS). Therefore, the agility of enterprise information systems (EISs) can be considered today as a primary objective of any enterprise. One way of achieving this objective is by the urbanisation of the EIS in the context of continuous improvement to make it a real asset servicing enterprise strategy. This paper investigates the benefits of EISs urbanisation based on clustering techniques as a driver for agility production and/or improvement to help managers and IT management departments to improve continuously the performance of the enterprise and make appropriate decisions in the scope of the enterprise objectives and strategy. This approach is applied to the urbanisation of a tour operator EIS.
Improving Reliability of Subject-Level Resting-State fMRI Parcellation with Shrinkage Estimators
Mejia, Amanda F.; Nebel, Mary Beth; Shou, Haochang; Crainiceanu, Ciprian M.; Pekar, James J.; Mostofsky, Stewart; Caffo, Brian; Lindquist, Martin A.
2015-01-01
A recent interest in resting state functional magnetic resonance imaging (rsfMRI) lies in subdividing the human brain into anatomically and functionally distinct regions of interest. For example, brain parcellation is often a necessary step for defining the network nodes used in connectivity studies. While inference has traditionally been performed on group-level data, there is a growing interest in parcellating single subject data. However, this is difficult due to the inherent low signal-to-noise ratio of rsfMRI data, combined with typically short scan lengths. A large number of brain parcellation approaches employ clustering, which begins with a measure of similarity or distance between voxels. The goal of this work is to improve the reproducibility of single-subject parcellation using shrinkage-based estimators of such measures, allowing the noisy subject-specific estimator to “borrow strength” in a principled manner from a larger population of subjects. We present several empirical Bayes shrinkage estimators and outline methods for shrinkage when multiple scans are not available for each subject. We perform shrinkage on raw inter-voxel correlation estimates and use both raw and shrinkage estimates to produce parcellations by performing clustering on the voxels. While we employ a standard spectral clustering approach, our proposed method is agnostic to the choice of clustering method and can be used as a pre-processing step for any clustering algorithm. Using two datasets – a simulated dataset where the true parcellation is known and is subject-specific and a test-retest dataset consisting of two 7-minute resting-state fMRI scans from 20 subjects – we show that parcellations produced from shrinkage correlation estimates have higher reliability and validity than those produced from raw correlation estimates. Application to test-retest data shows that using shrinkage estimators increases the reproducibility of subject-specific parcellations of the motor
Improving Lidar Turbulence Estimates for Wind Energy
Newman, Jennifer F.; Clifton, Andrew; Churchfield, Matthew J.; ...
2016-10-03
Remote sensing devices (e.g., lidars) are quickly becoming a cost-effective and reliable alternative to meteorological towers for wind energy applications. Although lidars can measure mean wind speeds accurately, these devices measure different values of turbulence intensity (TI) than an instrument on a tower. In response to these issues, a lidar TI error reduction model was recently developed for commercially available lidars. The TI error model first applies physics-based corrections to the lidar measurements, then uses machine-learning techniques to further reduce errors in lidar TI estimates. The model was tested at two sites in the Southern Plains where vertically profiling lidarsmore » were collocated with meteorological towers. Results indicate that the model works well under stable conditions but cannot fully mitigate the effects of variance contamination under unstable conditions. To understand how variance contamination affects lidar TI estimates, a new set of equations was derived in previous work to characterize the actual variance measured by a lidar. Terms in these equations were quantified using a lidar simulator and modeled wind field, and the new equations were then implemented into the TI error model.« less
Improving Lidar Turbulence Estimates for Wind Energy
Newman, Jennifer F.; Clifton, Andrew; Churchfield, Matthew J.; Klein, Petra
2016-10-03
Remote sensing devices (e.g., lidars) are quickly becoming a cost-effective and reliable alternative to meteorological towers for wind energy applications. Although lidars can measure mean wind speeds accurately, these devices measure different values of turbulence intensity (TI) than an instrument on a tower. In response to these issues, a lidar TI error reduction model was recently developed for commercially available lidars. The TI error model first applies physics-based corrections to the lidar measurements, then uses machine-learning techniques to further reduce errors in lidar TI estimates. The model was tested at two sites in the Southern Plains where vertically profiling lidars were collocated with meteorological towers. Results indicate that the model works well under stable conditions but cannot fully mitigate the effects of variance contamination under unstable conditions. To understand how variance contamination affects lidar TI estimates, a new set of equations was derived in previous work to characterize the actual variance measured by a lidar. Terms in these equations were quantified using a lidar simulator and modeled wind field, and the new equations were then implemented into the TI error model.
Improving lidar turbulence estimates for wind energy
NASA Astrophysics Data System (ADS)
Newman, J. F.; Clifton, A.; Churchfield, M. J.; Klein, P.
2016-09-01
Remote sensing devices (e.g., lidars) are quickly becoming a cost-effective and reliable alternative to meteorological towers for wind energy applications. Although lidars can measure mean wind speeds accurately, these devices measure different values of turbulence intensity (TI) than an instrument on a tower. In response to these issues, a lidar TI error reduction model was recently developed for commercially available lidars. The TI error model first applies physics-based corrections to the lidar measurements, then uses machine-learning techniques to further reduce errors in lidar TI estimates. The model was tested at two sites in the Southern Plains where vertically profiling lidars were collocated with meteorological towers. Results indicate that the model works well under stable conditions but cannot fully mitigate the effects of variance contamination under unstable conditions. To understand how variance contamination affects lidar TI estimates, a new set of equations was derived in previous work to characterize the actual variance measured by a lidar. Terms in these equations were quantified using a lidar simulator and modeled wind field, and the new equations were then implemented into the TI error model.
Astrophysical properties of star clusters in the Magellanic Clouds homogeneously estimated by ASteCA
NASA Astrophysics Data System (ADS)
Perren, G. I.; Piatti, A. E.; Vázquez, R. A.
2017-06-01
Aims: We seek to produce a homogeneous catalog of astrophysical parameters of 239 resolved star clusters, located in the Small and Large Magellanic Clouds, observed in the Washington photometric system. Methods: The cluster sample was processed with the recently introduced Automated Stellar Cluster Analysis (ASteCA) package, which ensures both an automatized and a fully reproducible treatment, together with a statistically based analysis of their fundamental parameters and associated uncertainties. The fundamental parameters determined for each cluster with this tool, via a color-magnitude diagram (CMD) analysis, are metallicity, age, reddening, distance modulus, and total mass. Results: We generated a homogeneous catalog of structural and fundamental parameters for the studied cluster sample and performed a detailed internal error analysis along with a thorough comparison with values taken from 26 published articles. We studied the distribution of cluster fundamental parameters in both Clouds and obtained their age-metallicity relationships. Conclusions: The ASteCA package can be applied to an unsupervised determination of fundamental cluster parameters, which is a task of increasing relevance as more data becomes available through upcoming surveys. A table with the estimated fundamental parameters for the 239 clusters analyzed is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/602/A89
Effect of Random Clustering on Surface Damage Density Estimates
Matthews, M J; Feit, M D
2007-10-29
Identification and spatial registration of laser-induced damage relative to incident fluence profiles is often required to characterize the damage properties of laser optics near damage threshold. Of particular interest in inertial confinement laser systems are large aperture beam damage tests (>1cm{sup 2}) where the number of initiated damage sites for {phi}>14J/cm{sup 2} can approach 10{sup 5}-10{sup 6}, requiring automatic microscopy counting to locate and register individual damage sites. However, as was shown for the case of bacteria counting in biology decades ago, random overlapping or 'clumping' prevents accurate counting of Poisson-distributed objects at high densities, and must be accounted for if the underlying statistics are to be understood. In this work we analyze the effect of random clumping on damage initiation density estimates at fluences above damage threshold. The parameter {psi} = a{rho} = {rho}/{rho}{sub 0}, where a = 1/{rho}{sub 0} is the mean damage site area and {rho} is the mean number density, is used to characterize the onset of clumping, and approximations based on a simple model are used to derive an expression for clumped damage density vs. fluence and damage site size. The influence of the uncorrected {rho} vs. {phi} curve on damage initiation probability predictions is also discussed.
Fretheim, Atle; Soumerai, Stephen B; Zhang, Fang; Oxman, Andrew D; Ross-Degnan, Dennis
2013-08-01
We reanalyzed the data from a cluster-randomized controlled trial (C-RCT) of a quality improvement intervention for prescribing antihypertensive medication. Our objective was to estimate the effectiveness of the intervention using both interrupted time-series (ITS) and RCT methods, and to compare the findings. We first conducted an ITS analysis using data only from the intervention arm of the trial because our main objective was to compare the findings from an ITS analysis with the findings from the C-RCT. We used segmented regression methods to estimate changes in level or slope coincident with the intervention, controlling for baseline trend. We analyzed the C-RCT data using generalized estimating equations. Last, we estimated the intervention effect by including data from both study groups and by conducting a controlled ITS analysis of the difference between the slope and level changes in the intervention and control groups. The estimates of absolute change resulting from the intervention were ITS analysis, 11.5% (95% confidence interval [CI]: 9.5, 13.5); C-RCT, 9.0% (95% CI: 4.9, 13.1); and the controlled ITS analysis, 14.0% (95% CI: 8.6, 19.4). ITS analysis can provide an effect estimate that is concordant with the results of a cluster-randomized trial. A broader range of comparisons from other RCTs would help to determine whether these are generalizable results. Copyright © 2013 Elsevier Inc. All rights reserved.
A Clustering Classification of Spare Parts for Improving Inventory Policies
NASA Astrophysics Data System (ADS)
Meri Lumban Raja, Anton; Ai, The Jin; Diar Astanti, Ririn
2016-02-01
Inventory policies in a company may consist of storage, control, and replenishment policy. Since the result of common ABC inventory classification can only affect the replenishment policy, we are proposing a clustering based classification technique as a basis for developing inventory policy especially for storage and control policy. Hierarchical clustering procedure is used after clustering variables are defined. Since hierarchical clustering procedure requires metric variables only, therefore a step to convert non-metric variables to metric variables is performed. The clusters resulted from the clustering techniques are analyzed in order to define each cluster characteristics. Then, the inventory policies are determined for each group according to its characteristics. A real data, which consists of 612 items from a local manufacturer's spare part warehouse, are used in the research of this paper to show the applicability of the proposed methodology.
Kalia, Sumeet; Klar, Neil; Donner, Allan
2016-12-30
Cluster randomized trials (CRTs) involve the random assignment of intact social units rather than independent subjects to intervention groups. Time-to-event outcomes often are endpoints in CRTs. Analyses of such data need to account for the correlation among cluster members. The intracluster correlation coefficient (ICC) is used to assess the similarity among binary and continuous outcomes that belong to the same cluster. However, estimating the ICC in CRTs with time-to-event outcomes is a challenge because of the presence of censored observations. The literature suggests that the ICC may be estimated using either censoring indicators or observed event times. A simulation study explores the effect of administrative censoring on estimating the ICC. Results show that ICC estimators derived from censoring indicators or observed event times are negatively biased. Analytic work further supports these results. Observed event times are preferred to estimate the ICC under minimum frequency of administrative censoring. To our knowledge, the existing literature provides no practical guidance on the estimation of ICC when substantial amount of administrative censoring is present. The results from this study corroborate the need for further methodological research on estimating the ICC for correlated time-to-event outcomes. Copyright © 2016 John Wiley & Sons, Ltd.
Kasim, Shahreen; Deris, Safaai; Othman, Razib M
2013-09-01
A drastic improvement in the analysis of gene expression has lead to new discoveries in bioinformatics research. In order to analyse the gene expression data, fuzzy clustering algorithms are widely used. However, the resulting analyses from these specific types of algorithms may lead to confusion in hypotheses with regard to the suggestion of dominant function for genes of interest. Besides that, the current fuzzy clustering algorithms do not conduct a thorough analysis of genes with low membership values. Therefore, we present a novel computational framework called the "multi-stage filtering-Clustering Functional Annotation" (msf-CluFA) for clustering gene expression data. The framework consists of four components: fuzzy c-means clustering (msf-CluFA-0), achieving dominant cluster (msf-CluFA-1), improving confidence level (msf-CluFA-2) and combination of msf-CluFA-0, msf-CluFA-1 and msf-CluFA-2 (msf-CluFA-3). By employing double filtering in msf-CluFA-1 and apriori algorithms in msf-CluFA-2, our new framework is capable of determining the dominant clusters and improving the confidence level of genes with lower membership values by means of which the unknown genes can be predicted. Copyright © 2013 Elsevier Ltd. All rights reserved.
Return period estimates for European windstorm clusters: a multi-model perspective
NASA Astrophysics Data System (ADS)
Renggli, Dominik; Zimmerli, Peter
2017-04-01
Clusters of storms over Europe can lead to very large aggregated losses. Realistic return period estimates for such cluster are therefore of vital interest to the (re)insurance industry. Such return period estimates are usually derived from historical storm activity statistics of the last 30 to 40 years. However, climate models provide an alternative source, potentially representing thousands of simulated storm seasons. In this study, we made use of decadal hindcast data from eight different climate models in the CMIP5 archive. We used an objective tracking algorithm to identify individual windstorms in the climate model data. The algorithm also computes a (population density weighted) Storm Severity Index (SSI) for each of the identified storms (both on a continental and more regional basis). We derived return period estimates for the cluster seasons 1990, 1999, 2013/2014 and 1884 in the following way: For each climate model, we extracted two different exceedance frequency curves. The first describes the exceedance frequency (or the return period as the inverse of it) of a given SSI level due to an individual storm occurrence. The second describes the exceedance frequency of the seasonally aggregated SSI level (i.e. the sum of the SSI values of all storms in a given season). Starting from appropriate return period assumptions for each individual storm of a historical cluster (e.g. Anatol, Lothar and Martin in 1999) and using the first curve, we extracted the SSI levels at the corresponding return periods. Summing these SSI values results in the seasonally aggregated SSI value. Combining this with the second (aggregated) exceedance frequency curve results in return period estimate of the historical cluster season. Since we do this for each model separately, we obtain eight different return period estimates for each historical cluster. In this way, we obtained the following return period estimates: 50 to 80 years for the 1990 season, 20 to 45 years for the 1999
Recent improvements in ocean heat content estimation
NASA Astrophysics Data System (ADS)
Abraham, J. P.
2015-12-01
Increase of ocean heat content is an outcome of a persistent and ongoing energy imbalance to the Earth's climate system. This imbalance, largely caused by human emissions of greenhouse gases, has engendered a multi-decade increase in stored thermal energy within the Earth system, manifest principally as an increase in ocean heat content. Consequently, in order to quantify the rate of global warming, it is necessary to measure the rate of increase of ocean heat content. The historical record of ocean heat content is extracted from a history of various devices and spatial/temporal coverage across the globe. One of the most important historical devices is the eXpendable BathyThermograph (XBT) which has been used for decades to measure ocean temperatures to depths of 700m and deeper. Here, recent progress in improving the XBT record of upper ocean heat content is described including corrections to systematic biases, filling in spatial gaps where data does not exist, and the selection of a proper climatology. In addition, comparisons of the revised historical record and CMIP5 climate models are made. It is seen that there is very good agreement between the models and measurements, with the models slightly under-predicting the increase of ocean heat content in the upper water layers over the past 45 years.
Estimators for Clustered Education RCTs Using the Neyman Model for Causal Inference
ERIC Educational Resources Information Center
Schochet, Peter Z.
2013-01-01
This article examines the estimation of two-stage clustered designs for education randomized control trials (RCTs) using the nonparametric Neyman causal inference framework that underlies experiments. The key distinction between the considered causal models is whether potential treatment and control group outcomes are considered to be fixed for…
Estimators for Clustered Education RCTs Using the Neyman Model for Causal Inference
ERIC Educational Resources Information Center
Schochet, Peter Z.
2013-01-01
This article examines the estimation of two-stage clustered designs for education randomized control trials (RCTs) using the nonparametric Neyman causal inference framework that underlies experiments. The key distinction between the considered causal models is whether potential treatment and control group outcomes are considered to be fixed for…
Improving Collective Estimations Using Resistance to Social Influence.
Madirolas, Gabriel; de Polavieja, Gonzalo G
2015-11-01
Groups can make precise collective estimations in cases like the weight of an object or the number of items in a volume. However, in others tasks, for example those requiring memory or mental calculation, subjects often give estimations with large deviations from factual values. Allowing members of the group to communicate their estimations has the additional perverse effect of shifting individual estimations even closer to the biased collective estimation. Here we show that this negative effect of social interactions can be turned into a method to improve collective estimations. We first obtained a statistical model of how humans change their estimation when receiving the estimates made by other individuals. We confirmed using existing experimental data its prediction that individuals use the weighted geometric mean of private and social estimations. We then used this result and the fact that each individual uses a different value of the social weight to devise a method that extracts the subgroups resisting social influence. We found that these subgroups of individuals resisting social influence can make very large improvements in group estimations. This is in contrast to methods using the confidence that each individual declares, for which we find no improvement in group estimations. Also, our proposed method does not need to use historical data to weight individuals by performance. These results show the benefits of using the individual characteristics of the members in a group to better extract collective wisdom.
Improving Collective Estimations Using Resistance to Social Influence
Madirolas, Gabriel; de Polavieja, Gonzalo G.
2015-01-01
Groups can make precise collective estimations in cases like the weight of an object or the number of items in a volume. However, in others tasks, for example those requiring memory or mental calculation, subjects often give estimations with large deviations from factual values. Allowing members of the group to communicate their estimations has the additional perverse effect of shifting individual estimations even closer to the biased collective estimation. Here we show that this negative effect of social interactions can be turned into a method to improve collective estimations. We first obtained a statistical model of how humans change their estimation when receiving the estimates made by other individuals. We confirmed using existing experimental data its prediction that individuals use the weighted geometric mean of private and social estimations. We then used this result and the fact that each individual uses a different value of the social weight to devise a method that extracts the subgroups resisting social influence. We found that these subgroups of individuals resisting social influence can make very large improvements in group estimations. This is in contrast to methods using the confidence that each individual declares, for which we find no improvement in group estimations. Also, our proposed method does not need to use historical data to weight individuals by performance. These results show the benefits of using the individual characteristics of the members in a group to better extract collective wisdom. PMID:26565619
An improved algorithm for clustering gene expression data.
Bandyopadhyay, Sanghamitra; Mukhopadhyay, Anirban; Maulik, Ujjwal
2007-11-01
Recent advancements in microarray technology allows simultaneous monitoring of the expression levels of a large number of genes over different time points. Clustering is an important tool for analyzing such microarray data, typical properties of which are its inherent uncertainty, noise and imprecision. In this article, a two-stage clustering algorithm, which employs a recently proposed variable string length genetic scheme and a multiobjective genetic clustering algorithm, is proposed. It is based on the novel concept of points having significant membership to multiple classes. An iterated version of the well-known Fuzzy C-Means is also utilized for clustering. The significant superiority of the proposed two-stage clustering algorithm as compared to the average linkage method, Self Organizing Map (SOM) and a recently developed weighted Chinese restaurant-based clustering method (CRC), widely used methods for clustering gene expression data, is established on a variety of artificial and publicly available real life data sets. The biological relevance of the clustering solutions are also analyzed.
Improving Performance for Gifted Students in a Cluster Grouping Model
ERIC Educational Resources Information Center
Brulles, Dina; Saunders, Rachel; Cohn, Sanford J.
2010-01-01
Although experts in gifted education widely promote cluster grouping gifted students, little empirical evidence is available to attest to its effectiveness. This study is an example of comparative action research in the form of a quantitative case study that focused on the mandated cluster grouping practices for gifted students in an urban…
NASA Technical Reports Server (NTRS)
Kalton, G.
1983-01-01
A number of surveys were conducted to study the relationship between the level of aircraft or traffic noise exposure experienced by people living in a particular area and their annoyance with it. These surveys generally employ a clustered sample design which affects the precision of the survey estimates. Regression analysis of annoyance on noise measures and other variables is often an important component of the survey analysis. Formulae are presented for estimating the standard errors of regression coefficients and ratio of regression coefficients that are applicable with a two- or three-stage clustered sample design. Using a simple cost function, they also determine the optimum allocation of the sample across the stages of the sample design for the estimation of a regression coefficient.
Improving the Discipline of Cost Estimation and Analysis
NASA Technical Reports Server (NTRS)
Piland, William M.; Pine, David J.; Wilson, Delano M.
2000-01-01
The need to improve the quality and accuracy of cost estimates of proposed new aerospace systems has been widely recognized. The industry has done the best job of maintaining related capability with improvements in estimation methods and giving appropriate priority to the hiring and training of qualified analysts. Some parts of Government, and National Aeronautics and Space Administration (NASA) in particular, continue to need major improvements in this area. Recently, NASA recognized that its cost estimation and analysis capabilities had eroded to the point that the ability to provide timely, reliable estimates was impacting the confidence in planning man), program activities. As a result, this year the Agency established a lead role for cost estimation and analysis. The Independent Program Assessment Office located at the Langley Research Center was given this responsibility.
How to Estimate the Value of Service Reliability Improvements
Sullivan, Michael J.; Mercurio, Matthew G.; Schellenberg, Josh A.; Eto, Joseph H.
2010-06-08
A robust methodology for estimating the value of service reliability improvements is presented. Although econometric models for estimating value of service (interruption costs) have been established and widely accepted, analysts often resort to applying relatively crude interruption cost estimation techniques in assessing the economic impacts of transmission and distribution investments. This paper first shows how the use of these techniques can substantially impact the estimated value of service improvements. A simple yet robust methodology that does not rely heavily on simplifying assumptions is presented. When a smart grid investment is proposed, reliability improvement is one of the most frequently cited benefits. Using the best methodology for estimating the value of this benefit is imperative. By providing directions on how to implement this methodology, this paper sends a practical, usable message to the industry.
NASA Astrophysics Data System (ADS)
Mathews, William G.; Guo, Fulai
2011-09-01
The total feedback energy injected into hot gas in galaxy clusters by central black holes can be estimated by comparing the potential energy of observed cluster gas profiles with the potential energy of non-radiating, feedback-free hot gas atmospheres resulting from gravitational collapse in clusters of the same total mass. Feedback energy from cluster-centered black holes expands the cluster gas, lowering the gas-to-dark-matter mass ratio below the cosmic value. Feedback energy is unnecessarily delivered by radio-emitting jets to distant gas far beyond the cooling radius where the cooling time equals the cluster lifetime. For clusters of mass (4-11) × 1014 M sun, estimates of the total feedback energy, (1-3) × 1063 erg, far exceed feedback energies estimated from observations of X-ray cavities and shocks in the cluster gas, energies gained from supernovae, and energies lost from cluster gas by radiation. The time-averaged mean feedback luminosity is comparable to those of powerful quasars, implying that some significant fraction of this energy may arise from the spin of the black hole. The universal entropy profile in feedback-free gaseous atmospheres in Navarro-Frenk-White cluster halos can be recovered by multiplying the observed gas entropy profile of any relaxed cluster by a factor involving the gas fraction profile. While the feedback energy and associated mass outflow in the clusters we consider far exceed that necessary to stop cooling inflow, the time-averaged mass outflow at the cooling radius almost exactly balances the mass that cools within this radius, an essential condition to shut down cluster cooling flows.
Mathews, William G.; Guo Fulai
2011-09-10
The total feedback energy injected into hot gas in galaxy clusters by central black holes can be estimated by comparing the potential energy of observed cluster gas profiles with the potential energy of non-radiating, feedback-free hot gas atmospheres resulting from gravitational collapse in clusters of the same total mass. Feedback energy from cluster-centered black holes expands the cluster gas, lowering the gas-to-dark-matter mass ratio below the cosmic value. Feedback energy is unnecessarily delivered by radio-emitting jets to distant gas far beyond the cooling radius where the cooling time equals the cluster lifetime. For clusters of mass (4-11) x 10{sup 14} M{sub sun}, estimates of the total feedback energy, (1-3) x 10{sup 63} erg, far exceed feedback energies estimated from observations of X-ray cavities and shocks in the cluster gas, energies gained from supernovae, and energies lost from cluster gas by radiation. The time-averaged mean feedback luminosity is comparable to those of powerful quasars, implying that some significant fraction of this energy may arise from the spin of the black hole. The universal entropy profile in feedback-free gaseous atmospheres in Navarro-Frenk-White cluster halos can be recovered by multiplying the observed gas entropy profile of any relaxed cluster by a factor involving the gas fraction profile. While the feedback energy and associated mass outflow in the clusters we consider far exceed that necessary to stop cooling inflow, the time-averaged mass outflow at the cooling radius almost exactly balances the mass that cools within this radius, an essential condition to shut down cluster cooling flows.
A cluster-randomised quality improvement study to improve two inpatient stroke quality indicators.
Williams, Linda; Daggett, Virginia; Slaven, James E; Yu, Zhangsheng; Sager, Danielle; Myers, Jennifer; Plue, Laurie; Woodward-Hagg, Heather; Damush, Teresa M
2016-04-01
Quality indicator collection and feedback improves stroke care. We sought to determine whether quality improvement training plus indicator feedback was more effective than indicator feedback alone in improving inpatient stroke indicators. We conducted a cluster-randomised quality improvement trial, randomising hospitals to quality improvement training plus indicator feedback versus indicator feedback alone to improve deep vein thrombosis (DVT) prophylaxis and dysphagia screening. Intervention sites received collaborative-based quality improvement training, external facilitation and indicator feedback. Control sites received only indicator feedback. We compared indicators pre-implementation (pre-I) to active implementation (active-I) and post-implementation (post-I) periods. We constructed mixed-effect logistic models of the two indicators with a random intercept for hospital effect, adjusting for patient, time, intervention and hospital variables. Patients at intervention sites (1147 admissions), had similar race, gender and National Institutes of Health Stroke Scale scores to control sites (1017 admissions). DVT prophylaxis improved more in intervention sites during active-I period (ratio of ORs 4.90, p<0.001), but did not differ in post-I period. Dysphagia screening improved similarly in both groups during active-I, but control sites improved more in post-I period (ratio of ORs 0.67, p=0.04). In logistic models, the intervention was independently positively associated with DVT performance during active-I period, and negatively associated with dysphagia performance post-I period. Quality improvement training was associated with early DVT improvement, but the effect was not sustained over time and was not seen with dysphagia screening. External quality improvement programmes may quickly boost performance but their effect may vary by indicator and may not sustain over time. Published by the BMJ Publishing Group Limited. For permission to use (where not already
SYSTEM IMPROVEMENT USING SIGNAL-TO-NOISE RATIO ESTIMATION.
systems by using signal-to-noise ratio ( SNR ) estimation of the received signal. Such SNR estimates can be used to adaptively control important system...parameters whose design explicitly depends on SNR . The results of this investigation show, for certain types of systems, performance can indeed be...substantially improved by SNR estimation. The analysis of the report is basically in two parts. In the first part consideration is given to the design
Improving estimates of wilderness use from mandatory travel permits.
David W. Lime; Grace A. Lorence
1974-01-01
Mandatory permits provide recreation managers with better use estimates. Because some visitors do not obtain permits, use estimates based on permit data need to be corrected. In the Boundary Waters Canoe Area, a method was devised for distinguishing noncomplying groups and finding correction factors that reflect the impact of these groups. Suggestions for improving...
Potential Improvements for HEC-HMS Automated Parameter Estimation
2006-08-01
e.g., flow, baseflow , quickflow, volume aggregations), comprise separate components of a composite global objective function). 5. Objective functions...Marquardt-Levenberg (GML) method of computer-based parame- ter estimation are described and demonstrated as potential improvements to existing HEC-HMS...automatic calibration capabilities. In contrast to ex- isting HEC-HMS automated parameter estimation capabilities, these methods support global
Improving the precision of dynamic forest parameter estimates using Landsat
Evan B. Brooks; John W. Coulston; Randolph H. Wynne; Valerie A. Thomas
2016-01-01
The use of satellite-derived classification maps to improve post-stratified forest parameter estimates is wellestablished.When reducing the variance of post-stratification estimates for forest change parameters such as forestgrowth, it is logical to use a change-related strata map. At the stand level, a time series of Landsat images is
Improving visual estimates of cervical spine range of motion.
Hirsch, Brandon P; Webb, Matthew L; Bohl, Daniel D; Fu, Michael; Buerba, Rafael A; Gruskay, Jordan A; Grauer, Jonathan N
2014-11-01
Cervical spine range of motion (ROM) is a common measure of cervical conditions, surgical outcomes, and functional impairment. Although ROM is routinely assessed by visual estimation in clinical practice, visual estimates have been shown to be unreliable and inaccurate. Reliable goniometers can be used for assessments, but the associated costs and logistics generally limit their clinical acceptance. To investigate whether training can improve visual estimates of cervical spine ROM, we asked attending surgeons, residents, and medical students at our institution to visually estimate the cervical spine ROM of healthy subjects before and after a training session. This training session included review of normal cervical spine ROM in 3 planes and demonstration of partial and full motion in 3 planes by multiple subjects. Estimates before, immediately after, and 1 month after this training session were compared to assess reliability and accuracy. Immediately after training, errors decreased by 11.9° (flexion-extension), 3.8° (lateral bending), and 2.9° (axial rotation). These improvements were statistically significant. One month after training, visual estimates remained improved, by 9.5°, 1.6°, and 3.1°, respectively, but were statistically significant only in flexion-extension. Although the accuracy of visual estimates can be improved, clinicians should be aware of the limitations of visual estimates of cervical spine ROM. Our study results support scrutiny of visual assessment of ROM as a criterion for diagnosing permanent impairment or disability.
Eisenberg, E.; Baram, A.
2007-01-01
For a large class of repulsive interaction models, the Mayer cluster integrals can be transformed into a tridiagonal real symmetric matrix Rmn, whose elements converge to two constants with 1/n2 correction. We find exact expressions in terms of these correction terms for the two critical exponents describing the density near the two singular termination points of the fluid phase. We apply the method to the hard-spheres model and find that the metastable fluid phase terminates at ρt = 0.751[5]. The density near the transition is given by ρt-ρ ∼ (zt − z)σ′, where the critical exponent is predicted to be σ′ = 0.0877[25]. Interestingly, the termination density is close to the observed glass transition; thus, the above critical behavior is expected to be associated with the onset of glassy behavior in hard spheres. PMID:17389362
NASA Astrophysics Data System (ADS)
Mahmoud, E.; Takey, A.; Shoukry, A.
2016-07-01
We develop a galaxy cluster finding algorithm based on spectral clustering technique to identify optical counterparts and estimate optical redshifts for X-ray selected cluster candidates. As an application, we run our algorithm on a sample of X-ray cluster candidates selected from the third XMM-Newton serendipitous source catalog (3XMM-DR5) that are located in the Stripe 82 of the Sloan Digital Sky Survey (SDSS). Our method works on galaxies described in the color-magnitude feature space. We begin by examining 45 galaxy clusters with published spectroscopic redshifts in the range of 0.1-0.8 with a median of 0.36. As a result, we are able to identify their optical counterparts and estimate their photometric redshifts, which have a typical accuracy of 0.025 and agree with the published ones. Then, we investigate another 40 X-ray cluster candidates (from the same cluster survey) with no redshift information in the literature and found that 12 candidates are considered as galaxy clusters in the redshift range from 0.29 to 0.76 with a median of 0.57. These systems are newly discovered clusters in X-rays and optical data. Among them 7 clusters have spectroscopic redshifts for at least one member galaxy.
High-Resolution Spatial Distribution and Estimation of Access to Improved Sanitation in Kenya
Jia, Peng; Anderson, John D.; Leitner, Michael; Rheingans, Richard
2016-01-01
Background Access to sanitation facilities is imperative in reducing the risk of multiple adverse health outcomes. A distinct disparity in sanitation exists among different wealth levels in many low-income countries, which may hinder the progress across each of the Millennium Development Goals. Methods The surveyed households in 397 clusters from 2008–2009 Kenya Demographic and Health Surveys were divided into five wealth quintiles based on their national asset scores. A series of spatial analysis methods including excess risk, local spatial autocorrelation, and spatial interpolation were applied to observe disparities in coverage of improved sanitation among different wealth categories. The total number of the population with improved sanitation was estimated by interpolating, time-adjusting, and multiplying the surveyed coverage rates by high-resolution population grids. A comparison was then made with the annual estimates from United Nations Population Division and World Health Organization /United Nations Children's Fund Joint Monitoring Program for Water Supply and Sanitation. Results The Empirical Bayesian Kriging interpolation produced minimal root mean squared error for all clusters and five quintiles while predicting the raw and spatial coverage rates of improved sanitation. The coverage in southern regions was generally higher than in the north and east, and the coverage in the south decreased from Nairobi in all directions, while Nyanza and North Eastern Province had relatively poor coverage. The general clustering trend of high and low sanitation improvement among surveyed clusters was confirmed after spatial smoothing. Conclusions There exists an apparent disparity in sanitation among different wealth categories across Kenya and spatially smoothed coverage rates resulted in a closer estimation of the available statistics than raw coverage rates. Future intervention activities need to be tailored for both different wealth categories and nationally
High-Resolution Spatial Distribution and Estimation of Access to Improved Sanitation in Kenya.
Jia, Peng; Anderson, John D; Leitner, Michael; Rheingans, Richard
2016-01-01
Access to sanitation facilities is imperative in reducing the risk of multiple adverse health outcomes. A distinct disparity in sanitation exists among different wealth levels in many low-income countries, which may hinder the progress across each of the Millennium Development Goals. The surveyed households in 397 clusters from 2008-2009 Kenya Demographic and Health Surveys were divided into five wealth quintiles based on their national asset scores. A series of spatial analysis methods including excess risk, local spatial autocorrelation, and spatial interpolation were applied to observe disparities in coverage of improved sanitation among different wealth categories. The total number of the population with improved sanitation was estimated by interpolating, time-adjusting, and multiplying the surveyed coverage rates by high-resolution population grids. A comparison was then made with the annual estimates from United Nations Population Division and World Health Organization /United Nations Children's Fund Joint Monitoring Program for Water Supply and Sanitation. The Empirical Bayesian Kriging interpolation produced minimal root mean squared error for all clusters and five quintiles while predicting the raw and spatial coverage rates of improved sanitation. The coverage in southern regions was generally higher than in the north and east, and the coverage in the south decreased from Nairobi in all directions, while Nyanza and North Eastern Province had relatively poor coverage. The general clustering trend of high and low sanitation improvement among surveyed clusters was confirmed after spatial smoothing. There exists an apparent disparity in sanitation among different wealth categories across Kenya and spatially smoothed coverage rates resulted in a closer estimation of the available statistics than raw coverage rates. Future intervention activities need to be tailored for both different wealth categories and nationally where there are areas of greater needs when
Improved Density Functional Tight Binding Potentials for Metalloid Aluminum Clusters
2016-06-01
approach, using solution chemistry methods to grow small metal clusters with a single monolayer of organic ligand on their surface. These may offer the...studies. In this section we discuss efforts to use this newly developed potential to more efficiently simulate chemistry in Al/Cp clusters. We... Chemistry A, vol. 111, pp. 5678–5684, Jan. 2007. [7] J. Frenzel et al. “Semi-relativistic, self-consistent charge Slater-Koster tables for density
Comparative assessment of bone pose estimation using Point Cluster Technique and OpenSim.
Lathrop, Rebecca L; Chaudhari, Ajit M W; Siston, Robert A
2011-11-01
Estimating the position of the bones from optical motion capture data is a challenge associated with human movement analysis. Bone pose estimation techniques such as the Point Cluster Technique (PCT) and simulations of movement through software packages such as OpenSim are used to minimize soft tissue artifact and estimate skeletal position; however, using different methods for analysis may produce differing kinematic results which could lead to differences in clinical interpretation such as a misclassification of normal or pathological gait. This study evaluated the differences present in knee joint kinematics as a result of calculating joint angles using various techniques. We calculated knee joint kinematics from experimental gait data using the standard PCT, the least squares approach in OpenSim applied to experimental marker data, and the least squares approach in OpenSim applied to the results of the PCT algorithm. Maximum and resultant RMS differences in knee angles were calculated between all techniques. We observed differences in flexion/extension, varus/valgus, and internal/external rotation angles between all approaches. The largest differences were between the PCT results and all results calculated using OpenSim. The RMS differences averaged nearly 5° for flexion/extension angles with maximum differences exceeding 15°. Average RMS differences were relatively small (< 1.08°) between results calculated within OpenSim, suggesting that the choice of marker weighting is not critical to the results of the least squares inverse kinematics calculations. The largest difference between techniques appeared to be a constant offset between the PCT and all OpenSim results, which may be due to differences in the definition of anatomical reference frames, scaling of musculoskeletal models, and/or placement of virtual markers within OpenSim. Different methods for data analysis can produce largely different kinematic results, which could lead to the misclassification
An improved fuzzy c-means clustering algorithm based on shadowed sets and PSO.
Zhang, Jian; Shen, Ling
2014-01-01
To organize the wide variety of data sets automatically and acquire accurate classification, this paper presents a modified fuzzy c-means algorithm (SP-FCM) based on particle swarm optimization (PSO) and shadowed sets to perform feature clustering. SP-FCM introduces the global search property of PSO to deal with the problem of premature convergence of conventional fuzzy clustering, utilizes vagueness balance property of shadowed sets to handle overlapping among clusters, and models uncertainty in class boundaries. This new method uses Xie-Beni index as cluster validity and automatically finds the optimal cluster number within a specific range with cluster partitions that provide compact and well-separated clusters. Experiments show that the proposed approach significantly improves the clustering effect.
An Improved Fuzzy c-Means Clustering Algorithm Based on Shadowed Sets and PSO
Zhang, Jian; Shen, Ling
2014-01-01
To organize the wide variety of data sets automatically and acquire accurate classification, this paper presents a modified fuzzy c-means algorithm (SP-FCM) based on particle swarm optimization (PSO) and shadowed sets to perform feature clustering. SP-FCM introduces the global search property of PSO to deal with the problem of premature convergence of conventional fuzzy clustering, utilizes vagueness balance property of shadowed sets to handle overlapping among clusters, and models uncertainty in class boundaries. This new method uses Xie-Beni index as cluster validity and automatically finds the optimal cluster number within a specific range with cluster partitions that provide compact and well-separated clusters. Experiments show that the proposed approach significantly improves the clustering effect. PMID:25477953
Distributing Power Grid State Estimation on HPC Clusters A System Architecture Prototype
Liu, Yan; Jiang, Wei; Jin, Shuangshuang; Rice, Mark J.; Chen, Yousu
2012-08-20
The future power grid is expected to further expand with highly distributed energy sources and smart loads. The increased size and complexity lead to increased burden on existing computational resources in energy control centers. Thus the need to perform real-time assessment on such systems entails efficient means to distribute centralized functions such as state estimation in the power system. In this paper, we present our early prototype of a system architecture that connects distributed state estimators individually running parallel programs to solve non-linear estimation procedure. The prototype consists of a middleware and data processing toolkits that allows data exchange in the distributed state estimation. We build a test case based on the IEEE 118 bus system and partition the state estimation of the whole system model to available HPC clusters. The measurement from the testbed demonstrates the low overhead of our solution.
Space-time stick-breaking processes for small area disease cluster estimation.
Hossain, Md Monir; Lawson, Andrew B; Cai, Bo; Choi, Jungsoon; Liu, Jihong; Kirby, Russell S
2013-03-01
We propose a space-time stick-breaking process for the disease cluster estimation. The dependencies for spatial and temporal effects are introduced by using space-time covariate dependent kernel stick-breaking processes. We compared this model with the space-time standard random effect model by checking each model's ability in terms of cluster detection of various shapes and sizes. This comparison was made for simulated data where the true risks were known. For the simulated data, we have observed that space-time stick-breaking process performs better in detecting medium- and high-risk clusters. For the real data, county specific low birth weight incidences for the state of South Carolina for the years 1997-2007, we have illustrated how the proposed model can be used to find grouping of counties of higher incidence rate.
Space-time stick-breaking processes for small area disease cluster estimation
Lawson, Andrew B.; Cai, Bo; Choi, Jungsoon; Liu, Jihong; Kirby, Russell S.
2012-01-01
We propose a space-time stick-breaking process for the disease cluster estimation. The dependencies for spatial and temporal effects are introduced by using space-time covariate dependent kernel stick-breaking processes. We compared this model with the space-time standard random effect model by checking each model’s ability in terms of cluster detection of various shapes and sizes. This comparison was made for simulated data where the true risks were known. For the simulated data, we have observed that space-time stick-breaking process performs better in detecting medium- and high-risk clusters. For the real data, county specific low birth weight incidences for the state of South Carolina for the years 1997–2007, we have illustrated how the proposed model can be used to find grouping of counties of higher incidence rate. PMID:23869181
Maximum Pseudolikelihood Estimation for Model-Based Clustering of Time Series Data.
Nguyen, Hien D; McLachlan, Geoffrey J; Orban, Pierre; Bellec, Pierre; Janke, Andrew L
2017-04-01
Mixture of autoregressions (MoAR) models provide a model-based approach to the clustering of time series data. The maximum likelihood (ML) estimation of MoAR models requires evaluating products of large numbers of densities of normal random variables. In practical scenarios, these products converge to zero as the length of the time series increases, and thus the ML estimation of MoAR models becomes infeasible without the use of numerical tricks. We propose a maximum pseudolikelihood (MPL) estimation approach as an alternative to the use of numerical tricks. The MPL estimator is proved to be consistent and can be computed with an EM (expectation-maximization) algorithm. Simulations are used to assess the performance of the MPL estimator against that of the ML estimator in cases where the latter was able to be calculated. An application to the clustering of time series data arising from a resting state fMRI experiment is presented as a demonstration of the methodology.
Improvements in estimating proportions of objects from multispectral data
NASA Technical Reports Server (NTRS)
Horwitz, H. M.; Hyde, P. D.; Richardson, W.
1974-01-01
Methods for estimating proportions of objects and materials imaged within the instantaneous field of view of a multispectral sensor were developed further. Improvements in the basic proportion estimation algorithm were devised as well as improved alien object detection procedures. Also, a simplified signature set analysis scheme was introduced for determining the adequacy of signature set geometry for satisfactory proportion estimation. Averaging procedures used in conjunction with the mixtures algorithm were examined theoretically and applied to artificially generated multispectral data. A computationally simpler estimator was considered and found unsatisfactory. Experiments conducted to find a suitable procedure for setting the alien object threshold yielded little definitive result. Mixtures procedures were used on a limited amount of ERTS data to estimate wheat proportion in selected areas. Results were unsatisfactory, partly because of the ill-conditioned nature of the pure signature set.
Scanning Linear Estimation: Improvements over Region of Interest (ROI) Methods
Kupinski, Meredith K.; Clarkson, Eric W.; Barrett, Harrison H.
2013-01-01
In tomographic medical imaging, signal activity is typically estimated by summing voxels from a reconstructed image. We introduce an alternative estimation scheme that operates on the raw projection data and offers a substantial improvement, as measured by the ensemble mean-square error (EMSE), when compared to using voxel values from a maximum-likelihood expectation-maximization (MLEM) reconstruction. The scanning-linear (SL) estimator operates on the raw projection data and is derived as a special case of maximum-likelihood (ML) estimation with a series of approximations to make the calculation tractable. The approximated likelihood accounts for background randomness, measurement noise, and variability in the parameters to be estimated. When signal size and location are known, the SL estimate of signal activity is an unbiased estimator, i.e., the average estimate equals the true value. By contrast, standard algorithms that operate on reconstructed data are subject to unpredictable bias arising from the null functions of the imaging system. The SL method is demonstrated for two different tasks: 1) simultaneously estimating a signal's size, location, and activity; 2) for a fixed signal size and location, estimating activity. Noisy projection data are realistically simulated using measured calibration data from the multi-module multi-resolution (M3R) small-animal SPECT imaging system. For both tasks the same set of images is reconstructed using the MLEM algorithm (80 iterations), and the average and the maximum value within the ROI are calculated for comparison. This comparison shows dramatic improvements in EMSE for the SL estimates. To show that the bias in ROI estimates affects not only absolute values but also relative differences, such as those used to monitor response to therapy, the activity estimation task is repeated for three different signal sizes. PMID:23384998
Evolving Improvements to TRMM Ground Validation Rainfall Estimates
NASA Technical Reports Server (NTRS)
Robinson, M.; Kulie, M. S.; Marks, D. A.; Wolff, D. B.; Ferrier, B. S.; Amitai, E.; Silberstein, D. S.; Fisher, B. L.; Wang, J.; Einaudi, Franco (Technical Monitor)
2000-01-01
The primary function of the TRMM Ground Validation (GV) Program is to create GV rainfall products that provide basic validation of satellite-derived precipitation measurements for select primary sites. Since the successful 1997 launch of the TRMM satellite, GV rainfall estimates have demonstrated systematic improvements directly related to improved radar and rain gauge data, modified science techniques, and software revisions. Improved rainfall estimates have resulted in higher quality GV rainfall products and subsequently, much improved evaluation products for the satellite-based precipitation estimates from TRMM. This presentation will demonstrate how TRMM GV rainfall products created in a semi-automated, operational environment have evolved and improved through successive generations. Monthly rainfall maps and rainfall accumulation statistics for each primary site will be presented for each stage of GV product development. Contributions from individual product modifications involving radar reflectivity (Ze)-rain rate (R) relationship refinements, improvements in rain gauge bulk-adjustment and data quality control processes, and improved radar and gauge data will be discussed. Finally, it will be demonstrated that as GV rainfall products have improved, rainfall estimation comparisons between GV and satellite have converged, lending confidence to the satellite-derived precipitation measurements from TRMM.
Unsupervised feature relevance analysis applied to improve ECG heartbeat clustering.
Rodríguez-Sotelo, J L; Peluffo-Ordoñez, D; Cuesta-Frau, D; Castellanos-Domínguez, G
2012-10-01
The computer-assisted analysis of biomedical records has become an essential tool in clinical settings. However, current devices provide a growing amount of data that often exceeds the processing capacity of normal computers. As this amount of information rises, new demands for more efficient data extracting methods appear. This paper addresses the task of data mining in physiological records using a feature selection scheme. An unsupervised method based on relevance analysis is described. This scheme uses a least-squares optimization of the input feature matrix in a single iteration. The output of the algorithm is a feature weighting vector. The performance of the method was assessed using a heartbeat clustering test on real ECG records. The quantitative cluster validity measures yielded a correctly classified heartbeat rate of 98.69% (specificity), 85.88% (sensitivity) and 95.04% (general clustering performance), which is even higher than the performance achieved by other similar ECG clustering studies. The number of features was reduced on average from 100 to 18, and the temporal cost was a 43% lower than in previous ECG clustering schemes.
NASA Astrophysics Data System (ADS)
Ameglio, S.; Borgani, S.; Diaferio, A.; Dolag, K.
2006-07-01
The angular-diameter distance DA of a galaxy cluster can be measured by combining its X-ray emission with the cosmic microwave background fluctuation due to the Sunyaev-Zeldovich (SZ) effect. The application of this distance indicator usually assumes that the cluster is spherically symmetric, the gas is distributed according to the isothermal β-model, and the X-ray temperature is an unbiased measure of the electron temperature. We test these assumptions with galaxy clusters extracted from an extended set of cosmological N-body/hydrodynamical simulations of a Λ cold dark matter concordance cosmology, which include the effect of radiative cooling, star formation and energy feedback from supernovae. We find that, due to the temperature gradients which are present in the central regions of simulated clusters, the assumption of isothermal gas leads to a significant underestimate of DA. This bias is efficiently corrected by using the polytropic version of the β-model to account for the presence of temperature gradients. In this case, once irregular clusters are removed, the correct value of DA is recovered with a ~5 per cent accuracy on average, with a ~20 per cent intrinsic scatter due to cluster asphericity. This result is valid when using either the electron temperature or a spectroscopic-like temperature. Instead When using the emission-weighted definition for the temperature of the simulated clusters, DA is biased low by ~20 per cent. We discuss the implications of our results for an accurate determination of the Hubble constant H0 and of the density parameter Ωm. We find that, at least in the case of ideal (i.e. noiseless) X-ray and SZ observations extended out to r500, H0 can be potentially recovered with exquisite precision, while the resulting estimate of Ωm, which is unbiased, has typical errors ΔΩm ~= 0.05.
A cluster-based decision support system for estimating earthquake damage and casualties.
Aleskerov, Fuad; Say, Arzu Iseri; Toker, Aysegül; Akin, H Levent; Altay, Gülay
2005-09-01
This paper describes a Decision Support System for Disaster Management (DSS-DM) to aid operational and strategic planning and policy-making for disaster mitigation and preparedness in a less-developed infrastructural context. Such contexts require a more flexible and robust system for fast prediction of damage and losses. The proposed system is specifically designed for earthquake scenarios, estimating the extent of human losses and injuries, as well as the need for temporary shelters. The DSS-DM uses a scenario approach to calculate the aforementioned parameters at the district and sub-district level at different earthquake intensities. The following system modules have been created: clusters (buildings) with respect to use; buildings with respect to construction typology; and estimations of damage to clusters, human losses and injuries, and the need for shelters. The paper not only examines the components of the DSS-DM, but also looks at its application in Besiktas municipality in the city of Istanbul, Turkey.
Kasaie, Parastu; Mathema, Barun; Kelton, W David; Azman, Andrew S; Pennington, Jeff; Dowdy, David W
2015-01-01
In any setting, a proportion of incident active tuberculosis (TB) reflects recent transmission ("recent transmission proportion"), whereas the remainder represents reactivation. Appropriately estimating the recent transmission proportion has important implications for local TB control, but existing approaches have known biases, especially where data are incomplete. We constructed a stochastic individual-based model of a TB epidemic and designed a set of simulations (derivation set) to develop two regression-based tools for estimating the recent transmission proportion from five inputs: underlying TB incidence, sampling coverage, study duration, clustered proportion of observed cases, and proportion of observed clusters in the sample. We tested these tools on a set of unrelated simulations (validation set), and compared their performance against that of the traditional 'n-1' approach. In the validation set, the regression tools reduced the absolute estimation bias (difference between estimated and true recent transmission proportion) in the 'n-1' technique by a median [interquartile range] of 60% [9%, 82%] and 69% [30%, 87%]. The bias in the 'n-1' model was highly sensitive to underlying levels of study coverage and duration, and substantially underestimated the recent transmission proportion in settings of incomplete data coverage. By contrast, the regression models' performance was more consistent across different epidemiological settings and study characteristics. We provide one of these regression models as a user-friendly, web-based tool. Novel tools can improve our ability to estimate the recent TB transmission proportion from data that are observable (or estimable) by public health practitioners with limited available molecular data.
Clustering with an Improved Self-Organizing Tree
NASA Astrophysics Data System (ADS)
Suzuki, Yukinori; Sasaki, Yasue
A self-organizing tree (S-TREE) has a self-organizing capability and better performance than previously reported tree-structured clustering. In the S-TREE algorithm, since a tree grows in greedy fashion, a pruning mechanism is necessary to reduce the effect of bad leaf nodes. Extra nodes are pruned when the tree reaches a predetermined maximum size (U). U is problem-dependent and is therefore difficult to specify beforehand. Furthermore, since U gives the limit of tree growth and also prevents self-organizing of the tree, it may produce unnatural clustering. We are presenting a new pruning algorithm without U. In this paper, we present results showing the performance of the new pruning algorithm using samples generated from normal distributions. The results of computational experiments showed that the new pruning algorithm works well for clustering of those samples.
Galili, Tal; Meilijson, Isaac
2016-01-01
The Rao–Blackwell theorem offers a procedure for converting a crude unbiased estimator of a parameter θ into a “better” one, in fact unique and optimal if the improvement is based on a minimal sufficient statistic that is complete. In contrast, behind every minimal sufficient statistic that is not complete, there is an improvable Rao–Blackwell improvement. This is illustrated via a simple example based on the uniform distribution, in which a rather natural Rao–Blackwell improvement is uniformly improvable. Furthermore, in this example the maximum likelihood estimator is inefficient, and an unbiased generalized Bayes estimator performs exceptionally well. Counterexamples of this sort can be useful didactic tools for explaining the true nature of a methodology and possible consequences when some of the assumptions are violated. [Received December 2014. Revised September 2015.] PMID:27499547
Galili, Tal; Meilijson, Isaac
2016-01-02
The Rao-Blackwell theorem offers a procedure for converting a crude unbiased estimator of a parameter θ into a "better" one, in fact unique and optimal if the improvement is based on a minimal sufficient statistic that is complete. In contrast, behind every minimal sufficient statistic that is not complete, there is an improvable Rao-Blackwell improvement. This is illustrated via a simple example based on the uniform distribution, in which a rather natural Rao-Blackwell improvement is uniformly improvable. Furthermore, in this example the maximum likelihood estimator is inefficient, and an unbiased generalized Bayes estimator performs exceptionally well. Counterexamples of this sort can be useful didactic tools for explaining the true nature of a methodology and possible consequences when some of the assumptions are violated. [Received December 2014. Revised September 2015.].
Scanning linear estimation: improvements over region of interest (ROI) methods.
Kupinski, Meredith K; Clarkson, Eric W; Barrett, Harrison H
2013-03-07
In tomographic medical imaging, a signal activity is typically estimated by summing voxels from a reconstructed image. We introduce an alternative estimation scheme that operates on the raw projection data and offers a substantial improvement, as measured by the ensemble mean-square error (EMSE), when compared to using voxel values from a maximum-likelihood expectation-maximization (MLEM) reconstruction. The scanning-linear (SL) estimator operates on the raw projection data and is derived as a special case of maximum-likelihood estimation with a series of approximations to make the calculation tractable. The approximated likelihood accounts for background randomness, measurement noise and variability in the parameters to be estimated. When signal size and location are known, the SL estimate of signal activity is unbiased, i.e. the average estimate equals the true value. By contrast, unpredictable bias arising from the null functions of the imaging system affect standard algorithms that operate on reconstructed data. The SL method is demonstrated for two different tasks: (1) simultaneously estimating a signal's size, location and activity; (2) for a fixed signal size and location, estimating activity. Noisy projection data are realistically simulated using measured calibration data from the multi-module multi-resolution small-animal SPECT imaging system. For both tasks, the same set of images is reconstructed using the MLEM algorithm (80 iterations), and the average and maximum values within the region of interest (ROI) are calculated for comparison. This comparison shows dramatic improvements in EMSE for the SL estimates. To show that the bias in ROI estimates affects not only absolute values but also relative differences, such as those used to monitor the response to therapy, the activity estimation task is repeated for three different signal sizes.
Distributed Noise Generation for Density Estimation Based Clustering without Trusted Third Party
NASA Astrophysics Data System (ADS)
Su, Chunhua; Bao, Feng; Zhou, Jianying; Takagi, Tsuyoshi; Sakurai, Kouichi
The rapid growth of the Internet provides people with tremendous opportunities for data collection, knowledge discovery and cooperative computation. However, it also brings the problem of sensitive information leakage. Both individuals and enterprises may suffer from the massive data collection and the information retrieval by distrusted parties. In this paper, we propose a privacy-preserving protocol for the distributed kernel density estimation-based clustering. Our scheme applies random data perturbation (RDP) technique and the verifiable secret sharing to solve the security problem of distributed kernel density estimation in [4] which assumed a mediate party to help in the computation.
Improving The Discipline of Cost Estimation and Analysis
NASA Technical Reports Server (NTRS)
Piland, William M.; Pine, David J.; Wilson, Delano M.
2000-01-01
The need to improve the quality and accuracy of cost estimates of proposed new aerospace systems has been widely recognized. The industry has done the best job of maintaining related capability with improvements in estimation methods and giving appropriate priority to the hiring and training of qualified analysts. Some parts of Government, and National Aeronautics and Space Administration (NASA) in particular, continue to need major improvements in this area. Recently, NASA recognized that its cost estimation and analysis capabilities had eroded to the point that the ability to provide timely, reliable estimates was impacting the confidence in planning many program activities. As a result, this year the Agency established a lead role for cost estimation and analysis. The Independent Program Assessment Office located at the Langley Research Center was given this responsibility. This paper presents the plans for the newly established role. Described is how the Independent Program Assessment Office, working with all NASA Centers, NASA Headquarters, other Government agencies, and industry, is focused on creating cost estimation and analysis as a professional discipline that will be recognized equally with the technical disciplines needed to design new space and aeronautics activities. Investments in selected, new analysis tools, creating advanced training opportunities for analysts, and developing career paths for future analysts engaged in the discipline are all elements of the plan. Plans also include increasing the human resources available to conduct independent cost analysis of Agency programs during their formulation, to improve near-term capability to conduct economic cost-benefit assessments, to support NASA management's decision process, and to provide cost analysis results emphasizing "full-cost" and "full-life cycle" considerations. The Agency cost analysis improvement plan has been approved for implementation starting this calendar year. Adequate financial
Improving The Discipline of Cost Estimation and Analysis
NASA Technical Reports Server (NTRS)
Piland, William M.; Pine, David J.; Wilson, Delano M.
2000-01-01
The need to improve the quality and accuracy of cost estimates of proposed new aerospace systems has been widely recognized. The industry has done the best job of maintaining related capability with improvements in estimation methods and giving appropriate priority to the hiring and training of qualified analysts. Some parts of Government, and National Aeronautics and Space Administration (NASA) in particular, continue to need major improvements in this area. Recently, NASA recognized that its cost estimation and analysis capabilities had eroded to the point that the ability to provide timely, reliable estimates was impacting the confidence in planning many program activities. As a result, this year the Agency established a lead role for cost estimation and analysis. The Independent Program Assessment Office located at the Langley Research Center was given this responsibility. This paper presents the plans for the newly established role. Described is how the Independent Program Assessment Office, working with all NASA Centers, NASA Headquarters, other Government agencies, and industry, is focused on creating cost estimation and analysis as a professional discipline that will be recognized equally with the technical disciplines needed to design new space and aeronautics activities. Investments in selected, new analysis tools, creating advanced training opportunities for analysts, and developing career paths for future analysts engaged in the discipline are all elements of the plan. Plans also include increasing the human resources available to conduct independent cost analysis of Agency programs during their formulation, to improve near-term capability to conduct economic cost-benefit assessments, to support NASA management's decision process, and to provide cost analysis results emphasizing "full-cost" and "full-life cycle" considerations. The Agency cost analysis improvement plan has been approved for implementation starting this calendar year. Adequate financial
NASA Astrophysics Data System (ADS)
Powalka, Mathieu; Lançon, Ariane; Puzia, Thomas H.; Peng, Eric W.; Liu, Chengze; Muñoz, Roberto P.; Blakeslee, John P.; Côté, Patrick; Ferrarese, Laura; Roediger, Joel; Sánchez-Janssen, Rúben; Zhang, Hongxin; Durrell, Patrick R.; Cuillandre, Jean-Charles; Duc, Pierre-Alain; Guhathakurta, Puragra; Gwyn, S. D. J.; Hudelot, Patrick; Mei, Simona; Toloba, Elisa
2017-08-01
Large samples of globular clusters (GC) with precise multi-wavelength photometry are becoming increasingly available and can be used to constrain the formation history of galaxies. We present the results of an analysis of Milky Way (MW) and Virgo core GCs based on 5 optical-near-infrared colors and 10 synthetic stellar population models. For the MW GCs, the models tend to agree on photometric ages and metallicities, with values similar to those obtained with previous studies. When used with Virgo core GCs, for which photometry is provided by the Next Generation Virgo cluster Survey (NGVS), the same models generically return younger ages. This is a consequence of the systematic differences observed between the locus occupied by Virgo core GCs and models in panchromatic color space. Only extreme fine-tuning of the adjustable parameters available to us can make the majority of the best-fit ages old. Although we cannot exclude that the formation history of the Virgo core may lead to more conspicuous populations of relatively young GCs than in other environments, we emphasize that the intrinsic properties of the Virgo GCs are likely to differ systematically from those assumed in the models. Thus, the large wavelength coverage and photometric quality of modern GC samples, such as those used here, is not by itself sufficient to better constrain the GC formation histories. Models matching the environment-dependent characteristics of GCs in multi-dimensional color space are needed to improve the situation.
Improving the realism of hydrologic model through multivariate parameter estimation
NASA Astrophysics Data System (ADS)
Rakovec, Oldrich; Kumar, Rohini; Attinger, Sabine; Samaniego, Luis
2017-04-01
Increased availability and quality of near real-time observations should improve understanding of predictive skills of hydrological models. Recent studies have shown the limited capability of river discharge data alone to adequately constrain different components of distributed model parameterizations. In this study, the GRACE satellite-based total water storage (TWS) anomaly is used to complement the discharge data with an aim to improve the fidelity of mesoscale hydrologic model (mHM) through multivariate parameter estimation. The study is conducted in 83 European basins covering a wide range of hydro-climatic regimes. The model parameterization complemented with the TWS anomalies leads to statistically significant improvements in (1) discharge simulations during low-flow period, and (2) evapotranspiration estimates which are evaluated against independent (FLUXNET) data. Overall, there is no significant deterioration in model performance for the discharge simulations when complemented by information from the TWS anomalies. However, considerable changes in the partitioning of precipitation into runoff components are noticed by in-/exclusion of TWS during the parameter estimation. A cross-validation test carried out to assess the transferability and robustness of the calibrated parameters to other locations further confirms the benefit of complementary TWS data. In particular, the evapotranspiration estimates show more robust performance when TWS data are incorporated during the parameter estimation, in comparison with the benchmark model constrained against discharge only. This study highlights the value for incorporating multiple data sources during parameter estimation to improve the overall realism of hydrologic model and its applications over large domains. Rakovec, O., Kumar, R., Attinger, S. and Samaniego, L. (2016): Improving the realism of hydrologic model functioning through multivariate parameter estimation. Water Resour. Res., 52, http://dx.doi.org/10
Improved Versions of Common Estimators of the Recombination Rate.
Gärtner, Kerstin; Futschik, Andreas
2016-09-01
The scaled recombination parameter [Formula: see text] is one of the key parameters, turning up frequently in population genetic models. Accurate estimates of [Formula: see text] are difficult to obtain, as recombination events do not always leave traces in the data. One of the most widely used approaches is composite likelihood. Here, we show that popular implementations of composite likelihood estimators can often be uniformly improved by optimizing the trade-off between bias and variance. The amount of possible improvement depends on parameters such as the sequence length, the sample size, and the mutation rate, and it can be considerable in some cases. It turns out that approximate Bayesian computation, with composite likelihood as a summary statistic, also leads to improved estimates, but now in terms of the posterior risk. Finally, we demonstrate a practical application on real data from Drosophila.
Source-lens clustering and intrinsic-alignment bias of weak-lensing estimators
NASA Astrophysics Data System (ADS)
Valageas, Patrick
2014-01-01
Aims: We estimate the amplitude of the source-lens clustering bias and of the intrinsic-alignment bias of weak-lensing estimators of the two-point and three-point convergence and cosmic-shear correlation functions. Methods: We use a linear galaxy bias model for the galaxy-density correlations, as well as a linear intrinsic-alignment model. For the three-point and four-point density correlations, we use analytical or semi-analytical models, based on a hierarchical ansatz or a combination of one-loop perturbation theory with a halo model. Results: For two-point statistics, we find that the source-lens clustering bias is typically several orders of magnitude below the weak-lensing signal, except when we correlate a very low-redshift galaxy (z2 ≲ 0.05) with a higher redshift galaxy (z1 ≳ 0.5), where it can reach 10% of the signal for the shear. For three-point statistics, the source-lens clustering bias is typically on the order of 10% of the signal, as soon as the three galaxy source redshifts are not identical. The intrinsic-alignment bias is typically about 10% of the signal for both two-point and three-point statistics. Thus, both source-lens clustering bias and intrinsic-alignment bias must be taken into account for three-point estimators aiming at a better than 10% accuracy. Appendices are available in electronic form at http://www.aanda.org
Improved harmonic mean estimator for phylogenetic model evidence.
Arima, Serena; Tardella, Luca
2012-04-01
Bayesian phylogenetic methods are generating noticeable enthusiasm in the field of molecular systematics. Many phylogenetic models are often at stake, and different approaches are used to compare them within a Bayesian framework. The Bayes factor, defined as the ratio of the marginal likelihoods of two competing models, plays a key role in Bayesian model selection. We focus on an alternative estimator of the marginal likelihood whose computation is still a challenging problem. Several computational solutions have been proposed, none of which can be considered outperforming the others simultaneously in terms of simplicity of implementation, computational burden and precision of the estimates. Practitioners and researchers, often led by available software, have privileged so far the simplicity of the harmonic mean (HM) estimator. However, it is known that the resulting estimates of the Bayesian evidence in favor of one model are biased and often inaccurate, up to having an infinite variance so that the reliability of the corresponding conclusions is doubtful. We consider possible improvements of the generalized harmonic mean (GHM) idea that recycle Markov Chain Monte Carlo (MCMC) simulations from the posterior, share the computational simplicity of the original HM estimator, but, unlike it, overcome the infinite variance issue. We show reliability and comparative performance of the improved harmonic mean estimators comparing them to approximation techniques relying on improved variants of the thermodynamic integration.
Improved Estimation and Interpretation of Correlations in Neural Circuits
Yatsenko, Dimitri; Josić, Krešimir; Ecker, Alexander S.; Froudarakis, Emmanouil; Cotton, R. James; Tolias, Andreas S.
2015-01-01
Ambitious projects aim to record the activity of ever larger and denser neuronal populations in vivo. Correlations in neural activity measured in such recordings can reveal important aspects of neural circuit organization. However, estimating and interpreting large correlation matrices is statistically challenging. Estimation can be improved by regularization, i.e. by imposing a structure on the estimate. The amount of improvement depends on how closely the assumed structure represents dependencies in the data. Therefore, the selection of the most efficient correlation matrix estimator for a given neural circuit must be determined empirically. Importantly, the identity and structure of the most efficient estimator informs about the types of dominant dependencies governing the system. We sought statistically efficient estimators of neural correlation matrices in recordings from large, dense groups of cortical neurons. Using fast 3D random-access laser scanning microscopy of calcium signals, we recorded the activity of nearly every neuron in volumes 200 μm wide and 100 μm deep (150–350 cells) in mouse visual cortex. We hypothesized that in these densely sampled recordings, the correlation matrix should be best modeled as the combination of a sparse graph of pairwise partial correlations representing local interactions and a low-rank component representing common fluctuations and external inputs. Indeed, in cross-validation tests, the covariance matrix estimator with this structure consistently outperformed other regularized estimators. The sparse component of the estimate defined a graph of interactions. These interactions reflected the physical distances and orientation tuning properties of cells: The density of positive ‘excitatory’ interactions decreased rapidly with geometric distances and with differences in orientation preference whereas negative ‘inhibitory’ interactions were less selective. Because of its superior performance, this
Robust normal estimation of point cloud with sharp features via subspace clustering
NASA Astrophysics Data System (ADS)
Luo, Pei; Wu, Zhuangzhi; Xia, Chunhe; Feng, Lu; Jia, Bo
2014-01-01
Normal estimation is an essential step in point cloud based geometric processing, such as high quality point based rendering and surface reconstruction. In this paper, we present a clustering based method for normal estimation which preserves sharp features. For a piecewise smooth point cloud, the k-nearest neighbors of one point lie on a union of multiple subspaces. Given the PCA normals as input, we perform a subspace clustering algorithm to segment these subspaces. Normals are estimated by the points lying in the same subspace as the center point. In contrast to the previous method, we exploit the low-rankness of the input data, by seeking the lowest rank representation among all the candidates that can represent one normal as linear combinations of the others. Integration of Low-Rank Representation (LRR) makes our method robust to noise. Moreover, our method can simultaneously produce the estimated normals and the local structures which are especially useful for denoise and segmentation applications. The experimental results show that our approach successfully recovers sharp features and generates more reliable results compared with the state-of-theart.
A simple recipe for estimating masses of elliptical galaxies and clusters of galaxies
NASA Astrophysics Data System (ADS)
Lyskova, N.
2013-04-01
We discuss a simple and robust procedure to evaluate the mass/circular velocity of massive elliptical galaxies and clusters of galaxies. It relies only on the surface density and the projected velocity dispersion profiles of tracer particles and therefore can be applied even in case of poor or noisy observational data. Stars, globular clusters or planetary nebulae can be used as tracers for mass determination of elliptical galaxies. For clusters the galaxies themselves can be used as tracer particles. The key element of the proposed procedure is the selection of a ``sweet'' radius R_sweet, where the sensitivity to the unknown anisotropy of the tracers' orbits is minimal. At this radius the surface density of tracers declines approximately as I(R)∝ R-2, thus placing R_sweet not far from the half-light radius of the tracers R_eff. The procedure was tested on a sample of cosmological simulations of individual galaxies and galaxy clusters and then applied to real observational data. Independently the total mass profile was derived from the hydrostatic equilibrium equation for the gaseous atmosphere. Mismatch in mass profiles obtained from optical and X-ray data is used to estimate the non-thermal contribution to the gas pressure and/or to constrain the distribution of tracers' orbits.
Distance Estimates for High Redshift Clusters SZ and X-Ray Measurements
NASA Technical Reports Server (NTRS)
Joy, Marshall K.
1999-01-01
I present interferometric images of the Sunyaev-Zel'dovich effect for the high redshift (z $ greater than $ 0.5) galaxy clusters in the \\emph(Einstein) Medium Sensitivity Survey: MS0451.5-0305 (z = 0.54), MS0015.9+1609 (z = 0.55), MS2053.7-0449 (z = 0.58), MS1 137.5+6625 (z = 0.78), and MS 1054.5-0321 (z = 0.83). Isothermal $\\beta$ models are applied to the data to determine the magnitude of the Sunyaev-Zel'dovich (S-Z) decrement in each cluster. Complementary ROSAT PSPC and HRI x-ray data are also analyzed, and are combined with the S-Z data to generate an independent estimate of the cluster distance. Since the Sunyaev-Zel'dovich Effect is invariant with redshift, sensitive S-Z imaging can provide an independent determination of the size, shape, density, and distance of high redshift galaxy clusters; we will discuss current systematic uncertainties with this approach, as well as future observations which will yield stronger constraints.
Thompson, William L.
2001-07-01
Monitoring population numbers is important for assessing trends and meeting various legislative mandates. However, sampling across time introduces a temporal aspect to survey design in addition to the spatial one. For instance, a sample that is initially representative may lose this attribute if there is a shift in numbers and/or spatial distribution in the underlying population that is not reflected in later sampled plots. Plot selection methods that account for this temporal variability will produce the best trend estimates. Consequently, I used simulation to compare bias and relative precision of estimates of population change among stratified and unstratified sampling designs based on permanent, temporary, and partial replacement plots under varying levels of spatial clustering, density, and temporal shifting of populations. Permanent plots produced more precise estimates of change than temporary plots across all factors. Further, permanent plots performed better than partial replacement plots except for high density (5 and 10 individuals per plot) and 25% - 50% shifts in the population. Stratified designs always produced less precise estimates of population change for all three plot selection methods, and often produced biased change estimates and greatly inflated variance estimates under sampling with partial replacement. Hence, stratification that remains fixed across time should be avoided when monitoring populations that are likely to exhibit large changes in numbers and/or spatial distribution during the study period. Key words: bias; change estimation; monitoring; permanent plots; relative precision; sampling with partial replacement; temporary plots.
Ji, Ze-Xuan; Sun, Quan-Sen; Xia, De-Shen
2011-07-01
A modified possibilistic fuzzy c-means clustering algorithm is presented for fuzzy segmentation of magnetic resonance (MR) images that have been corrupted by intensity inhomogeneities and noise. By introducing a novel adaptive method to compute the weights of local spatial in the objective function, the new adaptive fuzzy clustering algorithm is capable of utilizing local contextual information to impose local spatial continuity, thus allowing the suppression of noise and helping to resolve classification ambiguity. To estimate the intensity inhomogeneity, the global intensity is introduced into the coherent local intensity clustering algorithm and takes the local and global intensity information into account. The segmentation target therefore is driven by two forces to smooth the derived optimal bias field and improve the accuracy of the segmentation task. The proposed method has been successfully applied to 3 T, 7 T, synthetic and real MR images with desirable results. Comparisons with other approaches demonstrate the superior performance of the proposed algorithm. Moreover, the proposed algorithm is robust to initialization, thereby allowing fully automatic applications.
Yuwono, Mitchell; Su, Steven W; Moulton, Bruce D; Nguyen, Hung T
2013-01-01
When undertaking gait-analysis, one of the most important factors to consider is heel-strike (HS). Signals from a waist worn Inertial Measurement Unit (IMU) provides sufficient accelerometric and gyroscopic information for estimating gait parameter and identifying HS events. In this paper we propose a novel adaptive, unsupervised, and parameter-free identification method for detection of HS events during gait episodes. Our proposed method allows the device to learn and adapt to the profile of the user without the need of supervision. The algorithm is completely parameter-free and requires no prior fine tuning. Autocorrelation features (ACF) of both antero-posterior acceleration (aAP) and medio-lateral acceleration (aML) are used to determine cadence episodes. The Discrete Wavelet Transform (DWT) features of signal peaks during cadence are extracted and clustered using Swarm Rapid Centroid Estimation (Swarm RCE). Left HS (LHS), Right HS (RHS), and movement artifacts are clustered based on intra-cluster correlation. Initial pilot testing of the system on 8 subjects show promising results up to 84.3%±9.2% and 86.7%±6.9% average accuracy with 86.8%±9.2% and 88.9%±7.1% average precision for the segmentation of LHS and RHS respectively.
ERIC Educational Resources Information Center
Rhoads, Christopher
2014-01-01
Recent publications have drawn attention to the idea of utilizing prior information about the correlation structure to improve statistical power in cluster randomized experiments. Because power in cluster randomized designs is a function of many different parameters, it has been difficult for applied researchers to discern a simple rule explaining…
ERIC Educational Resources Information Center
Rhoads, Christopher
2014-01-01
Recent publications have drawn attention to the idea of utilizing prior information about the correlation structure to improve statistical power in cluster randomized experiments. Because power in cluster randomized designs is a function of many different parameters, it has been difficult for applied researchers to discern a simple rule explaining…
Wu, Sheng; Crespi, Catherine M.; Wong, Weng Kee
2012-01-01
The intraclass correlation coefficient (ICC) is a fundamental parameter of interest in cluster randomized trials as it can greatly affect statistical power. We compare common methods of estimating the ICC in cluster randomized trials with binary outcomes, with a specific focus on their application to community-based cancer prevention trials with primary outcome of self-reported cancer screening. Using three real data sets from cancer screening intervention trials with different numbers and types of clusters and cluster sizes, we obtained point estimates and 95% confidence intervals for the ICC using five methods: the analysis of variance estimator, the Fleiss-Cuzick estimator, the Pearson estimator, an estimator based on generalized estimating equations and an estimator from a random intercept logistic regression model. We compared estimates of the ICC for the overall sample and by study condition. Our results show that ICC estimates from different methods can be quite different, although confidence intervals generally overlap. The ICC varied substantially by study condition in two studies, suggesting that the common practice of assuming a common ICC across all clusters in the trial is questionable. A simulation study confirmed pitfalls of erroneously assuming a common ICC. Investigators should consider using sample size and analysis methods that allow the ICC to vary by study condition. PMID:22627076
Wu, Sheng; Crespi, Catherine M; Wong, Weng Kee
2012-09-01
The intraclass correlation coefficient (ICC) is a fundamental parameter of interest in cluster randomized trials as it can greatly affect statistical power. We compare common methods of estimating the ICC in cluster randomized trials with binary outcomes, with a specific focus on their application to community-based cancer prevention trials with primary outcome of self-reported cancer screening. Using three real data sets from cancer screening intervention trials with different numbers and types of clusters and cluster sizes, we obtained point estimates and 95% confidence intervals for the ICC using five methods: the analysis of variance estimator, the Fleiss-Cuzick estimator, the Pearson estimator, an estimator based on generalized estimating equations and an estimator from a random intercept logistic regression model. We compared estimates of the ICC for the overall sample and by study condition. Our results show that ICC estimates from different methods can be quite different, although confidence intervals generally overlap. The ICC varied substantially by study condition in two studies, suggesting that the common practice of assuming a common ICC across all clusters in the trial is questionable. A simulation study confirmed pitfalls of erroneously assuming a common ICC. Investigators should consider using sample size and analysis methods that allow the ICC to vary by study condition.
Motion estimation in the frequency domain using fuzzy c-planes clustering.
Erdem, C E; Karabulut, G Z; Yanmaz, E; Anarim, E
2001-01-01
A recent work explicitly models the discontinuous motion estimation problem in the frequency domain where the motion parameters are estimated using a harmonic retrieval approach. The vertical and horizontal components of the motion are independently estimated from the locations of the peaks of respective periodogram analyses and they are paired to obtain the motion vectors using a procedure proposed. In this paper, we present a more efficient method that replaces the motion component pairing task and hence eliminates the problems of the pairing method described. The method described in this paper uses the fuzzy c-planes (FCP) clustering approach to fit planes to three-dimensional (3-D) frequency domain data obtained from the peaks of the periodograms. Experimental results are provided to demonstrate the effectiveness of the proposed method.
Leon-Perez, Jose M; Notelaers, Guy; Arenas, Alicia; Munduate, Lourdes; Medina, Francisco J
2014-05-01
Research findings underline the negative effects of exposure to bullying behaviors and document the detrimental health effects of being a victim of workplace bullying. While no one disputes its negative consequences, debate continues about the magnitude of this phenomenon since very different prevalence rates of workplace bullying have been reported. Methodological aspects may explain these findings. Our contribution to this debate integrates behavioral and self-labeling estimation methods of workplace bullying into a measurement model that constitutes a bullying typology. Results in the present sample (n = 1,619) revealed that six different groups can be distinguished according to the nature and intensity of reported bullying behaviors. These clusters portray different paths for the workplace bullying process, where negative work-related and person-degrading behaviors are strongly intertwined. The analysis of the external validity showed that integrating previous estimation methods into a single measurement latent class model provides a reliable estimation method of workplace bullying, which may overcome previous flaws.
Unsupervised Texture Flow Estimation Using Appearance-Space Clustering and Correspondence.
Choi, Sunghwan; Min, Dongbo; Ham, Bumsub; Sohn, Kwanghoon
2015-11-01
This paper presents a texture flow estimation method that uses an appearance-space clustering and a correspondence search in the space of deformed exemplars. To estimate the underlying texture flow, such as scale, orientation, and texture label, most existing approaches require a certain amount of user interactions. Strict assumptions on a geometric model further limit the flow estimation to such a near-regular texture as a gradient-like pattern. We address these problems by extracting distinct texture exemplars in an unsupervised way and using an efficient search strategy on a deformation parameter space. This enables estimating a coherent flow in a fully automatic manner, even when an input image contains multiple textures of different categories. A set of texture exemplars that describes the input texture image is first extracted via a medoid-based clustering in appearance space. The texture exemplars are then matched with the input image to infer deformation parameters. In particular, we define a distance function for measuring a similarity between the texture exemplar and a deformed target patch centered at each pixel from the input image, and then propose to use a randomized search strategy to estimate these parameters efficiently. The deformation flow field is further refined by adaptively smoothing the flow field under guidance of a matching confidence score. We show that a local visual similarity, directly measured from appearance space, explains local behaviors of the flow very well, and the flow field can be estimated very efficiently when the matching criterion meets the randomized search strategy. Experimental results on synthetic and natural images show that the proposed method outperforms existing methods.
2015-03-26
ESTIMATING SINGLE AND MULTIPLE TARGET LOCATIONS USING K-MEANS CLUSTERING WITH RADIO TOMOGRAPHIC IMAGING IN WIRELESS SENSOR NETWORKS THESIS Jeffrey K...AND MULTIPLE TARGET LOCATIONS USING K-MEANS CLUSTERING WITH RADIO TOMOGRAPHIC IMAGING IN WIRELESS SENSOR NETWORKS THESIS Presented to the Faculty...SINGLE AND MULTIPLE TARGET LOCATIONS USING K-MEANS CLUSTERING WITH RADIO TOMOGRAPHIC IMAGING IN WIRELESS SENSOR NETWORKS Jeffrey K. Nishida, B.S.E.E
Age estimates of globular clusters in the Milky Way: constraints on cosmology.
Krauss, Lawrence M; Chaboyer, Brian
2003-01-03
Recent observations of stellar globular clusters in the Milky Way Galaxy, combined with revised ranges of parameters in stellar evolution codes and new estimates of the earliest epoch of globular cluster formation, result in a 95% confidence level lower limit on the age of the Universe of 11.2 billion years. This age is inconsistent with the expansion age for a flat Universe for the currently allowed range of the Hubble constant, unless the cosmic equation of state is dominated by a component that violates the strong energy condition. This means that the three fundamental observables in cosmology-the age of the Universe, the distance-redshift relation, and the geometry of the Universe-now independently support the case for a dark energy-dominated Universe.
ROC Estimation from Clustered Data with an Application to Liver Cancer Data
Kim, Joungyoun; Yun, Sung-Cheol; Lim, Johan; Lee, Moo-Song; Son, Won; Park, DoHwan
2016-01-01
In this article, we propose a regression model to compare the performances of different diagnostic methods having clustered ordinal test outcomes. The proposed model treats ordinal test outcomes (an ordinal categorical variable) as grouped-survival time data and uses random effects to explain correlation among outcomes from the same cluster. To compare different diagnostic methods, we introduce a set of covariates indicating diagnostic methods and compare their coefficients. We find that the proposed model defines a Lehmann family and can also introduce a location-scale family of a receiver operating characteristic (ROC) curve. The proposed model can easily be estimated using standard statistical software such as SAS and SPSS. We illustrate its practical usefulness by applying it to testing different magnetic resonance imaging (MRI) methods to detect abnormal lesions in a liver. PMID:28050126
Pu, Xiangke; Gao, Ge; Fan, Yubo; Wang, Mian
2016-01-01
Randomized response is a research method to get accurate answers to sensitive questions in structured sample survey. Simple random sampling is widely used in surveys of sensitive questions but hard to apply on large targeted populations. On the other side, more sophisticated sampling regimes and corresponding formulas are seldom employed to sensitive question surveys. In this work, we developed a series of formulas for parameter estimation in cluster sampling and stratified cluster sampling under two kinds of randomized response models by using classic sampling theories and total probability formulas. The performances of the sampling methods and formulas in the survey of premarital sex and cheating on exams at Soochow University were also provided. The reliability of the survey methods and formulas for sensitive question survey was found to be high.
Pu, Xiangke; Gao, Ge; Fan, Yubo; Wang, Mian
2016-01-01
Randomized response is a research method to get accurate answers to sensitive questions in structured sample survey. Simple random sampling is widely used in surveys of sensitive questions but hard to apply on large targeted populations. On the other side, more sophisticated sampling regimes and corresponding formulas are seldom employed to sensitive question surveys. In this work, we developed a series of formulas for parameter estimation in cluster sampling and stratified cluster sampling under two kinds of randomized response models by using classic sampling theories and total probability formulas. The performances of the sampling methods and formulas in the survey of premarital sex and cheating on exams at Soochow University were also provided. The reliability of the survey methods and formulas for sensitive question survey was found to be high. PMID:26886857
Improving and Evaluating Nested Sampling Algorithm for Marginal Likelihood Estimation
NASA Astrophysics Data System (ADS)
Ye, M.; Zeng, X.; Wu, J.; Wang, D.; Liu, J.
2016-12-01
With the growing impacts of climate change and human activities on the cycle of water resources, an increasing number of researches focus on the quantification of modeling uncertainty. Bayesian model averaging (BMA) provides a popular framework for quantifying conceptual model and parameter uncertainty. The ensemble prediction is generated by combining each plausible model's prediction, and each model is attached with a model weight which is determined by model's prior weight and marginal likelihood. Thus, the estimation of model's marginal likelihood is crucial for reliable and accurate BMA prediction. Nested sampling estimator (NSE) is a new proposed method for marginal likelihood estimation. The process of NSE is accomplished by searching the parameters' space from low likelihood area to high likelihood area gradually, and this evolution is finished iteratively via local sampling procedure. Thus, the efficiency of NSE is dominated by the strength of local sampling procedure. Currently, Metropolis-Hasting (M-H) algorithm is often used for local sampling. However, M-H is not an efficient sampling algorithm for high-dimensional or complicated parameter space. For improving the efficiency of NSE, it could be ideal to incorporate the robust and efficient sampling algorithm - DREAMzs into the local sampling of NSE. The comparison results demonstrated that the improved NSE could improve the efficiency of marginal likelihood estimation significantly. However, both improved and original NSEs suffer from heavy instability. In addition, the heavy computation cost of huge number of model executions is overcome by using an adaptive sparse grid surrogates.
Improving quantum state estimation with mutually unbiased bases.
Adamson, R B A; Steinberg, A M
2010-07-16
When used in quantum state estimation, projections onto mutually unbiased bases have the ability to maximize information extraction per measurement and to minimize redundancy. We present the first experimental demonstration of quantum state tomography of two-qubit polarization states to take advantage of mutually unbiased bases. We demonstrate improved state estimation as compared to standard measurement strategies and discuss how this can be understood from the structure of the measurements we use. We experimentally compared our method to the standard state estimation method for three different states and observe that the infidelity was up to 1.84 ± 0.06 times lower by using our technique than it was by using standard state estimation methods.
NASA Astrophysics Data System (ADS)
Dogulu, Nilay; Solomatine, Dimitri; Lal Shrestha, Durga
2014-05-01
Within the context of flood forecasting, assessment of predictive uncertainty has become a necessity for most of the modelling studies in operational hydrology. There are several uncertainty analysis and/or prediction methods available in the literature; however, most of them rely on normality and homoscedasticity assumptions for model residuals occurring in reproducing the observed data. This study focuses on a statistical method analyzing model residuals without having any assumptions and based on a clustering approach: Uncertainty Estimation based on local Errors and Clustering (UNEEC). The aim of this work is to provide a comprehensive evaluation of the UNEEC method's performance in view of clustering approach employed within its methodology. This is done by analyzing normality of model residuals and comparing uncertainty analysis results (for 50% and 90% confidence level) with those obtained from uniform interval and quantile regression methods. An important part of the basis by which the methods are compared is analysis of data clusters representing different hydrometeorological conditions. The validation measures used are PICP, MPI, ARIL and NUE where necessary. A new validation measure linking prediction interval to the (hydrological) model quality - weighted mean prediction interval (WMPI) - is also proposed for comparing the methods more effectively. The case study is Brue catchment, located in the South West of England. A different parametrization of the method than its previous application in Shrestha and Solomatine (2008) is used, i.e. past error values in addition to discharge and effective rainfall is considered. The results show that UNEEC's notable characteristic in its methodology, i.e. applying clustering to data of predictors upon which catchment behaviour information is encapsulated, contributes increased accuracy of the method's results for varying flow conditions. Besides, classifying data so that extreme flow events are individually
NASA Astrophysics Data System (ADS)
Eadie, Gwendolyn; Harris, William E.; Springford, Aaron; Widrow, Larry
2017-01-01
The mass and cumulative mass profile of the Milky Way's dark matter halo is a fundamental property of the Galaxy, and yet these quantities remain poorly constrained and span almost two orders of magnitude in the literature. There are a variety of methods to measure the mass of the Milky Way, and a common way to constrain the mass uses kinematic information of satellite objects (e.g. globular clusters) orbiting the Galaxy. One reason precise estimates of the mass and mass profile remain elusive is that the kinematic data of the globular clusters are incomplete; for some both line-of-sight and proper motion measurements are available (i.e. complete data), and for others there are only line-of-sight velocities (i.e. incomplete data). Furthermore, some proper motion measurements suffer from large measurement uncertainties, and these uncertainties can be difficult to take into account because they propagate in complicated ways. Past methods have dealt with incomplete data by using either only the line-of-sight measurements (and throwing away the proper motions), or only using the complete data. In either case, valuable information is not included in the analysis. During my PhD research, I have been developing a coherent hierarchical Bayesian method to estimate the mass and mass profile of the Galaxy that 1) includes both complete and incomplete kinematic data simultaneously in the analysis, and 2) includes measurement uncertainties in a meaningful way. In this presentation, I will introduce our approach in a way that is accessible and clear, and will also present our estimates of the Milky Way's total mass and mass profile using all available kinematic data from the globular cluster population of the Galaxy.
Aloisio, Kathryn M; Swanson, Sonja A; Micali, Nadia; Field, Alison; Horton, Nicholas J
2014-10-01
Clustered data arise in many settings, particularly within the social and biomedical sciences. As an example, multiple-source reports are commonly collected in child and adolescent psychiatric epidemiologic studies where researchers use various informants (e.g. parent and adolescent) to provide a holistic view of a subject's symptomatology. Fitzmaurice et al. (1995) have described estimation of multiple source models using a standard generalized estimating equation (GEE) framework. However, these studies often have missing data due to additional stages of consent and assent required. The usual GEE is unbiased when missingness is Missing Completely at Random (MCAR) in the sense of Little and Rubin (2002). This is a strong assumption that may not be tenable. Other options such as weighted generalized estimating equations (WEEs) are computationally challenging when missingness is non-monotone. Multiple imputation is an attractive method to fit incomplete data models while only requiring the less restrictive Missing at Random (MAR) assumption. Previously estimation of partially observed clustered data was computationally challenging however recent developments in Stata have facilitated their use in practice. We demonstrate how to utilize multiple imputation in conjunction with a GEE to investigate the prevalence of disordered eating symptoms in adolescents reported by parents and adolescents as well as factors associated with concordance and prevalence. The methods are motivated by the Avon Longitudinal Study of Parents and their Children (ALSPAC), a cohort study that enrolled more than 14,000 pregnant mothers in 1991-92 and has followed the health and development of their children at regular intervals. While point estimates were fairly similar to the GEE under MCAR, the MAR model had smaller standard errors, while requiring less stringent assumptions regarding missingness.
A new estimate of the Hubble constant using the Virgo cluster distance
NASA Astrophysics Data System (ADS)
Visvanathan, N.
The Hubble constant, which defines the size and age of the universe, remains substantially uncertain. Attention is presently given to an improved distance to the Virgo Cluster obtained by means of the 1.05-micron luminosity-H I width relation of spirals. In order to improve the absolute calibration of the relation, accurate distances to the nearby SMC, LMC, N6822, SEX A and N300 galaxies have also been obtained, on the basis of the near-IR P-L relation of the Cepheids. A value for the global Hubble constant of 67 + or 4 km/sec per Mpc is obtained.
mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models
Scrucca, Luca; Fop, Michael; Murphy, T. Brendan; Raftery, Adrian E.
2016-01-01
Finite mixture models are being used increasingly to model a wide variety of random phenomena for clustering, classification and density estimation. mclust is a powerful and popular package which allows modelling of data as a Gaussian finite mixture with different covariance structures and different numbers of mixture components, for a variety of purposes of analysis. Recently, version 5 of the package has been made available on CRAN. This updated version adds new covariance structures, dimension reduction capabilities for visualisation, model selection criteria, initialisation strategies for the EM algorithm, and bootstrap-based inference, making it a full-featured R package for data analysis via finite mixture modelling. PMID:27818791
Improving Emission Estimates With The Community Emissions Data System (CEDS
NASA Astrophysics Data System (ADS)
Smith, S.; Hoesly, R. M.
2016-12-01
Inventory data is a key component of scientific and regulatory efforts focused on air pollution, climate and global change and also a critical compliment for observational emission efforts. The Community Emissions Data System (CEDS) project aims to provide consistent estimates of historical anthropogenic emissions using an open-source data system. The first product from this system was anthropogenic emissions over 1750-2014 of reactive gases, aerosols, and carbon dioxide, for use in CMIP6. These data are annually resolved, have monthly seasonality, were estimated at a moderately detailed level of 50+ sectors and 8 fuel types, and were mapped to spatial grids. CEDS combines bottom-up default emissions estimates that are calibrated to country-level inventories where these are deemed reliable. Outside of years where inventories are available, driver data and emission factors are extended using user-defined rules. The system is designed to facilitate annual updates (so the most recent inventory data is available). The software and most input data are being released as open source software in order to provide access to assumptions, improve emission estimates, and allow access to fundamental emissions data for research purposes. We report on our efforts to expand the spatial resolution by estimating emission trends by state/province for large countries. This will allow spatial shifts in emissions over time to be better represented and make the data more useful for research such as that discussed in this session. As part of these improvements we will add support for use of regionally-specific emission proxies and point sources. A key focus of ongoing research is better quantification of emissions uncertainty. Our goal is consistent estimation of uncertainty over time, sector, and country. We will also report on results estimating the additional uncertainty associated with extending emissions data over recent years. http://www.globalchange.umd.edu/CEDS/
An improved method of monopulse estimation in PD radar
NASA Astrophysics Data System (ADS)
Wang, Jun; Guo, Peng; Lei, Peng; Wei, Shaoming
2011-10-01
Monopulse estimation is an angle measurement method with high data rate, measurement precision and anti-jamming ability, since the angle information of target is obtained by comparing echoes received in two or more simultaneous antenna beams. However, the data rate of this method decreases due to coherent integration when applied in pulse Doppler (PD) radar. This paper presents an improved method of monopulse estimation in PD radar. In this method, the received echoes are selected by shift before coherent integration, detection and angle measurement. It can increase data rate while maintain angle measurement precision. And the validity of this method is verified by theoretical analysis and simulation results.
Forastiere, Laura; Mealli, Fabrizia; VanderWeele, Tyler J.
2016-01-01
Exploration of causal mechanisms is often important for researchers and policymakers to understand how an intervention works and how it can be improved. This task can be crucial in clustered encouragement designs (CED). Encouragement design studies arise frequently when the treatment cannot be enforced because of ethical or practical constrains and an encouragement intervention (information campaigns, incentives, etc) is conceived with the purpose of increasing the uptake of the treatment of interest. By design, encouragements always entail the complication of non-compliance. Encouragements can also give rise to a variety of mechanisms, particularly when encouragement is assigned at cluster level. Social interactions among units within the same cluster can result in spillover effects. Disentangling the effect of encouragement through spillover effects from that through the enhancement of the treatment would give better insight into the intervention and it could be compelling for planning the scaling-up phase of the program. Building on previous works on CEDs and non-compliance, we use the principal stratification framework to define stratum-specific causal effects, that is, effects for specific latent subpopulations, defined by the joint potential compliance statuses under both encouragement conditions. We show how the latter stratum-specific causal effects are related to the decomposition commonly used in the literature and provide flexible homogeneity assumptions under which an extrapolation across principal strata allows one to disentangle the effects. Estimation of causal estimands can be performed with Bayesian inferential methods using hierarchical models to account for clustering. We illustrate the proposed methodology by analyzing a cluster randomized experiment implemented in Zambia and designed to evaluate the impact on malaria prevalence of an agricultural loan program intended to increase the bed net coverage. Farmer households assigned to the program
Evaluation of the IMPROVE Equation for estimating aerosol light extinction.
Lowenthal, Douglas H; Kumar, Naresh
2016-07-01
The [revised] IMPROVE Equation for estimating light extinction from aerosol chemical composition was evaluated considering new measurements at U.S. national parks. Compared with light scattering (Bsp) measured at seven IMPROVE sites with nephelometer data from 2003-2012, the [revised] IMPROVE Equation over- and underestimated Bsp in the lower and upper quintiles, respectively, of measured Bsp. Underestimation of the worst visibility cases (upper quintile) was reduced by assuming an organic mass (OM)/organic carbon (OC) ratio of 2.1 and hygroscopic growth of OM, based on results from previous field studies. This assumption, however, tended to overestimate low Bsp even more. Assuming that sulfate was present as ammonium bisulfate rather than as ammonium sulfate uniformly reduced estimated Bsp. The split-mode model of concentration- and size-dependent dry mass scattering efficiencies in the [revised] IMPROVE Equation does not eliminate systematic biases in estimated Bsp. While the new measurements of OM/OC and OM hygroscopicity should be incorporated into future iterations of the IMPROVE Equation, the problem is not well constrained due to a lack of routine measurements of sulfate neutralization and the water-soluble fraction of OM in the IMPROVE network. Studies in U.S. national parks showed that aerosol organics contain more mass and absorb more water as a function of relative humidity than is currently assumed by the IMPROVE Equation for calculating chemical light extinction. Consideration of these results could significantly shift the apportionment of light extinction to water-soluble organic aerosols and therefore better inform pollution control strategies under the U.S. Environmental Protection Agency Regional Haze Rule.
Estimating accuracy of land-cover composition from two-stage cluster sampling
Stehman, S.V.; Wickham, J.D.; Fattorini, L.; Wade, T.D.; Baffetta, F.; Smith, J.H.
2009-01-01
Land-cover maps are often used to compute land-cover composition (i.e., the proportion or percent of area covered by each class), for each unit in a spatial partition of the region mapped. We derive design-based estimators of mean deviation (MD), mean absolute deviation (MAD), root mean square error (RMSE), and correlation (CORR) to quantify accuracy of land-cover composition for a general two-stage cluster sampling design, and for the special case of simple random sampling without replacement (SRSWOR) at each stage. The bias of the estimators for the two-stage SRSWOR design is evaluated via a simulation study. The estimators of RMSE and CORR have small bias except when sample size is small and the land-cover class is rare. The estimator of MAD is biased for both rare and common land-cover classes except when sample size is large. A general recommendation is that rare land-cover classes require large sample sizes to ensure that the accuracy estimators have small bias. ?? 2009 Elsevier Inc.
Estimating carnivoran diets using a combination of carcass observations and scats from GPS clusters
Tambling, C.J.; Laurence, S.D.; Bellan, S.E.; Cameron, E.Z.; du Toit, J.T.; Getz, W.M.
2011-01-01
Scat analysis is one of the most frequently used methods to assess carnivoran diets and Global Positioning System (GPS) cluster methods are increasingly being used to locate feeding sites for large carnivorans. However, both methods have inherent biases that limit their use. GPS methods to locate kill sites are biased towards large carcasses, while scat analysis over-estimates the biomass consumed from smaller prey. We combined carcass observations and scats collected along known movement routes, assessed using GPS data from four African lion (Panthera leo) prides in the Kruger National Park, South Africa, to determine how a combination of these two datasets change diet estimates. As expected, using carcasses alone under-estimated the number of feeding events on small species, primarily impala (Aepyceros melampus) and warthog (Phacochoerus africanus), in our case by more than 50% and thus significantly under-estimated the biomass consumed per pride per day in comparison to when the diet was assessed using carcass observations alone. We show that an approach that supplements carcass observations with scats that enables the identification of potentially missed feeding events increases the estimates of food intake rates for large carnivorans, with possible ramifications for predator-prey interaction studies dealing with biomass intake rate. PMID:22408290
Estimating carnivoran diets using a combination of carcass observations and scats from GPS clusters.
Tambling, C J; Laurence, S D; Bellan, S E; Cameron, E Z; du Toit, J T; Getz, W M
2012-02-01
Scat analysis is one of the most frequently used methods to assess carnivoran diets and Global Positioning System (GPS) cluster methods are increasingly being used to locate feeding sites for large carnivorans. However, both methods have inherent biases that limit their use. GPS methods to locate kill sites are biased towards large carcasses, while scat analysis over-estimates the biomass consumed from smaller prey. We combined carcass observations and scats collected along known movement routes, assessed using GPS data from four African lion (Panthera leo) prides in the Kruger National Park, South Africa, to determine how a combination of these two datasets change diet estimates. As expected, using carcasses alone under-estimated the number of feeding events on small species, primarily impala (Aepyceros melampus) and warthog (Phacochoerus africanus), in our case by more than 50% and thus significantly under-estimated the biomass consumed per pride per day in comparison to when the diet was assessed using carcass observations alone. We show that an approach that supplements carcass observations with scats that enables the identification of potentially missed feeding events increases the estimates of food intake rates for large carnivorans, with possible ramifications for predator-prey interaction studies dealing with biomass intake rate.
Performance Analysis of an Improved MUSIC DoA Estimator
NASA Astrophysics Data System (ADS)
Vallet, Pascal; Mestre, Xavier; Loubaton, Philippe
2015-12-01
This paper adresses the statistical performance of subspace DoA estimation using a sensor array, in the asymptotic regime where the number of samples and sensors both converge to infinity at the same rate. Improved subspace DoA estimators were derived (termed as G-MUSIC) in previous works, and were shown to be consistent and asymptotically Gaussian distributed in the case where the number of sources and their DoA remain fixed. In this case, which models widely spaced DoA scenarios, it is proved in the present paper that the traditional MUSIC method also provides DoA consistent estimates having the same asymptotic variances as the G-MUSIC estimates. The case of DoA that are spaced of the order of a beamwidth, which models closely spaced sources, is also considered. It is shown that G-MUSIC estimates are still able to consistently separate the sources, while it is no longer the case for the MUSIC ones. The asymptotic variances of G-MUSIC estimates are also evaluated.
NASA Astrophysics Data System (ADS)
Schyska, Bruno U.; Couto, António; von Bremen, Lueder; Estanqueiro, Ana; Heinemann, Detlev
2017-05-01
Europe is facing the challenge of increasing shares of energy from variable renewable sources. Furthermore, it is heading towards a fully integrated electricity market, i.e. a Europe-wide electricity system. The stable operation of this large-scale renewable power system requires detailed information on the amount of electricity being transmitted now and in the future. To estimate the actual amount of electricity, upscaling algorithms are applied. Those algorithms - until now - however, only exist for smaller regions (e.g. transmission zones and single wind farms). The aim of this study is to introduce a new approach to estimate Europe-wide wind power generation based on spatio-temporal clustering. We furthermore show that training the upscaling model for different prevailing weather situations allows to further reduce the number of reference sites without losing accuracy.
The ALHAMBRA survey: Estimation of the clustering signal encoded in the cosmic variance
NASA Astrophysics Data System (ADS)
López-Sanjuan, C.; Cenarro, A. J.; Hernández-Monteagudo, C.; Arnalte-Mur, P.; Varela, J.; Viironen, K.; Fernández-Soto, A.; Martínez, V. J.; Alfaro, E.; Ascaso, B.; del Olmo, A.; Díaz-García, L. A.; Hurtado-Gil, Ll.; Moles, M.; Molino, A.; Perea, J.; Pović, M.; Aguerri, J. A. L.; Aparicio-Villegas, T.; Benítez, N.; Broadhurst, T.; Cabrera-Caño, J.; Castander, F. J.; Cepa, J.; Cerviño, M.; Cristóbal-Hornillos, D.; González Delgado, R. M.; Husillos, C.; Infante, L.; Márquez, I.; Masegosa, J.; Prada, F.; Quintana, J. M.
2015-10-01
Aims: The relative cosmic variance (σv) is a fundamental source of uncertainty in pencil-beam surveys and, as a particular case of count-in-cell statistics, can be used to estimate the bias between galaxies and their underlying dark-matter distribution. Our goal is to test the significance of the clustering information encoded in the σv measured in the ALHAMBRA survey. Methods: We measure the cosmic variance of several galaxy populations selected with B-band luminosity at 0.35 ≤ z< 1.05 as the intrinsic dispersion in the number density distribution derived from the 48 ALHAMBRA subfields. We compare the observational σv with the cosmic variance of the dark matter expected from the theory, σv,dm. This provides an estimation of the galaxy bias b. Results: The galaxy bias from the cosmic variance is in excellent agreement with the bias estimated by two-point correlation function analysis in ALHAMBRA. This holds for different redshift bins, for red and blue subsamples, and for several B-band luminosity selections. We find that b increases with the B-band luminosity and the redshift, as expected from previous work. Moreover, red galaxies have a larger bias than blue galaxies, with a relative bias of brel = 1.4 ± 0.2. Conclusions: Our results demonstrate that the cosmic variance measured in ALHAMBRA is due to the clustering of galaxies and can be used to characterise the σv affecting pencil-beam surveys. In addition, it can also be used to estimate the galaxy bias b from a method independent of correlation functions. Based on observations collected at the German-Spanish Astronomical Center, Calar Alto, jointly operated by the Max-Planck-Institut für Astronomie (MPIA) at Heidelberg and the Instituto de Astrofísica de Andalucía (CSIC).
An improved approximate-Bayesian model-choice method for estimating shared evolutionary history
2014-01-01
Background To understand biological diversification, it is important to account for large-scale processes that affect the evolutionary history of groups of co-distributed populations of organisms. Such events predict temporally clustered divergences times, a pattern that can be estimated using genetic data from co-distributed species. I introduce a new approximate-Bayesian method for comparative phylogeographical model-choice that estimates the temporal distribution of divergences across taxa from multi-locus DNA sequence data. The model is an extension of that implemented in msBayes. Results By reparameterizing the model, introducing more flexible priors on demographic and divergence-time parameters, and implementing a non-parametric Dirichlet-process prior over divergence models, I improved the robustness, accuracy, and power of the method for estimating shared evolutionary history across taxa. Conclusions The results demonstrate the improved performance of the new method is due to (1) more appropriate priors on divergence-time and demographic parameters that avoid prohibitively small marginal likelihoods for models with more divergence events, and (2) the Dirichlet-process providing a flexible prior on divergence histories that does not strongly disfavor models with intermediate numbers of divergence events. The new method yields more robust estimates of posterior uncertainty, and thus greatly reduces the tendency to incorrectly estimate models of shared evolutionary history with strong support. PMID:24992937
An Accurate Link Correlation Estimator for Improving Wireless Protocol Performance
Zhao, Zhiwei; Xu, Xianghua; Dong, Wei; Bu, Jiajun
2015-01-01
Wireless link correlation has shown significant impact on the performance of various sensor network protocols. Many works have been devoted to exploiting link correlation for protocol improvements. However, the effectiveness of these designs heavily relies on the accuracy of link correlation measurement. In this paper, we investigate state-of-the-art link correlation measurement and analyze the limitations of existing works. We then propose a novel lightweight and accurate link correlation estimation (LACE) approach based on the reasoning of link correlation formation. LACE combines both long-term and short-term link behaviors for link correlation estimation. We implement LACE as a stand-alone interface in TinyOS and incorporate it into both routing and flooding protocols. Simulation and testbed results show that LACE: (1) achieves more accurate and lightweight link correlation measurements than the state-of-the-art work; and (2) greatly improves the performance of protocols exploiting link correlation. PMID:25686314
Improving Cluster Analysis with Automatic Variable Selection Based on Trees
2014-12-01
collected from the Gaspe Peninsula. Measurements were taken of four features of the flowers , the length and width of their sepals and petals in...collection of information is estimated to average 1 hour per response, including the time for reviewing instruction, searching existing data sources ...dependents” as well as categorical variables such as “education level,” “commissioning source ” and “marital status” (Kizilkaya, 2004). In addition, it
Estimating {Omega} from galaxy redshifts: Linear flow distortions and nonlinear clustering
Bromley, B.C. |; Warren, M.S.; Zurek, W.H.
1997-02-01
We propose a method to determine the cosmic mass density {Omega} from redshift-space distortions induced by large-scale flows in the presence of nonlinear clustering. Nonlinear structures in redshift space, such as fingers of God, can contaminate distortions from linear flows on scales as large as several times the small-scale pairwise velocity dispersion {sigma}{sub {nu}}. Following Peacock & Dodds, we work in the Fourier domain and propose a model to describe the anisotropy in the redshift-space power spectrum; tests with high-resolution numerical data demonstrate that the model is robust for both mass and biased galaxy halos on translinear scales and above. On the basis of this model, we propose an estimator of the linear growth parameter {beta}={Omega}{sup 0.6}/b, where b measures bias, derived from sampling functions that are tuned to eliminate distortions from nonlinear clustering. The measure is tested on the numerical data and found to recover the true value of {beta} to within {approximately}10{percent}. An analysis of {ital IRAS} 1.2 Jy galaxies yields {beta}=0.8{sub {minus}0.3}{sup +0.4} at a scale of 1000kms{sup {minus}1}, which is close to optimal given the shot noise and finite size of the survey. This measurement is consistent with dynamical estimates of {beta} derived from both real-space and redshift-space information. The importance of the method presented here is that nonlinear clustering effects are removed to enable linear correlation anisotropy measurements on scales approaching the translinear regime. We discuss implications for analyses of forthcoming optical redshift surveys in which the dispersion is more than a factor of 2 greater than in the {ital IRAS} data. {copyright} {ital 1997} {ital The American Astronomical Society}
Improving Empirical Approaches to Estimating Local Greenhouse Gas Emissions
NASA Astrophysics Data System (ADS)
Blackhurst, M.; Azevedo, I. L.; Lattanzi, A.
2016-12-01
Evidence increasingly indicates our changing climate will have significant global impacts on public health, economies, and ecosystems. As a result, local governments have become increasingly interested in climate change mitigation. In the U.S., cities and counties representing nearly 15% of the domestic population plan to reduce 300 million metric tons of greenhouse gases over the next 40 years (or approximately 1 ton per capita). Local governments estimate greenhouse gas emissions to establish greenhouse gas mitigation goals and select supporting mitigation measures. However, current practices produce greenhouse gas estimates - also known as a "greenhouse gas inventory " - of empirical quality often insufficient for robust mitigation decision making. Namely, current mitigation planning uses sporadic, annual, and deterministic estimates disaggregated by broad end use sector, obscuring sources of emissions uncertainty, variability, and exogeneity that influence mitigation opportunities. As part of AGU's Thriving Earth Exchange, Ari Lattanzi of City of Pittsburgh, PA recently partnered with Dr. Inez Lima Azevedo (Carnegie Mellon University) and Dr. Michael Blackhurst (University of Pittsburgh) to improve the empirical approach to characterizing Pittsburgh's greenhouse gas emissions. The project will produce first-order estimates of the underlying sources of uncertainty, variability, and exogeneity influencing Pittsburgh's greenhouse gases and discuss implications of mitigation decision making. The results of the project will enable local governments to collect more robust greenhouse gas inventories to better support their mitigation goals and improve measurement and verification efforts.
Increasing fMRI sampling rate improves Granger causality estimates.
Lin, Fa-Hsuan; Ahveninen, Jyrki; Raij, Tommi; Witzel, Thomas; Chu, Ying-Hua; Jääskeläinen, Iiro P; Tsai, Kevin Wen-Kai; Kuo, Wen-Jui; Belliveau, John W
2014-01-01
Estimation of causal interactions between brain areas is necessary for elucidating large-scale functional brain networks underlying behavior and cognition. Granger causality analysis of time series data can quantitatively estimate directional information flow between brain regions. Here, we show that such estimates are significantly improved when the temporal sampling rate of functional magnetic resonance imaging (fMRI) is increased 20-fold. Specifically, healthy volunteers performed a simple visuomotor task during blood oxygenation level dependent (BOLD) contrast based whole-head inverse imaging (InI). Granger causality analysis based on raw InI BOLD data sampled at 100-ms resolution detected the expected causal relations, whereas when the data were downsampled to the temporal resolution of 2 s typically used in echo-planar fMRI, the causality could not be detected. An additional control analysis, in which we SINC interpolated additional data points to the downsampled time series at 0.1-s intervals, confirmed that the improvements achieved with the real InI data were not explainable by the increased time-series length alone. We therefore conclude that the high-temporal resolution of InI improves the Granger causality connectivity analysis of the human brain.
Improvement of Source Number Estimation Method for Single Channel Signal
Du, Bolun; He, Yunze
2016-01-01
Source number estimation methods for single channel signal have been investigated and the improvements for each method are suggested in this work. Firstly, the single channel data is converted to multi-channel form by delay process. Then, algorithms used in the array signal processing, such as Gerschgorin’s disk estimation (GDE) and minimum description length (MDL), are introduced to estimate the source number of the received signal. The previous results have shown that the MDL based on information theoretic criteria (ITC) obtains a superior performance than GDE at low SNR. However it has no ability to handle the signals containing colored noise. On the contrary, the GDE method can eliminate the influence of colored noise. Nevertheless, its performance at low SNR is not satisfactory. In order to solve these problems and contradictions, the work makes remarkable improvements on these two methods on account of the above consideration. A diagonal loading technique is employed to ameliorate the MDL method and a jackknife technique is referenced to optimize the data covariance matrix in order to improve the performance of the GDE method. The results of simulation have illustrated that the performance of original methods have been promoted largely. PMID:27736959
Improving the estimation of the tuberculosis burden in India
Cowling, Krycia; Dandona, Rakhi
2014-01-01
Abstract Although India is considered to be the country with the greatest tuberculosis burden, estimates of the disease’s incidence, prevalence and mortality in India rely on sparse data with substantial uncertainty. The relevant available data are less reliable than those from countries that have recently improved systems for case reporting or recently invested in national surveys of tuberculosis prevalence. We explored ways to improve the estimation of the tuberculosis burden in India. We focused on case notification data – among the most reliable data available – and ways to investigate the associated level of underreporting, as well as the need for a national tuberculosis prevalence survey. We discuss several recent developments – i.e. changes in national policies relating to tuberculosis, World Health Organization guidelines for the investigation of the disease, and a rapid diagnostic test – that should improve data collection for the estimation of the tuberculosis burden in India and elsewhere. We recommend the implementation of an inventory study in India to assess the underreporting of tuberculosis cases, as well as a national survey of tuberculosis prevalence. A national assessment of drug resistance in Indian strains of Mycobacterium tuberculosis should also be considered. The results of such studies will be vital for the accurate monitoring of tuberculosis control efforts in India and globally. PMID:25378743
Improving the estimation of the tuberculosis burden in India.
Cowling, Krycia; Dandona, Rakhi; Dandona, Lalit
2014-11-01
Although India is considered to be the country with the greatest tuberculosis burden, estimates of the disease's incidence, prevalence and mortality in India rely on sparse data with substantial uncertainty. The relevant available data are less reliable than those from countries that have recently improved systems for case reporting or recently invested in national surveys of tuberculosis prevalence. We explored ways to improve the estimation of the tuberculosis burden in India. We focused on case notification data - among the most reliable data available - and ways to investigate the associated level of underreporting, as well as the need for a national tuberculosis prevalence survey. We discuss several recent developments - i.e. changes in national policies relating to tuberculosis, World Health Organization guidelines for the investigation of the disease, and a rapid diagnostic test - that should improve data collection for the estimation of the tuberculosis burden in India and elsewhere. We recommend the implementation of an inventory study in India to assess the underreporting of tuberculosis cases, as well as a national survey of tuberculosis prevalence. A national assessment of drug resistance in Indian strains of Mycobacterium tuberculosis should also be considered. The results of such studies will be vital for the accurate monitoring of tuberculosis control efforts in India and globally.
An improved K-means clustering algorithm in agricultural image segmentation
NASA Astrophysics Data System (ADS)
Cheng, Huifeng; Peng, Hui; Liu, Shanmei
Image segmentation is the first important step to image analysis and image processing. In this paper, according to color crops image characteristics, we firstly transform the color space of image from RGB to HIS, and then select proper initial clustering center and cluster number in application of mean-variance approach and rough set theory followed by clustering calculation in such a way as to automatically segment color component rapidly and extract target objects from background accurately, which provides a reliable basis for identification, analysis, follow-up calculation and process of crops images. Experimental results demonstrate that improved k-means clustering algorithm is able to reduce the computation amounts and enhance precision and accuracy of clustering.
Improving estimates of tree mortality probability using potential growth rate
Das, Adrian J.; Stephenson, Nathan L.
2015-01-01
Tree growth rate is frequently used to estimate mortality probability. Yet, growth metrics can vary in form, and the justification for using one over another is rarely clear. We tested whether a growth index (GI) that scales the realized diameter growth rate against the potential diameter growth rate (PDGR) would give better estimates of mortality probability than other measures. We also tested whether PDGR, being a function of tree size, might better correlate with the baseline mortality probability than direct measurements of size such as diameter or basal area. Using a long-term dataset from the Sierra Nevada, California, U.S.A., as well as existing species-specific estimates of PDGR, we developed growth–mortality models for four common species. For three of the four species, models that included GI, PDGR, or a combination of GI and PDGR were substantially better than models without them. For the fourth species, the models including GI and PDGR performed roughly as well as a model that included only the diameter growth rate. Our results suggest that using PDGR can improve our ability to estimate tree survival probability. However, in the absence of PDGR estimates, the diameter growth rate was the best empirical predictor of mortality, in contrast to assumptions often made in the literature.
Can modeling improve estimation of desert tortoise population densities?
Nussear, K.E.; Tracy, C.R.
2007-01-01
The federally listed desert tortoise (Gopherus agassizii) is currently monitored using distance sampling to estimate population densities. Distance sampling, as with many other techniques for estimating population density, assumes that it is possible to quantify the proportion of animals available to be counted in any census. Because desert tortoises spend much of their life in burrows, and the proportion of tortoises in burrows at any time can be extremely variable, this assumption is difficult to meet. This proportion of animals available to be counted is used as a correction factor (g0) in distance sampling and has been estimated from daily censuses of small populations of tortoises (6-12 individuals). These censuses are costly and produce imprecise estimates of g0 due to small sample sizes. We used data on tortoise activity from a large (N = 150) experimental population to model activity as a function of the biophysical attributes of the environment, but these models did not improve the precision of estimates from the focal populations. Thus, to evaluate how much of the variance in tortoise activity is apparently not predictable, we assessed whether activity on any particular day can predict activity on subsequent days with essentially identical environmental conditions. Tortoise activity was only weakly correlated on consecutive days, indicating that behavior was not repeatable or consistent among days with similar physical environments. ?? 2007 by the Ecological Society of America.
Improving variance estimation in Monte Carlo eigenvalue simulations
Jin, Lei; Banerjee, Kaushik; Hamilton, Steven P.; ...
2017-07-27
Monte Carlo (MC) methods have been widely used to solve eigenvalue problems in complex nuclear systems. Once a stationary fission source is obtained in MC simulations, the sample mean of many stationary cycles is calculated. Variance or standard deviation of the sample mean is needed to indicate the level of statistical uncertainty of the simulation and to understand the convergence of the sample mean. Current MC codes typically use sample variance to estimate the statistical uncertainty of the simulation and assume that the MC stationary cycles are independent. But, there is a correlation between these cycles, and estimators of themore » variance that ignore these correlations will systematically underestimate the variance. Our paper discusses some statistical properties of the sample mean and the asymptotic variance and introduces two novel estimators based on (a) covariance-adjusted methods and (b) bootstrap methods to reduce the variance underestimation. For three test problems, it has been observed that both new methods can improve the estimation of the standard deviation of the sample mean by more than an order of magnitude. Additionally, there are some interesting patterns revealed for these estimates over the spatial regions, providing additional insights into MC simulations for nuclear systems. These new methodologies are based on post-processing the tally results and are therefore easy to implement and code agnostic.« less
Can modeling improve estimation of desert tortoise population densities?
Nussear, Kenneth E; Tracy, C Richard
2007-03-01
The federally listed desert tortoise (Gopherus agassizii) is currently monitored using distance sampling to estimate population densities. Distance sampling, as with many other techniques for estimating population density, assumes that it is possible to quantify the proportion of animals available to be counted in any census. Because desert tortoises spend much of their life in burrows, and the proportion of tortoises in burrows at any time can be extremely variable, this assumption is difficult to meet. This proportion of animals available to be counted is used as a correction factor (g0) in distance sampling and has been estimated from daily censuses of small populations of tortoises (6-12 individuals). These censuses are costly and produce imprecise estimates of go due to small sample sizes. We used data on tortoise activity from a large (N = 150) experimental population to model activity as a function of the biophysical attributes of the environment, but these models did not improve the precision of estimates from the focal populations. Thus, to evaluate how much of the variance in tortoise activity is apparently not predictable, we assessed whether activity on any particular day can predict activity on subsequent days with essentially identical environmental conditions. Tortoise activity was only weakly correlated on consecutive days, indicating that behavior was not repeatable or consistent among days with similar physical environments.
IPEG- IMPROVED PRICE ESTIMATION GUIDELINES (IBM 370 VERSION)
NASA Technical Reports Server (NTRS)
Chamberlain, R. G.
1994-01-01
The Improved Price Estimation Guidelines, IPEG, program provides a simple yet accurate estimate of the price of a manufactured product. IPEG facilitates sensitivity studies of price estimates at considerably less expense than would be incurred by using the Standard Assembly-line Manufacturing Industry Simulation, SAMIS, program (COSMIC program NPO-16032). A difference of less than one percent between the IPEG and SAMIS price estimates has been observed with realistic test cases. However, the IPEG simplification of SAMIS allows the analyst with limited time and computing resources to perform a greater number of sensitivity studies than with SAMIS. Although IPEG was developed for the photovoltaics industry, it is readily adaptable to any standard assembly line type of manufacturing industry. IPEG estimates the annual production price per unit. The input data includes cost of equipment, space, labor, materials, supplies, and utilities. Production on an industry wide basis or a process wide basis can be simulated. Once the IPEG input file is prepared, the original price is estimated and sensitivity studies may be performed. The IPEG user selects a sensitivity variable and a set of values. IPEG will compute a price estimate and a variety of other cost parameters for every specified value of the sensitivity variable. IPEG is designed as an interactive system and prompts the user for all required information and offers a variety of output options. The IPEG/PC program is written in TURBO PASCAL for interactive execution on an IBM PC computer under DOS 2.0 or above with at least 64K of memory. The IBM PC color display and color graphics adapter are needed to use the plotting capabilities in IPEG/PC. IPEG/PC was developed in 1984. The original IPEG program is written in SIMSCRIPT II.5 for interactive execution and has been implemented on an IBM 370 series computer with a central memory requirement of approximately 300K of 8 bit bytes. The original IPEG was developed in 1980.
IPEG- IMPROVED PRICE ESTIMATION GUIDELINES (IBM PC VERSION)
NASA Technical Reports Server (NTRS)
Aster, R. W.
1994-01-01
The Improved Price Estimation Guidelines, IPEG, program provides a simple yet accurate estimate of the price of a manufactured product. IPEG facilitates sensitivity studies of price estimates at considerably less expense than would be incurred by using the Standard Assembly-line Manufacturing Industry Simulation, SAMIS, program (COSMIC program NPO-16032). A difference of less than one percent between the IPEG and SAMIS price estimates has been observed with realistic test cases. However, the IPEG simplification of SAMIS allows the analyst with limited time and computing resources to perform a greater number of sensitivity studies than with SAMIS. Although IPEG was developed for the photovoltaics industry, it is readily adaptable to any standard assembly line type of manufacturing industry. IPEG estimates the annual production price per unit. The input data includes cost of equipment, space, labor, materials, supplies, and utilities. Production on an industry wide basis or a process wide basis can be simulated. Once the IPEG input file is prepared, the original price is estimated and sensitivity studies may be performed. The IPEG user selects a sensitivity variable and a set of values. IPEG will compute a price estimate and a variety of other cost parameters for every specified value of the sensitivity variable. IPEG is designed as an interactive system and prompts the user for all required information and offers a variety of output options. The IPEG/PC program is written in TURBO PASCAL for interactive execution on an IBM PC computer under DOS 2.0 or above with at least 64K of memory. The IBM PC color display and color graphics adapter are needed to use the plotting capabilities in IPEG/PC. IPEG/PC was developed in 1984. The original IPEG program is written in SIMSCRIPT II.5 for interactive execution and has been implemented on an IBM 370 series computer with a central memory requirement of approximately 300K of 8 bit bytes. The original IPEG was developed in 1980.
IPEG- IMPROVED PRICE ESTIMATION GUIDELINES (IBM PC VERSION)
NASA Technical Reports Server (NTRS)
Aster, R. W.
1994-01-01
The Improved Price Estimation Guidelines, IPEG, program provides a simple yet accurate estimate of the price of a manufactured product. IPEG facilitates sensitivity studies of price estimates at considerably less expense than would be incurred by using the Standard Assembly-line Manufacturing Industry Simulation, SAMIS, program (COSMIC program NPO-16032). A difference of less than one percent between the IPEG and SAMIS price estimates has been observed with realistic test cases. However, the IPEG simplification of SAMIS allows the analyst with limited time and computing resources to perform a greater number of sensitivity studies than with SAMIS. Although IPEG was developed for the photovoltaics industry, it is readily adaptable to any standard assembly line type of manufacturing industry. IPEG estimates the annual production price per unit. The input data includes cost of equipment, space, labor, materials, supplies, and utilities. Production on an industry wide basis or a process wide basis can be simulated. Once the IPEG input file is prepared, the original price is estimated and sensitivity studies may be performed. The IPEG user selects a sensitivity variable and a set of values. IPEG will compute a price estimate and a variety of other cost parameters for every specified value of the sensitivity variable. IPEG is designed as an interactive system and prompts the user for all required information and offers a variety of output options. The IPEG/PC program is written in TURBO PASCAL for interactive execution on an IBM PC computer under DOS 2.0 or above with at least 64K of memory. The IBM PC color display and color graphics adapter are needed to use the plotting capabilities in IPEG/PC. IPEG/PC was developed in 1984. The original IPEG program is written in SIMSCRIPT II.5 for interactive execution and has been implemented on an IBM 370 series computer with a central memory requirement of approximately 300K of 8 bit bytes. The original IPEG was developed in 1980.
IPEG- IMPROVED PRICE ESTIMATION GUIDELINES (IBM 370 VERSION)
NASA Technical Reports Server (NTRS)
Chamberlain, R. G.
1994-01-01
The Improved Price Estimation Guidelines, IPEG, program provides a simple yet accurate estimate of the price of a manufactured product. IPEG facilitates sensitivity studies of price estimates at considerably less expense than would be incurred by using the Standard Assembly-line Manufacturing Industry Simulation, SAMIS, program (COSMIC program NPO-16032). A difference of less than one percent between the IPEG and SAMIS price estimates has been observed with realistic test cases. However, the IPEG simplification of SAMIS allows the analyst with limited time and computing resources to perform a greater number of sensitivity studies than with SAMIS. Although IPEG was developed for the photovoltaics industry, it is readily adaptable to any standard assembly line type of manufacturing industry. IPEG estimates the annual production price per unit. The input data includes cost of equipment, space, labor, materials, supplies, and utilities. Production on an industry wide basis or a process wide basis can be simulated. Once the IPEG input file is prepared, the original price is estimated and sensitivity studies may be performed. The IPEG user selects a sensitivity variable and a set of values. IPEG will compute a price estimate and a variety of other cost parameters for every specified value of the sensitivity variable. IPEG is designed as an interactive system and prompts the user for all required information and offers a variety of output options. The IPEG/PC program is written in TURBO PASCAL for interactive execution on an IBM PC computer under DOS 2.0 or above with at least 64K of memory. The IBM PC color display and color graphics adapter are needed to use the plotting capabilities in IPEG/PC. IPEG/PC was developed in 1984. The original IPEG program is written in SIMSCRIPT II.5 for interactive execution and has been implemented on an IBM 370 series computer with a central memory requirement of approximately 300K of 8 bit bytes. The original IPEG was developed in 1980.
Improving Estimated Optical Constants With MSTM and DDSCAT Modeling
NASA Astrophysics Data System (ADS)
Pitman, K. M.; Wolff, M. J.
2015-12-01
We present numerical experiments to determine quantitatively the effects of mineral particle clustering on Mars spacecraft spectral signatures and to improve upon the values of refractive indices (optical constants n, k) derived from Mars dust laboratory analog spectra such as those from RELAB and MRO CRISM libraries. Whereas spectral properties for Mars analog minerals and actual Mars soil are dominated by aggregates of particles smaller than the size of martian atmospheric dust, the analytic radiative transfer (RT) solutions used to interpret planetary surfaces assume that individual, well-separated particles dominate the spectral signature. Both in RT models and in the refractive index derivation methods that include analytic RT approximations, spheres are also over-used to represent nonspherical particles. Part of the motivation is that the integrated effect over randomly oriented particles on quantities such as single scattering albedo and phase function are relatively less than for single particles. However, we have seen in previous numerical experiments that when varying the shape and size of individual grains within a cluster, the phase function changes in both magnitude and slope, thus the "relatively less" effect is more significant than one might think. Here we examine the wavelength dependence of the forward scattering parameter with multisphere T-matrix (MSTM) and discrete dipole approximation (DDSCAT) codes that compute light scattering by layers of particles on planetary surfaces to see how albedo is affected and integrate our model results into refractive index calculations to remove uncertainties in approximations and parameters that can lower the accuracy of optical constants. By correcting the single scattering albedo and phase function terms in the refractive index determinations, our data will help to improve the understanding of Mars in identifying, mapping the distributions, and quantifying abundances for these minerals and will address long
Crespi, Catherine M; Wong, Weng Kee; Wu, Sheng
2011-12-01
Power and sample size calculations for cluster randomized trials require prediction of the degree of correlation that will be realized among outcomes of participants in the same cluster. This correlation is typically quantified as the intraclass correlation coefficient (ICC), defined as the Pearson correlation between two members of the same cluster or proportion of the total variance attributable to variance between clusters. It is widely known but perhaps not fully appreciated that for binary outcomes, the ICC is a function of outcome prevalence. Hence, the ICC and the outcome prevalence are intrinsically related, making the ICC poorly generalizable across study conditions and between studies with different outcome prevalences. We use a simple parametrization of the ICC that aims to isolate that part of the ICC that measures dependence among responses within a cluster from the outcome prevalence. We incorporate this parametrization into sample size calculations for cluster randomized trials and compare our method to the traditional approach using the ICC. Our dependence parameter, R, may be less influenced by outcome prevalence and has an intuitive meaning that facilitates interpretation. Estimates of R from previous studies can be obtained using simple statistics. Comparison of methods showed that the traditional ICC approach to sample size determination tends to overpower studies under many scenarios, calling for more clusters than truly required. The methods are developed for equal-sized clusters, whereas cluster size may vary in practice. The dependence parameter R is an alternative measure of dependence among binary outcomes in cluster randomized trials that has a number of advantages over the ICC.
Improving the quality of parameter estimates obtained from slug tests
Butler, J.J.; McElwee, C.D.; Liu, W.
1996-01-01
The slug test is one of the most commonly used field methods for obtaining in situ estimates of hydraulic conductivity. Despite its prevalence, this method has received criticism from many quarters in the ground-water community. This criticism emphasizes the poor quality of the estimated parameters, a condition that is primarily a product of the somewhat casual approach that is often employed in slug tests. Recently, the Kansas Geological Survey (KGS) has pursued research directed it improving methods for the performance and analysis of slug tests. Based on extensive theoretical and field research, a series of guidelines have been proposed that should enable the quality of parameter estimates to be improved. The most significant of these guidelines are: (1) three or more slug tests should be performed at each well during a given test period; (2) two or more different initial displacements (Ho) should be used at each well during a test period; (3) the method used to initiate a test should enable the slug to be introduced in a near-instantaneous manner and should allow a good estimate of Ho to be obtained; (4) data-acquisition equipment that enables a large quantity of high quality data to be collected should be employed; (5) if an estimate of the storage parameter is needed, an observation well other than the test well should be employed; (6) the method chosen for analysis of the slug-test data should be appropriate for site conditions; (7) use of pre- and post-analysis plots should be an integral component of the analysis procedure, and (8) appropriate well construction parameters should be employed. Data from slug tests performed at a number of KGS field sites demonstrate the importance of these guidelines.
Bottom Ocean Topography Estimation Improvement Using Altimetry and Depth Soundings
NASA Astrophysics Data System (ADS)
Vergos, G. S.; Sideris, M. G.
The possibility of improving the estimation of bottom ocean topography using altimetry-derived gravity data and shipborne depth soundings is investigated in two extended test regions. The integrated inversion method proposed by Knudsen and based on Parker's formula, for the relationship between gravity and bathymetry, is used to estimate new local bathymetry models. Initially, only gravity data are used and then gravity and shipborne depth soundings are combined in an iterative least- squares collocation procedure to produce the new depth models. The estimated models are validated in terms of the smoothing they provide to gravity field related quantities, used for geoid and gravity field approximation. Global models and shipborne soundings are also implemented in the validation procedure to investigate the improvement that the new models offer not only to gravity field modeling but also to the estimation of the bathymetry itself. For the depth estimation, the global multi-satellite altimetry-derived KMS99 gravity field and available depth soundings from the GEODAS database are used. The validation is carried out with the global JGP95E and Sandwell and Smith bathymetry models as well as with shipborne gravity data from BGI and altimetry sea surface heights from the ERS1 and GEOSAT Geodetic Mission altimetry. It is shown that, for both test areas, the new local bathymetry models manage to smooth the gravity field data by 15-25%, which is about 50% better compared to the global ones. Additionally, the implementation of both gravity and depth data provides a more realistic representation of the real bathymetry, and reduces the differences with the depth soundings and the global digital depth models (DDMs) to only 100-200 m, compared to 600-800 m for the gravity-only solutions.
NASA Astrophysics Data System (ADS)
Cruz, S. M. A.; Marques, J. M. C.; Pereira, F. B.
2016-10-01
We propose improvements to our evolutionary algorithm (EA) [J. M. C. Marques and F. B. Pereira, J. Mol. Liq. 210, 51 (2015)] in order to avoid dissociative solutions in the global optimization of clusters with competing attractive and repulsive interactions. The improved EA outperforms the original version of the method for charged colloidal clusters in the size range 3 ≤ N ≤ 25, which is a very stringent test for global optimization algorithms. While the Bernal spiral is the global minimum for clusters in the interval 13 ≤ N ≤ 18, the lowest-energy structure is a peculiar, so-called beaded-necklace, motif for 19 ≤ N ≤ 25. We have also applied the method for larger sizes and unusual quasi-linear and branched clusters arise as low-energy structures.
An improved K-means clustering method for cDNA microarray image segmentation.
Wang, T N; Li, T J; Shao, G F; Wu, S X
2015-07-14
Microarray technology is a powerful tool for human genetic research and other biomedical applications. Numerous improvements to the standard K-means algorithm have been carried out to complete the image segmentation step. However, most of the previous studies classify the image into two clusters. In this paper, we propose a novel K-means algorithm, which first classifies the image into three clusters, and then one of the three clusters is divided as the background region and the other two clusters, as the foreground region. The proposed method was evaluated on six different data sets. The analyses of accuracy, efficiency, expression values, special gene spots, and noise images demonstrate the effectiveness of our method in improving the segmentation quality.
Improving the computational efficiency of recursive cluster elimination for gene selection.
Luo, Lin-Kai; Huang, Deng-Feng; Ye, Ling-Jun; Zhou, Qi-Feng; Shao, Gui-Fang; Peng, Hong
2011-01-01
The gene expression data are usually provided with a large number of genes and a relatively small number of samples, which brings a lot of new challenges. Selecting those informative genes becomes the main issue in microarray data analysis. Recursive cluster elimination based on support vector machine (SVM-RCE) has shown the better classification accuracy on some microarray data sets than recursive feature elimination based on support vector machine (SVM-RFE). However, SVM-RCE is extremely time-consuming. In this paper, we propose an improved method of SVM-RCE called ISVM-RCE. ISVM-RCE first trains a SVM model with all clusters, then applies the infinite norm of weight coefficient vector in each cluster to score the cluster, finally eliminates the gene clusters with the lowest score. In addition, ISVM-RCE eliminates genes within the clusters instead of removing a cluster of genes when the number of clusters is small. We have tested ISVM-RCE on six gene expression data sets and compared their performances with SVM-RCE and linear-discriminant-analysis-based RFE (LDA-RFE). The experiment results on these data sets show that ISVM-RCE greatly reduces the time cost of SVM-RCE, meanwhile obtains comparable classification performance as SVM-RCE, while LDA-RFE is not stable.
Bressington, Daniel; Chien, Wai Tong; Mui, Jolene; Lam, Kar Kei Claire; Mahfoud, Ziyad; White, Jacquie; Gray, Richard
2017-08-07
The aim of the present study was to establish the feasibility of conducting a full-scale trial and to estimate the preliminary effect of a Chinese Health Improvement Profile (CHIP) intervention on self-reported physical well-being of people with severe mental illness (SMI). The study used a parallel-group, open-label, cluster-randomized, controlled trial (RCT) design. Twelve community psychiatric nurses (CPN) and their corresponding 137 patients with SMI were randomized into the CHIP or treatment-as-usual (TAU) groups. After training, the CPN completed the CHIP at baseline and 12 months, and the findings were used to devise an individualized care plan to promote health behaviour change. Patients were assessed at baseline and 6 and 12 months after starting the intervention. There was an observed positive trend of improvement on the physical component subscale of SF12v2 in the CHIP group compared to the TAU group after 12 months, but the difference did not reach statistical significance (P = 0.138). The mental component subscale showed a similar positive trend (P = 0.077). CHIP participants were more satisfied with their physical health care than TAU patients (P = 0.009), and the CPN were positive about the usefulness/acceptability of the intervention. There were significant within-group improvements in the total numbers of physical health risks, as indicated by the CHIP items (P = 0.005). The findings suggest that it is feasible to conduct a full-scale RCT of the CHIP in future. The CHIP is an intervention that can be used within routine CPN practice, and could result in small-modest improvements in the physical well-being of people with SMI. © 2017 Australian College of Mental Health Nurses Inc.
Improved Goldstein Interferogram Filter Based on Local Fringe Frequency Estimation
Feng, Qingqing; Xu, Huaping; Wu, Zhefeng; You, Yanan; Liu, Wei; Ge, Shiqi
2016-01-01
The quality of an interferogram, which is limited by various phase noise, will greatly affect the further processes of InSAR, such as phase unwrapping. Interferometric SAR (InSAR) geophysical measurements’, such as height or displacement, phase filtering is therefore an essential step. In this work, an improved Goldstein interferogram filter is proposed to suppress the phase noise while preserving the fringe edges. First, the proposed adaptive filter step, performed before frequency estimation, is employed to improve the estimation accuracy. Subsequently, to preserve the fringe characteristics, the estimated fringe frequency in each fixed filtering patch is removed from the original noisy phase. Then, the residual phase is smoothed based on the modified Goldstein filter with its parameter alpha dependent on both the coherence map and the residual phase frequency. Finally, the filtered residual phase and the removed fringe frequency are combined to generate the filtered interferogram, with the loss of signal minimized while reducing the noise level. The effectiveness of the proposed method is verified by experimental results based on both simulated and real data. PMID:27886081
An Improved Method for NPP Estimation in Wuhan, China
NASA Astrophysics Data System (ADS)
Wang, L.; Feng, L.
2016-12-01
Dynamic monitoring of vegetation net primary productivity (NPP) is of great importance for better understanding the carbon cycle of terrestrial ecosystems. By analyzing the dependence of photosynthetically active radiation (PAR) on cosine of solar zenith angle and clearness index (Kt), an efficient all-sky model was introduced for estimating PAR under various sky conditions. As a key variable in NPP estimation, light use efficiencies were also improved by considering the stress factors of temperature/humidity for different types of vegetation. Seasonal and interannual variations of NPP in Wuhan, China from 2001 to 2010 were then investigated using MODIS products and ground meteorological data. The results showed that NPP increased slightly from 2001 to 2005 and decreased from 2005 to 2010; annual mean NPP was about 502 gCm-2a-1. Significant differences in NPP values for different vegetation types were also found: evergreen broadleaf vegetation produced the highest annual NPP value of 1016.7gCm-2a-1, and annual grass vegetation had the lowest mean value of 448 gCm-2a-1. This study will improve our basic understanding of carbon cycling process in the study area and the proposed model will be useful for other regional NPP estimations in the world.
Improving stochastic estimates with inference methods: Calculating matrix diagonals
NASA Astrophysics Data System (ADS)
Selig, Marco; Oppermann, Niels; Enßlin, Torsten A.
2012-02-01
Estimating the diagonal entries of a matrix, that is not directly accessible but only available as a linear operator in the form of a computer routine, is a common necessity in many computational applications, especially in image reconstruction and statistical inference. Here, methods of statistical inference are used to improve the accuracy or the computational costs of matrix probing methods to estimate matrix diagonals. In particular, the generalized Wiener filter methodology, as developed within information field theory, is shown to significantly improve estimates based on only a few sampling probes, in cases in which some form of continuity of the solution can be assumed. The strength, length scale, and precise functional form of the exploited autocorrelation function of the matrix diagonal is determined from the probes themselves. The developed algorithm is successfully applied to mock and real world problems. These performance tests show that, in situations where a matrix diagonal has to be calculated from only a small number of computationally expensive probes, a speedup by a factor of 2 to 10 is possible with the proposed method.
Speed Profiles for Improvement of Maritime Emission Estimation.
Yau, Pui Shan; Lee, Shun-Cheng; Ho, Kin Fai
2012-12-01
Maritime emissions play an important role in anthropogenic emissions, particularly for cities with busy ports such as Hong Kong. Ship emissions are strongly dependent on vessel speed, and thus accurate vessel speed is essential for maritime emission studies. In this study, we determined minute-by-minute high-resolution speed profiles of container ships on four major routes in Hong Kong waters using Automatic Identification System (AIS). The activity-based ship emissions of NO(x), CO, HC, CO(2), SO(2), and PM(10) were estimated using derived vessel speed profiles, and results were compared with those using the speed limits of control zones. Estimation using speed limits resulted in up to twofold overestimation of ship emissions. Compared with emissions estimated using the speed limits of control zones, emissions estimated using vessel speed profiles could provide results with up to 88% higher accuracy. Uncertainty analysis and sensitivity analysis of the model demonstrated the significance of improvement of vessel speed resolution. From spatial analysis, it is revealed that SO(2) and PM(10) emissions during maneuvering within 1 nautical mile from port were the highest. They contributed 7%-22% of SO(2) emissions and 8%-17% of PM(10) emissions of the entire voyage in Hong Kong.
NASA Astrophysics Data System (ADS)
Brewick, Patrick T.; Smyth, Andrew W.
2016-12-01
The authors have previously shown that many traditional approaches to operational modal analysis (OMA) struggle to properly identify the modal damping ratios for bridges under traffic loading due to the interference caused by the driving frequencies of the traffic loads. This paper presents a novel methodology for modal parameter estimation in OMA that overcomes the problems presented by driving frequencies and significantly improves the damping estimates. This methodology is based on finding the power spectral density (PSD) of a given modal coordinate, and then dividing the modal PSD into separate regions, left- and right-side spectra. The modal coordinates were found using a blind source separation (BSS) algorithm and a curve-fitting technique was developed that uses optimization to find the modal parameters that best fit each side spectra of the PSD. Specifically, a pattern-search optimization method was combined with a clustering analysis algorithm and together they were employed in a series of stages in order to improve the estimates of the modal damping ratios. This method was used to estimate the damping ratios from a simulated bridge model subjected to moving traffic loads. The results of this method were compared to other established OMA methods, such as Frequency Domain Decomposition (FDD) and BSS methods, and they were found to be more accurate and more reliable, even for modes that had their PSDs distorted or altered by driving frequencies.
Improvement of the noradrenergic symptom cluster following treatment with milnacipran.
Kasper, Siegfried; Meshkat, Diana; Kutzelnigg, Alexandra
2011-01-01
Depression has a major impact on social functioning. Decreased concentration, mental and physical slowing, loss of energy, lassitude, tiredness, and reduced self-care are all symptoms related to reduced noradrenergic activity. Depressed mood; loss of interest or pleasure; sleep disturbances; and feelings of worthlessness, pessimism, and anxiety are related to reduced activity of both serotonergic and noradrenergic neurotransmission. The importance of noradrenergic neurotransmission in social functioning is supported by studies with the specific norepinephrine reuptake inhibitor reboxetine. In healthy volunteers, reboxetine increases cooperative social behavior and social drive. A placebo-controlled study in depressed patients comparing reboxetine with the selective serotonin reuptake inhibitor (SSRI) fluoxetine showed significantly greater improvement in social adaptation with reboxetine. Two recent studies have examined the effect of the serotonin and norepinephrine reuptake inhibitor milnacipran on social adaptation. A study in depressed patients found that at the end of 8 weeks of treatment with milnacipran, 42.2% patients were in remission on the Social Adaptation Self-evaluation Scale (SASS). Another study in depressed workers or homemakers found that mean depression scores were significantly reduced after 2 weeks, whereas the SASS scores were significantly improved after 4 weeks. A preliminary study comparing depressed patients treated with milnacipran or the SSRI paroxetine showed that milnacipran treatment resulted in a greater number of patients in social remission. The available data thus suggest that milnacipran may improve social functioning, with a possibly greater effect than the SSRI paroxetine. These preliminary data suggest further evaluation of social dysfunction and its treatment outcome in future trials of milnacipran.
Improved phase arrival estimate and location for local earthquakes in South Korea
NASA Astrophysics Data System (ADS)
Morton, E. A.; Rowe, C. A.; Begnaud, M. L.
2012-12-01
The Korean Institute of Geoscience and Mineral Resources (KIGAM) and the Korean Meteorological Agency (KMA) regularly report local (distance < ~1200 km) seismicity recorded with their networks; we obtain preliminary event location estimates as well as waveform data, but no phase arrivals are reported, so the data are not immediately useful for earthquake location. Our goal is to identify seismic events that are sufficiently well-located to provide accurate seismic travel-time information for events within the KIGAM and KMA networks, and also recorded by some regional stations. Toward that end, we are using a combination of manual phase identification and arrival-time picking, with waveform cross-correlation, to cluster events that have occurred in close proximity to one another, which allows for improved phase identification by comparing the highly correlating waveforms. We cross-correlate the known events with one another on 5 seismic stations and cluster events that correlate above a correlation coefficient threshold of 0.7, which reveals few clusters containing few events each. The small number of repeating events suggests that the online catalogs have had mining and quarry blasts removed before publication, as these can contribute significantly to repeating seismic sources in relatively aseismic regions such as South Korea. The dispersed source locations in our catalog, however, are ideal for seismic velocity modeling by providing superior sampling through the dense seismic station arrangement, which produces favorable event-to-station ray path coverage. Following careful manual phase picking on 104 events chosen to provide adequate ray coverage, we re-locate the events to obtain improved source coordinates. The re-located events are used with Thurber's Simul2000 pseudo-bending local tomography code to estimate the crustal structure on the Korean Peninsula, which is an important contribution to ongoing calibration for events of interest in the region.
Improved estimates of coordinate error for molecular replacement
Oeffner, Robert D.; Bunkóczi, Gábor; McCoy, Airlie J.; Read, Randy J.
2013-11-01
A function for estimating the effective root-mean-square deviation in coordinates between two proteins has been developed that depends on both the sequence identity and the size of the protein and is optimized for use with molecular replacement in Phaser. A top peak translation-function Z-score of over 8 is found to be a reliable metric of when molecular replacement has succeeded. The estimate of the root-mean-square deviation (r.m.s.d.) in coordinates between the model and the target is an essential parameter for calibrating likelihood functions for molecular replacement (MR). Good estimates of the r.m.s.d. lead to good estimates of the variance term in the likelihood functions, which increases signal to noise and hence success rates in the MR search. Phaser has hitherto used an estimate of the r.m.s.d. that only depends on the sequence identity between the model and target and which was not optimized for the MR likelihood functions. Variance-refinement functionality was added to Phaser to enable determination of the effective r.m.s.d. that optimized the log-likelihood gain (LLG) for a correct MR solution. Variance refinement was subsequently performed on a database of over 21 000 MR problems that sampled a range of sequence identities, protein sizes and protein fold classes. Success was monitored using the translation-function Z-score (TFZ), where a TFZ of 8 or over for the top peak was found to be a reliable indicator that MR had succeeded for these cases with one molecule in the asymmetric unit. Good estimates of the r.m.s.d. are correlated with the sequence identity and the protein size. A new estimate of the r.m.s.d. that uses these two parameters in a function optimized to fit the mean of the refined variance is implemented in Phaser and improves MR outcomes. Perturbing the initial estimate of the r.m.s.d. from the mean of the distribution in steps of standard deviations of the distribution further increases MR success rates.
An analytical solution for improved HIFU SAR estimation
Dillon, C R; Vyas, U; Payne, A; Christensen, D A; Roemer, R B
2012-01-01
Accurate determination of the specific absorption rates (SARs) present during high intensity focused ultrasound (HIFU) experiments and treatments provides a solid physical basis for scientific comparison of results among HIFU studies and is necessary to validate and improve SAR predictive software, which will improve patient treatment planning, control and evaluation. This study develops and tests an analytical solution that significantly improves the accuracy of SAR values obtained from HIFU temperature data. SAR estimates are obtained by fitting the analytical temperature solution for a one-dimensional radial Gaussian heating pattern to the temperature versus time data following a step in applied power and evaluating the initial slope of the analytical solution. The analytical method is evaluated in multiple parametric simulations for which it consistently (except at high perfusions) yields maximum errors of less than 10% at the center of the focal zone compared with errors up to 90% and 55% for the commonly used linear method and an exponential method, respectively. For high perfusion, an extension of the analytical method estimates SAR with less than 10% error. The analytical method is validated experimentally by showing that the temperature elevations predicted using the analytical method’s SAR values determined for the entire 3-D focal region agree well with the experimental temperature elevations in a HIFU-heated tissue-mimicking phantom. PMID:22722656
A stochastic movement simulator improves estimates of landscape connectivity.
Coulon, A; Aben, J; Palmer, S C F; Stevens, V M; Callens, T; Strubbe, D; Lens, L; Matthysen, E; Baguette, M; Travis, J M J
2015-08-01
Conservation actions often focus on restoration or creation of natural areas designed to facilitate the movements of organisms among populations. To be efficient, these actions need to be based on reliable estimates or predictions of landscape connectivity. While circuit theory and least-cost paths (LCPs) are increasingly being used to estimate connectivity, these methods also have proven limitations. We compared their performance in predicting genetic connectivity with that of an alternative approach based on a simple, individual-based "stochastic movement simulator" (SMS). SMS predicts dispersal of organisms using the same landscape representation as LCPs and circuit theory-based estimates (i.e., a cost surface), while relaxing key LCP assumptions, namely individual omniscience of the landscape (by incorporating perceptual range) and the optimality of individual movements (by including stochasticity in simulated movements). The performance of the three estimators was assessed by the degree to which they correlated with genetic estimates of connectivity in two species with contrasting movement abilities (Cabanis's Greenbul, an Afrotropical forest bird species, and natterjack toad, an amphibian restricted to European sandy and heathland areas). For both species, the correlation between dispersal model and genetic data was substantially higher when SMS was used. Importantly, the results also demonstrate that the improvement gained by using SMS is robust both to variation in spatial resolution of the landscape and to uncertainty in the perceptual range model parameter. Integration of this individual-based approach with other developing methods in the field of connectivity research, such as graph theory, can yield rapid progress towards more robust connectivity indices and more effective recommendations for land management.
Improvements on foF1 estimation at polar regions
NASA Astrophysics Data System (ADS)
Sabbagh, Dario; Scotto, Carlo; Sgrigna, Vittorio
2016-04-01
The analysis of a sample of polar ionograms reveals that the DuCharme and Petrie empirical formula often fails in the foF1 estimation at polar regions. A study of the discrepancies between modeled and observed foF1 values is presented, using a data set of Antarctic ionograms from different stations. Such discrepancies have been quantitatively evaluated. Based on this study a correction to the DuCharme and Petrie formula is proposed. This correction is performed to be implemented in an improved version of Autoscala software for a particular ionospheric station, in the frame of AUSPICIO (Automatic Interpretation of Polar Ionograms and Cooperative Ionospheric Observations) project.
Raichoor, A.; Mei, S.; Huertas-Company, M.; Licitra, R.; Erben, T.; Hildebrandt, H.; Ilbert, O.; Boissier, S.; Boselli, A.; Ball, N. M.; Côté, P.; Ferrarese, L.; Gwyn, S. D. J.; Kavelaars, J. J.; Chen, Y.-T.; Cuillandre, J.-C.; Duc, P. A.; Guhathakurta, P.; and others
2014-12-20
The Next Generation Virgo Cluster Survey (NGVS) is an optical imaging survey covering 104 deg{sup 2} centered on the Virgo cluster. Currently, the complete survey area has been observed in the u*giz bands and one third in the r band. We present the photometric redshift estimation for the NGVS background sources. After a dedicated data reduction, we perform accurate photometry, with special attention to precise color measurements through point-spread function homogenization. We then estimate the photometric redshifts with the Le Phare and BPZ codes. We add a new prior that extends to i {sub AB} = 12.5 mag. When using the u* griz bands, our photometric redshifts for 15.5 mag ≤ i ≲ 23 mag or z {sub phot} ≲ 1 galaxies have a bias |Δz| < 0.02, less than 5% outliers, a scatter σ{sub outl.rej.}, and an individual error on z {sub phot} that increases with magnitude (from 0.02 to 0.05 and from 0.03 to 0.10, respectively). When using the u*giz bands over the same magnitude and redshift range, the lack of the r band increases the uncertainties in the 0.3 ≲ z {sub phot} ≲ 0.8 range (–0.05 < Δz < –0.02, σ{sub outl.rej} ∼ 0.06, 10%-15% outliers, and z {sub phot.err.} ∼ 0.15). We also present a joint analysis of the photometric redshift accuracy as a function of redshift and magnitude. We assess the quality of our photometric redshifts by comparison to spectroscopic samples and by verifying that the angular auto- and cross-correlation function w(θ) of the entire NGVS photometric redshift sample across redshift bins is in agreement with the expectations.
NASA Astrophysics Data System (ADS)
Raichoor, A.; Mei, S.; Erben, T.; Hildebrandt, H.; Huertas-Company, M.; Ilbert, O.; Licitra, R.; Ball, N. M.; Boissier, S.; Boselli, A.; Chen, Y.-T.; Côté, P.; Cuillandre, J.-C.; Duc, P. A.; Durrell, P. R.; Ferrarese, L.; Guhathakurta, P.; Gwyn, S. D. J.; Kavelaars, J. J.; Lançon, A.; Liu, C.; MacArthur, L. A.; Muller, M.; Muñoz, R. P.; Peng, E. W.; Puzia, T. H.; Sawicki, M.; Toloba, E.; Van Waerbeke, L.; Woods, D.; Zhang, H.
2014-12-01
The Next Generation Virgo Cluster Survey (NGVS) is an optical imaging survey covering 104 deg2 centered on the Virgo cluster. Currently, the complete survey area has been observed in the u*giz bands and one third in the r band. We present the photometric redshift estimation for the NGVS background sources. After a dedicated data reduction, we perform accurate photometry, with special attention to precise color measurements through point-spread function homogenization. We then estimate the photometric redshifts with the Le Phare and BPZ codes. We add a new prior that extends to i AB = 12.5 mag. When using the u* griz bands, our photometric redshifts for 15.5 mag <= i <~ 23 mag or z phot <~ 1 galaxies have a bias |Δz| < 0.02, less than 5% outliers, a scatter σoutl.rej., and an individual error on z phot that increases with magnitude (from 0.02 to 0.05 and from 0.03 to 0.10, respectively). When using the u*giz bands over the same magnitude and redshift range, the lack of the r band increases the uncertainties in the 0.3 <~ z phot <~ 0.8 range (-0.05 < Δz < -0.02, σoutl.rej ~ 0.06, 10%-15% outliers, and z phot.err. ~ 0.15). We also present a joint analysis of the photometric redshift accuracy as a function of redshift and magnitude. We assess the quality of our photometric redshifts by comparison to spectroscopic samples and by verifying that the angular auto- and cross-correlation function w(θ) of the entire NGVS photometric redshift sample across redshift bins is in agreement with the expectations.
NASA Astrophysics Data System (ADS)
Sundara, Vinny Yuliani; Sadik, Kusman; Kurnia, Anang
2017-03-01
Survey is one of data collection method which sampling of individual units from a population. However, national survey only provides limited information which impacts on low precision in small area level. In fact, when the area is not selected as sample unit, estimation cannot be made. Therefore, small area estimation method is required to solve this problem. One of model-based estimation methods is empirical Bayes which has been widely used to estimate parameter in small area, even in non-sampled area. Yet, problems occur when this method is used to estimate parameter of non-sampled area which is solely based on synthetic model which ignore the area effects. This paper proposed an approach to cluster area effects of auxiliary variable by assuming that there are similar among particular area. Direct estimates in several sub-districts in regency and city of Bogor are zero because no household which are under poverty in the sample that selected from these sub-districts. Empirical Bayes method is used to get the estimates are not zero. Empirical Bayes method on FGT poverty measures both Molina & Rao and information clusters have the same estimates in the sub-districts selected as samples, but have different estimates on non-sampled sub-districts. Empirical Bayes methods with information cluster has smaller coefficient of variation. Empirical Bayes method with cluster information is better than empirical Bayes methods without cluster information on non-sampled sub-districts in regency and city of Bogor in terms of coefficient of variation.
NeCamp, Timothy; Kilbourne, Amy; Almirall, Daniel
2017-08-01
Cluster-level dynamic treatment regimens can be used to guide sequential treatment decision-making at the cluster level in order to improve outcomes at the individual or patient-level. In a cluster-level dynamic treatment regimen, the treatment is potentially adapted and re-adapted over time based on changes in the cluster that could be impacted by prior intervention, including aggregate measures of the individuals or patients that compose it. Cluster-randomized sequential multiple assignment randomized trials can be used to answer multiple open questions preventing scientists from developing high-quality cluster-level dynamic treatment regimens. In a cluster-randomized sequential multiple assignment randomized trial, sequential randomizations occur at the cluster level and outcomes are observed at the individual level. This manuscript makes two contributions to the design and analysis of cluster-randomized sequential multiple assignment randomized trials. First, a weighted least squares regression approach is proposed for comparing the mean of a patient-level outcome between the cluster-level dynamic treatment regimens embedded in a sequential multiple assignment randomized trial. The regression approach facilitates the use of baseline covariates which is often critical in the analysis of cluster-level trials. Second, sample size calculators are derived for two common cluster-randomized sequential multiple assignment randomized trial designs for use when the primary aim is a between-dynamic treatment regimen comparison of the mean of a continuous patient-level outcome. The methods are motivated by the Adaptive Implementation of Effective Programs Trial which is, to our knowledge, the first-ever cluster-randomized sequential multiple assignment randomized trial in psychiatry.
Zhong, Wei; Altun, Gulsah; Harrison, Robert; Tai, Phang C; Pan, Yi
2005-09-01
Information about local protein sequence motifs is very important to the analysis of biologically significant conserved regions of protein sequences. These conserved regions can potentially determine the diverse conformation and activities of proteins. In this work, recurring sequence motifs of proteins are explored with an improved K-means clustering algorithm on a new dataset. The structural similarity of these recurring sequence clusters to produce sequence motifs is studied in order to evaluate the relationship between sequence motifs and their structures. To the best of our knowledge, the dataset used by our research is the most updated dataset among similar studies for sequence motifs. A new greedy initialization method for the K-means algorithm is proposed to improve traditional K-means clustering techniques. The new initialization method tries to choose suitable initial points, which are well separated and have the potential to form high-quality clusters. Our experiments indicate that the improved K-means algorithm satisfactorily increases the percentage of sequence segments belonging to clusters with high structural similarity. Careful comparison of sequence motifs obtained by the improved and traditional algorithms also suggests that the improved K-means clustering algorithm may discover some relatively weak and subtle sequence motifs, which are undetectable by the traditional K-means algorithms. Many biochemical tests reported in the literature show that these sequence motifs are biologically meaningful. Experimental results also indicate that the improved K-means algorithm generates more detailed sequence motifs representing common structures than previous research. Furthermore, these motifs are universally conserved sequence patterns across protein families, overcoming some weak points of other popular sequence motifs. The satisfactory result of the experiment suggests that this new K-means algorithm may be applied to other areas of bioinformatics
Improved risk estimates for carbon tetrachloride. 1998 annual progress report
Benson, J.M.; Springer, D.L.; Thrall, K.D.
1998-06-01
'The overall purpose of these studies is to improve the scientific basis for assessing the cancer risk associated with human exposure to carbon tetrachloride. Specifically, the toxicokinetics of inhaled carbon tetrachloride is being determined in rats, mice and hamsters. Species differences in the metabolism of carbon tetrachloride by rats, mice and hamsters is being determined in vivo and in vitro using tissues and microsomes from these rodent species and man. Dose-response relationships will be determined in all studies. The information will be used to improve the current physiologically based pharmacokinetic model for carbon tetrachloride. The authors will also determine whether carbon tetrachloride is a hepatocarcinogen only when exposure results in cell damage, cell killing, and regenerative cell proliferation. In combination, the results of these studies will provide the types of information needed to enable a refined risk estimate for carbon tetrachloride under EPA''s new guidelines for cancer risk assessment.'
Improving Distribution Resiliency with Microgrids and State and Parameter Estimation
Tuffner, Francis K.; Williams, Tess L.; Schneider, Kevin P.; Elizondo, Marcelo A.; Sun, Yannan; Liu, Chen-Ching; Xu, Yin; Gourisetti, Sri Nikhil Gup
2015-09-30
Modern society relies on low-cost reliable electrical power, both to maintain industry, as well as provide basic social services to the populace. When major disturbances occur, such as Hurricane Katrina or Hurricane Sandy, the nation’s electrical infrastructure can experience significant outages. To help prevent the spread of these outages, as well as facilitating faster restoration after an outage, various aspects of improving the resiliency of the power system are needed. Two such approaches are breaking the system into smaller microgrid sections, and to have improved insight into the operations to detect failures or mis-operations before they become critical. Breaking the system into smaller sections of microgrid islands, power can be maintained in smaller areas where distribution generation and energy storage resources are still available, but bulk power generation is no longer connected. Additionally, microgrid systems can maintain service to local pockets of customers when there has been extensive damage to the local distribution system. However, microgrids are grid connected a majority of the time and implementing and operating a microgrid is much different than when islanded. This report discusses work conducted by the Pacific Northwest National Laboratory that developed improvements for simulation tools to capture the characteristics of microgrids and how they can be used to develop new operational strategies. These operational strategies reduce the cost of microgrid operation and increase the reliability and resilience of the nation’s electricity infrastructure. In addition to the ability to break the system into microgrids, improved observability into the state of the distribution grid can make the power system more resilient. State estimation on the transmission system already provides great insight into grid operations and detecting abnormal conditions by leveraging existing measurements. These transmission-level approaches are expanded to using
Reducing measurement scale mismatch to improve surface energy flux estimation
NASA Astrophysics Data System (ADS)
Iwema, Joost; Rosolem, Rafael; Rahman, Mostaquimur; Blyth, Eleanor; Wagener, Thorsten
2016-04-01
Soil moisture importantly controls land surface processes such as energy and water partitioning. A good understanding of these controls is needed especially when recognizing the challenges in providing accurate hyper-resolution hydrometeorological simulations at sub-kilometre scales. Soil moisture controlling factors can, however, differ at distinct scales. In addition, some parameters in land surface models are still often prescribed based on observations obtained at another scale not necessarily employed by such models (e.g., soil properties obtained from lab samples used in regional simulations). To minimize such effects, parameters can be constrained with local data from Eddy-Covariance (EC) towers (i.e., latent and sensible heat fluxes) and Point Scale (PS) soil moisture observations (e.g., TDR). However, measurement scales represented by EC and PS still differ substantially. Here we use the fact that Cosmic-Ray Neutron Sensors (CRNS) estimate soil moisture at horizontal footprint similar to that of EC fluxes to help answer the following question: Does reduced observation scale mismatch yield better soil moisture - surface fluxes representation in land surface models? To answer this question we analysed soil moisture and surface fluxes measurements from twelve COSMOS-Ameriflux sites in the USA characterized by distinct climate, soils and vegetation types. We calibrated model parameters of the Joint UK Land Environment Simulator (JULES) against PS and CRNS soil moisture data, respectively. We analysed the improvement in soil moisture estimation compared to uncalibrated model simulations and then evaluated the degree of improvement in surface fluxes before and after calibration experiments. Preliminary results suggest that a more accurate representation of soil moisture dynamics is achieved when calibrating against observed soil moisture and further improvement obtained with CRNS relative to PS. However, our results also suggest that a more accurate
An improved clustering algorithm of tunnel monitoring data for cloud computing.
Zhong, Luo; Tang, KunHao; Li, Lin; Yang, Guang; Ye, JingJing
2014-01-01
With the rapid development of urban construction, the number of urban tunnels is increasing and the data they produce become more and more complex. It results in the fact that the traditional clustering algorithm cannot handle the mass data of the tunnel. To solve this problem, an improved parallel clustering algorithm based on k-means has been proposed. It is a clustering algorithm using the MapReduce within cloud computing that deals with data. It not only has the advantage of being used to deal with mass data but also is more efficient. Moreover, it is able to compute the average dissimilarity degree of each cluster in order to clean the abnormal data.
An improved local immunization strategy for scale-free networks with a high degree of clustering
NASA Astrophysics Data System (ADS)
Xia, Lingling; Jiang, Guoping; Song, Yurong; Song, Bo
2017-01-01
The design of immunization strategies is an extremely important issue for disease or computer virus control and prevention. In this paper, we propose an improved local immunization strategy based on node's clustering which was seldom considered in the existing immunization strategies. The main aim of the proposed strategy is to iteratively immunize the node which has a high connectivity and a low clustering coefficient. To validate the effectiveness of our strategy, we compare it with two typical local immunization strategies on both real and artificial networks with a high degree of clustering. Simulations on these networks demonstrate that the performance of our strategy is superior to that of two typical strategies. The proposed strategy can be regarded as a compromise between computational complexity and immune effect, which can be widely applied in scale-free networks of high clustering, such as social network, technological networks and so on. In addition, this study provides useful hints for designing optimal immunization strategy for specific network.
Improved metal cluster deposition on a genetically engineered tobacco mosaic virus template
NASA Astrophysics Data System (ADS)
Lee, Sang-Yup; Royston, Elizabeth; Culver, James N.; Harris, Michael T.
2005-07-01
Improved depositions of various metal clusters onto a biomolecular template were achieved using a genetically engineered tobacco mosaic virus (TMV). Wild-type TMV was genetically altered to display multiple solid metal binding sites through the insertion of two cysteine residues within the amino-terminus of the virus coat protein. Gold, silver, and palladium clusters synthesized through in situ chemical reductions could be readily deposited onto the genetically modified template via the exposed cysteine-derived thiol groups. Metal cluster coatings on the cysteine-modified template were more densely deposited and stable than similar coatings on the unmodified wild-type template. Combined, these results confirm that the introduction of cysteine residues onto the outer surface of the TMV coat protein enhances the usefulness of this virus as a biotemplate for the deposition of metal clusters.
An Improved Clustering Algorithm of Tunnel Monitoring Data for Cloud Computing
Zhong, Luo; Tang, KunHao; Li, Lin; Yang, Guang; Ye, JingJing
2014-01-01
With the rapid development of urban construction, the number of urban tunnels is increasing and the data they produce become more and more complex. It results in the fact that the traditional clustering algorithm cannot handle the mass data of the tunnel. To solve this problem, an improved parallel clustering algorithm based on k-means has been proposed. It is a clustering algorithm using the MapReduce within cloud computing that deals with data. It not only has the advantage of being used to deal with mass data but also is more efficient. Moreover, it is able to compute the average dissimilarity degree of each cluster in order to clean the abnormal data. PMID:24982971
Improving Estimation of Ground Casualty Risk From Reentering Space Objects
NASA Technical Reports Server (NTRS)
Ostrom, Chris L.
2017-01-01
A recent improvement to the long-term estimation of ground casualties from reentering space debris is the further refinement and update to the human population distribution. Previous human population distributions were based on global totals with simple scaling factors for future years, or a coarse grid of population counts in a subset of the world's countries, each cell having its own projected growth rate. The newest population model includes a 5-fold refinement in both latitude and longitude resolution. All areas along a single latitude are combined to form a global population distribution as a function of latitude, creating a more accurate population estimation based on non-uniform growth at the country and area levels. Previous risk probability calculations used simplifying assumptions that did not account for the ellipsoidal nature of the Earth. The new method uses first, a simple analytical method to estimate the amount of time spent above each latitude band for a debris object with a given orbit inclination and second, a more complex numerical method that incorporates the effects of a non-spherical Earth. These new results are compared with the prior models to assess the magnitude of the effects on reentry casualty risk.
Improving Estimation of Ground Casualty Risk from Reentering Space Objects
NASA Technical Reports Server (NTRS)
Ostrom, C.
2017-01-01
A recent improvement to the long-term estimation of ground casualties from reentering space debris is the further refinement and update to the human population distribution. Previous human population distributions were based on global totals with simple scaling factors for future years, or a coarse grid of population counts in a subset of the world's countries, each cell having its own projected growth rate. The newest population model includes a 5-fold refinement in both latitude and longitude resolution. All areas along a single latitude are combined to form a global population distribution as a function of latitude, creating a more accurate population estimation based on non-uniform growth at the country and area levels. Previous risk probability calculations used simplifying assumptions that did not account for the ellipsoidal nature of the earth. The new method uses first, a simple analytical method to estimate the amount of time spent above each latitude band for a debris object with a given orbit inclination, and second, a more complex numerical method that incorporates the effects of a non-spherical Earth. These new results are compared with the prior models to assess the magnitude of the effects on reentry casualty risk.
Improving Estimates of Cloud Radiative Forcing over Greenland
NASA Astrophysics Data System (ADS)
Wang, W.; Zender, C. S.
2014-12-01
Multiple driving mechanisms conspire to increase melt extent and extreme melt events frequency in the Arctic: changing heat transport, shortwave radiation (SW), and longwave radiation (LW). Cloud Radiative Forcing (CRF) of Greenland's surface is amplified by a dry atmosphere and by albedo feedback, making its contribution to surface melt even more variable in time and space. Unfortunately accurate cloud observations and thus CRF estimates are hindered by Greenland's remoteness, harsh conditions, and low contrast between surface and cloud reflectance. In this study, cloud observations from satellites and reanalyses are ingested into and evaluated within a column radiative transfer model. An improved CRF dataset is obtained by correcting systematic discrepancies derived from sensitivity experiments. First, we compare the surface radiation budgets from the Column Radiation Model (CRM) driven by different cloud datasets, with surface observations from Greenland Climate Network (GC-Net). In clear skies, CRM-estimated surface radiation driven by water vapor profiles from both AIRS and MODIS during May-Sept 2010-2012 are similar, stable, and reliable. For example, although AIRS water vapor path exceeds MODIS by 1.4 kg/m2 on a daily average, the overall absolute difference in downwelling SW is < 4 W/m2. CRM estimates are within 20 W/m2 range of GC-Net downwelling SW. After calibrating CRM in clear skies, the remaining differences between CRM and observed surface radiation are primarily attributable to differences in cloud observations. We estimate CRF using cloud products from MODIS and from MERRA. The SW radiative forcing of thin clouds is mainly controlled by cloud water path (CWP). As CWP increases from near 0 to 200 g/m2, the net surface SW drops from over 100 W/m2 to 30 W/m2 almost linearly, beyond which it becomes relatively insensitive to CWP. The LW is dominated by cloud height. For clouds at all altitudes, the lower the clouds, the greater the LW forcing. By
Aging and memory improvement through semantic clustering: The role of list-presentation format.
Kuhlmann, Beatrice G; Touron, Dayna R
2016-11-01
The present study examined how the presentation format of the study list influences younger and older adults' semantic clustering. Spontaneous clustering did not differ between age groups or between an individual-words (presentation of individual study words in consecution) and a whole-list (presentation of the whole study list at once for the same total duration) presentation format in 132 younger (18-30 years, M = 19.7) and 120 older (60-84 years, M = 69.5) adults. However, after instructions to use semantic clustering (second list) age-related differences in recall magnified, indicating a utilization deficiency, and both age groups achieved higher recall in the whole-list than in the individual-words format. While this whole-list benefit was comparable across age groups, it is notable that older adults were only able to improve their average recall performance after clustering instructions in the whole-list but not in the individual-words format. In both formats, instructed clustering was correlated with processing resources (processing speed and, especially, working memory capacity), particularly in older adults. Spontaneous clustering, however, was not related to processing resources but to metacognitive beliefs about the efficacy and difficulty of semantic clustering, neither of which indicated awareness of the benefits of the whole-list presentation format in either age group. Taken together, the findings demonstrate that presentation format has a nontrivial influence on the utilization of semantic clustering in adults. The analyses further highlight important differences between output-based and list-based clustering measures. (PsycINFO Database Record
Which Elements of Improvement Collaboratives Are Most Effective? A Cluster-Randomized Trial
Gustafson, D. H.; Quanbeck, A. R.; Robinson, J. M.; Ford, J. H.; Pulvermacher, A.; French, M. T.; McConnell, K. J.; Batalden, P. B.; Hoffman, K. A.; McCarty, D.
2013-01-01
Aims Improvement collaboratives consisting of various components are used throughout healthcare to improve quality, but no study has identified which components work best. This study tested the effectiveness of different components in addiction treatment services, hypothesizing that a combination of all components would be most effective. Design An unblinded cluster-randomized trial assigned clinics to one of four groups: interest circle calls (group teleconferences), clinic-level coaching, learning sessions (large face-to-face meetings), and a combination of all three. Interest circle calls functioned as a minimal intervention comparison group. Setting Outpatient addiction treatment clinics in the U.S. Participants 201 clinics in 5 states. Measurements Clinic data managers submitted data on three primary outcomes: waiting time (mean days between first contact and first treatment), retention (percent of patients retained from first to fourth treatment session), and annual number of new patients. State and group costs were collected for a cost-effectiveness analysis. Findings Waiting time declined significantly for 3 groups: coaching (an average of −4.6 days/clinic, P=0.001), learning sessions (−3.5 days/clinic, P=0.012), and the combination (−4.7 days/clinic, P=0.001). The coaching and combination groups significantly increased the number of new patients (19.5%, P=0.028; 8.9%, P=0.029; respectively). Interest circle calls showed no significant effects on outcomes. None of the groups significantly improved retention. The estimated cost/clinic was $2,878 for coaching versus $7,930 for the combination. Coaching and the combination of collaborative components were about equally effective in achieving study aims, but coaching was substantially more cost effective. Conclusions When trying to improve the effectiveness of addiction treatment services, clinic-level coaching appears to help improve waiting time and number of new patients while other components of
An improved global wind resource estimate for integrated assessment models
Eurek, Kelly; Sullivan, Patrick; Gleason, Michael; ...
2017-11-25
This study summarizes initial steps to improving the robustness and accuracy of global renewable resource and techno-economic assessments for use in integrated assessment models. We outline a method to construct country-level wind resource supply curves, delineated by resource quality and other parameters. Using mesoscale reanalysis data, we generate estimates for wind quality, both terrestrial and offshore, across the globe. Because not all land or water area is suitable for development, appropriate database layers provide exclusions to reduce the total resource to its technical potential. We expand upon estimates from related studies by: using a globally consistent data source of uniquelymore » detailed wind speed characterizations; assuming a non-constant coefficient of performance for adjusting power curves for altitude; categorizing the distance from resource sites to the electric power grid; and characterizing offshore exclusions on the basis of sea ice concentrations. The product, then, is technical potential by country, classified by resource quality as determined by net capacity factor. Additional classifications dimensions are available, including distance to transmission networks for terrestrial wind and distance to shore and water depth for offshore. We estimate the total global wind generation potential of 560 PWh for terrestrial wind with 90% of resource classified as low-to-mid quality, and 315 PWh for offshore wind with 67% classified as mid-to-high quality. These estimates are based on 3.5 MW composite wind turbines with 90 m hub heights, 0.95 availability, 90% array efficiency, and 5 MW/km2 deployment density in non-excluded areas. We compare the underlying technical assumption and results with other global assessments.« less
Laser photogrammetry improves size and demographic estimates for whale sharks.
Rohner, Christoph A; Richardson, Anthony J; Prebble, Clare E M; Marshall, Andrea D; Bennett, Michael B; Weeks, Scarla J; Cliff, Geremy; Wintner, Sabine P; Pierce, Simon J
2015-01-01
Whale sharks Rhincodon typus are globally threatened, but a lack of biological and demographic information hampers an accurate assessment of their vulnerability to further decline or capacity to recover. We used laser photogrammetry at two aggregation sites to obtain more accurate size estimates of free-swimming whale sharks compared to visual estimates, allowing improved estimates of biological parameters. Individual whale sharks ranged from 432-917 cm total length (TL) (mean ± SD = 673 ± 118.8 cm, N = 122) in southern Mozambique and from 420-990 cm TL (mean ± SD = 641 ± 133 cm, N = 46) in Tanzania. By combining measurements of stranded individuals with photogrammetry measurements of free-swimming sharks, we calculated length at 50% maturity for males in Mozambique at 916 cm TL. Repeat measurements of individual whale sharks measured over periods from 347-1,068 days yielded implausible growth rates, suggesting that the growth increment over this period was not large enough to be detected using laser photogrammetry, and that the method is best applied to estimating growth rates over longer (decadal) time periods. The sex ratio of both populations was biased towards males (74% in Mozambique, 89% in Tanzania), the majority of which were immature (98% in Mozambique, 94% in Tanzania). The population structure for these two aggregations was similar to most other documented whale shark aggregations around the world. Information on small (<400 cm) whale sharks, mature individuals, and females in this region is lacking, but necessary to inform conservation initiatives for this globally threatened species.
Laser photogrammetry improves size and demographic estimates for whale sharks
Richardson, Anthony J.; Prebble, Clare E.M.; Marshall, Andrea D.; Bennett, Michael B.; Weeks, Scarla J.; Cliff, Geremy; Wintner, Sabine P.; Pierce, Simon J.
2015-01-01
Whale sharks Rhincodon typus are globally threatened, but a lack of biological and demographic information hampers an accurate assessment of their vulnerability to further decline or capacity to recover. We used laser photogrammetry at two aggregation sites to obtain more accurate size estimates of free-swimming whale sharks compared to visual estimates, allowing improved estimates of biological parameters. Individual whale sharks ranged from 432–917 cm total length (TL) (mean ± SD = 673 ± 118.8 cm, N = 122) in southern Mozambique and from 420–990 cm TL (mean ± SD = 641 ± 133 cm, N = 46) in Tanzania. By combining measurements of stranded individuals with photogrammetry measurements of free-swimming sharks, we calculated length at 50% maturity for males in Mozambique at 916 cm TL. Repeat measurements of individual whale sharks measured over periods from 347–1,068 days yielded implausible growth rates, suggesting that the growth increment over this period was not large enough to be detected using laser photogrammetry, and that the method is best applied to estimating growth rates over longer (decadal) time periods. The sex ratio of both populations was biased towards males (74% in Mozambique, 89% in Tanzania), the majority of which were immature (98% in Mozambique, 94% in Tanzania). The population structure for these two aggregations was similar to most other documented whale shark aggregations around the world. Information on small (<400 cm) whale sharks, mature individuals, and females in this region is lacking, but necessary to inform conservation initiatives for this globally threatened species. PMID:25870776
Towards Improved Snow Water Equivalent Estimation via GRACE Assimilation
NASA Technical Reports Server (NTRS)
Forman, Bart; Reichle, Rofl; Rodell, Matt
2011-01-01
Passive microwave (e.g. AMSR-E) and visible spectrum (e.g. MODIS) measurements of snow states have been used in conjunction with land surface models to better characterize snow pack states, most notably snow water equivalent (SWE). However, both types of measurements have limitations. AMSR-E, for example, suffers a loss of information in deep/wet snow packs. Similarly, MODIS suffers a loss of temporal correlation information beyond the initial accumulation and final ablation phases of the snow season. Gravimetric measurements, on the other hand, do not suffer from these limitations. In this study, gravimetric measurements from the Gravity Recovery and Climate Experiment (GRACE) mission are used in a land surface model data assimilation (DA) framework to better characterize SWE in the Mackenzie River basin located in northern Canada. Comparisons are made against independent, ground-based SWE observations, state-of-the-art modeled SWE estimates, and independent, ground-based river discharge observations. Preliminary results suggest improved SWE estimates, including improved timing of the subsequent ablation and runoff of the snow pack. Additionally, use of the DA procedure can add vertical and horizontal resolution to the coarse-scale GRACE measurements as well as effectively downscale the measurements in time. Such findings offer the potential for better understanding of the hydrologic cycle in snow-dominated basins located in remote regions of the globe where ground-based observation collection if difficult, if not impossible. This information could ultimately lead to improved freshwater resource management in communities dependent on snow melt as well as a reduction in the uncertainty of river discharge into the Arctic Ocean.
Improved PPP ambiguity resolution by COES FCB estimation
NASA Astrophysics Data System (ADS)
Li, Yihe; Gao, Yang; Shi, Junbo
2016-05-01
Precise point positioning (PPP) integer ambiguity resolution is able to significantly improve the positioning accuracy with the correction of fractional cycle biases (FCBs) by shortening the time to first fix (TTFF) of ambiguities. When satellite orbit products are adopted to estimate the satellite FCB corrections, the narrow-lane (NL) FCB corrections will be contaminated by the orbit's line-of-sight (LOS) errors which subsequently affect ambiguity resolution (AR) performance, as well as positioning accuracy. To effectively separate orbit errors from satellite FCBs, we propose a cascaded orbit error separation (COES) method for the PPP implementation. Instead of using only one direction-independent component in previous studies, the satellite NL improved FCB corrections are modeled by one direction-independent component and three directional-dependent components per satellite in this study. More specifically, the direction-independent component assimilates actual FCBs, whereas the directional-dependent components are used to assimilate the orbit errors. To evaluate the performance of the proposed method, GPS measurements from a regional and a global network are processed with the IGSReal-time service (RTS), IGS rapid (IGR) products and predicted orbits with >10 cm 3D root mean square (RMS) error. The improvements by the proposed FCB estimation method are validated in terms of ambiguity fractions after applying FCB corrections and positioning accuracy. The numerical results confirm that the obtained FCBs using the proposed method outperform those by conventional method. The RMS of ambiguity fractions after applying FCB corrections is reduced by 13.2 %. The position RMSs in north, east and up directions are reduced by 30.0, 32.0 and 22.0 % on average.
Ip, Edward H; Wasserman, Richard; Barkin, Shari
2011-03-01
Designing cluster randomized trials in clinical studies often requires accurate estimates of intraclass correlation, which quantifies the strength of correlation between units, such as participants, within a cluster, such as a practice. Published ICC estimates, even when available, often suffer from the problem of wide confidence intervals. Using data from a national, randomized, controlled study concerning violence prevention for children--the Safety Check--we compare the ICC values derived from two approaches only baseline data and using both baseline and follow-up data. Using a variance component decomposition approach, the latter method allows flexibility in handling complex data sets. For example, it allows for shifts in the outcome variable over time and for an unbalanced cluster design. Furthermore, we evaluate the large-sample formula for ICC estimates and standard errors using the bootstrap method. Our findings suggest that ICC estimates range from 0.012 to 0.11 for providers within practice and range from 0.018 to 0.11 for families within provider. The estimates derived from the baseline-only and repeated-measurements approaches agree quite well except in cases in which variation over repeated measurements is large. The reductions in the widths of ICC confidence limits from using repeated measurement over baseline only are, respectively, 62% and 42% at the practice and provider levels. The contribution of this paper therefore includes two elements, which are a methodology for improving the accuracy of ICC, and the reporting of such quantities for pediatric and other researchers who are interested in designing clustered randomized trials similar to the current study.
Ironing out the wrinkles in the rare biosphere through improved OTU clustering.
Huse, Susan M; Welch, David Mark; Morrison, Hilary G; Sogin, Mitchell L
2010-07-01
Deep sequencing of PCR amplicon libraries facilitates the detection of low-abundance populations in environmental DNA surveys of complex microbial communities. At the same time, deep sequencing can lead to overestimates of microbial diversity through the generation of low-frequency, error-prone reads. Even with sequencing error rates below 0.005 per nucleotide position, the common method of generating operational taxonomic units (OTUs) by multiple sequence alignment and complete-linkage clustering significantly increases the number of predicted OTUs and inflates richness estimates. We show that a 2% single-linkage preclustering methodology followed by an average-linkage clustering based on pairwise alignments more accurately predicts expected OTUs in both single and pooled template preparations of known taxonomic composition. This new clustering method can reduce the OTU richness in environmental samples by as much as 30-60% but does not reduce the fraction of OTUs in long-tailed rank abundance curves that defines the rare biosphere.
Li, Peng; Redden, David T.
2014-01-01
SUMMARY The sandwich estimator in generalized estimating equations (GEE) approach underestimates the true variance in small samples and consequently results in inflated type I error rates in hypothesis testing. This fact limits the application of the GEE in cluster-randomized trials (CRTs) with few clusters. Under various CRT scenarios with correlated binary outcomes, we evaluate the small sample properties of the GEE Wald tests using bias-corrected sandwich estimators. Our results suggest that the GEE Wald z test should be avoided in the analyses of CRTs with few clusters even when bias-corrected sandwich estimators are used. With t-distribution approximation, the Kauermann and Carroll (KC)-correction can keep the test size to nominal levels even when the number of clusters is as low as 10, and is robust to the moderate variation of the cluster sizes. However, in cases with large variations in cluster sizes, the Fay and Graubard (FG)-correction should be used instead. Furthermore, we derive a formula to calculate the power and minimum total number of clusters one needs using the t test and KC-correction for the CRTs with binary outcomes. The power levels as predicted by the proposed formula agree well with the empirical powers from the simulations. The proposed methods are illustrated using real CRT data. We conclude that with appropriate control of type I error rates under small sample sizes, we recommend the use of GEE approach in CRTs with binary outcomes due to fewer assumptions and robustness to the misspecification of the covariance structure. PMID:25345738
Kilgore, Meredith L.; Outman, Ryan; Locher, Julie L.; Allison, Jeroan J.; Mudano, Amy; Kitchin, Beth; Saag, Kenneth G.; Curtis, Jeffrey R.
2014-01-01
Purpose To test an evidence-implementation intervention to improve the quality of care in the home health care setting for patients at high risk for fractures. Methods We conducted a cluster randomized trial of a multimodal intervention targeted at home care for high-risk patients (prior fracture or physician-diagnosed osteoporosis) receiving care in a statewide home health agency in Alabama. Offices throughout the state were randomized to receive the intervention or to usual care. The primary outcome was the proportion of high-risk home health patients treated with osteoporosis medications. A t-test of difference in proportions was conducted between intervention and control arms and constituted the primary analysis. Secondary analyses included logistic regression estimating the effect of individual patients being treated in an intervention arm office on the likelihood of a patient receiving osteoporosis medications. A follow-on analysis examined the effect of an automated alert built into the electronic medical record that prompted the home health care nurses to deploy the intervention for high risk patients using a pre-post design. Results Among the offices in the intervention arm the average proportion of eligible patients receiving osteoporosis medications post-intervention was 19.1%, compared with 15.7% in the usual care arm (difference in proportions 3.4%, 95% CI: −2.6 −9.5%). The overall rates of osteoporosis medication use increased from 14.8% prior to activation of the automated alert to 17.6% afterward, a non-significant difference. Conclusions The home health intervention did not result in a significant improvement in use of osteoporosis medications in high risk patients. PMID:23536256
Dynamic Clustering-Based Estimation of Missing Values in Mixed Type Data
NASA Astrophysics Data System (ADS)
Ayuyev, Vadim V.; Jupin, Joseph; Harris, Philip W.; Obradovic, Zoran
The appropriate choice of a method for imputation of missing data becomes especially important when the fraction of missing values is large and the data are of mixed type. The proposed dynamic clustering imputation (DCI) algorithm relies on similarity information from shared neighbors, where mixed type variables are considered together. When evaluated on a public social science dataset of 46,043 mixed type instances with up to 33% missing values, DCI resulted in more than 20% improved imputation accuracy over Multiple Imputation, Predictive Mean Matching, Linear and Multilevel Regression, and Mean Mode Replacement methods. Data imputed by 6 methods were used for prediction tests by NB-Tree, Random Subset Selection and Neural Network-based classification models. In our experiments classification accuracy obtained using DCI-preprocessed data was much better than when relying on alternative imputation methods for data preprocessing.
Evaluation of Incremental Improvement in the NWS MPE Precipitation Estimates
NASA Astrophysics Data System (ADS)
Qin, L.; Habib, E. H.
2009-12-01
This study focuses on assessment of incremental improvement in the multi-sensor precipitation estimates (MPE) developed by the National Weather Service (NWS) River Forecast Centers (RFC). The MPE product is based upon merging of data from WSR-88D radar, surface rain gauge, and occasionally geo-stationary satellite data. The MPE algorithm produces 5 intermediate sets of products known as: RMOSAIC, BMOSAIC, MMOSAIC, LMOSAIC, and MLMOSAIC. These products have different bias-removal and optimal gauge-merging mechanisms. The final product used in operational applications is selected by the RFC forecasters. All the MPE products are provided at hourly temporal resolution and over a national Hydrologic Rainfall Analysis Project (HRAP) grid of a nominal size of 4 square kilometers. To help determine the incremental improvement of MPE estimates, an evaluation analysis was performed over a two-year period (2005-2006) using 13 independently operated rain gauges located within an area of ~30 km2 in south Louisiana. The close proximity of gauge sites to each other allows for multiple gauges to be located within the same HRAP pixel and thus provides reliable estimates of true surface rainfall to be used as a reference dataset. The evaluation analysis is performed over two temporal scales: hourly and event duration. Besides graphical comparisons using scatter and histogram plots, several statistical measures are also applied such as multiplicative bias, additive bias, correlation, and error standard deviation. The results indicated a mixed performance of the different products over the study site depending on which statistical metric is used. The products based on local bias adjustment have lowest error standard deviation but worst multiplicative bias. The opposite is true for products that are based on mean-filed bias adjustment. Optimal merging with gauge fields lead to a reduction in the error quantiles of the products. The results of the current study will provide insight into
Improving Hurricane Heat Content Estimates From Satellite Altimeter Data
NASA Astrophysics Data System (ADS)
de Matthaeis, P.; Jacob, S.; Roubert, L. M.; Shay, N.; Black, P.
2007-12-01
Hurricanes are amongst the most destructive natural disasters known to mankind. The primary energy source driving these storms is the latent heat release due to the condensation of water vapor, which ultimately comes from the ocean. While the Sea Surface Temperature (SST) has a direct correlation with wind speeds, the oceanic heat content is dependent on the upper ocean vertical structure. Understanding the impact of these factors in the mutual interaction of hurricane-ocean is critical to more accurately forecasting intensity change in land-falling hurricanes. Use of hurricane heat content derived from the satellite radar altimeter measurements of sea surface height has been shown to improve intensity prediction. The general approach of estimating ocean heat content uses a two-layer model representing the ocean with its anomalies derived from altimeter data. Although these estimates compare reasonably well with in-situ measurements, they are generally about 10% under-biased. Additionally, recent studies show that the comparisons are less than satisfactory in the Western North Pacific. Therefore, our objective is to develop a methodology to more accurately represent the upper ocean structure using in-situ data. As part of a NOAA/ USWRP sponsored research, upper ocean observations were acquired in the Gulf of Mexico during the summers of 1999 and 2000. Overall, 260 expendable profilers (XCTD, XBT and XCP) acquired vertical temperature structure in the high heat content regions corresponding to the Loop Current and Warm Core Eddies. Using the temperature and salinity data from the XCTDs, first the Temperature-Salinity relationships in the Loop Current Water and Gulf Common water are derived based on the depth of the 26° C isotherm. These derived T-S relationships compare well with those inferred from climatology. By means of these relationships, estimated salinity values corresponding to the XBT and XCP temperature measurements are calculated, and used to derive
Using SVD on Clusters to Improve Precision of Interdocument Similarity Measure
Xiao, Fan; Li, Bin; Zhang, Siguang
2016-01-01
Recently, LSI (Latent Semantic Indexing) based on SVD (Singular Value Decomposition) is proposed to overcome the problems of polysemy and homonym in traditional lexical matching. However, it is usually criticized as with low discriminative power for representing documents although it has been validated as with good representative quality. In this paper, SVD on clusters is proposed to improve the discriminative power of LSI. The contribution of this paper is three manifolds. Firstly, we make a survey of existing linear algebra methods for LSI, including both SVD based methods and non-SVD based methods. Secondly, we propose SVD on clusters for LSI and theoretically explain that dimension expansion of document vectors and dimension projection using SVD are the two manipulations involved in SVD on clusters. Moreover, we develop updating processes to fold in new documents and terms in a decomposed matrix by SVD on clusters. Thirdly, two corpora, a Chinese corpus and an English corpus, are used to evaluate the performances of the proposed methods. Experiments demonstrate that, to some extent, SVD on clusters can improve the precision of interdocument similarity measure in comparison with other SVD based LSI methods. PMID:27579031
Leão, Erico; Montez, Carlos; Moraes, Ricardo; Portugal, Paulo; Vasques, Francisco
2017-01-01
The use of Wireless Sensor Network (WSN) technologies is an attractive option to support wide-scale monitoring applications, such as the ones that can be found in precision agriculture, environmental monitoring and industrial automation. The IEEE 802.15.4/ZigBee cluster-tree topology is a suitable topology to build wide-scale WSNs. Despite some of its known advantages, including timing synchronisation and duty-cycle operation, cluster-tree networks may suffer from severe network congestion problems due to the convergecast pattern of its communication traffic. Therefore, the careful adjustment of transmission opportunities (superframe durations) allocated to the cluster-heads is an important research issue. This paper proposes a set of proportional Superframe Duration Allocation (SDA) schemes, based on well-defined protocol and timing models, and on the message load imposed by child nodes (Load-SDA scheme), or by number of descendant nodes (Nodes-SDA scheme) of each cluster-head. The underlying reasoning is to adequately allocate transmission opportunities (superframe durations) and parametrize buffer sizes, in order to improve the network throughput and avoid typical problems, such as: network congestion, high end-to-end communication delays and discarded messages due to buffer overflows. Simulation assessments show how proposed allocation schemes may clearly improve the operation of wide-scale cluster-tree networks. PMID:28134822
Leão, Erico; Montez, Carlos; Moraes, Ricardo; Portugal, Paulo; Vasques, Francisco
2017-01-27
The use ofWireless Sensor Network (WSN) technologies is an attractive option to support wide-scale monitoring applications, such as the ones that can be found in precision agriculture, environmental monitoring and industrial automation. The IEEE 802.15.4/ZigBee cluster-tree topology is a suitable topology to build wide-scale WSNs. Despite some of its known advantages, including timing synchronisation and duty-cycle operation, cluster-tree networks may suffer from severe network congestion problems due to the convergecast pattern of its communication traffic. Therefore, the careful adjustment of transmission opportunities (superframe durations) allocated to the cluster-heads is an important research issue. This paper proposes a set of proportional Superframe Duration Allocation (SDA) schemes, based on well-defined protocol and timing models, and on the message load imposed by child nodes (Load-SDA scheme), or by number of descendant nodes (Nodes-SDA scheme) of each cluster-head. The underlying reasoning is to adequately allocate transmission opportunities (superframe durations) and parametrize buffer sizes, in order to improve the network throughput and avoid typical problems, such as: network congestion, high end-to-end communication delays and discarded messages due to buffer overflows. Simulation assessments show how proposed allocation schemes may clearly improve the operation of wide-scale cluster-tree networks.
Using SVD on Clusters to Improve Precision of Interdocument Similarity Measure.
Zhang, Wen; Xiao, Fan; Li, Bin; Zhang, Siguang
2016-01-01
Recently, LSI (Latent Semantic Indexing) based on SVD (Singular Value Decomposition) is proposed to overcome the problems of polysemy and homonym in traditional lexical matching. However, it is usually criticized as with low discriminative power for representing documents although it has been validated as with good representative quality. In this paper, SVD on clusters is proposed to improve the discriminative power of LSI. The contribution of this paper is three manifolds. Firstly, we make a survey of existing linear algebra methods for LSI, including both SVD based methods and non-SVD based methods. Secondly, we propose SVD on clusters for LSI and theoretically explain that dimension expansion of document vectors and dimension projection using SVD are the two manipulations involved in SVD on clusters. Moreover, we develop updating processes to fold in new documents and terms in a decomposed matrix by SVD on clusters. Thirdly, two corpora, a Chinese corpus and an English corpus, are used to evaluate the performances of the proposed methods. Experiments demonstrate that, to some extent, SVD on clusters can improve the precision of interdocument similarity measure in comparison with other SVD based LSI methods.
Improved methods of estimating critical indices via fractional calculus
NASA Astrophysics Data System (ADS)
Bandyopadhyay, S. K.; Bhattacharyya, K.
2002-05-01
Efficiencies of certain methods for the determination of critical indices from power-series expansions are shown to be considerably improved by a suitable implementation of fractional differentiation. In the context of the ratio method (RM), kinship of the modified strategy with the ad hoc `shifted' RM is established and the advantages are demonstrated. Further, in the course of the estimation of critical points, significant betterment of convergence properties of diagonal Padé approximants is observed on several occasions by invoking this concept. Test calculations are performed on (i) various Ising spin-1/2 lattice models for susceptibility series attended with a ferromagnetic phase transition, (ii) complex model situations involving confluent and antiferromagnetic singularities and (iii) the chain-generating functions for self-avoiding walks on triangular, square and simple cubic lattices.
Improved Estimates of Air Pollutant Emissions from Biorefinery
Tan, Eric C. D.
2015-11-13
We have attempted to use detailed kinetic modeling approach for improved estimation of combustion air pollutant emissions from biorefinery. We have developed a preliminary detailed reaction mechanism for biomass combustion. Lignin is the only biomass component included in the current mechanism and methane is used as the biogas surrogate. The model is capable of predicting the combustion emissions of greenhouse gases (CO2, N2O, CH4) and criteria air pollutants (NO, NO2, CO). The results are yet to be compared with the experimental data. The current model is still in its early stages of development. Given the acknowledged complexity of biomass oxidation, as well as the components in the feed to the combustor, obviously the modeling approach and the chemistry set discussed here may undergo revision, extension, and further validation in the future.
Improved estimate for the muon g-2 using VMD constraints
NASA Astrophysics Data System (ADS)
Benayoun, M.
2012-04-01
The muon anomalous magnetic moment aμ and the hadronic vacuum polarization (HVP) are examined using data analyzed within the framework of a suitably broken HLS model. The analysis relies on all available scan data samples and leaves aside the existing ISR data. The framework provided by our broken HLS model allows for improved estimates of the contributions to aμ from the e+e- annihilation cross sections into π+π-,π0γ,ηγ,π+π-π0,K+K-,K0K up to slightly above the ϕ meson mass. Within this framework, the information provided by the τ±→π±π0ν decay and by the radiative decays (VPγ and Pγγ) of light flavor mesons play as strong constraints on the model parameters. The discrepancy between the theoretical estimate of the muon anomalous magnetic moment g-2 and its direct BNL measurement is shown to reach conservatively 4.1σ while standard methods used under the same conditions yield 3.5σ.
Improved Glomerular Filtration Rate Estimation by an Artificial Neural Network
Zhang, Yunong; Zhang, Xiang; Chen, Jinxia; Lv, Linsheng; Ma, Huijuan; Wu, Xiaoming; Zhao, Weihong; Lou, Tanqi
2013-01-01
Background Accurate evaluation of glomerular filtration rates (GFRs) is of critical importance in clinical practice. A previous study showed that models based on artificial neural networks (ANNs) could achieve a better performance than traditional equations. However, large-sample cross-sectional surveys have not resolved questions about ANN performance. Methods A total of 1,180 patients that had chronic kidney disease (CKD) were enrolled in the development data set, the internal validation data set and the external validation data set. Additional 222 patients that were admitted to two independent institutions were externally validated. Several ANNs were constructed and finally a Back Propagation network optimized by a genetic algorithm (GABP network) was chosen as a superior model, which included six input variables; i.e., serum creatinine, serum urea nitrogen, age, height, weight and gender, and estimated GFR as the one output variable. Performance was then compared with the Cockcroft-Gault equation, the MDRD equations and the CKD-EPI equation. Results In the external validation data set, Bland-Altman analysis demonstrated that the precision of the six-variable GABP network was the highest among all of the estimation models; i.e., 46.7 ml/min/1.73 m2 vs. a range from 71.3 to 101.7 ml/min/1.73 m2, allowing improvement in accuracy (15% accuracy, 49.0%; 30% accuracy, 75.1%; 50% accuracy, 90.5% [P<0.001 for all]) and CKD stage classification (misclassification rate of CKD stage, 32.4% vs. a range from 47.3% to 53.3% [P<0.001 for all]). Furthermore, in the additional external validation data set, precision and accuracy were improved by the six-variable GABP network. Conclusions A new ANN model (the six-variable GABP network) for CKD patients was developed that could provide a simple, more accurate and reliable means for the estimation of GFR and stage of CKD than traditional equations. Further validations are needed to assess the ability of the ANN model in diverse
The Improved Estimation of Ratio of Two Population Proportions
ERIC Educational Resources Information Center
Solanki, Ramkrishna S.; Singh, Housila P.
2016-01-01
In this article, first we obtained the correct mean square error expression of Gupta and Shabbir's linear weighted estimator of the ratio of two population proportions. Later we suggested the general class of ratio estimators of two population proportions. The usual ratio estimator, Wynn-type estimator, Singh, Singh, and Kaur difference-type…
Hemodialysis catheter care strategies: a cluster-randomized quality improvement initiative.
Rosenblum, Alex; Wang, Weiling; Ball, Lynda K; Latham, Carolyn; Maddux, Franklin W; Lacson, Eduardo
2014-02-01
The prevalence of central venous catheters (CVCs) for hemodialysis remains high and, despite infection-control protocols, predisposes to bloodstream infections (BSIs). Stratified, cluster-randomized, quality improvement initiative. All in-center patients with a CVC within 211 facility pairs matched by region, facility size, and rate of positive blood cultures (January to March 2011) at Fresenius Medical Care, North America. Incorporate the use of 2% chlorhexidine with 70% alcohol swab sticks for exit-site care and 70% alcohol pads to perform "scrub the hubs" in dialysis-related CVC care procedures compared to usual care. The primary outcome was positive blood cultures for estimating BSI rates. Comparison of 3-month baseline period from April 1 to June 30 and follow-up period from August 1 to October 30, 2011. Baseline BSI rates were similar (0.85 vs 0.86/1,000 CVC-days), but follow-up rates differed at 0.81/1,000 CVC-days in intervention facilities versus 1.04/1,000 CVC-days in controls (P = 0.02). Intravenous antibiotic starts during the follow-up period also were lower, at 2.53/1,000 CVC-days versus 3.15/1,000 CVC-days in controls (P < 0.001). Cluster-adjusted Poisson regression confirmed 21%-22% reductions in both (P < 0.001). Extended follow-up for 3 successive quarters demonstrated a sustained reduction of bacteremia rates for patients in intervention facilities, at 0.50/1,000 CVC-days (41% reduction; P < 0.001). Hospitalizations due to sepsis during 1-year extended follow-up were 0.19/1,000 CVC-days (0.069/CVC-year) versus 0.26/1,000 CVC-days (0.095/CVC-year) in controls (∼27% difference; P < 0.05). Inability to capture results from blood cultures sent to external laboratories, underestimation of sepsis-specific hospitalizations, and potential crossover adoption of the intervention protocol in control facilities. Adoption of the new catheter care procedure (consistent with Centers for Disease Control and Prevention recommendations
ERIC Educational Resources Information Center
Hunt, Charles R.
A study developed a model to assist school administrators to estimate costs associated with the delivery of a metals cluster program at Norfolk State College, Virginia. It sought to construct the model so that costs could be explained as a function of enrollment levels. Data were collected through a literature review, computer searches of the…
ERIC Educational Resources Information Center
Schochet, Peter Z.
2009-01-01
This paper examines the estimation of two-stage clustered RCT designs in education research using the Neyman causal inference framework that underlies experiments. The key distinction between the considered causal models is whether potential treatment and control group outcomes are considered to be fixed for the study population (the…
Improved fuzzy clustering algorithms in segmentation of DC-enhanced breast MRI.
Kannan, S R; Ramathilagam, S; Devi, Pandiyarajan; Sathya, A
2012-02-01
Segmentation of medical images is a difficult and challenging problem due to poor image contrast and artifacts that result in missing or diffuse organ/tissue boundaries. Many researchers have applied various techniques however fuzzy c-means (FCM) based algorithms is more effective compared to other methods. The objective of this work is to develop some robust fuzzy clustering segmentation systems for effective segmentation of DCE - breast MRI. This paper obtains the robust fuzzy clustering algorithms by incorporating kernel methods, penalty terms, tolerance of the neighborhood attraction, additional entropy term and fuzzy parameters. The initial centers are obtained using initialization algorithm to reduce the computation complexity and running time of proposed algorithms. Experimental works on breast images show that the proposed algorithms are effective to improve the similarity measurement, to handle large amount of noise, to have better results in dealing the data corrupted by noise, and other artifacts. The clustering results of proposed methods are validated using Silhouette Method.
Guo, Yang; Li, Wei; Li, Shuhua
2014-10-02
An improved cluster-in-molecule (CIM) local correlation approach is developed to allow electron correlation calculations of large systems more accurate and faster. We have proposed a refined strategy of constructing virtual LMOs of various clusters, which is suitable for basis sets of various types. To recover medium-range electron correlation, which is important for quantitative descriptions of large systems, we find that a larger distance threshold (ξ) is necessary for highly accurate results. Our illustrative calculations show that the present CIM-MP2 (second-order Møller-Plesser perturbation theory, MP2) or CIM-CCSD (coupled cluster singles and doubles, CCSD) scheme with a suitable ξ value is capable of recovering more than 99.8% correlation energies for a wide range of systems at different basis sets. Furthermore, the present CIM-MP2 scheme can provide reliable relative energy differences as the conventional MP2 method for secondary structures of polypeptides.
Improved Soundings and Error Estimates using AIRS/AMSU Data
NASA Technical Reports Server (NTRS)
Susskind, Joel
2006-01-01
AIRS was launched on EOS Aqua on May 4, 2002, together with AMSU A and HSB, to form a next generation polar orbiting infrared and microwave atmospheric sounding system. The primary products of AIRS/AMSU are twice daily global fields of atmospheric temperature-humidity profiles, ozone profiles, sea/land surface skin temperature, and cloud related parameters including OLR. The sounding goals of AIRS are to produce 1 km tropospheric layer mean temperatures with an rms error of 1 K, and layer precipitable water with an rms error of 20 percent, in cases with up to 80 percent effective cloud cover. The basic theory used to analyze AIRS/AMSU/HSB data in the presence of clouds, called the at-launch algorithm, and a post-launch algorithm which differed only in the minor details from the at-launch algorithm, have been described previously. The post-launch algorithm, referred to as AIRS Version 4.0, has been used by the Goddard DAAC to analyze and distribute AIRS retrieval products. In this paper we show progress made toward the AIRS Version 5.0 algorithm which will be used by the Goddard DAAC starting late in 2006. A new methodology has been developed to provide accurate case by case error estimates for retrieved geophysical parameters and for the channel by channel cloud cleared radiances used to derive the geophysical parameters from the AIRS/AMSU observations. These error estimates are in turn used for quality control of the derived geophysical parameters and clear column radiances. Improvements made to the retrieval algorithm since Version 4.0 are described as well as results comparing Version 5.0 retrieval accuracy and spatial coverage with those obtained using Version 4.0.
Improvement of gougerotin and nikkomycin production by engineering their biosynthetic gene clusters.
Du, Deyao; Zhu, Yu; Wei, Junhong; Tian, Yuqing; Niu, Guoqing; Tan, Huarong
2013-07-01
Nikkomycins and gougerotin are peptidyl nucleoside antibiotics with broad biological activities. The nikkomycin biosynthetic gene cluster comprises one pathway-specific regulatory gene (sanG) and 21 structural genes, whereas the gene cluster for gougerotin biosynthesis includes one putative regulatory gene, one major facilitator superfamily transporter gene, and 13 structural genes. In the present study, we introduced sanG driven by six different promoters into Streptomyces ansochromogenes TH322. Nikkomycin production was increased significantly with the highest increase in engineered strain harboring hrdB promoter-driven sanG. In the meantime, we replaced the native promoter of key structural genes in the gougerotin (gou) gene cluster with the hrdB promoters. The heterologous producer Streptomyces coelicolor M1146 harboring the modified gene cluster produced gougerotin up to 10-fold more than strains carrying the unmodified cluster. Therefore, genetic manipulations of genes involved in antibiotics biosynthesis with the constitutive hrdB promoter present a robust, easy-to-use system generally useful for the improvement of antibiotics production in Streptomyces.
Improving lidar-derived turbulence estimates for wind energy
Newman, Jennifer F.; Clifton, Andrew
2016-07-08
Remote sensing devices such as lidars are currently being investigated as alternatives to cup anemometers on meteorological towers. Although lidars can measure mean wind speeds at heights spanning an entire turbine rotor disk and can be easily moved from one location to another, they measure different values of turbulence than an instrument on a tower. Current methods for improving lidar turbulence estimates include the use of analytical turbulence models and expensive scanning lidars. While these methods provide accurate results in a research setting, they cannot be easily applied to smaller, commercially available lidars in locations where high-resolution sonic anemometer data are not available. Thus, there is clearly a need for a turbulence error reduction model that is simpler and more easily applicable to lidars that are used in the wind energy industry.
In this work, a new turbulence error reduction algorithm for lidars is described. The algorithm, L-TERRA, can be applied using only data from a stand-alone commercially available lidar and requires minimal training with meteorological tower data. The basis of L-TERRA is a series of corrections that are applied to the lidar data to mitigate errors from instrument noise, volume averaging, and variance contamination. These corrections are applied in conjunction with a trained machine-learning model to improve turbulence estimates from a vertically profiling WINDCUBE v2 lidar.
L-TERRA was tested on data from three sites – two in flat terrain and one in semicomplex terrain. L-TERRA significantly reduced errors in lidar turbulence at all three sites, even when the machine-learning portion of the model was trained on one site and applied to a different site. Errors in turbulence were then related to errors in power through the use of a power prediction model for a simulated 1.5 MW turbine. L-TERRA also reduced errors in power significantly at all three sites, although moderate power errors
Improving lidar-derived turbulence estimates for wind energy
Newman, Jennifer F.; Clifton, Andrew
2016-07-08
Remote sensing devices such as lidars are currently being investigated as alternatives to cup anemometers on meteorological towers. Although lidars can measure mean wind speeds at heights spanning an entire turbine rotor disk and can be easily moved from one location to another, they measure different values of turbulence than an instrument on a tower. Current methods for improving lidar turbulence estimates include the use of analytical turbulence models and expensive scanning lidars. While these methods provide accurate results in a research setting, they cannot be easily applied to smaller, commercially available lidars in locations where high-resolution sonic anemometer datamore » are not available. Thus, there is clearly a need for a turbulence error reduction model that is simpler and more easily applicable to lidars that are used in the wind energy industry. In this work, a new turbulence error reduction algorithm for lidars is described. The algorithm, L-TERRA, can be applied using only data from a stand-alone commercially available lidar and requires minimal training with meteorological tower data. The basis of L-TERRA is a series of corrections that are applied to the lidar data to mitigate errors from instrument noise, volume averaging, and variance contamination. These corrections are applied in conjunction with a trained machine-learning model to improve turbulence estimates from a vertically profiling WINDCUBE v2 lidar. L-TERRA was tested on data from three sites – two in flat terrain and one in semicomplex terrain. L-TERRA significantly reduced errors in lidar turbulence at all three sites, even when the machine-learning portion of the model was trained on one site and applied to a different site. Errors in turbulence were then related to errors in power through the use of a power prediction model for a simulated 1.5 MW turbine. L-TERRA also reduced errors in power significantly at all three sites, although moderate power errors remained for
Disseminating quality improvement: study protocol for a large cluster-randomized trial
2011-01-01
Background Dissemination is a critical facet of implementing quality improvement in organizations. As a field, addiction treatment has produced effective interventions but disseminated them slowly and reached only a fraction of people needing treatment. This study investigates four methods of disseminating quality improvement (QI) to addiction treatment programs in the U.S. It is, to our knowledge, the largest study of organizational change ever conducted in healthcare. The trial seeks to determine the most cost-effective method of disseminating quality improvement in addiction treatment. Methods The study is evaluating the costs and effectiveness of different QI approaches by randomizing 201 addiction-treatment programs to four interventions. Each intervention used a web-based learning kit plus monthly phone calls, coaching, face-to-face meetings, or the combination of all three. Effectiveness is defined as reducing waiting time (days between first contact and treatment), increasing program admissions, and increasing continuation in treatment. Opportunity costs will be estimated for the resources associated with providing the services. Outcomes The study has three primary outcomes: waiting time, annual program admissions, and continuation in treatment. Secondary outcomes include: voluntary employee turnover, treatment completion, and operating margin. We are also seeking to understand the role of mediators, moderators, and other factors related to an organization's success in making changes. Analysis We are fitting a mixed-effect regression model to each program's average monthly waiting time and continuation rates (based on aggregated client records), including terms to isolate state and intervention effects. Admissions to treatment are aggregated to a yearly level to compensate for seasonality. We will order the interventions by cost to compare them pair-wise to the lowest cost intervention (monthly phone calls). All randomized sites with outcome data will be
Yeeles, Ksenija; Bremner, Stephen; Lauber, Christoph; Eldridge, Sandra; Ashby, Deborah; David, Anthony S; O’Connell, Nicola; Forrest, Alexandra; Burns, Tom
2013-01-01
Objective To test whether offering financial incentives to patients with psychotic disorders is effective in improving adherence to maintenance treatment with antipsychotics. Design Cluster randomised controlled trial. Setting Community mental health teams in secondary psychiatric care in the United Kingdom. Participants Patients with a diagnosis of schizophrenia, schizoaffective disorder, or bipolar disorder, who were prescribed long acting antipsychotic (depot) injections but had received 75% or less of the prescribed injections. We randomly allocated 73 teams with a total of 141 patients. Primary outcome data were available for 35 intervention teams with 75 patients (96% of randomised) and for 31 control teams with 56 patients (89% of randomised). Interventions Participants in the intervention group were offered £15 (€17; $22) for each depot injection over a 12 month period. Participants in the control condition received treatment as usual. Main outcome measure The primary outcome was the percentage of prescribed depot injections given during the 12 month intervention period. Results 73 teams with 141 consenting patients were randomised, and outcomes were assessed for 131 patients (93%). Average baseline adherence was 69% in the intervention group and 67% in the control group. During the 12 month trial period adherence was 85% in the intervention group and 71% in the control group. The adjusted effect estimate was 11.5% (95% confidence interval 3.9% to 19.0%, P=0.003). A secondary outcome was an adherence of ≥95%, which was achieved in 28% of the intervention group and 5% of the control group (adjusted odds ratio 8.21, 95% confidence interval 2.00 to 33.67, P=0.003). Although differences in clinician rated clinical improvement between the groups failed to reach statistical significance, patients in the intervention group had more favourable subjective quality of life ratings (β=0.71, 95% confidence interval 0.26 to 1.15, P=0.002). The number of admissions
Adaptive arrival cost update for improving Moving Horizon Estimation performance.
Sánchez, G; Murillo, M; Giovanini, L
2017-03-01
Moving horizon estimation is an efficient technique to estimate states and parameters of constrained dynamical systems. It relies on the solution of a finite horizon optimization problem to compute the estimates, providing a natural framework to handle bounds and constraints on estimates, noises and parameters. However, the approximation of the arrival cost and its updating mechanism are an active research topic. The arrival cost is very important because it provides a mean to incorporate information from previous measurements to the current estimates and it is difficult to estimate its true value. In this work, we exploit the features of adaptive estimation methods to update the parameters of the arrival cost. We show that, having a better approximation of the arrival cost, the size of the optimization problem can be significantly reduced guaranteeing the stability and convergence of the estimates. These properties are illustrated through simulation studies.
Multi-RTM-based Radiance Assimilation to Improve Snow Estimates
NASA Astrophysics Data System (ADS)
Kwon, Y.; Zhao, L.; Hoar, T. J.; Yang, Z. L.; Toure, A. M.
2015-12-01
Data assimilation of microwave brightness temperature (TB) observations (i.e., radiance assimilation (RA)) has been proven to improve snowpack characterization at relatively small scales. However, large-scale applications of RA require a considerable amount of further efforts. Our objective in this study is to explore global-scale snow RA. In a RA scheme, a radiative transfer model (RTM) is an observational operator predicting TB; therefore, the quality of the assimilation results may strongly depend upon the RTM used as well as the land surface model (LSM). Several existing RTMs show different sensitivities to snowpack properties and thus they simulate significantly different TB. At the global scale, snow physical properties vary widely with local climate conditions. No single RTM has been shown to be able to accurately reproduce the observed TB for such a wide range of snow conditions. In this study, therefore, we hypothesize that snow estimates using a microwave RA scheme can be improved through the use of multiple RTMs (i.e., multi-RTM-based approaches). As a first step, here we use two snowpack RTMs, i.e., the Dense Media Radiative Transfer-Multi Layers model (DMRT-ML) and the Microwave Emission Model for Layered Snowpacks (MEMLS). The Community Land Model version 4 (CLM4) is used to simulate snow dynamics. The assimilation process is conducted by the Data Assimilation Research Testbed (DART), which is a community facility developed by the National Center for Atmospheric Research (NCAR) for ensemble-based data assimilation studies. In the RA experiments, the Advanced Microwave Scanning Radiometer-Earth Observing System (AMSR-E) TB at 18.7 and 36.5 GHz vertical polarization channels are assimilated into the RA system using the ensemble adjustment Kalman filter. The results are evaluated using the Canadian Meteorological Centre (CMC) daily snow depth, the Moderate Resolution Imaging Spectroradiometer (MODIS) snow cover fraction, and in-situ snowpack and river
Using Satellite Rainfall Estimates to Improve Climate Services in Africa
NASA Astrophysics Data System (ADS)
Dinku, T.
2012-12-01
Climate variability and change pose serious challenges to sustainable development in Africa. The recent famine crisis in Horn of Africa is yet again another evidence of how fluctuations in the climate can destroy lives and livelihoods. Building resilience against the negative impacts of climate and maximizing the benefits from favorable conditions will require mainstreaming climate issues into development policy, planning and practice at different levels. The availability of decision-relevant climate information at different levels is very critical. The number and quality of weather stations in many part of Africa, however, has been declining. The available stations are unevenly distributed with most of the stations located along the main roads. This imposes severe limitations to the availability of climate information and services to rural communities where these services are needed most. Where observations are taken, they suffer from gaps and poor quality and are often unavailable beyond the respective national meteorological services. Combining available local observation with satellite products, making data and products available through the Internet, and training the user community to understand and use climate information will help to alleviate these problems. Improving data availability involves organizing and cleaning all available national station observations and combining them with satellite rainfall estimates. The main advantage of the satellite products is the excellent spatial coverage at increasingly improved spatial and temporal resolutions. This approach has been implemented in Ethiopia and Tanzania, and it is in the process being implemented in West Africa. The main outputs include: 1. Thirty-year times series of combined satellite-gauge rainfall time series at 10-daily time scale 10-km spatial resolution; 2. An array of user-specific products for climate analysis and monitoring; 3. An online facility providing user-friendly tools for
de Gunst, M C; Luebeck, E G
1998-03-01
The problem of finding the number and size distribution of cell clusters that grow in an organ or tissue from observations of the number and sizes of transections of such cell clusters in a planar section is considered. This problem is closely related to the well-known corpuscle or Wicksell problem in stereology, which deals with transections of spherical objects. However, for most biological applications, it is unrealistic to assume that cell clusters have spherical shapes since they may grow in various ways. We therefore propose a method that allows for more general spatial configurations of the clusters. Under the assumption that a parametric growth model is available for the number and sizes of the cell clusters, expressions are obtained for the probability distributions of the number and sizes of transections of the clusters in a section plane for each point in time. These expressions contain coefficients that are independent of the parametric growth model and time but depend on which model is chosen for the configuration of the cell clusters in space. These results enable us to perform estimation of the parameters of the growth model by maximum likelihood directly on the data instead of having to deal with the inverse problem of estimation of three-dimensional quantities based on two-dimensional data. For realistic choices of the configuration model, it will not be possible to obtain the exact values of the coefficients, but they can easily be approximated by means of computer simulations of the spatial configuration. Monte Carlo simulations were performed to approximate the coefficients for two particular spatial configuration models. For these two configuration models, the proposed method is applied to data on preneoplastic minifoci in rat liver under the assumption of a two-event model of carcinogenesis as the parametric growth model.
Improving estimates of air pollution exposure through ubiquitous sensing technologies
de Nazelle, Audrey; Seto, Edmund; Donaire-Gonzalez, David; Mendez, Michelle; Matamala, Jaume; Nieuwenhuijsen, Mark J; Jerrett, Michael
2013-01-01
Traditional methods of exposure assessment in epidemiological studies often fail to integrate important information on activity patterns, which may lead to bias, loss of statistical power or both in health effects estimates. Novel sensing technologies integrated with mobile phones offer potential to reduce exposure measurement error. We sought to demonstrate the usability and relevance of the CalFit smartphone technology to track person-level time, geographic location, and physical activity patterns for improved air pollution exposure assessment. We deployed CalFit-equipped smartphones in a free living-population of 36 subjects in Barcelona, Spain. Information obtained on physical activity and geographic location was linked to space-time air pollution mapping. For instance, we found on average travel activities accounted for 6% of people’s time and 24% of their daily inhaled NO2. Due to the large number of mobile phone users, this technology potentially provides an unobtrusive means of collecting epidemiologic exposure data at low cost. PMID:23416743
Improved Estimate of Phobos Secular Acceleration from MOLA Observations
NASA Technical Reports Server (NTRS)
Bills, Bruce; Neumann, Gregory; Smith, David; Zuber, Maria
2004-01-01
We report on new observations of the orbital position of Phobos, and use them to obtain a new and improved estimate of the rate of secular acceleration in longitude due to tidal dissipation within Mars. Phobos is the inner-most natural satellite of Mars, and one of the few natural satellites in the solar system with orbital period shorter than the rotation period of its primary. As a result, any departure from a perfect elastic response by Mars in the tides raised on it by Phobos will cause a transfer of angular momentum from the orbit of Phobos to the spin of Mars. Since its discovery in 1877, Phobos has completed over 145,500 orbits, and has one of the best studied orbits in the solar system, with over 6000 earth-based astrometric observations, and over 300 spacecraft observations. As early as 1945, Sharpless noted that there is a secular acceleration in mean longitude, with rate (1.88 + 0.25) 10(exp -3) degrees per square year. In preparation for the 1989 Russian spacecraft mission to Phobos, considerable work was done compiling past observations, and refining the orbital model. All of the published estimates from that era are in good agreement. A typical solution (Jacobson et al., 1989) yields (1.249 + 0.018) 10(exp -3) degrees per square year. The MOLA instrument on MGS is a laser altimeter, and was designed to measure the topography of Mars. However, it has also been used to make observations of the position of Phobos. In 1998, a direct range measurement was made, which indicated that Phobos was slightly ahead of the predicted position. The MOLA detector views the surface of Mars in a narrow field of view, at 1064 nanometer wavelength, and can detect shadows cast by Phobos on the surface of Mars. We have found 15 such serendipitous shadow transit events over the interval from xx to xx, and all of them show Phobos to be ahead of schedule, and getting progressively farther ahead of the predicted position. In contrast, the cross-track positions are quite close
Arpino, Bruno; Cannas, Massimo
2016-05-30
This article focuses on the implementation of propensity score matching for clustered data. Different approaches to reduce bias due to cluster-level confounders are considered and compared using Monte Carlo simulations. We investigated methods that exploit the clustered structure of the data in two ways: in the estimation of the propensity score model (through the inclusion of fixed or random effects) or in the implementation of the matching algorithm. In addition to a pure within-cluster matching, we also assessed the performance of a new approach, 'preferential' within-cluster matching. This approach first searches for control units to be matched to treated units within the same cluster. If matching is not possible within-cluster, then the algorithm searches in other clusters. All considered approaches successfully reduced the bias due to the omission of a cluster-level confounder. The preferential within-cluster matching approach, combining the advantages of within-cluster and between-cluster matching, showed a relatively good performance both in the presence of big and small clusters, and it was often the best method. An important advantage of this approach is that it reduces the number of unmatched units as compared with a pure within-cluster matching. We applied these methods to the estimation of the effect of caesarean section on the Apgar score using birth register data. Copyright © 2016 John Wiley & Sons, Ltd.
NASA Technical Reports Server (NTRS)
Lange, A. E.
1997-01-01
Measurements of the peculiar velocities of galaxy clusters with respect to the Hubble flow allow the determination of the gravitational field from all matter in the universe, not just the visible component. The Sunyaev-Zel'dovich (SZ) effect (the inverse-Compton scattering of cosmic microwave background photons by the hot gas in clusters of galaxies) allows these velocities to be measured without the use of empirical distance indicators. Additionally, because the magnitude of the SZ effect is independent of redshift, the technique can be used to measure velocities out to the epoch of cluster formation. The SZ technique requires a determination of the temperature of the hot cluster gas from X-ray observations, and measurements of the SZ effect at millimeter wavelengths to separate the contribution from the thermal motions within the gas from that of the cluster peculiax velocity. We have constructed a bolometric receiver, the Sunyaev-Zel'dovich Infrared Experiment, specifically to make measurements of the SZ effect at millimeter wavelengths in order to apply the SZ technique to peculiar velocity measurements. This receiver has already been used to set limits to the peculiar velocities of two galaxy clusters at z approx. 0.2. As a test of the SZ technique, the double cluster pair Abell 222 and 223 was selected for observation. Measurements of the redshifts of the two components suggest that, if the clusters are gravitationally bound, they should exhibit a relative velocity of 10OO km/ s, well above the expected precision of 200 km/ s (set by astrophysical confusion) that is expected from the SZ method. The temperature can be measured from ASCA data which we obtained for this cluster pair. However, in order to ensure that the temperature estimate from the ASCA data was not dominated by cooling flows within the cluster, we requested ROSAT HRI observations of this cluster pair. Analysis of the X-ray properties of the cluster pair is continuing by combining the ROSAT
Estimating Treatment Effects via Multilevel Matching within Homogenous Groups of Clusters
ERIC Educational Resources Information Center
Steiner, Peter M.; Kim, Jee-Seon
2015-01-01
Despite the popularity of propensity score (PS) techniques they are not yet well studied for matching multilevel data where selection into treatment takes place among level-one units within clusters. This paper suggests a PS matching strategy that tries to avoid the disadvantages of within- and across-cluster matching. The idea is to first…
The Amish furniture cluster in Ohio: competitive factors and wood use estimates
Matthew Bumgardner; Robert Romig; William Luppold
2008-01-01
This paper is an assessment of wood use by the Amish furniture cluster located in northeastern Ohio. The paper also highlights the competitive and demographic factors that have enabled cluster growth and new business formation in a time of declining market share for the overall U.S. furniture industry. Several secondary information sources and discussions with local...
Distance estimates to five open clusters based on 2mass data of red clump giants
NASA Astrophysics Data System (ADS)
Chen, Li; Gao, Xinhua
2013-02-01
Red clump (RC) giants are excellent standard candles in the Milky Way and the Large Magellanic Cloud. The near-infrared K-band intrinsic luminosity of RC giants exhibits only a small variance and a weak dependence on chemical composition and age. In addition, RCs are often easily recognizable in the color-magnitude diagrams of open clusters, which renders them extremely useful distance indicators for some intermediate-age or old open clusters. Here we determine the distance moduli of five Galactic open clusters covering a range of metallicities and ages, based on RC giants in the cluster regions using 2mass photometric data. We compare our result with those from main-sequence fitting and also briefly discuss the advantages and disadvantages of RC-based cluster distance determination.
An improved global dynamic routing strategy for scale-free network with tunable clustering
NASA Astrophysics Data System (ADS)
Sun, Lina; Huang, Ning; Zhang, Yue; Bai, Yannan
2016-08-01
An efficient routing strategy can deliver packets quickly to improve the network capacity. Node congestion and transmission path length are inevitable real-time factors for a good routing strategy. Existing dynamic global routing strategies only consider the congestion of neighbor nodes and the shortest path, which ignores other key nodes’ congestion on the path. With the development of detection methods and techniques, global traffic information is readily available and important for the routing choice. Reasonable use of this information can effectively improve the network routing. So, an improved global dynamic routing strategy is proposed, which considers the congestion of all nodes on the shortest path and incorporates the waiting time of the most congested node into the path. We investigate the effectiveness of the proposed routing for scale-free network with different clustering coefficients. The shortest path routing strategy and the traffic awareness routing strategy only considering the waiting time of neighbor node are analyzed comparatively. Simulation results show that network capacity is greatly enhanced compared with the shortest path; congestion state increase is relatively slow compared with the traffic awareness routing strategy. Clustering coefficient increase will not only reduce the network throughput, but also result in transmission average path length increase for scale-free network with tunable clustering. The proposed routing is favorable to ease network congestion and network routing strategy design.
Bonci, T; Camomilla, V; Dumas, R; Chèze, L; Cappozzo, A
2015-11-26
When stereophotogrammetry and skin-markers are used, bone-pose estimation is jeopardised by the soft tissue artefact (STA). At marker-cluster level, this can be represented using a modal series of rigid (RT; translation and rotation) and non-rigid (NRT; homothety and scaling) geometrical transformations. The NRT has been found to be smaller than the RT and claimed to have a limited impact on bone-pose estimation. This study aims to investigate this matter and comparatively assessing the propagation of both STA components to bone-pose estimate, using different numbers of markers. Twelve skin-markers distributed over the anterior aspect of a thigh were considered and STA time functions were generated for each of them, as plausibly occurs during walking, using an ad hoc model and represented through the geometrical transformations. Using marker-clusters made of four to 12 markers affected by these STAs, and a Procrustes superimposition approach, bone-pose and the relevant accuracy were estimated. This was done also for a selected four marker-cluster affected by STAs randomly simulated by modifying the original STA NRT component, so that its energy fell in the range 30-90% of total STA energy. The pose error, which slightly decreased while increasing the number of markers in the marker-cluster, was independent from the NRT amplitude, and was always null when the RT component was removed. It was thus demonstrated that only the RT component impacts pose estimation accuracy and should thus be accounted for when designing algorithms aimed at compensating for STA.
Improved Rosetta Pedotransfer Estimation of Hydraulic Properties and Their Covariance
NASA Astrophysics Data System (ADS)
Zhang, Y.; Schaap, M. G.
2014-12-01
Quantitative knowledge of the soil hydraulic properties is necessary for most studies involving water flow and solute transport in the vadose zone. However, it is always expensive, difficult, and time consuming to measure hydraulic properties directly. Pedotransfer functions (PTFs) have been widely used to forecast soil hydraulic parameters. Rosetta is is one of many PTFs and based on artificial neural network analysis coupled with the bootstrap sampling method. The model provides hierarchical PTFs for different levels of input data for Rosetta (H1-H5 models, with higher order models requiring more input variables). The original Rosetta model consists of separate PTFs for the four "van Genuchten" (VG) water retention parameters and saturated hydraulic conductivity (Ks) because different numbers of samples were available for these characteristics. In this study, we present an improved Rosetta pedotransfer function that uses a single model for all five parameters combined; these parameters are weighed for each sample individually using the covariance matrix obtained from the curve-fit of the VG parameters to the primary data. The optimal number of hidden nodes, weights for saturated hydraulic conductivity and water retention parameters in the neural network and bootstrap realization were selected. Results show that root mean square error (RMSE) for water retention decreased from 0.076 to 0.072 cm3/cm3 for the H2 model and decreased from 0.044 to 0.039 cm3/cm3 for the H5 model. Mean errors which indicate variable matric potential-dependent bias were also reduced significantly in the new model. The RMSE for Ks increased slightly (H2: 0.717 to 0.722; H5: 0.581 to 0.594); this increase is minimal and a result of using a single model for water retention and Ks. Despite this small increase the new model is recommended because of its improved estimation of water retention, and because it is now possible to calculate the full covariance matrix of soil water retention
Paireau, Juliette; Girond, Florian; Collard, Jean-Marc; Maïnassara, Halima B.; Jusot, Jean-François
2012-01-01
Background Meningococcal meningitis is a major health problem in the “African Meningitis Belt” where recurrent epidemics occur during the hot, dry season. In Niger, a central country belonging to the Meningitis Belt, reported meningitis cases varied between 1,000 and 13,000 from 2003 to 2009, with a case-fatality rate of 5–15%. Methodology/Principal Findings In order to gain insight in the epidemiology of meningococcal meningitis in Niger and to improve control strategies, the emergence of the epidemics and their diffusion patterns at a fine spatial scale have been investigated. A statistical analysis of the spatio-temporal distribution of confirmed meningococcal meningitis cases was performed between 2002 and 2009, based on health centre catchment areas (HCCAs) as spatial units. Anselin's local Moran's I test for spatial autocorrelation and Kulldorff's spatial scan statistic were used to identify spatial and spatio-temporal clusters of cases. Spatial clusters were detected every year and most frequently occurred within nine southern districts. Clusters most often encompassed few HCCAs within a district, without expanding to the entire district. Besides, strong intra-district heterogeneity and inter-annual variability in the spatio-temporal epidemic patterns were observed. To further investigate the benefit of using a finer spatial scale for surveillance and disease control, we compared timeliness of epidemic detection at the HCCA level versus district level and showed that a decision based on threshold estimated at the HCCA level may lead to earlier detection of outbreaks. Conclusions/Significance Our findings provide an evidence-based approach to improve control of meningitis in sub-Saharan Africa. First, they can assist public health authorities in Niger to better adjust allocation of resources (antibiotics, rapid diagnostic tests and medical staff). Then, this spatio-temporal analysis showed that surveillance at a finer spatial scale (HCCA) would be more
Gorfine, Malka; Bordo, Nadia; Hsu, Li
2017-01-01
SummaryConsider a popular case-control family study where individuals with a disease under study (case probands) and individuals who do not have the disease (control probands) are randomly sampled from a well-defined population. Possibly right-censored age at onset and disease status are observed for both probands and their relatives. For example, case probands are men diagnosed with prostate cancer, control probands are men free of prostate cancer, and the prostate cancer history of the fathers of the probands is also collected. Inherited genetic susceptibility, shared environment, and common behavior lead to correlation among the outcomes within a family. In this article, a novel nonparametric estimator of the marginal survival function is provided. The estimator is defined in the presence of intra-cluster dependence, and is based on consistent smoothed kernel estimators of conditional survival functions. By simulation, it is shown that the proposed estimator performs very well in terms of bias. The utility of the estimator is illustrated by the analysis of case-control family data of early onset prostate cancer. To our knowledge, this is the first article that provides a fully nonparametric marginal survival estimator based on case-control clustered age-at-onset data.
Improving statistical keyword detection in short texts: Entropic and clustering approaches
NASA Astrophysics Data System (ADS)
Carretero-Campos, C.; Bernaola-Galván, P.; Coronado, A. V.; Carpena, P.
2013-03-01
In the last years, two successful approaches have been introduced to tackle the problem of statistical keyword detection in a text without the use of external information: (i) The entropic approach, where Shannon’s entropy of information is used to quantify the information content of the sequence of occurrences of each word in the text; and (ii) The clustering approach, which links the heterogeneity of the spatial distribution of a word in the text (clustering) with its relevance. In this paper, first we present some modifications to both techniques which improve their results. Then, we propose new metrics to evaluate the performance of keyword detectors based specifically on the needs of a typical user, and we employ them to find out which approach performs better. Although both approaches work well in long texts, we obtain in general that measures based on word-clustering perform at least as well as the entropic measure, which needs a convenient partition of the text to be applied, such as chapters of a book. In the latter approach we also show that the partition of the text chosen affects strongly its results. Finally, we focus on short texts, a case of high practical importance, such as short reports, web pages, scientific articles, etc. We show that the performance of word-clustering measures is also good in generic short texts since these measures are able to discriminate better the degree of relevance of low frequency words than the entropic approach.
NASA Astrophysics Data System (ADS)
Xi, Yakun; Zhang, Cheng
2017-03-01
We show that one can obtain improved L 4 geodesic restriction estimates for eigenfunctions on compact Riemannian surfaces with nonpositive curvature. We achieve this by adapting Sogge's strategy in (Improved critical eigenfunction estimates on manifolds of nonpositive curvature, Preprint). We first combine the improved L 2 restriction estimate of Blair and Sogge (Concerning Toponogov's Theorem and logarithmic improvement of estimates of eigenfunctions, Preprint) and the classical improved {L^∞} estimate of Bérard to obtain an improved weak-type L 4 restriction estimate. We then upgrade this weak estimate to a strong one by using the improved Lorentz space estimate of Bak and Seeger (Math Res Lett 18(4):767-781, 2011). This estimate improves the L 4 restriction estimate of Burq et al. (Duke Math J 138:445-486, 2007) and Hu (Forum Math 6:1021-1052, 2009) by a power of {(log logλ)^{-1}}. Moreover, in the case of compact hyperbolic surfaces, we obtain further improvements in terms of {(logλ)^{-1}} by applying the ideas from (Chen and Sogge, Commun Math Phys 329(3):435-459, 2014) and (Blair and Sogge, Concerning Toponogov's Theorem and logarithmic improvement of estimates of eigenfunctions, Preprint). We are able to compute various constants that appeared in (Chen and Sogge, Commun Math Phys 329(3):435-459, 2014) explicitly, by proving detailed oscillatory integral estimates and lifting calculations to the universal cover H^2.
Adjusting for radiotelemetry error to improve estimates of habitat use.
Scott L. Findholt; Bruce K. Johnson; Lyman L. McDonald; John W. Kern; Alan Ager; Rosemary J. Stussy; Larry D. Bryant
2002-01-01
Animal locations estimated from radiotelemetry have traditionally been treated as error-free when analyzed in relation to habitat variables. Location error lowers the power of statistical tests of habitat selection. We describe a method that incorporates the error surrounding point estimates into measures of environmental variables determined from a geographic...
First estimates of the fundamental parameters of the very small open cluster Ruprecht 1
NASA Astrophysics Data System (ADS)
Piatti, A. E.; Clariá, J. J.; Parisi, M. C.; Ahumada, A. V.
New CCD observations with the Washington C and T1 filters in the field of the open cluster Ruprecht 1 are presented. The cluster turned out to be very small, its linear radius being 2.6 +/- 0.2 pc. Ruprecht 1 is moderately reddened [E(B-V) = 0.25] and moderately young (~ 230 Myr). Heliocentric distances of 1.9 +/- 0.4 kpc and 1.5 +/- 0.3 kpc are determined for Z (metallicity) = 0.02 and 0.08, respectively, although evidence is presented favouring a solar metal content rather than a subsolar one. We compare the cluster properties with tose of known open clusters located within 1 kpc around it.
NASA Astrophysics Data System (ADS)
Priyatikanto, R.; Arifyanto, M. I.
2015-01-01
Stellar membership determination of an open cluster is an important process to do before further analysis. Basically, there are two classes of membership determination method: parametric and non-parametric. In this study, an alternative of non-parametric method based on Binned Kernel Density Estimation that accounts measurements errors (simply called BKDE- e) is proposed. This method is applied upon proper motions data to determine cluster's membership kinematically and estimate the average proper motions of the cluster. Monte Carlo simulations show that the average proper motions determination using this proposed method is statistically more accurate than ordinary Kernel Density Estimator (KDE). By including measurement errors in the calculation, the mode location from the resulting density estimate is less sensitive to non-physical or stochastic fluctuation as compared to ordinary KDE that excludes measurement errors. For the typical mean measurement error of 7 mas/yr, BKDE- e suppresses the potential of miscalculation by a factor of two compared to KDE. With median accuracy of about 93 %, BKDE- e method has comparable accuracy with respect to parametric method (modified Sanders algorithm). Application to real data from The Fourth USNO CCD Astrograph Catalog (UCAC4), especially to NGC 2682 is also performed. The mode of member stars distribution on Vector Point Diagram is located at μ α cos δ=-9.94±0.85 mas/yr and μ δ =-4.92±0.88 mas/yr. Although the BKDE- e performance does not overtake parametric approach, it serves a new view of doing membership analysis, expandable to astrometric and photometric data or even in binary cluster search.
Improving Estimation Accuracy of Aggregate Queries on Data Cubes
Pourabbas, Elaheh; Shoshani, Arie
2008-08-15
In this paper, we investigate the problem of estimation of a target database from summary databases derived from a base data cube. We show that such estimates can be derived by choosing a primary database which uses a proxy database to estimate the results. This technique is common in statistics, but an important issue we are addressing is the accuracy of these estimates. Specifically, given multiple primary and multiple proxy databases, that share the same summary measure, the problem is how to select the primary and proxy databases that will generate the most accurate target database estimation possible. We propose an algorithmic approach for determining the steps to select or compute the source databases from multiple summary databases, which makes use of the principles of information entropy. We show that the source databases with the largest number of cells in common provide the more accurate estimates. We prove that this is consistent with maximizing the entropy. We provide some experimental results on the accuracy of the target database estimation in order to verify our results.
Using Smartphone Sensors for Improving Energy Expenditure Estimation
Zhu, Jindan; Das, Aveek K.; Zeng, Yunze; Mohapatra, Prasant; Han, Jay J.
2015-01-01
Energy expenditure (EE) estimation is an important factor in tracking personal activity and preventing chronic diseases, such as obesity and diabetes. Accurate and real-time EE estimation utilizing small wearable sensors is a difficult task, primarily because the most existing schemes work offline or use heuristics. In this paper, we focus on accurate EE estimation for tracking ambulatory activities (walking, standing, climbing upstairs, or downstairs) of a typical smartphone user. We used built-in smartphone sensors (accelerometer and barometer sensor), sampled at low frequency, to accurately estimate EE. Using a barometer sensor, in addition to an accelerometer sensor, greatly increases the accuracy of EE estimation. Using bagged regression trees, a machine learning technique, we developed a generic regression model for EE estimation that yields upto 96% correlation with actual EE. We compare our results against the state-of-the-art calorimetry equations and consumer electronics devices (Fitbit and Nike+ FuelBand). The newly developed EE estimation algorithm demonstrated superior accuracy compared with currently available methods. The results were calibrated against COSMED K4b2 calorimeter readings. PMID:27170901
Using Smartphone Sensors for Improving Energy Expenditure Estimation.
Pande, Amit; Zhu, Jindan; Das, Aveek K; Zeng, Yunze; Mohapatra, Prasant; Han, Jay J
2015-01-01
Energy expenditure (EE) estimation is an important factor in tracking personal activity and preventing chronic diseases, such as obesity and diabetes. Accurate and real-time EE estimation utilizing small wearable sensors is a difficult task, primarily because the most existing schemes work offline or use heuristics. In this paper, we focus on accurate EE estimation for tracking ambulatory activities (walking, standing, climbing upstairs, or downstairs) of a typical smartphone user. We used built-in smartphone sensors (accelerometer and barometer sensor), sampled at low frequency, to accurately estimate EE. Using a barometer sensor, in addition to an accelerometer sensor, greatly increases the accuracy of EE estimation. Using bagged regression trees, a machine learning technique, we developed a generic regression model for EE estimation that yields upto 96% correlation with actual EE. We compare our results against the state-of-the-art calorimetry equations and consumer electronics devices (Fitbit and Nike+ FuelBand). The newly developed EE estimation algorithm demonstrated superior accuracy compared with currently available methods. The results were calibrated against COSMED K4b2 calorimeter readings.
NASA Astrophysics Data System (ADS)
Wahyudi, Notodiputro, Khairil Anwar; Kurnia, Anang; Anisa, Rahma
2016-02-01
Empirical Best Linear Unbiased Prediction (EBLUP) is one of indirect estimating methods which used to estimate parameters of small areas. EBLUP methods works in using auxiliary variables of area while adding the area random effects. In estimating non-sampled area, the standard EBLUP can no longer be used due to no information of area random effects. To obtain more proper estimation methods for non sampled area, the standard EBLUP model has to be modified by adding cluster information. The aim of this research was to study clustering methods using factor analysis by means of simulation, provide better cluster information. The criteria used to evaluate the goodness of fit of the methods in the simulation study were the mean percentage of clustering accuracy. The results of the simulation study showed the use of factor analysis in clustering has increased the average percentage of accuracy particularly when using Ward method. The method was taken into account to estimate the per capita expenditures based on Small Area Estimation (SAE) techniques. The method was eventually used to estimate the per capita expenditures from SUSENAS and the quality of the estimates was measured by RMSE. This research has shown that the standard-modified EBLUP model provided with factor analysis better estimates when compared with standard EBLUP model and the standard-modified EBLUP without the factor analysis. Moreover, it was also shown that the clustering information is important in estimating non sampled area.
Li, Chunming; Xu, Chenyang; Anderson, Adam W; Gore, John C
2009-01-01
This paper presents a new energy minimization method for simultaneous tissue classification and bias field estimation of magnetic resonance (MR) images. We first derive an important characteristic of local image intensities--the intensities of different tissues within a neighborhood form separable clusters, and the center of each cluster can be well approximated by the product of the bias within the neighborhood and a tissue-dependent constant. We then introduce a coherent local intensity clustering (CLIC) criterion function as a metric to evaluate tissue classification and bias field estimation. An integration of this metric defines an energy on a bias field, membership functions of the tissues, and the parameters that approximate the true signal from the corresponding tissues. Thus, tissue classification and bias field estimation are simultaneously achieved by minimizing this energy. The smoothness of the derived optimal bias field is ensured by the spatially coherent nature of the CLIC criterion function. As a result, no extra effort is needed to smooth the bias field in our method. Moreover, the proposed algorithm is robust to the choice of initial conditions, thereby allowing fully automatic applications. Our algorithm has been applied to high field and ultra high field MR images with promising results.
NASA Astrophysics Data System (ADS)
Dorling, S. R.; Davies, T. D.; Pierce, C. E.
Combining the pattern recognition capabilities of cluster analysis with isobaric air trajectory data is a useful way of quantifying the influence of synoptic meteorology on the pollution climatology at a site. A non-hierarchial clustering of 1000 mb isobaric trajectories, using squared Euclidean distance as a similarity measure, leads to the identification of a finite number of distinct synoptic patterns. Typical airbore and aqueous pollutant concentrations associated with each of these patterns may then be established. By considering 3-day air trajectories in this study, the "history" of an air parcel is captured in an improved manner, when compared with attempts to use individual day weather "types" to characterize meterological situations.
Zheng, Jian-Ting; Wang, Sheng-Lan; Yang, Ke-Qian
2007-09-01
Streptomyces venezuelae ISP5230 produces a group of jadomycin congeners with cytotoxic activities. To improve jadomycin fermentation process, a genetic engineering strategy was designed to replace a 3.4-kb regulatory region of jad gene cluster that contains four regulatory genes (3' end 272 bp of jadW2, jadW3, jadR2, and jadR1) and the native promoter upstream of jadJ (P(J)) with the ermEp* promoter sequence so that ermEp* drives the expression of the jadomycin biosynthetic genes from jadJ in the engineered strain. As expected, the mutant strain produced jadomycin B without ethanol treatment, and the yield increased to about twofold that of the stressed wild-type. These results indicated that manipulation of the regulation of a biosynthetic gene cluster is an effective strategy to increase product yield.
Improved estimation of random vibration loads in launch vehicles
NASA Technical Reports Server (NTRS)
Mehta, R.; Erwin, E.; Suryanarayan, S.; Krishna, Murali M. R.
1993-01-01
Random vibration induced load is an important component of the total design load environment for payload and launch vehicle components and their support structures. The current approach to random vibration load estimation is based, particularly at the preliminary design stage, on the use of Miles' equation which assumes a single degree-of-freedom (DOF) system and white noise excitation. This paper examines the implications of the use of multi-DOF system models and response calculation based on numerical integration using the actual excitation spectra for random vibration load estimation. The analytical study presented considers a two-DOF system and brings out the effects of modal mass, damping and frequency ratios on the random vibration load factor. The results indicate that load estimates based on the Miles' equation can be significantly different from the more accurate estimates based on multi-DOF models.
Improved estimation of random vibration loads in launch vehicles
NASA Technical Reports Server (NTRS)
Mehta, R.; Erwin, E.; Suryanarayan, S.; Krishna, Murali M. R.
1993-01-01
Random vibration induced load is an important component of the total design load environment for payload and launch vehicle components and their support structures. The current approach to random vibration load estimation is based, particularly at the preliminary design stage, on the use of Miles' equation which assumes a single degree-of-freedom (DOF) system and white noise excitation. This paper examines the implications of the use of multi-DOF system models and response calculation based on numerical integration using the actual excitation spectra for random vibration load estimation. The analytical study presented considers a two-DOF system and brings out the effects of modal mass, damping and frequency ratios on the random vibration load factor. The results indicate that load estimates based on the Miles' equation can be significantly different from the more accurate estimates based on multi-DOF models.
Bayesian fusion algorithm for improved oscillometric blood pressure estimation.
Forouzanfar, Mohamad; Dajani, Hilmi R; Groza, Voicu Z; Bolic, Miodrag; Rajan, Sreeraman; Batkin, Izmail
2016-11-01
A variety of oscillometric algorithms have been recently proposed in the literature for estimation of blood pressure (BP). However, these algorithms possess specific strengths and weaknesses that should be taken into account before selecting the most appropriate one. In this paper, we propose a fusion method to exploit the advantages of the oscillometric algorithms and circumvent their limitations. The proposed fusion method is based on the computation of the weighted arithmetic mean of the oscillometric algorithms estimates, and the weights are obtained using a Bayesian approach by minimizing the mean square error. The proposed approach is used to fuse four different oscillometric blood pressure estimation algorithms. The performance of the proposed method is evaluated on a pilot dataset of 150 oscillometric recordings from 10 subjects. It is found that the mean error and standard deviation of error are reduced relative to the individual estimation algorithms by up to 7 mmHg and 3 mmHg in estimation of systolic pressure, respectively, and by up to 2 mmHg and 3 mmHg in estimation of diastolic pressure, respectively.
NASA Astrophysics Data System (ADS)
Schaan, Emmanuel; Takada, Masahiro; Spergel, David N.
2014-12-01
Naive estimates of the statistics of large-scale structure and weak lensing power spectrum measurements that include only Gaussian errors exaggerate their scientific impact. Nonlinear evolution and finite-volume effects are both significant sources of non-Gaussian covariance that reduce the ability of power spectrum measurements to constrain cosmological parameters. Using a halo model formalism, we derive an intuitive understanding of the various contributions to the covariance and show that our analytical treatment agrees with simulations. This approach enables an approximate derivation of a joint likelihood for the cluster number counts, the weak lensing power spectrum and the bispectrum. We show that this likelihood is a good description of the ray-tracing simulation. Since all of these observables are sensitive to the same finite-volume effects and contain information about the nonlinear evolution, a combined analysis recovers much of the "lost" information. For upcoming weak lensing surveys, we estimate that a joint analysis of power spectrum, number counts and bispectrum will produce an improvement of about 30-40% in determinations of the matter density and the scalar amplitude. This improvement is equivalent to doubling the survey area.
Guan, Guijian; Zhang, Shuang-Yuan; Cai, Yongqing; Liu, Shuhua; Bharathi, M S; Low, Michelle; Yu, Yong; Xie, Jianping; Zheng, Yuangang; Zhang, Yong-Wei; Han, Ming-Yong
2014-06-01
An effective separation process is developed to remove free protein from the protein-protected gold clusters via co-precipitation with zinc hydroxide on their surface. After dialysis, the purified clusters exhibit an enhanced fluorescence for improved sensitive detection and selective visualization.
Improved False Discovery Rate Estimation Procedure for Shotgun Proteomics
2016-01-01
Interpreting the potentially vast number of hypotheses generated by a shotgun proteomics experiment requires a valid and accurate procedure for assigning statistical confidence estimates to identified tandem mass spectra. Despite the crucial role such procedures play in most high-throughput proteomics experiments, the scientific literature has not reached a consensus about the best confidence estimation methodology. In this work, we evaluate, using theoretical and empirical analysis, four previously proposed protocols for estimating the false discovery rate (FDR) associated with a set of identified tandem mass spectra: two variants of the target-decoy competition protocol (TDC) of Elias and Gygi and two variants of the separate target-decoy search protocol of Käll et al. Our analysis reveals significant biases in the two separate target-decoy search protocols. Moreover, the one TDC protocol that provides an unbiased FDR estimate among the target PSMs does so at the cost of forfeiting a random subset of high-scoring spectrum identifications. We therefore propose the mix-max procedure to provide unbiased, accurate FDR estimates in the presence of well-calibrated scores. The method avoids biases associated with the two separate target-decoy search protocols and also avoids the propensity for target-decoy competition to discard a random subset of high-scoring target identifications. PMID:26152888
NASA Astrophysics Data System (ADS)
Muda, Nora; Othman, Abdul Rahman
2015-10-01
The process of grouping a set of objects into classes of similar objects is called clustering. It divides a large group of observations into smaller groups so that the observations within each group are relatively similar and the observations in different groups are relatively dissimilar. In this study, an agglomerative method in hierarchical cluster analysis is chosen and clusters were constructed by using an average linkage technique. An average linkage technique requires distance between clusters, which is calculated based on the average distance between all pairs of points, one group with another group. In calculating the average distance, the distance will not be robust when there is an outlier. Therefore, the average distance in average linkage needs to be modified in order to overcome the problem of outlier. Therefore, the criteria of outlier detection based on MADn criteria is used and the average distance is recalculated without the outlier. Next, the distance in average linkage is calculated based on a modified one step M-estimator (MOM). The groups of cluster are presented in dendrogram graph. To evaluate the goodness of a modified distance in the average linkage clustering, the bootstrap analysis is conducted on the dendrogram graph and the bootstrap value (BP) are assessed for each branch in dendrogram that formed the group, to ensure the reliability of the branches constructed. This study found that the average linkage technique with modified distance is significantly superior than the usual average linkage technique, if there is an outlier. Both of these techniques are said to be similar if there is no outlier.
Kropat, Georg; Bochud, Francois; Jaboyedoff, Michel; Laedermann, Jean-Pascal; Murith, Christophe; Palacios Gruson, Martha; Baechler, Sébastien
2015-09-01
According to estimations around 230 people die as a result of radon exposure in Switzerland. This public health concern makes reliable indoor radon prediction and mapping methods necessary in order to improve risk communication to the public. The aim of this study was to develop an automated method to classify lithological units according to their radon characteristics and to develop mapping and predictive tools in order to improve local radon prediction. About 240 000 indoor radon concentration (IRC) measurements in about 150 000 buildings were available for our analysis. The automated classification of lithological units was based on k-medoids clustering via pair-wise Kolmogorov distances between IRC distributions of lithological units. For IRC mapping and prediction we used random forests and Bayesian additive regression trees (BART). The automated classification groups lithological units well in terms of their IRC characteristics. Especially the IRC differences in metamorphic rocks like gneiss are well revealed by this method. The maps produced by random forests soundly represent the regional difference of IRCs in Switzerland and improve the spatial detail compared to existing approaches. We could explain 33% of the variations in IRC data with random forests. Additionally, the influence of a variable evaluated by random forests shows that building characteristics are less important predictors for IRCs than spatial/geological influences. BART could explain 29% of IRC variability and produced maps that indicate the prediction uncertainty. Ensemble regression trees are a powerful tool to model and understand the multidimensional influences on IRCs. Automatic clustering of lithological units complements this method by facilitating the interpretation of radon properties of rock types. This study provides an important element for radon risk communication. Future approaches should consider taking into account further variables like soil gas radon measurements as
RSQRT: AN HEURISTIC FOR ESTIMATING THE NUMBER OF CLUSTERS TO REPORT
Bruso, Kelsey
2012-01-01
Clustering can be a valuable tool for analyzing large datasets, such as in e-commerce applications. Anyone who clusters must choose how many item clusters, K, to report. Unfortunately, one must guess at K or some related parameter. Elsewhere we introduced a strongly-supported heuristic, RSQRT, which predicts K as a function of the attribute or item count, depending on attribute scales. We conducted a second analysis where we sought confirmation of the heuristic, analyzing data sets from theUCImachine learning benchmark repository. For the 25 studies where sufficient detail was available, we again found strong support. Also, in a side-by-side comparison of 28 studies, RSQRT best-predicted K and the Bayesian information criterion (BIC) predicted K are the same. RSQRT has a lower cost of O(log log n) versus O(n2) for BIC, and is more widely applicable. Using RSQRT prospectively could be much better than merely guessing. PMID:22773923
RSQRT: AN HEURISTIC FOR ESTIMATING THE NUMBER OF CLUSTERS TO REPORT.
Carlis, John; Bruso, Kelsey
2012-03-01
Clustering can be a valuable tool for analyzing large datasets, such as in e-commerce applications. Anyone who clusters must choose how many item clusters, K, to report. Unfortunately, one must guess at K or some related parameter. Elsewhere we introduced a strongly-supported heuristic, RSQRT, which predicts K as a function of the attribute or item count, depending on attribute scales. We conducted a second analysis where we sought confirmation of the heuristic, analyzing data sets from theUCImachine learning benchmark repository. For the 25 studies where sufficient detail was available, we again found strong support. Also, in a side-by-side comparison of 28 studies, RSQRT best-predicted K and the Bayesian information criterion (BIC) predicted K are the same. RSQRT has a lower cost of O(log log n) versus O(n(2)) for BIC, and is more widely applicable. Using RSQRT prospectively could be much better than merely guessing.
Veatch, Sarah L.; Machta, Benjamin B.; Shelby, Sarah A.; Chiang, Ethan N.; Holowka, David A.; Baird, Barbara A.
2012-01-01
We present an analytical method using correlation functions to quantify clustering in super-resolution fluorescence localization images and electron microscopy images of static surfaces in two dimensions. We use this method to quantify how over-counting of labeled molecules contributes to apparent self-clustering and to calculate the effective lateral resolution of an image. This treatment applies to distributions of proteins and lipids in cell membranes, where there is significant interest in using electron microscopy and super-resolution fluorescence localization techniques to probe membrane heterogeneity. When images are quantified using pair auto-correlation functions, the magnitude of apparent clustering arising from over-counting varies inversely with the surface density of labeled molecules and does not depend on the number of times an average molecule is counted. In contrast, we demonstrate that over-counting does not give rise to apparent co-clustering in double label experiments when pair cross-correlation functions are measured. We apply our analytical method to quantify the distribution of the IgE receptor (FcεRI) on the plasma membranes of chemically fixed RBL-2H3 mast cells from images acquired using stochastic optical reconstruction microscopy (STORM/dSTORM) and scanning electron microscopy (SEM). We find that apparent clustering of FcεRI-bound IgE is dominated by over-counting labels on individual complexes when IgE is directly conjugated to organic fluorophores. We verify this observation by measuring pair cross-correlation functions between two distinguishably labeled pools of IgE-FcεRI on the cell surface using both imaging methods. After correcting for over-counting, we observe weak but significant self-clustering of IgE-FcεRI in fluorescence localization measurements, and no residual self-clustering as detected with SEM. We also apply this method to quantify IgE-FcεRI redistribution after deliberate clustering by crosslinking with two
The report discusses an EPA investigation of techniques to improve methods for estimating volatile organic compound (VOC) emissions from area sources. Using the automobile refinishing industry for a detailed area source case study, an emission estimation method is being developed...
The report discusses an EPA investigation of techniques to improve methods for estimating volatile organic compound (VOC) emissions from area sources. Using the automobile refinishing industry for a detailed area source case study, an emission estimation method is being developed...
Improving care of patients with diabetes and CKD: a pilot study for a cluster-randomized trial.
Cortés-Sanabria, Laura; Cabrera-Pivaral, Carlos E; Cueto-Manzano, Alfonso M; Rojas-Campos, Enrique; Barragán, Graciela; Hernández-Anaya, Moisés; Martínez-Ramírez, Héctor R
2008-05-01
Family physicians may have the main role in managing patients with type 2 diabetes mellitus with early nephropathy. It is therefore important to determine the clinical competence of family physicians in preserving renal function of patients. The aim of this study is to evaluate the effect of an educational intervention on family physicians' clinical competence and subsequently determine the impact on kidney function of their patients with type 2 diabetes mellitus. Pilot study for a cluster-randomized trial. Primary health care units of the Mexican Institute of Social Security, Guadalajara, Mexico. The study group was composed of 21 family physicians from 1 unit and a control group of 19 family physicians from another unit. 46 patients treated by study physicians and 48 treated by control physicians also were evaluated. An educative strategy based on a participative model used during 6 months in the study group. Allocation of units to receive or not receive the educative intervention was randomly established. Clinical competence of family physicians and kidney function of patients. To evaluate clinical competence, a validated questionnaire measuring family physicians' capability to identify risk factors, integrate diagnosis, and correctly use laboratory tests and therapeutic resources was applied to all physicians at the beginning and end of educative intervention (0 and 6 months). In patients, serum creatinine level, estimated glomerular filtration rate, and albuminuria were evaluated at 0, 6, and 12 months. At the end of the intervention, more family physicians from the study group improved clinical competence (91%) compared with controls (37%; P = 0.001). Family physicians in the study group who increased their competence improved renal function significantly better than physicians in the same group who did not increase competence and physicians in the control group (with or without increase in competence): change in estimated glomerular filtration rate, 0
Distance-Learning, ADHD Quality Improvement in Primary Care: A Cluster-Randomized Trial.
Fiks, Alexander G; Mayne, Stephanie L; Michel, Jeremy J; Miller, Jeffrey; Abraham, Manju; Suh, Andrew; Jawad, Abbas F; Guevara, James P; Grundmeier, Robert W; Blum, Nathan J; Power, Thomas J
2017-10-01
To evaluate a distance-learning, quality improvement intervention to improve pediatric primary care provider use of attention-deficit/hyperactivity disorder (ADHD) rating scales. Primary care practices were cluster randomized to a 3-part distance-learning, quality improvement intervention (web-based education, collaborative consultation with ADHD experts, and performance feedback reports/calls), qualifying for Maintenance of Certification (MOC) Part IV credit, or wait-list control. We compared changes relative to a baseline period in rating scale use by study arm using logistic regression clustered by practice (primary analysis) and examined effect modification by level of clinician participation. An electronic health record-linked system for gathering ADHD rating scales from parents and teachers was implemented before the intervention period at all sites. Rating scale use was ascertained by manual chart review. One hundred five clinicians at 19 sites participated. Differences between arms were not significant. From the baseline to intervention period and after implementation of the electronic system, clinicians in both study arms were significantly more likely to administer and receive parent and teacher rating scales. Among intervention clinicians, those who participated in at least 1 feedback call or qualified for MOC credit were more likely to give parents rating scales with differences of 14.2 (95% confidence interval [CI], 0.6-27.7) and 18.8 (95% CI, 1.9-35.7) percentage points, respectively. A 3-part clinician-focused distance-learning, quality improvement intervention did not improve rating scale use. Complementary strategies that support workflows and more fully engage clinicians may be needed to bolster care. Electronic systems that gather rating scales may help achieve this goal. Index terms: ADHD, primary care, quality improvement, clinical decision support.
An improved scheduling algorithm for 3D cluster rendering with platform LSF
NASA Astrophysics Data System (ADS)
Xu, Wenli; Zhu, Yi; Zhang, Liping
2013-10-01
High-quality photorealistic rendering of 3D modeling needs powerful computing systems. On this demand highly efficient management of cluster resources develops fast to exert advantages. This paper is absorbed in the aim of how to improve the efficiency of 3D rendering tasks in cluster. It focuses research on a dynamic feedback load balance (DFLB) algorithm, the work principle of load sharing facility (LSF) and optimization of external scheduler plug-in. The algorithm can be applied into match and allocation phase of a scheduling cycle. Candidate hosts is prepared in sequence in match phase. And the scheduler makes allocation decisions for each job in allocation phase. With the dynamic mechanism, new weight is assigned to each candidate host for rearrangement. The most suitable one will be dispatched for rendering. A new plugin module of this algorithm has been designed and integrated into the internal scheduler. Simulation experiments demonstrate the ability of improved plugin module is superior to the default one for rendering tasks. It can help avoid load imbalance among servers, increase system throughput and improve system utilization.
Separating duff and litter for improved mass and carbon estimates
David Chojnacky; Michael Amacher; Michael Gavazzi
2009-01-01
Mass and carbon load estimates, such as those from forest soil organic matter (duff and litter), inform forestry decisions. The US Forest Inventory and Analysis (FIA) Program systematically collects data nationwide: a down woody material protocol specifies discrete duff and litter depth measurements, and a soils protocol specifies mass and carbon of duff and litter...
Improved alternatives for estimating in-use material stocks.
Chen, Wei-Qiang; Graedel, T E
2015-03-03
Determinations of in-use material stocks are useful for exploring past patterns and future scenarios of materials use, for estimating end-of-life flows of materials, and thereby for guiding policies on recycling and sustainable management of materials. This is especially true when those determinations are conducted for individual products or product groups such as "automobiles" rather than general (and sometimes nebulous) sectors such as "transportation". We propose four alternatives to the existing top-down and bottom-up methods for estimating in-use material stocks, with the choice depending on the focus of the study and on the available data. We illustrate with aluminum use in automobiles the robustness of and consistencies and differences among these four alternatives and demonstrate that a suitable combination of the four methods permits estimation of the in-use stock of a material contained in all products employing that material, or in-use stocks of different materials contained in a particular product. Therefore, we anticipate the estimation in the future of in-use stocks for many materials in many products or product groups, for many regions, and for longer time periods, by taking advantage of methodologies that fully employ the detailed data sets now becoming available.
A novel ULA-based geometry for improving AOA estimation
NASA Astrophysics Data System (ADS)
Shirvani-Moghaddam, Shahriar; Akbari, Farida
2011-12-01
Due to relatively simple implementation, Uniform Linear Array (ULA) is a popular geometry for array signal processing. Despite this advantage, it does not have a uniform performance in all directions and Angle of Arrival (AOA) estimation performance degrades considerably in the angles close to endfire. In this article, a new configuration is proposed which can solve this problem. Proposed Array (PA) configuration adds two elements to the ULA in top and bottom of the array axis. By extending signal model of the ULA to the new proposed ULA-based array, AOA estimation performance has been compared in terms of angular accuracy and resolution threshold through two well-known AOA estimation algorithms, MUSIC and MVDR. In both algorithms, Root Mean Square Error (RMSE) of the detected angles descends as the input Signal to Noise Ratio (SNR) increases. Simulation results show that the proposed array geometry introduces uniform accurate performance and higher resolution in middle angles as well as border ones. The PA also presents less RMSE than the ULA in endfire directions. Therefore, the proposed array offers better performance for the border angles with almost the same array size and simplicity in both MUSIC and MVDR algorithms with respect to the conventional ULA. In addition, AOA estimation performance of the PA geometry is compared with two well-known 2D-array geometries: L-shape and V-shape, and acceptable results are obtained with equivalent or lower complexity.
USING COLORS TO IMPROVE PHOTOMETRIC METALLICITY ESTIMATES FOR GALAXIES
Sanders, N. E.; Soderberg, A. M.; Levesque, E. M.
2013-10-01
There is a well known correlation between the mass and metallicity of star-forming galaxies. Because mass is correlated with luminosity, this relation is often exploited, when spectroscopy is not available, to estimate galaxy metallicities based on single band photometry. However, we show that galaxy color is typically more effective than luminosity as a predictor of metallicity. This is a consequence of the correlation between color and the galaxy mass-to-light ratio and the recently discovered correlation between star formation rate (SFR) and residuals from the mass-metallicity relation. Using Sloan Digital Sky Survey spectroscopy of ∼180, 000 nearby galaxies, we derive 'LZC relations', empirical relations between metallicity (in seven common strong line diagnostics), luminosity, and color (in 10 filter pairs and four methods of photometry). We show that these relations allow photometric metallicity estimates, based on luminosity and a single optical color, that are ∼50% more precise than those made based on luminosity alone; galaxy metallicity can be estimated to within ∼0.05-0.1 dex of the spectroscopically derived value depending on the diagnostic used. Including color information in photometric metallicity estimates also reduces systematic biases for populations skewed toward high or low SFR environments, as we illustrate using the host galaxy of the supernova SN 2010ay. This new tool will lend more statistical power to studies of galaxy populations, such as supernova and gamma-ray burst host environments, in ongoing and future wide-field imaging surveys.
Trellis Tension Monitoring Improves Yield Estimation in Vineyards
USDA-ARS?s Scientific Manuscript database
The preponderance of yield estimation practices for commercial vineyards is based on longstanding but individually variable industry protocols that rely on hand sampling fruit on one or a small number of dates during the growing season. Limitations associated with the static nature of yield estimati...
Improved bit error rate estimation over experimental optical wireless channels
NASA Astrophysics Data System (ADS)
El Tabach, Mamdouh; Saoudi, Samir; Tortelier, Patrick; Bouchet, Olivier; Pyndiah, Ramesh
2009-02-01
As a part of the EU-FP7 R&D programme, the OMEGA project (hOME Gigabit Access) aims at bridging the gap between wireless terminals and wired backbone network in homes, providing high bit rate connectivity to users. Beside radio frequencies, the wireless links will use Optical Wireless (OW) communications. To guarantee high performance and quality of service in real-time, our system needs techniques to approximate the Bit Error Probability (BEP) with a reasonable training sequence. Traditionally, the BEP is approximated by the Bit Error Rate (BER) measured by counting the number of errors within a given sequence of bits. For small BERs, required sequences are huge and may prevent real-time estimation. In this paper, methods to estimate BER using Probability Density Function (PDF) estimation are presented. Two a posteriori techniques based on Parzen estimator or constrained Gram-Charlier series expansion are adapted and applied to OW communications. Aided by simulations, comparison is done over experimental optical channels. We show that, for different scenarios, such as optical multipath distortion or a well designed Code Division Multiple Access (CDMA) system, this approach outperforms the counting method and yields to better results with a relatively small training sequence.
2013-01-01
Background To facilitate new drug development, physiologically-based pharmacokinetic (PBPK) modeling methods receive growing attention as a tool to fully understand and predict complex pharmacokinetic phenomena. As the number of parameters to reproduce physiological functions tend to be large in PBPK models, efficient parameter estimation methods are essential. We have successfully applied a recently developed algorithm to estimate a feasible solution space, called Cluster Newton Method (CNM), to reveal the cause of irinotecan pharmacokinetic alterations in two cancer patient groups. Results After improvements in the original CNM algorithm to maintain parameter diversities, a feasible solution space was successfully estimated for 55 or 56 parameters in the irinotecan PBPK model, within ten iterations, 3000 virtual samples, and in 15 minutes (Intel Xeon E5-1620 3.60GHz × 1 or Intel Core i7-870 2.93GHz × 1). Control parameters or parameter correlations were clarified after the parameter estimation processes. Possible causes in the irinotecan pharmacokinetic alterations were suggested, but they were not conclusive. Conclusions Application of CNM achieved a feasible solution space by solving inverse problems of a system containing ordinary differential equations (ODEs). This method may give us reliable insights into other complicated phenomena, which have a large number of parameters to estimate, under limited information. It is also helpful to design prospective studies for further investigation of phenomena of interest. PMID:24555857
Yoshida, Kenta; Maeda, Kazuya; Kusuhara, Hiroyuki; Konagaya, Akihiko
2013-10-16
To facilitate new drug development, physiologically-based pharmacokinetic (PBPK) modeling methods receive growing attention as a tool to fully understand and predict complex pharmacokinetic phenomena. As the number of parameters to reproduce physiological functions tend to be large in PBPK models, efficient parameter estimation methods are essential. We have successfully applied a recently developed algorithm to estimate a feasible solution space, called Cluster Newton Method (CNM), to reveal the cause of irinotecan pharmacokinetic alterations in two cancer patient groups. After improvements in the original CNM algorithm to maintain parameter diversities, a feasible solution space was successfully estimated for 55 or 56 parameters in the irinotecan PBPK model, within ten iterations, 3000 virtual samples, and in 15 minutes (Intel Xeon E5-1620 3.60GHz × 1 or Intel Core i7-870 2.93GHz × 1). Control parameters or parameter correlations were clarified after the parameter estimation processes. Possible causes in the irinotecan pharmacokinetic alterations were suggested, but they were not conclusive. Application of CNM achieved a feasible solution space by solving inverse problems of a system containing ordinary differential equations (ODEs). This method may give us reliable insights into other complicated phenomena, which have a large number of parameters to estimate, under limited information. It is also helpful to design prospective studies for further investigation of phenomena of interest.
Improved Heritability Estimation from Genome-wide SNPs
Speed, Doug; Hemani, Gibran; Johnson, Michael R.; Balding, David J.
2012-01-01
Estimation of narrow-sense heritability, h2, from genome-wide SNPs genotyped in unrelated individuals has recently attracted interest and offers several advantages over traditional pedigree-based methods. With the use of this approach, it has been estimated that over half the heritability of human height can be attributed to the ∼300,000 SNPs on a genome-wide genotyping array. In comparison, only 5%–10% can be explained by SNPs reaching genome-wide significance. We investigated via simulation the validity of several key assumptions underpinning the mixed-model analysis used in SNP-based h2 estimation. Although we found that the method is reasonably robust to violations of four key assumptions, it can be highly sensitive to uneven linkage disequilibrium (LD) between SNPs: contributions to h2 are overestimated from causal variants in regions of high LD and are underestimated in regions of low LD. The overall direction of the bias can be up or down depending on the genetic architecture of the trait, but it can be substantial in realistic scenarios. We propose a modified kinship matrix in which SNPs are weighted according to local LD. We show that this correction greatly reduces the bias and increases the precision of h2 estimates. We demonstrate the impact of our method on the first seven diseases studied by the Wellcome Trust Case Control Consortium. Our LD adjustment revises downward the h2 estimate for immune-related diseases, as expected because of high LD in the major-histocompatibility region, but increases it for some nonimmune diseases. To calculate our revised kinship matrix, we developed LDAK, software for computing LD-adjusted kinships. PMID:23217325
Improving estimates of riverine fresh water into the Mediterranean sea
NASA Astrophysics Data System (ADS)
Wang, Fuxing; Polcher, Jan
2017-04-01
Estimating the freshwater input from the continents into the Mediterranean sea is a difficult endeavor due to the uncertainties from un-gauged rivers, human activities, and measurement of water flow at river outlet. One approach to estimate the freshwater inflow into the Mediterranean sea is based on the observed flux (about 63% available) and a simple annual water balance for rivers without observations (ignoring human usage and other processes). This method is the basis of most water balance studies of the Mediterranean sea and oceanic modelling activities, but it only provides annual mean values with a very strong assumption. Another approach is done by forcing a state of the art land surface model (LSM) with bias corrected atmospheric conditions. This method can estimate total fresh water flowing into the Mediterranean at daily scale but with all the caveats associated to models. We use data assimilation techniques by merging data between the model output (ORCHIDEE LSM developed at Institut Pierre Simon Laplace) and the observed river discharge from Global Runoff Data Center (GRDC) to correct the modelled fluxes with observations over the entire basin. Over each sub watershed, the GRDC data (if available) are applied to correct model simulated river discharge. This will allow to compensate for systematic errors of model or missing processes and provide estimates of the riverine input into the sea at high temporal and spatial resolution. We will analyze the freshwater inflow into the Mediterranean obtained here with different approaches reported in previous paper. The new estimates will serve for ocean modelling and water balance studies of the region.
Improved Uncertainty Quantification in Groundwater Flux Estimation Using GRACE
NASA Astrophysics Data System (ADS)
Reager, J. T., II; Rao, P.; Famiglietti, J. S.; Turmon, M.
2015-12-01
Groundwater change is difficult to monitor over large scales. One of the most successful approaches is in the remote sensing of time-variable gravity using NASA Gravity Recovery and Climate Experiment (GRACE) mission data, and successful case studies have created the opportunity to move towards a global groundwater monitoring framework for the world's largest aquifers. To achieve these estimates, several approximations are applied, including those in GRACE processing corrections, the formulation of the formal GRACE errors, destriping and signal recovery, and the numerical model estimation of snow water, surface water and soil moisture storage states used to isolate a groundwater component. A major weakness in these approaches is inconsistency: different studies have used different sources of primary and ancillary data, and may achieve different results based on alternative choices in these approximations. In this study, we present two cases of groundwater change estimation in California and the Colorado River basin, selected for their good data availability and varied climates. We achieve a robust numerical estimate of post-processing uncertainties resulting from land-surface model structural shortcomings and model resolution errors. Groundwater variations should demonstrate less variability than the overlying soil moisture state does, as groundwater has a longer memory of past events due to buffering by infiltration and drainage rate limits. We apply a model ensemble approach in a Bayesian framework constrained by the assumption of decreasing signal variability with depth in the soil column. We also discuss time variable errors vs. time constant errors, across-scale errors v. across-model errors, and error spectral content (across scales and across model). More robust uncertainty quantification for GRACE-based groundwater estimates would take all of these issues into account, allowing for more fair use in management applications and for better integration of GRACE
Sample Size Estimation in Cluster Randomized Educational Trials: An Empirical Bayes Approach
ERIC Educational Resources Information Center
Rotondi, Michael A.; Donner, Allan
2009-01-01
The educational field has now accumulated an extensive literature reporting on values of the intraclass correlation coefficient, a parameter essential to determining the required size of a planned cluster randomized trial. We propose here a simple simulation-based approach including all relevant information that can facilitate this task. An…
Sample Size Estimation in Cluster Randomized Educational Trials: An Empirical Bayes Approach
ERIC Educational Resources Information Center
Rotondi, Michael A.; Donner, Allan
2009-01-01
The educational field has now accumulated an extensive literature reporting on values of the intraclass correlation coefficient, a parameter essential to determining the required size of a planned cluster randomized trial. We propose here a simple simulation-based approach including all relevant information that can facilitate this task. An…
Improved Battery State Estimation Using Novel Sensing Techniques
NASA Astrophysics Data System (ADS)
Abdul Samad, Nassim
Lithium-ion batteries have been considered a great complement or substitute for gasoline engines due to their high energy and power density capabilities among other advantages. However, these types of energy storage devices are still yet not widespread, mainly because of their relatively high cost and safety issues, especially at elevated temperatures. This thesis extends existing methods of estimating critical battery states using model-based techniques augmented by real-time measurements from novel temperature and force sensors. Typically, temperature sensors are located near the edge of the battery, and away from the hottest core cell regions, which leads to slower response times and increased errors in the prediction of core temperatures. New sensor technology allows for flexible sensor placement at the cell surface between cells in a pack. This raises questions about the optimal locations of these sensors for best observability and temperature estimation. Using a validated model, which is developed and verified using experiments in laboratory fixtures that replicate vehicle pack conditions, it is shown that optimal sensor placement can lead to better and faster temperature estimation. Another equally important state is the state of health or the capacity fading of the cell. This thesis introduces a novel method of using force measurements for capacity fade estimation. Monitoring capacity is important for defining the range of electric vehicles (EVs) and plug-in hybrid electric vehicles (PHEVs). Current capacity estimation techniques require a full discharge to monitor capacity. The proposed method can complement or replace current methods because it only requires a shallow discharge, which is especially useful in EVs and PHEVs. Using the accurate state estimation accomplished earlier, a method for downsizing a battery pack is shown to effectively reduce the number of cells in a pack without compromising safety. The influence on the battery performance (e
Improving Word Similarity by Augmenting PMI with Estimates of Word Polysemy
2011-12-29
Improving Word Similarity by Augmenting PMI with Estimates of Word Polysemy Lushan Han1, Tim Finin1,2, Paul McNamee2, Anupam Joshi1 and Yelena Yesha1...Yesha, Improving Word Similarity by Augmenting PMI with Estimates of Word Polysemy , IEEE Transactions on Knowledge and Data Engineering, IEEE Computer...4. TITLE AND SUBTITLE Improving Word Similarity by Augmenting PMI with Estimates of Word Polysemy 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c
Estimating the Power Characteristics of Clusters of Large Offshore Wind Farms
NASA Astrophysics Data System (ADS)
Drew, D.; Barlow, J. F.; Coceal, O.; Coker, P.; Brayshaw, D.; Lenaghan, D.
2014-12-01
The next phase of offshore wind projects in the UK focuses on the development of very large wind farms clustered within several allocated zones. However, this change in the distribution of wind capacity brings uncertainty for the operational planning of the power system. Firstly, there are concerns that concentrating large amounts of capacity in one area could reduce some of the benefits seen by spatially dispersing the turbines, such as the smoothing of the power generation variability. Secondly, wind farms of the scale planned are likely to influence the boundary layer sufficiently to impact the performance of adjacent farms, therefore the power generation characteristics of the clusters are largely unknown. The aim of this study is to use the Weather Research and Forecasting (WRF) model to investigate the power output of a cluster of offshore wind farms for a range of extreme events, taking into account the wake effects of the individual turbines and the neighbouring farms. Each wind farm in the cluster is represented as an elevated momentum sink and a source of turbulent kinetic energy using the WRF Wind Farm Parameterization. The research focuses on the Dogger Bank zone (located in the North Sea approximately 125 km off the East coast of the UK), which could have 7.2 GW of installed capacity across six separate wind farms. For this site, a 33 year reanalysis data set (MERRA, from NASA-GMAO) has been used to identify a series of extreme event case studies. These are characterised by either periods of persistent low (or high) wind speeds, or by rapid changes in power output. The latter could be caused by small changes in the wind speed inducing large changes in power output, very high winds prompting turbine shut down, or a change in the wind direction which shifts the wake effects of the neighbouring farms in the cluster and therefore changes the wind resource available.
Hens, Niel; Beutels, Philippe; Leirs, Herwig; Reijniers, Jonas
2016-01-01
Diseases of humans and wildlife are typically tracked and studied through incidence, the number of new infections per time unit. Estimating incidence is not without difficulties, as asymptomatic infections, low sampling intervals and low sample sizes can introduce large estimation errors. After infection, biomarkers such as antibodies or pathogens often change predictably over time, and this temporal pattern can contain information about the time since infection that could improve incidence estimation. Antibody level and avidity have been used to estimate time since infection and to recreate incidence, but the errors on these estimates using currently existing methods are generally large. Using a semi-parametric model in a Bayesian framework, we introduce a method that allows the use of multiple sources of information (such as antibody level, pathogen presence in different organs, individual age, season) for estimating individual time since infection. When sufficient background data are available, this method can greatly improve incidence estimation, which we show using arenavirus infection in multimammate mice as a test case. The method performs well, especially compared to the situation in which seroconversion events between sampling sessions are the main data source. The possibility to implement several sources of information allows the use of data that are in many cases already available, which means that existing incidence data can be improved without the need for additional sampling efforts or laboratory assays. PMID:27177244
Improved source term estimation using blind outlier detection
NASA Astrophysics Data System (ADS)
Martinez-Camara, Marta; Bejar Haro, Benjamin; Vetterli, Martin; Stohl, Andreas
2014-05-01
Emissions of substances into the atmosphere are produced in situations such as volcano eruptions, nuclear accidents or pollutant releases. It is necessary to know the source term - how the magnitude of these emissions changes with time - in order to predict the consequences of the emissions, such as high radioactivity levels in a populated area or high concentration of volcanic ash in an aircraft flight corridor. However, in general, we know neither how much material was released in total, nor the relative variation of emission strength with time. Hence, estimating the source term is a crucial task. Estimating the source term generally involves solving an ill-posed linear inverse problem using datasets of sensor measurements. Several so-called inversion methods have been developed for this task. Unfortunately, objective quantitative evaluation of the performance of inversion methods is difficult due to the fact that the ground truth is unknown for practically all the available measurement datasets. In this work we use the European Tracer Experiment (ETEX) - a rare example of an experiment where the ground truth is available - to develop and to test new source estimation algorithms. Knowledge of the ground truth grants us access to the additive error term. We show that the distribution of this error is heavy-tailed, which means that some measurements are outliers. We also show that precisely these outliers severely degrade the performance of traditional inversion methods. Therefore, we develop blind outlier detection algorithms specifically suited to the source estimation problem. Then, we propose new inversion methods that combine traditional regularization techniques with blind outlier detection. Such hybrid methods reduce the error of reconstruction of the source term up to 45% with respect to previously proposed methods.
Uncertainty Estimation Improves Energy Measurement and Verification Procedures
Walter, Travis; Price, Phillip N.; Sohn, Michael D.
2014-05-14
Implementing energy conservation measures in buildings can reduce energy costs and environmental impacts, but such measures cost money to implement so intelligent investment strategies require the ability to quantify the energy savings by comparing actual energy used to how much energy would have been used in absence of the conservation measures (known as the baseline energy use). Methods exist for predicting baseline energy use, but a limitation of most statistical methods reported in the literature is inadequate quantification of the uncertainty in baseline energy use predictions. However, estimation of uncertainty is essential for weighing the risks of investing in retrofits. Most commercial buildings have, or soon will have, electricity meters capable of providing data at short time intervals. These data provide new opportunities to quantify uncertainty in baseline predictions, and to do so after shorter measurement durations than are traditionally used. In this paper, we show that uncertainty estimation provides greater measurement and verification (M&V) information and helps to overcome some of the difficulties with deciding how much data is needed to develop baseline models and to confirm energy savings. We also show that cross-validation is an effective method for computing uncertainty. In so doing, we extend a simple regression-based method of predicting energy use using short-interval meter data. We demonstrate the methods by predicting energy use in 17 real commercial buildings. We discuss the benefits of uncertainty estimates which can provide actionable decision making information for investing in energy conservation measures.
Recent Improvements in Estimating Convective and Stratiform Rainfall in Amazonia
NASA Technical Reports Server (NTRS)
Negri, Andrew J.
1999-01-01
In this paper we present results from the application of a satellite infrared (IR) technique for estimating rainfall over northern South America. Our main objectives are to examine the diurnal variability of rainfall and to investigate the relative contributions from the convective and stratiform components. We apply the technique of Anagnostou et al (1999). In simple functional form, the estimated rain area A(sub rain) may be expressed as: A(sub rain) = f(A(sub mode),T(sub mode)), where T(sub mode) is the mode temperature of a cloud defined by 253 K, and A(sub mode) is the area encompassed by T(sub mode). The technique was trained by a regression between coincident microwave estimates from the Goddard Profiling (GPROF) algorithm (Kummerow et al, 1996) applied to SSM/I data and GOES IR (11 microns) observations. The apportionment of the rainfall into convective and stratiform components is based on the microwave technique described by Anagnostou and Kummerow (1997). The convective area from this technique was regressed against an IR structure parameter (the Convective Index) defined by Anagnostou et al (1999). Finally, rainrates are assigned to the Am.de proportional to (253-temperature), with different rates for the convective and stratiform
NASA Astrophysics Data System (ADS)
Milone, Eugene F.; Schiller, Stephen J.
2014-06-01
A paradigm method to calibrate a range of standard candles by means of well-calibrated photometry of eclipsing binaries in star clusters is the Direct Distance Estimation (DDE) procedure, contained in the 2010 and 2013 versions of the Wilson-Devinney light-curve modeling program. In particular, we are re-examining systems previously studied in our Binaries-in-Clusters program and analyzed with earlier versions of the Wilson-Devinney program. Earlier we reported on our use of the 2010 version of this program, which incorporates the DDE procedure to estimate the distance to an eclipsing system directly, as a system parameter, and is thus dependent on the data and analysis model alone. As such, the derived distance is accorded a standard error, independent of any additional assumptions or approximations that such analyses conventionally require. Additionally we have now made use of the 2013 version, which introduces temporal evolution of spots, an important improvement for systems containing variable active regions, as is the case for the systems we are studying currently, namely HD 27130 in the Hyades and DS And in NGC 752. Our work provides some constraints on the effects of spot treatment on distance determination of active systems.
Philips, Adam; Marchenko, Alex; Truflandier, Lionel A; Autschbach, Jochen
2017-09-12
Quadrupolar NMR relaxation rates are computed for (17)O and (2)H nuclei of liquid water, and of (23)Na(+), and (35)Cl(-) in aqueous solution via Kohn-Sham (KS) density functional theory ab initio molecular dynamics (aiMD) and subsequent KS electric field gradient (EFG) calculations along the trajectories. The calculated relaxation rates are within about a factor of 2 of experimental results and improved over previous aiMD simulations. The relaxation rates are assessed with regard to the lengths of the simulations as well as configurational sampling. The latter is found to be the more limiting factor in obtaining good statistical sampling and is improved by averaging over many equivalent nuclei of a system or over several independent trajectories. Further, full periodic plane-wave basis calculations of the EFGs are compared with molecular-cluster atomic-orbital basis calculations. The two methods deliver comparable results with nonhybrid functionals. With the molecular-cluster approach, a larger variety of electronic structure methods is available. For chloride, the EFG computations benefit from using a hybrid KS functional.
Prague, Melanie; Wang, Rui; Stephens, Alisa; Tchetgen Tchetgen, Eric; DeGruttola, Victor
2016-01-01
Summary Semi-parametric methods are often used for the estimation of intervention effects on correlated outcomes in cluster-randomized trials (CRTs). When outcomes are missing at random (MAR), Inverse Probability Weighted (IPW) methods incorporating baseline covariates can be used to deal with informative missingness. Also, augmented generalized estimating equations (AUG) correct for imbalance in baseline covariates but need to be extended for MAR outcomes. However, in the presence of interactions between treatment and baseline covariates, neither method alone produces consistent estimates for the marginal treatment effect if the model for interaction is not correctly specified. We propose an AUG-IPW estimator that weights by the inverse of the probability of being a complete case and allows different outcome models in each intervention arm. This estimator is doubly robust (DR), it gives correct estimates whether the missing data process or the outcome model is correctly specified. We consider the problem of covariate interference which arises when the outcome of an individual may depend on covariates of other individuals. When interfering covariates are not modeled, the DR property prevents bias as long as covariate interference is not present simultaneously for the outcome and the missingness. An R package is developed implementing the proposed method. An extensive simulation study and an application to a CRT of HIV risk reduction-intervention in South Africa illustrate the method. PMID:27060877
The CluTim algorithm: an improvement on the impact parameter estimates
NASA Astrophysics Data System (ADS)
Chiarello, G.; Chiri, C.; Cocciolo, G.; Corvaglia, A.; Grancagnolo, F.; Miccoli, A.; Panareo, M.; Pinto, C.; Pepino, A.; Spedicato, M.; Tassielli, G. F.
2017-03-01
A Drift Chamber is a detector used in high energy physics experiments for determining charged particles trajectories. The signal pulses from all the wires are then collected and the particle trajectory is tracked assuming that the distances of closest approach (the impact parameter) between the particle trajectory and the wires coincide with the distance between the cluster ions generated by the particle and the wire closer to it. The widespread use of helium based gas mixtures, which produces a low ionization clusters density (12 cluster/cm in a 90/10 helium/iso-butane mixture), introduces a sensible bias in the impact parameter assumption, particularly for short impact parameters and small cell drift chambers. Recently, an alternative track reconstruction (Cluster Counting/Timing) technique has been proposed, which consists in measuring the arrival times on the wires of each individual ionization cluster and combining these times to get a bias free estimate of the impact parameter. However, in order to efficiently exploiting the cluster timing technique, it is necessary to have read-out interfaces capable of processing a large quantity of high speed signals. We describe the design of a read-out board capable of acquiring the information coming from a fast digitization of the signals generated in a drift chamber and the algorithm for identifying the individual ionization pulse peaks and recording their time and amplitude.
Global Water Resources Under Future Changes: Toward an Improved Estimation
NASA Astrophysics Data System (ADS)
Islam, M.; Agata, Y.; Hanasaki, N.; Kanae, S.; Oki, T.
2005-05-01
Global water resources availability in the 21st century is going to be an important concern. Despite its international recognition, however, until now there are very limited global estimates of water resources, which considered the geographical linkage between water supply and demand, defined by runoff and its passage through river network. The available studies are again insufficient due to reasons like different approaches in defining water scarcity, simply based on annual average figures without considering the inter-annual or seasonal variability, absence of the inclusion of virtual water trading, etc. In this study, global water resources under future climate change associated with several socio-economic factors were estimated varying over both temporal and spatial scale. Global runoff data was derived from several land surface models under the GSWP2 (Global Soil Wetness Project) project, which was further processed through TRIP (Total Runoff Integrated Pathways) river routing model to produce a 0.5x0.5 degree grid based figure. Water abstraction was estimated for the same spatial resolution for three sectors as domestic, industrial and agriculture. GCM outputs from CCSR and MRI were collected to predict the runoff changes. Socio-economic factors like population and GDP growth, affected mostly the demand part. Instead of simply looking at annual figures, monthly figures for both supply and demand was considered. For an average year, such a seasonal variability can affect the crop yield significantly. In other case, inter-annual variability of runoff can cause for an absolute drought condition. To account for vulnerabilities of a region to future changes, both inter-annual and seasonal effects were thus considered. At present, the study assumed the future agricultural water uses to be unchanged under climatic changes. In this connection, EPIC model is underway to use for estimating future agricultural water demand under climatic changes on a monthly basis. From
Sun, Jinwei; Wu, Jiabing; Guan, Dexin; Yao, Fuqi; Yuan, Fenghui; Wang, Anzhi; Jin, Changjie
2014-01-01
Leaf respiration is an important component of carbon exchange in terrestrial ecosystems, and estimates of leaf respiration directly affect the accuracy of ecosystem carbon budgets. Leaf respiration is inhibited by light; therefore, gross primary production (GPP) will be overestimated if the reduction in leaf respiration by light is ignored. However, few studies have quantified GPP overestimation with respect to the degree of light inhibition in forest ecosystems. To determine the effect of light inhibition of leaf respiration on GPP estimation, we assessed the variation in leaf respiration of seedlings of the dominant tree species in an old mixed temperate forest with different photosynthetically active radiation levels using the Laisk method. Canopy respiration was estimated by combining the effect of light inhibition on leaf respiration of these species with within-canopy radiation. Leaf respiration decreased exponentially with an increase in light intensity. Canopy respiration and GPP were overestimated by approximately 20.4% and 4.6%, respectively, when leaf respiration reduction in light was ignored compared with the values obtained when light inhibition of leaf respiration was considered. This study indicates that accurate estimates of daytime ecosystem respiration are needed for the accurate evaluation of carbon budgets in temperate forests. In addition, this study provides a valuable approach to accurately estimate GPP by considering leaf respiration reduction in light in other ecosystems. PMID:25419844
Sun, Jinwei; Wu, Jiabing; Guan, Dexin; Yao, Fuqi; Yuan, Fenghui; Wang, Anzhi; Jin, Changjie
2014-01-01
Leaf respiration is an important component of carbon exchange in terrestrial ecosystems, and estimates of leaf respiration directly affect the accuracy of ecosystem carbon budgets. Leaf respiration is inhibited by light; therefore, gross primary production (GPP) will be overestimated if the reduction in leaf respiration by light is ignored. However, few studies have quantified GPP overestimation with respect to the degree of light inhibition in forest ecosystems. To determine the effect of light inhibition of leaf respiration on GPP estimation, we assessed the variation in leaf respiration of seedlings of the dominant tree species in an old mixed temperate forest with different photosynthetically active radiation levels using the Laisk method. Canopy respiration was estimated by combining the effect of light inhibition on leaf respiration of these species with within-canopy radiation. Leaf respiration decreased exponentially with an increase in light intensity. Canopy respiration and GPP were overestimated by approximately 20.4% and 4.6%, respectively, when leaf respiration reduction in light was ignored compared with the values obtained when light inhibition of leaf respiration was considered. This study indicates that accurate estimates of daytime ecosystem respiration are needed for the accurate evaluation of carbon budgets in temperate forests. In addition, this study provides a valuable approach to accurately estimate GPP by considering leaf respiration reduction in light in other ecosystems.
Improved fire radiative energy estimation in high latitude ecosystems
NASA Astrophysics Data System (ADS)
Melchiorre, A.; Boschetti, L.
2014-12-01
Scientists, land managers, and policy makers are facing new challenges as fire regimes are evolving as a result of climate change (Westerling et al. 2006). In high latitudes fires are increasing in number and size as temperatures increase and precipitation decreases (Kasischke and Turetsky 2006). Peatlands, like the large complexes in the Alaskan tundra, are burning more frequently and severely as a result of these changes, releasing large amounts of greenhouse gases. Remotely sensed data are routinely used to monitor the location of active fires and the extent of burned areas, but they are not sensitive to the depth of the organic soil layer combusted, resulting in underestimation of peatland greenhouse gas emissions when employing the conventional 'bottom up' approach (Seiler and Crutzen 1980). An alternative approach would be the direct estimation of the biomass burned from the energy released by the fire (Fire Radiative Energy, FRE) (Wooster et al. 2003). Previous works (Boschetti and Roy 2009; Kumar et al. 2011) showed that the sampling interval of polar orbiting satellite systems severely limits the accuracy of the FRE in tropical ecosystems (up to four overpasses a day with MODIS), but because of the convergence of the orbits, more observations are available at higher latitudes. In this work, we used a combination of MODIS thermal data and Landsat optical data for the estimation of biomass burned in peatland ecosystems. First, the global MODIS active fire detection algorithm (Giglio et al. 2003) was modified, adapting the temperature thresholds to maximize the number of detections in boreal regions. Then, following the approach proposed by Boschetti and Roy (2009), the FRP point estimations were interpolated in time and space to cover the full temporal and spatial extent of the burned area, mapped with Landsat5 TM data. The methodology was tested on a large burned area in Alaska, and the results compared to published field measurements (Turetsky et al. 2011).
Improved Speech Coding Based on Open-Loop Parameter Estimation
NASA Technical Reports Server (NTRS)
Juang, Jer-Nan; Chen, Ya-Chin; Longman, Richard W.
2000-01-01
A nonlinear optimization algorithm for linear predictive speech coding was developed early that not only optimizes the linear model coefficients for the open loop predictor, but does the optimization including the effects of quantization of the transmitted residual. It also simultaneously optimizes the quantization levels used for each speech segment. In this paper, we present an improved method for initialization of this nonlinear algorithm, and demonstrate substantial improvements in performance. In addition, the new procedure produces monotonically improving speech quality with increasing numbers of bits used in the transmitted error residual. Examples of speech encoding and decoding are given for 8 speech segments and signal to noise levels as high as 47 dB are produced. As in typical linear predictive coding, the optimization is done on the open loop speech analysis model. Here we demonstrate that minimizing the error of the closed loop speech reconstruction, instead of the simpler open loop optimization, is likely to produce negligible improvement in speech quality. The examples suggest that the algorithm here is close to giving the best performance obtainable from a linear model, for the chosen order with the chosen number of bits for the codebook.
Improving Mantel-Haenszel DIF Estimation through Bayesian Updating
ERIC Educational Resources Information Center
Zwick, Rebecca; Ye, Lei; Isham, Steven
2012-01-01
This study demonstrates how the stability of Mantel-Haenszel (MH) DIF (differential item functioning) methods can be improved by integrating information across multiple test administrations using Bayesian updating (BU). The authors conducted a simulation that showed that this approach, which is based on earlier work by Zwick, Thayer, and Lewis,…
Improving Mantel-Haenszel DIF Estimation through Bayesian Updating
ERIC Educational Resources Information Center
Zwick, Rebecca; Ye, Lei; Isham, Steven
2012-01-01
This study demonstrates how the stability of Mantel-Haenszel (MH) DIF (differential item functioning) methods can be improved by integrating information across multiple test administrations using Bayesian updating (BU). The authors conducted a simulation that showed that this approach, which is based on earlier work by Zwick, Thayer, and Lewis,…
Improving Lidar Turbulence Estimates for Wind Energy: Preprint
Newman, Jennifer; Clifton, Andrew; Churchfield, Matthew; Klein, Petra
2016-10-01
Remote sensing devices (e.g., lidars) are quickly becoming a cost-effective and reliable alternative to meteorological towers for wind energy applications. Although lidars can measure mean wind speeds accurately, these devices measure different values of turbulence intensity (TI) than an instrument on a tower. In response to these issues, a lidar TI error reduction model was recently developed for commercially available lidars. The TI error model first applies physics-based corrections to the lidar measurements, then uses machine-learning techniques to further reduce errors in lidar TI estimates. The model was tested at two sites in the Southern Plains where vertically profiling lidars were collocated with meteorological towers. Results indicate that the model works well under stable conditions but cannot fully mitigate the effects of variance contamination under unstable conditions. To understand how variance contamination affects lidar TI estimates, a new set of equations was derived in previous work to characterize the actual variance measured by a lidar. Terms in these equations were quantified using a lidar simulator and modeled wind field, and the new equations were then implemented into the TI error model.
Covariance specification and estimation to improve top-down Green House Gas emission estimates
NASA Astrophysics Data System (ADS)
Ghosh, S.; Lopez-Coto, I.; Prasad, K.; Whetstone, J. R.
2015-12-01
The National Institute of Standards and Technology (NIST) operates the North-East Corridor (NEC) project and the Indianapolis Flux Experiment (INFLUX) in order to develop measurement methods to quantify sources of Greenhouse Gas (GHG) emissions as well as their uncertainties in urban domains using a top down inversion method. Top down inversion updates prior knowledge using observations in a Bayesian way. One primary consideration in a Bayesian inversion framework is the covariance structure of (1) the emission prior residuals and (2) the observation residuals (i.e. the difference between observations and model predicted observations). These covariance matrices are respectively referred to as the prior covariance matrix and the model-data mismatch covariance matrix. It is known that the choice of these covariances can have large effect on estimates. The main objective of this work is to determine the impact of different covariance models on inversion estimates and their associated uncertainties in urban domains. We use a pseudo-data Bayesian inversion framework using footprints (i.e. sensitivities of tower measurements of GHGs to surface emissions) and emission priors (based on Hestia project to quantify fossil-fuel emissions) to estimate posterior emissions using different covariance schemes. The posterior emission estimates and uncertainties are compared to the hypothetical truth. We find that, if we correctly specify spatial variability and spatio-temporal variability in prior and model-data mismatch covariances respectively, then we can compute more accurate posterior estimates. We discuss few covariance models to introduce space-time interacting mismatches along with estimation of the involved parameters. We then compare several candidate prior spatial covariance models from the Matern covariance class and estimate their parameters with specified mismatches. We find that best-fitted prior covariances are not always best in recovering the truth. To achieve
NASA Astrophysics Data System (ADS)
Li, Mengmeng; Wang, Tijian; Xie, Min; Zhuang, Bingliang; Li, Shu; Han, Yong; Song, Yu; Cheng, Nianliang
2017-03-01
Land surface parameters play an important role in the land-atmosphere coupling and thus are critical to the weather and dispersion of pollutants in the atmosphere. This work aims at improving the meteorology and air quality simulations for a high-ozone (O3) event in the Yangtze River Delta urban cluster of China, through incorporation of satellite-derived land surface parameters. Using Moderate Resolution Imaging Spectroradiometer (MODIS) input to specify the land cover type, green vegetation fraction, leaf area index, albedo, emissivity, and deep soil temperature provides a more realistic representation of surface characteristics. Preliminary evaluations reveal clearly improved meteorological simulation with MODIS input compared with that using default parameters, particularly for temperature (from -2.5 to -1.7°C for mean bias) and humidity (from 9.7% to 4.3% for mean bias). The improved meteorology propagates through the air quality system, which results in better estimates for surface NO2 (from 11.5 to 8.0 ppb for mean bias) and nocturnal O3 low-end concentration values (from -18.8 to -13.6 ppb for mean bias). Modifications of the urban land surface parameters are the main reason for model improvement. The deeper urban boundary layer and intense updraft induced by the urban heat island are favorable for pollutant dilution, thus contributing to lower NO2 and elevated nocturnal O3. Furthermore, the intensified sea-land breeze circulation may exacerbate O3 pollution at coastal cities through pollutant recirculation. Improvement of mesoscale meteorology and air quality simulations with satellite-derived land surface parameters will be useful for air pollution monitoring and forecasting in urban areas.
Ong, Michael K; Jones, Loretta; Aoki, Wayne; Belin, Thomas R; Bromley, Elizabeth; Chung, Bowen; Dixon, Elizabeth; Johnson, Megan Dwight; Jones, Felica; Koegel, Paul; Khodyakov, Dmitry; Landry, Craig M; Lizaola, Elizabeth; Mtume, Norma; Ngo, Victoria K; Perlman, Judith; Pulido, Esmeralda; Sauer, Vivian; Sherbourne, Cathy D; Tang, Lingqi; Vidaurri, Ed; Whittington, Yolanda; Williams, Pluscedia; Lucas-Wright, Aziza; Zhang, Lily; Southard, Marvin; Miranda, Jeanne; Wells, Kenneth
2017-07-17
Community Partners in Care, a community-partnered, cluster-randomized trial with depressed clients from 93 Los Angeles health and community programs, examined the added value of a community coalition approach (Community Engagement and Planning [CEP]) versus individual program technical assistance (Resources for Services [RS]) for implementing depression quality improvement in underserved communities. CEP was more effective than RS in improving mental health-related quality of life, reducing behavioral health hospitalizations, and shifting services toward community-based programs at six months. At 12 months, continued evidence of improvement was found. This study examined three-year outcomes. Among 1,004 participants with depression who were eligible for three-year follow-up, 600 participants from 89 programs completed surveys. Multiple regression analyses estimated intervention effects on poor mental health-related quality of life and depression, physical health-related quality of life, behavioral health hospital nights, and use of services. At three years, no differences were found in the effects of CEP versus RS on depression or mental health-related quality of life, but CEP had modest effects in improving physical health-related quality of life and reducing behavioral health hospital nights, and CEP participants had more social- and community-sector depression visits and greater use of mood stabilizers. Sensitivity analyses with longitudinal modeling reproduced these findings but found no significant differences between groups in change from baseline to three years. At three years, CEP and RS did not have differential effects on primary mental health outcomes, but CEP participants had modest improvements in physical health and fewer behavioral health hospital nights.
A cluster-randomized trial to improve stroke care in hospitals
Lakshminarayan, K.; Borbas, C.; McLaughlin, B.; Morris, N.E.; Vazquez, G.; Luepker, R.V.; Anderson, D.C.
2010-01-01
Objective: We evaluated the effect of performance feedback on acute ischemic stroke care quality in Minnesota hospitals. Methods: A cluster-randomized controlled trial design with hospital as the unit of randomization was used. Care quality was defined as adherence to 10 performance measures grouped into acute, in-hospital, and discharge care. Following preintervention data collection, all hospitals received a report on baseline care quality. Additionally, in experimental hospitals, clinical opinion leaders delivered customized feedback to care providers and study personnel worked with hospital administrators to implement changes targeting identified barriers to stroke care. Multilevel models examined experimental vs control, preintervention and postintervention performance changes and secular trends in performance. Results: Nineteen hospitals were randomized with a total of 1,211 acute ischemic stroke cases preintervention and 1,094 cases postintervention. Secular trends were significant with improvement in both experimental and control hospitals for acute (odds ratio = 2.7, p = 0.007) and in-hospital (odds ratio = 1.5, p < 0.0001) care but not discharge care. There was no significant intervention effect for acute, in-hospital, or discharge care. Conclusion: There was no definite intervention effect: both experimental and control hospitals showed significant secular trends with performance improvement. Our results illustrate the potential fallacy of using historical controls for evaluating quality improvement interventions. Classification of evidence: This study provides Class II evidence that informing hospital leaders of compliance with ischemic stroke quality indicators followed by a structured quality improvement intervention did not significantly improve compliance more than informing hospital leaders of compliance with stroke quality indicators without a quality improvement intervention. GLOSSARY CI = confidence interval; HERF = Healthcare Evaluation and
Stochastic FDTD accuracy improvement through correlation coefficient estimation
NASA Astrophysics Data System (ADS)
Masumnia Bisheh, Khadijeh; Zakeri Gatabi, Bijan; Andargoli, Seyed Mehdi Hosseini
2015-04-01
This paper introduces a new scheme to improve the accuracy of the stochastic finite difference time domain (S-FDTD) method. S-FDTD, reported recently by Smith and Furse, calculates the variations in the electromagnetic fields caused by variability or uncertainty in the electrical properties of the materials in the model. The accuracy of the S-FDTD method is controlled by the approximations for correlation coefficients between the electrical properties of the materials in the model and the fields propagating in them. In this paper, new approximations for these correlation coefficients are obtained using Monte Carlo method with a small number of runs, terming them as Monte Carlo correlation coefficients (MC-CC). Numerical results for two bioelectromagnetic simulation examples demonstrate that MC-CC can improve the accuracy of the S-FDTD method and yield more accurate results than previous approximations.
Estimating Missing Features to Improve Multimedia Information Retrieval
Bagherjeiran, A; Love, N S; Kamath, C
2006-09-28
Retrieval in a multimedia database usually involves combining information from different modalities of data, such as text and images. However, all modalities of the data may not be available to form the query. The retrieval results from such a partial query are often less than satisfactory. In this paper, we present an approach to complete a partial query by estimating the missing features in the query. Our experiments with a database of images and their associated captions show that, with an initial text-only query, our completion method has similar performance to a full query with both image and text features. In addition, when we use relevance feedback, our approach outperforms the results obtained using a full query.
Estimating effects of improved drinking water and sanitation on cholera.
Leidner, Andrew J; Adusumilli, Naveen C
2013-12-01
Demand for adequate provision of drinking-water and sanitation facilities to promote public health and economic growth is increasing in the rapidly urbanizing countries of the developing world. With a panel of data on Asia and Africa from 1990 to 2008, associations are estimated between the occurrence of cholera outbreaks, the case rates in given outbreaks, the mortality rates associated with cholera and two disease control mechanisms, drinking-water and sanitation services. A statistically significant and negative effect is found between drinking-water services and both cholera case rates as well as cholera-related mortality rates. A relatively weak statistical relationship is found between the occurrence of cholera outbreaks and sanitation services.
Improved estimates of ocean heat content from 1960 to 2015.
Cheng, Lijing; Trenberth, Kevin E; Fasullo, John; Boyer, Tim; Abraham, John; Zhu, Jiang
2017-03-01
Earth's energy imbalance (EEI) drives the ongoing global warming and can best be assessed across the historical record (that is, since 1960) from ocean heat content (OHC) changes. An accurate assessment of OHC is a challenge, mainly because of insufficient and irregular data coverage. We provide updated OHC estimates with the goal of minimizing associated sampling error. We performed a subsample test, in which subsets of data during the data-rich Argo era are colocated with locations of earlier ocean observations, to quantify this error. Our results provide a new OHC estimate with an unbiased mean sampling error and with variability on decadal and multidecadal time scales (signal) that can be reliably distinguished from sampling error (noise) with signal-to-noise ratios higher than 3. The inferred integrated EEI is greater than that reported in previous assessments and is consistent with a reconstruction of the radiative imbalance at the top of atmosphere starting in 1985. We found that changes in OHC are relatively small before about 1980; since then, OHC has increased fairly steadily and, since 1990, has increasingly involved deeper layers of the ocean. In addition, OHC changes in six major oceans are reliable on decadal time scales. All ocean basins examined have experienced significant warming since 1998, with the greatest warming in the southern oceans, the tropical/subtropical Pacific Ocean, and the tropical/subtropical Atlantic Ocean. This new look at OHC and EEI changes over time provides greater confidence than previously possible, and the data sets produced are a valuable resource for further study.
Improved estimates of ocean heat content from 1960 to 2015
Cheng, Lijing; Trenberth, Kevin E.; Fasullo, John; Boyer, Tim; Abraham, John; Zhu, Jiang
2017-01-01
Earth’s energy imbalance (EEI) drives the ongoing global warming and can best be assessed across the historical record (that is, since 1960) from ocean heat content (OHC) changes. An accurate assessment of OHC is a challenge, mainly because of insufficient and irregular data coverage. We provide updated OHC estimates with the goal of minimizing associated sampling error. We performed a subsample test, in which subsets of data during the data-rich Argo era are colocated with locations of earlier ocean observations, to quantify this error. Our results provide a new OHC estimate with an unbiased mean sampling error and with variability on decadal and multidecadal time scales (signal) that can be reliably distinguished from sampling error (noise) with signal-to-noise ratios higher than 3. The inferred integrated EEI is greater than that reported in previous assessments and is consistent with a reconstruction of the radiative imbalance at the top of atmosphere starting in 1985. We found that changes in OHC are relatively small before about 1980; since then, OHC has increased fairly steadily and, since 1990, has increasingly involved deeper layers of the ocean. In addition, OHC changes in six major oceans are reliable on decadal time scales. All ocean basins examined have experienced significant warming since 1998, with the greatest warming in the southern oceans, the tropical/subtropical Pacific Ocean, and the tropical/subtropical Atlantic Ocean. This new look at OHC and EEI changes over time provides greater confidence than previously possible, and the data sets produced are a valuable resource for further study. PMID:28345033
Donald B.K. English; Susan M. Kocis; J. Ross Arnold; Stanley J. Zarnoch; Larry Warren
2003-01-01
In estimating recreation visitation at the National Forest level in the US, annual counts of a number of types of visitation proxy measures were used. The intent was to improve the overall precision of the visitation estimate by employing the proxy counts. The precision of visitation estimates at sites that had proxy information versus those that did not is examined....
NASA Astrophysics Data System (ADS)
Gennemark, Peter; Wedelin, Dag
We consider parameter estimation in ordinary differential equations (ODEs) from completely observed systems, and describe an improved version of our previously reported heuristic algorithm (IET Syst. Biol., 2007). Basically, in that method, estimation based on decomposing the problem to simulation of one ODE, is followed by estimation based on simulation of all ODEs of the system.
Janssen, Ronald J; Jylänki, Pasi; van Gerven, Marcel A J
2016-01-01
We have proposed a Bayesian approach for functional parcellation of whole-brain FMRI measurements which we call Clustered Activity Estimation with Spatial Adjacency Restrictions (CAESAR). We use distance-dependent Chinese restaurant processes (dd-CRPs) to define a flexible prior which partitions the voxel measurements into clusters whose number and shapes are unknown a priori. With dd-CRPs we can conveniently implement spatial constraints to ensure that our parcellations remain spatially contiguous and thereby physiologically meaningful. In the present work, we extend CAESAR by using Gaussian process (GP) priors to model the temporally smooth haemodynamic signals that give rise to the measured FMRI data. A challenge for GP inference in our setting is the cubic scaling with respect to the number of time points, which can become computationally prohibitive with FMRI measurements, potentially consisting of long time series. As a solution we describe an efficient implementation that is practically as fast as the corresponding time-independent non-GP model with typically-sized FMRI data sets. We also employ a population Monte-Carlo algorithm that can significantly speed up convergence compared to traditional single-chain methods. First we illustrate the benefits of CAESAR and the GP priors with simulated experiments. Next, we demonstrate our approach by parcellating resting state FMRI data measured from twenty participants as taken from the Human Connectome Project data repository. Results show that CAESAR affords highly robust and scalable whole-brain clustering of FMRI timecourses.
Strategies for Improved CALIPSO Aerosol Optical Depth Estimates
NASA Technical Reports Server (NTRS)
Vaughan, Mark A.; Kuehn, Ralph E.; Tackett, Jason L.; Rogers, Raymond R.; Liu, Zhaoyan; Omar, A.; Getzewich, Brian J.; Powell, Kathleen A.; Hu, Yongxiang; Young, Stuart A.; Avery, Melody A.; Winker, David M.; Trepte, Charles R.
2010-01-01
In the spring of 2010, the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) project will be releasing version 3 of its level 2 data products. In this paper we describe several changes to the algorithms and code that yield substantial improvements in CALIPSO's retrieval of aerosol optical depths (AOD). Among these are a retooled cloud-clearing procedure and a new approach to determining the base altitudes of aerosol layers in the planetary boundary layer (PBL). The results derived from these modifications are illustrated using case studies prepared using a late beta version of the level 2 version 3 processing code.
Monte Carlo Estimate to Improve Photon Energy Spectrum Reconstruction
NASA Astrophysics Data System (ADS)
Sawchuk, S.
Improvements to planning radiation treatment for cancer patients and quality control of medical linear accelerators (linacs) can be achieved with the explicit knowledge of the photon energy spectrum. Monte Carlo (MC) simulations of linac treatment heads and experimental attenuation analysis are among the most popular ways of obtaining these spectra. Attenuation methods which combine measurements under narrow beam geometry and the associated calculation techniques to reconstruct the spectrum from the acquired data are very practical in a clinical setting and they can also serve to validate MC simulations. A novel reconstruction method [1] which has been modified [2] utilizes a Simpson's rule (SR) to approximate and discretize (1)
Improving a regional model using reduced complexity and parameter estimation
Kelson, Victor A.; Hunt, Randall J.; Haitjema, Henk M.
2002-01-01
The availability of powerful desktop computers and graphical user interfaces for ground water flow models makes possible the construction of ever more complex models. A proposed copper-zinc sulfide mine in northern Wisconsin offers a unique case in which the same hydrologic system has been modeled using a variety of techniques covering a wide range of sophistication and complexity. Early in the permitting process, simple numerical models were used to evaluate the necessary amount of water to be pumped from the mine, reductions in streamflow, and the drawdowns in the regional aquifer. More complex models have subsequently been used in an attempt to refine the predictions. Even after so much modeling effort, questions regarding the accuracy and reliability of the predictions remain. We have performed a new analysis of the proposed mine using the two-dimensional analytic element code GFLOW coupled with the nonlinear parameter estimation code UCODE. The new model is parsimonious, containing fewer than 10 parameters, and covers a region several times larger in areal extent than any of the previous models. The model demonstrates the suitability of analytic element codes for use with parameter estimation codes. The simplified model results are similar to the more complex models; predicted mine inflows and UCODE-derived 95% confidence intervals are consistent with the previous predictions. More important, the large areal extent of the model allowed us to examine hydrological features not included in the previous models, resulting in new insights about the effects that far-field boundary conditions can have on near-field model calibration and parameterization. In this case, the addition of surface water runoff into a lake in the headwaters of a stream while holding recharge constant moved a regional ground watershed divide and resulted in some of the added water being captured by the adjoining basin. Finally, a simple analytical solution was used to clarify the GFLOW model
Estimating the difference limen in 2AFC tasks: pitfalls and improved estimators.
Ulrich, Rolf; Vorberg, Dirk
2009-08-01
Discrimination performance is often assessed by measuring the difference limen (DL; or just noticeable difference) in a two-alternative forced choice (2AFC) task. Here, we show that the DL estimated from 2AFC percentage-correct data is likely to systematically under- or overestimate true discrimination performance if order effects are present. We show how pitfalls with the 2AFC task may be avoided and suggest a novel approach for analyzing 2AFC data.
Clustering methods for removing outliers from vision-based range estimates
NASA Technical Reports Server (NTRS)
Hussien, B.; Suorsa, R.
1992-01-01
The present approach to the automation of helicopter low-altitude flight uses one or more passive imaging sensors to extract environmental obstacle information; this is then processed via computer-vision techniques to yield a time-varying map of range to obstacles in the sensor's field of view along the vehicle's flight path. Attention is given to two related techniques which can eliminate outliers from a sparse range map, clustering sparse range-map information into different spatial classes that rely on a segmented and labeled image to aid in spatial classification within the image plane.
Clustering methods for removing outliers from vision-based range estimates
NASA Technical Reports Server (NTRS)
Hussien, B.; Suorsa, R.
1992-01-01
The present approach to the automation of helicopter low-altitude flight uses one or more passive imaging sensors to extract environmental obstacle information; this is then processed via computer-vision techniques to yield a time-varying map of range to obstacles in the sensor's field of view along the vehicle's flight path. Attention is given to two related techniques which can eliminate outliers from a sparse range map, clustering sparse range-map information into different spatial classes that rely on a segmented and labeled image to aid in spatial classification within the image plane.
An Improved Source-Scanning Algorithm for Locating Earthquake Clusters or Aftershock Sequences
NASA Astrophysics Data System (ADS)
Liao, Y.; Kao, H.; Hsu, S.
2010-12-01
The Source-scanning Algorithm (SSA) was originally introduced in 2004 to locate non-volcanic tremors. Its application was later expanded to the identification of earthquake rupture planes and the near-real-time detection and monitoring of landslides and mud/debris flows. In this study, we further improve SSA for the purpose of locating earthquake clusters or aftershock sequences when only a limited number of waveform observations are available. The main improvements include the application of a ground motion analyzer to separate P and S waves, the automatic determination of resolution based on the grid size and time step of the scanning process, and a modified brightness function to utilize constraints from multiple phases. Specifically, the improved SSA (named as ISSA) addresses two major issues related to locating earthquake clusters/aftershocks. The first one is the massive amount of both time and labour to locate a large number of seismic events manually. And the second one is to efficiently and correctly identify the same phase across the entire recording array when multiple events occur closely in time and space. To test the robustness of ISSA, we generate synthetic waveforms consisting of 3 separated events such that individual P and S phases arrive at different stations in different order, thus making correct phase picking nearly impossible. Using these very complicated waveforms as the input, the ISSA scans all model space for possible combination of time and location for the existence of seismic sources. The scanning results successfully associate various phases from each event at all stations, and correctly recover the input. To further demonstrate the advantage of ISSA, we apply it to the waveform data collected by a temporary OBS array for the aftershock sequence of an offshore earthquake southwest of Taiwan. The overall signal-to-noise ratio is inadequate for locating small events; and the precise arrival times of P and S phases are difficult to
Improving hyperspectral band selection by constructing an estimated reference map
NASA Astrophysics Data System (ADS)
Guo, Baofeng; Damper, Robert I.; Gunn, Steve R.; Nelson, James D. B.
2014-01-01
We investigate band selection for hyperspectral image classification. Mutual information (MI) measures the statistical dependence between two random variables. By modeling the reference map as one of the two random variables, MI can, therefore, be used to select the bands that are more useful for image classification. A new method is proposed to estimate the MI using an optimally constructed reference map, reducing reliance on ground-truth information. To reduce the interferences from noise and clutters, the reference map is constructed by averaging a subset of spectral bands that are chosen with the best capability to approximate the ground truth. To automatically find these bands, we develop a searching strategy consisting of differentiable MI, gradient ascending algorithm, and random-start optimization. Experiments on AVIRIS 92AV3C dataset and Pavia University scene dataset show that the proposed method outperformed the benchmark methods. In AVIRIS 92AV3C dataset, up to 55% of bands can be removed without significant loss of classification accuracy, compared to the 40% from that using the reference map accompanied with the dataset. Meanwhile, its performance is much more robust to accuracy degradation when bands are cut off beyond 60%, revealing a better agreement in the MI calculation. In Pavia University scene dataset, using 45 bands achieved 86.18% classification accuracy, which is only 1.5% lower than that using all the 103 bands.
Ionospheric perturbation degree estimates for improving GNSS applications
NASA Astrophysics Data System (ADS)
Jakowski, Norbert; Mainul Hoque, M.; Wilken, Volker; Berdermann, Jens; Hlubek, Nikolai
Ionosphere can adversely affect accuracy, continuity, availability, and integrity of modern Global Navigation Satellite Systems (GNSS) in different ways. Hence, reliable information on key parameters describing the perturbation degree of the ionosphere is helpful for estimating the potential degradation of the performance of these systems. So, to guarantee the required safety level in aviation, Ground Based Augmentation Systems (GBAS) and Satellite Based Augmentation Systems (SBAS) have been established for detecting and mitigating ionospheric threats in particular due to ionospheric gradients. The paper reviews various attempts and capabilities to characterize the perturbation degree of the ionosphere currently being used in precise positioning and safety of life applications. Continuity and availability of signals are mainly impacted by amplitude and phase scintillations characterized by indices such as S4 or phase noise. To characterize medium and large scale ionospheric perturbations that may seriously affect accuracy and integrity of GNSS, the use of an internationally standardized Disturbance Ionosphere Index (DIX) is recommended. The definition of such a DIX must take into account the practical needs, should be an objective measure of ionospheric conditions and easy and reproducible to compute. A preliminary DIX approach is presented and discussed. Such a robust and easy adaptable index should have a great potential for being used in operational ionospheric weather services and GNSS augmentation systems.
Improving Multiyear Ice Concentration Estimates with Reanalysis Air Temperatures
NASA Astrophysics Data System (ADS)
Ye, Y.; Shokr, M.; Heygster, G.; Spreen, G.
2015-12-01
Multiyear ice (MYI) characteristics can be retrieved from passive or active microwave remote sensing observations. One of the algorithms that combine both of observations to identify partial concentrations of ice types (including MYI) is the Environment Canada's Ice Concentration Extractor (ECICE). However, cycles of warm/cold air temperature trigger wet-refreeze cycles of the snow cover on MYI ice surface. Under wet snow conditions, anomalous brightness temperature and backscatter, similar to those of first year ice (FYI) are observed. This leads to misidentification of MYI as being FYI, hence decreasing the estimated MYI concentration suddenly. The purpose of this study is to introduce a correction scheme to restore the MYI concentration under this condition. The correction is based on air temperature records. It utilizes the fact that the warm spell in autumn lasts for a short period of time (a few days). The correction is applied to MYI concentration results from ECICE using an input of combined QuikSCAT and AMSR-E data; acquired over the Arctic region in a series of autumn seasons from 2003 to 2008. The correction works well by replacing anomalous MYI concentrations with interpolated ones. For September of the six years, it introduces over 0.1×106 km2 MYI area except for 2005. Due to the regional effect of the warm air spells, the correction could be important in the operational applications where small and meso scale ice concentrations are crucial.
An adaptive displacement estimation algorithm for improved reconstruction of thermal strain.
Ding, Xuan; Dutta, Debaditya; Mahmoud, Ahmed M; Tillman, Bryan; Leers, Steven A; Kim, Kang
2015-01-01
Thermal strain imaging (TSI) can be used to differentiate between lipid and water-based tissues in atherosclerotic arteries. However, detecting small lipid pools in vivo requires accurate and robust displacement estimation over a wide range of displacement magnitudes. Phase-shift estimators such as Loupas' estimator and time-shift estimators such as normalized cross-correlation (NXcorr) are commonly used to track tissue displacements. However, Loupas' estimator is limited by phase-wrapping and NXcorr performs poorly when the SNR is low. In this paper, we present an adaptive displacement estimation algorithm that combines both Loupas' estimator and NXcorr. We evaluated this algorithm using computer simulations and an ex vivo human tissue sample. Using 1-D simulation studies, we showed that when the displacement magnitude induced by thermal strain was >λ/8 and the electronic system SNR was >25.5 dB, the NXcorr displacement estimate was less biased than the estimate found using Loupas' estimator. On the other hand, when the displacement magnitude was ≤λ/4 and the electronic system SNR was ≤25.5 dB, Loupas' estimator had less variance than NXcorr. We used these findings to design an adaptive displacement estimation algorithm. Computer simulations of TSI showed that the adaptive displacement estimator was less biased than either Loupas' estimator or NXcorr. Strain reconstructed from the adaptive displacement estimates improved the strain SNR by 43.7 to 350% and the spatial accuracy by 1.2 to 23.0% (P < 0.001). An ex vivo human tissue study provided results that were comparable to computer simulations. The results of this study showed that a novel displacement estimation algorithm, which combines two different displacement estimators, yielded improved displacement estimation and resulted in improved strain reconstruction.
An Adaptive Displacement Estimation Algorithm for Improved Reconstruction of Thermal Strain
Ding, Xuan; Dutta, Debaditya; Mahmoud, Ahmed M.; Tillman, Bryan; Leers, Steven A.; Kim, Kang
2014-01-01
Thermal strain imaging (TSI) can be used to differentiate between lipid and water-based tissues in atherosclerotic arteries. However, detecting small lipid pools in vivo requires accurate and robust displacement estimation over a wide range of displacement magnitudes. Phase-shift estimators such as Loupas’ estimator and time-shift estimators like normalized cross-correlation (NXcorr) are commonly used to track tissue displacements. However, Loupas’ estimator is limited by phase-wrapping and NXcorr performs poorly when the signal-to-noise ratio (SNR) is low. In this paper, we present an adaptive displacement estimation algorithm that combines both Loupas’ estimator and NXcorr. We evaluated this algorithm using computer simulations and an ex-vivo human tissue sample. Using 1-D simulation studies, we showed that when the displacement magnitude induced by thermal strain was >λ/8 and the electronic system SNR was >25.5 dB, the NXcorr displacement estimate was less biased than the estimate found using Loupas’ estimator. On the other hand, when the displacement magnitude was ≤λ/4 and the electronic system SNR was ≤25.5 dB, Loupas’ estimator had less variance than NXcorr. We used these findings to design an adaptive displacement estimation algorithm. Computer simulations of TSI using Field II showed that the adaptive displacement estimator was less biased than either Loupas’ estimator or NXcorr. Strain reconstructed from the adaptive displacement estimates improved the strain SNR by 43.7–350% and the spatial accuracy by 1.2–23.0% (p < 0.001). An ex-vivo human tissue study provided results that were comparable to computer simulations. The results of this study showed that a novel displacement estimation algorithm, which combines two different displacement estimators, yielded improved displacement estimation and results in improved strain reconstruction. PMID:25585398
2012-01-01
Background The localization of proteins to specific subcellular structures in eukaryotic cells provides important information with respect to their function. Fluorescence microscopy approaches to determine localization distribution have proved to be an essential tool in the characterization of unknown proteins, and are now particularly pertinent as a result of the wide availability of fluorescently-tagged constructs and antibodies. However, there are currently very few image analysis options able to effectively discriminate proteins with apparently similar distributions in cells, despite this information being important for protein characterization. Findings We have developed a novel method for combining two existing image analysis approaches, which results in highly efficient and accurate discrimination of proteins with seemingly similar distributions. We have combined image texture-based analysis with quantitative co-localization coefficients, a method that has traditionally only been used to study the spatial overlap between two populations of molecules. Here we describe and present a novel application for quantitative co-localization, as applied to the study of Rab family small GTP binding proteins localizing to the endomembrane system of cultured cells. Conclusions We show how quantitative co-localization can be used alongside texture feature analysis, resulting in improved clustering of microscopy images. The use of co-localization as an additional clustering parameter is non-biased and highly applicable to high-throughput image data sets. PMID:22681635
Böcker, Alexander
2008-11-01
A new clustering algorithm was developed that is able to group large data sets with more than 100,000 molecules according to their chemotypes. The algorithm preclusters a data set using a fingerprint version of the hierarchical k-means algorithm. Chemotypes are extracted from the terminal clusters via a maximum common substructure approach. Molecules forming a chemotype have to share a predefined number of rings, atoms, and non-carbon heavy atoms. In an iterative procedure, similar chemotypes and singletons are fused to larger chemotypes. Singletons that cannot be assigned to any chemotype are then grouped based on the proportion of overlap between the molecules. Representatives from each chemotype and the singletons are used in a second round of the hierarchical k-means algorithm to provide a final hierarchical grouping. Results are reported to an interactive graphical user interface which allows initial insights about the structure activity relationship (SAR) of the molecules. Example applications are shown for two chemotypes of reverse transcriptase inhibitors in the MDDR database and for the evaluation of descriptor-based similarity searching routines. A special focus was laid on the chemotype hopping potential of each individual routine. The algorithm will allow the analysis of high-throughput and virtual screening results with improved quality.
Rejani, R; Rao, K V; Osman, M; Srinivasa Rao, Ch; Reddy, K Sammi; Chary, G R; Pushpanjali; Samuel, Josily
2016-03-01
The ungauged wet semi-arid watershed cluster, Seethagondi, lies in the Adilabad district of Telangana in India and is prone to severe erosion and water scarcity. The runoff and soil loss data at watershed, catchment, and field level are necessary for planning soil and water conservation interventions. In this study, an attempt was made to develop a spatial soil loss estimation model for Seethagondi cluster using RUSLE coupled with ARCGIS and was used to estimate the soil loss spatially and temporally. The daily rainfall data of Aphrodite for the period from 1951 to 2007 was used, and the annual rainfall varied from 508 to 1351 mm with a mean annual rainfall of 950 mm and a mean erosivity of 6789 MJ mm ha(-1) h(-1) year(-1). Considerable variation in land use land cover especially in crop land and fallow land was observed during normal and drought years, and corresponding variation in the erosivity, C factor, and soil loss was also noted. The mean value of C factor derived from NDVI for crop land was 0.42 and 0.22 in normal year and drought years, respectively. The topography is undulating and major portion of the cluster has slope less than 10°, and 85.3% of the cluster has soil loss below 20 t ha(-1) year(-1). The soil loss from crop land varied from 2.9 to 3.6 t ha(-1) year(-1) in low rainfall years to 31.8 to 34.7 t ha(-1) year(-1) in high rainfall years with a mean annual soil loss of 12.2 t ha(-1) year(-1). The soil loss from crop land was higher in the month of August with an annual soil loss of 13.1 and 2.9 t ha(-1) year(-1) in normal and drought year, respectively. Based on the soil loss in a normal year, the interventions recommended for 85.3% of area of the watershed includes agronomic measures such as contour cultivation, graded bunds, strip cropping, mixed cropping, crop rotations, mulching, summer plowing, vegetative bunds, agri-horticultural system, and management practices such as broad bed furrow, raised sunken beds, and harvesting available water
NASA Astrophysics Data System (ADS)
Milone, Eugene F.; Schiller, Stephen Joseph
2015-08-01
Eclipsing binaries (EB) with well-calibrated photometry and precisely measured double-lined radial velocities are candidate standard candles when analyzed with a version of the Wilson-Devinney (WD) light curve modeling program that includes the direct distance estimation (DDE) algorithm. In the DDE procedure, distance is determined as a system parameter, thus avoiding the assumption of stellar sphericity and yielding a well-determined standard error for distance. The method therefore provides a powerful way to calibrate the distances of other objects in any aggregate that contains suitable EB's. DDE has been successfully applied to nearby systems and to a small number of EB's in open clusters. Previously we reported on one of the systems in our Binaries-in-Clusters program, HD27130 = V818 Tau, that had been analyzed with earlier versions of the WD program (see 1987 AJ 93, 1471; 1988 AJ 95, 1466; and 1995 AJ 109, 359 for examples). Results from those early solutions were entered as starting parameters in the current work with the WD 2013 version.Here we report several series of ongoing modeling experiments on a 1.01-d period, early type EB in the intermediate age cluster NGC 752. In one series, ranges of interstellar extinction and hotter star temperature were assumed, and in another series both component temperatures were adjusted. Consistent parameter sets, including distance, confirm DDE's advantages, essentially limited only by knowledge of interstellar extinction, which is small for DS And. Uncertainties in the bandpass calibration constants (flux in standard units from a zero magnitude star) are much less important because derived distance scales (inversely) only with the calibration's square root. This work was enabled by the unstinting help of Bob Wilson. We acknowledge earlier support for the Binaries-in-Clusters program from NSERC of Canada, and the Research Grants Committee and Department of Physics & Astronomy of the University of Calgary.
Clustering approaches to improve the performance of low cost air pollution sensors.
Smith, Katie R; Edwards, Peter M; Evans, Mathew J; Lee, James D; Shaw, Marvin D; Squires, Freya; Wilde, Shona; Lewis, Alastair C
2017-08-24
frequent calibration. The use of a cluster median value eliminates unpredictable medium term response changes, and other longer term outlier behaviours, extending the likely period needed between calibration and making a linear interpolation between calibrations more appropriate. Through the use of sensor clusters rather than individual sensors, existing low cost technologies could deliver significantly improved quality of observations.
An improved method for nonlinear parameter estimation: a case study of the Rössler model
NASA Astrophysics Data System (ADS)
He, Wen-Ping; Wang, Liu; Jiang, Yun-Di; Wan, Shi-Quan
2016-08-01
Parameter estimation is an important research topic in nonlinear dynamics. Based on the evolutionary algorithm (EA), Wang et al. (2014) present a new scheme for nonlinear parameter estimation and numerical tests indicate that the estimation precision is satisfactory. However, the convergence rate of the EA is relatively slow when multiple unknown parameters in a multidimensional dynamical system are estimated simultaneously. To solve this problem, an improved method for parameter estimation of nonlinear dynamical equations is provided in the present paper. The main idea of the improved scheme is to use all of the known time series for all of the components in some dynamical equations to estimate the parameters in single component one by one, instead of estimating all of the parameters in all of the components simultaneously. Thus, we can estimate all of the parameters stage by stage. The performance of the improved method was tested using a classic chaotic system—Rössler model. The numerical tests show that the amended parameter estimation scheme can greatly improve the searching efficiency and that there is a significant increase in the convergence rate of the EA, particularly for multiparameter estimation in multidimensional dynamical equations. Moreover, the results indicate that the accuracy of parameter estimation and the CPU time consumed by the presented method have no obvious dependence on the sample size.
Improving modeled snow albedo estimates during the spring melt season
NASA Astrophysics Data System (ADS)
Malik, M. Jahanzeb; Velde, Rogier; Vekerdy, Zoltan; Su, Zhongbo
2014-06-01
Snow albedo influences snow-covered land energy and water budgets and is thus an important variable for energy and water fluxes calculations. Here, we quantify the performance of the three existing snow albedo parameterizations under alpine, tundra, and prairie snow conditions when implemented in the Noah land surface model (LSM)—Noah's default and ones from the Biosphere-Atmosphere Transfer Scheme (BATS) and the Canadian Land Surface Scheme (CLASS) LSMs. The Noah LSM is forced with and its output is evaluated using in situ measurements from seven sites in U.S. and France. Comparison of the snow albedo simulations with the in situ measurements reveals that the three parameterizations overestimate snow albedo during springtime. An alternative snow albedo parameterization is introduced that adopts the shape of the variogram for the optically thick snowpacks and decreases the albedo further for optically thin conditions by mixing the snow with the land surface (background) albedo as a function of snow depth. In comparison with the in situ measurements, the new parameterization improves albedo simulation of the alpine and tundra snowpacks and positively impacts the simulation of snow depth, snowmelt rate, and upward shortwave radiation. An improved model performance with the variogram-shaped parameterization can, however, not be unambiguously detected for prairie snowpacks, which may be attributed to uncertainties associated with the simulation of snow density. An assessment of the model performance for the Upper Colorado River Basin highlights that with the variogram-shaped parameterization Noah simulates more evapotranspiration and larger runoff peaks in Spring, whereas the Summer runoff is lower.
Improved estimates of ocean heat content from 1960-2015
NASA Astrophysics Data System (ADS)
Cheng, L.; Trenberth, K. E.; Fasullo, J.; Boyer, T.; Abraham, J. P.; Zhu, J.
2016-12-01
Earth's energy imbalance (EEI) drives the ongoing global warming and can best be assessed across the historical record by ocean heat content (OHC) changes. An accurate assessment of OHC is a challenge mainly due to insufficient and irregular data coverage. Here we provide updated OHC estimates since 1960 with the goal of minimizing associated sampling error. A subsample test, in which subsets of data in the data-rich Argo era are co-located with historical ocean observation locations, is performed to quantify this error. Results imply that OHC variations on decadal and multi-decadal scales can reliably be represented by our advanced mapping method (i.e. ratio of signal to error is 3-25). The inferred integrated EEI based on the updated OHC is greater than reported in previous assessments, but is consistent with a reconstruction of the radiative imbalance at the top of atmosphere since 1985. Besides, OHC changes in six major ocean basins can also be reliably reconstructed on decadal-scales (ratio of signal to error is 2-20). All examined ocean basins experienced significant warming since 1998, although the heat was mainly stored in the Southern Ocean, Tropical Pacific and Tropical Atlantic during 1960-1998. A new look at OHC and EEI changes over time has provided greater confidence than previously possible. Changes in OHC were small prior to about 1980, but with fairly steady increases since then and increasingly involving deeper layers of the ocean after 1990. These datasets provide a valuable resource for further studies.
Improved automatic optic nerve radius estimation from high resolution MRI
NASA Astrophysics Data System (ADS)
Harrigan, Robert L.; Smith, Alex K.; Mawn, Louise A.; Smith, Seth A.; Landman, Bennett A.
2017-02-01
The optic nerve (ON) is a vital structure in the human visual system and transports all visual information from the retina to the cortex for higher order processing. Due to the lack of redundancy in the visual pathway, measures of ON damage have been shown to correlate well with visual deficits. These measures are typically taken at an arbitrary anatomically defined point along the nerve and do not characterize changes along the length of the ON. We propose a fully automated, three-dimensionally consistent technique building upon a previous independent slice-wise technique to estimate the radius of the ON and surrounding cerebrospinal fluid (CSF) on high-resolution heavily T2-weighted isotropic MRI. We show that by constraining results to be three-dimensionally consistent this technique produces more anatomically viable results. We compare this technique with the previously published slice-wise technique using a short-term reproducibility data set, 10 subjects, follow-up <1 month, and show that the new method is more reproducible in the center of the ON. The center of the ON contains the most accurate imaging because it lacks confounders such as motion and frontal lobe interference. Long-term reproducibility, 5 subjects, follow-up of approximately 11 months, is also investigated with this new technique and shown to be similar to short-term reproducibility, indicating that the ON does not change substantially within 11 months. The increased accuracy of this new technique provides increased power when searching for anatomical changes in ON size amongst patient populations.
Improving the text classification using clustering and a novel HMM to reduce the dimensionality.
Seara Vieira, A; Borrajo, L; Iglesias, E L
2016-11-01
In text classification problems, the representation of a document has a strong impact on the performance of learning systems. The high dimensionality of the classical structured representations can lead to burdensome computations due to the great size of real-world data. Consequently, there is a need for reducing the quantity of handled information to improve the classification process. In this paper, we propose a method to reduce the dimensionality of a classical text representation based on a clustering technique to group documents, and a previously developed Hidden Markov Model to represent them. We have applied tests with the k-NN and SVM classifiers on the OHSUMED and TREC benchmark text corpora using the proposed dimensionality reduction technique. The experimental results obtained are very satisfactory compared to commonly used techniques like InfoGain and the statistical tests performed demonstrate the suitability of the proposed technique for the preprocessing step in a text classification task.
A space-time look at two-phase estimation for improved annual inventory estimates
Jay Breidt; Jean Opsomer; Xiyue Liao; Gretchen. Moisen
2015-01-01
Over the past several years, three sets of new temporal remote sensing data have become available improving FIAâs ability to detect, characterize and forecast land cover changes. First, historic Landsat data has been processed for the conterminous US to provide disturbance history, agents of change, and fitted spectral trajectories annually over the last 30+ years at...
An Improved Estimation of COMS-based Sea Surface Temperature
NASA Astrophysics Data System (ADS)
Huh, M.; Seo, M.; Han, K. S.; Shin, J.; Shin, I.
2016-12-01
The objective of this paper is to implement retrieving Sea Surface Temperature (SST) using geostationary satellite of Korea, Communication, Ocean and Meteorological Satellite/Meteorological Imager (COMS/MI). In this study, IR channels of COMS are corrected using the Global Space-Based Inter-Calibration System (GSICS) that produces consistent accuracy of thermal infrared (IR) channels of satellite measurements. The new retrieval method is adopted the Multi-Channel Sea Surface Temperature (MCSST) `split-window' algorithm with First Guess and the quality controlled in-situ buoy data are used the reference data. The new MCSST_FG results are showed that RMSE is 0.85 ºC in day time (0.747 ºC in night time) by comparison with 0.92 ºC (0.827 ºC in night time) of MCSST which is the current operational retrieval method. We found the regional biases are reduced on MCSST_FG algorithm though, there are the skewness and outliers in the analysis of differences retrieved SST and in-situ. It is significant efforts reprocessing and improvement of the satellite COMS SST that expects the COMS SST is made use of thematic climate data record such as Global essential climate variables.
Improving the quality of care for infants: a cluster randomized controlled trial
Lee, Shoo K.; Aziz, Khalid; Singhal, Nalini; Cronin, Catherine M.; James, Andrew; Lee, David S.C.; Matthew, Derek; Ohlsson, Arne; Sankaran, Koravangattu; Seshia, Mary; Synnes, Anne; Walker, Robin; Whyte, Robin; Langley, Joanne; MacNab, Ying C.; Stevens, Bonnie; von Dadelszen, Peter
2009-01-01
Background We developed and tested a new method, called the Evidence-based Practice for Improving Quality method, for continuous quality improvement. Methods We used cluster randomization to assign 6 neonatal intensive care units (ICUs) to reduce nosocomial infection (infection group) and 6 ICUs to reduce bronchopulmonary dysplasia (pulmonary group). We included all infants born at 32 or fewer weeks gestation. We collected baseline data for 1 year. Practice change interventions were implemented using rapid-change cycles for 2 years. Results The difference in incidence trends (slopes of trend lines) between the ICUs in the infection and pulmonary groups was − 0.0020 (95% confidence interval [CI] − 0.0007 to 0.0004) for nosocomial infection and − 0.0006 (95% CI − 0.0011 to − 0.0001) for bronchopulmonary dysplasia. Interpretation The results suggest that the Evidence-based Practice for Improving Quality method reduced bronchopulmonary dysplasia in the neonatal ICU and that it may reduce nosocomial infection. PMID:19667033
Novel angle estimation for bistatic MIMO radar using an improved MUSIC
NASA Astrophysics Data System (ADS)
Li, Jianfeng; Zhang, Xiaofei; Chen, Han
2014-09-01
In this article, we study the problem of angle estimation for bistatic multiple-input multiple-output (MIMO) radar and propose an improved multiple signal classification (MUSIC) algorithm for joint direction of departure (DOD) and direction of arrival (DOA) estimation. The proposed algorithm obtains initial estimations of angles obtained from the signal subspace and uses the local one-dimensional peak searches to achieve the joint estimations of DOD and DOA. The angle estimation performance of the proposed algorithm is better than that of estimation of signal parameters via rotational invariance techniques (ESPRIT) algorithm, and is almost the same as that of two-dimensional MUSIC. Furthermore, the proposed algorithm can be suitable for irregular array geometry, obtain automatically paired DOD and DOA estimations, and avoid two-dimensional peak searching. The simulation results verify the effectiveness and improvement of the algorithm.
Estimating Accuracy of Land-Cover Composition From Two-Stage Clustering Sampling
Land-cover maps are often used to compute land-cover composition (i.e., the proportion or percent of area covered by each class), for each unit in a spatial partition of the region mapped. We derive design-based estimators of mean deviation (MD), mean absolute deviation (MAD), ...
Estimating Accuracy of Land-Cover Composition From Two-Stage Clustering Sampling
Land-cover maps are often used to compute land-cover composition (i.e., the proportion or percent of area covered by each class), for each unit in a spatial partition of the region mapped. We derive design-based estimators of mean deviation (MD), mean absolute deviation (MAD), ...
Wu, Hao-Yi; Rozo, Eduardo; Wechsler, Risa H.; /KIPAC, Menlo Park /SLAC /CCAPP, Columbus /KICP, Chicago /KIPAC, Menlo Park /SLAC
2010-06-02
The precision of cosmological parameters derived from galaxy cluster surveys is limited by uncertainty in relating observable signals to cluster mass. We demonstrate that a small mass-calibration follow-up program can significantly reduce this uncertainty and improve parameter constraints, particularly when the follow-up targets are judiciously chosen. To this end, we apply a simulated annealing algorithm to maximize the dark energy information at fixed observational cost, and find that optimal follow-up strategies can reduce the observational cost required to achieve a specified precision by up to an order of magnitude. Considering clusters selected from optical imaging in the Dark Energy Survey, we find that approximately 200 low-redshift X-ray clusters or massive Sunyaev-Zel'dovich clusters can improve the dark energy figure of merit by 50%, provided that the follow-up mass measurements involve no systematic error. In practice, the actual improvement depends on (1) the uncertainty in the systematic error in follow-up mass measurements, which needs to be controlled at the 5% level to avoid severe degradation of the results; and (2) the scatter in the optical richness-mass distribution, which needs to be made as tight as possible to improve the efficacy of follow-up observations.
High, F. W.; Stalder, B.; Song, J.; Ade, P. A. R.; Aird, K. A.; Allam, S. S.; Buckley-Geer, E. J.; Armstrong, R.; Barkhouse, W. A.; Benson, B. A.; Bertin, E.; Bhattacharya, S.; Bleem, L. E.; Carlstrom, J. E.; Chang, C. L.; Crawford, T. M.; Crites, A. T.; Brodwin, M.; Challis, P.; De Haan, T.
2010-11-10
We present redshifts and optical richness properties of 21 galaxy clusters uniformly selected by their Sunyaev-Zel'dovich (SZ) signature. These clusters, plus an additional, unconfirmed candidate, were detected in a 178 deg{sup 2} area surveyed by the South Pole Telescope (SPT) in 2008. Using griz imaging from the Blanco Cosmology Survey and from pointed Magellan telescope observations, as well as spectroscopy using Magellan facilities, we confirm the existence of clustered red-sequence galaxies, report red-sequence photometric redshifts, present spectroscopic redshifts for a subsample, and derive R{sub 200} radii and M{sub 200} masses from optical richness. The clusters span redshifts from 0.15 to greater than 1, with a median redshift of 0.74; three clusters are estimated to be at z>1. Redshifts inferred from mean red-sequence colors exhibit 2% rms scatter in {sigma}{sub z}/(1 + z) with respect to the spectroscopic subsample for z < 1. We show that the M{sub 200} cluster masses derived from optical richness correlate with masses derived from SPT data and agree with previously derived scaling relations to within the uncertainties. Optical and infrared imaging is an efficient means of cluster identification and redshift estimation in large SZ surveys, and exploiting the same data for richness measurements, as we have done, will be useful for constraining cluster masses and radii for large samples in cosmological analysis.
Measuring slope to improve energy expenditure estimates during field-based activities.
Duncan, Glen E; Lester, Jonathan; Migotsky, Sean; Higgins, Lisa; Borriello, Gaetano
2013-03-01
This technical note describes methods to improve activity energy expenditure estimates by using a multi-sensor board (MSB) to measure slope. Ten adults walked over a 4-km (2.5-mile) course wearing an MSB and mobile calorimeter. Energy expenditure was estimated using accelerometry alone (base) and 4 methods to measure slope. The barometer and global positioning system methods improved accuracy by 11% from the base (p < 0.05) to 86% overall. Measuring slope using the MSB improves energy expenditure estimates during field-based activities.
Estimates of the national benefits and costs of improving ambient air quality
Brady, G.L.; Bower, B.T.; Lakhani, H.A.
1983-04-01
This paper examines the estimates of national benefits and national costs of ambient air quality improvement in the United States for the period 1970 to 1978. Analysis must be at the micro-level for both receptors of pollution and the dischargers of residuals. Section 2 discusses techniques for estimating the national benefits from improving ambient air quality. The literature on national benefits to health (mortality and morbidity) and non-health (avoiding damages to materials, plants, crops, etc.) is critically reviewed in this section. For the period 1970 to 1978, the value of these benefits ranged from about $5 billion to $51 billion, with a point estimate of about $22 billion. The national cost estimates by the Council on Environmental Quality, Bureau of Economic Analysis, and McGraw-Hill are provided in section 2. Cost estimates must include not only the end-of-pipe treatment measures, but also the alternatives: changes in product specification, product mix, processes, etc. These types of responses are not generally considered in estimates of national costs. For the period 1970 to 1978, estimates provided in section 3 of national costs of improving ambient air quality ranged from $8 to $9 billion in 1978 dollars. Section 4 concludes that the national benefits for improving ambient air quality exceed the national costs for the average and the high values of benefits, but not for the low estimates. Section 5 discusses the requirements for establishing a national regional computational framework for estimating national benefits and national costs. 49 references, 2 tables
Liu, Xiaoqiu; Lewis, James J.; Zhang, Hui; Lu, Wei; Zhang, Shun; Zheng, Guilan; Bai, Liqiong; Li, Jun; Li, Xue; Chen, Hongguang; Liu, Mingming; Chen, Rong; Chi, Junying; Lu, Jian; Huan, Shitong; Cheng, Shiming; Wang, Lixia; Jiang, Shiwen; Chin, Daniel P.; Fielding, Katherine L.
2015-01-01
Background Mobile text messaging and medication monitors (medication monitor boxes) have the potential to improve adherence to tuberculosis (TB) treatment and reduce the need for directly observed treatment (DOT), but to our knowledge they have not been properly evaluated in TB patients. We assessed the effectiveness of text messaging and medication monitors to improve medication adherence in TB patients. Methods and Findings In a pragmatic cluster-randomised trial, 36 districts/counties (each with at least 300 active pulmonary TB patients registered in 2009) within the provinces of Heilongjiang, Jiangsu, Hunan, and Chongqing, China, were randomised using stratification and restriction to one of four case-management approaches in which patients received reminders via text messages, a medication monitor, combined, or neither (control). Patients in the intervention arms received reminders to take their drugs and reminders for monthly follow-up visits, and the managing doctor was recommended to switch patients with adherence problems to more intensive management or DOT. In all arms, patients took medications out of a medication monitor box, which recorded when the box was opened, but the box gave reminders only in the medication monitor and combined arms. Patients were followed up for 6 mo. The primary endpoint was the percentage of patient-months on TB treatment where at least 20% of doses were missed as measured by pill count and failure to open the medication monitor box. Secondary endpoints included additional adherence and standard treatment outcome measures. Interventions were not masked to study staff and patients. From 1 June 2011 to 7 March 2012, 4,292 new pulmonary TB patients were enrolled across the 36 clusters. A total of 119 patients (by arm: 33 control, 33 text messaging, 23 medication monitor, 30 combined) withdrew from the study in the first month because they were reassessed as not having TB by their managing doctor (61 patients) or were switched to
Peer Coaches to Improve Diabetes Outcomes in Rural Alabama: A Cluster Randomized Trial.
Safford, Monika M; Andreae, Susan; Cherrington, Andrea L; Martin, Michelle Y; Halanych, Jewell; Lewis, Marquita; Patel, Ashruta; Johnson, Ethel; Clark, Debra; Gamboa, Christopher; Richman, Joshua S
2015-08-01
It is unclear whether peer coaching is effective in minority populations living with diabetes in hard-to-reach, under-resourced areas such as the rural South. We examined the effect of an innovative peer-coaching intervention plus brief education vs brief education alone on diabetes outcomes. This was a community-engaged, cluster-randomized, controlled trial with primary care practices and their surrounding communities serving as clusters. The trial enrolled 424 participants, with 360 completing baseline and follow-up data collection (84.9% retention). The primary outcomes were change in glycated hemoglobin (HbA1c), systolic blood pressure (BP), low density lipoprotein cholesterol (LDL-C), body mass index (BMI), and quality of life, with diabetes distress and patient activation as secondary outcomes. Peer coaches were trained for 2 days in community settings; the training emphasized motivational interviewing skills, diabetes basics, and goal setting. All participants received a 1-hour diabetes education class and a personalized diabetes report card at baseline. Intervention arm participants were also paired with peer coaches; the protocol called for telephone interactions weekly for the first 8 weeks, then monthly for a total of 10 months. Due to real-world constraints, follow-up was protracted, and intervention effects varied over time. The analysis that included the 68% of participants followed up by 15 months showed only a significant increase in patient activation in the intervention group. The analysis that included all participants who eventually completed follow-up revealed that intervention arm participants had significant differences in changes in systolic BP (P = .047), BMI (P = .02), quality of life (P = .003), diabetes distress (P = .004), and patient activation (P = .03), but not in HbA1c (P = .14) or LDL-C (P = .97). Telephone-delivered peer coaching holds promise to improve health for individuals with diabetes living in under-resourced areas. © 2015
Zarchi, Kian; Haugaard, Vibeke B; Dufour, Deirdre N; Jemec, Gregor B E
2015-03-01
Telemedicine is widely considered as an efficient approach to manage the growing problem of chronic wounds. However, to date, there is no convincing evidence to support the clinical efficacy of telemedicine in wound management. In this prospective cluster controlled study, we tested the hypothesis that advice on wound management provided by a team of wound-care specialists through telemedicine would significantly improve the likelihood of wound healing compared with the best available conventional practice. A total of 90 chronic wound patients in home care met all study criteria and were included: 50 in the telemedicine group and 40 in the conventional group. Patients with pressure ulcers, surgical wounds, and cancer wounds were excluded. During the 1-year follow-up, complete wound healing was achieved in 35 patients (70%) in the telemedicine group compared with 18 patients (45%) in the conventional group. After adjusting for important covariates, offering advice on wound management through telemedicine was associated with significantly increased healing compared with the best available conventional practice (telemedicine vs. conventional practice: adjusted hazard ratio 2.19; 95% confidence interval: 1.15-4.17; P=0.017). This study strongly supports the use of telemedicine to connect home-care nurses to a team of wound experts in order to improve the management of chronic wounds.
ASTM clustering for improving coal analysis by near-infrared spectroscopy.
Andrés, J M; Bona, M T
2006-11-15
Multivariate analysis techniques have been applied to near-infrared (NIR) spectra coals to investigate the relationship between nine coal properties (moisture (%), ash (%), volatile matter (%), fixed carbon (%), heating value (kcal/kg), carbon (%), hydrogen (%), nitrogen (%) and sulphur (%)) and the corresponding predictor variables. In this work, a whole set of coal samples was grouped into six more homogeneous clusters following the ASTM reference method for classification prior to the application of calibration methods to each coal set. The results obtained showed a considerable improvement of the error determination compared with the calibration for the whole sample set. For some groups, the established calibrations approached the quality required by the ASTM/ISO norms for laboratory analysis. To predict property values for a new coal sample it is necessary the assignation of that sample to its respective group. Thus, the discrimination and classification ability of coal samples by Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS) in the NIR range was also studied by applying Soft Independent Modelling of Class Analogy (SIMCA) and Linear Discriminant Analysis (LDA) techniques. Modelling of the groups by SIMCA led to overlapping models that cannot discriminate for unique classification. On the other hand, the application of Linear Discriminant Analysis improved the classification of the samples but not enough to be satisfactory for every group considered.
Jonsen, Ian
2016-01-01
State-space models provide a powerful way to scale up inference of movement behaviours from individuals to populations when the inference is made across multiple individuals. Here, I show how a joint estimation approach that assumes individuals share identical movement parameters can lead to improved inference of behavioural states associated with different movement processes. I use simulated movement paths with known behavioural states to compare estimation error between nonhierarchical and joint estimation formulations of an otherwise identical state-space model. Behavioural state estimation error was strongly affected by the degree of similarity between movement patterns characterising the behavioural states, with less error when movements were strongly dissimilar between states. The joint estimation model improved behavioural state estimation relative to the nonhierarchical model for simulated data with heavy-tailed Argos location errors. When applied to Argos telemetry datasets from 10 Weddell seals, the nonhierarchical model estimated highly uncertain behavioural state switching probabilities for most individuals whereas the joint estimation model yielded substantially less uncertainty. The joint estimation model better resolved the behavioural state sequences across all seals. Hierarchical or joint estimation models should be the preferred choice for estimating behavioural states from animal movement data, especially when location data are error-prone. PMID:26853261
Improved estimator for non-Gaussianity in cosmic microwave background observations
NASA Astrophysics Data System (ADS)
Smith, Tristan L.; Grin, Daniel; Kamionkowski, Marc
2013-03-01
An improved estimator for the amplitude fNL of local-type non-Gaussianity from the cosmic microwave background (CMB) bispectrum is discussed. The standard estimator is constructed to be optimal in the zero-signal (i.e., Gaussian) limit. When applied to CMB maps which have a detectable level of non-Gaussianity the standard estimator is no longer optimal, possibly limiting the sensitivity of future observations to a non-Gaussian signal. Previous studies have proposed an improved estimator by using a realization-dependent normalization. Under the approximations of a flat sky and a vanishingly thin last-scattering surface, these studies showed that the variance of this improved estimator can be significantly smaller than the variance of the standard estimator when applied to non-Gaussian CMB maps. Here this technique is generalized to the full sky and to include the full radiation transfer function, yielding expressions for the improved estimator that can be directly applied to CMB maps. The ability of this estimator to reduce the variance as compared to the standard estimator in the face of a significant non-Gaussian signal is re-assessed using the full CMB transfer function. As a result of the late time integrated Sachs-Wolfe (ISW) effect, the performance of the improved estimator is degraded. If CMB maps are first cleaned of the late-time ISW effect using a tracer of foreground structure, such as a galaxy survey or a measurement of CMB weak lensing, the new estimator does remove a majority of the excess variance, allowing a higher significance detection of fNL.
Evaluation of an intervention to improve blood culture practices: a cluster randomised trial.
Pavese, P; Maillet, M; Vitrat-Hincky, V; Recule, C; Vittoz, J-P; Guyomard, A; Seigneurin, A; François, P
2014-12-01
This study aimed to evaluate an intervention to improve blood culture practices. A cluster randomised trial in two parallel groups was performed at the Grenoble University Hospital, France. In October 2009, the results of a practices audit and the guidelines for the optimal use of blood cultures were disseminated to clinical departments. We compared two types of information dissemination: simple presentation or presentation associated with an infectious diseases (ID) specialist intervention. The principal endpoint was blood culture performance measured by the rate of patients having one positive blood culture and the rate of positive blood cultures. The cases of 130 patients in the "ID" group and 119 patients in the "simple presentation" group were audited during the second audit in April 2010. The rate of patients with one positive blood culture increased in both groups (13.62 % vs 9.89 % for the ID group, p = 0.002, 15.90 % vs 13.47 % for the simple presentation group, p = 0.009). The rate of positive blood cultures improved in both groups (6.68 % vs 5.96 % for the ID group, p = 0.003, 6.52 % vs 6.21 % for the simple presentation group, p = 0.017). The blood culture indication was significantly less often specified in the request form in the simple presentation group, while it remained stable in the ID group (p = 0.04). The rate of positive blood cultures and the rate of patients having one positive blood culture improved in both groups. The ID specialist intervention did not have more of an impact on practices than a simple presentation of audit feedback and guidelines.
Vellinga, Akke; Galvin, Sandra; Duane, Sinead; Callan, Aoife; Bennett, Kathleen; Cormican, Martin; Domegan, Christine; Murphy, Andrew W
2016-02-02
Overuse of antimicrobial therapy in the community adds to the global spread of antimicrobial resistance, which is jeopardizing the treatment of common infections. We designed a cluster randomized complex intervention to improve antimicrobial prescribing for urinary tract infection in Irish general practice. During a 3-month baseline period, all practices received a workshop to promote consultation coding for urinary tract infections. Practices in intervention arms A and B received a second workshop with information on antimicrobial prescribing guidelines and a practice audit report (baseline data). Practices in intervention arm B received additional evidence on delayed prescribing of antimicrobials for suspected urinary tract infection. A reminder integrated into the patient management software suggested first-line treatment and, for practices in arm B, delayed prescribing. Over the 6-month intervention, practices in arms A and B received monthly audit reports of antimicrobial prescribing. The proportion of antimicrobial prescribing according to guidelines for urinary tract infection increased in arms A and B relative to control (adjusted overall odds ratio [OR] 2.3, 95% confidence interval [CI] 1.7 to 3.2; arm A adjusted OR 2.7, 95% CI 1.8 to 4.1; arm B adjusted OR 2.0, 95% CI 1.3 to 3.0). An unintended increase in antimicrobial prescribing was observed in the intervention arms relative to control (arm A adjusted OR 2.2, 95% CI 1.2 to 4.0; arm B adjusted OR 1.4, 95% CI 0.9 to 2.1). Improvements in guideline-based prescribing were sustained at 5 months after the intervention. A complex intervention, including audit reports and reminders, improved the quality of prescribing for urinary tract infection in Irish general practice. ClinicalTrials.gov, no. NCT01913860. © 2016 Canadian Medical Association or its licensors.
Using Local Matching to Improve Estimates of Program Impact: Evidence from Project STAR
ERIC Educational Resources Information Center
Jones, Nathan; Steiner, Peter; Cook, Tom
2011-01-01
In this study the authors test whether matching using intact local groups improves causal estimates over those produced using propensity score matching at the student level. Like the recent analysis of Wilde and Hollister (2007), they draw on data from Project STAR to estimate the effect of small class sizes on student achievement. They propose a…
2017-03-01
Abbreviations ACBM Activation Cost Budget Model CFM Office of Construction and Facilities Management eCMS Electronic Contract ...VA CONSTRUCTION Improved Processes Needed to Monitor Contract Modifications, Develop Schedules, and Estimate Costs...Monitor Contract Modifications, Develop Schedules, and Estimate Costs What GAO Found The Department of Veterans Affairs (VA) has taken steps to
Nguyen, Vinh Hao; Suh, Young Soo
2007-01-01
This paper is concerned with improving performance of a state estimation problem over a network in which a send-on-delta (SOD) transmission method is used. The SOD method requires that a sensor node transmit data to the estimator node only if its measurement value changes more than a given specified δ value. This method has been explored and applied by researchers because of its efficiency in the network bandwidth improvement. However, when this method is used, it is not ensured that the estimator node receives data from the sensor nodes regularly at every estimation period. Therefore, we propose a method to reduce estimation error in case of no sensor data reception. When the estimator node does not receive data from the sensor node, the sensor value is known to be in a (−δi,+δi) interval from the last transmitted sensor value. This implicit information has been used to improve estimation performance in previous studies. The main contribution of this paper is to propose an algorithm, where the sensor value interval is reduced to (−δi/2,+δi/2) in certain situations. Thus, the proposed algorithm improves the overall estimation performance without any changes in the send-on-delta algorithms of the sensor nodes. Through numerical simulations, we demonstrate the feasibility and the usefulness of the proposed method.
"Battleship Numberline": A Digital Game for Improving Estimation Accuracy on Fraction Number Lines
ERIC Educational Resources Information Center
Lomas, Derek; Ching, Dixie; Stampfer, Eliane; Sandoval, Melanie; Koedinger, Ken
2011-01-01
Given the strong relationship between number line estimation accuracy and math achievement, might a computer-based number line game help improve math achievement? In one study by Rittle-Johnson, Siegler and Alibali (2001), a simple digital game called "Catch the Monster" provided practice in estimating the location of decimals on a…
Müller, Dominik; Technow, Frank; Melchinger, Albrecht E
2015-04-01
We evaluated several methods for computing shrinkage estimates of the genomic relationship matrix and demonstrated their potential to enhance the reliability of genomic estimated breeding values of training set individuals. In genomic prediction in plant breeding, the training set constitutes a large fraction of the total number of genotypes assayed and is itself subject to selection. The objective of our study was to investigate whether genomic estimated breeding values (GEBVs) of individuals in the training set can be enhanced by shrinkage estimation of the genomic relationship matrix. We simulated two different population types: a diversity panel of unrelated individuals and a biparental family of doubled haploid lines. For different training set sizes (50, 100, 200), number of markers (50, 100, 200, 500, 2,500) and heritabilities (0.25, 0.5, 0.75), shrinkage coefficients were computed by four different methods. Two of these methods are novel and based on measures of LD, the other two were previously described in the literature, one of which was extended by us. Our results showed that shrinkage estimation of the genomic relationship matrix can significantly improve the reliability of the GEBVs of training set individuals, especially for a low number of markers. We demonstrate that the number of markers is the primary determinant of the optimum shrinkage coefficient maximizing the reliability and we recommend methods eligible for routine usage in practical applications.
NASA Astrophysics Data System (ADS)
Tang, Ronglin; Li, Zhao-Liang
2017-03-01
Evapotranspiration (ET) is a primary mechanism for water and heat transfer between land and the atmosphere. One approach to estimate ET is from instantaneous remotely sensed data. The constant evaporative fraction (EF) method is then usually used to estimate integrated daily fluxes, which are typically underestimated values. Here we present a theoretical improvement to the conventional EF. The improved EF is shown to be robust and superior to the conventional approach, and it significantly reduces the underestimation bias.
2014-03-27
DIRECT EMISSIVITY MEASUREMENTS OF PAINTED METALS FOR IMPROVED TEMPERATURE ESTIMATION DURING LASER DAMAGE TESTING THESIS Sean M. Baumann, Civilian...radiance measurement, and fitted spectral radiance results, of one pixel on the back surface of a painted metal sample, far from laser burn-through hole...parabolic mirror NET noise-equivalent temperature xv DIRECT EMISSIVITY MEASUREMENTS OF PAINTED METALS FOR IMPROVED TEMPERATURE ESTIMATION DURING LASER DAMAGE
NASA Astrophysics Data System (ADS)
Troiani, Francesco; Piacentini, Daniela; Seta Marta, Della
2016-04-01
analysis conducted on 52 clusters of high and very high Gi* values indicate that mass movement of slope material represents the dominant process producing over-steeped long-profiles along connected streams, whereas the litho-structure accounts for the main anomalies along disconnected steams. Tectonic structures generally provide to the largest clusters. Our results demonstrate that SL-HCA maps have the same potential of lithologically-filtered SL maps for detecting knickzones due to hillslope processes and/or tectonic structures. The reduced-complexity model derived from SL-HCA approach highly improve the readability of the morphometric outcomes, thus the interpretation at a regional scale of the geological-geomorphological meaning of over-steeped segments on long-profiles. SL-HCA maps are useful to investigate and better interpret knickzones within regions poorly covered by geological data and where field surveys are difficult to be performed.
Frühwirth, Rudolf; Mani, D R; Pyne, Saumyadipta
2011-08-31
Clustering is a widely applicable pattern recognition method for discovering groups of similar observations in data. While there are a large variety of clustering algorithms, very few of these can enforce constraints on the variation of attributes for data points included in a given cluster. In particular, a clustering algorithm that can limit variation within a cluster according to that cluster's position (centroid location) can produce effective and optimal results in many important applications ranging from clustering of silicon pixels or calorimeter cells in high-energy physics to label-free liquid chromatography based mass spectrometry (LC-MS) data analysis in proteomics and metabolomics. We present MEDEA (M-Estimator with DEterministic Annealing), an M-estimator based, new unsupervised algorithm that is designed to enforce position-specific constraints on variance during the clustering process. The utility of MEDEA is demonstrated by applying it to the problem of "peak matching"--identifying the common LC-MS peaks across multiple samples--in proteomic biomarker discovery. Using real-life datasets, we show that MEDEA not only outperforms current state-of-the-art model-based clustering methods, but also results in an implementation that is significantly more efficient, and hence applicable to much larger LC-MS data sets. MEDEA is an effective and efficient solution to the problem of peak matching in label-free LC-MS data. The program implementing the MEDEA algorithm, including datasets, clustering results, and supplementary information is available from the author website at http://www.hephy.at/user/fru/medea/.
NASA Astrophysics Data System (ADS)
Yeck, William L.; Block, Lisa V.; Wood, Christopher K.; King, Vanessa M.
2015-01-01
The Paradox Valley Unit (PVU), a salinity control project in southwest Colorado, disposes of brine in a single deep injection well. Since the initiation of injection at the PVU in 1991, earthquakes have been repeatedly induced. PVU closely monitors all seismicity in the Paradox Valley region with a dense surface seismic network. A key factor for understanding the seismic hazard from PVU injection is the maximum magnitude earthquake that can be induced. The estimate of maximum magnitude of induced earthquakes is difficult to constrain as, unlike naturally occurring earthquakes, the maximum magnitude of induced earthquakes changes over time and is affected by injection parameters. We investigate temporal variations in maximum magnitudes of induced earthquakes at the PVU using two methods. First, we consider the relationship between the total cumulative injected volume and the history of observed largest earthquakes at the PVU. Second, we explore the relationship between maximum magnitude and the geometry of individual seismicity clusters. Under the assumptions that: (i) elevated pore pressures must be distributed over an entire fault surface to initiate rupture and (ii) the location of induced events delineates volumes of sufficiently high pore-pressure to induce rupture, we calculate the largest allowable vertical penny-shaped faults, and investigate the potential earthquake magnitudes represented by their rupture. Results from both the injection volume and geometrical methods suggest that the PVU has the potential to induce events up to roughly MW 5 in the region directly surrounding the well; however, the largest observed earthquake to date has been about a magnitude unit smaller than this predicted maximum. In the seismicity cluster surrounding the injection well, the maximum potential earthquake size estimated by these methods and the observed maximum magnitudes have remained steady since the mid-2000s. These observations suggest that either these methods
Systems analysis and improvement to optimize pMTCT (SAIA): a cluster randomized trial
2014-01-01
Background Despite significant increases in global health investment and the availability of low-cost, efficacious interventions to prevent mother-to-child HIV transmission (pMTCT) in low- and middle-income countries with high HIV burden, the translation of scientific advances into effective delivery strategies has been slow, uneven and incomplete. As a result, pediatric HIV infection remains largely uncontrolled. A five-step, facility-level systems analysis and improvement intervention (SAIA) was designed to maximize effectiveness of pMTCT service provision by improving understanding of inefficiencies (step one: cascade analysis), guiding identification and prioritization of low-cost workflow modifications (step two: value stream mapping), and iteratively testing and redesigning these modifications (steps three through five). This protocol describes the SAIA intervention and methods to evaluate the intervention’s impact on reducing drop-offs along the pMTCT cascade. Methods This study employs a two-arm, longitudinal cluster randomized trial design. The unit of randomization is the health facility. A total of 90 facilities were identified in Côte d’Ivoire, Kenya and Mozambique (30 per country). A subset was randomly selected and assigned to intervention and comparison arms, stratified by country and service volume, resulting in 18 intervention and 18 comparison facilities across all three countries, with six intervention and six comparison facilities per country. The SAIA intervention will be implemented for six months in the 18 intervention facilities. Primary trial outcomes are designed to assess improvements in the pMTCT service cascade, and include the percentage of pregnant women being tested for HIV at the first antenatal care visit, the percentage of HIV-infected pregnant women receiving adequate prophylaxis or combination antiretroviral therapy in pregnancy, and the percentage of newborns exposed to HIV in pregnancy receiving an HIV diagnosis eight
Improved SISO MMSE Detection for Joint Coded-Precoded OFDM under Imperfect Channel Estimation
NASA Astrophysics Data System (ADS)
Zhang, Guomei; Zhu, Shihua; Li, Feng; Ren, Pinyi
An improved soft-input soft-output (SISO) minimum mean-squared error (MMSE) detection method is proposed for joint coding and precoding OFDM systems under imperfect channel estimation. Compared with the traditional mismatched detection which uses the channel estimate as its exact value, the signal model of the proposed detector is more accurate and the influence of channel estimation error (CEE) can be effectively mitigated. Simulations indicate that the proposed scheme can improve the bit error rate (BER) performance with fewer pilot symbols.
ERIC Educational Resources Information Center
Maskiewicz, April Cordero; Griscom, Heather Peckham; Welch, Nicole Turrill
2012-01-01
In this study, we used targeted active-learning activities to help students improve their ways of reasoning about carbon flow in ecosystems. The results of a validated ecology conceptual inventory (diagnostic question clusters [DQCs]) provided us with information about students' understanding of and reasoning about transformation of inorganic and…
ERIC Educational Resources Information Center
Maskiewicz, April Cordero; Griscom, Heather Peckham; Welch, Nicole Turrill
2012-01-01
In this study, we used targeted active-learning activities to help students improve their ways of reasoning about carbon flow in ecosystems. The results of a validated ecology conceptual inventory (diagnostic question clusters [DQCs]) provided us with information about students' understanding of and reasoning about transformation of inorganic and…
Tipton, Elizabeth
2013-04-01
An important question in the design of experiments is how to ensure that the findings from the experiment are generalizable to a larger population. This concern with generalizability is particularly important when treatment effects are heterogeneous and when selecting units into the experiment using random sampling is not possible-two conditions commonly met in large-scale educational experiments. This article introduces a model-based balanced-sampling framework for improving generalizations, with a focus on developing methods that are robust to model misspecification. Additionally, the article provides a new method for sample selection within this framework: First units in an inference population are divided into relatively homogenous strata using cluster analysis, and then the sample is selected using distance rankings. In order to demonstrate and evaluate the method, a reanalysis of a completed experiment is conducted. This example compares samples selected using the new method with the actual sample used in the experiment. Results indicate that even under high nonresponse, balance is better on most covariates and that fewer coverage errors result. The article concludes with a discussion of additional benefits and limitations of the method.
Evaluating and improving the cluster variation method entropy functional for Ising alloys
NASA Astrophysics Data System (ADS)
Ferreira, Luiz G.; Wolverton, C.; Zunger, Alex
1998-02-01
The success of the "cluster variation method" (CVM) in reproducing quite accurately the free energies of Monte Carlo (MC) calculations on Ising models is explained in terms of identifying a cancellation of errors: We show that the CVM produces correlation functions that are too close to zero, which leads to an overestimation of the exact energy, E, and at the same time, to an underestimation of -TS, so the free energy F=E-TS is more accurate than either of its parts. This insight explains a problem with "hybrid methods" using MC correlation functions in the CVM entropy expression: They give exact energies E and do not give significantly improved -TS relative to CVM, so they do not benefit from the above noted cancellation of errors. Additionally, hybrid methods suffer from the difficulty of adequately accounting for both ordered and disordered phases in a consistent way. A different technique, the "entropic Monte Carlo" (EMC), is shown here to provide a means for critically evaluating the CVM entropy. Inspired by EMC results, we find a universal and simple correlation to the CVM entropy which produces individual components of the free energy with MC accuracy, but is computationally much less expensive than either MC thermodynamic integration or EMC.
Ferri, Marica; De Luca, Assunta; Rossi, Paolo Giorgi; Lori, Giuliano; Guasticchi, Gabriella
2005-01-01
Background Early interventions proved to be able to improve prognosis in acute stroke patients. Prompt identification of symptoms, organised timely and efficient transportation towards appropriate facilities, become essential part of effective treatment. The implementation of an evidence based pre-hospital stroke care pathway may be a method for achieving the organizational standards required to grant appropriate care. We performed a systematic search for studies evaluating the effect of pre-hospital and emergency interventions for suspected stroke patients and we found that there seems to be only a few studies on the emergency field and none about implementation of clinical pathways. We will test the hypothesis that the adoption of emergency clinical pathway improves early diagnosis and referral in suspected stroke patients. We designed a cluster randomised controlled trial (C-RCT), the most powerful study design to assess the impact of complex interventions. The study was registered in the Current Controlled Trials Register: ISRCTN41456865 – Implementation of pre-hospital emergency pathway for stroke – a cluster randomised trial. Methods/design Two-arm cluster-randomised trial (C-RCT). 16 emergency services and 14 emergency rooms were randomised either to arm 1 (comprising a training module and administration of the guideline), or to arm 2 (no intervention, current practice). Arm 1 participants (152 physicians, 280 nurses, 50 drivers) attended an interactive two sessions course with continuous medical education CME credits on the contents of the clinical pathway. We estimated that around 750 patients will be met by the services in the 6 months of observation. This duration allows recruiting a sample of patients sufficient to observe a 30% improvement in the proportion of appropriate diagnoses. Data collection will be performed using current information systems. Process outcomes will be measured at the cluster level six months after the intervention. We will
Estimates of the national benefits and costs of improving ambient air quality
Brady, G.L.; Bower, B.T.; Lakhani, H.A.
1983-04-01
This paper examines the estimates of national benefits and national costs of ambient air quality improvement in the US for the period 1970 to 1978. Analysis must be at the micro-level for both receptors of pollution and the dischargers of residuals. Section 2 discusses techniques for estimating the national benefits from improving ambient air quality. The literature on national benefits to health (mortality and morbidity) and non-health (avoiding damages to materials, plants, crops, etc.) is critically reviewed in this section. For the period 1970 to 1978, the value of these benefits ranged from about $5 billion to $51 billion, with a point estimate of about $22 billion. The national cost estimates by the Council on Environmental Quality, Bureau of Economic Analysis, and McGraw-Hill are provided in section 2. Cost estimates must include not only the end-of-pipe treatment measures, but also the alternatives: changes in product specification, product mix, processes, etc. These types of responses are not generally considered in estimates of national costs of improving ambient air quality ranged from $8 to $9 billion in 1978 dollars. Section 4 concludes that the national benefits for improving ambient air quality exceed the national costs for the average and the high values of benefits, but not for the low estimates.
NASA Astrophysics Data System (ADS)
Gholizadeh, Hamed
2013-09-01
Hyperspectral remote sensing is capable of providing large numbers of spectral bands. The vast amount of data volume presents challenging problems for information processing, such as heavy computational burden. In this paper, the impact of dimension reduction on hyperspectral data clustering is investigated from two viewpoints: 1) computational complexity; and 2) clustering performance. Clustering is one of the most useful tasks in data mining process. So, investigating the impact of dimension reduction on hyperspectral data clustering is justifiable. The proposed approach is based on thresholding the band correlation matrix and selecting the least correlated bands. Selected bands are then used to cluster the hyperspectral image. Experimental results on a real-world hyperspectral remote sensing data proved that the proposed approach will decrease computational complexity and lead to better clustering results. For evaluating the clustering performance, the Calinski-Harabasz, Davies-Bouldin and Krzanowski-Lai indices are used. These indices evaluate the clustering results using quantities and features inherent in the dataset. In other words, they do not need any external information.
Ebrahimian, Hossein; Jalayer, Fatemeh
2017-08-29
In the immediate aftermath of a strong earthquake and in the presence of an ongoing aftershock sequence, scientific advisories in terms of seismicity forecasts play quite a crucial role in emergency decision-making and risk mitigation. Epidemic Type Aftershock Sequence (ETAS) models are frequently used for forecasting the spatio-temporal evolution of seismicity in the short-term. We propose robust forecasting of seismicity based on ETAS model, by exploiting the link between Bayesian inference and Markov Chain Monte Carlo Simulation. The methodology considers the uncertainty not only in the model parameters, conditioned on the available catalogue of events occurred before the forecasting interval, but also the uncertainty in the sequence of events that are going to happen during the forecasting interval. We demonstrate the methodology by retrospective early forecasting of seismicity associated with the 2016 Amatrice seismic sequence activities in central Italy. We provide robust spatio-temporal short-term seismicity forecasts with various time intervals in the first few days elapsed after each of the three main events within the sequence, which can predict the seismicity within plus/minus two standard deviations from the mean estimate within the few hours elapsed after the main event.
2010-01-01
Background Improving nutrition knowledge among children may help them to make healthier food choices. The aim of this study was to assess the effectiveness and acceptability of a novel educational intervention to increase nutrition knowledge among primary school children. Methods We developed a card game 'Top Grub' and a 'healthy eating' curriculum for use in primary schools. Thirty-eight state primary schools comprising 2519 children in years 5 and 6 (aged 9-11 years) were recruited in a pragmatic cluster randomised controlled trial. The main outcome measures were change in nutrition knowledge scores, attitudes to healthy eating and acceptability of the intervention by children and teachers. Results Twelve intervention and 13 control schools (comprising 1133 children) completed the trial. The main reason for non-completion was time pressure of the school curriculum. Mean total nutrition knowledge score increased by 1.1 in intervention (baseline to follow-up: 28.3 to 29.2) and 0.3 in control schools (27.3 to 27.6). Total nutrition knowledge score at follow-up, adjusted for baseline score, deprivation, and school size, was higher in intervention than in control schools (mean difference = 1.1; 95% CI: 0.05 to 2.16; p = 0.042). At follow-up, more children in the intervention schools said they 'are currently eating a healthy diet' (39.6%) or 'would try to eat a healthy diet' (35.7%) than in control schools (34.4% and 31.7% respectively; chi-square test p < 0.001). Most children (75.5%) enjoyed playing the game and teachers considered it a useful resource. Conclusions The 'Top Grub' card game facilitated the enjoyable delivery of nutrition education in a sample of UK primary school age children. Further studies should determine whether improvements in nutrition knowledge are sustained and lead to changes in dietary behaviour. PMID:20219104
Lakshman, Rajalakshmi R; Sharp, Stephen J; Ong, Ken K; Forouhi, Nita G
2010-03-10
Improving nutrition knowledge among children may help them to make healthier food choices. The aim of this study was to assess the effectiveness and acceptability of a novel educational intervention to increase nutrition knowledge among primary school children. We developed a card game 'Top Grub' and a 'healthy eating' curriculum for use in primary schools. Thirty-eight state primary schools comprising 2519 children in years 5 and 6 (aged 9-11 years) were recruited in a pragmatic cluster randomised controlled trial. The main outcome measures were change in nutrition knowledge scores, attitudes to healthy eating and acceptability of the intervention by children and teachers. Twelve intervention and 13 control schools (comprising 1133 children) completed the trial. The main reason for non-completion was time pressure of the school curriculum. Mean total nutrition knowledge score increased by 1.1 in intervention (baseline to follow-up: 28.3 to 29.2) and 0.3 in control schools (27.3 to 27.6). Total nutrition knowledge score at follow-up, adjusted for baseline score, deprivation, and school size, was higher in intervention than in control schools (mean difference = 1.1; 95% CI: 0.05 to 2.16; p = 0.042). At follow-up, more children in the intervention schools said they 'are currently eating a healthy diet' (39.6%) or 'would try to eat a healthy diet' (35.7%) than in control schools (34.4% and 31.7% respectively; chi-square test p < 0.001). Most children (75.5%) enjoyed playing the game and teachers considered it a useful resource. The 'Top Grub' card game facilitated the enjoyable delivery of nutrition education in a sample of UK primary school age children. Further studies should determine whether improvements in nutrition knowledge are sustained and lead to changes in dietary behaviour.
Accurate reconstruction of viral quasispecies spectra through improved estimation of strain richness
2015-01-01
Background Estimating the number of different species (richness) in a mixed microbial population has been a main focus in metagenomic research. Existing methods of species richness estimation ride on the assumption that the reads in each assembled contig correspond to only one of the microbial genomes in the population. This assumption and the underlying probabilistic formulations of existing methods are not useful for quasispecies populations where the strains are highly genetically related. The lack of knowledge on the number of different strains in a quasispecies population is observed to hinder the precision of existing Viral Quasispecies Spectrum Reconstruction (QSR) methods due to the uncontrolled reconstruction of a large number of in silico false positives. In this work, we formulated a novel probabilistic method for strain richness estimation specifically targeting viral quasispecies. By using this approach we improved our recently proposed spectrum reconstruction pipeline ViQuaS to achieve higher levels of precision in reconstructed quasispecies spectra without compromising the recall rates. We also discuss how one other existing popular QSR method named ShoRAH can be improved using this new approach. Results On benchmark data sets, our estimation method provided accurate richness estimates (< 0.2 median estimation error) and improved the precision of ViQuaS by 2%-13% and F-score by 1%-9% without compromising the recall rates. We also demonstrate that our estimation method can be used to improve the precision and F-score of ShoRAH by 0%-7% and 0%-5% respectively. Conclusions The proposed probabilistic estimation method can be used to estimate the richness of viral populations with a quasispecies behavior and to improve the accuracy of the quasispecies spectra reconstructed by the existing methods ViQuaS and ShoRAH in the presence of a moderate level of technical sequencing errors. Availability http://sourceforge.net/projects/viquas/ PMID:26678073
NASA Astrophysics Data System (ADS)
Wu, Tin-Yu; Chang, Tse; Chu, Teng-Hao
2017-02-01
Many data mining adopts the form of Artificial Neural Network (ANN) to solve many problems, many problems will be involved in the process of training Artificial Neural Network, such as the number of samples with volume label, the time and performance of training, the number of hidden layers and Transfer function, if the compared data results are not expected, it cannot be known clearly that which dimension causes the deviation, the main reason is that Artificial Neural Network trains compared results through the form of modifying weight, and it is not a kind of training to improve the original algorithm for the extraction algorithm of image, but tend to obtain correct value aimed at the result plus the weigh; in terms of these problems, this paper will mainly put forward a method to assist in the image data analysis of Artificial Neural Network; normally, a parameter will be set as the value to extract feature vector during processing the image, which will be considered by us as weight, the experiment will use the value extracted from feature point of Speeded Up Robust Features (SURF) Image as the basis for training, SURF itself can extract different feature points according to extracted values, we will make initial semi-supervised clustering according to these values, and use Modified K - on his Neighbors (MFKNN) as training and classification, the matching mode of unknown images is not one-to-one complete comparison, but only compare group Centroid, its main purpose is to save its efficiency and speed up, and its retrieved data results will be observed and analyzed eventually; the method is mainly to make clustering and classification with the use of the nature of image feature point to give values to groups with high error rate to produce new feature points and put them into Input Layer of Artificial Neural Network for training, and finally comparative analysis is made with Back-Propagation Neural Network (BPN) of Genetic Algorithm-Artificial Neural Network
Improved method for estimating tree crown diameter using high-resolution airborne data
NASA Astrophysics Data System (ADS)
Brovkina, Olga; Latypov, Iscander Sh.; Cienciala, Emil; Fabianek, Tomas
2016-04-01
Automatic mapping of tree crown size (radius, diameter, or width) from remote sensing can provide a major benefit for practical and scientific purposes, but requires the development of accurate methods. This study presents an improved method for average tree crown diameter estimation at a forest plot level from high-resolution airborne data. The improved method consists of the combination of a window binarization procedure and a granulometric algorithm, and avoids the complicated crown delineation procedure that is currently used to estimate crown size. The systematic error in average crown diameter estimates is corrected with the improved method. The improved method is tested with coniferous, beech, and mixed-species forest plots based on airborne images of various spatial resolutions. The absolute (quantitative) accuracy of the improved crown diameter estimates is comparable or higher for both monospecies plots and mixed-species plots than the current methods. The ability of the improved method to produce good estimates for average crown diameters for monoculture and mixed species, to use remote sensing data of various spatial resolution and to operate in automatic mode promisingly suggests its applicability to a wide range of forest systems.
Fialkov, Alexander B; Amirav, Aviv
2003-01-01
Upon the supersonic expansion of helium mixed with vapor from an organic solvent (e.g. methanol), various clusters of the solvent with the sample molecules can be formed. As a result of 70 eV electron ionization of these clusters, cluster chemical ionization (cluster CI) mass spectra are obtained. These spectra are characterized by the combination of EI mass spectra of vibrationally cold molecules in the supersonic molecular beam (cold EI) with CI-like appearance of abundant protonated molecules, together with satellite peaks corresponding to protonated or non-protonated clusters of sample compounds with 1-3 solvent molecules. Like CI, cluster CI preferably occurs for polar compounds with high proton affinity. However, in contrast to conventional CI, for non-polar compounds or those with reduced proton affinity the cluster CI mass spectrum converges to that of cold EI. The appearance of a protonated molecule and its solvent cluster peaks, plus the lack of protonation and cluster satellites for prominent EI fragments, enable the unambiguous identification of the molecular ion. In turn, the insertion of the proper molecular ion into the NIST library search of the cold EI mass spectra eliminates those candidates with incorrect molecular mass and thus significantly increases the confidence level in sample identification. Furthermore, molecular mass identification is of prime importance for the analysis of unknown compounds that are absent in the library. Examples are given with emphasis on the cluster CI analysis of carbamate pesticides, high explosives and unknown samples, to demonstrate the usefulness of Supersonic GC/MS (GC/MS with supersonic molecular beam) in the analysis of these thermally labile compounds. Cluster CI is shown to be a practical ionization method, due to its ease-of-use and fast instrumental conversion between EI and cluster CI, which involves the opening of only one valve located at the make-up gas path. The ease-of-use of cluster CI is analogous
NASA Astrophysics Data System (ADS)
Zhang, Lu; Wu, Zhiyong; Zhang, Yaoyu; Detian, Huang
2013-01-01
To mitigate the impact of the error between the estimated channel fading coefficient and the perfect fading coefficient on the bit error rate (BER), a priori conditional probability density function averaging the estimation error is proposed. Then, an improved maximum-likelihood (ML) symbol-by-symbol detection is derived for the free-space optical communication systems, which implement pilot symbol assisted modulation. To reduce complexity, a closed-form suboptimal improved ML detection is deduced using distribution approximation. Numerical results confirm that BER performance improvement can be reached by the improved ML detection, and that its suboptimal version performs as well as it does. Therefore, they both outperform classical ML detection, which doses not consider channel estimation error.
NASA Astrophysics Data System (ADS)
Chen, Y.; Ho, C.; Chang, L.
2011-12-01
In previous decades, the climate change caused by global warming increases the occurrence frequency of extreme hydrological events. Water supply shortages caused by extreme events create great challenges for water resource management. To evaluate future climate variations, general circulation models (GCMs) are the most wildly known tools which shows possible weather conditions under pre-defined CO2 emission scenarios announced by IPCC. Because the study area of GCMs is the entire earth, the grid sizes of GCMs are much larger than the basin scale. To overcome the gap, a statistic downscaling technique can transform the regional scale weather factors into basin scale precipitations. The statistic downscaling technique can be divided into three categories include transfer function, weather generator and weather type. The first two categories describe the relationships between the weather factors and precipitations respectively based on deterministic algorithms, such as linear or nonlinear regression and ANN, and stochastic approaches, such as Markov chain theory and statistical distributions. In the weather type, the method has ability to cluster weather factors, which are high dimensional and continuous variables, into weather types, which are limited number of discrete states. In this study, the proposed downscaling model integrates the weather type, using the K-means clustering algorithm, and the weather generator, using the kernel density estimation. The study area is Shihmen basin in northern of Taiwan. In this study, the research process contains two steps, a calibration step and a synthesis step. Three sub-steps were used in the calibration step. First, weather factors, such as pressures, humidities and wind speeds, obtained from NCEP and the precipitations observed from rainfall stations were collected for downscaling. Second, the K-means clustering grouped the weather factors into four weather types. Third, the Markov chain transition matrixes and the
Roumie, Christianne L; Elasy, Tom A; Greevy, Robert; Griffin, Marie R; Liu, Xulei; Stone, William J; Wallston, Kenneth A; Dittus, Robert S; Alvarez, Vincent; Cobb, Janice; Speroff, Theodore
2006-08-01
Inadequate blood pressure control is a persistent gap in quality care. To evaluate provider and patient interventions to improve blood pressure control. Cluster randomized, controlled trial. 2 hospital-based and 8 community-based clinics in the Veterans Affairs Tennessee Valley Healthcare System. 1341 veterans with essential hypertension cared for by 182 providers. Eligible patients had 2 or more blood pressure measurements greater than 140/90 mm Hg in a 6-month period and were taking a single antihypertensive agent. Providers who cared for eligible patients were randomly assigned to receive an e-mail with a Web-based link to the Seventh Report of the Joint National Committee on the Prevention, Detection, Evaluation and Treatment of High Blood Pressure (JNC 7) guidelines (provider education); provider education and a patient-specific hypertension computerized alert (provider education and alert); or provider education, hypertension alert, and patient education, in which patients were sent a letter advocating drug adherence, lifestyle modification, and conversations with providers (patient education). Proportion of patients with a systolic blood pressure less than 140 mm Hg at 6 months; intensification of antihypertensive medication. Mean baseline blood pressure was 157/83 mm Hg with no differences between groups (P = 0.105). Six-month follow-up data were available for 975 patients (73%). Patients of providers who were randomly assigned to the patient education group had better blood pressure control (138/75 mm Hg) than those in the provider education and alert or provider education alone groups (146/76 mm Hg and 145/78 mm Hg, respectively). More patients in the patient education group had a systolic blood pressure of 140 mm Hg or less compared with those in the provider education or provider education and alert groups (adjusted relative risk for the patient education group compared with the provider education alone group, 1.31 [95% CI, 1.06 to 1.62]; P = 0
Backman, Ruth; Foy, Robbie; Diggle, Peter J; Kneen, Rachel; Defres, Sylviane; Michael, Benedict Daniel; Medina-Lara, Antonieta; Solomon, Tom
2015-01-27
Viral encephalitis is a devastating condition for which delayed treatment is associated with increased morbidity and mortality. Clinical audits indicate substantial scope for improved detection and treatment. Improvement strategies should ideally be tailored according to identified needs and barriers to change. The aim of the study is to evaluate the effectiveness and cost-effectiveness of a tailored intervention to improve the secondary care management of suspected encephalitis. The study is a two-arm cluster randomised controlled trial with allocation by postgraduate deanery. Participants were identified from 24 hospitals nested within 12 postgraduate deaneries in the United Kingdom (UK). We developed a multifaceted intervention package including core and flexible components with embedded behaviour change techniques selected on the basis of identified needs and barriers to change. The primary outcome will be a composite of the proportion of patients with suspected encephalitis receiving timely and appropriate diagnostic lumbar puncture within 12 h of hospital admission and aciclovir treatment within 6 h. We will gather outcome data pre-intervention and up to 12 months post-intervention from patient records. Statistical analysis at the cluster level will be blind to allocation. An economic evaluation will estimate intervention cost-effectiveness from the health service perspective. Controlled Trials: ISRCTN06886935.
Improving PERSIANN-CCS rain estimation using probabilistic approach and multi-sensors information
NASA Astrophysics Data System (ADS)
Karbalaee, N.; Hsu, K. L.; Sorooshian, S.; Kirstetter, P.; Hong, Y.
2016-12-01
This presentation discusses the recent implemented approaches to improve the rainfall estimation from Precipitation Estimation from Remotely Sensed Information using Artificial Neural Network-Cloud Classification System (PERSIANN-CCS). PERSIANN-CCS is an infrared (IR) based algorithm being integrated in the IMERG (Integrated Multi-Satellite Retrievals for the Global Precipitation Mission GPM) to create a precipitation product in 0.1x0.1degree resolution over the chosen domain 50N to 50S every 30 minutes. Although PERSIANN-CCS has a high spatial and temporal resolution, it overestimates or underestimates due to some limitations.PERSIANN-CCS can estimate rainfall based on the extracted information from IR channels at three different temperature threshold levels (220, 235, and 253k). This algorithm relies only on infrared data to estimate rainfall indirectly from this channel which cause missing the rainfall from warm clouds and false estimation for no precipitating cold clouds. In this research the effectiveness of using other channels of GOES satellites such as visible and water vapors has been investigated. By using multi-sensors the precipitation can be estimated based on the extracted information from multiple channels. Also, instead of using the exponential function for estimating rainfall from cloud top temperature, the probabilistic method has been used. Using probability distributions of precipitation rates instead of deterministic values has improved the rainfall estimation for different type of clouds.
Nair, Nirmala; Tripathy, Prasanta; Sachdev, Harshpal S; Bhattacharyya, Sanghita; Gope, Rajkumar; Gagrai, Sumitra; Rath, Shibanand; Rath, Suchitra; Sinha, Rajesh; Roy, Swati Sarbani; Shewale, Suhas; Singh, Vijay; Srivastava, Aradhana; Pradhan, Hemanta; Costello, Anthony; Copas, Andrew; Skordis-Worrall, Jolene; Haghparast-Bidgoli, Hassan; Saville, Naomi; Prost, Audrey
2015-04-15
Child stunting (low height-for-age) is a marker of chronic undernutrition and predicts children's subsequent physical and cognitive development. Around one third of the world's stunted children live in India. Our study aims to assess the impact, cost-effectiveness, and scalability of a community intervention with a government-proposed community-based worker to improve growth in children under two in rural India. The study is a cluster randomised controlled trial in two rural districts of Jharkhand and Odisha (eastern India). The intervention tested involves a community-based worker carrying out two activities: (a) one home visit to all pregnant women in the third trimester, followed by subsequent monthly home visits to all infants aged 0-24 months to support appropriate feeding, infection control, and care-giving; (b) a monthly women's group meeting using participatory learning and action to catalyse individual and community action for maternal and child health and nutrition. Both intervention and control clusters also receive an intervention to strengthen Village Health Sanitation and Nutrition Committees. The unit of randomisation is a purposively selected cluster of approximately 1000 population. A total of 120 geographical clusters covering an estimated population of 121,531 were randomised to two trial arms: 60 clusters in the intervention arm receive home visits, group meetings, and support to Village Health Sanitation and Nutrition Committees; 60 clusters in the control arm receive support to Committees only. The study participants are pregnant women identified in the third trimester of pregnancy and their children (n = 2520). Mothers and their children are followed up at seven time points: during pregnancy, within 72 hours of delivery, and at 3, 6, 9, 12 and 18 months after birth. The trial's primary outcome is children's mean length-for-age Z scores at 18 months. Secondary outcomes include wasting and underweight at all time points, birth weight, growth
Ettl, Florian; Testori, Christoph; Weiser, Christoph; Fleischhackl, Sabine; Mayer-Stickler, Monika; Herkner, Harald; Schreiber, Wolfgang; Fleischhackl, Roman
2011-06-01
The first-aid training necessary for obtaining a drivers license in Austria has a regulated and predefined curriculum but has been targeted for the implementation of a new course structure with less theoretical input, repetitive training in cardiopulmonary resuscitation (CPR) and structured presentations using innovative media. The standard and a new course design were compared with a prospective, participant- and observer-blinded, cluster-randomized controlled study. Six months after the initial training, we evaluated the confidence of the 66 participants in their skills, CPR effectiveness parameters and correctness of their actions. The median self-confidence was significantly higher in the interventional group [IG, visual analogue scale (VAS:"0" not-confident at all,"100" highly confident):57] than in the control group (CG, VAS:41). The mean chest compression rate in the IG (98/min) was closer to the recommended 100 bpm than in the CG (110/min). The time to the first chest compression (IG:25s, CG:36s) and time to first defibrillator shock (IG:86s, CG:92s) were significantly shorter in the IG. Furthermore, the IG participants were safer in their handling of the defibrillator and started with countermeasures against developing shock more often. The management of an unconscious person and of heavy bleeding did not show a difference between the two groups even after shortening the lecture time. Motivation and self-confidence as well as skill retention after six months were shown to be dependent on the teaching methods and the time for practical training. Courses may be reorganized and content rescheduled, even within predefined curricula, to improve course outcomes. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Mullender-Wijnsma, Marijke J; Hartman, Esther; de Greeff, Johannes W; Doolaard, Simone; Bosker, Roel J; Visscher, Chris
2016-03-01
Using physical activity in the teaching of academic lessons is a new way of learning. The aim of this study was to investigate the effects of an innovative physically active academic intervention ("Fit & Vaardig op School" [F&V]) on academic achievement of children. Using physical activity to teach math and spelling lessons was studied in a cluster-randomized controlled trial. Participants were 499 children (mean age 8.1 years) from second- and third-grade classes of 12 elementary schools. At each school, a second- and third-grade class were randomly assigned to the intervention or control group. The intervention group participated in F&V lessons for 2 years, 22 weeks per year, 3 times a week. The control group participated in regular classroom lessons. Children's academic achievement was measured before the intervention started and after the first and second intervention years. Academic achievement was measured by 2 mathematics tests (speed and general math skills) and 2 language tests (reading and spelling). After 2 years, multilevel analysis showed that children in the intervention group had significantly greater gains in mathematics speed test (P < .001; effect size [ES] 0.51), general mathematics (P < .001; ES 0.42), and spelling (P < .001; ES 0.45) scores. This equates to 4 months more learning gains in comparison with the control group. No differences were found on the reading test. Physically active academic lessons significantly improved mathematics and spelling performance of elementary school children and are therefore a promising new way of teaching. Copyright © 2016 by the American Academy of Pediatrics.
Improved initialisation of model-based clustering using Gaussian hierarchical partitions
Scrucca, Luca; Raftery, Adrian E.
2015-01-01
Initialisation of the EM algorithm in model-based clustering is often crucial. Various starting points in the parameter space often lead to different local maxima of the likelihood function and, so to different clustering partitions. Among the several approaches available in the literature, model-based agglomerative hierarchical clustering is used to provide initial partitions in the popular mclust R package. This choice is computationally convenient and often yields good clustering partitions. However, in certain circumstances, poor initial partitions may cause the EM algorithm to converge to a local maximum of the likelihood function. We propose several simple and fast refinements based on data transformations and illustrate them through data examples. PMID:26949421
NASA Astrophysics Data System (ADS)
Giannantonio, Tommaso; Ross, Ashley J.; Percival, Will J.; Crittenden, Robert; Bacher, David; Kilbinger, Martin; Nichol, Robert; Weller, Jochen
2014-01-01
We present the strongest robust constraints on primordial non-Gaussianity (PNG) from currently available galaxy surveys, combining large-scale clustering measurements and their cross correlations with the cosmic microwave background. We update the data sets used by Giannantonio et al. (2012), and broaden that analysis to include the full set of two-point correlation functions between all surveys. In order to obtain the most reliable constraints on PNG, we advocate the use of the cross correlations between the catalogs as a robust estimator and we perform an extended analysis of the possible systematics to reduce their impact on the results. To minimize the impact of stellar contamination in our luminous red galaxy sample, we use the recent Baryon Oscillations Spectroscopic Survey catalog of Ross et al. (2011). We also find evidence for a new systematic in the NVSS radio galaxy survey similar to, but smaller than, the known declination-dependent issue; this is difficult to remove without affecting the inferred PNG signal, and thus we do not include the NVSS autocorrelation function in our analyses. We find no evidence of primordial non-Gaussianity; for the local-type configuration we obtain for the skewness parameter -36
Comerford, Julia M.; Moustakas, Leonidas A.; Natarajan, Priyamvada
2010-05-20
Scaling relations of observed galaxy cluster properties are useful tools for constraining cosmological parameters as well as cluster formation histories. One of the key cosmological parameters, {sigma}{sub 8}, is constrained using observed clusters of galaxies, although current estimates of {sigma}{sub 8} from the scaling relations of dynamically relaxed galaxy clusters are limited by the large scatter in the observed cluster mass-temperature (M-T) relation. With a sample of eight strong lensing clusters at 0.3 < z < 0.8, we find that the observed cluster concentration-mass relation can be used to reduce the M-T scatter by a factor of 6. Typically only relaxed clusters are used to estimate {sigma}{sub 8}, but combining the cluster concentration-mass relation with the M-T relation enables the inclusion of unrelaxed clusters as well. Thus, the resultant gains in the accuracy of {sigma}{sub 8} measurements from clusters are twofold: the errors on {sigma}{sub 8} are reduced and the cluster sample size is increased. Therefore, the statistics on {sigma}{sub 8} determination from clusters are greatly improved by the inclusion of unrelaxed clusters. Exploring cluster scaling relations further, we find that the correlation between brightest cluster galaxy (BCG) luminosity and cluster mass offers insight into the assembly histories of clusters. We find preliminary evidence for a steeper BCG luminosity-cluster mass relation for strong lensing clusters than the general cluster population, hinting that strong lensing clusters may have had more active merging histories.
Walters, William A.; Lennon, Niall J.; Bochicchio, James; Krohn, Andrew; Pennanen, Taina
2016-01-01
ABSTRACT While high-throughput sequencing methods are revolutionizing fungal ecology, recovering accurate estimates of species richness and abundance has proven elusive. We sought to design internal transcribed spacer (ITS) primers and an Illumina protocol that would maximize coverage of the kingdom Fungi while minimizing nontarget eukaryotes. We inspected alignments of the 5.8S and large subunit (LSU) ribosomal genes and evaluated potential primers using PrimerProspector. We tested the resulting primers using tiered-abundance mock communities and five previously characterized soil samples. We recovered operational taxonomic units (OTUs) belonging to all 8 members in both mock communities, despite DNA abundances spanning 3 orders of magnitude. The expected and observed read counts were strongly correlated (r = 0.94 to 0.97). However, several taxa were consistently over- or underrepresented, likely due to variation in rRNA gene copy numbers. The Illumina data resulted in clustering of soil samples identical to that obtained with Sanger sequence clone library data using different primers. Furthermore, the two methods produced distance matrices with a Mantel correlation of 0.92. Nonfungal sequences comprised less than 0.5% of the soil data set, with most attributable to vascular plants. Our results suggest that high-throughput methods can produce fairly accurate estimates of fungal abundances in complex communities. Further improvements might be achieved through corrections for rRNA copy number and utilization of standardized mock communities. IMPORTANCE Fungi play numerous important roles in the environment. Improvements in sequencing methods are providing revolutionary insights into fungal biodiversity, yet accurate estimates of the number of fungal species (i.e., richness) and their relative abundances in an environmental sample (e.g., soil, roots, water, etc.) remain difficult to obtain. We present improved methods for high-throughput Illumina sequencing of the
Improving hot region prediction by parameter optimization of density clustering in PPI.
Hu, Jing; Zhang, Xiaolong
2016-11-01
This paper proposed an optimized algorithm which combines density clustering of parameter selection with feature-based classification for hot region prediction. First, all the residues are classified by SVM to remove non-hot spot residues, then density clustering of parameter selection is used to find hot regions. In the density clustering, this paper studies how to select input parameters. There are two parameters radius and density in density-based incremental clustering. We firstly fix density and enumerate radius to find a pair of parameters which leads to maximum number of clusters, and then we fix radius and enumerate density to find another pair of parameters which leads to maximum number of clusters. Experiment results show that the proposed method using both two pairs of parameters provides better prediction performance than the other method, and compare these two predictive results, the result by fixing radius and enumerating density have slightly higher prediction accuracy than that by fixing density and enumerating radius. Copyright © 2016. Published by Elsevier Inc.
Dupont, Corinne; Winer, Norbert; Rabilloud, Muriel; Touzet, Sandrine; Branger, Bernard; Lansac, Jacques; Gaucher, Laurent; Duclos, Antoine; Huissoud, Cyril; Boutitie, Florent; Rudigoz, René-Charles; Colin, Cyrille
2017-08-01
Suboptimal care contributes to perinatal morbidity and mortality. We investigated the effects of a multifaceted program designed to improve obstetric practices and outcomes. A cluster-randomized trial was conducted from October 2008 to November 2010 in 95 French maternity units randomized either to receive an information intervention about published guidelines or left to apply them freely. The intervention combined an outreach visit with a morbidity/mortality conference (MMC) to review perinatal morbidity/mortality cases. Within the intervention group, the units were randomized to have MMCs with or without clinical psychologists. The primary outcome was the rate of suboptimal care among perinatal morbidity/mortality cases. The secondary outcomes included the rate of suboptimal care among cases of morbidity, the rate of suboptimal care among cases of mortality, the rate of avoidable morbidity and/or mortality cases, and the incidence of, morbidity and/or mortality. A mixed logistic regression model with random intercept was used to quantify the effect of the intervention on the main outcome. The study reviewed 2459 cases of morbidity or mortality among 165,353 births. The rate of suboptimal care among morbidity plus mortality cases was not significantly lower in the intervention than in the control group (8.1% vs. 10.6%, OR [95% CI]: 0.75 [0.50-1.12], p=0.15. However, the cases of suboptimal care among morbidity cases were significantly lower in the intervention group (7.6% vs. 11.5%, 0.62 [0.40-0.94], p=0.02); the incidence of perinatal morbidity was also lower (7.0 vs. 8.1‰, p=0.01). No differences were found between psychologist-backed and the other units. The intervention reduced the rate of suboptimal care mainly in morbidity cases and the incidence of morbidity but did not succeed in improving morbidity plus mortality combined. More clear-cut results regarding mortality require a longer study period and the inclusion of structures that intervene before and
Pollutant discharges to coastal areas: Improving upstream source estimates. Final report
Rohmann, S.O.
1989-10-01
The report describes a project NOAA's Strategic Environmental Assessments Division began to improve the estimates of pollutant discharges carried into coastal areas by rivers and streams. These estimates, termed discharges from upstream sources, take into account all pollution discharged by industries, sewage treatment plants, farms, cities, and other pollution-generating operations, as well as natural phenomena such as erosion and weathering which occur inland or upstream of the coastal US.
An improved Combes-Thomas estimate of magnetic Schrödinger operators
NASA Astrophysics Data System (ADS)
Shen, Zhongwei
2014-10-01
In the present paper, we prove an improved Combes-Thomas estimate, viz. the Combes-Thomas estimate in trace-class norms, for magnetic Schrödinger operators under general assumptions. In particular, we allow for unbounded potentials. We also show that for any function in the Schwartz space on the reals the operator kernel decays, in trace-class norms, faster than any polynomial.
Stover, J; Johnson, P; Zaba, B; Zwahlen, M; Dabis, F; Ekpini, R E
2008-08-01
The approach to national and global estimates of HIV/AIDS used by UNAIDS starts with estimates of adult HIV prevalence prepared from surveillance data using either the Estimation and Projection Package (EPP) or the Workbook. Time trends of prevalence are transferred to Spectrum to estimate the consequences of the HIV/AIDS epidemic, including the number of people living with HIV, new infections, AIDS deaths, AIDS orphans, treatment needs and the impact of treatment on survival. The UNAIDS Reference Group on Estimates, Modelling and Projections regularly reviews new data and information needs and recommends updates to the methodology and assumptions used in Spectrum. The latest update to Spectrum was used in the 2007 round of global estimates. Several new features have been added to Spectrum in the past two years. The structure of the population was reorganised to track populations by HIV status and treatment status. Mortality estimates were improved by the adoption of new approaches to estimating non-AIDS mortality by single age, and the use of new information on survival with HIV in non-treated cohorts and on the survival of patients on antiretroviral treatment (ART). A more detailed treatment of mother-to-child transmission of HIV now provides more prophylaxis and infant feeding options. New procedures were implemented to estimate the uncertainty around each of the key outputs. The latest update to the Spectrum program is intended to incorporate the latest research findings and provide new outputs needed by national and