Forensic identification of resampling operators: A semi non-intrusive approach.
Cao, Gang; Zhao, Yao; Ni, Rongrong
2012-03-10
Recently, several new resampling operators have been proposed and successfully invalidate the existing resampling detectors. However, the reliability of such anti-forensic techniques is unaware and needs to be investigated. In this paper, we focus on the forensic identification of digital image resampling operators including the traditional type and the anti-forensic type which hides the trace of traditional resampling. Various resampling algorithms involving geometric distortion (GD)-based, dual-path-based and postprocessing-based are investigated. The identification is achieved in the manner of semi non-intrusive, supposing the resampling software could be accessed. Given an input pattern of monotone signal, polarity aberration of GD-based resampled signal's first derivative is analyzed theoretically and measured by effective feature metric. Dual-path-based and postprocessing-based resampling can also be identified by feeding proper test patterns. Experimental results on various parameter settings demonstrate the effectiveness of the proposed approach. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Assessment of Person Fit Using Resampling-Based Approaches
ERIC Educational Resources Information Center
Sinharay, Sandip
2016-01-01
De la Torre and Deng suggested a resampling-based approach for person-fit assessment (PFA). The approach involves the use of the [math equation unavailable] statistic, a corrected expected a posteriori estimate of the examinee ability, and the Monte Carlo (MC) resampling method. The Type I error rate of the approach was closer to the nominal level…
Li, Dongmei; Le Pape, Marc A; Parikh, Nisha I; Chen, Will X; Dye, Timothy D
2013-01-01
Microarrays are widely used for examining differential gene expression, identifying single nucleotide polymorphisms, and detecting methylation loci. Multiple testing methods in microarray data analysis aim at controlling both Type I and Type II error rates; however, real microarray data do not always fit their distribution assumptions. Smyth's ubiquitous parametric method, for example, inadequately accommodates violations of normality assumptions, resulting in inflated Type I error rates. The Significance Analysis of Microarrays, another widely used microarray data analysis method, is based on a permutation test and is robust to non-normally distributed data; however, the Significance Analysis of Microarrays method fold change criteria are problematic, and can critically alter the conclusion of a study, as a result of compositional changes of the control data set in the analysis. We propose a novel approach, combining resampling with empirical Bayes methods: the Resampling-based empirical Bayes Methods. This approach not only reduces false discovery rates for non-normally distributed microarray data, but it is also impervious to fold change threshold since no control data set selection is needed. Through simulation studies, sensitivities, specificities, total rejections, and false discovery rates are compared across the Smyth's parametric method, the Significance Analysis of Microarrays, and the Resampling-based empirical Bayes Methods. Differences in false discovery rates controls between each approach are illustrated through a preterm delivery methylation study. The results show that the Resampling-based empirical Bayes Methods offer significantly higher specificity and lower false discovery rates compared to Smyth's parametric method when data are not normally distributed. The Resampling-based empirical Bayes Methods also offers higher statistical power than the Significance Analysis of Microarrays method when the proportion of significantly differentially expressed genes is large for both normally and non-normally distributed data. Finally, the Resampling-based empirical Bayes Methods are generalizable to next generation sequencing RNA-seq data analysis.
Yang, Yang; DeGruttola, Victor
2016-01-01
Traditional resampling-based tests for homogeneity in covariance matrices across multiple groups resample residuals, that is, data centered by group means. These residuals do not share the same second moments when the null hypothesis is false, which makes them difficult to use in the setting of multiple testing. An alternative approach is to resample standardized residuals, data centered by group sample means and standardized by group sample covariance matrices. This approach, however, has been observed to inflate type I error when sample size is small or data are generated from heavy-tailed distributions. We propose to improve this approach by using robust estimation for the first and second moments. We discuss two statistics: the Bartlett statistic and a statistic based on eigen-decomposition of sample covariance matrices. Both statistics can be expressed in terms of standardized errors under the null hypothesis. These methods are extended to test homogeneity in correlation matrices. Using simulation studies, we demonstrate that the robust resampling approach provides comparable or superior performance, relative to traditional approaches, for single testing and reasonable performance for multiple testing. The proposed methods are applied to data collected in an HIV vaccine trial to investigate possible determinants, including vaccine status, vaccine-induced immune response level and viral genotype, of unusual correlation pattern between HIV viral load and CD4 count in newly infected patients. PMID:22740584
Yang, Yang; DeGruttola, Victor
2012-06-22
Traditional resampling-based tests for homogeneity in covariance matrices across multiple groups resample residuals, that is, data centered by group means. These residuals do not share the same second moments when the null hypothesis is false, which makes them difficult to use in the setting of multiple testing. An alternative approach is to resample standardized residuals, data centered by group sample means and standardized by group sample covariance matrices. This approach, however, has been observed to inflate type I error when sample size is small or data are generated from heavy-tailed distributions. We propose to improve this approach by using robust estimation for the first and second moments. We discuss two statistics: the Bartlett statistic and a statistic based on eigen-decomposition of sample covariance matrices. Both statistics can be expressed in terms of standardized errors under the null hypothesis. These methods are extended to test homogeneity in correlation matrices. Using simulation studies, we demonstrate that the robust resampling approach provides comparable or superior performance, relative to traditional approaches, for single testing and reasonable performance for multiple testing. The proposed methods are applied to data collected in an HIV vaccine trial to investigate possible determinants, including vaccine status, vaccine-induced immune response level and viral genotype, of unusual correlation pattern between HIV viral load and CD4 count in newly infected patients.
Cellular neural network-based hybrid approach toward automatic image registration
NASA Astrophysics Data System (ADS)
Arun, Pattathal VijayaKumar; Katiyar, Sunil Kumar
2013-01-01
Image registration is a key component of various image processing operations that involve the analysis of different image data sets. Automatic image registration domains have witnessed the application of many intelligent methodologies over the past decade; however, inability to properly model object shape as well as contextual information has limited the attainable accuracy. A framework for accurate feature shape modeling and adaptive resampling using advanced techniques such as vector machines, cellular neural network (CNN), scale invariant feature transform (SIFT), coreset, and cellular automata is proposed. CNN has been found to be effective in improving feature matching as well as resampling stages of registration and complexity of the approach has been considerably reduced using coreset optimization. The salient features of this work are cellular neural network approach-based SIFT feature point optimization, adaptive resampling, and intelligent object modelling. Developed methodology has been compared with contemporary methods using different statistical measures. Investigations over various satellite images revealed that considerable success was achieved with the approach. This system has dynamically used spectral and spatial information for representing contextual knowledge using CNN-prolog approach. This methodology is also illustrated to be effective in providing intelligent interpretation and adaptive resampling.
Cui, Ming; Xu, Lili; Wang, Huimin; Ju, Shaoqing; Xu, Shuizhu; Jing, Rongrong
2017-12-01
Measurement uncertainty (MU) is a metrological concept, which can be used for objectively estimating the quality of test results in medical laboratories. The Nordtest guide recommends an approach that uses both internal quality control (IQC) and external quality assessment (EQA) data to evaluate the MU. Bootstrap resampling is employed to simulate the unknown distribution based on the mathematical statistics method using an existing small sample of data, where the aim is to transform the small sample into a large sample. However, there have been no reports of the utilization of this method in medical laboratories. Thus, this study applied the Nordtest guide approach based on bootstrap resampling for estimating the MU. We estimated the MU for the white blood cell (WBC) count, red blood cell (RBC) count, hemoglobin (Hb), and platelets (Plt). First, we used 6months of IQC data and 12months of EQA data to calculate the MU according to the Nordtest method. Second, we combined the Nordtest method and bootstrap resampling with the quality control data and calculated the MU using MATLAB software. We then compared the MU results obtained using the two approaches. The expanded uncertainty results determined for WBC, RBC, Hb, and Plt using the bootstrap resampling method were 4.39%, 2.43%, 3.04%, and 5.92%, respectively, and 4.38%, 2.42%, 3.02%, and 6.00% with the existing quality control data (U [k=2]). For WBC, RBC, Hb, and Plt, the differences between the results obtained using the two methods were lower than 1.33%. The expanded uncertainty values were all less than the target uncertainties. The bootstrap resampling method allows the statistical analysis of the MU. Combining the Nordtest method and bootstrap resampling is considered a suitable alternative method for estimating the MU. Copyright © 2017 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.
Anomalous change detection in imagery
Theiler, James P [Los Alamos, NM; Perkins, Simon J [Santa Fe, NM
2011-05-31
A distribution-based anomaly detection platform is described that identifies a non-flat background that is specified in terms of the distribution of the data. A resampling approach is also disclosed employing scrambled resampling of the original data with one class specified by the data and the other by the explicit distribution, and solving using binary classification.
ERIC Educational Resources Information Center
Nevitt, Jonathan; Hancock, Gregory R.
2001-01-01
Evaluated the bootstrap method under varying conditions of nonnormality, sample size, model specification, and number of bootstrap samples drawn from the resampling space. Results for the bootstrap suggest the resampling-based method may be conservative in its control over model rejections, thus having an impact on the statistical power associated…
System health monitoring using multiple-model adaptive estimation techniques
NASA Astrophysics Data System (ADS)
Sifford, Stanley Ryan
Monitoring system health for fault detection and diagnosis by tracking system parameters concurrently with state estimates is approached using a new multiple-model adaptive estimation (MMAE) method. This novel method is called GRid-based Adaptive Parameter Estimation (GRAPE). GRAPE expands existing MMAE methods by using new techniques to sample the parameter space. GRAPE expands on MMAE with the hypothesis that sample models can be applied and resampled without relying on a predefined set of models. GRAPE is initially implemented in a linear framework using Kalman filter models. A more generalized GRAPE formulation is presented using extended Kalman filter (EKF) models to represent nonlinear systems. GRAPE can handle both time invariant and time varying systems as it is designed to track parameter changes. Two techniques are presented to generate parameter samples for the parallel filter models. The first approach is called selected grid-based stratification (SGBS). SGBS divides the parameter space into equally spaced strata. The second approach uses Latin Hypercube Sampling (LHS) to determine the parameter locations and minimize the total number of required models. LHS is particularly useful when the parameter dimensions grow. Adding more parameters does not require the model count to increase for LHS. Each resample is independent of the prior sample set other than the location of the parameter estimate. SGBS and LHS can be used for both the initial sample and subsequent resamples. Furthermore, resamples are not required to use the same technique. Both techniques are demonstrated for both linear and nonlinear frameworks. The GRAPE framework further formalizes the parameter tracking process through a general approach for nonlinear systems. These additional methods allow GRAPE to either narrow the focus to converged values within a parameter range or expand the range in the appropriate direction to track the parameters outside the current parameter range boundary. Customizable rules define the specific resample behavior when the GRAPE parameter estimates converge. Convergence itself is determined from the derivatives of the parameter estimates using a simple moving average window to filter out noise. The system can be tuned to match the desired performance goals by making adjustments to parameters such as the sample size, convergence criteria, resample criteria, initial sampling method, resampling method, confidence in prior sample covariances, sample delay, and others.
Gaussian Process Interpolation for Uncertainty Estimation in Image Registration
Wachinger, Christian; Golland, Polina; Reuter, Martin; Wells, William
2014-01-01
Intensity-based image registration requires resampling images on a common grid to evaluate the similarity function. The uncertainty of interpolation varies across the image, depending on the location of resampled points relative to the base grid. We propose to perform Bayesian inference with Gaussian processes, where the covariance matrix of the Gaussian process posterior distribution estimates the uncertainty in interpolation. The Gaussian process replaces a single image with a distribution over images that we integrate into a generative model for registration. Marginalization over resampled images leads to a new similarity measure that includes the uncertainty of the interpolation. We demonstrate that our approach increases the registration accuracy and propose an efficient approximation scheme that enables seamless integration with existing registration methods. PMID:25333127
Assessing Uncertainties in Surface Water Security: A Probabilistic Multi-model Resampling approach
NASA Astrophysics Data System (ADS)
Rodrigues, D. B. B.
2015-12-01
Various uncertainties are involved in the representation of processes that characterize interactions between societal needs, ecosystem functioning, and hydrological conditions. Here, we develop an empirical uncertainty assessment of water security indicators that characterize scarcity and vulnerability, based on a multi-model and resampling framework. We consider several uncertainty sources including those related to: i) observed streamflow data; ii) hydrological model structure; iii) residual analysis; iv) the definition of Environmental Flow Requirement method; v) the definition of critical conditions for water provision; and vi) the critical demand imposed by human activities. We estimate the overall uncertainty coming from the hydrological model by means of a residual bootstrap resampling approach, and by uncertainty propagation through different methodological arrangements applied to a 291 km² agricultural basin within the Cantareira water supply system in Brazil. Together, the two-component hydrograph residual analysis and the block bootstrap resampling approach result in a more accurate and precise estimate of the uncertainty (95% confidence intervals) in the simulated time series. We then compare the uncertainty estimates associated with water security indicators using a multi-model framework and provided by each model uncertainty estimation approach. The method is general and can be easily extended forming the basis for meaningful support to end-users facing water resource challenges by enabling them to incorporate a viable uncertainty analysis into a robust decision making process.
An optical systems analysis approach to image resampling
NASA Technical Reports Server (NTRS)
Lyon, Richard G.
1997-01-01
All types of image registration require some type of resampling, either during the registration or as a final step in the registration process. Thus the image(s) must be regridded into a spatially uniform, or angularly uniform, coordinate system with some pre-defined resolution. Frequently the ending resolution is not the resolution at which the data was observed with. The registration algorithm designer and end product user are presented with a multitude of possible resampling methods each of which modify the spatial frequency content of the data in some way. The purpose of this paper is threefold: (1) to show how an imaging system modifies the scene from an end to end optical systems analysis approach, (2) to develop a generalized resampling model, and (3) empirically apply the model to simulated radiometric scene data and tabulate the results. A Hanning windowed sinc interpolator method will be developed based upon the optical characterization of the system. It will be discussed in terms of the effects and limitations of sampling, aliasing, spectral leakage, and computational complexity. Simulated radiometric scene data will be used to demonstrate each of the algorithms. A high resolution scene will be "grown" using a fractal growth algorithm based on mid-point recursion techniques. The result scene data will be convolved with a point spread function representing the optical response. The resultant scene will be convolved with the detection systems response and subsampled to the desired resolution. The resultant data product will be subsequently resampled to the correct grid using the Hanning windowed sinc interpolator and the results and errors tabulated and discussed.
Comparison of bootstrap approaches for estimation of uncertainties of DTI parameters.
Chung, SungWon; Lu, Ying; Henry, Roland G
2006-11-01
Bootstrap is an empirical non-parametric statistical technique based on data resampling that has been used to quantify uncertainties of diffusion tensor MRI (DTI) parameters, useful in tractography and in assessing DTI methods. The current bootstrap method (repetition bootstrap) used for DTI analysis performs resampling within the data sharing common diffusion gradients, requiring multiple acquisitions for each diffusion gradient. Recently, wild bootstrap was proposed that can be applied without multiple acquisitions. In this paper, two new approaches are introduced called residual bootstrap and repetition bootknife. We show that repetition bootknife corrects for the large bias present in the repetition bootstrap method and, therefore, better estimates the standard errors. Like wild bootstrap, residual bootstrap is applicable to single acquisition scheme, and both are based on regression residuals (called model-based resampling). Residual bootstrap is based on the assumption that non-constant variance of measured diffusion-attenuated signals can be modeled, which is actually the assumption behind the widely used weighted least squares solution of diffusion tensor. The performances of these bootstrap approaches were compared in terms of bias, variance, and overall error of bootstrap-estimated standard error by Monte Carlo simulation. We demonstrate that residual bootstrap has smaller biases and overall errors, which enables estimation of uncertainties with higher accuracy. Understanding the properties of these bootstrap procedures will help us to choose the optimal approach for estimating uncertainties that can benefit hypothesis testing based on DTI parameters, probabilistic fiber tracking, and optimizing DTI methods.
Zhang, Bo; Liu, Wei; Zhang, Zhiwei; Qu, Yanping; Chen, Zhen; Albert, Paul S
2017-08-01
Joint modeling and within-cluster resampling are two approaches that are used for analyzing correlated data with informative cluster sizes. Motivated by a developmental toxicity study, we examined the performances and validity of these two approaches in testing covariate effects in generalized linear mixed-effects models. We show that the joint modeling approach is robust to the misspecification of cluster size models in terms of Type I and Type II errors when the corresponding covariates are not included in the random effects structure; otherwise, statistical tests may be affected. We also evaluate the performance of the within-cluster resampling procedure and thoroughly investigate the validity of it in modeling correlated data with informative cluster sizes. We show that within-cluster resampling is a valid alternative to joint modeling for cluster-specific covariates, but it is invalid for time-dependent covariates. The two methods are applied to a developmental toxicity study that investigated the effect of exposure to diethylene glycol dimethyl ether.
A CNN based Hybrid approach towards automatic image registration
NASA Astrophysics Data System (ADS)
Arun, Pattathal V.; Katiyar, Sunil K.
2013-06-01
Image registration is a key component of various image processing operations which involve the analysis of different image data sets. Automatic image registration domains have witnessed the application of many intelligent methodologies over the past decade; however inability to properly model object shape as well as contextual information had limited the attainable accuracy. In this paper, we propose a framework for accurate feature shape modeling and adaptive resampling using advanced techniques such as Vector Machines, Cellular Neural Network (CNN), SIFT, coreset, and Cellular Automata. CNN has found to be effective in improving feature matching as well as resampling stages of registration and complexity of the approach has been considerably reduced using corset optimization The salient features of this work are cellular neural network approach based SIFT feature point optimisation, adaptive resampling and intelligent object modelling. Developed methodology has been compared with contemporary methods using different statistical measures. Investigations over various satellite images revealed that considerable success was achieved with the approach. System has dynamically used spectral and spatial information for representing contextual knowledge using CNN-prolog approach. Methodology also illustrated to be effective in providing intelligent interpretation and adaptive resampling. Rejestracja obrazu jest kluczowym składnikiem różnych operacji jego przetwarzania. W ostatnich latach do automatycznej rejestracji obrazu wykorzystuje się metody sztucznej inteligencji, których największą wadą, obniżającą dokładność uzyskanych wyników jest brak możliwości dobrego wymodelowania kształtu i informacji kontekstowych. W niniejszej pracy zaproponowano zasady dokładnego modelowania kształtu oraz adaptacyjnego resamplingu z wykorzystaniem zaawansowanych technik, takich jak Vector Machines (VM), komórkowa sieć neuronowa (CNN), przesiewanie (SIFT), Coreset i automaty komórkowe. Stwierdzono, że za pomocą CNN można skutecznie poprawiać dopasowanie obiektów obrazowych oraz resampling kolejnych kroków rejestracji, zaś zastosowanie optymalizacji metodą Coreset znacznie redukuje złożoność podejścia. Zasadniczym przedmiotem pracy są: optymalizacja punktów metodą SIFT oparta na podejściu CNN, adaptacyjny resampling oraz inteligentne modelowanie obiektów. Opracowana metoda została porównana ze współcześnie stosowanymi metodami wykorzystującymi różne miary statystyczne. Badania nad różnymi obrazami satelitarnymi wykazały, że stosując opracowane podejście osiągnięto bardzo dobre wyniki. System stosując podejście CNN-prolog dynamicznie wykorzystuje informacje spektralne i przestrzenne dla reprezentacji wiedzy kontekstowej. Metoda okazała się również skuteczna w dostarczaniu inteligentnych interpretacji i w adaptacyjnym resamplingu.
permGPU: Using graphics processing units in RNA microarray association studies.
Shterev, Ivo D; Jung, Sin-Ho; George, Stephen L; Owzar, Kouros
2010-06-16
Many analyses of microarray association studies involve permutation, bootstrap resampling and cross-validation, that are ideally formulated as embarrassingly parallel computing problems. Given that these analyses are computationally intensive, scalable approaches that can take advantage of multi-core processor systems need to be developed. We have developed a CUDA based implementation, permGPU, that employs graphics processing units in microarray association studies. We illustrate the performance and applicability of permGPU within the context of permutation resampling for a number of test statistics. An extensive simulation study demonstrates a dramatic increase in performance when using permGPU on an NVIDIA GTX 280 card compared to an optimized C/C++ solution running on a conventional Linux server. permGPU is available as an open-source stand-alone application and as an extension package for the R statistical environment. It provides a dramatic increase in performance for permutation resampling analysis in the context of microarray association studies. The current version offers six test statistics for carrying out permutation resampling analyses for binary, quantitative and censored time-to-event traits.
The Bootstrap, the Jackknife, and the Randomization Test: A Sampling Taxonomy.
Rodgers, J L
1999-10-01
A simple sampling taxonomy is defined that shows the differences between and relationships among the bootstrap, the jackknife, and the randomization test. Each method has as its goal the creation of an empirical sampling distribution that can be used to test statistical hypotheses, estimate standard errors, and/or create confidence intervals. Distinctions between the methods can be made based on the sampling approach (with replacement versus without replacement) and the sample size (replacing the whole original sample versus replacing a subset of the original sample). The taxonomy is useful for teaching the goals and purposes of resampling schemes. An extension of the taxonomy implies other possible resampling approaches that have not previously been considered. Univariate and multivariate examples are presented.
Surface Fitting for Quasi Scattered Data from Coordinate Measuring Systems.
Mao, Qing; Liu, Shugui; Wang, Sen; Ma, Xinhui
2018-01-13
Non-uniform rational B-spline (NURBS) surface fitting from data points is wildly used in the fields of computer aided design (CAD), medical imaging, cultural relic representation and object-shape detection. Usually, the measured data acquired from coordinate measuring systems is neither gridded nor completely scattered. The distribution of this kind of data is scattered in physical space, but the data points are stored in a way consistent with the order of measurement, so it is named quasi scattered data in this paper. Therefore they can be organized into rows easily but the number of points in each row is random. In order to overcome the difficulty of surface fitting from this kind of data, a new method based on resampling is proposed. It consists of three major steps: (1) NURBS curve fitting for each row, (2) resampling on the fitted curve and (3) surface fitting from the resampled data. Iterative projection optimization scheme is applied in the first and third step to yield advisable parameterization and reduce the time cost of projection. A resampling approach based on parameters, local peaks and contour curvature is proposed to overcome the problems of nodes redundancy and high time consumption in the fitting of this kind of scattered data. Numerical experiments are conducted with both simulation and practical data, and the results show that the proposed method is fast, effective and robust. What's more, by analyzing the fitting results acquired form data with different degrees of scatterness it can be demonstrated that the error introduced by resampling is negligible and therefore it is feasible.
NASA Astrophysics Data System (ADS)
Kim, Young-Rok; Park, Eunseo; Choi, Eun-Jung; Park, Sang-Young; Park, Chandeok; Lim, Hyung-Chul
2014-09-01
In this study, genetic resampling (GRS) approach is utilized for precise orbit determination (POD) using the batch filter based on particle filtering (PF). Two genetic operations, which are arithmetic crossover and residual mutation, are used for GRS of the batch filter based on PF (PF batch filter). For POD, Laser-ranging Precise Orbit Determination System (LPODS) and satellite laser ranging (SLR) observations of the CHAMP satellite are used. Monte Carlo trials for POD are performed by one hundred times. The characteristics of the POD results by PF batch filter with GRS are compared with those of a PF batch filter with minimum residual resampling (MRRS). The post-fit residual, 3D error by external orbit comparison, and POD repeatability are analyzed for orbit quality assessments. The POD results are externally checked by NASA JPL’s orbits using totally different software, measurements, and techniques. For post-fit residuals and 3D errors, both MRRS and GRS give accurate estimation results whose mean root mean square (RMS) values are at a level of 5 cm and 10-13 cm, respectively. The mean radial orbit errors of both methods are at a level of 5 cm. For POD repeatability represented as the standard deviations of post-fit residuals and 3D errors by repetitive PODs, however, GRS yields 25% and 13% more robust estimation results than MRRS for post-fit residual and 3D error, respectively. This study shows that PF batch filter with GRS approach using genetic operations is superior to PF batch filter with MRRS in terms of robustness in POD with SLR observations.
Approximate median regression for complex survey data with skewed response.
Fraser, Raphael André; Lipsitz, Stuart R; Sinha, Debajyoti; Fitzmaurice, Garrett M; Pan, Yi
2016-12-01
The ready availability of public-use data from various large national complex surveys has immense potential for the assessment of population characteristics using regression models. Complex surveys can be used to identify risk factors for important diseases such as cancer. Existing statistical methods based on estimating equations and/or utilizing resampling methods are often not valid with survey data due to complex survey design features. That is, stratification, multistage sampling, and weighting. In this article, we accommodate these design features in the analysis of highly skewed response variables arising from large complex surveys. Specifically, we propose a double-transform-both-sides (DTBS)'based estimating equations approach to estimate the median regression parameters of the highly skewed response; the DTBS approach applies the same Box-Cox type transformation twice to both the outcome and regression function. The usual sandwich variance estimate can be used in our approach, whereas a resampling approach would be needed for a pseudo-likelihood based on minimizing absolute deviations (MAD). Furthermore, the approach is relatively robust to the true underlying distribution, and has much smaller mean square error than a MAD approach. The method is motivated by an analysis of laboratory data on urinary iodine (UI) concentration from the National Health and Nutrition Examination Survey. © 2016, The International Biometric Society.
Approximate Median Regression for Complex Survey Data with Skewed Response
Fraser, Raphael André; Lipsitz, Stuart R.; Sinha, Debajyoti; Fitzmaurice, Garrett M.; Pan, Yi
2016-01-01
Summary The ready availability of public-use data from various large national complex surveys has immense potential for the assessment of population characteristics using regression models. Complex surveys can be used to identify risk factors for important diseases such as cancer. Existing statistical methods based on estimating equations and/or utilizing resampling methods are often not valid with survey data due to complex survey design features. That is, stratification, multistage sampling and weighting. In this paper, we accommodate these design features in the analysis of highly skewed response variables arising from large complex surveys. Specifically, we propose a double-transform-both-sides (DTBS) based estimating equations approach to estimate the median regression parameters of the highly skewed response; the DTBS approach applies the same Box-Cox type transformation twice to both the outcome and regression function. The usual sandwich variance estimate can be used in our approach, whereas a resampling approach would be needed for a pseudo-likelihood based on minimizing absolute deviations (MAD). Furthermore, the approach is relatively robust to the true underlying distribution, and has much smaller mean square error than a MAD approach. The method is motivated by an analysis of laboratory data on urinary iodine (UI) concentration from the National Health and Nutrition Examination Survey. PMID:27062562
NASA Astrophysics Data System (ADS)
Khaki, M.; Hoteit, I.; Kuhn, M.; Awange, J.; Forootan, E.; van Dijk, A. I. J. M.; Schumacher, M.; Pattiaratchi, C.
2017-09-01
The time-variable terrestrial water storage (TWS) products from the Gravity Recovery And Climate Experiment (GRACE) have been increasingly used in recent years to improve the simulation of hydrological models by applying data assimilation techniques. In this study, for the first time, we assess the performance of the most popular data assimilation sequential techniques for integrating GRACE TWS into the World-Wide Water Resources Assessment (W3RA) model. We implement and test stochastic and deterministic ensemble-based Kalman filters (EnKF), as well as Particle filters (PF) using two different resampling approaches of Multinomial Resampling and Systematic Resampling. These choices provide various opportunities for weighting observations and model simulations during the assimilation and also accounting for error distributions. Particularly, the deterministic EnKF is tested to avoid perturbing observations before assimilation (that is the case in an ordinary EnKF). Gaussian-based random updates in the EnKF approaches likely do not fully represent the statistical properties of the model simulations and TWS observations. Therefore, the fully non-Gaussian PF is also applied to estimate more realistic updates. Monthly GRACE TWS are assimilated into W3RA covering the entire Australia. To evaluate the filters performances and analyze their impact on model simulations, their estimates are validated by independent in-situ measurements. Our results indicate that all implemented filters improve the estimation of water storage simulations of W3RA. The best results are obtained using two versions of deterministic EnKF, i.e. the Square Root Analysis (SQRA) scheme and the Ensemble Square Root Filter (EnSRF), respectively, improving the model groundwater estimations errors by 34% and 31% compared to a model run without assimilation. Applying the PF along with Systematic Resampling successfully decreases the model estimation error by 23%.
Bishara, Anthony J; Hittner, James B
2012-09-01
It is well known that when data are nonnormally distributed, a test of the significance of Pearson's r may inflate Type I error rates and reduce power. Statistics textbooks and the simulation literature provide several alternatives to Pearson's correlation. However, the relative performance of these alternatives has been unclear. Two simulation studies were conducted to compare 12 methods, including Pearson, Spearman's rank-order, transformation, and resampling approaches. With most sample sizes (n ≥ 20), Type I and Type II error rates were minimized by transforming the data to a normal shape prior to assessing the Pearson correlation. Among transformation approaches, a general purpose rank-based inverse normal transformation (i.e., transformation to rankit scores) was most beneficial. However, when samples were both small (n ≤ 10) and extremely nonnormal, the permutation test often outperformed other alternatives, including various bootstrap tests.
Image re-sampling detection through a novel interpolation kernel.
Hilal, Alaa
2018-06-01
Image re-sampling involved in re-size and rotation transformations is an essential element block in a typical digital image alteration. Fortunately, traces left from such processes are detectable, proving that the image has gone a re-sampling transformation. Within this context, we present in this paper two original contributions. First, we propose a new re-sampling interpolation kernel. It depends on five independent parameters that controls its amplitude, angular frequency, standard deviation, and duration. Then, we demonstrate its capacity to imitate the same behavior of the most frequent interpolation kernels used in digital image re-sampling applications. Secondly, the proposed model is used to characterize and detect the correlation coefficients involved in re-sampling transformations. The involved process includes a minimization of an error function using the gradient method. The proposed method is assessed over a large database of 11,000 re-sampled images. Additionally, it is implemented within an algorithm in order to assess images that had undergone complex transformations. Obtained results demonstrate better performance and reduced processing time when compared to a reference method validating the suitability of the proposed approaches. Copyright © 2018 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Hsieh, Chueh-an; Xu, Xueli; von Davier, Matthias
2010-01-01
This paper presents an application of a jackknifing approach to variance estimation of ability inferences for groups of students, using a multidimensional discrete model for item response data. The data utilized to demonstrate the approach come from the National Assessment of Educational Progress (NAEP). In contrast to the operational approach…
NASA Astrophysics Data System (ADS)
Jannati, Mojtaba; Valadan Zoej, Mohammad Javad; Mokhtarzade, Mehdi
2018-03-01
This paper presents a novel approach to epipolar resampling of cross-track linear pushbroom imagery using orbital parameters model (OPM). The backbone of the proposed method relies on modification of attitude parameters of linear array stereo imagery in such a way to parallelize the approximate conjugate epipolar lines (ACELs) with the instantaneous base line (IBL) of the conjugate image points (CIPs). Afterward, a complementary rotation is applied in order to parallelize all the ACELs throughout the stereo imagery. The new estimated attitude parameters are evaluated based on the direction of the IBL and the ACELs. Due to the spatial and temporal variability of the IBL (respectively changes in column and row numbers of the CIPs) and nonparallel nature of the epipolar lines in the stereo linear images, some polynomials in the both column and row numbers of the CIPs are used to model new attitude parameters. As the instantaneous position of sensors remains fix, the digital elevation model (DEM) of the area of interest is not required in the resampling process. According to the experimental results obtained from two pairs of SPOT and RapidEye stereo imagery with a high elevation relief, the average absolute values of remained vertical parallaxes of CIPs in the normalized images were obtained 0.19 and 0.28 pixels respectively, which confirm the high accuracy and applicability of the proposed method.
A scale-invariant change detection method for land use/cover change research
NASA Astrophysics Data System (ADS)
Xing, Jin; Sieber, Renee; Caelli, Terrence
2018-07-01
Land Use/Cover Change (LUCC) detection relies increasingly on comparing remote sensing images with different spatial and spectral scales. Based on scale-invariant image analysis algorithms in computer vision, we propose a scale-invariant LUCC detection method to identify changes from scale heterogeneous images. This method is composed of an entropy-based spatial decomposition, two scale-invariant feature extraction methods, Maximally Stable Extremal Region (MSER) and Scale-Invariant Feature Transformation (SIFT) algorithms, a spatial regression voting method to integrate MSER and SIFT results, a Markov Random Field-based smoothing method, and a support vector machine classification method to assign LUCC labels. We test the scale invariance of our new method with a LUCC case study in Montreal, Canada, 2005-2012. We found that the scale-invariant LUCC detection method provides similar accuracy compared with the resampling-based approach but this method avoids the LUCC distortion incurred by resampling.
Assessing uncertainties in surface water security: An empirical multimodel approach
NASA Astrophysics Data System (ADS)
Rodrigues, Dulce B. B.; Gupta, Hoshin V.; Mendiondo, Eduardo M.; Oliveira, Paulo Tarso S.
2015-11-01
Various uncertainties are involved in the representation of processes that characterize interactions among societal needs, ecosystem functioning, and hydrological conditions. Here we develop an empirical uncertainty assessment of water security indicators that characterize scarcity and vulnerability, based on a multimodel and resampling framework. We consider several uncertainty sources including those related to (i) observed streamflow data; (ii) hydrological model structure; (iii) residual analysis; (iv) the method for defining Environmental Flow Requirement; (v) the definition of critical conditions for water provision; and (vi) the critical demand imposed by human activities. We estimate the overall hydrological model uncertainty by means of a residual bootstrap resampling approach, and by uncertainty propagation through different methodological arrangements applied to a 291 km2 agricultural basin within the Cantareira water supply system in Brazil. Together, the two-component hydrograph residual analysis and the block bootstrap resampling approach result in a more accurate and precise estimate of the uncertainty (95% confidence intervals) in the simulated time series. We then compare the uncertainty estimates associated with water security indicators using a multimodel framework and the uncertainty estimates provided by each model uncertainty estimation approach. The range of values obtained for the water security indicators suggests that the models/methods are robust and performs well in a range of plausible situations. The method is general and can be easily extended, thereby forming the basis for meaningful support to end-users facing water resource challenges by enabling them to incorporate a viable uncertainty analysis into a robust decision-making process.
NASA Astrophysics Data System (ADS)
Ruggeri, Paolo; Irving, James; Holliger, Klaus
2015-08-01
We critically examine the performance of sequential geostatistical resampling (SGR) as a model proposal mechanism for Bayesian Markov-chain-Monte-Carlo (MCMC) solutions to near-surface geophysical inverse problems. Focusing on a series of simple yet realistic synthetic crosshole georadar tomographic examples characterized by different numbers of data, levels of data error and degrees of model parameter spatial correlation, we investigate the efficiency of three different resampling strategies with regard to their ability to generate statistically independent realizations from the Bayesian posterior distribution. Quite importantly, our results show that, no matter what resampling strategy is employed, many of the examined test cases require an unreasonably high number of forward model runs to produce independent posterior samples, meaning that the SGR approach as currently implemented will not be computationally feasible for a wide range of problems. Although use of a novel gradual-deformation-based proposal method can help to alleviate these issues, it does not offer a full solution. Further, we find that the nature of the SGR is found to strongly influence MCMC performance; however no clear rule exists as to what set of inversion parameters and/or overall proposal acceptance rate will allow for the most efficient implementation. We conclude that although the SGR methodology is highly attractive as it allows for the consideration of complex geostatistical priors as well as conditioning to hard and soft data, further developments are necessary in the context of novel or hybrid MCMC approaches for it to be considered generally suitable for near-surface geophysical inversions.
Adaptive topographic mass correction for satellite gravity and gravity gradient data
NASA Astrophysics Data System (ADS)
Holzrichter, Nils; Szwillus, Wolfgang; Götze, Hans-Jürgen
2014-05-01
Subsurface modelling with gravity data includes a reliable topographic mass correction. Since decades, this mandatory step is a standard procedure. However, originally methods were developed for local terrestrial surveys. Therefore, these methods often include defaults like a limited correction area of 167 km around an observation point, resampling topography depending on the distance to the station or disregard the curvature of the earth. New satellite gravity data (e.g. GOCE) can be used for large scale lithospheric modelling with gravity data. The investigation areas can include thousands of kilometres. In addition, measurements are located in the flight height of the satellite (e.g. ~250 km for GOCE). The standard definition of the correction area and the specific grid spacing around an observation point was not developed for stations located in these heights and areas of these dimensions. This asks for a revaluation of the defaults used for topographic correction. We developed an algorithm which resamples the topography based on an adaptive approach. Instead of resampling topography depending on the distance to the station, the grids will be resampled depending on its influence at the station. Therefore, the only value the user has to define is the desired accuracy of the topographic correction. It is not necessary to define the grid spacing and a limited correction area. Furthermore, the algorithm calculates the topographic mass response with a spherical shaped polyhedral body. We show examples for local and global gravity datasets and compare the results of the topographic mass correction to existing approaches. We provide suggestions how satellite gravity and gradient data should be corrected.
Dudoit, Sandrine; Gilbert, Houston N.; van der Laan, Mark J.
2014-01-01
Summary This article proposes resampling-based empirical Bayes multiple testing procedures for controlling a broad class of Type I error rates, defined as generalized tail probability (gTP) error rates, gTP(q, g) = Pr(g(Vn, Sn) > q), and generalized expected value (gEV) error rates, gEV(g) = E[g(Vn, Sn)], for arbitrary functions g(Vn, Sn) of the numbers of false positives Vn and true positives Sn. Of particular interest are error rates based on the proportion g(Vn, Sn) = Vn/(Vn + Sn) of Type I errors among the rejected hypotheses, such as the false discovery rate (FDR), FDR = E[Vn/(Vn + Sn)]. The proposed procedures offer several advantages over existing methods. They provide Type I error control for general data generating distributions, with arbitrary dependence structures among variables. Gains in power are achieved by deriving rejection regions based on guessed sets of true null hypotheses and null test statistics randomly sampled from joint distributions that account for the dependence structure of the data. The Type I error and power properties of an FDR-controlling version of the resampling-based empirical Bayes approach are investigated and compared to those of widely-used FDR-controlling linear step-up procedures in a simulation study. The Type I error and power trade-off achieved by the empirical Bayes procedures under a variety of testing scenarios allows this approach to be competitive with or outperform the Storey and Tibshirani (2003) linear step-up procedure, as an alternative to the classical Benjamini and Hochberg (1995) procedure. PMID:18932138
Lawrence, Gregory B.; Fernandez, Ivan J.; Richter, Daniel D.; Ross, Donald S.; Hazlett, Paul W.; Bailey, Scott W.; Oiumet, Rock; Warby, Richard A.F.; Johnson, Arthur H.; Lin, Henry; Kaste, James M.; Lapenis, Andrew G.; Sullivan, Timothy J.
2013-01-01
Environmental change is monitored in North America through repeated measurements of weather, stream and river flow, air and water quality, and most recently, soil properties. Some skepticism remains, however, about whether repeated soil sampling can effectively distinguish between temporal and spatial variability, and efforts to document soil change in forest ecosystems through repeated measurements are largely nascent and uncoordinated. In eastern North America, repeated soil sampling has begun to provide valuable information on environmental problems such as air pollution. This review synthesizes the current state of the science to further the development and use of soil resampling as an integral method for recording and understanding environmental change in forested settings. The origins of soil resampling reach back to the 19th century in England and Russia. The concepts and methodologies involved in forest soil resampling are reviewed and evaluated through a discussion of how temporal and spatial variability can be addressed with a variety of sampling approaches. Key resampling studies demonstrate the type of results that can be obtained through differing approaches. Ongoing, large-scale issues such as recovery from acidification, long-term N deposition, C sequestration, effects of climate change, impacts from invasive species, and the increasing intensification of soil management all warrant the use of soil resampling as an essential tool for environmental monitoring and assessment. Furthermore, with better awareness of the value of soil resampling, studies can be designed with a long-term perspective so that information can be efficiently obtained well into the future to address problems that have not yet surfaced.
Estimator banks: a new tool for direction-of-arrival estimation
NASA Astrophysics Data System (ADS)
Gershman, Alex B.; Boehme, Johann F.
1997-10-01
A new powerful tool for improving the threshold performance of direction-of-arrival (DOA) estimation is considered. The essence of our approach is to reduce the number of outliers in the threshold domain using the so-called estimator bank containing multiple 'parallel' underlying DOA estimators which are based on pseudorandom resampling of the MUSIC spatial spectrum for given data batch or sample covariance matrix. To improve the threshold performance relative to conventional MUSIC, evolutionary principles are used, i.e., only 'successful' underlying estimators (having no failure in the preliminary estimated source localization sectors) are exploited in the final estimate. An efficient beamspace root implementation of the estimator bank approach is developed, combined with the array interpolation technique which enables the application to arbitrary arrays. A higher-order extension of our approach is also presented, where the cumulant-based MUSIC estimator is exploited as a basic technique for spatial spectrum resampling. Simulations and experimental data processing show that our algorithm performs well below the MUSIC threshold, namely, has the threshold performance similar to that of the stochastic ML method. At the same time, the computational cost of our algorithm is much lower than that of stochastic ML because no multidimensional optimization is involved.
Katayama, R; Sakai, S; Sakaguchi, T; Maeda, T; Takada, K; Hayabuchi, N; Morishita, J
2008-07-20
PURPOSE/AIM OF THE EXHIBIT: The purpose of this exhibit is: 1. To explain "resampling", an image data processing, performed by the digital radiographic system based on flat panel detector (FPD). 2. To show the influence of "resampling" on the basic imaging properties. 3. To present accurate measurement methods of the basic imaging properties of the FPD system. 1. The relationship between the matrix sizes of the output image and the image data acquired on FPD that automatically changes depending on a selected image size (FOV). 2. The explanation of the image data processing of "resampling". 3. The evaluation results of the basic imaging properties of the FPD system using two types of DICOM image to which "resampling" was performed: characteristic curves, presampled MTFs, noise power spectra, detective quantum efficiencies. CONCLUSION/SUMMARY: The major points of the exhibit are as follows: 1. The influence of "resampling" should not be disregarded in the evaluation of the basic imaging properties of the flat panel detector system. 2. It is necessary for the basic imaging properties to be measured by using DICOM image to which no "resampling" is performed.
Comment on: 'A Poisson resampling method for simulating reduced counts in nuclear medicine images'.
de Nijs, Robin
2015-07-21
In order to be able to calculate half-count images from already acquired data, White and Lawson published their method based on Poisson resampling. They verified their method experimentally by measurements with a Co-57 flood source. In this comment their results are reproduced and confirmed by a direct numerical simulation in Matlab. Not only Poisson resampling, but also two direct redrawing methods were investigated. Redrawing methods were based on a Poisson and a Gaussian distribution. Mean, standard deviation, skewness and excess kurtosis half-count/full-count ratios were determined for all methods, and compared to the theoretical values for a Poisson distribution. Statistical parameters showed the same behavior as in the original note and showed the superiority of the Poisson resampling method. Rounding off before saving of the half count image had a severe impact on counting statistics for counts below 100. Only Poisson resampling was not affected by this, while Gaussian redrawing was less affected by it than Poisson redrawing. Poisson resampling is the method of choice, when simulating half-count (or less) images from full-count images. It simulates correctly the statistical properties, also in the case of rounding off of the images.
NASA Astrophysics Data System (ADS)
Müller, H.; Haberlandt, U.
2018-01-01
Rainfall time series of high temporal resolution and spatial density are crucial for urban hydrology. The multiplicative random cascade model can be used for temporal rainfall disaggregation of daily data to generate such time series. Here, the uniform splitting approach with a branching number of 3 in the first disaggregation step is applied. To achieve a final resolution of 5 min, subsequent steps after disaggregation are necessary. Three modifications at different disaggregation levels are tested in this investigation (uniform splitting at Δt = 15 min, linear interpolation at Δt = 7.5 min and Δt = 3.75 min). Results are compared both with observations and an often used approach, based on the assumption that a time steps with Δt = 5.625 min, as resulting if a branching number of 2 is applied throughout, can be replaced with Δt = 5 min (called the 1280 min approach). Spatial consistence is implemented in the disaggregated time series using a resampling algorithm. In total, 24 recording stations in Lower Saxony, Northern Germany with a 5 min resolution have been used for the validation of the disaggregation procedure. The urban-hydrological suitability is tested with an artificial combined sewer system of about 170 hectares. The results show that all three variations outperform the 1280 min approach regarding reproduction of wet spell duration, average intensity, fraction of dry intervals and lag-1 autocorrelation. Extreme values with durations of 5 min are also better represented. For durations of 1 h, all approaches show only slight deviations from the observed extremes. The applied resampling algorithm is capable to achieve sufficient spatial consistence. The effects on the urban hydrological simulations are significant. Without spatial consistence, flood volumes of manholes and combined sewer overflow are strongly underestimated. After resampling, results using disaggregated time series as input are in the range of those using observed time series. Best overall performance regarding rainfall statistics are obtained by the method in which the disaggregation process ends at time steps with 7.5 min duration, deriving the 5 min time steps by linear interpolation. With subsequent resampling this method leads to a good representation of manhole flooding and combined sewer overflow volume in terms of hydrological simulations and outperforms the 1280 min approach.
A multistate dynamic site occupancy model for spatially aggregated sessile communities
Fukaya, Keiichi; Royle, J. Andrew; Okuda, Takehiro; Nakaoka, Masahiro; Noda, Takashi
2017-01-01
Estimation of transition probabilities of sessile communities seems easy in principle but may still be difficult in practice because resampling error (i.e. a failure to resample exactly the same location at fixed points) may cause significant estimation bias. Previous studies have developed novel analytical methods to correct for this estimation bias. However, they did not consider the local structure of community composition induced by the aggregated distribution of organisms that is typically observed in sessile assemblages and is very likely to affect observations.We developed a multistate dynamic site occupancy model to estimate transition probabilities that accounts for resampling errors associated with local community structure. The model applies a nonparametric multivariate kernel smoothing methodology to the latent occupancy component to estimate the local state composition near each observation point, which is assumed to determine the probability distribution of data conditional on the occurrence of resampling error.By using computer simulations, we confirmed that an observation process that depends on local community structure may bias inferences about transition probabilities. By applying the proposed model to a real data set of intertidal sessile communities, we also showed that estimates of transition probabilities and of the properties of community dynamics may differ considerably when spatial dependence is taken into account.Results suggest the importance of accounting for resampling error and local community structure for developing management plans that are based on Markovian models. Our approach provides a solution to this problem that is applicable to broad sessile communities. It can even accommodate an anisotropic spatial correlation of species composition, and may also serve as a basis for inferring complex nonlinear ecological dynamics.
NASA Astrophysics Data System (ADS)
Baisden, W. T.; Prior, C.; Lambie, S.; Tate, K.; Bruhn, F.; Parfitt, R.; Schipper, L.; Wilde, R. H.; Ross, C.
2006-12-01
Soil organic matter contains more C than terrestrial biomass and atmospheric CO2 combined, and reacts to climate and land-use change on timescales requiring long-term experiments or monitoring. The direction and uncertainty of soil C stock changes has been difficult to predict and incorporate in decision support tools for climate change policies. Moreover, standardization of approaches has been difficult because historic methods of soil sampling have varied regionally, nationally and temporally. The most common and uniform type of historic sampling is soil profiles, which have commonly been collected, described and archived in the course of both soil survey studies and research. Resampling soil profiles has considerable utility in carbon monitoring and in parameterizing models to understand the ecosystem responses to global change. Recent work spanning seven soil orders in New Zealand's grazed pastures has shown that, averaged over approximately 20 years, 31 soil profiles lost 106 g C m-2 y-1 (p=0.01) and 9.1 g N m{^-2} y-1 (p=0.002). These losses are unexpected and appear to extend well below the upper 30 cm of soil. Following on these recent results, additional advantages of resampling soil profiles can be emphasized. One of the most powerful applications afforded by resampling archived soils is the use of the pulse label of radiocarbon injected into the atmosphere by thermonuclear weapons testing circa 1963 as a tracer of soil carbon dynamics. This approach allows estimation of the proportion of soil C that is `passive' or `inert' and therefore unlikely to respond to global change. Evaluation of resampled soil horizons in a New Zealand soil chronosequence confirms that the approach yields consistent values for the proportion of `passive' soil C, reaching 25% of surface horizon soil C over 12,000 years. Across whole profiles, radiocarbon data suggest that the proportion of `passive' C in New Zealand grassland soil can be less than 40% of total soil C. Below 30 cm, 1 kg C m-2 or more may be reactive on decadal timescales, supporting evidence of soil C losses from throughout the soil profiles. Information from resampled soil profiles can be combined with additional contemporary measurements to test hypotheses about mechanisms for soil C changes. For example, Δ14C in excess of 200‰ in water extractable dissolved organic C (DOC) from surface soil horizons supports the hypothesis that decadal movement of DOC represents an important translocation of soil C. These preliminary results demonstrate that resampling whole soil profiles can support substantial progress in C cycle science, ranging from updating operational C accounting systems to the frontiers of research. Resampling can be complementary or superior to fixed-depth interval sampling of surface soil layers. Resampling must however be undertaken with relative urgency to maximize the potential interpretive power of bomb-derived radiocarbon.
Adaptive Resampling Particle Filters for GPS Carrier-Phase Navigation and Collision Avoidance System
NASA Astrophysics Data System (ADS)
Hwang, Soon Sik
This dissertation addresses three problems: 1) adaptive resampling technique (ART) for Particle Filters, 2) precise relative positioning using Global Positioning System (GPS) Carrier-Phase (CP) measurements applied to nonlinear integer resolution problem for GPS CP navigation using Particle Filters, and 3) collision detection system based on GPS CP broadcasts. First, Monte Carlo filters, called Particle Filters (PF), are widely used where the system is non-linear and non-Gaussian. In real-time applications, their estimation accuracies and efficiencies are significantly affected by the number of particles and the scheduling of relocating weights and samples, the so-called resampling step. In this dissertation, the appropriate number of particles is estimated adaptively such that the error of the sample mean and variance stay in bounds. These bounds are given by the confidence interval of a normal probability distribution for a multi-variate state. Two required number of samples maintaining the mean and variance error within the bounds are derived. The time of resampling is determined when the required sample number for the variance error crosses the required sample number for the mean error. Second, the PF using GPS CP measurements with adaptive resampling is applied to precise relative navigation between two GPS antennas. In order to make use of CP measurements for navigation, the unknown number of cycles between GPS antennas, the so called integer ambiguity, should be resolved. The PF is applied to this integer ambiguity resolution problem where the relative navigation states estimation involves nonlinear observations and nonlinear dynamics equation. Using the PF, the probability density function of the states is estimated by sampling from the position and velocity space and the integer ambiguities are resolved without using the usual hypothesis tests to search for the integer ambiguity. The ART manages the number of position samples and the frequency of the resampling step for real-time kinematics GPS navigation. The experimental results demonstrate the performance of the ART and the insensitivity of the proposed approach to GPS CP cycle-slips. Third, the GPS has great potential for the development of new collision avoidance systems and is being considered for the next generation Traffic alert and Collision Avoidance System (TCAS). The current TCAS equipment, is capable of broadcasting GPS code information to nearby airplanes, and also, the collision avoidance system using the navigation information based on GPS code has been studied by researchers. In this dissertation, the aircraft collision detection system using GPS CP information is addressed. The PF with position samples is employed for the CP based relative position estimation problem and the same algorithm can be used to determine the vehicle attitude if multiple GPS antennas are used. For a reliable and enhanced collision avoidance system, three dimensional trajectories are projected using the estimates of the relative position, velocity, and the attitude. It is shown that the performance of GPS CP based collision detecting algorithm meets the accuracy requirements for a precise approach of flight for auto landing with significantly less unnecessary collision false alarms and no miss alarms.
Propagating probability distributions of stand variables using sequential Monte Carlo methods
Jeffrey H. Gove
2009-01-01
A general probabilistic approach to stand yield estimation is developed based on sequential Monte Carlo filters, also known as particle filters. The essential steps in the development of the sampling importance resampling (SIR) particle filter are presented. The SIR filter is then applied to simulated and observed data showing how the 'predictor - corrector'...
Introduction to Permutation and Resampling-Based Hypothesis Tests
ERIC Educational Resources Information Center
LaFleur, Bonnie J.; Greevy, Robert A.
2009-01-01
A resampling-based method of inference--permutation tests--is often used when distributional assumptions are questionable or unmet. Not only are these methods useful for obvious departures from parametric assumptions (e.g., normality) and small sample sizes, but they are also more robust than their parametric counterparts in the presences of…
Ter Braak, Cajo J F; Peres-Neto, Pedro; Dray, Stéphane
2017-01-01
Statistical testing of trait-environment association from data is a challenge as there is no common unit of observation: the trait is observed on species, the environment on sites and the mediating abundance on species-site combinations. A number of correlation-based methods, such as the community weighted trait means method (CWM), the fourth-corner correlation method and the multivariate method RLQ, have been proposed to estimate such trait-environment associations. In these methods, valid statistical testing proceeds by performing two separate resampling tests, one site-based and the other species-based and by assessing significance by the largest of the two p -values (the p max test). Recently, regression-based methods using generalized linear models (GLM) have been proposed as a promising alternative with statistical inference via site-based resampling. We investigated the performance of this new approach along with approaches that mimicked the p max test using GLM instead of fourth-corner. By simulation using models with additional random variation in the species response to the environment, the site-based resampling tests using GLM are shown to have severely inflated type I error, of up to 90%, when the nominal level is set as 5%. In addition, predictive modelling of such data using site-based cross-validation very often identified trait-environment interactions that had no predictive value. The problem that we identify is not an "omitted variable bias" problem as it occurs even when the additional random variation is independent of the observed trait and environment data. Instead, it is a problem of ignoring a random effect. In the same simulations, the GLM-based p max test controlled the type I error in all models proposed so far in this context, but still gave slightly inflated error in more complex models that included both missing (but important) traits and missing (but important) environmental variables. For screening the importance of single trait-environment combinations, the fourth-corner test is shown to give almost the same results as the GLM-based tests in far less computing time.
NASA Astrophysics Data System (ADS)
Fan, Y. R.; Huang, G. H.; Baetz, B. W.; Li, Y. P.; Huang, K.
2017-06-01
In this study, a copula-based particle filter (CopPF) approach was developed for sequential hydrological data assimilation by considering parameter correlation structures. In CopPF, multivariate copulas are proposed to reflect parameter interdependence before the resampling procedure with new particles then being sampled from the obtained copulas. Such a process can overcome both particle degeneration and sample impoverishment. The applicability of CopPF is illustrated with three case studies using a two-parameter simplified model and two conceptual hydrologic models. The results for the simplified model indicate that model parameters are highly correlated in the data assimilation process, suggesting a demand for full description of their dependence structure. Synthetic experiments on hydrologic data assimilation indicate that CopPF can rejuvenate particle evolution in large spaces and thus achieve good performances with low sample size scenarios. The applicability of CopPF is further illustrated through two real-case studies. It is shown that, compared with traditional particle filter (PF) and particle Markov chain Monte Carlo (PMCMC) approaches, the proposed method can provide more accurate results for both deterministic and probabilistic prediction with a sample size of 100. Furthermore, the sample size would not significantly influence the performance of CopPF. Also, the copula resampling approach dominates parameter evolution in CopPF, with more than 50% of particles sampled by copulas in most sample size scenarios.
Immersive volume rendering of blood vessels
NASA Astrophysics Data System (ADS)
Long, Gregory; Kim, Han Suk; Marsden, Alison; Bazilevs, Yuri; Schulze, Jürgen P.
2012-03-01
In this paper, we present a novel method of visualizing flow in blood vessels. Our approach reads unstructured tetrahedral data, resamples it, and uses slice based 3D texture volume rendering. Due to the sparse structure of blood vessels, we utilize an octree to efficiently store the resampled data by discarding empty regions of the volume. We use animation to convey time series data, wireframe surface to give structure, and utilize the StarCAVE, a 3D virtual reality environment, to add a fully immersive element to the visualization. Our tool has great value in interdisciplinary work, helping scientists collaborate with clinicians, by improving the understanding of blood flow simulations. Full immersion in the flow field allows for a more intuitive understanding of the flow phenomena, and can be a great help to medical experts for treatment planning.
Comparison of parametric and bootstrap method in bioequivalence test.
Ahn, Byung-Jin; Yim, Dong-Seok
2009-10-01
The estimation of 90% parametric confidence intervals (CIs) of mean AUC and Cmax ratios in bioequivalence (BE) tests are based upon the assumption that formulation effects in log-transformed data are normally distributed. To compare the parametric CIs with those obtained from nonparametric methods we performed repeated estimation of bootstrap-resampled datasets. The AUC and Cmax values from 3 archived datasets were used. BE tests on 1,000 resampled datasets from each archived dataset were performed using SAS (Enterprise Guide Ver.3). Bootstrap nonparametric 90% CIs of formulation effects were then compared with the parametric 90% CIs of the original datasets. The 90% CIs of formulation effects estimated from the 3 archived datasets were slightly different from nonparametric 90% CIs obtained from BE tests on resampled datasets. Histograms and density curves of formulation effects obtained from resampled datasets were similar to those of normal distribution. However, in 2 of 3 resampled log (AUC) datasets, the estimates of formulation effects did not follow the Gaussian distribution. Bias-corrected and accelerated (BCa) CIs, one of the nonparametric CIs of formulation effects, shifted outside the parametric 90% CIs of the archived datasets in these 2 non-normally distributed resampled log (AUC) datasets. Currently, the 80~125% rule based upon the parametric 90% CIs is widely accepted under the assumption of normally distributed formulation effects in log-transformed data. However, nonparametric CIs may be a better choice when data do not follow this assumption.
Comparison of Parametric and Bootstrap Method in Bioequivalence Test
Ahn, Byung-Jin
2009-01-01
The estimation of 90% parametric confidence intervals (CIs) of mean AUC and Cmax ratios in bioequivalence (BE) tests are based upon the assumption that formulation effects in log-transformed data are normally distributed. To compare the parametric CIs with those obtained from nonparametric methods we performed repeated estimation of bootstrap-resampled datasets. The AUC and Cmax values from 3 archived datasets were used. BE tests on 1,000 resampled datasets from each archived dataset were performed using SAS (Enterprise Guide Ver.3). Bootstrap nonparametric 90% CIs of formulation effects were then compared with the parametric 90% CIs of the original datasets. The 90% CIs of formulation effects estimated from the 3 archived datasets were slightly different from nonparametric 90% CIs obtained from BE tests on resampled datasets. Histograms and density curves of formulation effects obtained from resampled datasets were similar to those of normal distribution. However, in 2 of 3 resampled log (AUC) datasets, the estimates of formulation effects did not follow the Gaussian distribution. Bias-corrected and accelerated (BCa) CIs, one of the nonparametric CIs of formulation effects, shifted outside the parametric 90% CIs of the archived datasets in these 2 non-normally distributed resampled log (AUC) datasets. Currently, the 80~125% rule based upon the parametric 90% CIs is widely accepted under the assumption of normally distributed formulation effects in log-transformed data. However, nonparametric CIs may be a better choice when data do not follow this assumption. PMID:19915699
Bersanelli, Matteo; Mosca, Ettore; Remondini, Daniel; Castellani, Gastone; Milanesi, Luciano
2016-01-01
A relation exists between network proximity of molecular entities in interaction networks, functional similarity and association with diseases. The identification of network regions associated with biological functions and pathologies is a major goal in systems biology. We describe a network diffusion-based pipeline for the interpretation of different types of omics in the context of molecular interaction networks. We introduce the network smoothing index, a network-based quantity that allows to jointly quantify the amount of omics information in genes and in their network neighbourhood, using network diffusion to define network proximity. The approach is applicable to both descriptive and inferential statistics calculated on omics data. We also show that network resampling, applied to gene lists ranked by quantities derived from the network smoothing index, indicates the presence of significantly connected genes. As a proof of principle, we identified gene modules enriched in somatic mutations and transcriptional variations observed in samples of prostate adenocarcinoma (PRAD). In line with the local hypothesis, network smoothing index and network resampling underlined the existence of a connected component of genes harbouring molecular alterations in PRAD. PMID:27731320
Resampling procedures to identify important SNPs using a consensus approach.
Pardy, Christopher; Motyer, Allan; Wilson, Susan
2011-11-29
Our goal is to identify common single-nucleotide polymorphisms (SNPs) (minor allele frequency > 1%) that add predictive accuracy above that gained by knowledge of easily measured clinical variables. We take an algorithmic approach to predict each phenotypic variable using a combination of phenotypic and genotypic predictors. We perform our procedure on the first simulated replicate and then validate against the others. Our procedure performs well when predicting Q1 but is less successful for the other outcomes. We use resampling procedures where possible to guard against false positives and to improve generalizability. The approach is based on finding a consensus regarding important SNPs by applying random forests and the least absolute shrinkage and selection operator (LASSO) on multiple subsamples. Random forests are used first to discard unimportant predictors, narrowing our focus to roughly 100 important SNPs. A cross-validation LASSO is then used to further select variables. We combine these procedures to guarantee that cross-validation can be used to choose a shrinkage parameter for the LASSO. If the clinical variables were unavailable, this prefiltering step would be essential. We perform the SNP-based analyses simultaneously rather than one at a time to estimate SNP effects in the presence of other causal variants. We analyzed the first simulated replicate of Genetic Analysis Workshop 17 without knowledge of the true model. Post-conference knowledge of the simulation parameters allowed us to investigate the limitations of our approach. We found that many of the false positives we identified were substantially correlated with genuine causal SNPs.
NASA Astrophysics Data System (ADS)
Lu, Siliang; Wang, Xiaoxian; He, Qingbo; Liu, Fang; Liu, Yongbin
2016-12-01
Transient signal analysis (TSA) has been proven an effective tool for motor bearing fault diagnosis, but has yet to be applied in processing bearing fault signals with variable rotating speed. In this study, a new TSA-based angular resampling (TSAAR) method is proposed for fault diagnosis under speed fluctuation condition via sound signal analysis. By applying the TSAAR method, the frequency smearing phenomenon is eliminated and the fault characteristic frequency is exposed in the envelope spectrum for bearing fault recognition. The TSAAR method can accurately estimate the phase information of the fault-induced impulses using neither complicated time-frequency analysis techniques nor external speed sensors, and hence it provides a simple, flexible, and data-driven approach that realizes variable-speed motor bearing fault diagnosis. The effectiveness and efficiency of the proposed TSAAR method are verified through a series of simulated and experimental case studies.
Efficient high-quality volume rendering of SPH data.
Fraedrich, Roland; Auer, Stefan; Westermann, Rüdiger
2010-01-01
High quality volume rendering of SPH data requires a complex order-dependent resampling of particle quantities along the view rays. In this paper we present an efficient approach to perform this task using a novel view-space discretization of the simulation domain. Our method draws upon recent work on GPU-based particle voxelization for the efficient resampling of particles into uniform grids. We propose a new technique that leverages a perspective grid to adaptively discretize the view-volume, giving rise to a continuous level-of-detail sampling structure and reducing memory requirements compared to a uniform grid. In combination with a level-of-detail representation of the particle set, the perspective grid allows effectively reducing the amount of primitives to be processed at run-time. We demonstrate the quality and performance of our method for the rendering of fluid and gas dynamics SPH simulations consisting of many millions of particles.
Xu, Yaomin; Guo, Xingyi; Sun, Jiayang; Zhao, Zhongming
2015-01-01
Motivation: Large-scale cancer genomic studies, such as The Cancer Genome Atlas (TCGA), have profiled multidimensional genomic data, including mutation and expression profiles on a variety of cancer cell types, to uncover the molecular mechanism of cancerogenesis. More than a hundred driver mutations have been characterized that confer the advantage of cell growth. However, how driver mutations regulate the transcriptome to affect cellular functions remains largely unexplored. Differential analysis of gene expression relative to a driver mutation on patient samples could provide us with new insights in understanding driver mutation dysregulation in tumor genome and developing personalized treatment strategies. Results: Here, we introduce the Snowball approach as a highly sensitive statistical analysis method to identify transcriptional signatures that are affected by a recurrent driver mutation. Snowball utilizes a resampling-based approach and combines a distance-based regression framework to assign a robust ranking index of genes based on their aggregated association with the presence of the mutation, and further selects the top significant genes for downstream data analyses or experiments. In our application of the Snowball approach to both synthesized and TCGA data, we demonstrated that it outperforms the standard methods and provides more accurate inferences to the functional effects and transcriptional dysregulation of driver mutations. Availability and implementation: R package and source code are available from CRAN at http://cran.r-project.org/web/packages/DESnowball, and also available at http://bioinfo.mc.vanderbilt.edu/DESnowball/. Contact: zhongming.zhao@vanderbilt.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25192743
Assessment of resampling methods for causality testing: A note on the US inflation behavior
Kyrtsou, Catherine; Kugiumtzis, Dimitris; Diks, Cees
2017-01-01
Different resampling methods for the null hypothesis of no Granger causality are assessed in the setting of multivariate time series, taking into account that the driving-response coupling is conditioned on the other observed variables. As appropriate test statistic for this setting, the partial transfer entropy (PTE), an information and model-free measure, is used. Two resampling techniques, time-shifted surrogates and the stationary bootstrap, are combined with three independence settings (giving a total of six resampling methods), all approximating the null hypothesis of no Granger causality. In these three settings, the level of dependence is changed, while the conditioning variables remain intact. The empirical null distribution of the PTE, as the surrogate and bootstrapped time series become more independent, is examined along with the size and power of the respective tests. Additionally, we consider a seventh resampling method by contemporaneously resampling the driving and the response time series using the stationary bootstrap. Although this case does not comply with the no causality hypothesis, one can obtain an accurate sampling distribution for the mean of the test statistic since its value is zero under H0. Results indicate that as the resampling setting gets more independent, the test becomes more conservative. Finally, we conclude with a real application. More specifically, we investigate the causal links among the growth rates for the US CPI, money supply and crude oil. Based on the PTE and the seven resampling methods, we consistently find that changes in crude oil cause inflation conditioning on money supply in the post-1986 period. However this relationship cannot be explained on the basis of traditional cost-push mechanisms. PMID:28708870
Assessment of resampling methods for causality testing: A note on the US inflation behavior.
Papana, Angeliki; Kyrtsou, Catherine; Kugiumtzis, Dimitris; Diks, Cees
2017-01-01
Different resampling methods for the null hypothesis of no Granger causality are assessed in the setting of multivariate time series, taking into account that the driving-response coupling is conditioned on the other observed variables. As appropriate test statistic for this setting, the partial transfer entropy (PTE), an information and model-free measure, is used. Two resampling techniques, time-shifted surrogates and the stationary bootstrap, are combined with three independence settings (giving a total of six resampling methods), all approximating the null hypothesis of no Granger causality. In these three settings, the level of dependence is changed, while the conditioning variables remain intact. The empirical null distribution of the PTE, as the surrogate and bootstrapped time series become more independent, is examined along with the size and power of the respective tests. Additionally, we consider a seventh resampling method by contemporaneously resampling the driving and the response time series using the stationary bootstrap. Although this case does not comply with the no causality hypothesis, one can obtain an accurate sampling distribution for the mean of the test statistic since its value is zero under H0. Results indicate that as the resampling setting gets more independent, the test becomes more conservative. Finally, we conclude with a real application. More specifically, we investigate the causal links among the growth rates for the US CPI, money supply and crude oil. Based on the PTE and the seven resampling methods, we consistently find that changes in crude oil cause inflation conditioning on money supply in the post-1986 period. However this relationship cannot be explained on the basis of traditional cost-push mechanisms.
Testing non-inferiority of a new treatment in three-arm clinical trials with binary endpoints.
Tang, Nian-Sheng; Yu, Bin; Tang, Man-Lai
2014-12-18
A two-arm non-inferiority trial without a placebo is usually adopted to demonstrate that an experimental treatment is not worse than a reference treatment by a small pre-specified non-inferiority margin due to ethical concerns. Selection of the non-inferiority margin and establishment of assay sensitivity are two major issues in the design, analysis and interpretation for two-arm non-inferiority trials. Alternatively, a three-arm non-inferiority clinical trial including a placebo is usually conducted to assess the assay sensitivity and internal validity of a trial. Recently, some large-sample approaches have been developed to assess the non-inferiority of a new treatment based on the three-arm trial design. However, these methods behave badly with small sample sizes in the three arms. This manuscript aims to develop some reliable small-sample methods to test three-arm non-inferiority. Saddlepoint approximation, exact and approximate unconditional, and bootstrap-resampling methods are developed to calculate p-values of the Wald-type, score and likelihood ratio tests. Simulation studies are conducted to evaluate their performance in terms of type I error rate and power. Our empirical results show that the saddlepoint approximation method generally behaves better than the asymptotic method based on the Wald-type test statistic. For small sample sizes, approximate unconditional and bootstrap-resampling methods based on the score test statistic perform better in the sense that their corresponding type I error rates are generally closer to the prespecified nominal level than those of other test procedures. Both approximate unconditional and bootstrap-resampling test procedures based on the score test statistic are generally recommended for three-arm non-inferiority trials with binary outcomes.
Wavelet analysis in ecology and epidemiology: impact of statistical tests
Cazelles, Bernard; Cazelles, Kévin; Chavez, Mario
2014-01-01
Wavelet analysis is now frequently used to extract information from ecological and epidemiological time series. Statistical hypothesis tests are conducted on associated wavelet quantities to assess the likelihood that they are due to a random process. Such random processes represent null models and are generally based on synthetic data that share some statistical characteristics with the original time series. This allows the comparison of null statistics with those obtained from original time series. When creating synthetic datasets, different techniques of resampling result in different characteristics shared by the synthetic time series. Therefore, it becomes crucial to consider the impact of the resampling method on the results. We have addressed this point by comparing seven different statistical testing methods applied with different real and simulated data. Our results show that statistical assessment of periodic patterns is strongly affected by the choice of the resampling method, so two different resampling techniques could lead to two different conclusions about the same time series. Moreover, our results clearly show the inadequacy of resampling series generated by white noise and red noise that are nevertheless the methods currently used in the wide majority of wavelets applications. Our results highlight that the characteristics of a time series, namely its Fourier spectrum and autocorrelation, are important to consider when choosing the resampling technique. Results suggest that data-driven resampling methods should be used such as the hidden Markov model algorithm and the ‘beta-surrogate’ method. PMID:24284892
Wavelet analysis in ecology and epidemiology: impact of statistical tests.
Cazelles, Bernard; Cazelles, Kévin; Chavez, Mario
2014-02-06
Wavelet analysis is now frequently used to extract information from ecological and epidemiological time series. Statistical hypothesis tests are conducted on associated wavelet quantities to assess the likelihood that they are due to a random process. Such random processes represent null models and are generally based on synthetic data that share some statistical characteristics with the original time series. This allows the comparison of null statistics with those obtained from original time series. When creating synthetic datasets, different techniques of resampling result in different characteristics shared by the synthetic time series. Therefore, it becomes crucial to consider the impact of the resampling method on the results. We have addressed this point by comparing seven different statistical testing methods applied with different real and simulated data. Our results show that statistical assessment of periodic patterns is strongly affected by the choice of the resampling method, so two different resampling techniques could lead to two different conclusions about the same time series. Moreover, our results clearly show the inadequacy of resampling series generated by white noise and red noise that are nevertheless the methods currently used in the wide majority of wavelets applications. Our results highlight that the characteristics of a time series, namely its Fourier spectrum and autocorrelation, are important to consider when choosing the resampling technique. Results suggest that data-driven resampling methods should be used such as the hidden Markov model algorithm and the 'beta-surrogate' method.
A hybrid approach to fault diagnosis of roller bearings under variable speed conditions
NASA Astrophysics Data System (ADS)
Wang, Yanxue; Yang, Lin; Xiang, Jiawei; Yang, Jianwei; He, Shuilong
2017-12-01
Rolling element bearings are one of the main elements in rotating machines, whose failure may lead to a fatal breakdown and significant economic losses. Conventional vibration-based diagnostic methods are based on the stationary assumption, thus they are not applicable to the diagnosis of bearings working under varying speeds. This constraint limits the bearing diagnosis to the industrial application significantly. A hybrid approach to fault diagnosis of roller bearings under variable speed conditions is proposed in this work, based on computed order tracking (COT) and variational mode decomposition (VMD)-based time frequency representation (VTFR). COT is utilized to resample the non-stationary vibration signal in the angular domain, while VMD is used to decompose the resampled signal into a number of band-limited intrinsic mode functions (BLIMFs). A VTFR is then constructed based on the estimated instantaneous frequency and instantaneous amplitude of each BLIMF. Moreover, the Gini index and time-frequency kurtosis are both proposed to quantitatively measure the sparsity and concentration measurement of time-frequency representation, respectively. The effectiveness of the VTFR for extracting nonlinear components has been verified by a bat signal. Results of this numerical simulation also show the sparsity and concentration of the VTFR are better than those of short-time Fourier transform, continuous wavelet transform, Hilbert-Huang transform and Wigner-Ville distribution techniques. Several experimental results have further demonstrated that the proposed method can well detect bearing faults under variable speed conditions.
Goldstein, Darlene R
2006-10-01
Studies of gene expression using high-density short oligonucleotide arrays have become a standard in a variety of biological contexts. Of the expression measures that have been proposed to quantify expression in these arrays, multi-chip-based measures have been shown to perform well. As gene expression studies increase in size, however, utilizing multi-chip expression measures is more challenging in terms of computing memory requirements and time. A strategic alternative to exact multi-chip quantification on a full large chip set is to approximate expression values based on subsets of chips. This paper introduces an extrapolation method, Extrapolation Averaging (EA), and a resampling method, Partition Resampling (PR), to approximate expression in large studies. An examination of properties indicates that subset-based methods can perform well compared with exact expression quantification. The focus is on short oligonucleotide chips, but the same ideas apply equally well to any array type for which expression is quantified using an entire set of arrays, rather than for only a single array at a time. Software implementing Partition Resampling and Extrapolation Averaging is under development as an R package for the BioConductor project.
Feder, Paul I; Ma, Zhenxu J; Bull, Richard J; Teuschler, Linda K; Rice, Glenn
2009-01-01
In chemical mixtures risk assessment, the use of dose-response data developed for one mixture to estimate risk posed by a second mixture depends on whether the two mixtures are sufficiently similar. While evaluations of similarity may be made using qualitative judgments, this article uses nonparametric statistical methods based on the "bootstrap" resampling technique to address the question of similarity among mixtures of chemical disinfectant by-products (DBP) in drinking water. The bootstrap resampling technique is a general-purpose, computer-intensive approach to statistical inference that substitutes empirical sampling for theoretically based parametric mathematical modeling. Nonparametric, bootstrap-based inference involves fewer assumptions than parametric normal theory based inference. The bootstrap procedure is appropriate, at least in an asymptotic sense, whether or not the parametric, distributional assumptions hold, even approximately. The statistical analysis procedures in this article are initially illustrated with data from 5 water treatment plants (Schenck et al., 2009), and then extended using data developed from a study of 35 drinking-water utilities (U.S. EPA/AMWA, 1989), which permits inclusion of a greater number of water constituents and increased structure in the statistical models.
NASA Astrophysics Data System (ADS)
Adjorlolo, Clement; Mutanga, Onisimo; Cho, Moses A.; Ismail, Riyad
2013-04-01
In this paper, a user-defined inter-band correlation filter function was used to resample hyperspectral data and thereby mitigate the problem of multicollinearity in classification analysis. The proposed resampling technique convolves the spectral dependence information between a chosen band-centre and its shorter and longer wavelength neighbours. Weighting threshold of inter-band correlation (WTC, Pearson's r) was calculated, whereby r = 1 at the band-centre. Various WTC (r = 0.99, r = 0.95 and r = 0.90) were assessed, and bands with coefficients beyond a chosen threshold were assigned r = 0. The resultant data were used in the random forest analysis to classify in situ C3 and C4 grass canopy reflectance. The respective WTC datasets yielded improved classification accuracies (kappa = 0.82, 0.79 and 0.76) with less correlated wavebands when compared to resampled Hyperion bands (kappa = 0.76). Overall, the results obtained from this study suggested that resampling of hyperspectral data should account for the spectral dependence information to improve overall classification accuracy as well as reducing the problem of multicollinearity.
The conditional resampling model STARS: weaknesses of the modeling concept and development
NASA Astrophysics Data System (ADS)
Menz, Christoph
2016-04-01
The Statistical Analogue Resampling Scheme (STARS) is based on a modeling concept of Werner and Gerstengarbe (1997). The model uses a conditional resampling technique to create a simulation time series from daily observations. Unlike other time series generators (such as stochastic weather generators) STARS only needs a linear regression specification of a single variable as the target condition for the resampling. Since its first implementation the algorithm was further extended in order to allow for a spatially distributed trend signal, to preserve the seasonal cycle and the autocorrelation of the observation time series (Orlovsky, 2007; Orlovsky et al., 2008). This evolved version was successfully used in several climate impact studies. However a detaild evaluation of the simulations revealed two fundamental weaknesses of the utilized resampling technique. 1. The restriction of the resampling condition on a single individual variable can lead to a misinterpretation of the change signal of other variables when the model is applied to a mulvariate time series. (F. Wechsung and M. Wechsung, 2014). As one example, the short-term correlations between precipitation and temperature (cooling of the near-surface air layer after a rainfall event) can be misinterpreted as a climatic change signal in the simulation series. 2. The model restricts the linear regression specification to the annual mean time series, refusing the specification of seasonal varying trends. To overcome these fundamental weaknesses a redevelopment of the whole algorithm was done. The poster discusses the main weaknesses of the earlier model implementation and the methods applied to overcome these in the new version. Based on the new model idealized simulations were conducted to illustrate the enhancement.
Porto, Paolo; Walling, Des E; Alewell, Christine; Callegari, Giovanni; Mabit, Lionel; Mallimo, Nicola; Meusburger, Katrin; Zehringer, Markus
2014-12-01
Soil erosion and both its on-site and off-site impacts are increasingly seen as a serious environmental problem across the world. The need for an improved evidence base on soil loss and soil redistribution rates has directed attention to the use of fallout radionuclides, and particularly (137)Cs, for documenting soil redistribution rates. This approach possesses important advantages over more traditional means of documenting soil erosion and soil redistribution. However, one key limitation of the approach is the time-averaged or lumped nature of the estimated erosion rates. In nearly all cases, these will relate to the period extending from the main period of bomb fallout to the time of sampling. Increasing concern for the impact of global change, particularly that related to changing land use and climate change, has frequently directed attention to the need to document changes in soil redistribution rates within this period. Re-sampling techniques, which should be distinguished from repeat-sampling techniques, have the potential to meet this requirement. As an example, the use of a re-sampling technique to derive estimates of the mean annual net soil loss from a small (1.38 ha) forested catchment in southern Italy is reported. The catchment was originally sampled in 1998 and samples were collected from points very close to the original sampling points again in 2013. This made it possible to compare the estimate of mean annual erosion for the period 1954-1998 with that for the period 1999-2013. The availability of measurements of sediment yield from the catchment for parts of the overall period made it possible to compare the results provided by the (137)Cs re-sampling study with the estimates of sediment yield for the same periods. In order to compare the estimates of soil loss and sediment yield for the two different periods, it was necessary to establish the uncertainty associated with the individual estimates. In the absence of a generally accepted procedure for such calculations, key factors influencing the uncertainty of the estimates were identified and a procedure developed. The results of the study demonstrated that there had been no significant change in mean annual soil loss in recent years and this was consistent with the information provided by the estimates of sediment yield from the catchment for the same periods. The study demonstrates the potential for using a re-sampling technique to document recent changes in soil redistribution rates. Copyright © 2014. Published by Elsevier Ltd.
Resampling and Distribution of the Product Methods for Testing Indirect Effects in Complex Models
ERIC Educational Resources Information Center
Williams, Jason; MacKinnon, David P.
2008-01-01
Recent advances in testing mediation have found that certain resampling methods and tests based on the mathematical distribution of 2 normal random variables substantially outperform the traditional "z" test. However, these studies have primarily focused only on models with a single mediator and 2 component paths. To address this limitation, a…
Estimation of Rainfall Sampling Uncertainty: A Comparison of Two Diverse Approaches
NASA Technical Reports Server (NTRS)
Steiner, Matthias; Zhang, Yu; Baeck, Mary Lynn; Wood, Eric F.; Smith, James A.; Bell, Thomas L.; Lau, William K. M. (Technical Monitor)
2002-01-01
The spatial and temporal intermittence of rainfall causes the averages of satellite observations of rain rate to differ from the "true" average rain rate over any given area and time period, even if the satellite observations are perfectly accurate. The difference of satellite averages based on occasional observation by satellite systems and the continuous-time average of rain rate is referred to as sampling error. In this study, rms sampling error estimates are obtained for average rain rates over boxes 100 km, 200 km, and 500 km on a side, for averaging periods of 1 day, 5 days, and 30 days. The study uses a multi-year, merged radar data product provided by Weather Services International Corp. at a resolution of 2 km in space and 15 min in time, over an area of the central U.S. extending from 35N to 45N in latitude and 100W to 80W in longitude. The intervals between satellite observations are assumed to be equal, and similar In size to what present and future satellite systems are able to provide (from 1 h to 12 h). The sampling error estimates are obtained using a resampling method called "resampling by shifts," and are compared to sampling error estimates proposed by Bell based on earlier work by Laughlin. The resampling estimates are found to scale with areal size and time period as the theory predicts. The dependence on average rain rate and time interval between observations is also similar to what the simple theory suggests.
One-shot estimate of MRMC variance: AUC.
Gallas, Brandon D
2006-03-01
One popular study design for estimating the area under the receiver operating characteristic curve (AUC) is the one in which a set of readers reads a set of cases: a fully crossed design in which every reader reads every case. The variability of the subsequent reader-averaged AUC has two sources: the multiple readers and the multiple cases (MRMC). In this article, we present a nonparametric estimate for the variance of the reader-averaged AUC that is unbiased and does not use resampling tools. The one-shot estimate is based on the MRMC variance derived by the mechanistic approach of Barrett et al. (2005), as well as the nonparametric variance of a single-reader AUC derived in the literature on U statistics. We investigate the bias and variance properties of the one-shot estimate through a set of Monte Carlo simulations with simulated model observers and images. The different simulation configurations vary numbers of readers and cases, amounts of image noise and internal noise, as well as how the readers are constructed. We compare the one-shot estimate to a method that uses the jackknife resampling technique with an analysis of variance model at its foundation (Dorfman et al. 1992). The name one-shot highlights that resampling is not used. The one-shot and jackknife estimators behave similarly, with the one-shot being marginally more efficient when the number of cases is small. We have derived a one-shot estimate of the MRMC variance of AUC that is based on a probabilistic foundation with limited assumptions, is unbiased, and compares favorably to an established estimate.
Motion vector field phase-to-amplitude resampling for 4D motion-compensated cone-beam CT
NASA Astrophysics Data System (ADS)
Sauppe, Sebastian; Kuhm, Julian; Brehm, Marcus; Paysan, Pascal; Seghers, Dieter; Kachelrieß, Marc
2018-02-01
We propose a phase-to-amplitude resampling (PTAR) method to reduce motion blurring in motion-compensated (MoCo) 4D cone-beam CT (CBCT) image reconstruction, without increasing the computational complexity of the motion vector field (MVF) estimation approach. PTAR is able to improve the image quality in reconstructed 4D volumes, including both regular and irregular respiration patterns. The PTAR approach starts with a robust phase-gating procedure for the initial MVF estimation and then switches to a phase-adapted amplitude gating method. The switch implies an MVF-resampling, which makes them amplitude-specific. PTAR ensures that the MVFs, which have been estimated on phase-gated reconstructions, are still valid for all amplitude-gated reconstructions. To validate the method, we use an artificially deformed clinical CT scan with a realistic breathing pattern and several patient data sets acquired with a TrueBeamTM integrated imaging system (Varian Medical Systems, Palo Alto, CA, USA). Motion blurring, which still occurs around the area of the diaphragm or at small vessels above the diaphragm in artifact-specific cyclic motion compensation (acMoCo) images based on phase-gating, is significantly reduced by PTAR. Also, small lung structures appear sharper in the images. This is demonstrated both for simulated and real patient data. A quantification of the sharpness of the diaphragm confirms these findings. PTAR improves the image quality of 4D MoCo reconstructions compared to conventional phase-gated MoCo images, in particular for irregular breathing patterns. Thus, PTAR increases the robustness of MoCo reconstructions for CBCT. Because PTAR does not require any additional steps for the MVF estimation, it is computationally efficient. Our method is not restricted to CBCT but could rather be applied to other image modalities.
NASA Astrophysics Data System (ADS)
Bou-Fakhreddine, Bassam; Mougharbel, Imad; Faye, Alain; Abou Chakra, Sara; Pollet, Yann
2018-03-01
Accurate daily river flow forecast is essential in many applications of water resources such as hydropower operation, agricultural planning and flood control. This paper presents a forecasting approach to deal with a newly addressed situation where hydrological data exist for a period longer than that of meteorological data (measurements asymmetry). In fact, one of the potential solutions to resolve measurements asymmetry issue is data re-sampling. It is a matter of either considering only the hydrological data or the balanced part of the hydro-meteorological data set during the forecasting process. However, the main disadvantage is that we may lose potentially relevant information from the left-out data. In this research, the key output is a Two-Phase Constructive Fuzzy inference hybrid model that is implemented over the non re-sampled data. The introduced modeling approach must be capable of exploiting the available data efficiently with higher prediction efficiency relative to Constructive Fuzzy model trained over re-sampled data set. The study was applied to Litani River in the Bekaa Valley - Lebanon by using 4 years of rainfall and 24 years of river flow daily measurements. A Constructive Fuzzy System Model (C-FSM) and a Two-Phase Constructive Fuzzy System Model (TPC-FSM) are trained. Upon validating, the second model has shown a primarily competitive performance and accuracy with the ability to preserve a higher day-to-day variability for 1, 3 and 6 days ahead. In fact, for the longest lead period, the C-FSM and TPC-FSM were able of explaining respectively 84.6% and 86.5% of the actual river flow variation. Overall, the results indicate that TPC-FSM model has provided a better tool to capture extreme flows in the process of streamflow prediction.
NASA Astrophysics Data System (ADS)
Adjorlolo, Clement; Cho, Moses A.; Mutanga, Onisimo; Ismail, Riyad
2012-01-01
Hyperspectral remote-sensing approaches are suitable for detection of the differences in 3-carbon (C3) and four carbon (C4) grass species phenology and composition. However, the application of hyperspectral sensors to vegetation has been hampered by high-dimensionality, spectral redundancy, and multicollinearity problems. In this experiment, resampling of hyperspectral data to wider wavelength intervals, around a few band-centers, sensitive to the biophysical and biochemical properties of C3 or C4 grass species is proposed. The approach accounts for an inherent property of vegetation spectral response: the asymmetrical nature of the inter-band correlations between a waveband and its shorter- and longer-wavelength neighbors. It involves constructing a curve of weighting threshold of correlation (Pearson's r) between a chosen band-center and its neighbors, as a function of wavelength. In addition, data were resampled to some multispectral sensors-ASTER, GeoEye-1, IKONOS, QuickBird, RapidEye, SPOT 5, and WorldView-2 satellites-for comparative purposes, with the proposed method. The resulting datasets were analyzed, using the random forest algorithm. The proposed resampling method achieved improved classification accuracy (κ=0.82), compared to the resampled multispectral datasets (κ=0.78, 0.65, 0.62, 0.59, 0.65, 0.62, 0.76, respectively). Overall, results from this study demonstrated that spectral resolutions for C3 and C4 grasses can be optimized and controlled for high dimensionality and multicollinearity problems, yet yielding high classification accuracies. The findings also provide a sound basis for programming wavebands for future sensors.
Exact and Monte carlo resampling procedures for the Wilcoxon-Mann-Whitney and Kruskal-Wallis tests.
Berry, K J; Mielke, P W
2000-12-01
Exact and Monte Carlo resampling FORTRAN programs are described for the Wilcoxon-Mann-Whitney rank sum test and the Kruskal-Wallis one-way analysis of variance for ranks test. The program algorithms compensate for tied values and do not depend on asymptotic approximations for probability values, unlike most algorithms contained in PC-based statistical software packages.
Clausen, J L; Georgian, T; Gardner, K H; Douglas, T A
2018-01-01
Research shows grab sampling is inadequate for evaluating military ranges contaminated with energetics because of their highly heterogeneous distribution. Similar studies assessing the heterogeneous distribution of metals at small-arms ranges (SAR) are lacking. To address this we evaluated whether grab sampling provides appropriate data for performing risk analysis at metal-contaminated SARs characterized with 30-48 grab samples. We evaluated the extractable metal content of Cu, Pb, Sb, and Zn of the field data using a Monte Carlo random resampling with replacement (bootstrapping) simulation approach. Results indicate the 95% confidence interval of the mean for Pb (432 mg/kg) at one site was 200-700 mg/kg with a data range of 5-4500 mg/kg. Considering the U.S. Environmental Protection Agency screening level for lead is 400 mg/kg, the necessity of cleanup at this site is unclear. Resampling based on populations of 7 and 15 samples, a sample size more realistic for the area yielded high false negative rates.
Anisotropic scene geometry resampling with occlusion filling for 3DTV applications
NASA Astrophysics Data System (ADS)
Kim, Jangheon; Sikora, Thomas
2006-02-01
Image and video-based rendering technologies are receiving growing attention due to their photo-realistic rendering capability in free-viewpoint. However, two major limitations are ghosting and blurring due to their sampling-based mechanism. The scene geometry which supports to select accurate sampling positions is proposed using global method (i.e. approximate depth plane) and local method (i.e. disparity estimation). This paper focuses on the local method since it can yield more accurate rendering quality without large number of cameras. The local scene geometry has two difficulties which are the geometrical density and the uncovered area including hidden information. They are the serious drawback to reconstruct an arbitrary viewpoint without aliasing artifacts. To solve the problems, we propose anisotropic diffusive resampling method based on tensor theory. Isotropic low-pass filtering accomplishes anti-aliasing in scene geometry and anisotropic diffusion prevents filtering from blurring the visual structures. Apertures in coarse samples are estimated following diffusion on the pre-filtered space, the nonlinear weighting of gradient directions suppresses the amount of diffusion. Aliasing artifacts from low density are efficiently removed by isotropic filtering and the edge blurring can be solved by the anisotropic method at one process. Due to difference size of sampling gap, the resampling condition is defined considering causality between filter-scale and edge. Using partial differential equation (PDE) employing Gaussian scale-space, we iteratively achieve the coarse-to-fine resampling. In a large scale, apertures and uncovered holes can be overcoming because only strong and meaningful boundaries are selected on the resolution. The coarse-level resampling with a large scale is iteratively refined to get detail scene structure. Simulation results show the marked improvements of rendering quality.
NASA Astrophysics Data System (ADS)
Huang, Huan; Baddour, Natalie; Liang, Ming
2018-02-01
Under normal operating conditions, bearings often run under time-varying rotational speed conditions. Under such circumstances, the bearing vibrational signal is non-stationary, which renders ineffective the techniques used for bearing fault diagnosis under constant running conditions. One of the conventional methods of bearing fault diagnosis under time-varying speed conditions is resampling the non-stationary signal to a stationary signal via order tracking with the measured variable speed. With the resampled signal, the methods available for constant condition cases are thus applicable. However, the accuracy of the order tracking is often inadequate and the time-varying speed is sometimes not measurable. Thus, resampling-free methods are of interest for bearing fault diagnosis under time-varying rotational speed for use without tachometers. With the development of time-frequency analysis, the time-varying fault character manifests as curves in the time-frequency domain. By extracting the Instantaneous Fault Characteristic Frequency (IFCF) from the Time-Frequency Representation (TFR) and converting the IFCF, its harmonics, and the Instantaneous Shaft Rotational Frequency (ISRF) into straight lines, the bearing fault can be detected and diagnosed without resampling. However, so far, the extraction of the IFCF for bearing fault diagnosis is mostly based on the assumption that at each moment the IFCF has the highest amplitude in the TFR, which is not always true. Hence, a more reliable T-F curve extraction approach should be investigated. Moreover, if the T-F curves including the IFCF, its harmonic, and the ISRF can be all extracted from the TFR directly, no extra processing is needed for fault diagnosis. Therefore, this paper proposes an algorithm for multiple T-F curve extraction from the TFR based on a fast path optimization which is more reliable for T-F curve extraction. Then, a new procedure for bearing fault diagnosis under unknown time-varying speed conditions is developed based on the proposed algorithm and a new fault diagnosis strategy. The average curve-to-curve ratios are utilized to describe the relationship of the extracted curves and fault diagnosis can then be achieved by comparing the ratios to the fault characteristic coefficients. The effectiveness of the proposed method is validated by simulated and experimental signals.
Warton, David I; Thibaut, Loïc; Wang, Yi Alice
2017-01-01
Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)-common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of "model-free bootstrap", adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods.
Thibaut, Loïc; Wang, Yi Alice
2017-01-01
Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)—common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of “model-free bootstrap”, adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods. PMID:28738071
Burkness, Eric C; Hutchison, W D
2009-10-01
Populations of cabbage looper, Trichoplusiani (Lepidoptera: Noctuidae), were sampled in experimental plots and commercial fields of cabbage (Brasicca spp.) in Minnesota during 1998-1999 as part of a larger effort to implement an integrated pest management program. Using a resampling approach and the Wald's sequential probability ratio test, sampling plans with different sampling parameters were evaluated using independent presence/absence and enumerative data. Evaluations and comparisons of the different sampling plans were made based on the operating characteristic and average sample number functions generated for each plan and through the use of a decision probability matrix. Values for upper and lower decision boundaries, sequential error rates (alpha, beta), and tally threshold were modified to determine parameter influence on the operating characteristic and average sample number functions. The following parameters resulted in the most desirable operating characteristic and average sample number functions; action threshold of 0.1 proportion of plants infested, tally threshold of 1, alpha = beta = 0.1, upper boundary of 0.15, lower boundary of 0.05, and resampling with replacement. We found that sampling parameters can be modified and evaluated using resampling software to achieve desirable operating characteristic and average sample number functions. Moreover, management of T. ni by using binomial sequential sampling should provide a good balance between cost and reliability by minimizing sample size and maintaining a high level of correct decisions (>95%) to treat or not treat.
Han, Guanghui; Liu, Xiabi; Zheng, Guangyuan; Wang, Murong; Huang, Shan
2018-06-06
Ground-glass opacity (GGO) is a common CT imaging sign on high-resolution CT, which means the lesion is more likely to be malignant compared to common solid lung nodules. The automatic recognition of GGO CT imaging signs is of great importance for early diagnosis and possible cure of lung cancers. The present GGO recognition methods employ traditional low-level features and system performance improves slowly. Considering the high-performance of CNN model in computer vision field, we proposed an automatic recognition method of 3D GGO CT imaging signs through the fusion of hybrid resampling and layer-wise fine-tuning CNN models in this paper. Our hybrid resampling is performed on multi-views and multi-receptive fields, which reduces the risk of missing small or large GGOs by adopting representative sampling panels and processing GGOs with multiple scales simultaneously. The layer-wise fine-tuning strategy has the ability to obtain the optimal fine-tuning model. Multi-CNN models fusion strategy obtains better performance than any single trained model. We evaluated our method on the GGO nodule samples in publicly available LIDC-IDRI dataset of chest CT scans. The experimental results show that our method yields excellent results with 96.64% sensitivity, 71.43% specificity, and 0.83 F1 score. Our method is a promising approach to apply deep learning method to computer-aided analysis of specific CT imaging signs with insufficient labeled images. Graphical abstract We proposed an automatic recognition method of 3D GGO CT imaging signs through the fusion of hybrid resampling and layer-wise fine-tuning CNN models in this paper. Our hybrid resampling reduces the risk of missing small or large GGOs by adopting representative sampling panels and processing GGOs with multiple scales simultaneously. The layer-wise fine-tuning strategy has ability to obtain the optimal fine-tuning model. Our method is a promising approach to apply deep learning method to computer-aided analysis of specific CT imaging signs with insufficient labeled images.
Shen, Chung-Wei; Chen, Yi-Hau
2018-03-13
We propose a model selection criterion for semiparametric marginal mean regression based on generalized estimating equations. The work is motivated by a longitudinal study on the physical frailty outcome in the elderly, where the cluster size, that is, the number of the observed outcomes in each subject, is "informative" in the sense that it is related to the frailty outcome itself. The new proposal, called Resampling Cluster Information Criterion (RCIC), is based on the resampling idea utilized in the within-cluster resampling method (Hoffman, Sen, and Weinberg, 2001, Biometrika 88, 1121-1134) and accommodates informative cluster size. The implementation of RCIC, however, is free of performing actual resampling of the data and hence is computationally convenient. Compared with the existing model selection methods for marginal mean regression, the RCIC method incorporates an additional component accounting for variability of the model over within-cluster subsampling, and leads to remarkable improvements in selecting the correct model, regardless of whether the cluster size is informative or not. Applying the RCIC method to the longitudinal frailty study, we identify being female, old age, low income and life satisfaction, and chronic health conditions as significant risk factors for physical frailty in the elderly. © 2018, The International Biometric Society.
Final Data Usability Summary and Resampling Proposal for Fort Sheridan
1996-03-22
performed. The basic approach discussed here was determined in discussions between Fort Sheridan, the EPA, Illinois EPA, the Army Environmental Center, and its RI consultant, Environmental Science and Engineering, Inc.
Stereo reconstruction from multiperspective panoramas.
Li, Yin; Shum, Heung-Yeung; Tang, Chi-Keung; Szeliski, Richard
2004-01-01
A new approach to computing a panoramic (360 degrees) depth map is presented in this paper. Our approach uses a large collection of images taken by a camera whose motion has been constrained to planar concentric circles. We resample regular perspective images to produce a set of multiperspective panoramas and then compute depth maps directly from these resampled panoramas. Our panoramas sample uniformly in three dimensions: rotation angle, inverse radial distance, and vertical elevation. The use of multiperspective panoramas eliminates the limited overlap present in the original input images and, thus, problems as in conventional multibaseline stereo can be avoided. Our approach differs from stereo matching of single-perspective panoramic images taken from different locations, where the epipolar constraints are sine curves. For our multiperspective panoramas, the epipolar geometry, to the first order approximation, consists of horizontal lines. Therefore, any traditional stereo algorithm can be applied to multiperspective panoramas with little modification. In this paper, we describe two reconstruction algorithms. The first is a cylinder sweep algorithm that uses a small number of resampled multiperspective panoramas to obtain dense 3D reconstruction. The second algorithm, in contrast, uses a large number of multiperspective panoramas and takes advantage of the approximate horizontal epipolar geometry inherent in multiperspective panoramas. It comprises a novel and efficient 1D multibaseline matching technique, followed by tensor voting to extract the depth surface. Experiments show that our algorithms are capable of producing comparable high quality depth maps which can be used for applications such as view interpolation.
A PRIM approach to predictive-signature development for patient stratification
Chen, Gong; Zhong, Hua; Belousov, Anton; Devanarayan, Viswanath
2015-01-01
Patients often respond differently to a treatment because of individual heterogeneity. Failures of clinical trials can be substantially reduced if, prior to an investigational treatment, patients are stratified into responders and nonresponders based on biological or demographic characteristics. These characteristics are captured by a predictive signature. In this paper, we propose a procedure to search for predictive signatures based on the approach of patient rule induction method. Specifically, we discuss selection of a proper objective function for the search, present its algorithm, and describe a resampling scheme that can enhance search performance. Through simulations, we characterize conditions under which the procedure works well. To demonstrate practical uses of the procedure, we apply it to two real-world data sets. We also compare the results with those obtained from a recent regression-based approach, Adaptive Index Models, and discuss their respective advantages. In this study, we focus on oncology applications with survival responses. PMID:25345685
Realistic generation of natural phenomena based on video synthesis
NASA Astrophysics Data System (ADS)
Wang, Changbo; Quan, Hongyan; Li, Chenhui; Xiao, Zhao; Chen, Xiao; Li, Peng; Shen, Liuwei
2009-10-01
Research on the generation of natural phenomena has many applications in special effects of movie, battlefield simulation and virtual reality, etc. Based on video synthesis technique, a new approach is proposed for the synthesis of natural phenomena, including flowing water and fire flame. From the fire and flow video, the seamless video of arbitrary length is generated. Then, the interaction between wind and fire flame is achieved through the skeleton of flame. Later, the flow is also synthesized by extending the video textures using an edge resample method. Finally, we can integrate the synthesized natural phenomena into a virtual scene.
Zhou, Shuntai; Jones, Corbin; Mieczkowski, Piotr
2015-01-01
ABSTRACT Validating the sampling depth and reducing sequencing errors are critical for studies of viral populations using next-generation sequencing (NGS). We previously described the use of Primer ID to tag each viral RNA template with a block of degenerate nucleotides in the cDNA primer. We now show that low-abundance Primer IDs (offspring Primer IDs) are generated due to PCR/sequencing errors. These artifactual Primer IDs can be removed using a cutoff model for the number of reads required to make a template consensus sequence. We have modeled the fraction of sequences lost due to Primer ID resampling. For a typical sequencing run, less than 10% of the raw reads are lost to offspring Primer ID filtering and resampling. The remaining raw reads are used to correct for PCR resampling and sequencing errors. We also demonstrate that Primer ID reveals bias intrinsic to PCR, especially at low template input or utilization. cDNA synthesis and PCR convert ca. 20% of RNA templates into recoverable sequences, and 30-fold sequence coverage recovers most of these template sequences. We have directly measured the residual error rate to be around 1 in 10,000 nucleotides. We use this error rate and the Poisson distribution to define the cutoff to identify preexisting drug resistance mutations at low abundance in an HIV-infected subject. Collectively, these studies show that >90% of the raw sequence reads can be used to validate template sampling depth and to dramatically reduce the error rate in assessing a genetically diverse viral population using NGS. IMPORTANCE Although next-generation sequencing (NGS) has revolutionized sequencing strategies, it suffers from serious limitations in defining sequence heterogeneity in a genetically diverse population, such as HIV-1 due to PCR resampling and PCR/sequencing errors. The Primer ID approach reveals the true sampling depth and greatly reduces errors. Knowing the sampling depth allows the construction of a model of how to maximize the recovery of sequences from input templates and to reduce resampling of the Primer ID so that appropriate multiplexing can be included in the experimental design. With the defined sampling depth and measured error rate, we are able to assign cutoffs for the accurate detection of minority variants in viral populations. This approach allows the power of NGS to be realized without having to guess about sampling depth or to ignore the problem of PCR resampling, while also being able to correct most of the errors in the data set. PMID:26041299
NASA Astrophysics Data System (ADS)
Singh, A. K.; Toshniwal, D.
2017-12-01
The MODIS Joint Atmosphere product, MODATML2 and MYDATML2 L2/3 provided by LAADS DAAC (Level-1 and Atmosphere Archive & Distribution System Distributed Active Archive Center) re-sampled from medium resolution MODIS Terra /Aqua Satellites data at 5km scale, contains Cloud Reflectance, Cloud Top Temperature, Water Vapor, Aerosol Optical Depth/Thickness, Humidity data. These re-sampled data, when used for deriving climatic effects of aerosols (particularly in case of cooling effect) still exposes limitations in presence of uncertainty measures in atmospheric artifacts such as aerosol, cloud, cirrus cloud etc. The effect of uncertainty measures in these artifacts imposes an important challenge for estimation of aerosol effects, adequately affecting precise regional weather modeling and predictions: Forecasting and recommendation applications developed largely depend on these short-term local conditions (e.g. City/Locality based recommendations to citizens/farmers based on local weather models). Our approach inculcates artificial intelligence technique for representing heterogeneous data(satellite data along with air quality data from local weather stations (i.e. in situ data)) to learn, correct and predict aerosol effects in the presence of cloud and other atmospheric artifacts, defusing Spatio-temporal correlations and regressions. The Big Data process pipeline consisting correlation and regression techniques developed on Apache Spark platform can easily scale for large data sets including many tiles (scenes) and over widened time-scale. Keywords: Climatic Effects of Aerosols, Situation-Aware, Big Data, Apache Spark, MODIS Terra /Aqua, Time Series
NASA Astrophysics Data System (ADS)
Poppick, A. N.; McKinnon, K. A.; Dunn-Sigouin, E.; Deser, C.
2017-12-01
Initial condition climate model ensembles suggest that regional temperature trends can be highly variable on decadal timescales due to characteristics of internal climate variability. Accounting for trend uncertainty due to internal variability is therefore necessary to contextualize recent observed temperature changes. However, while the variability of trends in a climate model ensemble can be evaluated directly (as the spread across ensemble members), internal variability simulated by a climate model may be inconsistent with observations. Observation-based methods for assessing the role of internal variability on trend uncertainty are therefore required. Here, we use a statistical resampling approach to assess trend uncertainty due to internal variability in historical 50-year (1966-2015) winter near-surface air temperature trends over North America. We compare this estimate of trend uncertainty to simulated trend variability in the NCAR CESM1 Large Ensemble (LENS), finding that uncertainty in wintertime temperature trends over North America due to internal variability is largely overestimated by CESM1, on average by a factor of 32%. Our observation-based resampling approach is combined with the forced signal from LENS to produce an 'Observational Large Ensemble' (OLENS). The members of OLENS indicate a range of spatially coherent fields of temperature trends resulting from different sequences of internal variability consistent with observations. The smaller trend variability in OLENS suggests that uncertainty in the historical climate change signal in observations due to internal variability is less than suggested by LENS.
Huang, Zhengxing; Chan, Tak-Ming; Dong, Wei
2017-02-01
Major adverse cardiac events (MACE) of acute coronary syndrome (ACS) often occur suddenly resulting in high mortality and morbidity. Recently, the rapid development of electronic medical records (EMR) provides the opportunity to utilize the potential of EMR to improve the performance of MACE prediction. In this study, we present a novel data-mining based approach specialized for MACE prediction from a large volume of EMR data. The proposed approach presents a new classification algorithm by applying both over-sampling and under-sampling on minority-class and majority-class samples, respectively, and integrating the resampling strategy into a boosting framework so that it can effectively handle imbalance of MACE of ACS patients analogous to domain practice. The method learns a new and stronger MACE prediction model each iteration from a more difficult subset of EMR data with wrongly predicted MACEs of ACS patients by a previous weak model. We verify the effectiveness of the proposed approach on a clinical dataset containing 2930 ACS patient samples with 268 feature types. While the imbalanced ratio does not seem extreme (25.7%), MACE prediction targets pose great challenge to traditional methods. As these methods degenerate dramatically with increasing imbalanced ratios, the performance of our approach for predicting MACE remains robust and reaches 0.672 in terms of AUC. On average, the proposed approach improves the performance of MACE prediction by 4.8%, 4.5%, 8.6% and 4.8% over the standard SVM, Adaboost, SMOTE, and the conventional GRACE risk scoring system for MACE prediction, respectively. We consider that the proposed iterative boosting approach has demonstrated great potential to meet the challenge of MACE prediction for ACS patients using a large volume of EMR. Copyright © 2017 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Yuan, Shenfang; Chen, Jian; Yang, Weibo; Qiu, Lei
2017-08-01
Fatigue crack growth prognosis is important for prolonging service time, improving safety, and reducing maintenance cost in many safety-critical systems, such as in aircraft, wind turbines, bridges, and nuclear plants. Combining fatigue crack growth models with the particle filter (PF) method has proved promising to deal with the uncertainties during fatigue crack growth and reach a more accurate prognosis. However, research on prognosis methods integrating on-line crack monitoring with the PF method is still lacking, as well as experimental verifications. Besides, the PF methods adopted so far are almost all sequential importance resampling-based PFs, which usually encounter sample impoverishment problems, and hence performs poorly. To solve these problems, in this paper, the piezoelectric transducers (PZTs)-based active Lamb wave method is adopted for on-line crack monitoring. The deterministic resampling PF (DRPF) is proposed to be used in fatigue crack growth prognosis, which can overcome the sample impoverishment problem. The proposed method is verified through fatigue tests of attachment lugs, which are a kind of important joint component in aerospace systems.
Porto, Paolo; Walling, Desmond E; Cogliandro, Vanessa; Callegari, Giovanni
2016-11-01
In recent years, the fallout radionuclides caesium-137 ( 137 Cs) and unsupported lead-210 ( 210 Pb ex) have been successfully used to document rates of soil erosion in many areas of the world, as an alternative to conventional measurements. By virtue of their different half-lives, these two radionuclides are capable of providing information related to different time windows. 137 Cs measurements are commonly used to generate information on mean annual erosion rates over the past ca. 50-60 years, whereas 210 Pb ex measurements are able to provide information relating to a longer period of up to ca. 100 years. However, the time-integrated nature of the estimates of soil redistribution provided by 137 Cs and 210 Pb ex measurements can be seen as a limitation, particularly when viewed in the context of global change and interest in the response of soil redistribution rates to contemporary climate change and land use change. Re-sampling techniques used with these two fallout radionuclides potentially provide a basis for providing information on recent changes in soil redistribution rates. By virtue of the effectively continuous fallout input, of 210 Pb, the response of the 210 Pb ex inventory of a soil profile to changing soil redistribution rates and thus its potential for use with the re-sampling approach differs from that of 137 Cs. Its greater sensitivity to recent changes in soil redistribution rates suggests that 210 Pb ex may have advantages over 137 Cs for use in the re-sampling approach. The potential for using 210 Pb ex measurements in re-sampling studies is explored further in this contribution. Attention focuses on a small (1.38 ha) forested catchment in southern Italy. The catchment was originally sampled for 210 Pb ex measurements in 2001 and equivalent samples were collected from points very close to the original sampling points again in 2013. This made it possible to compare the estimates of mean annual erosion related to two different time windows. This comparison suggests that mean annual rates of net soil loss had increased during the period between the two sampling campaigns and that this increase was associated with a shift to an increased sediment delivery ratio. This change was consistent with independent information on likely changes in the sediment response of the study catchment provided by the available records of annual sediment yield and changes in the annual rainfall documented for the local area. Copyright © 2016 Elsevier Ltd. All rights reserved.
From climate-change spaghetti to climate-change distributions for 21st Century California
Dettinger, M.D.
2005-01-01
The uncertainties associated with climate-change projections for California are unlikely to disappear any time soon, and yet important long-term decisions will be needed to accommodate those potential changes. Projection uncertainties have typically been addressed by analysis of a few scenarios, chosen based on availability or to capture the extreme cases among available projections. However, by focusing on more common projections rather than the most extreme projections (using a new resampling method), new insights into current projections emerge: (1) uncertainties associated with future greenhouse-gas emissions are comparable with the differences among climate models, so that neither source of uncertainties should be neglected or underrepresented; (2) twenty-first century temperature projections spread more, overall, than do precipitation scenarios; (3) projections of extremely wet futures for California are true outliers among current projections; and (4) current projections that are warmest tend, overall, to yield a moderately drier California, while the cooler projections yield a somewhat wetter future. The resampling approach applied in this paper also provides a natural opportunity to objectively incorporate measures of model skill and the likelihoods of various emission scenarios into future assessments.
McRoy, Susan; Jones, Sean; Kurmally, Adam
2016-09-01
This article examines methods for automated question classification applied to cancer-related questions that people have asked on the web. This work is part of a broader effort to provide automated question answering for health education. We created a new corpus of consumer-health questions related to cancer and a new taxonomy for those questions. We then compared the effectiveness of different statistical methods for developing classifiers, including weighted classification and resampling. Basic methods for building classifiers were limited by the high variability in the natural distribution of questions and typical refinement approaches of feature selection and merging categories achieved only small improvements to classifier accuracy. Best performance was achieved using weighted classification and resampling methods, the latter yielding an accuracy of F1 = 0.963. Thus, it would appear that statistical classifiers can be trained on natural data, but only if natural distributions of classes are smoothed. Such classifiers would be useful for automated question answering, for enriching web-based content, or assisting clinical professionals to answer questions. © The Author(s) 2015.
NASA Technical Reports Server (NTRS)
Benner, R.; Young, W.
1977-01-01
The results of an experimental study conducted to determine the geometric and radiometric effects of double resampling (bi-resampling) performed on image data in the process of performing map projection transformations are reported.
Arctic Acoustic Workshop Proceedings, 14-15 February 1989.
1989-06-01
measurements. The measurements reported by Levine et al. (1987) were taken from current and temperature sensors moored in two triangular grids . The internal...requires a resampling of the data series on a uniform depth-time grid . Statistics calculated from the resampled series will be used to test numerical...from an isolated keel. Figure 2: 2-D Modeling Geometry - The model is based on a 2-D Cartesian grid with an axis of symmetry on the left. A pulsed
Yin, Weiwei; Garimalla, Swetha; Moreno, Alberto; Galinski, Mary R; Styczynski, Mark P
2015-08-28
There are increasing efforts to bring high-throughput systems biology techniques to bear on complex animal model systems, often with a goal of learning about underlying regulatory network structures (e.g., gene regulatory networks). However, complex animal model systems typically have significant limitations on cohort sizes, number of samples, and the ability to perform follow-up and validation experiments. These constraints are particularly problematic for many current network learning approaches, which require large numbers of samples and may predict many more regulatory relationships than actually exist. Here, we test the idea that by leveraging the accuracy and efficiency of classifiers, we can construct high-quality networks that capture important interactions between variables in datasets with few samples. We start from a previously-developed tree-like Bayesian classifier and generalize its network learning approach to allow for arbitrary depth and complexity of tree-like networks. Using four diverse sample networks, we demonstrate that this approach performs consistently better at low sample sizes than the Sparse Candidate Algorithm, a representative approach for comparison because it is known to generate Bayesian networks with high positive predictive value. We develop and demonstrate a resampling-based approach to enable the identification of a viable root for the learned tree-like network, important for cases where the root of a network is not known a priori. We also develop and demonstrate an integrated resampling-based approach to the reduction of variable space for the learning of the network. Finally, we demonstrate the utility of this approach via the analysis of a transcriptional dataset of a malaria challenge in a non-human primate model system, Macaca mulatta, suggesting the potential to capture indicators of the earliest stages of cellular differentiation during leukopoiesis. We demonstrate that by starting from effective and efficient approaches for creating classifiers, we can identify interesting tree-like network structures with significant ability to capture the relationships in the training data. This approach represents a promising strategy for inferring networks with high positive predictive value under the constraint of small numbers of samples, meeting a need that will only continue to grow as more high-throughput studies are applied to complex model systems.
Ensemble-based prediction of RNA secondary structures.
Aghaeepour, Nima; Hoos, Holger H
2013-04-24
Accurate structure prediction methods play an important role for the understanding of RNA function. Energy-based, pseudoknot-free secondary structure prediction is one of the most widely used and versatile approaches, and improved methods for this task have received much attention over the past five years. Despite the impressive progress that as been achieved in this area, existing evaluations of the prediction accuracy achieved by various algorithms do not provide a comprehensive, statistically sound assessment. Furthermore, while there is increasing evidence that no prediction algorithm consistently outperforms all others, no work has been done to exploit the complementary strengths of multiple approaches. In this work, we present two contributions to the area of RNA secondary structure prediction. Firstly, we use state-of-the-art, resampling-based statistical methods together with a previously published and increasingly widely used dataset of high-quality RNA structures to conduct a comprehensive evaluation of existing RNA secondary structure prediction procedures. The results from this evaluation clarify the performance relationship between ten well-known existing energy-based pseudoknot-free RNA secondary structure prediction methods and clearly demonstrate the progress that has been achieved in recent years. Secondly, we introduce AveRNA, a generic and powerful method for combining a set of existing secondary structure prediction procedures into an ensemble-based method that achieves significantly higher prediction accuracies than obtained from any of its component procedures. Our new, ensemble-based method, AveRNA, improves the state of the art for energy-based, pseudoknot-free RNA secondary structure prediction by exploiting the complementary strengths of multiple existing prediction procedures, as demonstrated using a state-of-the-art statistical resampling approach. In addition, AveRNA allows an intuitive and effective control of the trade-off between false negative and false positive base pair predictions. Finally, AveRNA can make use of arbitrary sets of secondary structure prediction procedures and can therefore be used to leverage improvements in prediction accuracy offered by algorithms and energy models developed in the future. Our data, MATLAB software and a web-based version of AveRNA are publicly available at http://www.cs.ubc.ca/labs/beta/Software/AveRNA.
Bradshaw, Corey J. A.; Brook, Barry W.
2016-01-01
There are now many methods available to assess the relative citation performance of peer-reviewed journals. Regardless of their individual faults and advantages, citation-based metrics are used by researchers to maximize the citation potential of their articles, and by employers to rank academic track records. The absolute value of any particular index is arguably meaningless unless compared to other journals, and different metrics result in divergent rankings. To provide a simple yet more objective way to rank journals within and among disciplines, we developed a κ-resampled composite journal rank incorporating five popular citation indices: Impact Factor, Immediacy Index, Source-Normalized Impact Per Paper, SCImago Journal Rank and Google 5-year h-index; this approach provides an index of relative rank uncertainty. We applied the approach to six sample sets of scientific journals from Ecology (n = 100 journals), Medicine (n = 100), Multidisciplinary (n = 50); Ecology + Multidisciplinary (n = 25), Obstetrics & Gynaecology (n = 25) and Marine Biology & Fisheries (n = 25). We then cross-compared the κ-resampled ranking for the Ecology + Multidisciplinary journal set to the results of a survey of 188 publishing ecologists who were asked to rank the same journals, and found a 0.68–0.84 Spearman’s ρ correlation between the two rankings datasets. Our composite index approach therefore approximates relative journal reputation, at least for that discipline. Agglomerative and divisive clustering and multi-dimensional scaling techniques applied to the Ecology + Multidisciplinary journal set identified specific clusters of similarly ranked journals, with only Nature & Science separating out from the others. When comparing a selection of journals within or among disciplines, we recommend collecting multiple citation-based metrics for a sample of relevant and realistic journals to calculate the composite rankings and their relative uncertainty windows. PMID:26930052
Bradshaw, Corey J A; Brook, Barry W
2016-01-01
There are now many methods available to assess the relative citation performance of peer-reviewed journals. Regardless of their individual faults and advantages, citation-based metrics are used by researchers to maximize the citation potential of their articles, and by employers to rank academic track records. The absolute value of any particular index is arguably meaningless unless compared to other journals, and different metrics result in divergent rankings. To provide a simple yet more objective way to rank journals within and among disciplines, we developed a κ-resampled composite journal rank incorporating five popular citation indices: Impact Factor, Immediacy Index, Source-Normalized Impact Per Paper, SCImago Journal Rank and Google 5-year h-index; this approach provides an index of relative rank uncertainty. We applied the approach to six sample sets of scientific journals from Ecology (n = 100 journals), Medicine (n = 100), Multidisciplinary (n = 50); Ecology + Multidisciplinary (n = 25), Obstetrics & Gynaecology (n = 25) and Marine Biology & Fisheries (n = 25). We then cross-compared the κ-resampled ranking for the Ecology + Multidisciplinary journal set to the results of a survey of 188 publishing ecologists who were asked to rank the same journals, and found a 0.68-0.84 Spearman's ρ correlation between the two rankings datasets. Our composite index approach therefore approximates relative journal reputation, at least for that discipline. Agglomerative and divisive clustering and multi-dimensional scaling techniques applied to the Ecology + Multidisciplinary journal set identified specific clusters of similarly ranked journals, with only Nature & Science separating out from the others. When comparing a selection of journals within or among disciplines, we recommend collecting multiple citation-based metrics for a sample of relevant and realistic journals to calculate the composite rankings and their relative uncertainty windows.
Rainy Day: A Remote Sensing-Driven Extreme Rainfall Simulation Approach for Hazard Assessment
NASA Astrophysics Data System (ADS)
Wright, Daniel; Yatheendradas, Soni; Peters-Lidard, Christa; Kirschbaum, Dalia; Ayalew, Tibebu; Mantilla, Ricardo; Krajewski, Witold
2015-04-01
Progress on the assessment of rainfall-driven hazards such as floods and landslides has been hampered by the challenge of characterizing the frequency, intensity, and structure of extreme rainfall at the watershed or hillslope scale. Conventional approaches rely on simplifying assumptions and are strongly dependent on the location, the availability of long-term rain gage measurements, and the subjectivity of the analyst. Regional and global-scale rainfall remote sensing products provide an alternative, but are limited by relatively short (~15-year) observational records. To overcome this, we have coupled these remote sensing products with a space-time resampling framework known as stochastic storm transposition (SST). SST "lengthens" the rainfall record by resampling from a catalog of observed storms from a user-defined region, effectively recreating the regional extreme rainfall hydroclimate. This coupling has been codified in Rainy Day, a Python-based platform for quickly generating large numbers of probabilistic extreme rainfall "scenarios" at any point on the globe. Rainy Day is readily compatible with any gridded rainfall dataset. The user can optionally incorporate regional rain gage or weather radar measurements for bias correction using the Precipitation Uncertainties for Satellite Hydrology (PUSH) framework. Results from Rainy Day using the CMORPH satellite precipitation product are compared with local observations in two examples. The first example is peak discharge estimation in a medium-sized (~4000 square km) watershed in the central United States performed using CUENCAS, a parsimonious physically-based distributed hydrologic model. The second example is rainfall frequency analysis for Saint Lucia, a small volcanic island in the eastern Caribbean that is prone to landslides and flash floods. The distinct rainfall hydroclimates of the two example sites illustrate the flexibility of the approach and its usefulness for hazard analysis in data-poor regions.
Classifier performance prediction for computer-aided diagnosis using a limited dataset.
Sahiner, Berkman; Chan, Heang-Ping; Hadjiiski, Lubomir
2008-04-01
In a practical classifier design problem, the true population is generally unknown and the available sample is finite-sized. A common approach is to use a resampling technique to estimate the performance of the classifier that will be trained with the available sample. We conducted a Monte Carlo simulation study to compare the ability of the different resampling techniques in training the classifier and predicting its performance under the constraint of a finite-sized sample. The true population for the two classes was assumed to be multivariate normal distributions with known covariance matrices. Finite sets of sample vectors were drawn from the population. The true performance of the classifier is defined as the area under the receiver operating characteristic curve (AUC) when the classifier designed with the specific sample is applied to the true population. We investigated methods based on the Fukunaga-Hayes and the leave-one-out techniques, as well as three different types of bootstrap methods, namely, the ordinary, 0.632, and 0.632+ bootstrap. The Fisher's linear discriminant analysis was used as the classifier. The dimensionality of the feature space was varied from 3 to 15. The sample size n2 from the positive class was varied between 25 and 60, while the number of cases from the negative class was either equal to n2 or 3n2. Each experiment was performed with an independent dataset randomly drawn from the true population. Using a total of 1000 experiments for each simulation condition, we compared the bias, the variance, and the root-mean-squared error (RMSE) of the AUC estimated using the different resampling techniques relative to the true AUC (obtained from training on a finite dataset and testing on the population). Our results indicated that, under the study conditions, there can be a large difference in the RMSE obtained using different resampling methods, especially when the feature space dimensionality is relatively large and the sample size is small. Under this type of conditions, the 0.632 and 0.632+ bootstrap methods have the lowest RMSE, indicating that the difference between the estimated and the true performances obtained using the 0.632 and 0.632+ bootstrap will be statistically smaller than those obtained using the other three resampling methods. Of the three bootstrap methods, the 0.632+ bootstrap provides the lowest bias. Although this investigation is performed under some specific conditions, it reveals important trends for the problem of classifier performance prediction under the constraint of a limited dataset.
Significance of the impact of motion compensation on the variability of PET image features
NASA Astrophysics Data System (ADS)
Carles, M.; Bach, T.; Torres-Espallardo, I.; Baltas, D.; Nestle, U.; Martí-Bonmatí, L.
2018-03-01
In lung cancer, quantification by positron emission tomography/computed tomography (PET/CT) imaging presents challenges due to respiratory movement. Our primary aim was to study the impact of motion compensation implied by retrospectively gated (4D)-PET/CT on the variability of PET quantitative parameters. Its significance was evaluated by comparison with the variability due to (i) the voxel size in image reconstruction and (ii) the voxel size in image post-resampling. The method employed for feature extraction was chosen based on the analysis of (i) the effect of discretization of the standardized uptake value (SUV) on complementarity between texture features (TF) and conventional indices, (ii) the impact of the segmentation method on the variability of image features, and (iii) the variability of image features across the time-frame of 4D-PET. Thirty-one PET-features were involved. Three SUV discretization methods were applied: a constant width (SUV resolution) of the resampling bin (method RW), a constant number of bins (method RN) and RN on the image obtained after histogram equalization (method EqRN). The segmentation approaches evaluated were 40% of SUVmax and the contrast oriented algorithm (COA). Parameters derived from 4D-PET images were compared with values derived from the PET image obtained for (i) the static protocol used in our clinical routine (3D) and (ii) the 3D image post-resampled to the voxel size of the 4D image and PET image derived after modifying the reconstruction of the 3D image to comprise the voxel size of the 4D image. Results showed that TF complementarity with conventional indices was sensitive to the SUV discretization method. In the comparison of COA and 40% contours, despite the values not being interchangeable, all image features showed strong linear correlations (r > 0.91, p\\ll 0.001 ). Across the time-frames of 4D-PET, all image features followed a normal distribution in most patients. For our patient cohort, the compensation of tumor motion did not have a significant impact on the quantitative PET parameters. The variability of PET parameters due to voxel size in image reconstruction was more significant than variability due to voxel size in image post-resampling. In conclusion, most of the parameters (apart from the contrast of neighborhood matrix) were robust to the motion compensation implied by 4D-PET/CT. The impact on parameter variability due to the voxel size in image reconstruction and in image post-resampling could not be assumed to be equivalent.
Kück, Patrick; Meusemann, Karen; Dambach, Johannes; Thormann, Birthe; von Reumont, Björn M; Wägele, Johann W; Misof, Bernhard
2010-03-31
Methods of alignment masking, which refers to the technique of excluding alignment blocks prior to tree reconstructions, have been successful in improving the signal-to-noise ratio in sequence alignments. However, the lack of formally well defined methods to identify randomness in sequence alignments has prevented a routine application of alignment masking. In this study, we compared the effects on tree reconstructions of the most commonly used profiling method (GBLOCKS) which uses a predefined set of rules in combination with alignment masking, with a new profiling approach (ALISCORE) based on Monte Carlo resampling within a sliding window, using different data sets and alignment methods. While the GBLOCKS approach excludes variable sections above a certain threshold which choice is left arbitrary, the ALISCORE algorithm is free of a priori rating of parameter space and therefore more objective. ALISCORE was successfully extended to amino acids using a proportional model and empirical substitution matrices to score randomness in multiple sequence alignments. A complex bootstrap resampling leads to an even distribution of scores of randomly similar sequences to assess randomness of the observed sequence similarity. Testing performance on real data, both masking methods, GBLOCKS and ALISCORE, helped to improve tree resolution. The sliding window approach was less sensitive to different alignments of identical data sets and performed equally well on all data sets. Concurrently, ALISCORE is capable of dealing with different substitution patterns and heterogeneous base composition. ALISCORE and the most relaxed GBLOCKS gap parameter setting performed best on all data sets. Correspondingly, Neighbor-Net analyses showed the most decrease in conflict. Alignment masking improves signal-to-noise ratio in multiple sequence alignments prior to phylogenetic reconstruction. Given the robust performance of alignment profiling, alignment masking should routinely be used to improve tree reconstructions. Parametric methods of alignment profiling can be easily extended to more complex likelihood based models of sequence evolution which opens the possibility of further improvements.
A PRIM approach to predictive-signature development for patient stratification.
Chen, Gong; Zhong, Hua; Belousov, Anton; Devanarayan, Viswanath
2015-01-30
Patients often respond differently to a treatment because of individual heterogeneity. Failures of clinical trials can be substantially reduced if, prior to an investigational treatment, patients are stratified into responders and nonresponders based on biological or demographic characteristics. These characteristics are captured by a predictive signature. In this paper, we propose a procedure to search for predictive signatures based on the approach of patient rule induction method. Specifically, we discuss selection of a proper objective function for the search, present its algorithm, and describe a resampling scheme that can enhance search performance. Through simulations, we characterize conditions under which the procedure works well. To demonstrate practical uses of the procedure, we apply it to two real-world data sets. We also compare the results with those obtained from a recent regression-based approach, Adaptive Index Models, and discuss their respective advantages. In this study, we focus on oncology applications with survival responses. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Beckers, J.; Weerts, A.; Tijdeman, E.; Welles, E.; McManamon, A.
2013-12-01
To provide reliable and accurate seasonal streamflow forecasts for water resources management several operational hydrologic agencies and hydropower companies around the world use the Extended Streamflow Prediction (ESP) procedure. The ESP in its original implementation does not accommodate for any additional information that the forecaster may have about expected deviations from climatology in the near future. Several attempts have been conducted to improve the skill of the ESP forecast, especially for areas which are affected by teleconnetions (e,g. ENSO, PDO) via selection (Hamlet and Lettenmaier, 1999) or weighting schemes (Werner et al., 2004; Wood and Lettenmaier, 2006; Najafi et al., 2012). A disadvantage of such schemes is that they lead to a reduction of the signal to noise ratio of the probabilistic forecast. To overcome this, we propose a resampling method conditional on climate indices to generate meteorological time series to be used in the ESP. The method can be used to generate a large number of meteorological ensemble members in order to improve the statistical properties of the ensemble. The effectiveness of the method was demonstrated in a real-time operational hydrologic seasonal forecasts system for the Columbia River basin operated by the Bonneville Power Administration. The forecast skill of the k-nn resampler was tested against the original ESP for three basins at the long-range seasonal time scale. The BSS and CRPSS were used to compare the results to those of the original ESP method. Positive forecast skill scores were found for the resampler method conditioned on different indices for the prediction of spring peak flows in the Dworshak and Hungry Horse basin. For the Libby Dam basin however, no improvement of skill was found. The proposed resampling method is a promising practical approach that can add skill to ESP forecasts at the seasonal time scale. Further improvement is possible by fine tuning the method and selecting the most informative climate indices for the region of interest.
System for monitoring non-coincident, nonstationary process signals
Gross, Kenneth C.; Wegerich, Stephan W.
2005-01-04
An improved system for monitoring non-coincident, non-stationary, process signals. The mean, variance, and length of a reference signal is defined by an automated system, followed by the identification of the leading and falling edges of a monitored signal and the length of the monitored signal. The monitored signal is compared to the reference signal, and the monitored signal is resampled in accordance with the reference signal. The reference signal is then correlated with the resampled monitored signal such that the reference signal and the resampled monitored signal are coincident in time with each other. The resampled monitored signal is then compared to the reference signal to determine whether the resampled monitored signal is within a set of predesignated operating conditions.
A Sequential Ensemble Prediction System at Convection Permitting Scales
NASA Astrophysics Data System (ADS)
Milan, M.; Simmer, C.
2012-04-01
A Sequential Assimilation Method (SAM) following some aspects of particle filtering with resampling, also called SIR (Sequential Importance Resampling), is introduced and applied in the framework of an Ensemble Prediction System (EPS) for weather forecasting on convection permitting scales, with focus to precipitation forecast. At this scale and beyond, the atmosphere increasingly exhibits chaotic behaviour and non linear state space evolution due to convectively driven processes. One way to take full account of non linear state developments are particle filter methods, their basic idea is the representation of the model probability density function by a number of ensemble members weighted by their likelihood with the observations. In particular particle filter with resampling abandons ensemble members (particles) with low weights restoring the original number of particles adding multiple copies of the members with high weights. In our SIR-like implementation we substitute the likelihood way to define weights and introduce a metric which quantifies the "distance" between the observed atmospheric state and the states simulated by the ensemble members. We also introduce a methodology to counteract filter degeneracy, i.e. the collapse of the simulated state space. To this goal we propose a combination of resampling taking account of simulated state space clustering and nudging. By keeping cluster representatives during resampling and filtering, the method maintains the potential for non linear system state development. We assume that a particle cluster with initially low likelihood may evolve in a state space with higher likelihood in a subsequent filter time thus mimicking non linear system state developments (e.g. sudden convection initiation) and remedies timing errors for convection due to model errors and/or imperfect initial condition. We apply a simplified version of the resampling, the particles with highest weights in each cluster are duplicated; for the model evolution for each particle pair one particle evolves using the forward model; the second particle, however, is nudged to the radar and satellite observation during its evolution based on the forward model.
Communication Optimizations for a Wireless Distributed Prognostic Framework
NASA Technical Reports Server (NTRS)
Saha, Sankalita; Saha, Bhaskar; Goebel, Kai
2009-01-01
Distributed architecture for prognostics is an essential step in prognostic research in order to enable feasible real-time system health management. Communication overhead is an important design problem for such systems. In this paper we focus on communication issues faced in the distributed implementation of an important class of algorithms for prognostics - particle filters. In spite of being computation and memory intensive, particle filters lend well to distributed implementation except for one significant step - resampling. We propose new resampling scheme called parameterized resampling that attempts to reduce communication between collaborating nodes in a distributed wireless sensor network. Analysis and comparison with relevant resampling schemes is also presented. A battery health management system is used as a target application. A new resampling scheme for distributed implementation of particle filters has been discussed in this paper. Analysis and comparison of this new scheme with existing resampling schemes in the context for minimizing communication overhead have also been discussed. Our proposed new resampling scheme performs significantly better compared to other schemes by attempting to reduce both the communication message length as well as number total communication messages exchanged while not compromising prediction accuracy and precision. Future work will explore the effects of the new resampling scheme in the overall computational performance of the whole system as well as full implementation of the new schemes on the Sun SPOT devices. Exploring different network architectures for efficient communication is an importance future research direction as well.
Mir, Taskia; Dirks, Peter; Mason, Warren P; Bernstein, Mark
2014-10-01
This is a qualitative study designed to examine patient acceptability of re-sampling surgery for glioblastoma multiforme (GBM) electively post-therapy or at asymptomatic relapse. Thirty patients were selected using the convenience sampling method and interviewed. Patients were presented with hypothetical scenarios including a scenario in which the surgery was offered to them routinely and a scenario in which the surgery was in a clinical trial. The results of the study suggest that about two thirds of the patients offered the surgery on a routine basis would be interested, and half of the patients would agree to the surgery as part of a clinical trial. Several overarching themes emerged, some of which include: patients expressed ethical concerns about offering financial incentives or compensation to the patients or surgeons involved in the study; patients were concerned about appropriate communication and full disclosure about the procedures involved, the legalities of tumor ownership and the use of the tumor post-surgery; patients may feel alone or vulnerable when they are approached about the surgery; patients and their families expressed immense trust in their surgeon and indicated that this trust is a major determinant of their agreeing to surgery. The overall positive response to re-sampling surgery suggests that this procedure, if designed with all the ethical concerns attended to, would be welcomed by most patients. This approach of asking patients beforehand if a treatment innovation is acceptable would appear to be more practical and ethically desirable than previous practice.
NASA Astrophysics Data System (ADS)
Collins, Jarrod A.; Heiselman, Jon S.; Weis, Jared A.; Clements, Logan W.; Simpson, Amber L.; Jarnagin, William R.; Miga, Michael I.
2017-03-01
In image-guided liver surgery (IGLS), sparse representations of the anterior organ surface may be collected intraoperatively to drive image-to-physical space registration. Soft tissue deformation represents a significant source of error for IGLS techniques. This work investigates the impact of surface data quality on current surface based IGLS registration methods. In this work, we characterize the robustness of our IGLS registration methods to noise in organ surface digitization. We study this within a novel human-to-phantom data framework that allows a rapid evaluation of clinically realistic data and noise patterns on a fully characterized hepatic deformation phantom. Additionally, we implement a surface data resampling strategy that is designed to decrease the impact of differences in surface acquisition. For this analysis, n=5 cases of clinical intraoperative data consisting of organ surface and salient feature digitizations from open liver resection were collected and analyzed within our human-to-phantom validation framework. As expected, results indicate that increasing levels of noise in surface acquisition cause registration fidelity to deteriorate. With respect to rigid registration using the raw and resampled data at clinically realistic levels of noise (i.e. a magnitude of 1.5 mm), resampling improved TRE by 21%. In terms of nonrigid registration, registrations using resampled data outperformed the raw data result by 14% at clinically realistic levels and were less susceptible to noise across the range of noise investigated. These results demonstrate the types of analyses our novel human-to-phantom validation framework can provide and indicate the considerable benefits of resampling strategies.
Reconstruction of dynamical systems from resampled point processes produced by neuron models
NASA Astrophysics Data System (ADS)
Pavlova, Olga N.; Pavlov, Alexey N.
2018-04-01
Characterization of dynamical features of chaotic oscillations from point processes is based on embedding theorems for non-uniformly sampled signals such as the sequences of interspike intervals (ISIs). This theoretical background confirms the ability of attractor reconstruction from ISIs generated by chaotically driven neuron models. The quality of such reconstruction depends on the available length of the analyzed dataset. We discuss how data resampling improves the reconstruction for short amount of data and show that this effect is observed for different types of mechanisms for spike generation.
Datamining approaches for modeling tumor control probability.
Naqa, Issam El; Deasy, Joseph O; Mu, Yi; Huang, Ellen; Hope, Andrew J; Lindsay, Patricia E; Apte, Aditya; Alaly, James; Bradley, Jeffrey D
2010-11-01
Tumor control probability (TCP) to radiotherapy is determined by complex interactions between tumor biology, tumor microenvironment, radiation dosimetry, and patient-related variables. The complexity of these heterogeneous variable interactions constitutes a challenge for building predictive models for routine clinical practice. We describe a datamining framework that can unravel the higher order relationships among dosimetric dose-volume prognostic variables, interrogate various radiobiological processes, and generalize to unseen data before when applied prospectively. Several datamining approaches are discussed that include dose-volume metrics, equivalent uniform dose, mechanistic Poisson model, and model building methods using statistical regression and machine learning techniques. Institutional datasets of non-small cell lung cancer (NSCLC) patients are used to demonstrate these methods. The performance of the different methods was evaluated using bivariate Spearman rank correlations (rs). Over-fitting was controlled via resampling methods. Using a dataset of 56 patients with primary NCSLC tumors and 23 candidate variables, we estimated GTV volume and V75 to be the best model parameters for predicting TCP using statistical resampling and a logistic model. Using these variables, the support vector machine (SVM) kernel method provided superior performance for TCP prediction with an rs=0.68 on leave-one-out testing compared to logistic regression (rs=0.4), Poisson-based TCP (rs=0.33), and cell kill equivalent uniform dose model (rs=0.17). The prediction of treatment response can be improved by utilizing datamining approaches, which are able to unravel important non-linear complex interactions among model variables and have the capacity to predict on unseen data for prospective clinical applications.
Resampling: A Marriage of Computers and Statistics. ERIC/TM Digest.
ERIC Educational Resources Information Center
Rudner, Lawrence M.; Shafer, Mary Morello
Advances in computer technology are making it possible for educational researchers to use simpler statistical methods to address a wide range of questions with smaller data sets and fewer, and less restrictive, assumptions. This digest introduces computationally intensive statistics, collectively called resampling techniques. Resampling is a…
Spatial Quality Evaluation of Resampled Unmanned Aerial Vehicle-Imagery for Weed Mapping.
Borra-Serrano, Irene; Peña, José Manuel; Torres-Sánchez, Jorge; Mesas-Carrascosa, Francisco Javier; López-Granados, Francisca
2015-08-12
Unmanned aerial vehicles (UAVs) combined with different spectral range sensors are an emerging technology for providing early weed maps for optimizing herbicide applications. Considering that weeds, at very early phenological stages, are similar spectrally and in appearance, three major components are relevant: spatial resolution, type of sensor and classification algorithm. Resampling is a technique to create a new version of an image with a different width and/or height in pixels, and it has been used in satellite imagery with different spatial and temporal resolutions. In this paper, the efficiency of resampled-images (RS-images) created from real UAV-images (UAV-images; the UAVs were equipped with two types of sensors, i.e., visible and visible plus near-infrared spectra) captured at different altitudes is examined to test the quality of the RS-image output. The performance of the object-based-image-analysis (OBIA) implemented for the early weed mapping using different weed thresholds was also evaluated. Our results showed that resampling accurately extracted the spectral values from high spatial resolution UAV-images at an altitude of 30 m and the RS-image data at altitudes of 60 and 100 m, was able to provide accurate weed cover and herbicide application maps compared with UAV-images from real flights.
Spatial Quality Evaluation of Resampled Unmanned Aerial Vehicle-Imagery for Weed Mapping
Borra-Serrano, Irene; Peña, José Manuel; Torres-Sánchez, Jorge; Mesas-Carrascosa, Francisco Javier; López-Granados, Francisca
2015-01-01
Unmanned aerial vehicles (UAVs) combined with different spectral range sensors are an emerging technology for providing early weed maps for optimizing herbicide applications. Considering that weeds, at very early phenological stages, are similar spectrally and in appearance, three major components are relevant: spatial resolution, type of sensor and classification algorithm. Resampling is a technique to create a new version of an image with a different width and/or height in pixels, and it has been used in satellite imagery with different spatial and temporal resolutions. In this paper, the efficiency of resampled-images (RS-images) created from real UAV-images (UAV-images; the UAVs were equipped with two types of sensors, i.e., visible and visible plus near-infrared spectra) captured at different altitudes is examined to test the quality of the RS-image output. The performance of the object-based-image-analysis (OBIA) implemented for the early weed mapping using different weed thresholds was also evaluated. Our results showed that resampling accurately extracted the spectral values from high spatial resolution UAV-images at an altitude of 30 m and the RS-image data at altitudes of 60 and 100 m, was able to provide accurate weed cover and herbicide application maps compared with UAV-images from real flights. PMID:26274960
Bias-Corrected Estimation of Noncentrality Parameters of Covariance Structure Models
ERIC Educational Resources Information Center
Raykov, Tenko
2005-01-01
A bias-corrected estimator of noncentrality parameters of covariance structure models is discussed. The approach represents an application of the bootstrap methodology for purposes of bias correction, and utilizes the relation between average of resample conventional noncentrality parameter estimates and their sample counterpart. The…
NASA Astrophysics Data System (ADS)
Granade, Christopher; Wiebe, Nathan
2017-08-01
A major challenge facing existing sequential Monte Carlo methods for parameter estimation in physics stems from the inability of existing approaches to robustly deal with experiments that have different mechanisms that yield the results with equivalent probability. We address this problem here by proposing a form of particle filtering that clusters the particles that comprise the sequential Monte Carlo approximation to the posterior before applying a resampler. Through a new graphical approach to thinking about such models, we are able to devise an artificial-intelligence based strategy that automatically learns the shape and number of the clusters in the support of the posterior. We demonstrate the power of our approach by applying it to randomized gap estimation and a form of low circuit-depth phase estimation where existing methods from the physics literature either exhibit much worse performance or even fail completely.
An add-in implementation of the RESAMPLING syntax under Microsoft EXCEL.
Meineke, I
2000-10-01
The RESAMPLING syntax defines a set of powerful commands, which allow the programming of probabilistic statistical models with few, easily memorized statements. This paper presents an implementation of the RESAMPLING syntax using Microsoft EXCEL with Microsoft WINDOWS(R) as a platform. Two examples are given to demonstrate typical applications of RESAMPLING in biomedicine. Details of the implementation with special emphasis on the programming environment are discussed at length. The add-in is available electronically to interested readers upon request. The use of the add-in facilitates numerical statistical analyses of data from within EXCEL in a comfortable way.
NASA Astrophysics Data System (ADS)
Bureick, Johannes; Alkhatib, Hamza; Neumann, Ingo
2016-03-01
In many geodetic engineering applications it is necessary to solve the problem of describing a measured data point cloud, measured, e. g. by laser scanner, by means of free-form curves or surfaces, e. g., with B-Splines as basis functions. The state of the art approaches to determine B-Splines yields results which are seriously manipulated by the occurrence of data gaps and outliers. Optimal and robust B-Spline fitting depend, however, on optimal selection of the knot vector. Hence we combine in our approach Monte-Carlo methods and the location and curvature of the measured data in order to determine the knot vector of the B-Spline in such a way that no oscillating effects at the edges of data gaps occur. We introduce an optimized approach based on computed weights by means of resampling techniques. In order to minimize the effect of outliers, we apply robust M-estimators for the estimation of control points. The above mentioned approach will be applied to a multi-sensor system based on kinematic terrestrial laserscanning in the field of rail track inspection.
Walter C. Shortle; Kevin T. Smith; Andrei G. Lapenis
2017-01-01
A soil resampling approach has detected an early stage of recovery in the cation chemistry of spruce forest soil due to reductions in acid deposition. That approach is limited by the lack of soil data and archived soil samples prior to major increases in acid deposition during the latter half of the 20th century. An alternative approach is the dendrochemical analysis...
ERIC Educational Resources Information Center
Hand, Michael L.
1990-01-01
Use of the bootstrap resampling technique (BRT) is assessed in its application to resampling analysis associated with measurement of payment allocation errors by federally funded Family Assistance Programs. The BRT is applied to a food stamp quality control database in Oregon. This analysis highlights the outlier-sensitivity of the…
Application of a New Resampling Method to SEM: A Comparison of S-SMART with the Bootstrap
ERIC Educational Resources Information Center
Bai, Haiyan; Sivo, Stephen A.; Pan, Wei; Fan, Xitao
2016-01-01
Among the commonly used resampling methods of dealing with small-sample problems, the bootstrap enjoys the widest applications because it often outperforms its counterparts. However, the bootstrap still has limitations when its operations are contemplated. Therefore, the purpose of this study is to examine an alternative, new resampling method…
A Visual mining based framework for classification accuracy estimation
NASA Astrophysics Data System (ADS)
Arun, Pattathal Vijayakumar
2013-12-01
Classification techniques have been widely used in different remote sensing applications and correct classification of mixed pixels is a tedious task. Traditional approaches adopt various statistical parameters, however does not facilitate effective visualisation. Data mining tools are proving very helpful in the classification process. We propose a visual mining based frame work for accuracy assessment of classification techniques using open source tools such as WEKA and PREFUSE. These tools in integration can provide an efficient approach for getting information about improvements in the classification accuracy and helps in refining training data set. We have illustrated framework for investigating the effects of various resampling methods on classification accuracy and found that bilinear (BL) is best suited for preserving radiometric characteristics. We have also investigated the optimal number of folds required for effective analysis of LISS-IV images. Techniki klasyfikacji są szeroko wykorzystywane w różnych aplikacjach teledetekcyjnych, w których poprawna klasyfikacja pikseli stanowi poważne wyzwanie. Podejście tradycyjne wykorzystujące różnego rodzaju parametry statystyczne nie zapewnia efektywnej wizualizacji. Wielce obiecujące wydaje się zastosowanie do klasyfikacji narzędzi do eksploracji danych. W artykule zaproponowano podejście bazujące na wizualnej analizie eksploracyjnej, wykorzystujące takie narzędzia typu open source jak WEKA i PREFUSE. Wymienione narzędzia ułatwiają korektę pół treningowych i efektywnie wspomagają poprawę dokładności klasyfikacji. Działanie metody sprawdzono wykorzystując wpływ różnych metod resampling na zachowanie dokładności radiometrycznej i uzyskując najlepsze wyniki dla metody bilinearnej (BL).
Tracking the Gender Pay Gap: A Case Study
ERIC Educational Resources Information Center
Travis, Cheryl B.; Gross, Louis J.; Johnson, Bruce A.
2009-01-01
This article provides a short introduction to standard considerations in the formal study of wages and illustrates the use of multiple regression and resampling simulation approaches in a case study of faculty salaries at one university. Multiple regression is especially beneficial where it provides information on strength of association, specific…
Modified Polar-Format Software for Processing SAR Data
NASA Technical Reports Server (NTRS)
Chen, Curtis
2003-01-01
HMPF is a computer program that implements a modified polar-format algorithm for processing data from spaceborne synthetic-aperture radar (SAR) systems. Unlike prior polar-format processing algorithms, this algorithm is based on the assumption that the radar signal wavefronts are spherical rather than planar. The algorithm provides for resampling of SAR pulse data from slant range to radial distance from the center of a reference sphere that is nominally the local Earth surface. Then, invoking the projection-slice theorem, the resampled pulse data are Fourier-transformed over radial distance, arranged in the wavenumber domain according to the acquisition geometry, resampled to a Cartesian grid, and inverse-Fourier-transformed. The result of this process is the focused SAR image. HMPF, and perhaps other programs that implement variants of the algorithm, may give better accuracy than do prior algorithms for processing strip-map SAR data from high altitudes and may give better phase preservation relative to prior polar-format algorithms for processing spotlight-mode SAR data.
A quasi-dense matching approach and its calibration application with Internet photos.
Wan, Yanli; Miao, Zhenjiang; Wu, Q M Jonathan; Wang, Xifu; Tang, Zhen; Wang, Zhifei
2015-03-01
This paper proposes a quasi-dense matching approach to the automatic acquisition of camera parameters, which is required for recovering 3-D information from 2-D images. An affine transformation-based optimization model and a new matching cost function are used to acquire quasi-dense correspondences with high accuracy in each pair of views. These correspondences can be effectively detected and tracked at the sub-pixel level in multiviews with our neighboring view selection strategy. A two-layer iteration algorithm is proposed to optimize 3-D quasi-dense points and camera parameters. In the inner layer, different optimization strategies based on local photometric consistency and a global objective function are employed to optimize the 3-D quasi-dense points and camera parameters, respectively. In the outer layer, quasi-dense correspondences are resampled to guide a new estimation and optimization process of the camera parameters. We demonstrate the effectiveness of our algorithm with several experiments.
Wittenberg, Philipp; Gan, Fah Fatt; Knoth, Sven
2018-04-17
The variable life-adjusted display (VLAD) is the first risk-adjusted graphical procedure proposed in the literature for monitoring the performance of a surgeon. It displays the cumulative sum of expected minus observed deaths. It has since become highly popular because the statistic plotted is easy to understand. But it is also easy to misinterpret a surgeon's performance by utilizing the VLAD, potentially leading to grave consequences. The problem of misinterpretation is essentially caused by the variance of the VLAD's statistic that increases with sample size. In order for the VLAD to be truly useful, a simple signaling rule is desperately needed. Various forms of signaling rules have been developed, but they are usually quite complicated. Without signaling rules, making inferences using the VLAD alone is difficult if not misleading. In this paper, we establish an equivalence between a VLAD with V-mask and a risk-adjusted cumulative sum (RA-CUSUM) chart based on the difference between the estimated probability of death and surgical outcome. Average run length analysis based on simulation shows that this particular RA-CUSUM chart has similar performance as compared to the established RA-CUSUM chart based on the log-likelihood ratio statistic obtained by testing the odds ratio of death. We provide a simple design procedure for determining the V-mask parameters based on a resampling approach. Resampling from a real data set ensures that these parameters can be estimated appropriately. Finally, we illustrate the monitoring of a real surgeon's performance using VLAD with V-mask. Copyright © 2018 John Wiley & Sons, Ltd.
Simulating ensembles of source water quality using a K-nearest neighbor resampling approach.
Towler, Erin; Rajagopalan, Balaji; Seidel, Chad; Summers, R Scott
2009-03-01
Climatological, geological, and water management factors can cause significant variability in surface water quality. As drinking water quality standards become more stringent, the ability to quantify the variability of source water quality becomes more important for decision-making and planning in water treatment for regulatory compliance. However, paucity of long-term water quality data makes it challenging to apply traditional simulation techniques. To overcome this limitation, we have developed and applied a robust nonparametric K-nearest neighbor (K-nn) bootstrap approach utilizing the United States Environmental Protection Agency's Information Collection Rule (ICR) data. In this technique, first an appropriate "feature vector" is formed from the best available explanatory variables. The nearest neighbors to the feature vector are identified from the ICR data and are resampled using a weight function. Repetition of this results in water quality ensembles, and consequently the distribution and the quantification of the variability. The main strengths of the approach are its flexibility, simplicity, and the ability to use a large amount of spatial data with limited temporal extent to provide water quality ensembles for any given location. We demonstrate this approach by applying it to simulate monthly ensembles of total organic carbon for two utilities in the U.S. with very different watersheds and to alkalinity and bromide at two other U.S. utilities.
Maximum a posteriori resampling of noisy, spatially correlated data
NASA Astrophysics Data System (ADS)
Goff, John A.; Jenkins, Chris; Calder, Brian
2006-08-01
In any geologic application, noisy data are sources of consternation for researchers, inhibiting interpretability and marring images with unsightly and unrealistic artifacts. Filtering is the typical solution to dealing with noisy data. However, filtering commonly suffers from ad hoc (i.e., uncalibrated, ungoverned) application. We present here an alternative to filtering: a newly developed method for correcting noise in data by finding the "best" value given available information. The motivating rationale is that data points that are close to each other in space cannot differ by "too much," where "too much" is governed by the field covariance. Data with large uncertainties will frequently violate this condition and therefore ought to be corrected, or "resampled." Our solution for resampling is determined by the maximum of the a posteriori density function defined by the intersection of (1) the data error probability density function (pdf) and (2) the conditional pdf, determined by the geostatistical kriging algorithm applied to proximal data values. A maximum a posteriori solution can be computed sequentially going through all the data, but the solution depends on the order in which the data are examined. We approximate the global a posteriori solution by randomizing this order and taking the average. A test with a synthetic data set sampled from a known field demonstrates quantitatively and qualitatively the improvement provided by the maximum a posteriori resampling algorithm. The method is also applied to three marine geology/geophysics data examples, demonstrating the viability of the method for diverse applications: (1) three generations of bathymetric data on the New Jersey shelf with disparate data uncertainties; (2) mean grain size data from the Adriatic Sea, which is a combination of both analytic (low uncertainty) and word-based (higher uncertainty) sources; and (3) side-scan backscatter data from the Martha's Vineyard Coastal Observatory which are, as is typical for such data, affected by speckle noise. Compared to filtering, maximum a posteriori resampling provides an objective and optimal method for reducing noise, and better preservation of the statistical properties of the sampled field. The primary disadvantage is that maximum a posteriori resampling is a computationally expensive procedure.
Measures of precision for dissimilarity-based multivariate analysis of ecological communities
Anderson, Marti J; Santana-Garcon, Julia
2015-01-01
Ecological studies require key decisions regarding the appropriate size and number of sampling units. No methods currently exist to measure precision for multivariate assemblage data when dissimilarity-based analyses are intended to follow. Here, we propose a pseudo multivariate dissimilarity-based standard error (MultSE) as a useful quantity for assessing sample-size adequacy in studies of ecological communities. Based on sums of squared dissimilarities, MultSE measures variability in the position of the centroid in the space of a chosen dissimilarity measure under repeated sampling for a given sample size. We describe a novel double resampling method to quantify uncertainty in MultSE values with increasing sample size. For more complex designs, values of MultSE can be calculated from the pseudo residual mean square of a permanova model, with the double resampling done within appropriate cells in the design. R code functions for implementing these techniques, along with ecological examples, are provided. PMID:25438826
Approaches to Evaluating Probability of Collision Uncertainty
NASA Technical Reports Server (NTRS)
Hejduk, Matthew D.; Johnson, Lauren C.
2016-01-01
While the two-dimensional probability of collision (Pc) calculation has served as the main input to conjunction analysis risk assessment for over a decade, it has done this mostly as a point estimate, with relatively little effort made to produce confidence intervals on the Pc value based on the uncertainties in the inputs. The present effort seeks to try to carry these uncertainties through the calculation in order to generate a probability density of Pc results rather than a single average value. Methods for assessing uncertainty in the primary and secondary objects' physical sizes and state estimate covariances, as well as a resampling approach to reveal the natural variability in the calculation, are presented; and an initial proposal for operationally-useful display and interpretation of these data for a particular conjunction is given.
Computationally Efficient Resampling of Nonuniform Oversampled SAR Data
2010-05-01
noncoherently . The resample data is calculated using both a simple average and a weighted average of the demodulated data. The average nonuniform...trials with randomly varying accelerations. The results are shown in Fig. 5 for the noncoherent power difference and Fig. 6 for and coherent power...simple average. Figure 5. Noncoherent difference between SAR imagery generated with uniform sampling and nonuniform sampling that was resampled
Resampling methods in Microsoft Excel® for estimating reference intervals
Theodorsson, Elvar
2015-01-01
Computer- intensive resampling/bootstrap methods are feasible when calculating reference intervals from non-Gaussian or small reference samples. Microsoft Excel® in version 2010 or later includes natural functions, which lend themselves well to this purpose including recommended interpolation procedures for estimating 2.5 and 97.5 percentiles. The purpose of this paper is to introduce the reader to resampling estimation techniques in general and in using Microsoft Excel® 2010 for the purpose of estimating reference intervals in particular. Parametric methods are preferable to resampling methods when the distributions of observations in the reference samples is Gaussian or can transformed to that distribution even when the number of reference samples is less than 120. Resampling methods are appropriate when the distribution of data from the reference samples is non-Gaussian and in case the number of reference individuals and corresponding samples are in the order of 40. At least 500-1000 random samples with replacement should be taken from the results of measurement of the reference samples. PMID:26527366
Resampling methods in Microsoft Excel® for estimating reference intervals.
Theodorsson, Elvar
2015-01-01
Computer-intensive resampling/bootstrap methods are feasible when calculating reference intervals from non-Gaussian or small reference samples. Microsoft Excel® in version 2010 or later includes natural functions, which lend themselves well to this purpose including recommended interpolation procedures for estimating 2.5 and 97.5 percentiles. The purpose of this paper is to introduce the reader to resampling estimation techniques in general and in using Microsoft Excel® 2010 for the purpose of estimating reference intervals in particular. Parametric methods are preferable to resampling methods when the distributions of observations in the reference samples is Gaussian or can transformed to that distribution even when the number of reference samples is less than 120. Resampling methods are appropriate when the distribution of data from the reference samples is non-Gaussian and in case the number of reference individuals and corresponding samples are in the order of 40. At least 500-1000 random samples with replacement should be taken from the results of measurement of the reference samples.
Testing for Granger Causality in the Frequency Domain: A Phase Resampling Method.
Liu, Siwei; Molenaar, Peter
2016-01-01
This article introduces phase resampling, an existing but rarely used surrogate data method for making statistical inferences of Granger causality in frequency domain time series analysis. Granger causality testing is essential for establishing causal relations among variables in multivariate dynamic processes. However, testing for Granger causality in the frequency domain is challenging due to the nonlinear relation between frequency domain measures (e.g., partial directed coherence, generalized partial directed coherence) and time domain data. Through a simulation study, we demonstrate that phase resampling is a general and robust method for making statistical inferences even with short time series. With Gaussian data, phase resampling yields satisfactory type I and type II error rates in all but one condition we examine: when a small effect size is combined with an insufficient number of data points. Violations of normality lead to slightly higher error rates but are mostly within acceptable ranges. We illustrate the utility of phase resampling with two empirical examples involving multivariate electroencephalography (EEG) and skin conductance data.
Kent, Robert; Belitz, Kenneth; Fram, Miranda S.
2014-01-01
The Priority Basin Project (PBP) of the Groundwater Ambient Monitoring and Assessment (GAMA) Program was developed in response to the Groundwater Quality Monitoring Act of 2001 and is being conducted by the U.S. Geological Survey (USGS) in cooperation with the California State Water Resources Control Board (SWRCB). The GAMA-PBP began sampling, primarily public supply wells in May 2004. By the end of February 2006, seven (of what would eventually be 35) study units had been sampled over a wide area of the State. Selected wells in these first seven study units were resampled for water quality from August 2007 to November 2008 as part of an assessment of temporal trends in water quality by the GAMA-PBP. The initial sampling was designed to provide a spatially unbiased assessment of the quality of raw groundwater used for public water supplies within the seven study units. In the 7 study units, 462 wells were selected by using a spatially distributed, randomized grid-based method to provide statistical representation of the study area. Wells selected this way are referred to as grid wells or status wells. Approximately 3 years after the initial sampling, 55 of these previously sampled status wells (approximately 10 percent in each study unit) were randomly selected for resampling. The seven resampled study units, the total number of status wells sampled for each study unit, and the number of these wells resampled for trends are as follows, in chronological order of sampling: San Diego Drainages (53 status wells, 7 trend wells), North San Francisco Bay (84, 10), Northern San Joaquin Basin (51, 5), Southern Sacramento Valley (67, 7), San Fernando–San Gabriel (35, 6), Monterey Bay and Salinas Valley Basins (91, 11), and Southeast San Joaquin Valley (83, 9). The groundwater samples were analyzed for a large number of synthetic organic constituents (volatile organic compounds [VOCs], pesticides, and pesticide degradates), constituents of special interest (perchlorate, N-nitrosodimethylamine [NDMA], and 1,2,3-trichloropropane [1,2,3-TCP]), and naturally-occurring inorganic constituents (nutrients, major and minor ions, and trace elements). Naturally-occurring isotopes (tritium, carbon-14, and stable isotopes of hydrogen and oxygen in water) also were measured to help identify processes affecting groundwater quality and the sources and ages of the sampled groundwater. Nearly 300 constituents and water-quality indicators were investigated. Quality-control samples (blanks, replicates, and samples for matrix spikes) were collected at 24 percent of the 55 status wells resampled for trends, and the results for these samples were used to evaluate the quality of the data for the groundwater samples. Field blanks rarely contained detectable concentrations of any constituent, suggesting that contamination was not a noticeable source of bias in the data for the groundwater samples. Differences between replicate samples were mostly within acceptable ranges, indicating acceptably low variability in analytical results. Matrix-spike recoveries were within the acceptable range (70 to 130 percent) for 75 percent of the compounds for which matrix spikes were collected. This study did not attempt to evaluate the quality of water delivered to consumers. After withdrawal, groundwater typically is treated, disinfected, and blended with other waters to maintain acceptable water quality. The benchmarks used in this report apply to treated water that is served to the consumer, not to untreated groundwater. To provide some context for the results, however, concentrations of constituents measured in these groundwater samples were compared with benchmarks established by the U.S. Environmental Protection Agency (USEPA) and California Department of Public Health (CDPH). Comparisons between data collected for this study and benchmarks for drinking water are for illustrative purposes only and are not indicative of compliance or non-compliance with those benchmarks. Most constituents that were detected in groundwater samples from the trend wells were found at concentrations less than drinking-water benchmarks. Four VOCs—trichloroethene, tetrachloroethene, 1,2-dibromo-3-chloropropane, and methyl tert-butyl ether—were detected in one or more wells at concentrations greater than their health-based benchmarks, and six VOCs were detected in at least 10 percent of the samples during initial sampling or resampling of the trend wells. No pesticides were detected at concentrations near or greater than their health-based benchmarks. Three pesticide constituents—atrazine, deethylatrazine, and simazine—were detected in more than 10 percent of the trend-well samples during both sampling periods. Perchlorate, a constituent of special interest, was detected more frequently, and at greater concentrations during resampling than during initial sampling, but this may be due to a change in analytical method between the sampling periods, rather than to a change in groundwater quality. Another constituent of special interest, 1,2,3-TCP, was also detected more frequently during resampling than during initial sampling, but this pattern also may not reflect a change in groundwater quality. Samples from several of the wells where 1,2,3-TCP was detected by low-concentration-level analysis during resampling were not analyzed for 1,2,3-TCP using a low-level method during initial sampling. Most detections of nutrients and trace elements in samples from trend wells were less than health-based benchmarks during both sampling periods. Exceptions include nitrate, arsenic, boron, and vanadium, all detected at concentrations greater than their health-based benchmarks in at least one well during both sampling periods, and molybdenum, detected at concentrations greater than its health-based benchmark during resampling only. The isotopic ratios of oxygen and hydrogen in water and tritium and carbon-14 activities generally changed little between sampling periods, suggesting that the predominant sources and ages of groundwater in most trend wells were consistent between the sampling periods.
Assessing uncertainties in superficial water provision by different bootstrap-based techniques
NASA Astrophysics Data System (ADS)
Rodrigues, Dulce B. B.; Gupta, Hoshin V.; Mendiondo, Eduardo Mario
2014-05-01
An assessment of water security can incorporate several water-related concepts, characterizing the interactions between societal needs, ecosystem functioning, and hydro-climatic conditions. The superficial freshwater provision level depends on the methods chosen for 'Environmental Flow Requirement' estimations, which integrate the sources of uncertainty in the understanding of how water-related threats to aquatic ecosystem security arise. Here, we develop an uncertainty assessment of superficial freshwater provision based on different bootstrap techniques (non-parametric resampling with replacement). To illustrate this approach, we use an agricultural basin (291 km2) within the Cantareira water supply system in Brazil monitored by one daily streamflow gage (24-year period). The original streamflow time series has been randomly resampled for different times or sample sizes (N = 500; ...; 1000), then applied to the conventional bootstrap approach and variations of this method, such as: 'nearest neighbor bootstrap'; and 'moving blocks bootstrap'. We have analyzed the impact of the sampling uncertainty on five Environmental Flow Requirement methods, based on: flow duration curves or probability of exceedance (Q90%, Q75% and Q50%); 7-day 10-year low-flow statistic (Q7,10); and presumptive standard (80% of the natural monthly mean ?ow). The bootstrap technique has been also used to compare those 'Environmental Flow Requirement' (EFR) methods among themselves, considering the difference between the bootstrap estimates and the "true" EFR characteristic, which has been computed averaging the EFR values of the five methods and using the entire streamflow record at monitoring station. This study evaluates the bootstrapping strategies, the representativeness of streamflow series for EFR estimates and their confidence intervals, in addition to overview of the performance differences between the EFR methods. The uncertainties arisen during EFR methods assessment will be propagated through water security indicators referring to water scarcity and vulnerability, seeking to provide meaningful support to end-users and water managers facing the incorporation of uncertainties in the decision making process.
NASA Astrophysics Data System (ADS)
Meadors, Grant David; Krishnan, Badri; Papa, Maria Alessandra; Whelan, John T.; Zhang, Yuanhao
2018-02-01
Continuous-wave (CW) gravitational waves (GWs) call for computationally-intensive methods. Low signal-to-noise ratio signals need templated searches with long coherent integration times and thus fine parameter-space resolution. Longer integration increases sensitivity. Low-mass x-ray binaries (LMXBs) such as Scorpius X-1 (Sco X-1) may emit accretion-driven CWs at strains reachable by current ground-based observatories. Binary orbital parameters induce phase modulation. This paper describes how resampling corrects binary and detector motion, yielding source-frame time series used for cross-correlation. Compared to the previous, detector-frame, templated cross-correlation method, used for Sco X-1 on data from the first Advanced LIGO observing run (O1), resampling is about 20 × faster in the costliest, most-sensitive frequency bands. Speed-up factors depend on integration time and search setup. The speed could be reinvested into longer integration with a forecast sensitivity gain, 20 to 125 Hz median, of approximately 51%, or from 20 to 250 Hz, 11%, given the same per-band cost and setup. This paper's timing model enables future setup optimization. Resampling scales well with longer integration, and at 10 × unoptimized cost could reach respectively 2.83 × and 2.75 × median sensitivities, limited by spin-wandering. Then an O1 search could yield a marginalized-polarization upper limit reaching torque-balance at 100 Hz. Frequencies from 40 to 140 Hz might be probed in equal observing time with 2 × improved detectors.
Recommended GIS Analysis Methods for Global Gridded Population Data
NASA Astrophysics Data System (ADS)
Frye, C. E.; Sorichetta, A.; Rose, A.
2017-12-01
When using geographic information systems (GIS) to analyze gridded, i.e., raster, population data, analysts need a detailed understanding of several factors that affect raster data processing, and thus, the accuracy of the results. Global raster data is most often provided in an unprojected state, usually in the WGS 1984 geographic coordinate system. Most GIS functions and tools evaluate data based on overlay relationships (area) or proximity (distance). Area and distance for global raster data can be either calculated directly using the various earth ellipsoids or after transforming the data to equal-area/equidistant projected coordinate systems to analyze all locations equally. However, unlike when projecting vector data, not all projected coordinate systems can support such analyses equally, and the process of transforming raster data from one coordinate space to another often results unmanaged loss of data through a process called resampling. Resampling determines which values to use in the result dataset given an imperfect locational match in the input dataset(s). Cell size or resolution, registration, resampling method, statistical type, and whether the raster represents continuous or discreet information potentially influence the quality of the result. Gridded population data represent estimates of population in each raster cell, and this presentation will provide guidelines for accurately transforming population rasters for analysis in GIS. Resampling impacts the display of high resolution global gridded population data, and we will discuss how to properly handle pyramid creation using the Aggregate tool with the sum option to create overviews for mosaic datasets.
A comparison of resampling schemes for estimating model observer performance with small ensembles
NASA Astrophysics Data System (ADS)
Elshahaby, Fatma E. A.; Jha, Abhinav K.; Ghaly, Michael; Frey, Eric C.
2017-09-01
In objective assessment of image quality, an ensemble of images is used to compute the 1st and 2nd order statistics of the data. Often, only a finite number of images is available, leading to the issue of statistical variability in numerical observer performance. Resampling-based strategies can help overcome this issue. In this paper, we compared different combinations of resampling schemes (the leave-one-out (LOO) and the half-train/half-test (HT/HT)) and model observers (the conventional channelized Hotelling observer (CHO), channelized linear discriminant (CLD) and channelized quadratic discriminant). Observer performance was quantified by the area under the ROC curve (AUC). For a binary classification task and for each observer, the AUC value for an ensemble size of 2000 samples per class served as a gold standard for that observer. Results indicated that each observer yielded a different performance depending on the ensemble size and the resampling scheme. For a small ensemble size, the combination [CHO, HT/HT] had more accurate rankings than the combination [CHO, LOO]. Using the LOO scheme, the CLD and CHO had similar performance for large ensembles. However, the CLD outperformed the CHO and gave more accurate rankings for smaller ensembles. As the ensemble size decreased, the performance of the [CHO, LOO] combination seriously deteriorated as opposed to the [CLD, LOO] combination. Thus, it might be desirable to use the CLD with the LOO scheme when smaller ensemble size is available.
The influence of sampling interval on the accuracy of trail impact assessment
Leung, Y.-F.; Marion, J.L.
1999-01-01
Trail impact assessment and monitoring (IA&M) programs have been growing in importance and application in recreation resource management at protected areas. Census-based and sampling-based approaches have been developed in such programs, with systematic point sampling being the most common survey design. This paper examines the influence of sampling interval on the accuracy of estimates for selected trail impact problems. A complete census of four impact types on 70 trails in Great Smoky Mountains National Park was utilized as the base data set for the analyses. The census data were resampled at increasing intervals to create a series of simulated point data sets. Estimates of frequency of occurrence and lineal extent for the four impact types were compared with the census data set. The responses of accuracy loss on lineal extent estimates to increasing sampling intervals varied across different impact types, while the responses on frequency of occurrence estimates were consistent, approximating an inverse asymptotic curve. These findings suggest that systematic point sampling may be an appropriate method for estimating the lineal extent but not the frequency of trail impacts. Sample intervals of less than 100 m appear to yield an excellent level of accuracy for the four impact types evaluated. Multiple regression analysis results suggest that appropriate sampling intervals are more likely to be determined by the type of impact in question rather than the length of trail. The census-based trail survey and the resampling-simulation method developed in this study can be a valuable first step in establishing long-term trail IA&M programs, in which an optimal sampling interval range with acceptable accuracy is determined before investing efforts in data collection.
An Efficient Image Compressor for Charge Coupled Devices Camera
Li, Jin; Xing, Fei; You, Zheng
2014-01-01
Recently, the discrete wavelet transforms- (DWT-) based compressor, such as JPEG2000 and CCSDS-IDC, is widely seen as the state of the art compression scheme for charge coupled devices (CCD) camera. However, CCD images project on the DWT basis to produce a large number of large amplitude high-frequency coefficients because these images have a large number of complex texture and contour information, which are disadvantage for the later coding. In this paper, we proposed a low-complexity posttransform coupled with compressing sensing (PT-CS) compression approach for remote sensing image. First, the DWT is applied to the remote sensing image. Then, a pair base posttransform is applied to the DWT coefficients. The pair base are DCT base and Hadamard base, which can be used on the high and low bit-rate, respectively. The best posttransform is selected by the l p-norm-based approach. The posttransform is considered as the sparse representation stage of CS. The posttransform coefficients are resampled by sensing measurement matrix. Experimental results on on-board CCD camera images show that the proposed approach significantly outperforms the CCSDS-IDC-based coder, and its performance is comparable to that of the JPEG2000 at low bit rate and it does not have the high excessive implementation complexity of JPEG2000. PMID:25114977
A Data Augmentation Approach to Short Text Classification
ERIC Educational Resources Information Center
Rosario, Ryan Robert
2017-01-01
Text classification typically performs best with large training sets, but short texts are very common on the World Wide Web. Can we use resampling and data augmentation to construct larger texts using similar terms? Several current methods exist for working with short text that rely on using external data and contexts, or workarounds. Our focus is…
ERIC Educational Resources Information Center
Nevitt, Johnathan; Hancock, Gregory R.
Though common structural equation modeling (SEM) methods are predicated upon the assumption of multivariate normality, applied researchers often find themselves with data clearly violating this assumption and without sufficient sample size to use distribution-free estimation methods. Fortunately, promising alternatives are being integrated into…
On long-only information-based portfolio diversification framework
NASA Astrophysics Data System (ADS)
Santos, Raphael A.; Takada, Hellinton H.
2014-12-01
Using the concepts from information theory, it is possible to improve the traditional frameworks for long-only asset allocation. In modern portfolio theory, the investor has two basic procedures: the choice of a portfolio that maximizes its risk-adjusted excess return or the mixed allocation between the maximum Sharpe portfolio and the risk-free asset. In the literature, the first procedure was already addressed using information theory. One contribution of this paper is the consideration of the second procedure in the information theory context. The performance of these approaches was compared with three traditional asset allocation methodologies: the Markowitz's mean-variance, the resampled mean-variance and the equally weighted portfolio. Using simulated and real data, the information theory-based methodologies were verified to be more robust when dealing with the estimation errors.
Restoration and reconstruction from overlapping images
NASA Technical Reports Server (NTRS)
Reichenbach, Stephen E.; Kaiser, Daniel J.; Hanson, Andrew L.; Li, Jing
1997-01-01
This paper describes a technique for restoring and reconstructing a scene from overlapping images. In situations where there are multiple, overlapping images of the same scene, it may be desirable to create a single image that most closely approximates the scene, based on all of the data in the available images. For example, successive swaths acquired by NASA's planned Moderate Imaging Spectrometer (MODIS) will overlap, particularly at wide scan angles, creating a severe visual artifact in the output image. Resampling the overlapping swaths to produce a more accurate image on a uniform grid requires restoration and reconstruction. The one-pass restoration and reconstruction technique developed in this paper yields mean-square-optimal resampling, based on a comprehensive end-to-end system model that accounts for image overlap, and subject to user-defined and data-availability constraints on the spatial support of the filter.
Measures of precision for dissimilarity-based multivariate analysis of ecological communities.
Anderson, Marti J; Santana-Garcon, Julia
2015-01-01
Ecological studies require key decisions regarding the appropriate size and number of sampling units. No methods currently exist to measure precision for multivariate assemblage data when dissimilarity-based analyses are intended to follow. Here, we propose a pseudo multivariate dissimilarity-based standard error (MultSE) as a useful quantity for assessing sample-size adequacy in studies of ecological communities. Based on sums of squared dissimilarities, MultSE measures variability in the position of the centroid in the space of a chosen dissimilarity measure under repeated sampling for a given sample size. We describe a novel double resampling method to quantify uncertainty in MultSE values with increasing sample size. For more complex designs, values of MultSE can be calculated from the pseudo residual mean square of a permanova model, with the double resampling done within appropriate cells in the design. R code functions for implementing these techniques, along with ecological examples, are provided. © 2014 The Authors. Ecology Letters published by John Wiley & Sons Ltd and CNRS.
Kozak, M; Karaman, M
2001-07-01
Digital beamforming based on oversampled delta-sigma (delta sigma) analog-to-digital (A/D) conversion can reduce the overall cost, size, and power consumption of phased array front-end processing. The signal resampling involved in dynamic delta sigma beamforming, however, disrupts synchronization between the modulators and demodulator, causing significant degradation in the signal-to-noise ratio. As a solution to this, we have explored a new digital beamforming approach based on non-uniform oversampling delta sigma A/D conversion. Using this approach, the echo signals received by the transducer array are sampled at time instants determined by the beamforming timing and then digitized by single-bit delta sigma A/D conversion prior to the coherent beam summation. The timing information involves a non-uniform sampling scheme employing different clocks at each array channel. The delta sigma coded beamsums obtained by adding the delayed 1-bit coded RF echo signals are then processed through a decimation filter to produce final beamforming outputs. The performance and validity of the proposed beamforming approach are assessed by means of emulations using experimental raw RF data.
White, H; Racine, J
2001-01-01
We propose tests for individual and joint irrelevance of network inputs. Such tests can be used to determine whether an input or group of inputs "belong" in a particular model, thus permitting valid statistical inference based on estimated feedforward neural-network models. The approaches employ well-known statistical resampling techniques. We conduct a small Monte Carlo experiment showing that our tests have reasonable level and power behavior, and we apply our methods to examine whether there are predictable regularities in foreign exchange rates. We find that exchange rates do appear to contain information that is exploitable for enhanced point prediction, but the nature of the predictive relations evolves through time.
Method for Pre-Conditioning a Measured Surface Height Map for Model Validation
NASA Technical Reports Server (NTRS)
Sidick, Erkin
2012-01-01
This software allows one to up-sample or down-sample a measured surface map for model validation, not only without introducing any re-sampling errors, but also eliminating the existing measurement noise and measurement errors. Because the re-sampling of a surface map is accomplished based on the analytical expressions of Zernike-polynomials and a power spectral density model, such re-sampling does not introduce any aliasing and interpolation errors as is done by the conventional interpolation and FFT-based (fast-Fourier-transform-based) spatial-filtering method. Also, this new method automatically eliminates the measurement noise and other measurement errors such as artificial discontinuity. The developmental cycle of an optical system, such as a space telescope, includes, but is not limited to, the following two steps: (1) deriving requirements or specs on the optical quality of individual optics before they are fabricated through optical modeling and simulations, and (2) validating the optical model using the measured surface height maps after all optics are fabricated. There are a number of computational issues related to model validation, one of which is the "pre-conditioning" or pre-processing of the measured surface maps before using them in a model validation software tool. This software addresses the following issues: (1) up- or down-sampling a measured surface map to match it with the gridded data format of a model validation tool, and (2) eliminating the surface measurement noise or measurement errors such that the resulted surface height map is continuous or smoothly-varying. So far, the preferred method used for re-sampling a surface map is two-dimensional interpolation. The main problem of this method is that the same pixel can take different values when the method of interpolation is changed among the different methods such as the "nearest," "linear," "cubic," and "spline" fitting in Matlab. The conventional, FFT-based spatial filtering method used to eliminate the surface measurement noise or measurement errors can also suffer from aliasing effects. During re-sampling of a surface map, this software preserves the low spatial-frequency characteristic of a given surface map through the use of Zernike-polynomial fit coefficients, and maintains mid- and high-spatial-frequency characteristics of the given surface map by the use of a PSD model derived from the two-dimensional PSD data of the mid- and high-spatial-frequency components of the original surface map. Because this new method creates the new surface map in the desired sampling format from analytical expressions only, it does not encounter any aliasing effects and does not cause any discontinuity in the resultant surface map.
Memory and Trend of Precipitation in China during 1966-2013
NASA Astrophysics Data System (ADS)
Du, M.; Sun, F.; Liu, W.
2017-12-01
As climate change has had a significant impact on water cycle, the characteristic and variation of precipitation under climate change turned into a hotspot in hydrology. This study aims to analyze the trend and memory (both short-term and long-term) of precipitation in China. To do that, we apply statistical tests (including Mann-Kendall test, Ljung-Box test and Hurst exponent) to annual precipitation (P), frequency of rainy day (λ) and mean daily rainfall in days when precipitation occurs (α) in China (1966-2013). We also use a resampling approach to determine the field significance. From there, we evaluate the spatial distribution and percentages of stations with significant memory or trend. We find that the percentages of significant downtrends for λ and significant uptrends for α are significantly larger than the critical values at 95% field significance level, probably caused by the global warming. From these results, we conclude that extra care is necessary when significant results are obtained using statistical tests. This is because the null hypothesis could be rejected by chance and this situation is more likely to occur if spatial correlation is ignored according to the results of the resampling approach.
Image restoration techniques as applied to Landsat MSS and TM data
Meyer, David
1987-01-01
Two factors are primarily responsible for the loss of image sharpness in processing digital Landsat images. The first factor is inherent in the data because the sensor's optics and electronics, along with other sensor elements, blur and smear the data. Digital image restoration can be used to reduce this degradation. The second factor, which further degrades by blurring or aliasing, is the resampling performed during geometric correction. An image restoration procedure, when used in place of typical resampled techniques, reduces sensor degradation without introducing the artifacts associated with resampling. The EROS Data Center (EDC) has implemented the restoration proceed for Landsat multispectral scanner (MSS) and thematic mapper (TM) data. This capability, developed at the University of Arizona by Dr. Robert Schowengerdt and Lynette Wood, combines restoration and resampling in a single step to produce geometrically corrected MSS and TM imagery. As with resampling, restoration demands a tradeoff be made between aliasing, which occurs when attempting to extract maximum sharpness from an image, and blurring, which reduces the aliasing problem but sacrifices image sharpness. The restoration procedure used at EDC minimizes these artifacts by being adaptive, tailoring the tradeoff to be optimal for individual images.
A Sequential Monte Carlo Approach for Streamflow Forecasting
NASA Astrophysics Data System (ADS)
Hsu, K.; Sorooshian, S.
2008-12-01
As alternatives to traditional physically-based models, Artificial Neural Network (ANN) models offer some advantages with respect to the flexibility of not requiring the precise quantitative mechanism of the process and the ability to train themselves from the data directly. In this study, an ANN model was used to generate one-day-ahead streamflow forecasts from the precipitation input over a catchment. Meanwhile, the ANN model parameters were trained using a Sequential Monte Carlo (SMC) approach, namely Regularized Particle Filter (RPF). The SMC approaches are known for their capabilities in tracking the states and parameters of a nonlinear dynamic process based on the Baye's rule and the proposed effective sampling and resampling strategies. In this study, five years of daily rainfall and streamflow measurement were used for model training. Variable sample sizes of RPF, from 200 to 2000, were tested. The results show that, after 1000 RPF samples, the simulation statistics, in terms of correlation coefficient, root mean square error, and bias, were stabilized. It is also shown that the forecasted daily flows fit the observations very well, with the correlation coefficient of higher than 0.95. The results of RPF simulations were also compared with those from the popular back-propagation ANN training approach. The pros and cons of using SMC approach and the traditional back-propagation approach will be discussed.
Ground and Space Radar Volume Matching and Comparison Software
NASA Technical Reports Server (NTRS)
Morris, Kenneth; Schwaller, Mathew
2010-01-01
This software enables easy comparison of ground- and space-based radar observations. The software was initially designed to compare ground radar reflectivity from operational, ground based Sand C-band meteorological radars with comparable measurements from the Tropical Rainfall Measuring Mission (TRMM) satellite s Precipitation Radar (PR) instrument. The software is also applicable to other ground-based and space-based radars. The ground and space radar volume matching and comparison software was developed in response to requirements defined by the Ground Validation System (GVS) of Goddard s Global Precipitation Mission (GPM) project. This software innovation is specifically concerned with simplifying the comparison of ground- and spacebased radar measurements for the purpose of GPM algorithm and data product validation. This software is unique in that it provides an operational environment to routinely create comparison products, and uses a direct geometric approach to derive common volumes of space- and ground-based radar data. In this approach, spatially coincident volumes are defined by the intersection of individual space-based Precipitation Radar rays with the each of the conical elevation sweeps of the ground radar. Thus, the resampled volume elements of the space and ground radar reflectivity can be directly compared to one another.
Assessing paleo-biodiversity using low proxy influx.
Blarquez, Olivier; Finsinger, Walter; Carcaillet, Christopher
2013-01-01
We developed an algorithm to improve richness assessment based on paleoecological series, considering sample features such as their temporal resolutions or their volumes. Our new method can be applied to both high- and low-count size proxies, i.e. pollen and plant macroremain records, respectively. While pollen generally abounds in sediments, plant macroremains are generally rare, thus leading to difficulties to compute paleo-biodiversity indices. Our approach uses resampled macroremain influxes that enable the computation of the rarefaction index for the low influx records. The raw counts are resampled to a constant resolution and sample volume by interpolating initial sample ages at a constant time interval using the age∼depth model. Then, the contribution of initial counts and volume to each interpolated sample is determined by calculating a proportion matrix that is in turn used to obtain regularly spaced time series of pollen and macroremain influx. We applied this algorithm to sedimentary data from a subalpine lake situated in the European Alps. The reconstructed total floristic richness at the study site increased gradually when both pollen and macroremain records indicated a decrease in relative abundances of shrubs and an increase in trees from 11,000 to 7,000 cal BP. This points to an ecosystem change that favored trees against shrubs, whereas herb abundance remained stable. Since 6,000 cal BP, local richness decreased based on plant macroremains, while pollen-based richness was stable. The reconstructed richness and evenness are interrelated confirming the difficulty to distinguish these two aspects for the studies in paleo-biodiversity. The present study shows that low-influx bio-proxy records (here macroremains) can be used to reconstruct stand diversity and address ecological issues. These developments on macroremain and pollen records may contribute to bridge the gap between paleoecology and biodiversity studies.
Methods to achieve accurate projection of regional and global raster databases
Usery, E.L.; Seong, J.C.; Steinwand, D.R.; Finn, M.P.
2002-01-01
This research aims at building a decision support system (DSS) for selecting an optimum projection considering various factors, such as pixel size, areal extent, number of categories, spatial pattern of categories, resampling methods, and error correction methods. Specifically, this research will investigate three goals theoretically and empirically and, using the already developed empirical base of knowledge with these results, develop an expert system for map projection of raster data for regional and global database modeling. The three theoretical goals are as follows: (1) The development of a dynamic projection that adjusts projection formulas for latitude on the basis of raster cell size to maintain equal-sized cells. (2) The investigation of the relationships between the raster representation and the distortion of features, number of categories, and spatial pattern. (3) The development of an error correction and resampling procedure that is based on error analysis of raster projection.
Mattfeldt, Torsten
2011-04-01
Computer-intensive methods may be defined as data analytical procedures involving a huge number of highly repetitive computations. We mention resampling methods with replacement (bootstrap methods), resampling methods without replacement (randomization tests) and simulation methods. The resampling methods are based on simple and robust principles and are largely free from distributional assumptions. Bootstrap methods may be used to compute confidence intervals for a scalar model parameter and for summary statistics from replicated planar point patterns, and for significance tests. For some simple models of planar point processes, point patterns can be simulated by elementary Monte Carlo methods. The simulation of models with more complex interaction properties usually requires more advanced computing methods. In this context, we mention simulation of Gibbs processes with Markov chain Monte Carlo methods using the Metropolis-Hastings algorithm. An alternative to simulations on the basis of a parametric model consists of stochastic reconstruction methods. The basic ideas behind the methods are briefly reviewed and illustrated by simple worked examples in order to encourage novices in the field to use computer-intensive methods. © 2010 The Authors Journal of Microscopy © 2010 Royal Microscopical Society.
Robust 3D face landmark localization based on local coordinate coding.
Song, Mingli; Tao, Dacheng; Sun, Shengpeng; Chen, Chun; Maybank, Stephen J
2014-12-01
In the 3D facial animation and synthesis community, input faces are usually required to be labeled by a set of landmarks for parameterization. Because of the variations in pose, expression and resolution, automatic 3D face landmark localization remains a challenge. In this paper, a novel landmark localization approach is presented. The approach is based on local coordinate coding (LCC) and consists of two stages. In the first stage, we perform nose detection, relying on the fact that the nose shape is usually invariant under the variations in the pose, expression, and resolution. Then, we use the iterative closest points algorithm to find a 3D affine transformation that aligns the input face to a reference face. In the second stage, we perform resampling to build correspondences between the input 3D face and the training faces. Then, an LCC-based localization algorithm is proposed to obtain the positions of the landmarks in the input face. Experimental results show that the proposed method is comparable to state of the art methods in terms of its robustness, flexibility, and accuracy.
Local variance for multi-scale analysis in geomorphometry.
Drăguţ, Lucian; Eisank, Clemens; Strasser, Thomas
2011-07-15
Increasing availability of high resolution Digital Elevation Models (DEMs) is leading to a paradigm shift regarding scale issues in geomorphometry, prompting new solutions to cope with multi-scale analysis and detection of characteristic scales. We tested the suitability of the local variance (LV) method, originally developed for image analysis, for multi-scale analysis in geomorphometry. The method consists of: 1) up-scaling land-surface parameters derived from a DEM; 2) calculating LV as the average standard deviation (SD) within a 3 × 3 moving window for each scale level; 3) calculating the rate of change of LV (ROC-LV) from one level to another, and 4) plotting values so obtained against scale levels. We interpreted peaks in the ROC-LV graphs as markers of scale levels where cells or segments match types of pattern elements characterized by (relatively) equal degrees of homogeneity. The proposed method has been applied to LiDAR DEMs in two test areas different in terms of roughness: low relief and mountainous, respectively. For each test area, scale levels for slope gradient, plan, and profile curvatures were produced at constant increments with either resampling (cell-based) or image segmentation (object-based). Visual assessment revealed homogeneous areas that convincingly associate into patterns of land-surface parameters well differentiated across scales. We found that the LV method performed better on scale levels generated through segmentation as compared to up-scaling through resampling. The results indicate that coupling multi-scale pattern analysis with delineation of morphometric primitives is possible. This approach could be further used for developing hierarchical classifications of landform elements.
Local variance for multi-scale analysis in geomorphometry
Drăguţ, Lucian; Eisank, Clemens; Strasser, Thomas
2011-01-01
Increasing availability of high resolution Digital Elevation Models (DEMs) is leading to a paradigm shift regarding scale issues in geomorphometry, prompting new solutions to cope with multi-scale analysis and detection of characteristic scales. We tested the suitability of the local variance (LV) method, originally developed for image analysis, for multi-scale analysis in geomorphometry. The method consists of: 1) up-scaling land-surface parameters derived from a DEM; 2) calculating LV as the average standard deviation (SD) within a 3 × 3 moving window for each scale level; 3) calculating the rate of change of LV (ROC-LV) from one level to another, and 4) plotting values so obtained against scale levels. We interpreted peaks in the ROC-LV graphs as markers of scale levels where cells or segments match types of pattern elements characterized by (relatively) equal degrees of homogeneity. The proposed method has been applied to LiDAR DEMs in two test areas different in terms of roughness: low relief and mountainous, respectively. For each test area, scale levels for slope gradient, plan, and profile curvatures were produced at constant increments with either resampling (cell-based) or image segmentation (object-based). Visual assessment revealed homogeneous areas that convincingly associate into patterns of land-surface parameters well differentiated across scales. We found that the LV method performed better on scale levels generated through segmentation as compared to up-scaling through resampling. The results indicate that coupling multi-scale pattern analysis with delineation of morphometric primitives is possible. This approach could be further used for developing hierarchical classifications of landform elements. PMID:21779138
Building test data from real outbreaks for evaluating detection algorithms.
Texier, Gaetan; Jackson, Michael L; Siwe, Leonel; Meynard, Jean-Baptiste; Deparis, Xavier; Chaudet, Herve
2017-01-01
Benchmarking surveillance systems requires realistic simulations of disease outbreaks. However, obtaining these data in sufficient quantity, with a realistic shape and covering a sufficient range of agents, size and duration, is known to be very difficult. The dataset of outbreak signals generated should reflect the likely distribution of authentic situations faced by the surveillance system, including very unlikely outbreak signals. We propose and evaluate a new approach based on the use of historical outbreak data to simulate tailored outbreak signals. The method relies on a homothetic transformation of the historical distribution followed by resampling processes (Binomial, Inverse Transform Sampling Method-ITSM, Metropolis-Hasting Random Walk, Metropolis-Hasting Independent, Gibbs Sampler, Hybrid Gibbs Sampler). We carried out an analysis to identify the most important input parameters for simulation quality and to evaluate performance for each of the resampling algorithms. Our analysis confirms the influence of the type of algorithm used and simulation parameters (i.e. days, number of cases, outbreak shape, overall scale factor) on the results. We show that, regardless of the outbreaks, algorithms and metrics chosen for the evaluation, simulation quality decreased with the increase in the number of days simulated and increased with the number of cases simulated. Simulating outbreaks with fewer cases than days of duration (i.e. overall scale factor less than 1) resulted in an important loss of information during the simulation. We found that Gibbs sampling with a shrinkage procedure provides a good balance between accuracy and data dependency. If dependency is of little importance, binomial and ITSM methods are accurate. Given the constraint of keeping the simulation within a range of plausible epidemiological curves faced by the surveillance system, our study confirms that our approach can be used to generate a large spectrum of outbreak signals.
Building test data from real outbreaks for evaluating detection algorithms
Texier, Gaetan; Jackson, Michael L.; Siwe, Leonel; Meynard, Jean-Baptiste; Deparis, Xavier; Chaudet, Herve
2017-01-01
Benchmarking surveillance systems requires realistic simulations of disease outbreaks. However, obtaining these data in sufficient quantity, with a realistic shape and covering a sufficient range of agents, size and duration, is known to be very difficult. The dataset of outbreak signals generated should reflect the likely distribution of authentic situations faced by the surveillance system, including very unlikely outbreak signals. We propose and evaluate a new approach based on the use of historical outbreak data to simulate tailored outbreak signals. The method relies on a homothetic transformation of the historical distribution followed by resampling processes (Binomial, Inverse Transform Sampling Method—ITSM, Metropolis-Hasting Random Walk, Metropolis-Hasting Independent, Gibbs Sampler, Hybrid Gibbs Sampler). We carried out an analysis to identify the most important input parameters for simulation quality and to evaluate performance for each of the resampling algorithms. Our analysis confirms the influence of the type of algorithm used and simulation parameters (i.e. days, number of cases, outbreak shape, overall scale factor) on the results. We show that, regardless of the outbreaks, algorithms and metrics chosen for the evaluation, simulation quality decreased with the increase in the number of days simulated and increased with the number of cases simulated. Simulating outbreaks with fewer cases than days of duration (i.e. overall scale factor less than 1) resulted in an important loss of information during the simulation. We found that Gibbs sampling with a shrinkage procedure provides a good balance between accuracy and data dependency. If dependency is of little importance, binomial and ITSM methods are accurate. Given the constraint of keeping the simulation within a range of plausible epidemiological curves faced by the surveillance system, our study confirms that our approach can be used to generate a large spectrum of outbreak signals. PMID:28863159
Zimmermann, Johannes; Wright, Aidan G C
2017-01-01
The interpersonal circumplex is a well-established structural model that organizes interpersonal functioning within the two-dimensional space marked by dominance and affiliation. The structural summary method (SSM) was developed to evaluate the interpersonal nature of other constructs and measures outside the interpersonal circumplex. To date, this method has been primarily descriptive, providing no way to draw inferences when comparing SSM parameters across constructs or groups. We describe a newly developed resampling-based method for deriving confidence intervals, which allows for SSM parameter comparisons. In a series of five studies, we evaluated the accuracy of the approach across a wide range of possible sample sizes and parameter values, and demonstrated its utility for posing theoretical questions on the interpersonal nature of relevant constructs (e.g., personality disorders) using real-world data. As a result, the SSM is strengthened for its intended purpose of construct evaluation and theory building. © The Author(s) 2015.
Sweep excitation with order tracking: A new tactic for beam crack analysis
NASA Astrophysics Data System (ADS)
Wei, Dongdong; Wang, KeSheng; Zhang, Mian; Zuo, Ming J.
2018-04-01
Crack detection in beams and beam-like structures is an important issue in industry and has attracted numerous investigations. A local crack leads to global system dynamics changes and produce non-linear vibration responses. Many researchers have studied these non-linearities for beam crack diagnosis. However, most reported methods are based on impact excitation and constant frequency excitation. Few studies have focused on crack detection through external sweep excitation which unleashes abundant dynamic characteristics of the system. Together with a signal resampling technique inspired by Computed Order Tracking, this paper utilize vibration responses under sweep excitations to diagnose crack status of beams. A data driven method for crack depth evaluation is proposed and window based harmonics extracting approaches are studied. The effectiveness of sweep excitation and the proposed method is experimentally validated.
NASA Astrophysics Data System (ADS)
Zhang, Yu-Ying; Reiprich, Thomas H.; Schneider, Peter; Clerc, Nicolas; Merloni, Andrea; Schwope, Axel; Borm, Katharina; Andernach, Heinz; Caretta, César A.; Wu, Xiang-Ping
2017-03-01
We present the relation of X-ray luminosity versus dynamical mass for 63 nearby clusters of galaxies in a flux-limited sample, the HIghest X-ray FLUx Galaxy Cluster Sample (HIFLUGCS, consisting of 64 clusters). The luminosity measurements are obtained based on 1.3 Ms of clean XMM-Newton data and ROSAT pointed observations. The masses are estimated using optical spectroscopic redshifts of 13647 cluster galaxies in total. We classify clusters into disturbed and undisturbed based on a combination of the X-ray luminosity concentration and the offset between the brightest cluster galaxy and X-ray flux-weighted center. Given sufficient numbers (I.e., ≥45) of member galaxies when the dynamical masses are computed, the luminosity versus mass relations agree between the disturbed and undisturbed clusters. The cool-core clusters still dominate the scatter in the luminosity versus mass relation even when a core-corrected X-ray luminosity is used, which indicates that the scatter of this scaling relation mainly reflects the structure formation history of the clusters. As shown by the clusters with only few spectroscopically confirmed members, the dynamical masses can be underestimated and thus lead to a biased scaling relation. To investigate the potential of spectroscopic surveys to follow up high-redshift galaxy clusters or groups observed in X-ray surveys for the identifications and mass calibrations, we carried out Monte Carlo resampling of the cluster galaxy redshifts and calibrated the uncertainties of the redshift and dynamical mass estimates when only reduced numbers of galaxy redshifts per cluster are available. The resampling considers the SPIDERS and 4MOST configurations, designed for the follow-up of the eROSITA clusters, and was carried out for each cluster in the sample at the actual cluster redshift as well as at the assigned input cluster redshifts of 0.2, 0.4, 0.6, and 0.8. To follow up very distant clusters or groups, we also carried out the mass calibration based on the resampling with only ten redshifts per cluster, and redshift calibration based on the resampling with only five and ten redshifts per cluster, respectively. Our results demonstrate the power of combining upcoming X-ray and optical spectroscopic surveys for mass calibration of clusters. The scatter in the dynamical mass estimates for the clusters with at least ten members is within 50%.
Zhang, Yeqing; Wang, Meiling; Li, Yafeng
2018-01-01
For the objective of essentially decreasing computational complexity and time consumption of signal acquisition, this paper explores a resampling strategy and variable circular correlation time strategy specific to broadband multi-frequency GNSS receivers. In broadband GNSS receivers, the resampling strategy is established to work on conventional acquisition algorithms by resampling the main lobe of received broadband signals with a much lower frequency. Variable circular correlation time is designed to adapt to different signal strength conditions and thereby increase the operation flexibility of GNSS signal acquisition. The acquisition threshold is defined as the ratio of the highest and second highest correlation results in the search space of carrier frequency and code phase. Moreover, computational complexity of signal acquisition is formulated by amounts of multiplication and summation operations in the acquisition process. Comparative experiments and performance analysis are conducted on four sets of real GPS L2C signals with different sampling frequencies. The results indicate that the resampling strategy can effectively decrease computation and time cost by nearly 90–94% with just slight loss of acquisition sensitivity. With circular correlation time varying from 10 ms to 20 ms, the time cost of signal acquisition has increased by about 2.7–5.6% per millisecond, with most satellites acquired successfully. PMID:29495301
Zhang, Yeqing; Wang, Meiling; Li, Yafeng
2018-02-24
For the objective of essentially decreasing computational complexity and time consumption of signal acquisition, this paper explores a resampling strategy and variable circular correlation time strategy specific to broadband multi-frequency GNSS receivers. In broadband GNSS receivers, the resampling strategy is established to work on conventional acquisition algorithms by resampling the main lobe of received broadband signals with a much lower frequency. Variable circular correlation time is designed to adapt to different signal strength conditions and thereby increase the operation flexibility of GNSS signal acquisition. The acquisition threshold is defined as the ratio of the highest and second highest correlation results in the search space of carrier frequency and code phase. Moreover, computational complexity of signal acquisition is formulated by amounts of multiplication and summation operations in the acquisition process. Comparative experiments and performance analysis are conducted on four sets of real GPS L2C signals with different sampling frequencies. The results indicate that the resampling strategy can effectively decrease computation and time cost by nearly 90-94% with just slight loss of acquisition sensitivity. With circular correlation time varying from 10 ms to 20 ms, the time cost of signal acquisition has increased by about 2.7-5.6% per millisecond, with most satellites acquired successfully.
Resampling approach for anomalous change detection
NASA Astrophysics Data System (ADS)
Theiler, James; Perkins, Simon
2007-04-01
We investigate the problem of identifying pixels in pairs of co-registered images that correspond to real changes on the ground. Changes that are due to environmental differences (illumination, atmospheric distortion, etc.) or sensor differences (focus, contrast, etc.) will be widespread throughout the image, and the aim is to avoid these changes in favor of changes that occur in only one or a few pixels. Formal outlier detection schemes (such as the one-class support vector machine) can identify rare occurrences, but will be confounded by pixels that are "equally rare" in both images: they may be anomalous, but they are not changes. We describe a resampling scheme we have developed that formally addresses both of these issues, and reduces the problem to a binary classification, a problem for which a large variety of machine learning tools have been developed. In principle, the effects of misregistration will manifest themselves as pervasive changes, and our method will be robust against them - but in practice, misregistration remains a serious issue.
A program for handling map projections of small-scale geospatial raster data
Finn, Michael P.; Steinwand, Daniel R.; Trent, Jason R.; Buehler, Robert A.; Mattli, David M.; Yamamoto, Kristina H.
2012-01-01
Scientists routinely accomplish small-scale geospatial modeling using raster datasets of global extent. Such use often requires the projection of global raster datasets onto a map or the reprojection from a given map projection associated with a dataset. The distortion characteristics of these projection transformations can have significant effects on modeling results. Distortions associated with the reprojection of global data are generally greater than distortions associated with reprojections of larger-scale, localized areas. The accuracy of areas in projected raster datasets of global extent is dependent on spatial resolution. To address these problems of projection and the associated resampling that accompanies it, methods for framing the transformation space, direct point-to-point transformations rather than gridded transformation spaces, a solution to the wrap-around problem, and an approach to alternative resampling methods are presented. The implementations of these methods are provided in an open-source software package called MapImage (or mapIMG, for short), which is designed to function on a variety of computer architectures.
Improving the quality of extracting dynamics from interspike intervals via a resampling approach
NASA Astrophysics Data System (ADS)
Pavlova, O. N.; Pavlov, A. N.
2018-04-01
We address the problem of improving the quality of characterizing chaotic dynamics based on point processes produced by different types of neuron models. Despite the presence of embedding theorems for non-uniformly sampled dynamical systems, the case of short data analysis requires additional attention because the selection of algorithmic parameters may have an essential influence on estimated measures. We consider how the preliminary processing of interspike intervals (ISIs) can increase the precision of computing the largest Lyapunov exponent (LE). We report general features of characterizing chaotic dynamics from point processes and show that independently of the selected mechanism for spike generation, the performed preprocessing reduces computation errors when dealing with a limited amount of data.
Gleeson, Helena K; Wiley, Veronica; Wilcken, Bridget; Elliott, Elizabeth; Cowell, Christopher; Thonsett, Michael; Byrne, Geoffrey; Ambler, Geoffrey
2008-10-01
To assess the benefits and practicalities of setting up a newborn screening (NBS) program in Australia for congenital adrenal hyperplasia (CAH) through a 2 year pilot screening in ACT/NSW and comparing with case surveillance in other states. The pilot newborn screening occurred between 1/10/95 and 30/9/97 in NSW/ACT. Concurrently, case reporting for all new CAH cases occurred through the Australian Paediatric Surveillance Unit (APSU) across Australia. Details of clinical presentation, re-sampling and laboratory performance were assessed. 185,854 newborn infants were screened for CAH in NSW/ACT. Concurrently, 30 cases of CAH were reported to APSU, twelve of which were from NSW/ACT. CAH incidence was 1 in 15 488 (screened population) vs 1 in 18,034 births (unscreened) (difference not significant). Median age of initial notification was day 8 with confirmed diagnosis at 13(5-23) days in the screened population vs 16(7-37) days in the unscreened population (not significant). Of the 5 clinically unsuspected males in the screened population, one had mild salt-wasting by the time of notification, compared with salt-wasting crisis in all 6 males from the unscreened population. 96% of results were reported by day 10. Resampling was requested in 637 (0.4%) and median re-sampling delay was 11(0-28) days with higher resample rates in males (p < 0.0001). The within-laboratory cost per case of clinically unsuspected cases was A$42 717. There seems good justification for NBS for CAH based on clear prevention of salt-wasting crises and their potential long-term consequences. Also, prospects exist for enhancing screening performance.
USDA-ARS?s Scientific Manuscript database
Satellite-based passive microwave remote sensing typically involves a scanning antenna that makes measurements at irregularly spaced locations. These locations can change on a day to day basis. Soil moisture products derived from satellite-based passive microwave remote sensing are usually resampled...
Liu, Fang; Shen, Changqing; He, Qingbo; Zhang, Ao; Liu, Yongbin; Kong, Fanrang
2014-01-01
A fault diagnosis strategy based on the wayside acoustic monitoring technique is investigated for locomotive bearing fault diagnosis. Inspired by the transient modeling analysis method based on correlation filtering analysis, a so-called Parametric-Mother-Doppler-Wavelet (PMDW) is constructed with six parameters, including a center characteristic frequency and five kinematic model parameters. A Doppler effect eliminator containing a PMDW generator, a correlation filtering analysis module, and a signal resampler is invented to eliminate the Doppler effect embedded in the acoustic signal of the recorded bearing. Through the Doppler effect eliminator, the five kinematic model parameters can be identified based on the signal itself. Then, the signal resampler is applied to eliminate the Doppler effect using the identified parameters. With the ability to detect early bearing faults, the transient model analysis method is employed to detect localized bearing faults after the embedded Doppler effect is eliminated. The effectiveness of the proposed fault diagnosis strategy is verified via simulation studies and applications to diagnose locomotive roller bearing defects. PMID:24803197
A novel fruit shape classification method based on multi-scale analysis
NASA Astrophysics Data System (ADS)
Gui, Jiangsheng; Ying, Yibin; Rao, Xiuqin
2005-11-01
Shape is one of the major concerns and which is still a difficult problem in automated inspection and sorting of fruits. In this research, we proposed the multi-scale energy distribution (MSED) for object shape description, the relationship between objects shape and its boundary energy distribution at multi-scale was explored for shape extraction. MSED offers not only the mainly energy which represent primary shape information at the lower scales, but also subordinate energy which represent local shape information at higher differential scales. Thus, it provides a natural tool for multi resolution representation and can be used as a feature for shape classification. We addressed the three main processing steps in the MSED-based shape classification. They are namely, 1) image preprocessing and citrus shape extraction, 2) shape resample and shape feature normalization, 3) energy decomposition by wavelet and classification by BP neural network. Hereinto, shape resample is resample 256 boundary pixel from a curve which is approximated original boundary by using cubic spline in order to get uniform raw data. A probability function was defined and an effective method to select a start point was given through maximal expectation, which overcame the inconvenience of traditional methods in order to have a property of rotation invariants. The experiment result is relatively well normal citrus and serious abnormality, with a classification rate superior to 91.2%. The global correct classification rate is 89.77%, and our method is more effective than traditional method. The global result can meet the request of fruit grading.
Computation of ancestry scores with mixed families and unrelated individuals.
Zhou, Yi-Hui; Marron, James S; Wright, Fred A
2018-03-01
The issue of robustness to family relationships in computing genotype ancestry scores such as eigenvector projections has received increased attention in genetic association, and is particularly challenging when sets of both unrelated individuals and closely related family members are included. The current standard is to compute loadings (left singular vectors) using unrelated individuals and to compute projected scores for remaining family members. However, projected ancestry scores from this approach suffer from shrinkage toward zero. We consider two main novel strategies: (i) matrix substitution based on decomposition of a target family-orthogonalized covariance matrix, and (ii) using family-averaged data to obtain loadings. We illustrate the performance via simulations, including resampling from 1000 Genomes Project data, and analysis of a cystic fibrosis dataset. The matrix substitution approach has similar performance to the current standard, but is simple and uses only a genotype covariance matrix, while the family-average method shows superior performance. Our approaches are accompanied by novel ancillary approaches that provide considerable insight, including individual-specific eigenvalue scree plots. © 2017 The Authors. Biometrics published by Wiley Periodicals, Inc. on behalf of International Biometric Society.
Chu, Hui-May; Ette, Ene I
2005-09-02
his study was performed to develop a new nonparametric approach for the estimation of robust tissue-to-plasma ratio from extremely sparsely sampled paired data (ie, one sample each from plasma and tissue per subject). Tissue-to-plasma ratio was estimated from paired/unpaired experimental data using independent time points approach, area under the curve (AUC) values calculated with the naïve data averaging approach, and AUC values calculated using sampling based approaches (eg, the pseudoprofile-based bootstrap [PpbB] approach and the random sampling approach [our proposed approach]). The random sampling approach involves the use of a 2-phase algorithm. The convergence of the sampling/resampling approaches was investigated, as well as the robustness of the estimates produced by different approaches. To evaluate the latter, new data sets were generated by introducing outlier(s) into the real data set. One to 2 concentration values were inflated by 10% to 40% from their original values to produce the outliers. Tissue-to-plasma ratios computed using the independent time points approach varied between 0 and 50 across time points. The ratio obtained from AUC values acquired using the naive data averaging approach was not associated with any measure of uncertainty or variability. Calculating the ratio without regard to pairing yielded poorer estimates. The random sampling and pseudoprofile-based bootstrap approaches yielded tissue-to-plasma ratios with uncertainty and variability. However, the random sampling approach, because of the 2-phase nature of its algorithm, yielded more robust estimates and required fewer replications. Therefore, a 2-phase random sampling approach is proposed for the robust estimation of tissue-to-plasma ratio from extremely sparsely sampled data.
Resampling probability values for weighted kappa with multiple raters.
Mielke, Paul W; Berry, Kenneth J; Johnston, Janis E
2008-04-01
A new procedure to compute weighted kappa with multiple raters is described. A resampling procedure to compute approximate probability values for weighted kappa with multiple raters is presented. Applications of weighted kappa are illustrated with an example analysis of classifications by three independent raters.
Automated coregistration of MTI spectral bands
NASA Astrophysics Data System (ADS)
Theiler, James P.; Galbraith, Amy E.; Pope, Paul A.; Ramsey, Keri A.; Szymanski, John J.
2002-08-01
In the focal plane of a pushbroom imager, a linear array of pixels is scanned across the scene, building up the image one row at a time. For the Multispectral Thermal Imager (MTI), each of fifteen different spectral bands has its own linear array. These arrays are pushed across the scene together, but since each band's array is at a different position on the focal plane, a separate image is produced for each band. The standard MTI data products (LEVEL1B_R_COREG and LEVEL1B_R_GEO) resample these separate images to a common grid and produce coregistered multispectral image cubes. The coregistration software employs a direct ``dead reckoning' approach. Every pixel in the calibrated image is mapped to an absolute position on the surface of the earth, and these are resampled to produce an undistorted coregistered image of the scene. To do this requires extensive information regarding the satellite position and pointing as a function of time, the precise configuration of the focal plane, and the distortion due to the optics. These must be combined with knowledge about the position and altitude of the target on the rotating ellipsoidal earth. We will discuss the direct approach to MTI coregistration, as well as more recent attempts to tweak the precision of the band-to-band registration using correlations in the imagery itself.
NASA Astrophysics Data System (ADS)
Schonlau, William J.
2006-05-01
An immersive viewing engine providing basic telepresence functionality for a variety of application types is presented. Augmented reality, teleoperation and virtual reality applications all benefit from the use of head mounted display devices that present imagery appropriate to the user's head orientation at full frame rates. Our primary application is the viewing of remote environments, as with a camera equipped teleoperated vehicle. The conventional approach where imagery from a narrow field camera onboard the vehicle is presented to the user on a small rectangular screen is contrasted with an immersive viewing system where a cylindrical or spherical format image is received from a panoramic camera on the vehicle, resampled in response to sensed user head orientation and presented via wide field eyewear display, approaching 180 degrees of horizontal field. Of primary interest is the user's enhanced ability to perceive and understand image content, even when image resolution parameters are poor, due to the innate visual integration and 3-D model generation capabilities of the human visual system. A mathematical model for tracking user head position and resampling the panoramic image to attain distortion free viewing of the region appropriate to the user's current head pose is presented and consideration is given to providing the user with stereo viewing generated from depth map information derived using stereo from motion algorithms.
Exchangeability, extreme returns and Value-at-Risk forecasts
NASA Astrophysics Data System (ADS)
Huang, Chun-Kai; North, Delia; Zewotir, Temesgen
2017-07-01
In this paper, we propose a new approach to extreme value modelling for the forecasting of Value-at-Risk (VaR). In particular, the block maxima and the peaks-over-threshold methods are generalised to exchangeable random sequences. This caters for the dependencies, such as serial autocorrelation, of financial returns observed empirically. In addition, this approach allows for parameter variations within each VaR estimation window. Empirical prior distributions of the extreme value parameters are attained by using resampling procedures. We compare the results of our VaR forecasts to that of the unconditional extreme value theory (EVT) approach and the conditional GARCH-EVT model for robust conclusions.
Ozçift, Akin
2011-05-01
Supervised classification algorithms are commonly used in the designing of computer-aided diagnosis systems. In this study, we present a resampling strategy based Random Forests (RF) ensemble classifier to improve diagnosis of cardiac arrhythmia. Random forests is an ensemble classifier that consists of many decision trees and outputs the class that is the mode of the class's output by individual trees. In this way, an RF ensemble classifier performs better than a single tree from classification performance point of view. In general, multiclass datasets having unbalanced distribution of sample sizes are difficult to analyze in terms of class discrimination. Cardiac arrhythmia is such a dataset that has multiple classes with small sample sizes and it is therefore adequate to test our resampling based training strategy. The dataset contains 452 samples in fourteen types of arrhythmias and eleven of these classes have sample sizes less than 15. Our diagnosis strategy consists of two parts: (i) a correlation based feature selection algorithm is used to select relevant features from cardiac arrhythmia dataset. (ii) RF machine learning algorithm is used to evaluate the performance of selected features with and without simple random sampling to evaluate the efficiency of proposed training strategy. The resultant accuracy of the classifier is found to be 90.0% and this is a quite high diagnosis performance for cardiac arrhythmia. Furthermore, three case studies, i.e., thyroid, cardiotocography and audiology, are used to benchmark the effectiveness of the proposed method. The results of experiments demonstrated the efficiency of random sampling strategy in training RF ensemble classification algorithm. Copyright © 2011 Elsevier Ltd. All rights reserved.
Li, Jun; Tibshirani, Robert
2015-01-01
We discuss the identification of features that are associated with an outcome in RNA-Sequencing (RNA-Seq) and other sequencing-based comparative genomic experiments. RNA-Seq data takes the form of counts, so models based on the normal distribution are generally unsuitable. The problem is especially challenging because different sequencing experiments may generate quite different total numbers of reads, or ‘sequencing depths’. Existing methods for this problem are based on Poisson or negative binomial models: they are useful but can be heavily influenced by ‘outliers’ in the data. We introduce a simple, nonparametric method with resampling to account for the different sequencing depths. The new method is more robust than parametric methods. It can be applied to data with quantitative, survival, two-class or multiple-class outcomes. We compare our proposed method to Poisson and negative binomial-based methods in simulated and real data sets, and find that our method discovers more consistent patterns than competing methods. PMID:22127579
NASA Astrophysics Data System (ADS)
Mishra, C.; Samantaray, A. K.; Chakraborty, G.
2016-09-01
Vibration analysis for diagnosis of faults in rolling element bearings is complicated when the rotor speed is variable or slow. In the former case, the time interval between the fault-induced impact responses in the vibration signal are non-uniform and the signal strength is variable. In the latter case, the fault-induced impact response strength is weak and generally gets buried in the noise, i.e. noise dominates the signal. This article proposes a diagnosis scheme based on a combination of a few signal processing techniques. The proposed scheme initially represents the vibration signal in terms of uniformly resampled angular position of the rotor shaft by using the interpolated instantaneous angular position measurements. Thereafter, intrinsic mode functions (IMFs) are generated through empirical mode decomposition (EMD) of resampled vibration signal which is followed by thresholding of IMFs and signal reconstruction to de-noise the signal and envelope order tracking to diagnose the faults. Data for validating the proposed diagnosis scheme are initially generated from a multi-body simulation model of rolling element bearing which is developed using bond graph approach. This bond graph model includes the ball and cage dynamics, localized fault geometry, contact mechanics, rotor unbalance, and friction and slip effects. The diagnosis scheme is finally validated with experiments performed with the help of a machine fault simulator (MFS) system. Some fault scenarios which could not be experimentally recreated are then generated through simulations and analyzed through the developed diagnosis scheme.
Narayan, Manjari; Allen, Genevera I.
2016-01-01
Many complex brain disorders, such as autism spectrum disorders, exhibit a wide range of symptoms and disability. To understand how brain communication is impaired in such conditions, functional connectivity studies seek to understand individual differences in brain network structure in terms of covariates that measure symptom severity. In practice, however, functional connectivity is not observed but estimated from complex and noisy neural activity measurements. Imperfect subject network estimates can compromise subsequent efforts to detect covariate effects on network structure. We address this problem in the case of Gaussian graphical models of functional connectivity, by proposing novel two-level models that treat both subject level networks and population level covariate effects as unknown parameters. To account for imperfectly estimated subject level networks when fitting these models, we propose two related approaches—R2 based on resampling and random effects test statistics, and R3 that additionally employs random adaptive penalization. Simulation studies using realistic graph structures reveal that R2 and R3 have superior statistical power to detect covariate effects compared to existing approaches, particularly when the number of within subject observations is comparable to the size of subject networks. Using our novel models and methods to study parts of the ABIDE dataset, we find evidence of hypoconnectivity associated with symptom severity in autism spectrum disorders, in frontoparietal and limbic systems as well as in anterior and posterior cingulate cortices. PMID:27147940
An empirical study using permutation-based resampling in meta-regression
2012-01-01
Background In meta-regression, as the number of trials in the analyses decreases, the risk of false positives or false negatives increases. This is partly due to the assumption of normality that may not hold in small samples. Creation of a distribution from the observed trials using permutation methods to calculate P values may allow for less spurious findings. Permutation has not been empirically tested in meta-regression. The objective of this study was to perform an empirical investigation to explore the differences in results for meta-analyses on a small number of trials using standard large sample approaches verses permutation-based methods for meta-regression. Methods We isolated a sample of randomized controlled clinical trials (RCTs) for interventions that have a small number of trials (herbal medicine trials). Trials were then grouped by herbal species and condition and assessed for methodological quality using the Jadad scale, and data were extracted for each outcome. Finally, we performed meta-analyses on the primary outcome of each group of trials and meta-regression for methodological quality subgroups within each meta-analysis. We used large sample methods and permutation methods in our meta-regression modeling. We then compared final models and final P values between methods. Results We collected 110 trials across 5 intervention/outcome pairings and 5 to 10 trials per covariate. When applying large sample methods and permutation-based methods in our backwards stepwise regression the covariates in the final models were identical in all cases. The P values for the covariates in the final model were larger in 78% (7/9) of the cases for permutation and identical for 22% (2/9) of the cases. Conclusions We present empirical evidence that permutation-based resampling may not change final models when using backwards stepwise regression, but may increase P values in meta-regression of multiple covariates for relatively small amount of trials. PMID:22587815
Chen, Ling; Feng, Yanqin; Sun, Jianguo
2017-10-01
This paper discusses regression analysis of clustered failure time data, which occur when the failure times of interest are collected from clusters. In particular, we consider the situation where the correlated failure times of interest may be related to cluster sizes. For inference, we present two estimation procedures, the weighted estimating equation-based method and the within-cluster resampling-based method, when the correlated failure times of interest arise from a class of additive transformation models. The former makes use of the inverse of cluster sizes as weights in the estimating equations, while the latter can be easily implemented by using the existing software packages for right-censored failure time data. An extensive simulation study is conducted and indicates that the proposed approaches work well in both the situations with and without informative cluster size. They are applied to a dental study that motivated this study.
Review and Analysis of Peak Tracking Techniques for Fiber Bragg Grating Sensors
2017-01-01
Fiber Bragg Grating (FBG) sensors are among the most popular elements for fiber optic sensor networks used for the direct measurement of temperature and strain. Modern FBG interrogation setups measure the FBG spectrum in real-time, and determine the shift of the Bragg wavelength of the FBG in order to estimate the physical parameters. The problem of determining the peak wavelength of the FBG from a spectral measurement limited in resolution and noise, is referred as the peak-tracking problem. In this work, the several peak-tracking approaches are reviewed and classified, outlining their algorithmic implementations: the methods based on direct estimation, interpolation, correlation, resampling, transforms, and optimization are discussed in all their proposed implementations. Then, a simulation based on coupled-mode theory compares the performance of the main peak-tracking methods, in terms of accuracy and signal to noise ratio resilience. PMID:29039804
Fast repurposing of high-resolution stereo video content for mobile use
NASA Astrophysics Data System (ADS)
Karaoglu, Ali; Lee, Bong Ho; Boev, Atanas; Cheong, Won-Sik; Gotchev, Atanas
2012-06-01
3D video content is captured and created mainly in high resolution targeting big cinema or home TV screens. For 3D mobile devices, equipped with small-size auto-stereoscopic displays, such content has to be properly repurposed, preferably in real-time. The repurposing requires not only spatial resizing but also properly maintaining the output stereo disparity, as it should deliver realistic, pleasant and harmless 3D perception. In this paper, we propose an approach to adapt the disparity range of the source video to the comfort disparity zone of the target display. To achieve this, we adapt the scale and the aspect ratio of the source video. We aim at maximizing the disparity range of the retargeted content within the comfort zone, and minimizing the letterboxing of the cropped content. The proposed algorithm consists of five stages. First, we analyse the display profile, which characterises what 3D content can be comfortably observed in the target display. Then, we perform fast disparity analysis of the input stereoscopic content. Instead of returning the dense disparity map, it returns an estimate of the disparity statistics (min, max, meanand variance) per frame. Additionally, we detect scene cuts, where sharp transitions in disparities occur. Based on the estimated input, and desired output disparity ranges, we derive the optimal cropping parameters and scale of the cropping window, which would yield the targeted disparity range and minimize the area of cropped and letterboxed content. Once the rescaling and cropping parameters are known, we perform resampling procedure using spline-based and perceptually optimized resampling (anti-aliasing) kernels, which have also a very efficient computational structure. Perceptual optimization is achieved through adjusting the cut-off frequency of the anti-aliasing filter with the throughput of the target display.
Xiao, Yongling; Abrahamowicz, Michal
2010-03-30
We propose two bootstrap-based methods to correct the standard errors (SEs) from Cox's model for within-cluster correlation of right-censored event times. The cluster-bootstrap method resamples, with replacement, only the clusters, whereas the two-step bootstrap method resamples (i) the clusters, and (ii) individuals within each selected cluster, with replacement. In simulations, we evaluate both methods and compare them with the existing robust variance estimator and the shared gamma frailty model, which are available in statistical software packages. We simulate clustered event time data, with latent cluster-level random effects, which are ignored in the conventional Cox's model. For cluster-level covariates, both proposed bootstrap methods yield accurate SEs, and type I error rates, and acceptable coverage rates, regardless of the true random effects distribution, and avoid serious variance under-estimation by conventional Cox-based standard errors. However, the two-step bootstrap method over-estimates the variance for individual-level covariates. We also apply the proposed bootstrap methods to obtain confidence bands around flexible estimates of time-dependent effects in a real-life analysis of cluster event times.
Jodice, Patrick G.R.; Garman, S.L.; Collopy, Michael W.
2001-01-01
Marbled Murrelets (Brachyramphus marmoratus) are threatened seabirds that nest in coastal old-growth coniferous forests throughout much of their breeding range. Currently, observer-based audio-visual surveys are conducted at inland forest sites during the breeding season primarily to determine nesting distribution and breeding status and are being used to estimate temporal or spatial trends in murrelet detections. Our goal was to assess the feasibility of using audio-visual survey data for such monitoring. We used an intensive field-based survey effort to record daily murrelet detections at seven survey stations in the Oregon Coast Range. We then used computer-aided resampling techniques to assess the effectiveness of twelve survey strategies with varying scheduling and a sampling intensity of 4-14 surveys per breeding season to estimate known means and SDs of murrelet detections. Most survey strategies we tested failed to provide estimates of detection means and SDs that were within A?20% of actual means and SDs. Estimates of daily detections were, however, frequently estimated to within A?50% of field data with sampling efforts of 14 days/breeding season. Additional resampling analyses with statistically generated detection data indicated that the temporal variability in detection data had a great effect on the reliability of the mean and SD estimates calculated from the twelve survey strategies, while the value of the mean had little effect. Effectiveness at estimating multi-year trends in detection data was similarly poor, indicating that audio-visual surveys might be reliably used to estimate annual declines in murrelet detections of the order of 50% per year.
Application of cause-and-effect analysis to potentiometric titration.
Kufelnicki, A; Lis, S; Meinrath, G
2005-08-01
A first attempt has been made to interpret physicochemical data from potentiometric titration analysis in accordance with the complete measurement-uncertainty budget approach (bottom-up) of ISO and Eurachem. A cause-and-effect diagram is established and discussed. Titration data for arsenazo III are used as a basis for this discussion. The commercial software Superquad is used and applied within a computer-intensive resampling framework. The cause-and-effect diagram is applied to evaluation of seven protonation constants of arsenazo III in the pH range 2-10.7. The data interpretation is based on empirical probability distributions and their analysis by second-order correct confidence estimates. The evaluated data are applied in the calculation of a speciation diagram including uncertainty estimates using the probabilistic speciation software Ljungskile.
Building Intuitions about Statistical Inference Based on Resampling
ERIC Educational Resources Information Center
Watson, Jane; Chance, Beth
2012-01-01
Formal inference, which makes theoretical assumptions about distributions and applies hypothesis testing procedures with null and alternative hypotheses, is notoriously difficult for tertiary students to master. The debate about whether this content should appear in Years 11 and 12 of the "Australian Curriculum: Mathematics" has gone on…
Testing variance components by two jackknife methods
USDA-ARS?s Scientific Manuscript database
The jacknife method, a resampling technique, has been widely used for statistical tests for years. The pseudo value based jacknife method (defined as pseudo jackknife method) is commonly used to reduce the bias for an estimate; however, sometimes it could result in large variaion for an estmimate a...
ERIC Educational Resources Information Center
Fan, Xitao
This paper empirically and systematically assessed the performance of bootstrap resampling procedure as it was applied to a regression model. Parameter estimates from Monte Carlo experiments (repeated sampling from population) and bootstrap experiments (repeated resampling from one original bootstrap sample) were generated and compared. Sample…
De-Dopplerization of Acoustic Measurements
2017-08-10
band energy obtained from fractional octave band digital filters generates a de-Dopplerized spectrum without complex resampling algorithms. An...energy obtained from fractional octave band digital filters generates a de-Dopplerized spectrum without complex resampling algorithms. An equation...fractional octave representation and smearing that occurs within the spectrum11, digital filtering techniques were not considered by these earlier
Thematic mapper design parameter investigation
NASA Technical Reports Server (NTRS)
Colby, C. P., Jr.; Wheeler, S. G.
1978-01-01
This study simulated the multispectral data sets to be expected from three different Thematic Mapper configurations, and the ground processing of these data sets by three different resampling techniques. The simulated data sets were then evaluated by processing them for multispectral classification, and the Thematic Mapper configuration, and resampling technique which provided the best classification accuracy were identified.
NEAT: an efficient network enrichment analysis test.
Signorelli, Mirko; Vinciotti, Veronica; Wit, Ernst C
2016-09-05
Network enrichment analysis is a powerful method, which allows to integrate gene enrichment analysis with the information on relationships between genes that is provided by gene networks. Existing tests for network enrichment analysis deal only with undirected networks, they can be computationally slow and are based on normality assumptions. We propose NEAT, a test for network enrichment analysis. The test is based on the hypergeometric distribution, which naturally arises as the null distribution in this context. NEAT can be applied not only to undirected, but to directed and partially directed networks as well. Our simulations indicate that NEAT is considerably faster than alternative resampling-based methods, and that its capacity to detect enrichments is at least as good as the one of alternative tests. We discuss applications of NEAT to network analyses in yeast by testing for enrichment of the Environmental Stress Response target gene set with GO Slim and KEGG functional gene sets, and also by inspecting associations between functional sets themselves. NEAT is a flexible and efficient test for network enrichment analysis that aims to overcome some limitations of existing resampling-based tests. The method is implemented in the R package neat, which can be freely downloaded from CRAN ( https://cran.r-project.org/package=neat ).
Zhang, Jianhua; Li, Sunan; Wang, Rubin
2017-01-01
In this paper, we deal with the Mental Workload (MWL) classification problem based on the measured physiological data. First we discussed the optimal depth (i.e., the number of hidden layers) and parameter optimization algorithms for the Convolutional Neural Networks (CNN). The base CNNs designed were tested according to five classification performance indices, namely Accuracy, Precision, F-measure, G-mean, and required training time. Then we developed an Ensemble Convolutional Neural Network (ECNN) to enhance the accuracy and robustness of the individual CNN model. For the ECNN design, three model aggregation approaches (weighted averaging, majority voting and stacking) were examined and a resampling strategy was used to enhance the diversity of individual CNN models. The results of MWL classification performance comparison indicated that the proposed ECNN framework can effectively improve MWL classification performance and is featured by entirely automatic feature extraction and MWL classification, when compared with traditional machine learning methods.
NASA Astrophysics Data System (ADS)
Wechsung, Frank; Wechsung, Maximilian
2016-11-01
The STatistical Analogue Resampling Scheme (STARS) statistical approach was recently used to project changes of climate variables in Germany corresponding to a supposed degree of warming. We show by theoretical and empirical analysis that STARS simply transforms interannual gradients between warmer and cooler seasons into climate trends. According to STARS projections, summers in Germany will inevitably become dryer and winters wetter under global warming. Due to the dominance of negative interannual correlations between precipitation and temperature during the year, STARS has a tendency to generate a net annual decrease in precipitation under mean German conditions. Furthermore, according to STARS, the annual level of global radiation would increase in Germany. STARS can be still used, e.g., for generating scenarios in vulnerability and uncertainty studies. However, it is not suitable as a climate downscaling tool to access risks following from changing climate for a finer than general circulation model (GCM) spatial scale.
Publications - GMC 366 | Alaska Division of Geological & Geophysical
Alaska MAPTEACH Tsunami Inundation Mapping Energy Resources Gas Hydrates STATEMAP Program information DGGS GMC 366 Publication Details Title: Makushin Geothermal Project ST-1R Core 2009 re-sampling and analysis: Analytical results for anomalous precious and base metals associated with geothermal systems
The Beginner's Guide to the Bootstrap Method of Resampling.
ERIC Educational Resources Information Center
Lane, Ginny G.
The bootstrap method of resampling can be useful in estimating the replicability of study results. The bootstrap procedure creates a mock population from a given sample of data from which multiple samples are then drawn. The method extends the usefulness of the jackknife procedure as it allows for computation of a given statistic across a maximal…
An Algorithm Framework for Isolating Anomalous Signals in Electromagnetic Data
NASA Astrophysics Data System (ADS)
Kappler, K. N.; Schneider, D.; Bleier, T.; MacLean, L. S.
2016-12-01
QuakeFinder and its international collaborators have installed and currently maintain an array of 165 three-axis induction magnetometer instrument sites in California, Peru, Taiwan, Greece, Chile and Sumatra. Based on research by Bleier et al. (2009), Fraser-Smith et al. (1990), and Freund (2007), the electromagnetic data from these instruments are being analyzed for pre-earthquake signatures. This analysis consists of both private research by QuakeFinder, and institutional collaborators (PUCP in Peru, NCU in Taiwan, NOA in Greece, LASP at University of Colorado, Stanford, UCLA, NASA-ESI, NASA-AMES and USC-CSEP). QuakeFinder has developed an algorithm framework aimed at isolating anomalous signals (pulses) in the time series. Results are presented from an application of this framework to induction-coil magnetometer data. Our data driven approach starts with sliding windows applied to uniformly resampled array data with a variety of lengths and overlap. Data variance (a proxy for energy) is calculated on each window and a short-term average/ long-term average (STA/LTA) filter is applied to the variance time series. Pulse identification is done by flagging time intervals in the STA/LTA filtered time series which exceed a threshold. Flagged time intervals are subsequently fed into a feature extraction program which computes statistical properties of the resampled data. These features are then filtered using a Principal Component Analysis (PCA) based method to cluster similar pulses. We explore the extent to which this approach categorizes pulses with known sources (e.g. cars, lightning, etc.) and the remaining pulses of unknown origin can be analyzed with respect to their relationship with seismicity. We seek a correlation between these daily pulse-counts (with known sources removed) and subsequent (days to weeks) seismic events greater than M5 within 15km radius. Thus we explore functions which map daily pulse-counts to a time series representing the likelihood of a seismic event occurring at some future time. These "pseudo-probabilities" can in turn be represented as Molchan diagrams. The Molchan curve provides an effective cost function for optimization and allows for a rigorous statistical assessment of the validity of pre-earthquake signals in the electromagnetic data.
A preclustering-based ensemble learning technique for acute appendicitis diagnoses.
Lee, Yen-Hsien; Hu, Paul Jen-Hwa; Cheng, Tsang-Hsiang; Huang, Te-Chia; Chuang, Wei-Yao
2013-06-01
Acute appendicitis is a common medical condition, whose effective, timely diagnosis can be difficult. A missed diagnosis not only puts the patient in danger but also requires additional resources for corrective treatments. An acute appendicitis diagnosis constitutes a classification problem, for which a further fundamental challenge pertains to the skewed outcome class distribution of instances in the training sample. A preclustering-based ensemble learning (PEL) technique aims to address the associated imbalanced sample learning problems and thereby support the timely, accurate diagnosis of acute appendicitis. The proposed PEL technique employs undersampling to reduce the number of majority-class instances in a training sample, uses preclustering to group similar majority-class instances into multiple groups, and selects from each group representative instances to create more balanced samples. The PEL technique thereby reduces potential information loss from random undersampling. It also takes advantage of ensemble learning to improve performance. We empirically evaluate this proposed technique with 574 clinical cases obtained from a comprehensive tertiary hospital in southern Taiwan, using several prevalent techniques and a salient scoring system as benchmarks. The comparative results show that PEL is more effective and less biased than any benchmarks. The proposed PEL technique seems more sensitive to identifying positive acute appendicitis than the commonly used Alvarado scoring system and exhibits higher specificity in identifying negative acute appendicitis. In addition, the sensitivity and specificity values of PEL appear higher than those of the investigated benchmarks that follow the resampling approach. Our analysis suggests PEL benefits from the more representative majority-class instances in the training sample. According to our overall evaluation results, PEL records the best overall performance, and its area under the curve measure reaches 0.619. The PEL technique is capable of addressing imbalanced sample learning associated with acute appendicitis diagnosis. Our evaluation results suggest PEL is less biased toward a positive or negative class than the investigated benchmark techniques. In addition, our results indicate the overall effectiveness of the proposed technique, compared with prevalent scoring systems or salient classification techniques that follow the resampling approach. Copyright © 2013 Elsevier B.V. All rights reserved.
Maximum likelihood resampling of noisy, spatially correlated data
NASA Astrophysics Data System (ADS)
Goff, J.; Jenkins, C.
2005-12-01
In any geologic application, noisy data are sources of consternation for researchers, inhibiting interpretability and marring images with unsightly and unrealistic artifacts. Filtering is the typical solution to dealing with noisy data. However, filtering commonly suffers from ad hoc (i.e., uncalibrated, ungoverned) application, which runs the risk of erasing high variability components of the field in addition to the noise components. We present here an alternative to filtering: a newly developed methodology for correcting noise in data by finding the "best" value given the data value, its uncertainty, and the data values and uncertainties at proximal locations. The motivating rationale is that data points that are close to each other in space cannot differ by "too much", where how much is "too much" is governed by the field correlation properties. Data with large uncertainties will frequently violate this condition, and in such cases need to be corrected, or "resampled." The best solution for resampling is determined by the maximum of the likelihood function defined by the intersection of two probability density functions (pdf): (1) the data pdf, with mean and variance determined by the data value and square uncertainty, respectively, and (2) the geostatistical pdf, whose mean and variance are determined by the kriging algorithm applied to proximal data values. A Monte Carlo sampling of the data probability space eliminates non-uniqueness, and weights the solution toward data values with lower uncertainties. A test with a synthetic data set sampled from a known field demonstrates quantitatively and qualitatively the improvement provided by the maximum likelihood resampling algorithm. The method is also applied to three marine geology/geophysics data examples: (1) three generations of bathymetric data on the New Jersey shelf with disparate data uncertainties; (2) mean grain size data from the Adriatic Sea, which is combination of both analytic (low uncertainty) and word-based (higher uncertainty) sources; and (3) sidescan backscatter data from the Martha's Vineyard Coastal Observatory which are, as is typical for such data, affected by speckly noise.
a New Approach for Subway Tunnel Deformation Monitoring: High-Resolution Terrestrial Laser Scanning
NASA Astrophysics Data System (ADS)
Li, J.; Wan, Y.; Gao, X.
2012-07-01
With the improvement of the accuracy and efficiency of laser scanning technology, high-resolution terrestrial laser scanning (TLS) technology can obtain high precise points-cloud and density distribution and can be applied to high-precision deformation monitoring of subway tunnels and high-speed railway bridges and other fields. In this paper, a new approach using a points-cloud segmentation method based on vectors of neighbor points and surface fitting method based on moving least squares was proposed and applied to subway tunnel deformation monitoring in Tianjin combined with a new high-resolution terrestrial laser scanner (Riegl VZ-400). There were three main procedures. Firstly, a points-cloud consisted of several scanning was registered by linearized iterative least squares approach to improve the accuracy of registration, and several control points were acquired by total stations (TS) and then adjusted. Secondly, the registered points-cloud was resampled and segmented based on vectors of neighbor points to select suitable points. Thirdly, the selected points were used to fit the subway tunnel surface with moving least squares algorithm. Then a series of parallel sections obtained from temporal series of fitting tunnel surfaces were compared to analysis the deformation. Finally, the results of the approach in z direction were compared with the fiber optical displacement sensor approach and the results in x, y directions were compared with TS respectively, and comparison results showed the accuracy errors of x, y, z directions were respectively about 1.5 mm, 2 mm, 1 mm. Therefore the new approach using high-resolution TLS can meet the demand of subway tunnel deformation monitoring.
Bennett, Iain; Paracha, Noman; Abrams, Keith; Ray, Joshua
2018-01-01
Rank Preserving Structural Failure Time models are one of the most commonly used statistical methods to adjust for treatment switching in oncology clinical trials. The method is often applied in a decision analytic model without appropriately accounting for additional uncertainty when determining the allocation of health care resources. The aim of the study is to describe novel approaches to adequately account for uncertainty when using a Rank Preserving Structural Failure Time model in a decision analytic model. Using two examples, we tested and compared the performance of the novel Test-based method with the resampling bootstrap method and with the conventional approach of no adjustment. In the first example, we simulated life expectancy using a simple decision analytic model based on a hypothetical oncology trial with treatment switching. In the second example, we applied the adjustment method on published data when no individual patient data were available. Mean estimates of overall and incremental life expectancy were similar across methods. However, the bootstrapped and test-based estimates consistently produced greater estimates of uncertainty compared with the estimate without any adjustment applied. Similar results were observed when using the test based approach on a published data showing that failing to adjust for uncertainty led to smaller confidence intervals. Both the bootstrapping and test-based approaches provide a solution to appropriately incorporate uncertainty, with the benefit that the latter can implemented by researchers in the absence of individual patient data. Copyright © 2018 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Chuan, Zun Liang; Ismail, Noriszura; Shinyie, Wendy Ling; Lit Ken, Tan; Fam, Soo-Fen; Senawi, Azlyna; Yusoff, Wan Nur Syahidah Wan
2018-04-01
Due to the limited of historical precipitation records, agglomerative hierarchical clustering algorithms widely used to extrapolate information from gauged to ungauged precipitation catchments in yielding a more reliable projection of extreme hydro-meteorological events such as extreme precipitation events. However, identifying the optimum number of homogeneous precipitation catchments accurately based on the dendrogram resulted using agglomerative hierarchical algorithms are very subjective. The main objective of this study is to propose an efficient regionalized algorithm to identify the homogeneous precipitation catchments for non-stationary precipitation time series. The homogeneous precipitation catchments are identified using average linkage hierarchical clustering algorithm associated multi-scale bootstrap resampling, while uncentered correlation coefficient as the similarity measure. The regionalized homogeneous precipitation is consolidated using K-sample Anderson Darling non-parametric test. The analysis result shows the proposed regionalized algorithm performed more better compared to the proposed agglomerative hierarchical clustering algorithm in previous studies.
Accelerated spike resampling for accurate multiple testing controls.
Harrison, Matthew T
2013-02-01
Controlling for multiple hypothesis tests using standard spike resampling techniques often requires prohibitive amounts of computation. Importance sampling techniques can be used to accelerate the computation. The general theory is presented, along with specific examples for testing differences across conditions using permutation tests and for testing pairwise synchrony and precise lagged-correlation between many simultaneously recorded spike trains using interval jitter.
Streiner, David L
2015-10-01
Testing many null hypotheses in a single study results in an increased probability of detecting a significant finding just by chance (the problem of multiplicity). Debates have raged over many years with regard to whether to correct for multiplicity and, if so, how it should be done. This article first discusses how multiple tests lead to an inflation of the α level, then explores the following different contexts in which multiplicity arises: testing for baseline differences in various types of studies, having >1 outcome variable, conducting statistical tests that produce >1 P value, taking multiple "peeks" at the data, and unplanned, post hoc analyses (i.e., "data dredging," "fishing expeditions," or "P-hacking"). It then discusses some of the methods that have been proposed for correcting for multiplicity, including single-step procedures (e.g., Bonferroni); multistep procedures, such as those of Holm, Hochberg, and Šidák; false discovery rate control; and resampling approaches. Note that these various approaches describe different aspects and are not necessarily mutually exclusive. For example, resampling methods could be used to control the false discovery rate or the family-wise error rate (as defined later in this article). However, the use of one of these approaches presupposes that we should correct for multiplicity, which is not universally accepted, and the article presents the arguments for and against such "correction." The final section brings together these threads and presents suggestions with regard to when it makes sense to apply the corrections and how to do so. © 2015 American Society for Nutrition.
Shannon, Casey P; Chen, Virginia; Takhar, Mandeep; Hollander, Zsuzsanna; Balshaw, Robert; McManus, Bruce M; Tebbutt, Scott J; Sin, Don D; Ng, Raymond T
2016-11-14
Gene network inference (GNI) algorithms can be used to identify sets of coordinately expressed genes, termed network modules from whole transcriptome gene expression data. The identification of such modules has become a popular approach to systems biology, with important applications in translational research. Although diverse computational and statistical approaches have been devised to identify such modules, their performance behavior is still not fully understood, particularly in complex human tissues. Given human heterogeneity, one important question is how the outputs of these computational methods are sensitive to the input sample set, or stability. A related question is how this sensitivity depends on the size of the sample set. We describe here the SABRE (Similarity Across Bootstrap RE-sampling) procedure for assessing the stability of gene network modules using a re-sampling strategy, introduce a novel criterion for identifying stable modules, and demonstrate the utility of this approach in a clinically-relevant cohort, using two different gene network module discovery algorithms. The stability of modules increased as sample size increased and stable modules were more likely to be replicated in larger sets of samples. Random modules derived from permutated gene expression data were consistently unstable, as assessed by SABRE, and provide a useful baseline value for our proposed stability criterion. Gene module sets identified by different algorithms varied with respect to their stability, as assessed by SABRE. Finally, stable modules were more readily annotated in various curated gene set databases. The SABRE procedure and proposed stability criterion may provide guidance when designing systems biology studies in complex human disease and tissues.
Pesticides in Wyoming Groundwater, 2008-10
Eddy-Miller, Cheryl A.; Bartos, Timothy T.; Taylor, Michelle L.
2013-01-01
Groundwater samples were collected from 296 wells during 1995-2006 as part of a baseline study of pesticides in Wyoming groundwater. In 2009, a previous report summarized the results of the baseline sampling and the statistical evaluation of the occurrence of pesticides in relation to selected natural and anthropogenic (human-related) characteristics. During 2008-10, the U.S. Geological Survey, in cooperation with the Wyoming Department of Agriculture, resampled a subset (52) of the 296 wells sampled during 1995-2006 baseline study in order to compare detected compounds and respective concentrations between the two sampling periods and to evaluate the detections of new compounds. The 52 wells were distributed similarly to sites used in the 1995-2006 baseline study with respect to geographic area and land use within the geographic area of interest. Because of the use of different types of reporting levels and variability in reporting-level values during both the 1995-2006 baseline study and the 2008-10 resampling study, analytical results received from the laboratory were recensored. Two levels of recensoring were used to compare pesticides—a compound-specific assessment level (CSAL) that differed by compound and a common assessment level (CAL) of 0.07 microgram per liter. The recensoring techniques and values used for both studies, with the exception of the pesticide 2,4-D methyl ester, were the same. Twenty-eight different pesticides were detected in samples from the 52 wells during the 2008-10 resampling study. Pesticide concentrations were compared with several U.S. Environmental Protection Agency drinking-water standards or health advisories for finished (treated) water established under the Safe Drinking Water Act. All detected pesticides were measured at concentrations smaller than U.S. Environmental Protection Agency drinking-water standards or health advisories where applicable (many pesticides did not have standards or advisories). One or more pesticides were detected at concentrations greater than the CAL in water from 16 of 52 wells sampled (about 31 percent) during the resampling study. Detected pesticides were classified into one of six types: herbicides, herbicide degradates, insecticides, insecticide degradates, fungicides, or fungicide degradates. At least 95 percent of detected pesticides were classified as herbicides or herbicide degradates. The number of different pesticides detected in samples from the 52 wells was similar between the 1995-2006 baseline study (30 different pesticides) and 2008-2010 resampling study (28 different pesticides). Thirteen pesticides were detected during both studies. The change in the number of pesticides detected (without regard to which pesticide was detected) in groundwater samples from each of the 52 wells was evaluated and the number of pesticides detected in groundwater did not change for most of the wells (32). Of those that did have a difference between the two studies, 17 wells had more pesticide detections in groundwater during the 1995-2006 baseline study, whereas only 3 wells had more detections during the 2008-2010 resampling study. The difference in pesticide concentrations in groundwater samples from each of the 52 wells was determined. Few changes in concentration between the 1995-2006 baseline study and the 2008-2010 resampling study were seen for most detected pesticides. Seven pesticides had a greater concentration detected in the groundwater from the same well during the baseline sampling compared to the resampling study. Concentrations of prometon, which was detected in 17 wells, were greater in the baseline study sample compared to the resampling study sample from the same well 100 percent of the time. The change in the number of pesticides detected (without regard to which pesticide was detected) in groundwater samples from each of the 52 wells with respect to land use and geographic area was calculated. All wells with land use classified as agricultural had the same or a smaller number of pesticides detected in the resampling study compared to the baseline study. All wells in the Bighorn Basin geographic area also had the same or a smaller number of pesticides detected in the resampling study compared to the baseline study.
Study on the Classification of GAOFEN-3 Polarimetric SAR Images Using Deep Neural Network
NASA Astrophysics Data System (ADS)
Zhang, J.; Zhang, J.; Zhao, Z.
2018-04-01
Polarimetric Synthetic Aperture Radar (POLSAR) imaging principle determines that the image quality will be affected by speckle noise. So the recognition accuracy of traditional image classification methods will be reduced by the effect of this interference. Since the date of submission, Deep Convolutional Neural Network impacts on the traditional image processing methods and brings the field of computer vision to a new stage with the advantages of a strong ability to learn deep features and excellent ability to fit large datasets. Based on the basic characteristics of polarimetric SAR images, the paper studied the types of the surface cover by using the method of Deep Learning. We used the fully polarimetric SAR features of different scales to fuse RGB images to the GoogLeNet model based on convolution neural network Iterative training, and then use the trained model to test the classification of data validation.First of all, referring to the optical image, we mark the surface coverage type of GF-3 POLSAR image with 8m resolution, and then collect the samples according to different categories. To meet the GoogLeNet model requirements of 256 × 256 pixel image input and taking into account the lack of full-resolution SAR resolution, the original image should be pre-processed in the process of resampling. In this paper, POLSAR image slice samples of different scales with sampling intervals of 2 m and 1 m to be trained separately and validated by the verification dataset. Among them, the training accuracy of GoogLeNet model trained with resampled 2-m polarimetric SAR image is 94.89 %, and that of the trained SAR image with resampled 1 m is 92.65 %.
NASA Astrophysics Data System (ADS)
Holoien, Thomas W.-S.; Marshall, Philip J.; Wechsler, Risa H.
2017-06-01
We describe two new open-source tools written in Python for performing extreme deconvolution Gaussian mixture modeling (XDGMM) and using a conditioned model to re-sample observed supernova and host galaxy populations. XDGMM is new program that uses Gaussian mixtures to perform density estimation of noisy data using extreme deconvolution (XD) algorithms. Additionally, it has functionality not available in other XD tools. It allows the user to select between the AstroML and Bovy et al. fitting methods and is compatible with scikit-learn machine learning algorithms. Most crucially, it allows the user to condition a model based on the known values of a subset of parameters. This gives the user the ability to produce a tool that can predict unknown parameters based on a model that is conditioned on known values of other parameters. EmpiriciSN is an exemplary application of this functionality, which can be used to fit an XDGMM model to observed supernova/host data sets and predict likely supernova parameters using a model conditioned on observed host properties. It is primarily intended to simulate realistic supernovae for LSST data simulations based on empirical galaxy properties.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holoien, Thomas W. -S.; Marshall, Philip J.; Wechsler, Risa H.
We describe two new open-source tools written in Python for performing extreme deconvolution Gaussian mixture modeling (XDGMM) and using a conditioned model to re-sample observed supernova and host galaxy populations. XDGMM is new program that uses Gaussian mixtures to perform density estimation of noisy data using extreme deconvolution (XD) algorithms. Additionally, it has functionality not available in other XD tools. It allows the user to select between the AstroML and Bovy et al. fitting methods and is compatible with scikit-learn machine learning algorithms. Most crucially, it allows the user to condition a model based on the known values of amore » subset of parameters. This gives the user the ability to produce a tool that can predict unknown parameters based on a model that is conditioned on known values of other parameters. EmpiriciSN is an exemplary application of this functionality, which can be used to fit an XDGMM model to observed supernova/host data sets and predict likely supernova parameters using a model conditioned on observed host properties. It is primarily intended to simulate realistic supernovae for LSST data simulations based on empirical galaxy properties.« less
Experimental study of digital image processing techniques for LANDSAT data
NASA Technical Reports Server (NTRS)
Rifman, S. S. (Principal Investigator); Allendoerfer, W. B.; Caron, R. H.; Pemberton, L. J.; Mckinnon, D. M.; Polanski, G.; Simon, K. W.
1976-01-01
The author has identified the following significant results. Results are reported for: (1) subscene registration, (2) full scene rectification and registration, (3) resampling techniques, (4) and ground control point (GCP) extraction. Subscenes (354 pixels x 234 lines) were registered to approximately 1/4 pixel accuracy and evaluated by change detection imagery for three cases: (1) bulk data registration, (2) precision correction of a reference subscene using GCP data, and (3) independently precision processed subscenes. Full scene rectification and registration results were evaluated by using a correlation technique to measure registration errors of 0.3 pixel rms thoughout the full scene. Resampling evaluations of nearest neighbor and TRW cubic convolution processed data included change detection imagery and feature classification. Resampled data were also evaluated for an MSS scene containing specular solar reflections.
Resampling-Based Gap Analysis for Detecting Nodes with High Centrality on Large Social Network
2015-05-22
University, Shiga, Japan kimura@rins.ryukoku.ac.jp 4 Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan 5 School of...second one is a network extracted from a Japanese word-of-mouth communication site for cosmetics , “@cosme”2, consist- ing of 45, 024 nodes
Abuassba, Adnan O M; Zhang, Dezheng; Luo, Xiong; Shaheryar, Ahmad; Ali, Hazrat
2017-01-01
Extreme Learning Machine (ELM) is a fast-learning algorithm for a single-hidden layer feedforward neural network (SLFN). It often has good generalization performance. However, there are chances that it might overfit the training data due to having more hidden nodes than needed. To address the generalization performance, we use a heterogeneous ensemble approach. We propose an Advanced ELM Ensemble (AELME) for classification, which includes Regularized-ELM, L 2 -norm-optimized ELM (ELML2), and Kernel-ELM. The ensemble is constructed by training a randomly chosen ELM classifier on a subset of training data selected through random resampling. The proposed AELM-Ensemble is evolved by employing an objective function of increasing diversity and accuracy among the final ensemble. Finally, the class label of unseen data is predicted using majority vote approach. Splitting the training data into subsets and incorporation of heterogeneous ELM classifiers result in higher prediction accuracy, better generalization, and a lower number of base classifiers, as compared to other models (Adaboost, Bagging, Dynamic ELM ensemble, data splitting ELM ensemble, and ELM ensemble). The validity of AELME is confirmed through classification on several real-world benchmark datasets.
Abuassba, Adnan O. M.; Ali, Hazrat
2017-01-01
Extreme Learning Machine (ELM) is a fast-learning algorithm for a single-hidden layer feedforward neural network (SLFN). It often has good generalization performance. However, there are chances that it might overfit the training data due to having more hidden nodes than needed. To address the generalization performance, we use a heterogeneous ensemble approach. We propose an Advanced ELM Ensemble (AELME) for classification, which includes Regularized-ELM, L2-norm-optimized ELM (ELML2), and Kernel-ELM. The ensemble is constructed by training a randomly chosen ELM classifier on a subset of training data selected through random resampling. The proposed AELM-Ensemble is evolved by employing an objective function of increasing diversity and accuracy among the final ensemble. Finally, the class label of unseen data is predicted using majority vote approach. Splitting the training data into subsets and incorporation of heterogeneous ELM classifiers result in higher prediction accuracy, better generalization, and a lower number of base classifiers, as compared to other models (Adaboost, Bagging, Dynamic ELM ensemble, data splitting ELM ensemble, and ELM ensemble). The validity of AELME is confirmed through classification on several real-world benchmark datasets. PMID:28546808
TU-CD-BRA-01: A Novel 3D Registration Method for Multiparametric Radiological Images
DOE Office of Scientific and Technical Information (OSTI.GOV)
Akhbardeh, A; Parekth, VS; Jacobs, MA
2015-06-15
Purpose: Multiparametric and multimodality radiological imaging methods, such as, magnetic resonance imaging(MRI), computed tomography(CT), and positron emission tomography(PET), provide multiple types of tissue contrast and anatomical information for clinical diagnosis. However, these radiological modalities are acquired using very different technical parameters, e.g.,field of view(FOV), matrix size, and scan planes, which, can lead to challenges in registering the different data sets. Therefore, we developed a hybrid registration method based on 3D wavelet transformation and 3D interpolations that performs 3D resampling and rotation of the target radiological images without loss of information Methods: T1-weighted, T2-weighted, diffusion-weighted-imaging(DWI), dynamic-contrast-enhanced(DCE) MRI and PET/CT were usedmore » in the registration algorithm from breast and prostate data at 3T MRI and multimodality(PET/CT) cases. The hybrid registration scheme consists of several steps to reslice and match each modality using a combination of 3D wavelets, interpolations, and affine registration steps. First, orthogonal reslicing is performed to equalize FOV, matrix sizes and the number of slices using wavelet transformation. Second, angular resampling of the target data is performed to match the reference data. Finally, using optimized angles from resampling, 3D registration is performed using similarity transformation(scaling and translation) between the reference and resliced target volume is performed. After registration, the mean-square-error(MSE) and Dice Similarity(DS) between the reference and registered target volumes were calculated. Results: The 3D registration method registered synthetic and clinical data with significant improvement(p<0.05) of overlap between anatomical structures. After transforming and deforming the synthetic data, the MSE and Dice similarity were 0.12 and 0.99. The average improvement of the MSE in breast was 62%(0.27 to 0.10) and prostate was 63%(0.13 to 0.04;p<0.05). The Dice similarity was in breast 8%(0.91 to 0.99) and for prostate was 89%(0.01 to 0.90;p<0.05) Conclusion: Our 3D wavelet hybrid registration approach registered diverse breast and prostate data of different radiological images(MR/PET/CT) with a high accuracy.« less
NASA Astrophysics Data System (ADS)
Tweedie, C. E.; Ebert-May, D.; Hollister, R. D.; Johnson, D. R.; Lara, M. J.; Villarreal, S.; Spasojevic, M.; Webber, P.
2010-12-01
The International Polar Year-Back to the Future (IPY-BTF) is an endorsed International Polar Year project (IPY project #214). The overarching goal of this program is to determine how key structural and functional characteristics of high latitude/altitude terrestrial ecosystems have changed over the past 25 or more years and assess if such trajectories of change are likely to continue in the future. By rescuing data, revisiting, re-sampling historic research sites and assessing environmental change over time, we aim to provide greater understanding of how tundra is changing and what the possible drivers of these changes are. Resampling of sites established by Patrick J. Webber between 1964 and 1975 in northern Baffin Island, Northern Alaska and in the Rocky Mountains form a key contribution to the BTF project. Here we report on resampling efforts at each of these locations and initial results of a synthesis effort that finds similarities and differences in change between sites. Results suggest that although shifts in plant community composition are detectable at each location, the magnitude and direction of change differ among locations. Vegetation shifts along soil moisture gradients is apparent at most of the sites resampled. Interestingly, however, wet communities seem to have changed more than dry communities in the Arctic locations, while plant communities at the alpine site appear to be becoming more distinct regardless of soil moisture status. Ecosystem function studies performed in conjunction with plant community change suggest that there has been an increase in plant productivity at most sites resampled, especially in wet and mesic land cover types.
Detecting black bear source–sink dynamics using individual-based genetic graphs
Draheim, Hope M.; Moore, Jennifer A.; Etter, Dwayne; Winterstein, Scott R.; Scribner, Kim T.
2016-01-01
Source–sink dynamics affects population connectivity, spatial genetic structure and population viability for many species. We introduce a novel approach that uses individual-based genetic graphs to identify source–sink areas within a continuously distributed population of black bears (Ursus americanus) in the northern lower peninsula (NLP) of Michigan, USA. Black bear harvest samples (n = 569, from 2002, 2006 and 2010) were genotyped at 12 microsatellite loci and locations were compared across years to identify areas of consistent occupancy over time. We compared graph metrics estimated for a genetic model with metrics from 10 ecological models to identify ecological factors that were associated with sources and sinks. We identified 62 source nodes, 16 of which represent important source areas (net flux > 0.7) and 79 sink nodes. Source strength was significantly correlated with bear local harvest density (a proxy for bear density) and habitat suitability. Additionally, resampling simulations showed our approach is robust to potential sampling bias from uneven sample dispersion. Findings demonstrate black bears in the NLP exhibit asymmetric gene flow, and individual-based genetic graphs can characterize source–sink dynamics in continuously distributed species in the absence of discrete habitat patches. Our findings warrant consideration of undetected source–sink dynamics and their implications on harvest management of game species. PMID:27440668
Detecting black bear source-sink dynamics using individual-based genetic graphs.
Draheim, Hope M; Moore, Jennifer A; Etter, Dwayne; Winterstein, Scott R; Scribner, Kim T
2016-07-27
Source-sink dynamics affects population connectivity, spatial genetic structure and population viability for many species. We introduce a novel approach that uses individual-based genetic graphs to identify source-sink areas within a continuously distributed population of black bears (Ursus americanus) in the northern lower peninsula (NLP) of Michigan, USA. Black bear harvest samples (n = 569, from 2002, 2006 and 2010) were genotyped at 12 microsatellite loci and locations were compared across years to identify areas of consistent occupancy over time. We compared graph metrics estimated for a genetic model with metrics from 10 ecological models to identify ecological factors that were associated with sources and sinks. We identified 62 source nodes, 16 of which represent important source areas (net flux > 0.7) and 79 sink nodes. Source strength was significantly correlated with bear local harvest density (a proxy for bear density) and habitat suitability. Additionally, resampling simulations showed our approach is robust to potential sampling bias from uneven sample dispersion. Findings demonstrate black bears in the NLP exhibit asymmetric gene flow, and individual-based genetic graphs can characterize source-sink dynamics in continuously distributed species in the absence of discrete habitat patches. Our findings warrant consideration of undetected source-sink dynamics and their implications on harvest management of game species. © 2016 The Author(s).
Fourier Descriptor Analysis and Unification of Voice Range Profile Contours: Method and Applications
ERIC Educational Resources Information Center
Pabon, Peter; Ternstrom, Sten; Lamarche, Anick
2011-01-01
Purpose: To describe a method for unified description, statistical modeling, and comparison of voice range profile (VRP) contours, even from diverse sources. Method: A morphologic modeling technique, which is based on Fourier descriptors (FDs), is applied to the VRP contour. The technique, which essentially involves resampling of the curve of the…
Brandon M. Collins; Richard G. Everett; Scott L. Stephens
2011-01-01
We re-sampled areas included in an unbiased 1911 timber inventory conducted by the U.S. Forest Service over a 4000 ha study area. Over half of the re-sampled area burned in relatively recent management- and lightning-ignited fires. This allowed for comparisons of both areas that have experienced recent fire and areas with no recent fire, to the same areas historically...
Quasi-Epipolar Resampling of High Resolution Satellite Stereo Imagery for Semi Global Matching
NASA Astrophysics Data System (ADS)
Tatar, N.; Saadatseresht, M.; Arefi, H.; Hadavand, A.
2015-12-01
Semi-global matching is a well-known stereo matching algorithm in photogrammetric and computer vision society. Epipolar images are supposed as input of this algorithm. Epipolar geometry of linear array scanners is not a straight line as in case of frame camera. Traditional epipolar resampling algorithms demands for rational polynomial coefficients (RPCs), physical sensor model or ground control points. In this paper we propose a new solution for epipolar resampling method which works without the need for these information. In proposed method, automatic feature extraction algorithms are employed to generate corresponding features for registering stereo pairs. Also original images are divided into small tiles. In this way by omitting the need for extra information, the speed of matching algorithm increased and the need for high temporal memory decreased. Our experiments on GeoEye-1 stereo pair captured over Qom city in Iran demonstrates that the epipolar images are generated with sub-pixel accuracy.
Janssen, Steve M J; Chessa, Antonio G; Murre, Jaap M J
2007-10-01
The reminiscence bump is the effect that people recall more personal events from early adulthood than from childhood or adulthood. The bump has been examined extensively. However, the question of whether the bump is caused by differential encoding or re-sampling is still unanswered. To examine this issue, participants were asked to name their three favourite books, movies, and records. Furthermore,they were asked when they first encountered them. We compared the temporal distributions and found that they all showed recency effects and reminiscence bumps. The distribution of favourite books had the largest recency effect and the distribution of favourite records had the largest reminiscence bump. We can explain these results by the difference in rehearsal. Books are read two or three times, movies are watched more frequently, whereas records are listened to numerous times. The results suggest that differential encoding initially causes the reminiscence bump and that re-sampling increases the bump further.
2014-01-01
Background Cost-effectiveness analyses (CEAs) that use patient-specific data from a randomized controlled trial (RCT) are popular, yet such CEAs are criticized because they neglect to incorporate evidence external to the trial. A popular method for quantifying uncertainty in a RCT-based CEA is the bootstrap. The objective of the present study was to further expand the bootstrap method of RCT-based CEA for the incorporation of external evidence. Methods We utilize the Bayesian interpretation of the bootstrap and derive the distribution for the cost and effectiveness outcomes after observing the current RCT data and the external evidence. We propose simple modifications of the bootstrap for sampling from such posterior distributions. Results In a proof-of-concept case study, we use data from a clinical trial and incorporate external evidence on the effect size of treatments to illustrate the method in action. Compared to the parametric models of evidence synthesis, the proposed approach requires fewer distributional assumptions, does not require explicit modeling of the relation between external evidence and outcomes of interest, and is generally easier to implement. A drawback of this approach is potential computational inefficiency compared to the parametric Bayesian methods. Conclusions The bootstrap method of RCT-based CEA can be extended to incorporate external evidence, while preserving its appealing features such as no requirement for parametric modeling of cost and effectiveness outcomes. PMID:24888356
Sadatsafavi, Mohsen; Marra, Carlo; Aaron, Shawn; Bryan, Stirling
2014-06-03
Cost-effectiveness analyses (CEAs) that use patient-specific data from a randomized controlled trial (RCT) are popular, yet such CEAs are criticized because they neglect to incorporate evidence external to the trial. A popular method for quantifying uncertainty in a RCT-based CEA is the bootstrap. The objective of the present study was to further expand the bootstrap method of RCT-based CEA for the incorporation of external evidence. We utilize the Bayesian interpretation of the bootstrap and derive the distribution for the cost and effectiveness outcomes after observing the current RCT data and the external evidence. We propose simple modifications of the bootstrap for sampling from such posterior distributions. In a proof-of-concept case study, we use data from a clinical trial and incorporate external evidence on the effect size of treatments to illustrate the method in action. Compared to the parametric models of evidence synthesis, the proposed approach requires fewer distributional assumptions, does not require explicit modeling of the relation between external evidence and outcomes of interest, and is generally easier to implement. A drawback of this approach is potential computational inefficiency compared to the parametric Bayesian methods. The bootstrap method of RCT-based CEA can be extended to incorporate external evidence, while preserving its appealing features such as no requirement for parametric modeling of cost and effectiveness outcomes.
Spatial Point Pattern Analysis of Neurons Using Ripley's K-Function in 3D
Jafari-Mamaghani, Mehrdad; Andersson, Mikael; Krieger, Patrik
2010-01-01
The aim of this paper is to apply a non-parametric statistical tool, Ripley's K-function, to analyze the 3-dimensional distribution of pyramidal neurons. Ripley's K-function is a widely used tool in spatial point pattern analysis. There are several approaches in 2D domains in which this function is executed and analyzed. Drawing consistent inferences on the underlying 3D point pattern distributions in various applications is of great importance as the acquisition of 3D biological data now poses lesser of a challenge due to technological progress. As of now, most of the applications of Ripley's K-function in 3D domains do not focus on the phenomenon of edge correction, which is discussed thoroughly in this paper. The main goal is to extend the theoretical and practical utilization of Ripley's K-function and corresponding tests based on bootstrap resampling from 2D to 3D domains. PMID:20577588
NASA Astrophysics Data System (ADS)
Zhang, Shengjun; Sandwell, David T.; Jin, Taoyong; Li, Dawei
2017-02-01
The accuracy and resolution of marine gravity field derived from satellite altimetry mainly depends on the range precision and dense spatial distribution. This paper aims at modeling a regional marine gravity field with improved accuracy and higher resolution (1‧ × 1‧) over Southeastern China Seas using additional data from CryoSat-2 as well as new data from AltiKa. Three approaches are used to enhance the precision level of satellite-derived gravity anomalies. Firstly we evaluate a suite of published retracking algorithms and find the two-step retracker is optimal for open ocean waveforms. Secondly, we evaluate the filtering and resampling procedure used to reduce the full 20 or 40 Hz data to a lower rate having lower noise. We adopt a uniform low-pass filter for all altimeter missions and resample at 5 Hz and then perform a second editing based on sea surface slope estimates from previous models. Thirdly, we selected WHU12 model to update the corrections provided in geophysical data record. We finally calculated the 1‧ × 1‧ marine gravity field model by using EGM2008 model as reference field during the remove/restore procedure. The root mean squares of the discrepancies between the new result and DTU10, DTU13, V23.1, EGM2008 are within the range of 1.8- 3.9 mGal, while the verification with respect to shipboard gravity data shows that the accuracy of the new result reached a comparable level with DTU13 and was slightly superior to V23.1, DTU10 and EGM2008 models. Moreover, the new result has a 2 mGal better accuracy over open seas than coastal areas with shallow water depth.
Bradu, Adrian; Kapinchev, Konstantin; Barnes, Frederick; Podoleanu, Adrian
2015-07-01
In a previous report, we demonstrated master-slave optical coherence tomography (MS-OCT), an OCT method that does not need resampling of data and can be used to deliver en face images from several depths simultaneously. In a separate report, we have also demonstrated MS-OCT's capability of producing cross-sectional images of a quality similar to those provided by the traditional Fourier domain (FD) OCT technique, but at a much slower rate. Here, we demonstrate that by taking advantage of the parallel processing capabilities offered by the MS-OCT method, cross-sectional OCT images of the human retina can be produced in real time. We analyze the conditions that ensure a true real-time B-scan imaging operation and demonstrate in vivo real-time images from human fovea and the optic nerve, with resolution and sensitivity comparable to those produced using the traditional FD-based method, however, without the need of data resampling.
NASA Astrophysics Data System (ADS)
Coupon, Jean; Leauthaud, Alexie; Kilbinger, Martin; Medezinski, Elinor
2017-07-01
SWOT (Super W Of Theta) computes two-point statistics for very large data sets, based on “divide and conquer” algorithms, mainly, but not limited to data storage in binary trees, approximation at large scale, parellelization (open MPI), and bootstrap and jackknife resampling methods “on the fly”. It currently supports projected and 3D galaxy auto and cross correlations, galaxy-galaxy lensing, and weighted histograms.
Confidence Limits for the Indirect Effect: Distribution of the Product and Resampling Methods
ERIC Educational Resources Information Center
MacKinnon, David P.; Lockwood, Chondra M.; Williams, Jason
2004-01-01
The most commonly used method to test an indirect effect is to divide the estimate of the indirect effect by its standard error and compare the resulting z statistic with a critical value from the standard normal distribution. Confidence limits for the indirect effect are also typically based on critical values from the standard normal…
Air-to-Air Missile Vector Scoring
2012-03-22
SIR sampling-importance resampling . . . . . . . . . . . . . . 53 EPF extended particle filter . . . . . . . . . . . . . . . . . . . . 54 UPF unscented...particle filter ( EPF ) or a unscented particle fil- ter (UPF) [20]. The basic concept is to apply a bank of N EKF or UKF filters to move particles from...Merwe, Doucet, Freitas and Wan provide a comprehensive discussion on the EPF and UPF, including algorithms for implementation [20]. 2Result based on
NASA Astrophysics Data System (ADS)
Angrisano, Antonio; Maratea, Antonio; Gaglione, Salvatore
2018-01-01
In the absence of obstacles, a GPS device is generally able to provide continuous and accurate estimates of position, while in urban scenarios buildings can generate multipath and echo-only phenomena that severely affect the continuity and the accuracy of the provided estimates. Receiver autonomous integrity monitoring (RAIM) techniques are able to reduce the negative consequences of large blunders in urban scenarios, but require both a good redundancy and a low contamination to be effective. In this paper a resampling strategy based on bootstrap is proposed as an alternative to RAIM, in order to estimate accurately position in case of low redundancy and multiple blunders: starting with the pseudorange measurement model, at each epoch the available measurements are bootstrapped—that is random sampled with replacement—and the generated a posteriori empirical distribution is exploited to derive the final position. Compared to standard bootstrap, in this paper the sampling probabilities are not uniform, but vary according to an indicator of the measurement quality. The proposed method has been compared with two different RAIM techniques on a data set collected in critical conditions, resulting in a clear improvement on all considered figures of merit.
NASA Astrophysics Data System (ADS)
Wicaksono, Pramaditya; Salivian Wisnu Kumara, Ignatius; Kamal, Muhammad; Afif Fauzan, Muhammad; Zhafarina, Zhafirah; Agus Nurswantoro, Dwi; Noviaris Yogyantoro, Rifka
2017-12-01
Although spectrally different, seagrass species may not be able to be mapped from multispectral remote sensing images due to the limitation of their spectral resolution. Therefore, it is important to quantitatively assess the possibility of mapping seagrass species using multispectral images by resampling seagrass species spectra to multispectral bands. Seagrass species spectra were measured on harvested seagrass leaves. Spectral resolution of multispectral images used in this research was adopted from WorldView-2, Quickbird, Sentinel-2A, ASTER VNIR, and Landsat 8 OLI. These images are widely available and can be a good representative and baseline for previous or future remote sensing images. Seagrass species considered in this research are Enhalus acoroides (Ea), Thalassodendron ciliatum (Tc), Thalassia hemprichii (Th), Cymodocea rotundata (Cr), Cymodocea serrulata (Cs), Halodule uninervis (Hu), Halodule pinifolia (Hp), Syringodum isoetifolium (Si), Halophila ovalis (Ho), and Halophila minor (Hm). Multispectral resampling analysis indicate that the resampled spectra exhibit similar shape and pattern with the original spectra but less precise, and they lose the unique absorption feature of seagrass species. Relying on spectral bands alone, multispectral image is not effective in mapping these seagrass species individually, which is shown by the poor and inconsistent result of Spectral Angle Mapper (SAM) classification technique in classifying seagrass species using seagrass species spectra as pure endmember. Only Sentinel-2A produced acceptable classification result using SAM.
NASA Astrophysics Data System (ADS)
Wani, Omar; Beckers, Joost V. L.; Weerts, Albrecht H.; Solomatine, Dimitri P.
2017-08-01
A non-parametric method is applied to quantify residual uncertainty in hydrologic streamflow forecasting. This method acts as a post-processor on deterministic model forecasts and generates a residual uncertainty distribution. Based on instance-based learning, it uses a k nearest-neighbour search for similar historical hydrometeorological conditions to determine uncertainty intervals from a set of historical errors, i.e. discrepancies between past forecast and observation. The performance of this method is assessed using test cases of hydrologic forecasting in two UK rivers: the Severn and Brue. Forecasts in retrospect were made and their uncertainties were estimated using kNN resampling and two alternative uncertainty estimators: quantile regression (QR) and uncertainty estimation based on local errors and clustering (UNEEC). Results show that kNN uncertainty estimation produces accurate and narrow uncertainty intervals with good probability coverage. Analysis also shows that the performance of this technique depends on the choice of search space. Nevertheless, the accuracy and reliability of uncertainty intervals generated using kNN resampling are at least comparable to those produced by QR and UNEEC. It is concluded that kNN uncertainty estimation is an interesting alternative to other post-processors, like QR and UNEEC, for estimating forecast uncertainty. Apart from its concept being simple and well understood, an advantage of this method is that it is relatively easy to implement.
Performing label-fusion-based segmentation using multiple automatically generated templates.
Chakravarty, M Mallar; Steadman, Patrick; van Eede, Matthijs C; Calcott, Rebecca D; Gu, Victoria; Shaw, Philip; Raznahan, Armin; Collins, D Louis; Lerch, Jason P
2013-10-01
Classically, model-based segmentation procedures match magnetic resonance imaging (MRI) volumes to an expertly labeled atlas using nonlinear registration. The accuracy of these techniques are limited due to atlas biases, misregistration, and resampling error. Multi-atlas-based approaches are used as a remedy and involve matching each subject to a number of manually labeled templates. This approach yields numerous independent segmentations that are fused using a voxel-by-voxel label-voting procedure. In this article, we demonstrate how the multi-atlas approach can be extended to work with input atlases that are unique and extremely time consuming to construct by generating a library of multiple automatically generated templates of different brains (MAGeT Brain). We demonstrate the efficacy of our method for the mouse and human using two different nonlinear registration algorithms (ANIMAL and ANTs). The input atlases consist a high-resolution mouse brain atlas and an atlas of the human basal ganglia and thalamus derived from serial histological data. MAGeT Brain segmentation improves the identification of the mouse anterior commissure (mean Dice Kappa values (κ = 0.801), but may be encountering a ceiling effect for hippocampal segmentations. Applying MAGeT Brain to human subcortical structures improves segmentation accuracy for all structures compared to regular model-based techniques (κ = 0.845, 0.752, and 0.861 for the striatum, globus pallidus, and thalamus, respectively). Experiments performed with three manually derived input templates suggest that MAGeT Brain can approach or exceed the accuracy of multi-atlas label-fusion segmentation (κ = 0.894, 0.815, and 0.895 for the striatum, globus pallidus, and thalamus, respectively). Copyright © 2012 Wiley Periodicals, Inc.
Zhao, Liping; Zhang, Zefeng; Kolm, Paul; Jasper, Susan; Lewis, Cheryl; Klein, Allan; Weintraub, William
2008-02-01
The ACUTE II study demonstrated that transesophageal echocardiographically guided cardioversion with enoxaparin in patients with atrial fibrillation was associated with shorter initial hospital stay, more normal sinus rhythm at 5 weeks, and no significant differences in stroke, bleeding, or death compared with unfractionated heparin (UFH). The present study evaluated resource use and costs in enoxaparin (n=76) and UFH (n=79) during 5-week follow-up. Resources included initial and subsequent hospitalizations, study drugs, outpatient services, and emergency room visits. Two costing approaches were employed for the hospitalization costing. The first approach was based on the UB-92 formulation of hospital bill and diagnosis-related group. The second approach was based on UB-92 and imputation using multivariable linear regression. Costs for outpatient and emergency room visits were determined from the Medicare fee schedule. Sensitivity analysis was performed to assess the robustness of the results. A bootstrap resample approach was used to obtain the confidence interval (CI) for the cost differences. Costs of initial and subsequent hospitalizations, outpatient procedures, and emergency room visits were lower in the enoxaparin group. Average total costs remained significantly lower for the enoxaparin group for the 2 costing approaches ($5,800 vs $8,167, difference $2,367, 95% CI 855 to 4,388, for the first approach; $7,942 vs $10,076, difference $2,134, 95% CI 437 to 4,207, for the second approach). Sensitivity analysis showed that cost differences between strategies are robust to variation of drug costs. In conclusion, the use of enoxaparin as a bridging therapy is a cost-saving strategy (similar clinical outcomes and lower costs) for atrial fibrillation.
NASA Astrophysics Data System (ADS)
van der Heijden, Sven; Callau Poduje, Ana; Müller, Hannes; Shehu, Bora; Haberlandt, Uwe; Lorenz, Manuel; Wagner, Sven; Kunstmann, Harald; Müller, Thomas; Mosthaf, Tobias; Bárdossy, András
2015-04-01
For the design and operation of urban drainage systems with numerical simulation models, long, continuous precipitation time series with high temporal resolution are necessary. Suitable observed time series are rare. As a result, intelligent design concepts often use uncertain or unsuitable precipitation data, which renders them uneconomic or unsustainable. An expedient alternative to observed data is the use of long, synthetic rainfall time series as input for the simulation models. Within the project SYNOPSE, several different methods to generate synthetic precipitation data for urban drainage modelling are advanced, tested, and compared. The presented study compares four different approaches of precipitation models regarding their ability to reproduce rainfall and runoff characteristics. These include one parametric stochastic model (alternating renewal approach), one non-parametric stochastic model (resampling approach), one downscaling approach from a regional climate model, and one disaggregation approach based on daily precipitation measurements. All four models produce long precipitation time series with a temporal resolution of five minutes. The synthetic time series are first compared to observed rainfall reference time series. Comparison criteria include event based statistics like mean dry spell and wet spell duration, wet spell amount and intensity, long term means of precipitation sum and number of events, and extreme value distributions for different durations. Then they are compared regarding simulated discharge characteristics using an urban hydrological model on a fictitious sewage network. First results show a principal suitability of all rainfall models but with different strengths and weaknesses regarding the different rainfall and runoff characteristics considered.
Incremental terrain processing for large digital elevation models
NASA Astrophysics Data System (ADS)
Ye, Z.
2012-12-01
Incremental terrain processing for large digital elevation models Zichuan Ye, Dean Djokic, Lori Armstrong Esri, 380 New York Street, Redlands, CA 92373, USA (E-mail: zye@esri.com, ddjokic@esri.com , larmstrong@esri.com) Efficient analyses of large digital elevation models (DEM) require generation of additional DEM artifacts such as flow direction, flow accumulation and other DEM derivatives. When the DEMs to analyze have a large number of grid cells (usually > 1,000,000,000) the generation of these DEM derivatives is either impractical (it takes too long) or impossible (software is incapable of processing such a large number of cells). Different strategies and algorithms can be put in place to alleviate this situation. This paper describes an approach where the overall DEM is partitioned in smaller processing units that can be efficiently processed. The processed DEM derivatives for each partition can then be either mosaicked back into a single large entity or managed on partition level. For dendritic terrain morphologies, the way in which partitions are to be derived and the order in which they are to be processed depend on the river and catchment patterns. These patterns are not available until flow pattern of the whole region is created, which in turn cannot be established upfront due to the size issues. This paper describes a procedure that solves this problem: (1) Resample the original large DEM grid so that the total number of cells is reduced to a level for which the drainage pattern can be established. (2) Run standard terrain preprocessing operations on the resampled DEM to generate the river and catchment system. (3) Define the processing units and their processing order based on the river and catchment system created in step (2). (4) Based on the processing order, apply the analysis, i.e., flow accumulation operation to each of the processing units, at the full resolution DEM. (5) As each processing unit is processed based on the processing order defined in (3), compare the resulting drainage pattern with the drainage pattern established at the coarser scale and adjust the drainage boundaries and rivers if necessary.
NASA Astrophysics Data System (ADS)
Moreno, Claudia E.; Guevara, Roger; Sánchez-Rojas, Gerardo; Téllez, Dianeis; Verdú, José R.
2008-01-01
Environmental assessment at the community level in highly diverse ecosystems is limited by taxonomic constraints and statistical methods requiring true replicates. Our objective was to show how diverse systems can be studied at the community level using higher taxa as biodiversity surrogates, and re-sampling methods to allow comparisons. To illustrate this we compared the abundance, richness, evenness and diversity of the litter fauna in a pine-oak forest in central Mexico among seasons, sites and collecting methods. We also assessed changes in the abundance of trophic guilds and evaluated the relationships between community parameters and litter attributes. With the direct search method we observed differences in the rate of taxa accumulation between sites. Bootstrap analysis showed that abundance varied significantly between seasons and sampling methods, but not between sites. In contrast, diversity and evenness were significantly higher at the managed than at the non-managed site. Tree regression models show that abundance varied mainly between seasons, whereas taxa richness was affected by litter attributes (composition and moisture content). The abundance of trophic guilds varied among methods and seasons, but overall we found that parasitoids, predators and detrivores decreased under management. Therefore, although our results suggest that management has positive effects on the richness and diversity of litter fauna, the analysis of trophic guilds revealed a contrasting story. Our results indicate that functional groups and re-sampling methods may be used as tools for describing community patterns in highly diverse systems. Also, the higher taxa surrogacy could be seen as a preliminary approach when it is not possible to identify the specimens at a low taxonomic level in a reasonable period of time and in a context of limited financial resources, but further studies are needed to test whether the results are specific to a system or whether they are general with regards to land management.
Epistemic uncertainty in the location and magnitude of earthquakes in Italy from Macroseismic data
Bakun, W.H.; Gomez, Capera A.; Stucchi, M.
2011-01-01
Three independent techniques (Bakun and Wentworth, 1997; Boxer from Gasperini et al., 1999; and Macroseismic Estimation of Earthquake Parameters [MEEP; see Data and Resources section, deliverable D3] from R.M.W. Musson and M.J. Jimenez) have been proposed for estimating an earthquake location and magnitude from intensity data alone. The locations and magnitudes obtained for a given set of intensity data are almost always different, and no one technique is consistently best at matching instrumental locations and magnitudes of recent well-recorded earthquakes in Italy. Rather than attempting to select one of the three solutions as best, we use all three techniques to estimate the location and the magnitude and the epistemic uncertainties among them. The estimates are calculated using bootstrap resampled data sets with Monte Carlo sampling of a decision tree. The decision-tree branch weights are based on goodness-of-fit measures of location and magnitude for recent earthquakes. The location estimates are based on the spatial distribution of locations calculated from the bootstrap resampled data. The preferred source location is the locus of the maximum bootstrap location spatial density. The location uncertainty is obtained from contours of the bootstrap spatial density: 68% of the bootstrap locations are within the 68% confidence region, and so on. For large earthquakes, our preferred location is not associated with the epicenter but with a location on the extended rupture surface. For small earthquakes, the epicenters are generally consistent with the location uncertainties inferred from the intensity data if an epicenter inaccuracy of 2-3 km is allowed. The preferred magnitude is the median of the distribution of bootstrap magnitudes. As with location uncertainties, the uncertainties in magnitude are obtained from the distribution of bootstrap magnitudes: the bounds of the 68% uncertainty range enclose 68% of the bootstrap magnitudes, and so on. The instrumental magnitudes for large and small earthquakes are generally consistent with the confidence intervals inferred from the distribution of bootstrap resampled magnitudes.
Correcting Evaluation Bias of Relational Classifiers with Network Cross Validation
2010-01-01
classi- fication algorithms: simple random resampling (RRS), equal-instance random resampling (ERS), and network cross-validation ( NCV ). The first two... NCV procedure that eliminates overlap between test sets altogether. The procedure samples for k disjoint test sets that will be used for evaluation...propLabeled ∗ S) nodes from train Pool in f erenceSet =network − trainSet F = F ∪ < trainSet, test Set, in f erenceSet > end for output: F NCV addresses
Forensic discrimination of copper wire using trace element concentrations.
Dettman, Joshua R; Cassabaum, Alyssa A; Saunders, Christopher P; Snyder, Deanna L; Buscaglia, JoAnn
2014-08-19
Copper may be recovered as evidence in high-profile cases such as thefts and improvised explosive device incidents; comparison of copper samples from the crime scene and those associated with the subject of an investigation can provide probative associative evidence and investigative support. A solution-based inductively coupled plasma mass spectrometry method for measuring trace element concentrations in high-purity copper was developed using standard reference materials. The method was evaluated for its ability to use trace element profiles to statistically discriminate between copper samples considering the precision of the measurement and manufacturing processes. The discriminating power was estimated by comparing samples chosen on the basis of the copper refining and production process to represent the within-source (samples expected to be similar) and between-source (samples expected to be different) variability using multivariate parametric- and empirical-based data simulation models with bootstrap resampling. If the false exclusion rate is set to 5%, >90% of the copper samples can be correctly determined to originate from different sources using a parametric-based model and >87% with an empirical-based approach. These results demonstrate the potential utility of the developed method for the comparison of copper samples encountered as forensic evidence.
Interpolation for de-Dopplerisation
NASA Astrophysics Data System (ADS)
Graham, W. R.
2018-05-01
'De-Dopplerisation' is one aspect of a problem frequently encountered in experimental acoustics: deducing an emitted source signal from received data. It is necessary when source and receiver are in relative motion, and requires interpolation of the measured signal. This introduces error. In acoustics, typical current practice is to employ linear interpolation and reduce error by over-sampling. In other applications, more advanced approaches with better performance have been developed. Associated with this work is a large body of theoretical analysis, much of which is highly specialised. Nonetheless, a simple and compact performance metric is available: the Fourier transform of the 'kernel' function underlying the interpolation method. Furthermore, in the acoustics context, it is a more appropriate indicator than other, more abstract, candidates. On this basis, interpolators from three families previously identified as promising - - piecewise-polynomial, windowed-sinc, and B-spline-based - - are compared. The results show that significant improvements over linear interpolation can straightforwardly be obtained. The recommended approach is B-spline-based interpolation, which performs best irrespective of accuracy specification. Its only drawback is a pre-filtering requirement, which represents an additional implementation cost compared to other methods. If this cost is unacceptable, and aliasing errors (on re-sampling) up to approximately 1% can be tolerated, a family of piecewise-cubic interpolators provides the best alternative.
Enhancing hydrologic data assimilation by evolutionary Particle Filter and Markov Chain Monte Carlo
NASA Astrophysics Data System (ADS)
Abbaszadeh, Peyman; Moradkhani, Hamid; Yan, Hongxiang
2018-01-01
Particle Filters (PFs) have received increasing attention by researchers from different disciplines including the hydro-geosciences, as an effective tool to improve model predictions in nonlinear and non-Gaussian dynamical systems. The implication of dual state and parameter estimation using the PFs in hydrology has evolved since 2005 from the PF-SIR (sampling importance resampling) to PF-MCMC (Markov Chain Monte Carlo), and now to the most effective and robust framework through evolutionary PF approach based on Genetic Algorithm (GA) and MCMC, the so-called EPFM. In this framework, the prior distribution undergoes an evolutionary process based on the designed mutation and crossover operators of GA. The merit of this approach is that the particles move to an appropriate position by using the GA optimization and then the number of effective particles is increased by means of MCMC, whereby the particle degeneracy is avoided and the particle diversity is improved. In this study, the usefulness and effectiveness of the proposed EPFM is investigated by applying the technique on a conceptual and highly nonlinear hydrologic model over four river basins located in different climate and geographical regions of the United States. Both synthetic and real case studies demonstrate that the EPFM improves both the state and parameter estimation more effectively and reliably as compared with the PF-MCMC.
Accuracy of lung nodule density on HRCT: analysis by PSF-based image simulation.
Ohno, Ken; Ohkubo, Masaki; Marasinghe, Janaka C; Murao, Kohei; Matsumoto, Toru; Wada, Shinichi
2012-11-08
A computed tomography (CT) image simulation technique based on the point spread function (PSF) was applied to analyze the accuracy of CT-based clinical evaluations of lung nodule density. The PSF of the CT system was measured and used to perform the lung nodule image simulation. Then, the simulated image was resampled at intervals equal to the pixel size and the slice interval found in clinical high-resolution CT (HRCT) images. On those images, the nodule density was measured by placing a region of interest (ROI) commonly used for routine clinical practice, and comparing the measured value with the true value (a known density of object function used in the image simulation). It was quantitatively determined that the measured nodule density depended on the nodule diameter and the image reconstruction parameters (kernel and slice thickness). In addition, the measured density fluctuated, depending on the offset between the nodule center and the image voxel center. This fluctuation was reduced by decreasing the slice interval (i.e., with the use of overlapping reconstruction), leading to a stable density evaluation. Our proposed method of PSF-based image simulation accompanied with resampling enables a quantitative analysis of the accuracy of CT-based evaluations of lung nodule density. These results could potentially reveal clinical misreadings in diagnosis, and lead to more accurate and precise density evaluations. They would also be of value for determining the optimum scan and reconstruction parameters, such as image reconstruction kernels and slice thicknesses/intervals.
NASA Astrophysics Data System (ADS)
Laloy, Eric; Hérault, Romain; Lee, John; Jacques, Diederik; Linde, Niklas
2017-12-01
Efficient and high-fidelity prior sampling and inversion for complex geological media is still a largely unsolved challenge. Here, we use a deep neural network of the variational autoencoder type to construct a parametric low-dimensional base model parameterization of complex binary geological media. For inversion purposes, it has the attractive feature that random draws from an uncorrelated standard normal distribution yield model realizations with spatial characteristics that are in agreement with the training set. In comparison with the most commonly used parametric representations in probabilistic inversion, we find that our dimensionality reduction (DR) approach outperforms principle component analysis (PCA), optimization-PCA (OPCA) and discrete cosine transform (DCT) DR techniques for unconditional geostatistical simulation of a channelized prior model. For the considered examples, important compression ratios (200-500) are achieved. Given that the construction of our parameterization requires a training set of several tens of thousands of prior model realizations, our DR approach is more suited for probabilistic (or deterministic) inversion than for unconditional (or point-conditioned) geostatistical simulation. Probabilistic inversions of 2D steady-state and 3D transient hydraulic tomography data are used to demonstrate the DR-based inversion. For the 2D case study, the performance is superior compared to current state-of-the-art multiple-point statistics inversion by sequential geostatistical resampling (SGR). Inversion results for the 3D application are also encouraging.
The Improved Locating Algorithm of Particle Filter Based on ROS Robot
NASA Astrophysics Data System (ADS)
Fang, Xun; Fu, Xiaoyang; Sun, Ming
2018-03-01
This paperanalyzes basic theory and primary algorithm of the real-time locating system and SLAM technology based on ROS system Robot. It proposes improved locating algorithm of particle filter effectively reduces the matching time of laser radar and map, additional ultra-wideband technology directly accelerates the global efficiency of FastSLAM algorithm, which no longer needs searching on the global map. Meanwhile, the re-sampling has been largely reduced about 5/6 that directly cancels the matching behavior on Roboticsalgorithm.
Proposed hardware architectures of particle filter for object tracking
NASA Astrophysics Data System (ADS)
Abd El-Halym, Howida A.; Mahmoud, Imbaby Ismail; Habib, SED
2012-12-01
In this article, efficient hardware architectures for particle filter (PF) are presented. We propose three different architectures for Sequential Importance Resampling Filter (SIRF) implementation. The first architecture is a two-step sequential PF machine, where particle sampling, weight, and output calculations are carried out in parallel during the first step followed by sequential resampling in the second step. For the weight computation step, a piecewise linear function is used instead of the classical exponential function. This decreases the complexity of the architecture without degrading the results. The second architecture speeds up the resampling step via a parallel, rather than a serial, architecture. This second architecture targets a balance between hardware resources and the speed of operation. The third architecture implements the SIRF as a distributed PF composed of several processing elements and central unit. All the proposed architectures are captured using VHDL synthesized using Xilinx environment, and verified using the ModelSim simulator. Synthesis results confirmed the resource reduction and speed up advantages of our architectures.
NASA Astrophysics Data System (ADS)
Guo, Jun; Lu, Siliang; Zhai, Chao; He, Qingbo
2018-02-01
An automatic bearing fault diagnosis method is proposed for permanent magnet synchronous generators (PMSGs), which are widely installed in wind turbines subjected to low rotating speeds, speed fluctuations, and electrical device noise interferences. The mechanical rotating angle curve is first extracted from the phase current of a PMSG by sequentially applying a series of algorithms. The synchronous sampled vibration signal of the fault bearing is then resampled in the angular domain according to the obtained rotating phase information. Considering that the resampled vibration signal is still overwhelmed by heavy background noise, an adaptive stochastic resonance filter is applied to the resampled signal to enhance the fault indicator and facilitate bearing fault identification. Two types of fault bearings with different fault sizes in a PMSG test rig are subjected to experiments to test the effectiveness of the proposed method. The proposed method is fully automated and thus shows potential for convenient, highly efficient and in situ bearing fault diagnosis for wind turbines subjected to harsh environments.
Liu, Zhihua; Yang, Jian; He, Hong S.
2013-01-01
The relative importance of fuel, topography, and weather on fire spread varies at different spatial scales, but how the relative importance of these controls respond to changing spatial scales is poorly understood. We designed a “moving window” resampling technique that allowed us to quantify the relative importance of controls on fire spread at continuous spatial scales using boosted regression trees methods. This quantification allowed us to identify the threshold value for fire size at which the dominant control switches from fuel at small sizes to weather at large sizes. Topography had a fluctuating effect on fire spread across the spatial scales, explaining 20–30% of relative importance. With increasing fire size, the dominant control switched from bottom-up controls (fuel and topography) to top-down controls (weather). Our analysis suggested that there is a threshold for fire size, above which fires are driven primarily by weather and more likely lead to larger fire size. We suggest that this threshold, which may be ecosystem-specific, can be identified using our “moving window” resampling technique. Although the threshold derived from this analytical method may rely heavily on the sampling technique, our study introduced an easily implemented approach to identify scale thresholds in wildfire regimes. PMID:23383247
Wang, Xuefeng; Lee, Seunggeun; Zhu, Xiaofeng; Redline, Susan; Lin, Xihong
2013-12-01
Family-based genetic association studies of related individuals provide opportunities to detect genetic variants that complement studies of unrelated individuals. Most statistical methods for family association studies for common variants are single marker based, which test one SNP a time. In this paper, we consider testing the effect of an SNP set, e.g., SNPs in a gene, in family studies, for both continuous and discrete traits. Specifically, we propose a generalized estimating equations (GEEs) based kernel association test, a variance component based testing method, to test for the association between a phenotype and multiple variants in an SNP set jointly using family samples. The proposed approach allows for both continuous and discrete traits, where the correlation among family members is taken into account through the use of an empirical covariance estimator. We derive the theoretical distribution of the proposed statistic under the null and develop analytical methods to calculate the P-values. We also propose an efficient resampling method for correcting for small sample size bias in family studies. The proposed method allows for easily incorporating covariates and SNP-SNP interactions. Simulation studies show that the proposed method properly controls for type I error rates under both random and ascertained sampling schemes in family studies. We demonstrate through simulation studies that our approach has superior performance for association mapping compared to the single marker based minimum P-value GEE test for an SNP-set effect over a range of scenarios. We illustrate the application of the proposed method using data from the Cleveland Family GWAS Study. © 2013 WILEY PERIODICALS, INC.
A computer-vision-based rotating speed estimation method for motor bearing fault diagnosis
NASA Astrophysics Data System (ADS)
Wang, Xiaoxian; Guo, Jie; Lu, Siliang; Shen, Changqing; He, Qingbo
2017-06-01
Diagnosis of motor bearing faults under variable speed is a problem. In this study, a new computer-vision-based order tracking method is proposed to address this problem. First, a video recorded by a high-speed camera is analyzed with the speeded-up robust feature extraction and matching algorithm to obtain the instantaneous rotating speed (IRS) of the motor. Subsequently, an audio signal recorded by a microphone is equi-angle resampled for order tracking in accordance with the IRS curve, through which the frequency-domain signal is transferred to an angular-domain one. The envelope order spectrum is then calculated to determine the fault characteristic order, and finally the bearing fault pattern is determined. The effectiveness and robustness of the proposed method are verified with two brushless direct-current motor test rigs, in which two defective bearings and a healthy bearing are tested separately. This study provides a new noninvasive measurement approach that simultaneously avoids the installation of a tachometer and overcomes the disadvantages of tacholess order tracking methods for motor bearing fault diagnosis under variable speed.
Choi, Ickwon; Kattan, Michael W; Wells, Brian J; Yu, Changhong
2012-01-01
In medical society, the prognostic models, which use clinicopathologic features and predict prognosis after a certain treatment, have been externally validated and used in practice. In recent years, most research has focused on high dimensional genomic data and small sample sizes. Since clinically similar but molecularly heterogeneous tumors may produce different clinical outcomes, the combination of clinical and genomic information, which may be complementary, is crucial to improve the quality of prognostic predictions. However, there is a lack of an integrating scheme for clinic-genomic models due to the P ≥ N problem, in particular, for a parsimonious model. We propose a methodology to build a reduced yet accurate integrative model using a hybrid approach based on the Cox regression model, which uses several dimension reduction techniques, L₂ penalized maximum likelihood estimation (PMLE), and resampling methods to tackle the problem. The predictive accuracy of the modeling approach is assessed by several metrics via an independent and thorough scheme to compare competing methods. In breast cancer data studies on a metastasis and death event, we show that the proposed methodology can improve prediction accuracy and build a final model with a hybrid signature that is parsimonious when integrating both types of variables.
Jollymore, Ashlee; Johnson, Mark S.; Hawthorne, Iain
2012-01-01
Organic material, including total and dissolved organic carbon (DOC), is ubiquitous within aquatic ecosystems, playing a variety of important and diverse biogeochemical and ecological roles. Determining how land-use changes affect DOC concentrations and bioavailability within aquatic ecosystems is an important means of evaluating the effects on ecological productivity and biogeochemical cycling. This paper presents a methodology case study looking at the deployment of a submersible UV-Vis absorbance spectrophotometer (UV-Vis spectro∷lyzer model, s∷can, Vienna, Austria) to determine stream organic carbon dynamics within a headwater catchment located near Campbell River (British Columbia, Canada). Field-based absorbance measurements of DOC were made before and after forest harvest, highlighting the advantages of high temporal resolution compared to traditional grab sampling and laboratory measurements. Details of remote deployment are described. High-frequency DOC data is explored by resampling the 30 min time series with a range of resampling time intervals (from daily to weekly time steps). DOC export was calculated for three months from the post-harvest data and resampled time series, showing that sampling frequency has a profound effect on total DOC export. DOC exports derived from weekly measurements were found to underestimate export by as much as 30% compared to DOC export calculated from high-frequency data. Additionally, the importance of the ability to remotely monitor the system through a recently deployed wireless connection is emphasized by examining causes of prior data losses, and how such losses may be prevented through the ability to react when environmental or power disturbances cause system interruption and data loss. PMID:22666002
Jollymore, Ashlee; Johnson, Mark S; Hawthorne, Iain
2012-01-01
Organic material, including total and dissolved organic carbon (DOC), is ubiquitous within aquatic ecosystems, playing a variety of important and diverse biogeochemical and ecological roles. Determining how land-use changes affect DOC concentrations and bioavailability within aquatic ecosystems is an important means of evaluating the effects on ecological productivity and biogeochemical cycling. This paper presents a methodology case study looking at the deployment of a submersible UV-Vis absorbance spectrophotometer (UV-Vis spectro::lyzer model, s::can, Vienna, Austria) to determine stream organic carbon dynamics within a headwater catchment located near Campbell River (British Columbia, Canada). Field-based absorbance measurements of DOC were made before and after forest harvest, highlighting the advantages of high temporal resolution compared to traditional grab sampling and laboratory measurements. Details of remote deployment are described. High-frequency DOC data is explored by resampling the 30 min time series with a range of resampling time intervals (from daily to weekly time steps). DOC export was calculated for three months from the post-harvest data and resampled time series, showing that sampling frequency has a profound effect on total DOC export. DOC exports derived from weekly measurements were found to underestimate export by as much as 30% compared to DOC export calculated from high-frequency data. Additionally, the importance of the ability to remotely monitor the system through a recently deployed wireless connection is emphasized by examining causes of prior data losses, and how such losses may be prevented through the ability to react when environmental or power disturbances cause system interruption and data loss.
0-2 Ma Paleomagnetic Field Behavior from Lava Flow Data Sets
NASA Astrophysics Data System (ADS)
Johnson, C. L.; Constable, C.; Tauxe, L.; Cromwell, G.
2010-12-01
The global time-averaged (TAF) structure of the paleomagnetic field and paleosecular variation (PSV) provide important constraints for numerical geodynamo simulations. Studies of the TAF have sought to characterize the nature of non-geocentric-axial dipole contributions to the field, in particular any such contributions that may be diagnostic of the influence of core-mantle boundary conditions on field generation. Similarly geographical variations in PSV are of interest, in particular the long-standing debate concerning anomalously low VGP (virtual geomagnetic pole) dispersion at Hawaii. Here, we analyze updated global directional data sets from lava flows. We present global models for the time-averaged field for the Brunhes and Matuyama epochs. New TAF models based on lava flow directional data for the Brunhes show longitudinal structure. In particular, high latitude flux lobes are observed, constrained by improved data sets from N. and S. America, Japan, and New Zealand. Anomalous TAF structure is also observed in the region around Hawaii. At Hawaii, previous inferences of the anomalous TAF (large inclination anomaly) and PSV (low VGP dispersion) have been argued to be the result of temporal sampling bias toward young flows. We use resampling techniques to examine possible biases in the TAF and PSV incurred by uneven temporal sampling. Resampling of the paleodirectional data onto a uniform temporal distribution, incorporating site ages and age errors leads to a TAF estimate for the Brunhes that is close to that reported for the actual data set, but an estimate for VGP dispersion that is increased relative to that obtained from the unevenly sampled data. Future investigations will incorporate the temporal resampling procedures into TAF modeling efforts, as well as recent progress in modeling the 0-2 Ma paleomagnetic dipole moment.
3D data processing with advanced computer graphics tools
NASA Astrophysics Data System (ADS)
Zhang, Song; Ekstrand, Laura; Grieve, Taylor; Eisenmann, David J.; Chumbley, L. Scott
2012-09-01
Often, the 3-D raw data coming from an optical profilometer contains spiky noises and irregular grid, which make it difficult to analyze and difficult to store because of the enormously large size. This paper is to address these two issues for an optical profilometer by substantially reducing the spiky noise of the 3-D raw data from an optical profilometer, and by rapidly re-sampling the raw data into regular grids at any pixel size and any orientation with advanced computer graphics tools. Experimental results will be presented to demonstrate the effectiveness of the proposed approach.
ERIC Educational Resources Information Center
Zhang, Dongbo; Koda, Keiko
2012-01-01
Within the Structural Equation Modeling framework, this study tested the direct and indirect effects of morphological awareness and lexical inferencing ability on L2 vocabulary knowledge and reading comprehension among advanced Chinese EFL readers in a university in China. Using both regular z-test and the bootstrapping (data-based resampling)…
NASA Astrophysics Data System (ADS)
Pezzani, Carlos M.; Bossio, José M.; Castellino, Ariel M.; Bossio, Guillermo R.; De Angelo, Cristian H.
2017-02-01
Condition monitoring in permanent magnet synchronous machines has gained interest due to the increasing use in applications such as electric traction and power generation. Particularly in wind power generation, non-invasive condition monitoring techniques are of great importance. Usually, in such applications the access to the generator is complex and costly, while unexpected breakdowns results in high repair costs. This paper presents a technique which allows using vibration analysis for bearing fault detection in permanent magnet synchronous generators used in wind turbines. Given that in wind power applications the generator rotational speed may vary during normal operation, it is necessary to use special sampling techniques to apply spectral analysis of mechanical vibrations. In this work, a resampling technique based on order tracking without measuring the rotor position is proposed. To synchronize sampling with rotor position, an estimation of the rotor position obtained from the angle of the voltage vector is proposed. This angle is obtained from a phase-locked loop synchronized with the generator voltages. The proposed strategy is validated by laboratory experimental results obtained from a permanent magnet synchronous generator. Results with single point defects in the outer race of a bearing under variable speed and load conditions are presented.
Vehicle Fault Diagnose Based on Smart Sensor
NASA Astrophysics Data System (ADS)
Zhining, Li; Peng, Wang; Jianmin, Mei; Jianwei, Li; Fei, Teng
In the vehicle's traditional fault diagnose system, we usually use a computer system with a A/D card and with many sensors connected to it. The disadvantage of this system is that these sensor can hardly be shared with control system and other systems, there are too many connect lines and the electro magnetic compatibility(EMC) will be affected. In this paper, smart speed sensor, smart acoustic press sensor, smart oil press sensor, smart acceleration sensor and smart order tracking sensor were designed to solve this problem. With the CAN BUS these smart sensors, fault diagnose computer and other computer could be connected together to establish a network system which can monitor and control the vehicle's diesel and other system without any duplicate sensor. The hard and soft ware of the smart sensor system was introduced, the oil press, vibration and acoustic signal are resampled by constant angle increment to eliminate the influence of the rotate speed. After the resample, the signal in every working cycle could be averaged in angle domain and do other analysis like order spectrum.
Grinde, Kelsey E.; Arbet, Jaron; Green, Alden; O'Connell, Michael; Valcarcel, Alessandra; Westra, Jason; Tintle, Nathan
2017-01-01
To date, gene-based rare variant testing approaches have focused on aggregating information across sets of variants to maximize statistical power in identifying genes showing significant association with diseases. Beyond identifying genes that are associated with diseases, the identification of causal variant(s) in those genes and estimation of their effect is crucial for planning replication studies and characterizing the genetic architecture of the locus. However, we illustrate that straightforward single-marker association statistics can suffer from substantial bias introduced by conditioning on gene-based test significance, due to the phenomenon often referred to as “winner's curse.” We illustrate the ramifications of this bias on variant effect size estimation and variant prioritization/ranking approaches, outline parameters of genetic architecture that affect this bias, and propose a bootstrap resampling method to correct for this bias. We find that our correction method significantly reduces the bias due to winner's curse (average two-fold decrease in bias, p < 2.2 × 10−6) and, consequently, substantially improves mean squared error and variant prioritization/ranking. The method is particularly helpful in adjustment for winner's curse effects when the initial gene-based test has low power and for relatively more common, non-causal variants. Adjustment for winner's curse is recommended for all post-hoc estimation and ranking of variants after a gene-based test. Further work is necessary to continue seeking ways to reduce bias and improve inference in post-hoc analysis of gene-based tests under a wide variety of genetic architectures. PMID:28959274
Assessing dynamics, spatial scale, and uncertainty in task-related brain network analyses
Stephen, Emily P.; Lepage, Kyle Q.; Eden, Uri T.; Brunner, Peter; Schalk, Gerwin; Brumberg, Jonathan S.; Guenther, Frank H.; Kramer, Mark A.
2014-01-01
The brain is a complex network of interconnected elements, whose interactions evolve dynamically in time to cooperatively perform specific functions. A common technique to probe these interactions involves multi-sensor recordings of brain activity during a repeated task. Many techniques exist to characterize the resulting task-related activity, including establishing functional networks, which represent the statistical associations between brain areas. Although functional network inference is commonly employed to analyze neural time series data, techniques to assess the uncertainty—both in the functional network edges and the corresponding aggregate measures of network topology—are lacking. To address this, we describe a statistically principled approach for computing uncertainty in functional networks and aggregate network measures in task-related data. The approach is based on a resampling procedure that utilizes the trial structure common in experimental recordings. We show in simulations that this approach successfully identifies functional networks and associated measures of confidence emergent during a task in a variety of scenarios, including dynamically evolving networks. In addition, we describe a principled technique for establishing functional networks based on predetermined regions of interest using canonical correlation. Doing so provides additional robustness to the functional network inference. Finally, we illustrate the use of these methods on example invasive brain voltage recordings collected during an overt speech task. The general strategy described here—appropriate for static and dynamic network inference and different statistical measures of coupling—permits the evaluation of confidence in network measures in a variety of settings common to neuroscience. PMID:24678295
Assessing dynamics, spatial scale, and uncertainty in task-related brain network analyses.
Stephen, Emily P; Lepage, Kyle Q; Eden, Uri T; Brunner, Peter; Schalk, Gerwin; Brumberg, Jonathan S; Guenther, Frank H; Kramer, Mark A
2014-01-01
The brain is a complex network of interconnected elements, whose interactions evolve dynamically in time to cooperatively perform specific functions. A common technique to probe these interactions involves multi-sensor recordings of brain activity during a repeated task. Many techniques exist to characterize the resulting task-related activity, including establishing functional networks, which represent the statistical associations between brain areas. Although functional network inference is commonly employed to analyze neural time series data, techniques to assess the uncertainty-both in the functional network edges and the corresponding aggregate measures of network topology-are lacking. To address this, we describe a statistically principled approach for computing uncertainty in functional networks and aggregate network measures in task-related data. The approach is based on a resampling procedure that utilizes the trial structure common in experimental recordings. We show in simulations that this approach successfully identifies functional networks and associated measures of confidence emergent during a task in a variety of scenarios, including dynamically evolving networks. In addition, we describe a principled technique for establishing functional networks based on predetermined regions of interest using canonical correlation. Doing so provides additional robustness to the functional network inference. Finally, we illustrate the use of these methods on example invasive brain voltage recordings collected during an overt speech task. The general strategy described here-appropriate for static and dynamic network inference and different statistical measures of coupling-permits the evaluation of confidence in network measures in a variety of settings common to neuroscience.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shrestha, S; Vedantham, S; Karellas, A
Purpose: Detectors with hexagonal pixels require resampling to square pixels for distortion-free display of acquired images. In this work, the presampling modulation transfer function (MTF) of a hexagonal pixel array photon-counting CdTe detector for region-of-interest fluoroscopy was measured and the optimal square pixel size for resampling was determined. Methods: A 0.65mm thick CdTe Schottky sensor capable of concurrently acquiring up to 3 energy-windowed images was operated in a single energy-window mode to include ≥10 KeV photons. The detector had hexagonal pixels with apothem of 30 microns resulting in pixel spacing of 60 and 51.96 microns along the two orthogonal directions.more » Images of a tungsten edge test device acquired under IEC RQA5 conditions were double Hough transformed to identify the edge and numerically differentiated. The presampling MTF was determined from the finely sampled line spread function that accounted for the hexagonal sampling. The optimal square pixel size was determined in two ways; the square pixel size for which the aperture function evaluated at the Nyquist frequencies along the two orthogonal directions matched that from the hexagonal pixel aperture functions, and the square pixel size for which the mean absolute difference between the square and hexagonal aperture functions was minimized over all frequencies up to the Nyquist limit. Results: Evaluation of the aperture functions over the entire frequency range resulted in square pixel size of 53 microns with less than 2% difference from the hexagonal pixel. Evaluation of the aperture functions at Nyquist frequencies alone resulted in 54 microns square pixels. For the photon-counting CdTe detector and after resampling to 53 microns square pixels using quadratic interpolation, the presampling MTF at Nyquist frequency of 9.434 cycles/mm along the two directions were 0.501 and 0.507. Conclusion: Hexagonal pixel array photon-counting CdTe detector after resampling to square pixels provides high-resolution imaging suitable for fluoroscopy.« less
Tacholess order-tracking approach for wind turbine gearbox fault detection
NASA Astrophysics Data System (ADS)
Wang, Yi; Xie, Yong; Xu, Guanghua; Zhang, Sicong; Hou, Chenggang
2017-09-01
Monitoring of wind turbines under variable-speed operating conditions has become an important issue in recent years. The gearbox of a wind turbine is the most important transmission unit; it generally exhibits complex vibration signatures due to random variations in operating conditions. Spectral analysis is one of the main approaches in vibration signal processing. However, spectral analysis is based on a stationary assumption and thus inapplicable to the fault diagnosis of wind turbines under variable-speed operating conditions. This constraint limits the application of spectral analysis to wind turbine diagnosis in industrial applications. Although order-tracking methods have been proposed for wind turbine fault detection in recent years, current methods are only applicable to cases in which the instantaneous shaft phase is available. For wind turbines with limited structural spaces, collecting phase signals with tachometers or encoders is difficult. In this study, a tacholess order-tracking method for wind turbines is proposed to overcome the limitations of traditional techniques. The proposed method extracts the instantaneous phase from the vibration signal, resamples the signal at equiangular increments, and calculates the order spectrum for wind turbine fault identification. The effectiveness of the proposed method is experimentally validated with the vibration signals of wind turbines.
Joint scale-change models for recurrent events and failure time.
Xu, Gongjun; Chiou, Sy Han; Huang, Chiung-Yu; Wang, Mei-Cheng; Yan, Jun
2017-01-01
Recurrent event data arise frequently in various fields such as biomedical sciences, public health, engineering, and social sciences. In many instances, the observation of the recurrent event process can be stopped by the occurrence of a correlated failure event, such as treatment failure and death. In this article, we propose a joint scale-change model for the recurrent event process and the failure time, where a shared frailty variable is used to model the association between the two types of outcomes. In contrast to the popular Cox-type joint modeling approaches, the regression parameters in the proposed joint scale-change model have marginal interpretations. The proposed approach is robust in the sense that no parametric assumption is imposed on the distribution of the unobserved frailty and that we do not need the strong Poisson-type assumption for the recurrent event process. We establish consistency and asymptotic normality of the proposed semiparametric estimators under suitable regularity conditions. To estimate the corresponding variances of the estimators, we develop a computationally efficient resampling-based procedure. Simulation studies and an analysis of hospitalization data from the Danish Psychiatric Central Register illustrate the performance of the proposed method.
NASA Astrophysics Data System (ADS)
Patel, Nirmal; Sultana, Sharmin; Rashid, Tanweer; Krusienski, Dean; Audette, Michel A.
2015-03-01
This paper presents a methodology for the digital formatting of a printed atlas of the brainstem and the delineation of cranial nerves from this digital atlas. It also describes on-going work on the 3D resampling and refinement of the 2D functional regions and nerve contours. In MRI-based anatomical modeling for neurosurgery planning and simulation, the complexity of the functional anatomy entails a digital atlas approach, rather than less descriptive voxel or surface-based approaches. However, there is an insufficiency of descriptive digital atlases, in particular of the brainstem. Our approach proceeds from a series of numbered, contour-based sketches coinciding with slices of the brainstem featuring both closed and open contours. The closed contours coincide with functionally relevant regions, whereby our objective is to fill in each corresponding label, which is analogous to painting numbered regions in a paint-by-numbers kit. Any open contour typically coincides with a cranial nerve. This 2D phase is needed in order to produce densely labeled regions that can be stacked to produce 3D regions, as well as identifying the embedded paths and outer attachment points of cranial nerves. Cranial nerves are modeled using an explicit contour based technique called 1-Simplex. The relevance of cranial nerves modeling of this project is two-fold: i) this atlas will fill a void left by the brain segmentation communities, as no suitable digital atlas of the brainstem exists, and ii) this atlas is necessary to make explicit the attachment points of major nerves (except I and II) having a cranial origin. Keywords: digital atlas, contour models, surface models
NASA Astrophysics Data System (ADS)
Plaza Guingla, D. A.; Pauwels, V. R.; De Lannoy, G. J.; Matgen, P.; Giustarini, L.; De Keyser, R.
2012-12-01
The objective of this work is to analyze the improvement in the performance of the particle filter by including a resample-move step or by using a modified Gaussian particle filter. Specifically, the standard particle filter structure is altered by the inclusion of the Markov chain Monte Carlo move step. The second choice adopted in this study uses the moments of an ensemble Kalman filter analysis to define the importance density function within the Gaussian particle filter structure. Both variants of the standard particle filter are used in the assimilation of densely sampled discharge records into a conceptual rainfall-runoff model. In order to quantify the obtained improvement, discharge root mean square errors are compared for different particle filters, as well as for the ensemble Kalman filter. First, a synthetic experiment is carried out. The results indicate that the performance of the standard particle filter can be improved by the inclusion of the resample-move step, but its effectiveness is limited to situations with limited particle impoverishment. The results also show that the modified Gaussian particle filter outperforms the rest of the filters. Second, a real experiment is carried out in order to validate the findings from the synthetic experiment. The addition of the resample-move step does not show a considerable improvement due to performance limitations in the standard particle filter with real data. On the other hand, when an optimal importance density function is used in the Gaussian particle filter, the results show a considerably improved performance of the particle filter.
Yang, Xianjin; Chen, Xiao; Carrigan, Charles R.; ...
2014-06-03
A parametric bootstrap approach is presented for uncertainty quantification (UQ) of CO₂ saturation derived from electrical resistance tomography (ERT) data collected at the Cranfield, Mississippi (USA) carbon sequestration site. There are many sources of uncertainty in ERT-derived CO₂ saturation, but we focus on how the ERT observation errors propagate to the estimated CO₂ saturation in a nonlinear inversion process. Our UQ approach consists of three steps. We first estimated the observational errors from a large number of reciprocal ERT measurements. The second step was to invert the pre-injection baseline data and the resulting resistivity tomograph was used as the priormore » information for nonlinear inversion of time-lapse data. We assigned a 3% random noise to the baseline model. Finally, we used a parametric bootstrap method to obtain bootstrap CO₂ saturation samples by deterministically solving a nonlinear inverse problem many times with resampled data and resampled baseline models. Then the mean and standard deviation of CO₂ saturation were calculated from the bootstrap samples. We found that the maximum standard deviation of CO₂ saturation was around 6% with a corresponding maximum saturation of 30% for a data set collected 100 days after injection began. There was no apparent spatial correlation between the mean and standard deviation of CO₂ saturation but the standard deviation values increased with time as the saturation increased. The uncertainty in CO₂ saturation also depends on the ERT reciprocal error threshold used to identify and remove noisy data and inversion constraints such as temporal roughness. Five hundred realizations requiring 3.5 h on a single 12-core node were needed for the nonlinear Monte Carlo inversion to arrive at stationary variances while the Markov Chain Monte Carlo (MCMC) stochastic inverse approach may expend days for a global search. This indicates that UQ of 2D or 3D ERT inverse problems can be performed on a laptop or desktop PC.« less
Donovan, Rory M.; Tapia, Jose-Juan; Sullivan, Devin P.; Faeder, James R.; Murphy, Robert F.; Dittrich, Markus; Zuckerman, Daniel M.
2016-01-01
The long-term goal of connecting scales in biological simulation can be facilitated by scale-agnostic methods. We demonstrate that the weighted ensemble (WE) strategy, initially developed for molecular simulations, applies effectively to spatially resolved cell-scale simulations. The WE approach runs an ensemble of parallel trajectories with assigned weights and uses a statistical resampling strategy of replicating and pruning trajectories to focus computational effort on difficult-to-sample regions. The method can also generate unbiased estimates of non-equilibrium and equilibrium observables, sometimes with significantly less aggregate computing time than would be possible using standard parallelization. Here, we use WE to orchestrate particle-based kinetic Monte Carlo simulations, which include spatial geometry (e.g., of organelles, plasma membrane) and biochemical interactions among mobile molecular species. We study a series of models exhibiting spatial, temporal and biochemical complexity and show that although WE has important limitations, it can achieve performance significantly exceeding standard parallel simulation—by orders of magnitude for some observables. PMID:26845334
Refining mortality estimates in shark demographic analyses: a Bayesian inverse matrix approach.
Smart, Jonathan J; Punt, André E; White, William T; Simpfendorfer, Colin A
2018-01-18
Leslie matrix models are an important analysis tool in conservation biology that are applied to a diversity of taxa. The standard approach estimates the finite rate of population growth (λ) from a set of vital rates. In some instances, an estimate of λ is available, but the vital rates are poorly understood and can be solved for using an inverse matrix approach. However, these approaches are rarely attempted due to prerequisites of information on the structure of age or stage classes. This study addressed this issue by using a combination of Monte Carlo simulations and the sample-importance-resampling (SIR) algorithm to solve the inverse matrix problem without data on population structure. This approach was applied to the grey reef shark (Carcharhinus amblyrhynchos) from the Great Barrier Reef (GBR) in Australia to determine the demography of this population. Additionally, these outputs were applied to another heavily fished population from Papua New Guinea (PNG) that requires estimates of λ for fisheries management. The SIR analysis determined that natural mortality (M) and total mortality (Z) based on indirect methods have previously been overestimated for C. amblyrhynchos, leading to an underestimated λ. The updated Z distributions determined using SIR provided λ estimates that matched an empirical λ for the GBR population and corrected obvious error in the demographic parameters for the PNG population. This approach provides opportunity for the inverse matrix approach to be applied more broadly to situations where information on population structure is lacking. © 2018 by the Ecological Society of America.
NASA Technical Reports Server (NTRS)
Merola, John A.
1989-01-01
The LANDSAT Thematic Mapper (TM) scanner records reflected solar energy from the earth's surface in six wavelength regions, or bands, and one band that records emitted energy in the thermal region, giving a total of seven bands. Useful research was extracted about terrain morphometry from remote sensing measurements and this information is used in an image-based terrain model for selected coastal geomorphic features in the Great Salt Lake Desert (GSLD). Technical developments include the incorporation of Aerial Profiling of Terrain System (APTS) data in satellite image analysis, and the production and use of 3-D surface plots of TM reflectance data. Also included in the technical developments is the analysis of the ground control point spatial distribution and its affects on geometric correction, and the terrain mapping procedure; using satellite data in a way that eliminates the need to degrade the data by resampling. The most common approach for terrain mapping with multispectral scanner data includes the techniques of pattern recognition and image classification, as opposed to direct measurement of radiance for identification of terrain features. The research approach in this investigation was based on an understanding of the characteristics of reflected light resulting from the variations in moisture and geometry related to terrain as described by the physical laws of radiative transfer. The image-based terrain model provides quantitative information about the terrain morphometry based on the physical relationship between TM data, the physical character of the GSLD, and the APTS measurements.
1986-11-01
V2 + V 2 2 We can formulate the general weighted resampling formulas by giving an inter - polation formula and a sampling formula. Specifically...tessellation grids. 4.1. One-dimensional Adaptive Pyramid We suggest an interest operator based on the local " busyness " of the data. It has been observed...that in human perception a line with higher " busyness " seems longer than a straight line segment [6], as in Figure 7. Here, we will use a smoothed
Allison Bidlack; Sarah Bisbing; Brian Buma; David D’Amore; Paul Hennon; Thomas Heutte; John Krapek; Robin Mulvey; Lauren Oakes
2017-01-01
In their analysis of resampled and remeasured plot data from the USDA Forest Service Forest Inventory and Analysis (FIA) program, Barrett and Pattison (2017, Can. J. For. Res. 47(1): 97â105, doi:10.1139/cjfr-2016-0335) suggest that there is neither evidence of a recent regional decrease in yellow-cedar (Callitropsis nootkatensis...
A bootstrap estimation scheme for chemical compositional data with nondetects
Palarea-Albaladejo, J; Martín-Fernández, J.A; Olea, Ricardo A.
2014-01-01
The bootstrap method is commonly used to estimate the distribution of estimators and their associated uncertainty when explicit analytic expressions are not available or are difficult to obtain. It has been widely applied in environmental and geochemical studies, where the data generated often represent parts of whole, typically chemical concentrations. This kind of constrained data is generically called compositional data, and they require specialised statistical methods to properly account for their particular covariance structure. On the other hand, it is not unusual in practice that those data contain labels denoting nondetects, that is, concentrations falling below detection limits. Nondetects impede the implementation of the bootstrap and represent an additional source of uncertainty that must be taken into account. In this work, a bootstrap scheme is devised that handles nondetects by adding an imputation step within the resampling process and conveniently propagates their associated uncertainly. In doing so, it considers the constrained relationships between chemical concentrations originated from their compositional nature. Bootstrap estimates using a range of imputation methods, including new stochastic proposals, are compared across scenarios of increasing difficulty. They are formulated to meet compositional principles following the log-ratio approach, and an adjustment is introduced in the multivariate case to deal with nonclosed samples. Results suggest that nondetect bootstrap based on model-based imputation is generally preferable. A robust approach based on isometric log-ratio transformations appears to be particularly suited in this context. Computer routines in the R statistical programming language are provided.
Automatic detection of regions of interest in mammographic images
NASA Astrophysics Data System (ADS)
Cheng, Erkang; Ling, Haibin; Bakic, Predrag R.; Maidment, Andrew D. A.; Megalooikonomou, Vasileios
2011-03-01
This work is a part of our ongoing study aimed at comparing the topology of anatomical branching structures with the underlying image texture. Detection of regions of interest (ROIs) in clinical breast images serves as the first step in development of an automated system for image analysis and breast cancer diagnosis. In this paper, we have investigated machine learning approaches for the task of identifying ROIs with visible breast ductal trees in a given galactographic image. Specifically, we have developed boosting based framework using the AdaBoost algorithm in combination with Haar wavelet features for the ROI detection. Twenty-eight clinical galactograms with expert annotated ROIs were used for training. Positive samples were generated by resampling near the annotated ROIs, and negative samples were generated randomly by image decomposition. Each detected ROI candidate was given a confidences core. Candidate ROIs with spatial overlap were merged and their confidence scores combined. We have compared three strategies for elimination of false positives. The strategies differed in their approach to combining confidence scores by summation, averaging, or selecting the maximum score.. The strategies were compared based upon the spatial overlap with annotated ROIs. Using a 4-fold cross-validation with the annotated clinical galactographic images, the summation strategy showed the best performance with 75% detection rate. When combining the top two candidates, the selection of maximum score showed the best performance with 96% detection rate.
Application of microarray analysis on computer cluster and cloud platforms.
Bernau, C; Boulesteix, A-L; Knaus, J
2013-01-01
Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.
Liao, Hstau Y.; Hashem, Yaser; Frank, Joachim
2015-01-01
Summary Single-particle cryogenic electron microscopy (cryo-EM) is a powerful tool for the study of macromolecular structures at high resolution. Classification allows multiple structural states to be extracted and reconstructed from the same sample. One classification approach is via the covariance matrix, which captures the correlation between every pair of voxels. Earlier approaches employ computing-intensive resampling and estimate only the eigenvectors of the matrix, which are then used in a separate fast classification step. We propose an iterative scheme to explicitly estimate the covariance matrix in its entirety. In our approach, the flexibility in choosing the solution domain allows us to examine a part of the molecule in greater detail. 3D covariance maps obtained in this way from experimental data (cryo-EM images of the eukaryotic pre-initiation complex) prove to be in excellent agreement with conclusions derived by using traditional approaches, revealing in addition the interdependencies of ligand bindings and structural changes. PMID:25982529
Liao, Hstau Y; Hashem, Yaser; Frank, Joachim
2015-06-02
Single-particle cryogenic electron microscopy (cryo-EM) is a powerful tool for the study of macromolecular structures at high resolution. Classification allows multiple structural states to be extracted and reconstructed from the same sample. One classification approach is via the covariance matrix, which captures the correlation between every pair of voxels. Earlier approaches employ computing-intensive resampling and estimate only the eigenvectors of the matrix, which are then used in a separate fast classification step. We propose an iterative scheme to explicitly estimate the covariance matrix in its entirety. In our approach, the flexibility in choosing the solution domain allows us to examine a part of the molecule in greater detail. Three-dimensional covariance maps obtained in this way from experimental data (cryo-EM images of the eukaryotic pre-initiation complex) prove to be in excellent agreement with conclusions derived by using traditional approaches, revealing in addition the interdependencies of ligand bindings and structural changes. Copyright © 2015 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Kross, Angela; McNairn, Heather; Lapen, David; Sunohara, Mark; Champagne, Catherine
2015-02-01
Leaf area index (LAI) and biomass are important indicators of crop development and the availability of this information during the growing season can support farmer decision making processes. This study demonstrates the applicability of RapidEye multi-spectral data for estimation of LAI and biomass of two crop types (corn and soybean) with different canopy structure, leaf structure and photosynthetic pathways. The advantages of Rapid Eye in terms of increased temporal resolution (∼daily), high spatial resolution (∼5 m) and enhanced spectral information (includes red-edge band) are explored as an individual sensor and as part of a multi-sensor constellation. Seven vegetation indices based on combinations of reflectance in green, red, red-edge and near infrared bands were derived from RapidEye imagery between 2011 and 2013. LAI and biomass data were collected during the same period for calibration and validation of the relationships between vegetation indices and LAI and dry above-ground biomass. Most indices showed sensitivity to LAI from emergence to 8 m2/m2. The normalized difference vegetation index (NDVI), the red-edge NDVI and the green NDVI were insensitive to crop type and had coefficients of variations (CV) ranging between 19 and 27%; and coefficients of determination ranging between 86 and 88%. The NDVI performed best for the estimation of dry leaf biomass (CV = 27% and r2 = 090) and was also insensitive to crop type. The red-edge indices did not show any significant improvement in LAI and biomass estimation over traditional multispectral indices. Cumulative vegetation indices showed strong performance for estimation of total dry above-ground biomass, especially for corn (CV ≤ 20%). This study demonstrated that continuous crop LAI monitoring over time and space at the field level can be achieved using a combination of RapidEye, Landsat and SPOT data and sensor-dependant best-fit functions. This approach eliminates/reduces the need for reflectance resampling, VIs inter-calibration and spatial resampling.
Poder, Thomas G; Kouakou, Christian R C; Bouchard, Pierre-Alexandre; Tremblay, Véronique; Blais, Sébastien; Maltais, François; Lellouche, François
2018-01-01
Objective Conduct a cost-effectiveness analysis of FreeO2 technology versus manual oxygen-titration technology for patients with chronic obstructive pulmonary disease (COPD) hospitalised for acute exacerbations. Setting Tertiary acute care hospital in Quebec, Canada. Participants 47 patients with COPD hospitalised for acute exacerbations. Intervention An automated oxygen-titration and oxygen-weaning technology. Methods and outcomes The costs for hospitalisation and follow-up for 180 days were calculated using a microcosting approach and included the cost of FreeO2 technology. Incremental cost-effectiveness ratios (ICERs) were calculated using bootstrap resampling with 5000 replications. The main effect variable was the percentage of time spent at the target oxygen saturation (SpO2). The other two effect variables were the time spent in hyperoxia (target SpO2+5%) and in severe hypoxaemia (SpO2 <85%). The resamplings were based on data from a randomised controlled trial with 47 patients with COPD hospitalised for acute exacerbations. Results FreeO2 generated savings of 20.7% of the per-patient costs at 180 days (ie, −$C2959.71). This decrease is nevertheless not significant at the 95% threshold (P=0.13), but the effect variables all improved (P<0.001). The improvement in the time spent at the target SpO2 was 56.3%. The ICERs indicate that FreeO2 technology is more cost-effective than manual oxygen titration with a savings of −$C96.91 per percentage point of time spent at the target SpO2 (95% CI −301.26 to 116.96). Conclusion FreeO2 technology could significantly enhance the efficiency of the health system by reducing per-patient costs at 180 days. A study with a larger patient sample needs to be carried out to confirm these preliminary results. Trial registration number NCT01393015; Post-results. PMID:29362258
NASA Astrophysics Data System (ADS)
Vincent, Sébastien; Lemercier, Blandine; Berthier, Lionel; Walter, Christian
2015-04-01
Accurate soil information over large extent is essential to manage agronomical and environmental issues. Where it exists, information on soil is often sparse or available at coarser resolution than required. Typically, the spatial distribution of soil at regional scale is represented as a set of polygons defining soil map units (SMU), each one describing several soil types not spatially delineated, and a semantic database describing these objects. Delineation of soil types within SMU, ie spatial disaggregation of SMU allows improved soil information's accuracy using legacy data. The aim of this study was to predict soil types by spatial disaggregation of SMU through a decision tree approach, considering expert knowledge on soil-landscape relationships embedded in soil databases. The DSMART (Disaggregation and Harmonization of Soil Map Units Through resampled Classification Trees) algorithm developed by Odgers et al. (2014) was used. It requires soil information, environmental covariates, and calibration samples, to build then extrapolate decision trees. To assign a soil type to a particular spatial position, a weighed random allocation approach is applied: each soil type in the SMU is weighted according to its assumed proportion of occurrence in the SMU. Thus soil-landscape relationships are not considered in the current version of DSMART. Expert rules on soil distribution considering the relief, parent material and wetlands location were proposed to drive the procedure of allocation of soil type to sampled positions, in order to integrate the soil-landscape relationships. Semantic information about spatial organization of soil types within SMU and exhaustive landscape descriptors were used. In the eastern part of Brittany (NW France), 171 soil types were described; their relative area in the SMU were estimated, geomorphological and geological contexts were recorded. The model predicted 144 soil types. An external validation was performed by comparing predicted with effectively observed soil types derived from available soil maps at scale of 1:25.000 or 1:50.000. Overall accuracies were 63.1% and 36.2%, respectively considering or not the adjacent pixels. The introduction of expert rules based on soil-landscape relationships to allocate soil types to calibration samples enhanced dramatically the results in comparison with a simple weighted random allocation procedure. It also enabled the production of a comprehensive soil map, retrieving expected spatial organization of soils. Estimation of soil properties for various depths is planned using disaggregated soil types, according to the GlobalSoilmap.net specifications. Odgers, N.P., Sun, W., McBratney, A.B., Minasny, B., Clifford, D., 2014. Disaggregating and harmonising soil map units through resampled classification trees. Geoderma 214, 91-100.
NASA Astrophysics Data System (ADS)
Sargent, Steven D.; Greenman, Mark E.; Hansen, Scott M.
1998-11-01
The Spatial Infrared Imaging Telescope (SPIRIT III) is the primary sensor aboard the Midcourse Space Experiment (MSX), which was launched 24 April 1996. SPIRIT III included a Fourier transform spectrometer that collected terrestrial and celestial background phenomenology data for the Ballistic Missile Defense Organization (BMDO). This spectrometer used a helium-neon reference laser to measure the optical path difference (OPD) in the spectrometer and to command the analog-to-digital conversion of the infrared detector signals, thereby ensuring the data were sampled at precise increments of OPD. Spectrometer data must be sampled at accurate increments of OPD to optimize the spectral resolution and spectral position of the transformed spectra. Unfortunately, a failure in the power supply preregulator at the MSX spacecraft/SPIRIT III interface early in the mission forced the spectrometer to be operated without the reference laser until a failure investigation was completed. During this time data were collected in a backup mode that used an electronic clock to sample the data. These data were sampled evenly in time, and because the scan velocity varied, at nonuniform increments of OPD. The scan velocity profile depended on scan direction and scan length, and varied over time, greatly degrading the spectral resolution and spectral and radiometric accuracy of the measurements. The Convert software used to process the SPIRIT III data was modified to resample the clock-sampled data at even increments of OPD, using scan velocity profiles determined from ground and on-orbit data, greatly improving the quality of the clock-sampled data. This paper presents the resampling algorithm, the characterization of the scan velocity profiles, and the results of applying the resampling algorithm to on-orbit data.
Declustering of clustered preferential sampling for histogram and semivariogram inference
Olea, R.A.
2007-01-01
Measurements of attributes obtained more as a consequence of business ventures than sampling design frequently result in samplings that are preferential both in location and value, typically in the form of clusters along the pay. Preferential sampling requires preprocessing for the purpose of properly inferring characteristics of the parent population, such as the cumulative distribution and the semivariogram. Consideration of the distance to the nearest neighbor allows preparation of resampled sets that produce comparable results to those from previously proposed methods. Clustered sampling of size 140, taken from an exhaustive sampling, is employed to illustrate this approach. ?? International Association for Mathematical Geology 2007.
Video auto stitching in multicamera surveillance system
NASA Astrophysics Data System (ADS)
He, Bin; Zhao, Gang; Liu, Qifang; Li, Yangyang
2012-01-01
This paper concerns the problem of video stitching automatically in a multi-camera surveillance system. Previous approaches have used multiple calibrated cameras for video mosaic in large scale monitoring application. In this work, we formulate video stitching as a multi-image registration and blending problem, and not all cameras are needed to be calibrated except a few selected master cameras. SURF is used to find matched pairs of image key points from different cameras, and then camera pose is estimated and refined. Homography matrix is employed to calculate overlapping pixels and finally implement boundary resample algorithm to blend images. The result of simulation demonstrates the efficiency of our method.
Video auto stitching in multicamera surveillance system
NASA Astrophysics Data System (ADS)
He, Bin; Zhao, Gang; Liu, Qifang; Li, Yangyang
2011-12-01
This paper concerns the problem of video stitching automatically in a multi-camera surveillance system. Previous approaches have used multiple calibrated cameras for video mosaic in large scale monitoring application. In this work, we formulate video stitching as a multi-image registration and blending problem, and not all cameras are needed to be calibrated except a few selected master cameras. SURF is used to find matched pairs of image key points from different cameras, and then camera pose is estimated and refined. Homography matrix is employed to calculate overlapping pixels and finally implement boundary resample algorithm to blend images. The result of simulation demonstrates the efficiency of our method.
Hierarchical animal movement models for population-level inference
Hooten, Mevin B.; Buderman, Frances E.; Brost, Brian M.; Hanks, Ephraim M.; Ivans, Jacob S.
2016-01-01
New methods for modeling animal movement based on telemetry data are developed regularly. With advances in telemetry capabilities, animal movement models are becoming increasingly sophisticated. Despite a need for population-level inference, animal movement models are still predominantly developed for individual-level inference. Most efforts to upscale the inference to the population level are either post hoc or complicated enough that only the developer can implement the model. Hierarchical Bayesian models provide an ideal platform for the development of population-level animal movement models but can be challenging to fit due to computational limitations or extensive tuning required. We propose a two-stage procedure for fitting hierarchical animal movement models to telemetry data. The two-stage approach is statistically rigorous and allows one to fit individual-level movement models separately, then resample them using a secondary MCMC algorithm. The primary advantages of the two-stage approach are that the first stage is easily parallelizable and the second stage is completely unsupervised, allowing for an automated fitting procedure in many cases. We demonstrate the two-stage procedure with two applications of animal movement models. The first application involves a spatial point process approach to modeling telemetry data, and the second involves a more complicated continuous-time discrete-space animal movement model. We fit these models to simulated data and real telemetry data arising from a population of monitored Canada lynx in Colorado, USA.
Babamoradi, Hamid; van den Berg, Frans; Rinnan, Åsmund
2016-02-18
In Multivariate Statistical Process Control, when a fault is expected or detected in the process, contribution plots are essential for operators and optimization engineers in identifying those process variables that were affected by or might be the cause of the fault. The traditional way of interpreting a contribution plot is to examine the largest contributing process variables as the most probable faulty ones. This might result in false readings purely due to the differences in natural variation, measurement uncertainties, etc. It is more reasonable to compare variable contributions for new process runs with historical results achieved under Normal Operating Conditions, where confidence limits for contribution plots estimated from training data are used to judge new production runs. Asymptotic methods cannot provide confidence limits for contribution plots, leaving re-sampling methods as the only option. We suggest bootstrap re-sampling to build confidence limits for all contribution plots in online PCA-based MSPC. The new strategy to estimate CLs is compared to the previously reported CLs for contribution plots. An industrial batch process dataset was used to illustrate the concepts. Copyright © 2016 Elsevier B.V. All rights reserved.
Table-driven image transformation engine algorithm
NASA Astrophysics Data System (ADS)
Shichman, Marc
1993-04-01
A high speed image transformation engine (ITE) was designed and a prototype built for use in a generic electronic light table and image perspective transformation application code. The ITE takes any linear transformation, breaks the transformation into two passes and resamples the image appropriately for each pass. The system performance is achieved by driving the engine with a set of look up tables computed at start up time for the calculation of pixel output contributions. Anti-aliasing is done automatically in the image resampling process. Operations such as multiplications and trigonometric functions are minimized. This algorithm can be used for texture mapping, image perspective transformation, electronic light table, and virtual reality.
On removing interpolation and resampling artifacts in rigid image registration.
Aganj, Iman; Yeo, Boon Thye Thomas; Sabuncu, Mert R; Fischl, Bruce
2013-02-01
We show that image registration using conventional interpolation and summation approximations of continuous integrals can generally fail because of resampling artifacts. These artifacts negatively affect the accuracy of registration by producing local optima, altering the gradient, shifting the global optimum, and making rigid registration asymmetric. In this paper, after an extensive literature review, we demonstrate the causes of the artifacts by comparing inclusion and avoidance of resampling analytically. We show the sum-of-squared-differences cost function formulated as an integral to be more accurate compared with its traditional sum form in a simple case of image registration. We then discuss aliasing that occurs in rotation, which is due to the fact that an image represented in the Cartesian grid is sampled with different rates in different directions, and propose the use of oscillatory isotropic interpolation kernels, which allow better recovery of true global optima by overcoming this type of aliasing. Through our experiments on brain, fingerprint, and white noise images, we illustrate the superior performance of the integral registration cost function in both the Cartesian and spherical coordinates, and also validate the introduced radial interpolation kernel by demonstrating the improvement in registration.
On Removing Interpolation and Resampling Artifacts in Rigid Image Registration
Aganj, Iman; Yeo, Boon Thye Thomas; Sabuncu, Mert R.; Fischl, Bruce
2013-01-01
We show that image registration using conventional interpolation and summation approximations of continuous integrals can generally fail because of resampling artifacts. These artifacts negatively affect the accuracy of registration by producing local optima, altering the gradient, shifting the global optimum, and making rigid registration asymmetric. In this paper, after an extensive literature review, we demonstrate the causes of the artifacts by comparing inclusion and avoidance of resampling analytically. We show the sum-of-squared-differences cost function formulated as an integral to be more accurate compared with its traditional sum form in a simple case of image registration. We then discuss aliasing that occurs in rotation, which is due to the fact that an image represented in the Cartesian grid is sampled with different rates in different directions, and propose the use of oscillatory isotropic interpolation kernels, which allow better recovery of true global optima by overcoming this type of aliasing. Through our experiments on brain, fingerprint, and white noise images, we illustrate the superior performance of the integral registration cost function in both the Cartesian and spherical coordinates, and also validate the introduced radial interpolation kernel by demonstrating the improvement in registration. PMID:23076044
Generating Virtual Patients by Multivariate and Discrete Re-Sampling Techniques.
Teutonico, D; Musuamba, F; Maas, H J; Facius, A; Yang, S; Danhof, M; Della Pasqua, O
2015-10-01
Clinical Trial Simulations (CTS) are a valuable tool for decision-making during drug development. However, to obtain realistic simulation scenarios, the patients included in the CTS must be representative of the target population. This is particularly important when covariate effects exist that may affect the outcome of a trial. The objective of our investigation was to evaluate and compare CTS results using re-sampling from a population pool and multivariate distributions to simulate patient covariates. COPD was selected as paradigm disease for the purposes of our analysis, FEV1 was used as response measure and the effects of a hypothetical intervention were evaluated in different populations in order to assess the predictive performance of the two methods. Our results show that the multivariate distribution method produces realistic covariate correlations, comparable to the real population. Moreover, it allows simulation of patient characteristics beyond the limits of inclusion and exclusion criteria in historical protocols. Both methods, discrete resampling and multivariate distribution generate realistic pools of virtual patients. However the use of a multivariate distribution enable more flexible simulation scenarios since it is not necessarily bound to the existing covariate combinations in the available clinical data sets.
Using informative priors in facies inversion: The case of C-ISR method
NASA Astrophysics Data System (ADS)
Valakas, G.; Modis, K.
2016-08-01
Inverse problems involving the characterization of hydraulic properties of groundwater flow systems by conditioning on observations of the state variables are mathematically ill-posed because they have multiple solutions and are sensitive to small changes in the data. In the framework of McMC methods for nonlinear optimization and under an iterative spatial resampling transition kernel, we present an algorithm for narrowing the prior and thus producing improved proposal realizations. To achieve this goal, we cosimulate the facies distribution conditionally to facies observations and normal scores transformed hydrologic response measurements, assuming a linear coregionalization model. The approach works by creating an importance sampling effect that steers the process to selected areas of the prior. The effectiveness of our approach is demonstrated by an example application on a synthetic underdetermined inverse problem in aquifer characterization.
Superresolution with the focused plenoptic camera
NASA Astrophysics Data System (ADS)
Georgiev, Todor; Chunev, Georgi; Lumsdaine, Andrew
2011-03-01
Digital images from a CCD or CMOS sensor with a color filter array must undergo a demosaicing process to combine the separate color samples into a single color image. This interpolation process can interfere with the subsequent superresolution process. Plenoptic superresolution, which relies on precise sub-pixel sampling across captured microimages, is particularly sensitive to such resampling of the raw data. In this paper we present an approach for superresolving plenoptic images that takes place at the time of demosaicing the raw color image data. Our approach exploits the interleaving provided by typical color filter arrays (e.g., Bayer filter) to further refine plenoptic sub-pixel sampling. Our rendering algorithm treats the color channels in a plenoptic image separately, which improves final superresolution by a factor of two. With appropriate plenoptic capture we show the theoretical possibility for rendering final images at full sensor resolution.
Paquet, Victor; Joseph, Caroline; D'Souza, Clive
2012-01-01
Anthropometric studies typically require a large number of individuals that are selected in a manner so that demographic characteristics that impact body size and function are proportionally representative of a user population. This sampling approach does not allow for an efficient characterization of the distribution of body sizes and functions of sub-groups within a population and the demographic characteristics of user populations can often change with time, limiting the application of the anthropometric data in design. The objective of this study is to demonstrate how demographically representative user populations can be developed from samples that are not proportionally representative in order to improve the application of anthropometric data in design. An engineering anthropometry problem of door width and clear floor space width is used to illustrate the value of the approach.
Wavelet-based characterization of gait signal for neurological abnormalities.
Baratin, E; Sugavaneswaran, L; Umapathy, K; Ioana, C; Krishnan, S
2015-02-01
Studies conducted by the World Health Organization (WHO) indicate that over one billion suffer from neurological disorders worldwide, and lack of efficient diagnosis procedures affects their therapeutic interventions. Characterizing certain pathologies of motor control for facilitating their diagnosis can be useful in quantitatively monitoring disease progression and efficient treatment planning. As a suitable directive, we introduce a wavelet-based scheme for effective characterization of gait associated with certain neurological disorders. In addition, since the data were recorded from a dynamic process, this work also investigates the need for gait signal re-sampling prior to identification of signal markers in the presence of pathologies. To benefit automated discrimination of gait data, certain characteristic features are extracted from the wavelet-transformed signals. The performance of the proposed approach was evaluated using a database consisting of 15 Parkinson's disease (PD), 20 Huntington's disease (HD), 13 Amyotrophic lateral sclerosis (ALS) and 16 healthy control subjects, and an average classification accuracy of 85% is achieved using an unbiased cross-validation strategy. The obtained results demonstrate the potential of the proposed methodology for computer-aided diagnosis and automatic characterization of certain neurological disorders. Copyright © 2015 Elsevier B.V. All rights reserved.
Tourscape: A systematic approach towards a sustainable rural tourism management
NASA Astrophysics Data System (ADS)
Lo, M. C.; Wang, Y. C.; Songan, P.; Yeo, A. W.
2014-02-01
Tourism plays an important role in the Malaysian economy as it is considered to be one of the corner stones of the country's economy. The purpose of this research is to conduct an analysis based on the existing tourism industry in rural tourism destinations in Malaysia by examining the impact of economics, environmental, social and cultural factors of the tourism industry on the local communities in Malaysia. 516 respondents comprising of tourism stakeholders from 34 rural tourism sites in Malaysia took part voluntarily in this study. To assess the developed model, SmartPLS 2.0 (M3) was applied based on path modeling and then bootstrapping with 200 re-samples was applied to generate the standard error of the estimate and t-values. Subsequently, a system named Tourscape was designed to manage the information. This system can be considered as a benchmark for tourism industry stakeholders as it is able to display the current situational analysis and the tourism health of selected tourism destination sites by capturing data and information, not only from local communities but industry players and tourists as well. The findings from this study revealed that the cooperation from various stakeholders has created significant impact on the development of rural tourism.
NASA Astrophysics Data System (ADS)
Liu, Bilan; Qiu, Xing; Zhu, Tong; Tian, Wei; Hu, Rui; Ekholm, Sven; Schifitto, Giovanni; Zhong, Jianhui
2016-03-01
Subject-specific longitudinal DTI study is vital for investigation of pathological changes of lesions and disease evolution. Spatial Regression Analysis of Diffusion tensor imaging (SPREAD) is a non-parametric permutation-based statistical framework that combines spatial regression and resampling techniques to achieve effective detection of localized longitudinal diffusion changes within the whole brain at individual level without a priori hypotheses. However, boundary blurring and dislocation limit its sensitivity, especially towards detecting lesions of irregular shapes. In the present study, we propose an improved SPREAD (dubbed improved SPREAD, or iSPREAD) method by incorporating a three-dimensional (3D) nonlinear anisotropic diffusion filtering method, which provides edge-preserving image smoothing through a nonlinear scale space approach. The statistical inference based on iSPREAD was evaluated and compared with the original SPREAD method using both simulated and in vivo human brain data. Results demonstrated that the sensitivity and accuracy of the SPREAD method has been improved substantially by adapting nonlinear anisotropic filtering. iSPREAD identifies subject-specific longitudinal changes in the brain with improved sensitivity, accuracy, and enhanced statistical power, especially when the spatial correlation is heterogeneous among neighboring image pixels in DTI.
Confidence Intervals from Realizations of Simulated Nuclear Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Younes, W.; Ratkiewicz, A.; Ressler, J. J.
2017-09-28
Various statistical techniques are discussed that can be used to assign a level of confidence in the prediction of models that depend on input data with known uncertainties and correlations. The particular techniques reviewed in this paper are: 1) random realizations of the input data using Monte-Carlo methods, 2) the construction of confidence intervals to assess the reliability of model predictions, and 3) resampling techniques to impose statistical constraints on the input data based on additional information. These techniques are illustrated with a calculation of the keff value, based on the 235U(n, f) and 239Pu (n, f) cross sections.
Travelogue--a newcomer encounters statistics and the computer.
Bruce, Peter
2011-11-01
Computer-intensive methods have revolutionized statistics, giving rise to new areas of analysis and expertise in predictive analytics, image processing, pattern recognition, machine learning, genomic analysis, and more. Interest naturally centers on the new capabilities the computer allows the analyst to bring to the table. This article, instead, focuses on the account of how computer-based resampling methods, with their relative simplicity and transparency, enticed one individual, untutored in statistics or mathematics, on a long journey into learning statistics, then teaching it, then starting an education institution.
Incorporating advanced language models into the P300 speller using particle filtering
NASA Astrophysics Data System (ADS)
Speier, W.; Arnold, C. W.; Deshpande, A.; Knall, J.; Pouratian, N.
2015-08-01
Objective. The P300 speller is a common brain-computer interface (BCI) application designed to communicate language by detecting event related potentials in a subject’s electroencephalogram signal. Information about the structure of natural language can be valuable for BCI communication, but attempts to use this information have thus far been limited to rudimentary n-gram models. While more sophisticated language models are prevalent in natural language processing literature, current BCI analysis methods based on dynamic programming cannot handle their complexity. Approach. Sampling methods can overcome this complexity by estimating the posterior distribution without searching the entire state space of the model. In this study, we implement sequential importance resampling, a commonly used particle filtering (PF) algorithm, to integrate a probabilistic automaton language model. Main result. This method was first evaluated offline on a dataset of 15 healthy subjects, which showed significant increases in speed and accuracy when compared to standard classification methods as well as a recently published approach using a hidden Markov model (HMM). An online pilot study verified these results as the average speed and accuracy achieved using the PF method was significantly higher than that using the HMM method. Significance. These findings strongly support the integration of domain-specific knowledge into BCI classification to improve system performance.
A Dynamic Ensemble Framework for Mining Textual Streams with Class Imbalance
Song, Ge
2014-01-01
Textual stream classification has become a realistic and challenging issue since large-scale, high-dimensional, and non-stationary streams with class imbalance have been widely used in various real-life applications. According to the characters of textual streams, it is technically difficult to deal with the classification of textual stream, especially in imbalanced environment. In this paper, we propose a new ensemble framework, clustering forest, for learning from the textual imbalanced stream with concept drift (CFIM). The CFIM is based on ensemble learning by integrating a set of clustering trees (CTs). An adaptive selection method, which flexibly chooses the useful CTs by the property of the stream, is presented in CFIM. In particular, to deal with the problem of class imbalance, we collect and reuse both rare-class instances and misclassified instances from the historical chunks. Compared to most existing approaches, it is worth pointing out that our approach assumes that both majority class and rareclass may suffer from concept drift. Thus the distribution of resampled instances is similar to the current concept. The effectiveness of CFIM is examined in five real-world textual streams under an imbalanced nonstationary environment. Experimental results demonstrate that CFIM achieves better performance than four state-of-the-art ensemble models. PMID:24982961
Li, Songfeng; Wei, Jun; Chan, Heang-Ping; Helvie, Mark A; Roubidoux, Marilyn A; Lu, Yao; Zhou, Chuan; Hadjiiski, Lubomir M; Samala, Ravi K
2018-01-09
Breast density is one of the most significant factors that is associated with cancer risk. In this study, our purpose was to develop a supervised deep learning approach for automated estimation of percentage density (PD) on digital mammograms (DMs). The input 'for processing' DMs was first log-transformed, enhanced by a multi-resolution preprocessing scheme, and subsampled to a pixel size of 800 µm × 800 µm from 100 µm × 100 µm. A deep convolutional neural network (DCNN) was trained to estimate a probability map of breast density (PMD) by using a domain adaptation resampling method. The PD was estimated as the ratio of the dense area to the breast area based on the PMD. The DCNN approach was compared to a feature-based statistical learning approach. Gray level, texture and morphological features were extracted and a least absolute shrinkage and selection operator was used to combine the features into a feature-based PMD. With approval of the Institutional Review Board, we retrospectively collected a training set of 478 DMs and an independent test set of 183 DMs from patient files in our institution. Two experienced mammography quality standards act radiologists interactively segmented PD as the reference standard. Ten-fold cross-validation was used for model selection and evaluation with the training set. With cross-validation, DCNN obtained a Dice's coefficient (DC) of 0.79 ± 0.13 and Pearson's correlation (r) of 0.97, whereas feature-based learning obtained DC = 0.72 ± 0.18 and r = 0.85. For the independent test set, DCNN achieved DC = 0.76 ± 0.09 and r = 0.94, while feature-based learning achieved DC = 0.62 ± 0.21 and r = 0.75. Our DCNN approach was significantly better and more robust than the feature-based learning approach for automated PD estimation on DMs, demonstrating its potential use for automated density reporting as well as for model-based risk prediction.
NASA Astrophysics Data System (ADS)
Li, Songfeng; Wei, Jun; Chan, Heang-Ping; Helvie, Mark A.; Roubidoux, Marilyn A.; Lu, Yao; Zhou, Chuan; Hadjiiski, Lubomir M.; Samala, Ravi K.
2018-01-01
Breast density is one of the most significant factors that is associated with cancer risk. In this study, our purpose was to develop a supervised deep learning approach for automated estimation of percentage density (PD) on digital mammograms (DMs). The input ‘for processing’ DMs was first log-transformed, enhanced by a multi-resolution preprocessing scheme, and subsampled to a pixel size of 800 µm × 800 µm from 100 µm × 100 µm. A deep convolutional neural network (DCNN) was trained to estimate a probability map of breast density (PMD) by using a domain adaptation resampling method. The PD was estimated as the ratio of the dense area to the breast area based on the PMD. The DCNN approach was compared to a feature-based statistical learning approach. Gray level, texture and morphological features were extracted and a least absolute shrinkage and selection operator was used to combine the features into a feature-based PMD. With approval of the Institutional Review Board, we retrospectively collected a training set of 478 DMs and an independent test set of 183 DMs from patient files in our institution. Two experienced mammography quality standards act radiologists interactively segmented PD as the reference standard. Ten-fold cross-validation was used for model selection and evaluation with the training set. With cross-validation, DCNN obtained a Dice’s coefficient (DC) of 0.79 ± 0.13 and Pearson’s correlation (r) of 0.97, whereas feature-based learning obtained DC = 0.72 ± 0.18 and r = 0.85. For the independent test set, DCNN achieved DC = 0.76 ± 0.09 and r = 0.94, while feature-based learning achieved DC = 0.62 ± 0.21 and r = 0.75. Our DCNN approach was significantly better and more robust than the feature-based learning approach for automated PD estimation on DMs, demonstrating its potential use for automated density reporting as well as for model-based risk prediction.
Kolmogorov-Smirnov test for spatially correlated data
Olea, R.A.; Pawlowsky-Glahn, V.
2009-01-01
The Kolmogorov-Smirnov test is a convenient method for investigating whether two underlying univariate probability distributions can be regarded as undistinguishable from each other or whether an underlying probability distribution differs from a hypothesized distribution. Application of the test requires that the sample be unbiased and the outcomes be independent and identically distributed, conditions that are violated in several degrees by spatially continuous attributes, such as topographical elevation. A generalized form of the bootstrap method is used here for the purpose of modeling the distribution of the statistic D of the Kolmogorov-Smirnov test. The innovation is in the resampling, which in the traditional formulation of bootstrap is done by drawing from the empirical sample with replacement presuming independence. The generalization consists of preparing resamplings with the same spatial correlation as the empirical sample. This is accomplished by reading the value of unconditional stochastic realizations at the sampling locations, realizations that are generated by simulated annealing. The new approach was tested by two empirical samples taken from an exhaustive sample closely following a lognormal distribution. One sample was a regular, unbiased sample while the other one was a clustered, preferential sample that had to be preprocessed. Our results show that the p-value for the spatially correlated case is always larger that the p-value of the statistic in the absence of spatial correlation, which is in agreement with the fact that the information content of an uncorrelated sample is larger than the one for a spatially correlated sample of the same size. ?? Springer-Verlag 2008.
NASA Astrophysics Data System (ADS)
Laloy, Eric; Linde, Niklas; Jacques, Diederik; Mariethoz, Grégoire
2016-04-01
The sequential geostatistical resampling (SGR) algorithm is a Markov chain Monte Carlo (MCMC) scheme for sampling from possibly non-Gaussian, complex spatially-distributed prior models such as geologic facies or categorical fields. In this work, we highlight the limits of standard SGR for posterior inference of high-dimensional categorical fields with realistically complex likelihood landscapes and benchmark a parallel tempering implementation (PT-SGR). Our proposed PT-SGR approach is demonstrated using synthetic (error corrupted) data from steady-state flow and transport experiments in categorical 7575- and 10,000-dimensional 2D conductivity fields. In both case studies, every SGR trial gets trapped in a local optima while PT-SGR maintains an higher diversity in the sampled model states. The advantage of PT-SGR is most apparent in an inverse transport problem where the posterior distribution is made bimodal by construction. PT-SGR then converges towards the appropriate data misfit much faster than SGR and partly recovers the two modes. In contrast, for the same computational resources SGR does not fit the data to the appropriate error level and hardly produces a locally optimal solution that looks visually similar to one of the two reference modes. Although PT-SGR clearly surpasses SGR in performance, our results also indicate that using a small number (16-24) of temperatures (and thus parallel cores) may not permit complete sampling of the posterior distribution by PT-SGR within a reasonable computational time (less than 1-2 weeks).
Dotsinsky, Ivan
2005-01-01
Background Public access defibrillators (PADs) are now available for more efficient and rapid treatment of out-of-hospital sudden cardiac arrest. PADs are used normally by untrained people on the streets and in sports centers, airports, and other public areas. Therefore, automated detection of ventricular fibrillation, or its exclusion, is of high importance. A special case exists at railway stations, where electric power-line frequency interference is significant. Many countries, especially in Europe, use 16.7 Hz AC power, which introduces high level frequency-varying interference that may compromise fibrillation detection. Method Moving signal averaging is often used for 50/60 Hz interference suppression if its effect on the ECG spectrum has little importance (no morphological analysis is performed). This approach may be also applied to the railway situation, if the interference frequency is continuously detected so as to synchronize the analog-to-digital conversion (ADC) for introducing variable inter-sample intervals. A better solution consists of rated ADC, software frequency measuring, internal irregular re-sampling according to the interference frequency, and a moving average over a constant sample number, followed by regular back re-sampling. Results The proposed method leads to a total railway interference cancellation, together with suppression of inherent noise, while the peak amplitudes of some sharp complexes are reduced. This reduction has negligible effect on accurate fibrillation detection. Conclusion The method is developed in the MATLAB environment and represents a useful tool for real time railway interference suppression. PMID:16309558
Dotsinsky, Ivan
2005-11-26
Public access defibrillators (PADs) are now available for more efficient and rapid treatment of out-of-hospital sudden cardiac arrest. PADs are used normally by untrained people on the streets and in sports centers, airports, and other public areas. Therefore, automated detection of ventricular fibrillation, or its exclusion, is of high importance. A special case exists at railway stations, where electric power-line frequency interference is significant. Many countries, especially in Europe, use 16.7 Hz AC power, which introduces high level frequency-varying interference that may compromise fibrillation detection. Moving signal averaging is often used for 50/60 Hz interference suppression if its effect on the ECG spectrum has little importance (no morphological analysis is performed). This approach may be also applied to the railway situation, if the interference frequency is continuously detected so as to synchronize the analog-to-digital conversion (ADC) for introducing variable inter-sample intervals. A better solution consists of rated ADC, software frequency measuring, internal irregular re-sampling according to the interference frequency, and a moving average over a constant sample number, followed by regular back re-sampling. The proposed method leads to a total railway interference cancellation, together with suppression of inherent noise, while the peak amplitudes of some sharp complexes are reduced. This reduction has negligible effect on accurate fibrillation detection. The method is developed in the MATLAB environment and represents a useful tool for real time railway interference suppression.
Radiomic modeling of BI-RADS density categories
NASA Astrophysics Data System (ADS)
Wei, Jun; Chan, Heang-Ping; Helvie, Mark A.; Roubidoux, Marilyn A.; Zhou, Chuan; Hadjiiski, Lubomir
2017-03-01
Screening mammography is the most effective and low-cost method to date for early cancer detection. Mammographic breast density has been shown to be highly correlated with breast cancer risk. We are developing a radiomic model for BI-RADS density categorization on digital mammography (FFDM) with a supervised machine learning approach. With IRB approval, we retrospectively collected 478 FFDMs from 478 women. As a gold standard, breast density was assessed by an MQSA radiologist based on BI-RADS categories. The raw FFDMs were used for computerized density assessment. The raw FFDM first underwent log-transform to approximate the x-ray sensitometric response, followed by multiscale processing to enhance the fibroglandular densities and parenchymal patterns. Three ROIs were automatically identified based on the keypoint distribution, where the keypoints were obtained as the extrema in the image Gaussian scale-space. A total of 73 features, including intensity and texture features that describe the density and the parenchymal pattern, were extracted from each breast. Our BI-RADS density estimator was constructed by using a random forest classifier. We used a 10-fold cross validation resampling approach to estimate the errors. With the random forest classifier, computerized density categories for 412 of the 478 cases agree with radiologist's assessment (weighted kappa = 0.93). The machine learning method with radiomic features as predictors demonstrated a high accuracy in classifying FFDMs into BI-RADS density categories. Further work is underway to improve our system performance as well as to perform an independent testing using a large unseen FFDM set.
Methods of Soil Resampling to Monitor Changes in the Chemical Concentrations of Forest Soils.
Lawrence, Gregory B; Fernandez, Ivan J; Hazlett, Paul W; Bailey, Scott W; Ross, Donald S; Villars, Thomas R; Quintana, Angelica; Ouimet, Rock; McHale, Michael R; Johnson, Chris E; Briggs, Russell D; Colter, Robert A; Siemion, Jason; Bartlett, Olivia L; Vargas, Olga; Antidormi, Michael R; Koppers, Mary M
2016-11-25
Recent soils research has shown that important chemical soil characteristics can change in less than a decade, often the result of broad environmental changes. Repeated sampling to monitor these changes in forest soils is a relatively new practice that is not well documented in the literature and has only recently been broadly embraced by the scientific community. The objective of this protocol is therefore to synthesize the latest information on methods of soil resampling in a format that can be used to design and implement a soil monitoring program. Successful monitoring of forest soils requires that a study unit be defined within an area of forested land that can be characterized with replicate sampling locations. A resampling interval of 5 years is recommended, but if monitoring is done to evaluate a specific environmental driver, the rate of change expected in that driver should be taken into consideration. Here, we show that the sampling of the profile can be done by horizon where boundaries can be clearly identified and horizons are sufficiently thick to remove soil without contamination from horizons above or below. Otherwise, sampling can be done by depth interval. Archiving of sample for future reanalysis is a key step in avoiding analytical bias and providing the opportunity for additional analyses as new questions arise.
Kent, Robert; Landon, Matthew K.
2016-01-01
From 2004 to 2011, the U.S. Geological Survey collected samples from 1686 wells across the State of California as part of the California State Water Resources Control Board’s Groundwater Ambient Monitoring and Assessment (GAMA) Priority Basin Project (PBP). From 2007 to 2013, 224 of these wells were resampled to assess temporal trends in water quality. The samples were analyzed for 216 water-quality constituents, including inorganic and organic compounds as well as isotopic tracers. The resampled wells were grouped into five hydrogeologic zones. A nonparametric hypothesis test was used to test the differences between initial sampling and resampling results to evaluate possible step trends in water-quality, statewide, and within each hydrogeologic zone. The hypothesis tests were performed on the 79 constituents that were detected in more than 5 % of the samples collected during either sampling period in at least one hydrogeologic zone. Step trends were detected for 17 constituents. Increasing trends were detected for alkalinity, aluminum, beryllium, boron, lithium, orthophosphate, perchlorate, sodium, and specific conductance. Decreasing trends were detected for atrazine, cobalt, dissolved oxygen, lead, nickel, pH, simazine, and tritium. Tritium was expected to decrease due to decreasing values in precipitation, and the detection of decreases indicates that the method is capable of resolving temporal trends.
Methods of Soil Resampling to Monitor Changes in the Chemical Concentrations of Forest Soils
Lawrence, Gregory B.; Fernandez, Ivan J.; Hazlett, Paul W.; Bailey, Scott W.; Ross, Donald S.; Villars, Thomas R.; Quintana, Angelica; Ouimet, Rock; McHale, Michael R.; Johnson, Chris E.; Briggs, Russell D.; Colter, Robert A.; Siemion, Jason; Bartlett, Olivia L.; Vargas, Olga; Antidormi, Michael R.; Koppers, Mary M.
2016-01-01
Recent soils research has shown that important chemical soil characteristics can change in less than a decade, often the result of broad environmental changes. Repeated sampling to monitor these changes in forest soils is a relatively new practice that is not well documented in the literature and has only recently been broadly embraced by the scientific community. The objective of this protocol is therefore to synthesize the latest information on methods of soil resampling in a format that can be used to design and implement a soil monitoring program. Successful monitoring of forest soils requires that a study unit be defined within an area of forested land that can be characterized with replicate sampling locations. A resampling interval of 5 years is recommended, but if monitoring is done to evaluate a specific environmental driver, the rate of change expected in that driver should be taken into consideration. Here, we show that the sampling of the profile can be done by horizon where boundaries can be clearly identified and horizons are sufficiently thick to remove soil without contamination from horizons above or below. Otherwise, sampling can be done by depth interval. Archiving of sample for future reanalysis is a key step in avoiding analytical bias and providing the opportunity for additional analyses as new questions arise. PMID:27911419
Methods of soil resampling to monitor changes in the chemical concentrations of forest soils
Lawrence, Gregory B.; Fernandez, Ivan J.; Hazlett, Paul W.; Bailey, Scott W.; Ross, Donald S.; Villars, Thomas R.; Quintana, Angelica; Ouimet, Rock; McHale, Michael; Johnson, Chris E.; Briggs, Russell D.; Colter, Robert A.; Siemion, Jason; Bartlett, Olivia L.; Vargas, Olga; Antidormi, Michael; Koppers, Mary Margaret
2016-01-01
Recent soils research has shown that important chemical soil characteristics can change in less than a decade, often the result of broad environmental changes. Repeated sampling to monitor these changes in forest soils is a relatively new practice that is not well documented in the literature and has only recently been broadly embraced by the scientific community. The objective of this protocol is therefore to synthesize the latest information on methods of soil resampling in a format that can be used to design and implement a soil monitoring program. Successful monitoring of forest soils requires that a study unit be defined within an area of forested land that can be characterized with replicate sampling locations. A resampling interval of 5 years is recommended, but if monitoring is done to evaluate a specific environmental driver, the rate of change expected in that driver should be taken into consideration. Here, we show that the sampling of the profile can be done by horizon where boundaries can be clearly identified and horizons are sufficiently thick to remove soil without contamination from horizons above or below. Otherwise, sampling can be done by depth interval. Archiving of sample for future reanalysis is a key step in avoiding analytical bias and providing the opportunity for additional analyses as new questions arise.
Robust and efficient method for matching features in omnidirectional images
NASA Astrophysics Data System (ADS)
Zhu, Qinyi; Zhang, Zhijiang; Zeng, Dan
2018-04-01
Binary descriptors have been widely used in many real-time applications due to their efficiency. These descriptors are commonly designed for perspective images but perform poorly on omnidirectional images, which are severely distorted. To address this issue, this paper proposes tangent plane BRIEF (TPBRIEF) and adapted log polar grid-based motion statistics (ALPGMS). TPBRIEF projects keypoints to a unit sphere and applies the fixed test set in BRIEF descriptor on the tangent plane of the unit sphere. The fixed test set is then backprojected onto the original distorted images to construct the distortion invariant descriptor. TPBRIEF directly enables keypoint detecting and feature describing on original distorted images, whereas other approaches correct the distortion through image resampling, which introduces artifacts and adds time cost. With ALPGMS, omnidirectional images are divided into circular arches named adapted log polar grids. Whether a match is true or false is then determined by simply thresholding the match numbers in a grid pair where the two matched points located. Experiments show that TPBRIEF greatly improves the feature matching accuracy and ALPGMS robustly removes wrong matches. Our proposed method outperforms the state-of-the-art methods.
A Robust Kalman Framework with Resampling and Optimal Smoothing
Kautz, Thomas; Eskofier, Bjoern M.
2015-01-01
The Kalman filter (KF) is an extremely powerful and versatile tool for signal processing that has been applied extensively in various fields. We introduce a novel Kalman-based analysis procedure that encompasses robustness towards outliers, Kalman smoothing and real-time conversion from non-uniformly sampled inputs to a constant output rate. These features have been mostly treated independently, so that not all of their benefits could be exploited at the same time. Here, we present a coherent analysis procedure that combines the aforementioned features and their benefits. To facilitate utilization of the proposed methodology and to ensure optimal performance, we also introduce a procedure to calculate all necessary parameters. Thereby, we substantially expand the versatility of one of the most widely-used filtering approaches, taking full advantage of its most prevalent extensions. The applicability and superior performance of the proposed methods are demonstrated using simulated and real data. The possible areas of applications for the presented analysis procedure range from movement analysis over medical imaging, brain-computer interfaces to robot navigation or meteorological studies. PMID:25734647
Particle merging algorithm for PIC codes
NASA Astrophysics Data System (ADS)
Vranic, M.; Grismayer, T.; Martins, J. L.; Fonseca, R. A.; Silva, L. O.
2015-06-01
Particle-in-cell merging algorithms aim to resample dynamically the six-dimensional phase space occupied by particles without distorting substantially the physical description of the system. Whereas various approaches have been proposed in previous works, none of them seemed to be able to conserve fully charge, momentum, energy and their associated distributions. We describe here an alternative algorithm based on the coalescence of N massive or massless particles, considered to be close enough in phase space, into two new macro-particles. The local conservation of charge, momentum and energy are ensured by the resolution of a system of scalar equations. Various simulation comparisons have been carried out with and without the merging algorithm, from classical plasma physics problems to extreme scenarios where quantum electrodynamics is taken into account, showing in addition to the conservation of local quantities, the good reproducibility of the particle distributions. In case where the number of particles ought to increase exponentially in the simulation box, the dynamical merging permits a considerable speedup, and significant memory savings that otherwise would make the simulations impossible to perform.
Cooperative Autonomous Observation of Volcanic Environments with sUAS
NASA Astrophysics Data System (ADS)
Ravela, S.
2015-12-01
The Cooperative Autonomous Observing System Project (CAOS) at the MIT Earth Signals and Systems Group has developed methodology and systems for dynamically mapping coherent fluids such as plumes using small unmanned aircraft systems (sUAS). In the CAOS approach, two classes of sUAS, one remote the other in-situ, implement a dynamic data-driven mapping system by closing the loop between Modeling, Estimation, Sampling, Planning and Control (MESPAC). The continually gathered measurements are assimilated to produce maps/analyses which also guide the sUAS network to adaptively resample the environment. Rather than scan the volume in fixed Eulerian or Lagrangian flight plans, the adaptive nature of the sampling process enables objectives for efficiency and resilience to be incorporated. Modeling includes realtime prediction using two types of reduced models, one based on nowcasting remote observations of plume tracer using scale-cascaded alignment, and another based on dynamically-deformable EOF/POD developed for coherent structures. Ensemble-based Information-theoretic machine learning approaches are used for the highly non-linear/non-Gaussian state/parameter estimation, and for planning. Control of the sUAS is based on model reference control coupled with hierarchical PID. MESPAC is implemented in part on a SkyCandy platform, and implements an airborne mesh that provides instantaneous situational awareness and redundant communication to an operating fleet. SkyCandy is deployed on Itzamna Aero's I9X/W UAS with low-cost sensors, and is currently being used to study the Popocatepetl volcano. Results suggest that operational communities can deploy low-cost sUAS to systematically monitor whilst optimizing for efficiency/maximizing resilience. The CAOS methodology is applicable to many other environments where coherent structures are present in the background. More information can be found at caos.mit.edu.
Robust estimation of the proportion of treatment effect explained by surrogate marker information.
Parast, Layla; McDermott, Mary M; Tian, Lu
2016-05-10
In randomized treatment studies where the primary outcome requires long follow-up of patients and/or expensive or invasive obtainment procedures, the availability of a surrogate marker that could be used to estimate the treatment effect and could potentially be observed earlier than the primary outcome would allow researchers to make conclusions regarding the treatment effect with less required follow-up time and resources. The Prentice criterion for a valid surrogate marker requires that a test for treatment effect on the surrogate marker also be a valid test for treatment effect on the primary outcome of interest. Based on this criterion, methods have been developed to define and estimate the proportion of treatment effect on the primary outcome that is explained by the treatment effect on the surrogate marker. These methods aim to identify useful statistical surrogates that capture a large proportion of the treatment effect. However, current methods to estimate this proportion usually require restrictive model assumptions that may not hold in practice and thus may lead to biased estimates of this quantity. In this paper, we propose a nonparametric procedure to estimate the proportion of treatment effect on the primary outcome that is explained by the treatment effect on a potential surrogate marker and extend this procedure to a setting with multiple surrogate markers. We compare our approach with previously proposed model-based approaches and propose a variance estimation procedure based on a perturbation-resampling method. Simulation studies demonstrate that the procedure performs well in finite samples and outperforms model-based procedures when the specified models are not correct. We illustrate our proposed procedure using a data set from a randomized study investigating a group-mediated cognitive behavioral intervention for peripheral artery disease participants. Copyright © 2015 John Wiley & Sons, Ltd.
Watermarking on 3D mesh based on spherical wavelet transform.
Jin, Jian-Qiu; Dai, Min-Ya; Bao, Hu-Jun; Peng, Qun-Sheng
2004-03-01
In this paper we propose a robust watermarking algorithm for 3D mesh. The algorithm is based on spherical wavelet transform. Our basic idea is to decompose the original mesh into a series of details at different scales by using spherical wavelet transform; the watermark is then embedded into the different levels of details. The embedding process includes: global sphere parameterization, spherical uniform sampling, spherical wavelet forward transform, embedding watermark, spherical wavelet inverse transform, and at last resampling the mesh watermarked to recover the topological connectivity of the original model. Experiments showed that our algorithm can improve the capacity of the watermark and the robustness of watermarking against attacks.
Inferring microevolution from museum collections and resampling: lessons learned from Cepaea.
Ożgo, Małgorzata; Liew, Thor-Seng; Webster, Nicole B; Schilthuizen, Menno
2017-01-01
Natural history collections are an important and largely untapped source of long-term data on evolutionary changes in wild populations. Here, we utilize three large geo-referenced sets of samples of the common European land-snail Cepaea nemoralis stored in the collection of Naturalis Biodiversity Center in Leiden, the Netherlands. Resampling of these populations allowed us to gain insight into changes occurring over 95, 69, and 50 years. Cepaea nemoralis is polymorphic for the colour and banding of the shell; the mode of inheritance of these patterns is known, and the polymorphism is under both thermal and predatory selection. At two sites the general direction of changes was towards lighter shells (yellow and less heavily banded), which is consistent with predictions based on on-going climatic change. At one site no directional changes were detected. At all sites there were significant shifts in morph frequencies between years, and our study contributes to the recognition that short-term changes in the states of populations often exceed long-term trends. Our interpretation was limited by the few time points available in the studied collections. We therefore stress the need for natural history collections to routinely collect large samples of common species, to allow much more reliable hind-casting of evolutionary responses to environmental change.
Brunori, Paola; Masi, Piergiorgio; Faggiani, Luigi; Villani, Luciano; Tronchin, Michele; Galli, Claudio; Laube, Clarissa; Leoni, Antonella; Demi, Maila; La Gioia, Antonio
2011-04-11
Neonatal jaundice might lead to severe clinical consequences. Measurement of bilirubin in samples is interfered by hemolysis. Over a method-depending cut-off value of measured hemolysis, bilirubin value is not accepted and a new sample is required for evaluation although this is not always possible, especially with newborns and cachectic oncological patients. When usage of different methods, less prone to interferences, is not feasible an alternative recovery method for analytical significance of rejected data might help clinicians to take appropriate decisions. We studied the effects of hemolysis over total bilirubin measurement, comparing hemolysis-interfered bilirubin measurement with the non-interfered value. Interference curves were extrapolated over a wide range of bilirubin (0-30 mg/mL) and hemolysis (H Index 0-1100). Interference "altitude" curves were calculated and plotted. A bimodal acceptance table was calculated. Non-interfered bilirubin of given samples was calculated, by linear interpolation between the nearest lower and upper interference curves. Rejection of interference-sensitive data from hemolysed samples for every method should be based not upon the interferent concentration but upon a more complex algorithm based upon the concentration-dependent bimodal interaction between the interfered analyte and the measured interferent. The altitude-curve cartography approach to interfered assays may help laboratories to build up their own method-dependent algorithm and to improve the trueness of their data by choosing a cut-off value different from the one (-10% interference) proposed by manufacturers. When re-sampling or an alternative method is not available the altitude-curve cartography approach might also represent an alternative recovery method for analytical significance of rejected data. Copyright © 2011 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Liu, Jie; Wang, Wilson; Ma, Fai
2011-07-01
System current state estimation (or condition monitoring) and future state prediction (or failure prognostics) constitute the core elements of condition-based maintenance programs. For complex systems whose internal state variables are either inaccessible to sensors or hard to measure under normal operational conditions, inference has to be made from indirect measurements using approaches such as Bayesian learning. In recent years, the auxiliary particle filter (APF) has gained popularity in Bayesian state estimation; the APF technique, however, has some potential limitations in real-world applications. For example, the diversity of the particles may deteriorate when the process noise is small, and the variance of the importance weights could become extremely large when the likelihood varies dramatically over the prior. To tackle these problems, a regularized auxiliary particle filter (RAPF) is developed in this paper for system state estimation and forecasting. This RAPF aims to improve the performance of the APF through two innovative steps: (1) regularize the approximating empirical density and redraw samples from a continuous distribution so as to diversify the particles; and (2) smooth out the rather diffused proposals by a rejection/resampling approach so as to improve the robustness of particle filtering. The effectiveness of the proposed RAPF technique is evaluated through simulations of a nonlinear/non-Gaussian benchmark model for state estimation. It is also implemented for a real application in the remaining useful life (RUL) prediction of lithium-ion batteries.
Gridless, pattern-driven point cloud completion and extension
NASA Astrophysics Data System (ADS)
Gravey, Mathieu; Mariethoz, Gregoire
2016-04-01
While satellites offer Earth observation with a wide coverage, other remote sensing techniques such as terrestrial LiDAR can acquire very high-resolution data on an area that is limited in extension and often discontinuous due to shadow effects. Here we propose a numerical approach to merge these two types of information, thereby reconstructing high-resolution data on a continuous large area. It is based on a pattern matching process that completes the areas where only low-resolution data is available, using bootstrapped high-resolution patterns. Currently, the most common approach to pattern matching is to interpolate the point data on a grid. While this approach is computationally efficient, it presents major drawbacks for point clouds processing because a significant part of the information is lost in the point-to-grid resampling, and that a prohibitive amount of memory is needed to store large grids. To address these issues, we propose a gridless method that compares point clouds subsets without the need to use a grid. On-the-fly interpolation involves a heavy computational load, which is met by using a GPU high-optimized implementation and a hierarchical pattern searching strategy. The method is illustrated using data from the Val d'Arolla, Swiss Alps, where high-resolution terrestrial LiDAR data are fused with lower-resolution Landsat and WorldView-3 acquisitions, such that the density of points is homogeneized (data completion) and that it is extend to a larger area (data extension).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Manoli, Gabriele, E-mail: manoli@dmsa.unipd.it; Nicholas School of the Environment, Duke University, Durham, NC 27708; Rossi, Matteo
The modeling of unsaturated groundwater flow is affected by a high degree of uncertainty related to both measurement and model errors. Geophysical methods such as Electrical Resistivity Tomography (ERT) can provide useful indirect information on the hydrological processes occurring in the vadose zone. In this paper, we propose and test an iterated particle filter method to solve the coupled hydrogeophysical inverse problem. We focus on an infiltration test monitored by time-lapse ERT and modeled using Richards equation. The goal is to identify hydrological model parameters from ERT electrical potential measurements. Traditional uncoupled inversion relies on the solution of two sequentialmore » inverse problems, the first one applied to the ERT measurements, the second one to Richards equation. This approach does not ensure an accurate quantitative description of the physical state, typically violating mass balance. To avoid one of these two inversions and incorporate in the process more physical simulation constraints, we cast the problem within the framework of a SIR (Sequential Importance Resampling) data assimilation approach that uses a Richards equation solver to model the hydrological dynamics and a forward ERT simulator combined with Archie's law to serve as measurement model. ERT observations are then used to update the state of the system as well as to estimate the model parameters and their posterior distribution. The limitations of the traditional sequential Bayesian approach are investigated and an innovative iterative approach is proposed to estimate the model parameters with high accuracy. The numerical properties of the developed algorithm are verified on both homogeneous and heterogeneous synthetic test cases based on a real-world field experiment.« less
Atmospheric Science Data Center
2016-08-22
MISBR MISR Browse Data: Color browse image of the Ellipsoid product for each camera resampled to 2.2 km resolution. ... Tool: Order Data Readme Files: Processing Status Production Report Read Software ...
INS/GNSS Tightly-Coupled Integration Using Quaternion-Based AUPF for USV.
Xia, Guoqing; Wang, Guoqing
2016-08-02
This paper addresses the problem of integration of Inertial Navigation System (INS) and Global Navigation Satellite System (GNSS) for the purpose of developing a low-cost, robust and highly accurate navigation system for unmanned surface vehicles (USVs). A tightly-coupled integration approach is one of the most promising architectures to fuse the GNSS data with INS measurements. However, the resulting system and measurement models turn out to be nonlinear, and the sensor stochastic measurement errors are non-Gaussian and distributed in a practical system. Particle filter (PF), one of the most theoretical attractive non-linear/non-Gaussian estimation methods, is becoming more and more attractive in navigation applications. However, the large computation burden limits its practical usage. For the purpose of reducing the computational burden without degrading the system estimation accuracy, a quaternion-based adaptive unscented particle filter (AUPF), which combines the adaptive unscented Kalman filter (AUKF) with PF, has been proposed in this paper. The unscented Kalman filter (UKF) is used in the algorithm to improve the proposal distribution and generate a posterior estimates, which specify the PF importance density function for generating particles more intelligently. In addition, the computational complexity of the filter is reduced with the avoidance of the re-sampling step. Furthermore, a residual-based covariance matching technique is used to adapt the measurement error covariance. A trajectory simulator based on a dynamic model of USV is used to test the proposed algorithm. Results show that quaternion-based AUPF can significantly improve the overall navigation accuracy and reliability.
Collell, Guillem; Prelec, Drazen; Patil, Kaustubh R
2018-01-31
Class imbalance presents a major hurdle in the application of classification methods. A commonly taken approach is to learn ensembles of classifiers using rebalanced data. Examples include bootstrap averaging (bagging) combined with either undersampling or oversampling of the minority class examples. However, rebalancing methods entail asymmetric changes to the examples of different classes, which in turn can introduce their own biases. Furthermore, these methods often require specifying the performance measure of interest a priori, i.e., before learning. An alternative is to employ the threshold moving technique, which applies a threshold to the continuous output of a model, offering the possibility to adapt to a performance measure a posteriori , i.e., a plug-in method. Surprisingly, little attention has been paid to this combination of a bagging ensemble and threshold-moving. In this paper, we study this combination and demonstrate its competitiveness. Contrary to the other resampling methods, we preserve the natural class distribution of the data resulting in well-calibrated posterior probabilities. Additionally, we extend the proposed method to handle multiclass data. We validated our method on binary and multiclass benchmark data sets by using both, decision trees and neural networks as base classifiers. We perform analyses that provide insights into the proposed method.
Mazurowski, Maciej A; Zurada, Jacek M; Tourassi, Georgia D
2009-07-01
Ensemble classifiers have been shown efficient in multiple applications. In this article, the authors explore the effectiveness of ensemble classifiers in a case-based computer-aided diagnosis system for detection of masses in mammograms. They evaluate two general ways of constructing subclassifiers by resampling of the available development dataset: Random division and random selection. Furthermore, they discuss the problem of selecting the ensemble size and propose two adaptive incremental techniques that automatically select the size for the problem at hand. All the techniques are evaluated with respect to a previously proposed information-theoretic CAD system (IT-CAD). The experimental results show that the examined ensemble techniques provide a statistically significant improvement (AUC = 0.905 +/- 0.024) in performance as compared to the original IT-CAD system (AUC = 0.865 +/- 0.029). Some of the techniques allow for a notable reduction in the total number of examples stored in the case base (to 1.3% of the original size), which, in turn, results in lower storage requirements and a shorter response time of the system. Among the methods examined in this article, the two proposed adaptive techniques are by far the most effective for this purpose. Furthermore, the authors provide some discussion and guidance for choosing the ensemble parameters.
Application of change-point problem to the detection of plant patches.
López, I; Gámez, M; Garay, J; Standovár, T; Varga, Z
2010-03-01
In ecology, if the considered area or space is large, the spatial distribution of individuals of a given plant species is never homogeneous; plants form different patches. The homogeneity change in space or in time (in particular, the related change-point problem) is an important research subject in mathematical statistics. In the paper, for a given data system along a straight line, two areas are considered, where the data of each area come from different discrete distributions, with unknown parameters. In the paper a method is presented for the estimation of the distribution change-point between both areas and an estimate is given for the distributions separated by the obtained change-point. The solution of this problem will be based on the maximum likelihood method. Furthermore, based on an adaptation of the well-known bootstrap resampling, a method for the estimation of the so-called change-interval is also given. The latter approach is very general, since it not only applies in the case of the maximum-likelihood estimation of the change-point, but it can be also used starting from any other change-point estimation known in the ecological literature. The proposed model is validated against typical ecological situations, providing at the same time a verification of the applied algorithms.
Uga, Minako; Dan, Ippeita; Dan, Haruka; Kyutoku, Yasushi; Taguchi, Y-h; Watanabe, Eiju
2015-01-01
Abstract. Recent advances in multichannel functional near-infrared spectroscopy (fNIRS) allow wide coverage of cortical areas while entailing the necessity to control family-wise errors (FWEs) due to increased multiplicity. Conventionally, the Bonferroni method has been used to control FWE. While Type I errors (false positives) can be strictly controlled, the application of a large number of channel settings may inflate the chance of Type II errors (false negatives). The Bonferroni-based methods are especially stringent in controlling Type I errors of the most activated channel with the smallest p value. To maintain a balance between Types I and II errors, effective multiplicity (Meff) derived from the eigenvalues of correlation matrices is a method that has been introduced in genetic studies. Thus, we explored its feasibility in multichannel fNIRS studies. Applying the Meff method to three kinds of experimental data with different activation profiles, we performed resampling simulations and found that Meff was controlled at 10 to 15 in a 44-channel setting. Consequently, the number of significantly activated channels remained almost constant regardless of the number of measured channels. We demonstrated that the Meff approach can be an effective alternative to Bonferroni-based methods for multichannel fNIRS studies. PMID:26157982
Scanner imaging systems, aircraft
NASA Technical Reports Server (NTRS)
Ungar, S. G.
1982-01-01
The causes and effects of distortion in aircraft scanner data are reviewed and an approach to reduce distortions by modelling the effect of aircraft motion on the scanner scene is discussed. With the advent of advanced satellite borne scanner systems, the geometric and radiometric correction of aircraft scanner data has become increasingly important. Corrections are needed to reliably simulate observations obtained by such systems for purposes of evaluation. It is found that if sufficient navigational information is available, aircraft scanner coordinates may be related very precisely to planimetric ground coordinates. However, the potential for a multivalue remapping transformation (i.e., scan lines crossing each other), adds an inherent uncertainty, to any radiometric resampling scheme, which is dependent on the precise geometry of the scan and ground pattern.
Poder, Thomas G; Kouakou, Christian R C; Bouchard, Pierre-Alexandre; Tremblay, Véronique; Blais, Sébastien; Maltais, François; Lellouche, François
2018-01-23
Conduct a cost-effectiveness analysis of FreeO 2 technology versus manual oxygen-titration technology for patients with chronic obstructive pulmonary disease (COPD) hospitalised for acute exacerbations. Tertiary acute care hospital in Quebec, Canada. 47 patients with COPD hospitalised for acute exacerbations. An automated oxygen-titration and oxygen-weaning technology. The costs for hospitalisation and follow-up for 180 days were calculated using a microcosting approach and included the cost of FreeO 2 technology. Incremental cost-effectiveness ratios (ICERs) were calculated using bootstrap resampling with 5000 replications. The main effect variable was the percentage of time spent at the target oxygen saturation (SpO 2 ). The other two effect variables were the time spent in hyperoxia (target SpO 2 +5%) and in severe hypoxaemia (SpO 2 <85%). The resamplings were based on data from a randomised controlled trial with 47 patients with COPD hospitalised for acute exacerbations. FreeO 2 generated savings of 20.7% of the per-patient costs at 180 days (ie, -$C2959.71). This decrease is nevertheless not significant at the 95% threshold (P=0.13), but the effect variables all improved (P<0.001). The improvement in the time spent at the target SpO 2 was 56.3%. The ICERs indicate that FreeO 2 technology is more cost-effective than manual oxygen titration with a savings of -$C96.91 per percentage point of time spent at the target SpO 2 (95% CI -301.26 to 116.96). FreeO 2 technology could significantly enhance the efficiency of the health system by reducing per-patient costs at 180 days. A study with a larger patient sample needs to be carried out to confirm these preliminary results. NCT01393015; Post-results. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Kim, Hyoungrae; Jang, Cheongyun; Yadav, Dharmendra K; Kim, Mi-Hyun
2017-03-23
The accuracy of any 3D-QSAR, Pharmacophore and 3D-similarity based chemometric target fishing models are highly dependent on a reasonable sample of active conformations. Since a number of diverse conformational sampling algorithm exist, which exhaustively generate enough conformers, however model building methods relies on explicit number of common conformers. In this work, we have attempted to make clustering algorithms, which could find reasonable number of representative conformer ensembles automatically with asymmetric dissimilarity matrix generated from openeye tool kit. RMSD was the important descriptor (variable) of each column of the N × N matrix considered as N variables describing the relationship (network) between the conformer (in a row) and the other N conformers. This approach used to evaluate the performance of the well-known clustering algorithms by comparison in terms of generating representative conformer ensembles and test them over different matrix transformation functions considering the stability. In the network, the representative conformer group could be resampled for four kinds of algorithms with implicit parameters. The directed dissimilarity matrix becomes the only input to the clustering algorithms. Dunn index, Davies-Bouldin index, Eta-squared values and omega-squared values were used to evaluate the clustering algorithms with respect to the compactness and the explanatory power. The evaluation includes the reduction (abstraction) rate of the data, correlation between the sizes of the population and the samples, the computational complexity and the memory usage as well. Every algorithm could find representative conformers automatically without any user intervention, and they reduced the data to 14-19% of the original values within 1.13 s per sample at the most. The clustering methods are simple and practical as they are fast and do not ask for any explicit parameters. RCDTC presented the maximum Dunn and omega-squared values of the four algorithms in addition to consistent reduction rate between the population size and the sample size. The performance of the clustering algorithms was consistent over different transformation functions. Moreover, the clustering method can also be applied to molecular dynamics sampling simulation results.
Wang, Yunpeng; Thompson, Wesley K.; Schork, Andrew J.; Holland, Dominic; Chen, Chi-Hua; Bettella, Francesco; Desikan, Rahul S.; Li, Wen; Witoelar, Aree; Zuber, Verena; Devor, Anna; Nöthen, Markus M.; Rietschel, Marcella; Chen, Qiang; Werge, Thomas; Cichon, Sven; Weinberger, Daniel R.; Djurovic, Srdjan; O’Donovan, Michael; Visscher, Peter M.; Andreassen, Ole A.; Dale, Anders M.
2016-01-01
Most of the genetic architecture of schizophrenia (SCZ) has not yet been identified. Here, we apply a novel statistical algorithm called Covariate-Modulated Mixture Modeling (CM3), which incorporates auxiliary information (heterozygosity, total linkage disequilibrium, genomic annotations, pleiotropy) for each single nucleotide polymorphism (SNP) to enable more accurate estimation of replication probabilities, conditional on the observed test statistic (“z-score”) of the SNP. We use a multiple logistic regression on z-scores to combine information from auxiliary information to derive a “relative enrichment score” for each SNP. For each stratum of these relative enrichment scores, we obtain nonparametric estimates of posterior expected test statistics and replication probabilities as a function of discovery z-scores, using a resampling-based approach that repeatedly and randomly partitions meta-analysis sub-studies into training and replication samples. We fit a scale mixture of two Gaussians model to each stratum, obtaining parameter estimates that minimize the sum of squared differences of the scale-mixture model with the stratified nonparametric estimates. We apply this approach to the recent genome-wide association study (GWAS) of SCZ (n = 82,315), obtaining a good fit between the model-based and observed effect sizes and replication probabilities. We observed that SNPs with low enrichment scores replicate with a lower probability than SNPs with high enrichment scores even when both they are genome-wide significant (p < 5x10-8). There were 693 and 219 independent loci with model-based replication rates ≥80% and ≥90%, respectively. Compared to analyses not incorporating relative enrichment scores, CM3 increased out-of-sample yield for SNPs that replicate at a given rate. This demonstrates that replication probabilities can be more accurately estimated using prior enrichment information with CM3. PMID:26808560
Kang, Le; Chen, Weijie; Petrick, Nicholas A.; Gallas, Brandon D.
2014-01-01
The area under the receiver operating characteristic (ROC) curve (AUC) is often used as a summary index of the diagnostic ability in evaluating biomarkers when the clinical outcome (truth) is binary. When the clinical outcome is right-censored survival time, the C index, motivated as an extension of AUC, has been proposed by Harrell as a measure of concordance between a predictive biomarker and the right-censored survival outcome. In this work, we investigate methods for statistical comparison of two diagnostic or predictive systems, of which they could either be two biomarkers or two fixed algorithms, in terms of their C indices. We adopt a U-statistics based C estimator that is asymptotically normal and develop a nonparametric analytical approach to estimate the variance of the C estimator and the covariance of two C estimators. A z-score test is then constructed to compare the two C indices. We validate our one-shot nonparametric method via simulation studies in terms of the type I error rate and power. We also compare our one-shot method with resampling methods including the jackknife and the bootstrap. Simulation results show that the proposed one-shot method provides almost unbiased variance estimations and has satisfactory type I error control and power. Finally, we illustrate the use of the proposed method with an example from the Framingham Heart Study. PMID:25399736
Imaging through turbulence using a plenoptic sensor
NASA Astrophysics Data System (ADS)
Wu, Chensheng; Ko, Jonathan; Davis, Christopher C.
2015-09-01
Atmospheric turbulence can significantly affect imaging through paths near the ground. Atmospheric turbulence is generally treated as a time varying inhomogeneity of the refractive index of the air, which disrupts the propagation of optical signals from the object to the viewer. Under circumstances of deep or strong turbulence, the object is hard to recognize through direct imaging. Conventional imaging methods can't handle those problems efficiently. The required time for lucky imaging can be increased significantly and the image processing approaches require much more complex and iterative de-blurring algorithms. We propose an alternative approach using a plenoptic sensor to resample and analyze the image distortions. The plenoptic sensor uses a shared objective lens and a microlens array to form a mini Keplerian telescope array. Therefore, the image obtained by a conventional method will be separated into an array of images that contain multiple copies of the object's image and less correlated turbulence disturbances. Then a highdimensional lucky imaging algorithm can be performed based on the collected video on the plenoptic sensor. The corresponding algorithm will select the most stable pixels from various image cells and reconstruct the object's image as if there is only weak turbulence effect. Then, by comparing the reconstructed image with the recorded images in each MLA cell, the difference can be regarded as the turbulence effects. As a result, the retrieval of the object's image and extraction of turbulence effect can be performed simultaneously.
GPU and APU computations of Finite Time Lyapunov Exponent fields
NASA Astrophysics Data System (ADS)
Conti, Christian; Rossinelli, Diego; Koumoutsakos, Petros
2012-03-01
We present GPU and APU accelerated computations of Finite-Time Lyapunov Exponent (FTLE) fields. The calculation of FTLEs is a computationally intensive process, as in order to obtain the sharp ridges associated with the Lagrangian Coherent Structures an extensive resampling of the flow field is required. The computational performance of this resampling is limited by the memory bandwidth of the underlying computer architecture. The present technique harnesses data-parallel execution of many-core architectures and relies on fast and accurate evaluations of moment conserving functions for the mesh to particle interpolations. We demonstrate how the computation of FTLEs can be efficiently performed on a GPU and on an APU through OpenCL and we report over one order of magnitude improvements over multi-threaded executions in FTLE computations of bluff body flows.
NASA Technical Reports Server (NTRS)
Lawton, Pat
2004-01-01
The objective of this work was to support the design of improved IUE NEWSIPS high dispersion extraction algorithms. The purpose of this work was to evaluate use of the Linearized Image (LIHI) file versus the Re-Sampled Image (SIHI) file, evaluate various extraction, and design algorithms for evaluation of IUE High Dispersion spectra. It was concluded the use of the Re-Sampled Image (SIHI) file was acceptable. Since the Gaussian profile worked well for the core and the Lorentzian profile worked well for the wings, the Voigt profile was chosen for use in the extraction algorithm. It was found that the gamma and sigma parameters varied significantly across the detector, so gamma and sigma masks for the SWP detector were developed. Extraction code was written.
Fast Computation of the Two-Point Correlation Function in the Age of Big Data
NASA Astrophysics Data System (ADS)
Pellegrino, Andrew; Timlin, John
2018-01-01
We present a new code which quickly computes the two-point correlation function for large sets of astronomical data. This code combines the ease of use of Python with the speed of parallel shared libraries written in C. We include the capability to compute the auto- and cross-correlation statistics, and allow the user to calculate the three-dimensional and angular correlation functions. Additionally, the code automatically divides the user-provided sky masks into contiguous subsamples of similar size, using the HEALPix pixelization scheme, for the purpose of resampling. Errors are computed using jackknife and bootstrap resampling in a way that adds negligible extra runtime, even with many subsamples. We demonstrate comparable speed with other clustering codes, and code accuracy compared to known and analytic results.
Spatial resampling of IDR frames for low bitrate video coding with HEVC
NASA Astrophysics Data System (ADS)
Hosking, Brett; Agrafiotis, Dimitris; Bull, David; Easton, Nick
2015-03-01
As the demand for higher quality and higher resolution video increases, many applications fail to meet this demand due to low bandwidth restrictions. One factor contributing to this problem is the high bitrate requirement of the intra-coded Instantaneous Decoding Refresh (IDR) frames featuring in all video coding standards. Frequent coding of IDR frames is essential for error resilience in order to prevent the occurrence of error propagation. However, as each one consumes a huge portion of the available bitrate, the quality of future coded frames is hindered by high levels of compression. This work presents a new technique, known as Spatial Resampling of IDR Frames (SRIF), and shows how it can increase the rate distortion performance by providing a higher and more consistent level of video quality at low bitrates.
More About the Phase-Synchronized Enhancement Method
NASA Technical Reports Server (NTRS)
Jong, Jen-Yi
2004-01-01
A report presents further details regarding the subject matter of "Phase-Synchronized Enhancement Method for Engine Diagnostics" (MFS-26435), NASA Tech Briefs, Vol. 22, No. 1 (January 1998), page 54. To recapitulate: The phase-synchronized enhancement method (PSEM) involves the digital resampling of a quasi-periodic signal in synchronism with the instantaneous phase of one of its spectral components. This resampling transforms the quasi-periodic signal into a periodic one more amenable to analysis. It is particularly useful for diagnosis of a rotating machine through analysis of vibration spectra that include components at the fundamental and harmonics of a slightly fluctuating rotation frequency. The report discusses the machinery-signal-analysis problem, outlines the PSEM algorithms, presents the mathematical basis of the PSEM, and presents examples of application of the PSEM in some computational simulations.
Comparisons of Reflectivities from the TRMM Precipitation Radar and Ground-Based Radars
NASA Technical Reports Server (NTRS)
Wang, Jianxin; Wolff, David B.
2008-01-01
Given the decade long and highly successful Tropical Rainfall Measuring Mission (TRMM), it is now possible to provide quantitative comparisons between ground-based radars (GRs) with the space-borne TRMM precipitation radar (PR) with greater certainty over longer time scales in various tropical climatological regions. This study develops an automated methodology to match and compare simultaneous TRMM PR and GR reflectivities at four primary TRMM Ground Validation (GV) sites: Houston, Texas (HSTN); Melbourne, Florida (MELB); Kwajalein, Republic of the Marshall Islands (KWAJ); and Darwin, Australia (DARW). Data from each instrument are resampled into a three-dimensional Cartesian coordinate system. The horizontal displacement during the PR data resampling is corrected. Comparisons suggest that the PR suffers significant attenuation at lower levels especially in convective rain. The attenuation correction performs quite well for convective rain but appears to slightly over-correct in stratiform rain. The PR and GR observations at HSTN, MELB and KWAJ agree to about 1 dB on average with a few exceptions, while the GR at DARW requires +1 to -5 dB calibration corrections. One of the important findings of this study is that the GR calibration offset is dependent on the reflectivity magnitude. Hence, we propose that the calibration should be carried out using a regression correction, rather than simply adding an offset value to all GR reflectivities. This methodology is developed towards TRMM GV efforts to improve the accuracy of tropical rain estimates, and can also be applied to the proposed Global Precipitation Measurement and other related activities over the globe.
Multilayered nonuniform sampling for three-dimensional scene representation
NASA Astrophysics Data System (ADS)
Lin, Huei-Yung; Xiao, Yu-Hua; Chen, Bo-Ren
2015-09-01
The representation of a three-dimensional (3-D) scene is essential in multiview imaging technologies. We present a unified geometry and texture representation based on global resampling of the scene. A layered data map representation with a distance-dependent nonuniform sampling strategy is proposed. It is capable of increasing the details of the 3-D structure locally and is compact in size. The 3-D point cloud obtained from the multilayered data map is used for view rendering. For any given viewpoint, image synthesis with different levels of detail is carried out using the quadtree-based nonuniformly sampled 3-D data points. Experimental results are presented using the 3-D models of reconstructed real objects.
Evaluating video digitizer errors
NASA Astrophysics Data System (ADS)
Peterson, C.
2016-01-01
Analog output video cameras remain popular for recording meteor data. Although these cameras uniformly employ electronic detectors with fixed pixel arrays, the digitization process requires resampling the horizontal lines as they are output in order to reconstruct the pixel data, usually resulting in a new data array of different horizontal dimensions than the native sensor. Pixel timing is not provided by the camera, and must be reconstructed based on line sync information embedded in the analog video signal. Using a technique based on hot pixels, I present evidence that jitter, sync detection, and other timing errors introduce both position and intensity errors which are not present in cameras which internally digitize their sensors and output the digital data directly.
Stefanik, Laura; Erdman, Lauren; Ameis, Stephanie H; Foussias, George; Mulsant, Benoit H; Behdinan, Tina; Goldenberg, Anna; O'Donnell, Lauren J; Voineskos, Aristotle N
2018-04-01
There is considerable heterogeneity in social cognitive and neurocognitive performance among people with schizophrenia spectrum disorders (SSD), autism spectrum disorders (ASD), bipolar disorder (BD), and healthy individuals. This study used Similarity Network Fusion (SNF), a novel data-driven approach, to identify participant similarity networks based on relationships among demographic, brain imaging, and behavioral data. T1-weighted and diffusion-weighted magnetic resonance images were obtained for 174 adolescents and young adults (aged 16-35 years) with an SSD (n=51), an ASD without intellectual disability (n=38), euthymic BD (n=34), and healthy controls (n=51). A battery of social cognitive and neurocognitive tasks were administered. Data integration, cluster determination, and biological group formation were then obtained using SNF. We identified four new groups of individuals, each with distinct neural circuit-cognitive profiles. The most influential variables driving the formation of the new groups were robustly reliable across embedded resampling techniques. The data-driven groups showed considerably greater differentiation on key social and neurocognitive circuit nodes than groups generated by diagnostic analyses or dimensional social cognitive analyses. The data-driven groups were validated through functional outcome and brain network property measures not included in the SNF model. Cutting across diagnostic boundaries, our approach can effectively identify new groups of people based on a profile of neuroimaging and behavioral data. Our findings bring us closer to disease subtyping that can be leveraged toward the targeting of specific neural circuitry among participant subgroups to ameliorate social cognitive and neurocognitive deficits.
Generalized Bootstrap Method for Assessment of Uncertainty in Semivariogram Inference
Olea, R.A.; Pardo-Iguzquiza, E.
2011-01-01
The semivariogram and its related function, the covariance, play a central role in classical geostatistics for modeling the average continuity of spatially correlated attributes. Whereas all methods are formulated in terms of the true semivariogram, in practice what can be used are estimated semivariograms and models based on samples. A generalized form of the bootstrap method to properly model spatially correlated data is used to advance knowledge about the reliability of empirical semivariograms and semivariogram models based on a single sample. Among several methods available to generate spatially correlated resamples, we selected a method based on the LU decomposition and used several examples to illustrate the approach. The first one is a synthetic, isotropic, exhaustive sample following a normal distribution, the second example is also a synthetic but following a non-Gaussian random field, and a third empirical sample consists of actual raingauge measurements. Results show wider confidence intervals than those found previously by others with inadequate application of the bootstrap. Also, even for the Gaussian example, distributions for estimated semivariogram values and model parameters are positively skewed. In this sense, bootstrap percentile confidence intervals, which are not centered around the empirical semivariogram and do not require distributional assumptions for its construction, provide an achieved coverage similar to the nominal coverage. The latter cannot be achieved by symmetrical confidence intervals based on the standard error, regardless if the standard error is estimated from a parametric equation or from bootstrap. ?? 2010 International Association for Mathematical Geosciences.
Confidence intervals in Flow Forecasting by using artificial neural networks
NASA Astrophysics Data System (ADS)
Panagoulia, Dionysia; Tsekouras, George
2014-05-01
One of the major inadequacies in implementation of Artificial Neural Networks (ANNs) for flow forecasting is the development of confidence intervals, because the relevant estimation cannot be implemented directly, contrasted to the classical forecasting methods. The variation in the ANN output is a measure of uncertainty in the model predictions based on the training data set. Different methods for uncertainty analysis, such as bootstrap, Bayesian, Monte Carlo, have already proposed for hydrologic and geophysical models, while methods for confidence intervals, such as error output, re-sampling, multi-linear regression adapted to ANN have been used for power load forecasting [1-2]. The aim of this paper is to present the re-sampling method for ANN prediction models and to develop this for flow forecasting of the next day. The re-sampling method is based on the ascending sorting of the errors between real and predicted values for all input vectors. The cumulative sample distribution function of the prediction errors is calculated and the confidence intervals are estimated by keeping the intermediate value, rejecting the extreme values according to the desired confidence levels, and holding the intervals symmetrical in probability. For application of the confidence intervals issue, input vectors are used from the Mesochora catchment in western-central Greece. The ANN's training algorithm is the stochastic training back-propagation process with decreasing functions of learning rate and momentum term, for which an optimization process is conducted regarding the crucial parameters values, such as the number of neurons, the kind of activation functions, the initial values and time parameters of learning rate and momentum term etc. Input variables are historical data of previous days, such as flows, nonlinearly weather related temperatures and nonlinearly weather related rainfalls based on correlation analysis between the under prediction flow and each implicit input variable of different ANN structures [3]. The performance of each ANN structure is evaluated by the voting analysis based on eleven criteria, which are the root mean square error (RMSE), the correlation index (R), the mean absolute percentage error (MAPE), the mean percentage error (MPE), the mean percentage error (ME), the percentage volume in errors (VE), the percentage error in peak (MF), the normalized mean bias error (NMBE), the normalized root mean bias error (NRMSE), the Nash-Sutcliffe model efficiency coefficient (E) and the modified Nash-Sutcliffe model efficiency coefficient (E1). The next day flow for the test set is calculated using the best ANN structure's model. Consequently, the confidence intervals of various confidence levels for training, evaluation and test sets are compared in order to explore the generalisation dynamics of confidence intervals from training and evaluation sets. [1] H.S. Hippert, C.E. Pedreira, R.C. Souza, "Neural networks for short-term load forecasting: A review and evaluation," IEEE Trans. on Power Systems, vol. 16, no. 1, 2001, pp. 44-55. [2] G. J. Tsekouras, N.E. Mastorakis, F.D. Kanellos, V.T. Kontargyri, C.D. Tsirekis, I.S. Karanasiou, Ch.N. Elias, A.D. Salis, P.A. Kontaxis, A.A. Gialketsi: "Short term load forecasting in Greek interconnected power system using ANN: Confidence Interval using a novel re-sampling technique with corrective Factor", WSEAS International Conference on Circuits, Systems, Electronics, Control & Signal Processing, (CSECS '10), Vouliagmeni, Athens, Greece, December 29-31, 2010. [3] D. Panagoulia, I. Trichakis, G. J. Tsekouras: "Flow Forecasting via Artificial Neural Networks - A Study for Input Variables conditioned on atmospheric circulation", European Geosciences Union, General Assembly 2012 (NH1.1 / AS1.16 - Extreme meteorological and hydrological events induced by severe weather and climate change), Vienna, Austria, 22-27 April 2012.
Hyperspectral Features of Oil-Polluted Sea Ice and the Response to the Contamination Area Fraction
Li, Ying; Liu, Chengyu; Xie, Feng
2018-01-01
Researchers have studied oil spills in open waters using remote sensors, but few have focused on extracting reflectance features of oil pollution on sea ice. An experiment was conducted on natural sea ice in Bohai Bay, China, to obtain the spectral reflectance of oil-contaminated sea ice. The spectral absorption index (SAI), spectral peak height (SPH), and wavelet detail coefficient (DWT d5) were calculated using stepwise multiple linear regression. The reflectances of some false targets were measured and analysed. The simulated false targets were sediment, iron ore fines, coal dust, and the melt pool. The measured reflectances were resampled using five common sensors (GF-2, Landsat8-OLI, Sentinel3-OLCI, MODIS, and AVIRIS). Some significant spectral features could discriminate between oil-polluted and clean sea ice. The indices correlated well with the oil area fractions. All of the adjusted R2 values exceeded 0.9. The SPH model1, based on spectral features at 507–670 and 1627–1746 nm, displayed the best fitting. The resampled data indicated that these multi-spectral and hyper-spectral sensors could be used to detect crude oil on the sea ice if the effect of noise and spatial resolution are neglected. The spectral features and their identified changes may provide reference on sensor design and band selection. PMID:29342945
Resampling to Address the Winner's Curse in Genetic Association Analysis of Time to Event
Poirier, Julia G.; Faye, Laura L.; Dimitromanolakis, Apostolos; Paterson, Andrew D.; Sun, Lei
2015-01-01
ABSTRACT The “winner's curse” is a subtle and difficult problem in interpretation of genetic association, in which association estimates from large‐scale gene detection studies are larger in magnitude than those from subsequent replication studies. This is practically important because use of a biased estimate from the original study will yield an underestimate of sample size requirements for replication, leaving the investigators with an underpowered study. Motivated by investigation of the genetics of type 1 diabetes complications in a longitudinal cohort of participants in the Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications (DCCT/EDIC) Genetics Study, we apply a bootstrap resampling method in analysis of time to nephropathy under a Cox proportional hazards model, examining 1,213 single‐nucleotide polymorphisms (SNPs) in 201 candidate genes custom genotyped in 1,361 white probands. Among 15 top‐ranked SNPs, bias reduction in log hazard ratio estimates ranges from 43.1% to 80.5%. In simulation studies based on the observed DCCT/EDIC genotype data, genome‐wide bootstrap estimates for false‐positive SNPs and for true‐positive SNPs with low‐to‐moderate power are closer to the true values than uncorrected naïve estimates, but tend to overcorrect SNPs with high power. This bias‐reduction technique is generally applicable for complex trait studies including quantitative, binary, and time‐to‐event traits. PMID:26411674
Coates, James; Jeyaseelan, Asha K; Ybarra, Norma; David, Marc; Faria, Sergio; Souhami, Luis; Cury, Fabio; Duclos, Marie; El Naqa, Issam
2015-04-01
We explore analytical and data-driven approaches to investigate the integration of genetic variations (single nucleotide polymorphisms [SNPs] and copy number variations [CNVs]) with dosimetric and clinical variables in modeling radiation-induced rectal bleeding (RB) and erectile dysfunction (ED) in prostate cancer patients. Sixty-two patients who underwent curative hypofractionated radiotherapy (66 Gy in 22 fractions) between 2002 and 2010 were retrospectively genotyped for CNV and SNP rs5489 in the xrcc1 DNA repair gene. Fifty-four patients had full dosimetric profiles. Two parallel modeling approaches were compared to assess the risk of severe RB (Grade⩾3) and ED (Grade⩾1); Maximum likelihood estimated generalized Lyman-Kutcher-Burman (LKB) and logistic regression. Statistical resampling based on cross-validation was used to evaluate model predictive power and generalizability to unseen data. Integration of biological variables xrcc1 CNV and SNP improved the fit of the RB and ED analytical and data-driven models. Cross-validation of the generalized LKB models yielded increases in classification performance of 27.4% for RB and 14.6% for ED when xrcc1 CNV and SNP were included, respectively. Biological variables added to logistic regression modeling improved classification performance over standard dosimetric models by 33.5% for RB and 21.2% for ED models. As a proof-of-concept, we demonstrated that the combination of genetic and dosimetric variables can provide significant improvement in NTCP prediction using analytical and data-driven approaches. The improvement in prediction performance was more pronounced in the data driven approaches. Moreover, we have shown that CNVs, in addition to SNPs, may be useful structural genetic variants in predicting radiation toxicities. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
An object-oriented framework for medical image registration, fusion, and visualization.
Zhu, Yang-Ming; Cochoff, Steven M
2006-06-01
An object-oriented framework for image registration, fusion, and visualization was developed based on the classic model-view-controller paradigm. The framework employs many design patterns to facilitate legacy code reuse, manage software complexity, and enhance the maintainability and portability of the framework. Three sample applications built a-top of this framework are illustrated to show the effectiveness of this framework: the first one is for volume image grouping and re-sampling, the second one is for 2D registration and fusion, and the last one is for visualization of single images as well as registered volume images.
Parks, Nathan A.; Gannon, Matthew A.; Long, Stephanie M.; Young, Madeleine E.
2016-01-01
Analysis of event-related potential (ERP) data includes several steps to ensure that ERPs meet an appropriate level of signal quality. One such step, subject exclusion, rejects subject data if ERP waveforms fail to meet an appropriate level of signal quality. Subject exclusion is an important quality control step in the ERP analysis pipeline as it ensures that statistical inference is based only upon those subjects exhibiting clear evoked brain responses. This critical quality control step is most often performed simply through visual inspection of subject-level ERPs by investigators. Such an approach is qualitative, subjective, and susceptible to investigator bias, as there are no standards as to what constitutes an ERP of sufficient signal quality. Here, we describe a standardized and objective method for quantifying waveform quality in individual subjects and establishing criteria for subject exclusion. The approach uses bootstrap resampling of ERP waveforms (from a pool of all available trials) to compute a signal-to-noise ratio confidence interval (SNR-CI) for individual subject waveforms. The lower bound of this SNR-CI (SNRLB) yields an effective and objective measure of signal quality as it ensures that ERP waveforms statistically exceed a desired signal-to-noise criterion. SNRLB provides a quantifiable metric of individual subject ERP quality and eliminates the need for subjective evaluation of waveform quality by the investigator. We detail the SNR-CI methodology, establish the efficacy of employing this approach with Monte Carlo simulations, and demonstrate its utility in practice when applied to ERP datasets. PMID:26903849
Spatial Resolution Characterization for QuickBird Image Products 2003-2004 Season
NASA Technical Reports Server (NTRS)
Blonski, Slawomir
2006-01-01
This presentation focuses on spatial resolution characterization for QuickBird panochromatic images in 2003-2004 and presents data measurements and analysis of SSC edge target deployment and edge response extraction and modeling. The results of the characterization are shown as values of the Modulation Transfer Function (MTF) at the Nyquist spatial frequency and as the Relative Edge Response (RER) components. The results show that RER is much less sensitive to accuracy of the curve fitting than the value of MTF at Nyquist frequency. Therefore, the RER/edge response slope is a more robust estimator of the digital image spatial resolution than the MTF. For the QuickBird panochromatic images, the RER is consistently equal to 0.5 for images processed with the Cubic Convolution resampling and to 0.8 for the MTF resampling.
Voice Conversion Using Pitch Shifting Algorithm by Time Stretching with PSOLA and Re-Sampling
NASA Astrophysics Data System (ADS)
Mousa, Allam
2010-01-01
Voice changing has many applications in the industry and commercial filed. This paper emphasizes voice conversion using a pitch shifting method which depends on detecting the pitch of the signal (fundamental frequency) using Simplified Inverse Filter Tracking (SIFT) and changing it according to the target pitch period using time stretching with Pitch Synchronous Over Lap Add Algorithm (PSOLA), then resampling the signal in order to have the same play rate. The same study was performed to see the effect of voice conversion when some Arabic speech signal is considered. Treatment of certain Arabic voiced vowels and the conversion between male and female speech has shown some expansion or compression in the resulting speech. Comparison in terms of pitch shifting is presented here. Analysis was performed for a single frame and a full segmentation of speech.
NASA Astrophysics Data System (ADS)
Wang, Jinliang; Wu, Xuejiao
2010-11-01
Geometric correction of imagery is a basic application of remote sensing technology. Its precision will impact directly on the accuracy and reliability of applications. The accuracy of geometric correction depends on many factors, including the used model for correction and the accuracy of the reference map, the number of ground control points (GCP) and its spatial distribution, resampling methods. The ETM+ image of Kunming Dianchi Lake Basin and 1:50000 geographical maps had been used to compare different correction methods. The results showed that: (1) The correction errors were more than one pixel and some of them were several pixels when the polynomial model was used. The correction accuracy was not stable when the Delaunay model was used. The correction errors were less than one pixel when the collinearity equation was used. (2) 6, 9, 25 and 35 GCP were selected randomly for geometric correction using the polynomial correction model respectively, the best result was obtained when 25 GCPs were used. (3) The contrast ratio of image corrected by using nearest neighbor and the best resampling rate was compared to that of using the cubic convolution and bilinear model. But the continuity of pixel gravy value was not very good. The contrast of image corrected was the worst and the computation time was the longest by using the cubic convolution method. According to the above results, the result was the best by using bilinear to resample.
Phu, Jack; Bui, Bang V; Kalloniatis, Michael; Khuu, Sieu K
2018-03-01
The number of subjects needed to establish the normative limits for visual field (VF) testing is not known. Using bootstrap resampling, we determined whether the ground truth mean, distribution limits, and standard deviation (SD) could be approximated using different set size ( x ) levels, in order to provide guidance for the number of healthy subjects required to obtain robust VF normative data. We analyzed the 500 Humphrey Field Analyzer (HFA) SITA-Standard results of 116 healthy subjects and 100 HFA full threshold results of 100 psychophysically experienced healthy subjects. These VFs were resampled (bootstrapped) to determine mean sensitivity, distribution limits (5th and 95th percentiles), and SD for different ' x ' and numbers of resamples. We also used the VF results of 122 glaucoma patients to determine the performance of ground truth and bootstrapped results in identifying and quantifying VF defects. An x of 150 (for SITA-Standard) and 60 (for full threshold) produced bootstrapped descriptive statistics that were no longer different to the original distribution limits and SD. Removing outliers produced similar results. Differences between original and bootstrapped limits in detecting glaucomatous defects were minimized at x = 250. Ground truth statistics of VF sensitivities could be approximated using set sizes that are significantly smaller than the original cohort. Outlier removal facilitates the use of Gaussian statistics and does not significantly affect the distribution limits. We provide guidance for choosing the cohort size for different levels of error when performing normative comparisons with glaucoma patients.
Bias Corrections for Regional Estimates of the Time-averaged Geomagnetic Field
NASA Astrophysics Data System (ADS)
Constable, C.; Johnson, C. L.
2009-05-01
We assess two sources of bias in the time-averaged geomagnetic field (TAF) and paleosecular variation (PSV): inadequate temporal sampling, and the use of unit vectors in deriving temporal averages of the regional geomagnetic field. For the first temporal sampling question we use statistical resampling of existing data sets to minimize and correct for bias arising from uneven temporal sampling in studies of the time- averaged geomagnetic field (TAF) and its paleosecular variation (PSV). The techniques are illustrated using data derived from Hawaiian lava flows for 0-5~Ma: directional observations are an updated version of a previously published compilation of paleomagnetic directional data centered on ± 20° latitude by Lawrence et al./(2006); intensity data are drawn from Tauxe & Yamazaki, (2007). We conclude that poor temporal sampling can produce biased estimates of TAF and PSV, and resampling to appropriate statistical distribution of ages reduces this bias. We suggest that similar resampling should be attempted as a bias correction for all regional paleomagnetic data to be used in TAF and PSV modeling. The second potential source of bias is the use of directional data in place of full vector data to estimate the average field. This is investigated for the full vector subset of the updated Hawaiian data set. Lawrence, K.P., C.G. Constable, and C.L. Johnson, 2006, Geochem. Geophys. Geosyst., 7, Q07007, DOI 10.1029/2005GC001181. Tauxe, L., & Yamazkai, 2007, Treatise on Geophysics,5, Geomagnetism, Elsevier, Amsterdam, Chapter 13,p509
An Automated, Adaptive Framework for Optimizing Preprocessing Pipelines in Task-Based Functional MRI
Churchill, Nathan W.; Spring, Robyn; Afshin-Pour, Babak; Dong, Fan; Strother, Stephen C.
2015-01-01
BOLD fMRI is sensitive to blood-oxygenation changes correlated with brain function; however, it is limited by relatively weak signal and significant noise confounds. Many preprocessing algorithms have been developed to control noise and improve signal detection in fMRI. Although the chosen set of preprocessing and analysis steps (the “pipeline”) significantly affects signal detection, pipelines are rarely quantitatively validated in the neuroimaging literature, due to complex preprocessing interactions. This paper outlines and validates an adaptive resampling framework for evaluating and optimizing preprocessing choices by optimizing data-driven metrics of task prediction and spatial reproducibility. Compared to standard “fixed” preprocessing pipelines, this optimization approach significantly improves independent validation measures of within-subject test-retest, and between-subject activation overlap, and behavioural prediction accuracy. We demonstrate that preprocessing choices function as implicit model regularizers, and that improvements due to pipeline optimization generalize across a range of simple to complex experimental tasks and analysis models. Results are shown for brief scanning sessions (<3 minutes each), demonstrating that with pipeline optimization, it is possible to obtain reliable results and brain-behaviour correlations in relatively small datasets. PMID:26161667
Pallmann, Philip; Schaarschmidt, Frank; Hothorn, Ludwig A; Fischer, Christiane; Nacke, Heiko; Priesnitz, Kai U; Schork, Nicholas J
2012-11-01
Comparing diversities between groups is a task biologists are frequently faced with, for example in ecological field trials or when dealing with metagenomics data. However, researchers often waver about which measure of diversity to choose as there is a multitude of approaches available. As Jost (2008, Molecular Ecology, 17, 4015) has pointed out, widely used measures such as the Shannon or Simpson index have undesirable properties which make them hard to compare and interpret. Many of the problems associated with the use of these 'raw' indices can be corrected by transforming them into 'true' diversity measures. We introduce a technique that allows the comparison of two or more groups of observations and simultaneously tests a user-defined selection of a number of 'true' diversity measures. This procedure yields multiplicity-adjusted P-values according to the method of Westfall and Young (1993, Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment, 49, 941), which ensures that the rate of false positives (type I error) does not rise when the number of groups and/or diversity indices is extended. Software is available in the R package 'simboot'. © 2012 Blackwell Publishing Ltd.
Producing Alaska interim land cover maps from Landsat digital and ancillary data
Fitzpatrick-Lins, Katherine; Doughty, Eileen Flanagan; Shasby, Mark; Loveland, Thomas R.; Benjamin, Susan
1987-01-01
In 1985, the U.S. Geological Survey initiated a research program to produce 1:250,000-scale land cover maps of Alaska using digital Landsat multispectral scanner data and ancillary data and to evaluate the potential of establishing a statewide land cover mapping program using this approach. The geometrically corrected and resampled Landsat pixel data are registered to a Universal Transverse Mercator (UTM) projection, along with arc-second digital elevation model data used as an aid in the final computer classification. Areas summaries of the land cover classes are extracted by merging the Landsat digital classification files with the U.S. Bureau of Land Management's Public Land Survey digital file. Registration of the digital land cover data is verified and control points are identified so that a laser plotter can products screened film separate for printing the classification data at map scale directly from the digital file. The final land cover classification is retained both as a color map at 1:250,000 scale registered to the U.S. Geological Survey base map, with area summaries by township and range on the reverse, and as a digital file where it may be used as a category in a geographic information system.
Saunders, Christina T; Blume, Jeffrey D
2017-10-26
Mediation analysis explores the degree to which an exposure's effect on an outcome is diverted through a mediating variable. We describe a classical regression framework for conducting mediation analyses in which estimates of causal mediation effects and their variance are obtained from the fit of a single regression model. The vector of changes in exposure pathway coefficients, which we named the essential mediation components (EMCs), is used to estimate standard causal mediation effects. Because these effects are often simple functions of the EMCs, an analytical expression for their model-based variance follows directly. Given this formula, it is instructive to revisit the performance of routinely used variance approximations (e.g., delta method and resampling methods). Requiring the fit of only one model reduces the computation time required for complex mediation analyses and permits the use of a rich suite of regression tools that are not easily implemented on a system of three equations, as would be required in the Baron-Kenny framework. Using data from the BRAIN-ICU study, we provide examples to illustrate the advantages of this framework and compare it with the existing approaches. © The Author 2017. Published by Oxford University Press.
An empirical Bayes approach to network recovery using external knowledge.
Kpogbezan, Gino B; van der Vaart, Aad W; van Wieringen, Wessel N; Leday, Gwenaël G R; van de Wiel, Mark A
2017-09-01
Reconstruction of a high-dimensional network may benefit substantially from the inclusion of prior knowledge on the network topology. In the case of gene interaction networks such knowledge may come for instance from pathway repositories like KEGG, or be inferred from data of a pilot study. The Bayesian framework provides a natural means of including such prior knowledge. Based on a Bayesian Simultaneous Equation Model, we develop an appealing Empirical Bayes (EB) procedure that automatically assesses the agreement of the used prior knowledge with the data at hand. We use variational Bayes method for posterior densities approximation and compare its accuracy with that of Gibbs sampling strategy. Our method is computationally fast, and can outperform known competitors. In a simulation study, we show that accurate prior data can greatly improve the reconstruction of the network, but need not harm the reconstruction if wrong. We demonstrate the benefits of the method in an analysis of gene expression data from GEO. In particular, the edges of the recovered network have superior reproducibility (compared to that of competitors) over resampled versions of the data. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Preacher, Kristopher J; Hayes, Andrew F
2008-08-01
Hypotheses involving mediation are common in the behavioral sciences. Mediation exists when a predictor affects a dependent variable indirectly through at least one intervening variable, or mediator. Methods to assess mediation involving multiple simultaneous mediators have received little attention in the methodological literature despite a clear need. We provide an overview of simple and multiple mediation and explore three approaches that can be used to investigate indirect processes, as well as methods for contrasting two or more mediators within a single model. We present an illustrative example, assessing and contrasting potential mediators of the relationship between the helpfulness of socialization agents and job satisfaction. We also provide SAS and SPSS macros, as well as Mplus and LISREL syntax, to facilitate the use of these methods in applications.
A modification of the fusion model for log polar coordinates
NASA Technical Reports Server (NTRS)
Griswold, N. C.; Weiman, Carl F. R.
1990-01-01
The fusion mechanism for application in stereo analysis of range restricted the depth of field and therefore required a shift variant mechanism in the peripheral area to find disparity. Misregistration was prevented by restricting the disparity detection range to a neighborhood spanned by the directional edge detection filters. This transformation was essentially accomplished by a nonuniform resampling of the original image in a horizontal direction. While this is easily implemented for digital processing, the approach does not (in the peripheral vision area) model the log-conformal mapping which is known to occur in the human mechanism. This paper therefore modifies the original fusion concept in the peripheral area to include the polar exponential grid-to-log conformal tesselation. Examples of the fusion process resulting in accurate disparity values are given.
NASA Astrophysics Data System (ADS)
Montzka, Carsten; Herbst, Michael; Weihermüller, Lutz; Verhoef, Anne; Vereecken, Harry
2017-07-01
Agroecosystem models, regional and global climate models, and numerical weather prediction models require adequate parameterization of soil hydraulic properties. These properties are fundamental for describing and predicting water and energy exchange processes at the transition zone between solid earth and atmosphere, and regulate evapotranspiration, infiltration and runoff generation. Hydraulic parameters describing the soil water retention (WRC) and hydraulic conductivity (HCC) curves are typically derived from soil texture via pedotransfer functions (PTFs). Resampling of those parameters for specific model grids is typically performed by different aggregation approaches such a spatial averaging and the use of dominant textural properties or soil classes. These aggregation approaches introduce uncertainty, bias and parameter inconsistencies throughout spatial scales due to nonlinear relationships between hydraulic parameters and soil texture. Therefore, we present a method to scale hydraulic parameters to individual model grids and provide a global data set that overcomes the mentioned problems. The approach is based on Miller-Miller scaling in the relaxed form by Warrick, that fits the parameters of the WRC through all sub-grid WRCs to provide an effective parameterization for the grid cell at model resolution; at the same time it preserves the information of sub-grid variability of the water retention curve by deriving local scaling parameters. Based on the Mualem-van Genuchten approach we also derive the unsaturated hydraulic conductivity from the water retention functions, thereby assuming that the local parameters are also valid for this function. In addition, via the Warrick scaling parameter λ, information on global sub-grid scaling variance is given that enables modellers to improve dynamical downscaling of (regional) climate models or to perturb hydraulic parameters for model ensemble output generation. The present analysis is based on the ROSETTA PTF of Schaap et al. (2001) applied to the SoilGrids1km data set of Hengl et al. (2014). The example data set is provided at a global resolution of 0.25° at https://doi.org/10.1594/PANGAEA.870605.
NASA Astrophysics Data System (ADS)
Arnaud, Patrick; Cantet, Philippe; Odry, Jean
2017-11-01
Flood frequency analyses (FFAs) are needed for flood risk management. Many methods exist ranging from classical purely statistical approaches to more complex approaches based on process simulation. The results of these methods are associated with uncertainties that are sometimes difficult to estimate due to the complexity of the approaches or the number of parameters, especially for process simulation. This is the case of the simulation-based FFA approach called SHYREG presented in this paper, in which a rainfall generator is coupled with a simple rainfall-runoff model in an attempt to estimate the uncertainties due to the estimation of the seven parameters needed to estimate flood frequencies. The six parameters of the rainfall generator are mean values, so their theoretical distribution is known and can be used to estimate the generator uncertainties. In contrast, the theoretical distribution of the single hydrological model parameter is unknown; consequently, a bootstrap method is applied to estimate the calibration uncertainties. The propagation of uncertainty from the rainfall generator to the hydrological model is also taken into account. This method is applied to 1112 basins throughout France. Uncertainties coming from the SHYREG method and from purely statistical approaches are compared, and the results are discussed according to the length of the recorded observations, basin size and basin location. Uncertainties of the SHYREG method decrease as the basin size increases or as the length of the recorded flow increases. Moreover, the results show that the confidence intervals of the SHYREG method are relatively small despite the complexity of the method and the number of parameters (seven). This is due to the stability of the parameters and takes into account the dependence of uncertainties due to the rainfall model and the hydrological calibration. Indeed, the uncertainties on the flow quantiles are on the same order of magnitude as those associated with the use of a statistical law with two parameters (here generalised extreme value Type I distribution) and clearly lower than those associated with the use of a three-parameter law (here generalised extreme value Type II distribution). For extreme flood quantiles, the uncertainties are mostly due to the rainfall generator because of the progressive saturation of the hydrological model.
Optimization of Trade-offs in Error-free Image Transmission
NASA Astrophysics Data System (ADS)
Cox, Jerome R.; Moore, Stephen M.; Blaine, G. James; Zimmerman, John B.; Wallace, Gregory K.
1989-05-01
The availability of ubiquitous wide-area channels of both modest cost and higher transmission rate than voice-grade lines promises to allow the expansion of electronic radiology services to a larger community. The band-widths of the new services becoming available from the Integrated Services Digital Network (ISDN) are typically limited to 128 Kb/s, almost two orders of magnitude lower than popular LANs can support. Using Discrete Cosine Transform (DCT) techniques, a compressed approximation to an image may be rapidly transmitted. However, intensity or resampling transformations of the reconstructed image may reveal otherwise invisible artifacts of the approximate encoding. A progressive transmission scheme reported in ISO Working Paper N800 offers an attractive solution to this problem by rapidly reconstructing an apparently undistorted image from the DCT coefficients and then subse-quently transmitting the error image corresponding to the difference between the original and the reconstructed images. This approach achieves an error-free transmission without sacrificing the perception of rapid image delivery. Furthermore, subsequent intensity and resampling manipulations can be carried out with confidence. DCT coefficient precision affects the amount of error information that must be transmitted and, hence the delivery speed of error-free images. This study calculates the overall information coding rate for six radiographic images as a function of DCT coefficient precision. The results demonstrate that a minimum occurs for each of the six images at an average coefficient precision of between 0.5 and 1.0 bits per pixel (b/p). Apparently undistorted versions of these six images can be transmitted with a coding rate of between 0.25 and 0.75 b/p while error-free versions can be transmitted with an overall coding rate between 4.5 and 6.5 b/p.
NASA Astrophysics Data System (ADS)
Miao, Yonghao; Zhao, Ming; Lin, Jing; Lei, Yaguo
2017-08-01
The extraction of periodic impulses, which are the important indicators of rolling bearing faults, from vibration signals is considerably significance for fault diagnosis. Maximum correlated kurtosis deconvolution (MCKD) developed from minimum entropy deconvolution (MED) has been proven as an efficient tool for enhancing the periodic impulses in the diagnosis of rolling element bearings and gearboxes. However, challenges still exist when MCKD is applied to the bearings operating under harsh working conditions. The difficulties mainly come from the rigorous requires for the multi-input parameters and the complicated resampling process. To overcome these limitations, an improved MCKD (IMCKD) is presented in this paper. The new method estimates the iterative period by calculating the autocorrelation of the envelope signal rather than relies on the provided prior period. Moreover, the iterative period will gradually approach to the true fault period through updating the iterative period after every iterative step. Since IMCKD is unaffected by the impulse signals with the high kurtosis value, the new method selects the maximum kurtosis filtered signal as the final choice from all candidates in the assigned iterative counts. Compared with MCKD, IMCKD has three advantages. First, without considering prior period and the choice of the order of shift, IMCKD is more efficient and has higher robustness. Second, the resampling process is not necessary for IMCKD, which is greatly convenient for the subsequent frequency spectrum analysis and envelope spectrum analysis without resetting the sampling rate. Third, IMCKD has a significant performance advantage in diagnosing the bearing compound-fault which expands the application range. Finally, the effectiveness and superiority of IMCKD are validated by a number of simulated bearing fault signals and applying to compound faults and single fault diagnosis of a locomotive bearing.
NASA Astrophysics Data System (ADS)
Khoshkbar Sadigh, Arash
Part I: Dynamic Voltage Restorer In the present power grids, voltage sags are recognized as a serious threat and a frequently occurring power-quality problem and have costly consequence such as sensitive loads tripping and production loss. Consequently, the demand for high power quality and voltage stability becomes a pressing issue. Dynamic voltage restorer (DVR), as a custom power device, is more effective and direct solutions for "restoring" the quality of voltage at its load-side terminals when the quality of voltage at its source-side terminals is disturbed. In the first part of this thesis, a DVR configuration with no need of bulky dc link capacitor or energy storage is proposed. This fact causes to reduce the size of the DVR and increase the reliability of the circuit. In addition, the proposed DVR topology is based on high-frequency isolation transformer resulting in the size reduction of transformer. The proposed DVR circuit, which is suitable for both low- and medium-voltage applications, is based on dc-ac converters connected in series to split the main dc link between the inputs of dc-ac converters. This feature makes it possible to use modular dc-ac converters and utilize low-voltage components in these converters whenever it is required to use DVR in medium-voltage application. The proposed configuration is tested under different conditions of load power factor and grid voltage harmonic. It has been shown that proposed DVR can compensate the voltage sag effectively and protect the sensitive loads. Following the proposition of the DVR topology, a fundamental voltage amplitude detection method which is applicable in both single/three-phase systems for DVR applications is proposed. The advantages of proposed method include application in distorted power grid with no need of any low-pass filter, precise and reliable detection, simple computation and implementation without using a phased locked loop and lookup table. The proposed method has been verified by simulation and experimental tests under various conditions considering all possible cases such as different amounts of voltage sag depth (VSD), different amounts of point-on-wave (POW) at which voltage sag occurs, harmonic distortion, line frequency variation, and phase jump (PJ). Furthermore, the ripple amount of fundamental voltage amplitude calculated by the proposed method and its error is analyzed considering the line frequency variation together with harmonic distortion. The best and worst detection time of proposed method were measured 1ms and 8.8ms, respectively. Finally, the proposed method has been compared with other voltage sag detection methods available in literature. Part 2: Power System Modeling for Renewable Energy Integration: As power distribution systems are evolving into more complex networks, electrical engineers have to rely on software tools to perform circuit analysis. There are dozens of powerful software tools available in the market to perform the power system studies. Although their main functions are similar, there are differences in features and formatting structures to suit specific applications. This creates challenges for transferring power system circuit models data (PSCMD) between different software and rebuilding the same circuit in the second software environment. The objective of this part of thesis is to develop a Unified Platform (UP) to facilitate transferring PSCMD among different software packages and relieve the challenges of the circuit model conversion process. UP uses a commonly available spreadsheet file with a defined format, for any home software to write data to and for any destination software to read data from, via a script-based application called PSCMD transfer application. The main considerations in developing the UP are to minimize manual intervention and import a one-line diagram into the destination software or export it from the source software, with all details to allow load flow, short circuit and other analyses. In this study, ETAP, OpenDSS, and GridLab-D are considered, and PSCMD transfer applications written in MATLAB have been developed for each of these to read the circuit model data provided in the UP spreadsheet. In order to test the developed PSCMD transfer applications, circuit model data of a test circuit and a power distribution circuit from Southern California Edison (SCE) - a utility company - both built in CYME, were exported into the spreadsheet file according to the UP format. Thereafter, circuit model data were imported successfully from the spreadsheet files into above mentioned software using the PSCMD transfer applications developed for each software. After the SCE studied circuit is transferred into OpenDSS software using the proposed UP scheme and developed application, it has been studied to investigate the impacts of large-scale solar energy penetration. The main challenge of solar energy integration into power grid is its intermittency (i.e., discontinuity of output power) nature due to cloud shading of photovoltaic panels which depends on weather conditions. In order to conduct this study, OpenDSS time-series simulation feature, which is required due to intermittency of solar energy, is utilized. In this study, the impacts of intermittency of solar energy penetration, especially high-variability points, on voltage fluctuation and operation of capacitor bank and voltage regulator is provided. In addition, the necessity to interpolate and resample unequally spaced time-series measurement data and convert them to equally spaced time-series data as well as the effect of resampling time-interval on the amount of error is discussed. Two applications are developed in Matlab to do interpolation and resampling as well as to calculate the amount of error for different resampling time-intervals to figure out the suitable resampling time-interval. Furthermore, an approach based on cumulative distribution, regarding the length for lines/cables types and the power rating for loads, is presented to prioritize which loads, lines and cables the meters should be installed at to have the most effect on model validation.
Inference on periodicity of circadian time series.
Costa, Maria J; Finkenstädt, Bärbel; Roche, Véronique; Lévi, Francis; Gould, Peter D; Foreman, Julia; Halliday, Karen; Hall, Anthony; Rand, David A
2013-09-01
Estimation of the period length of time-course data from cyclical biological processes, such as those driven by the circadian pacemaker, is crucial for inferring the properties of the biological clock found in many living organisms. We propose a methodology for period estimation based on spectrum resampling (SR) techniques. Simulation studies show that SR is superior and more robust to non-sinusoidal and noisy cycles than a currently used routine based on Fourier approximations. In addition, a simple fit to the oscillations using linear least squares is available, together with a non-parametric test for detecting changes in period length which allows for period estimates with different variances, as frequently encountered in practice. The proposed methods are motivated by and applied to various data examples from chronobiology.
Yeo, Boon Y.; McLaughlin, Robert A.; Kirk, Rodney W.; Sampson, David D.
2012-01-01
We present a high-resolution three-dimensional position tracking method that allows an optical coherence tomography (OCT) needle probe to be scanned laterally by hand, providing the high degree of flexibility and freedom required in clinical usage. The method is based on a magnetic tracking system, which is augmented by cross-correlation-based resampling and a two-stage moving window average algorithm to improve upon the tracker's limited intrinsic spatial resolution, achieving 18 µm RMS position accuracy. A proof-of-principle system was developed, with successful image reconstruction demonstrated on phantoms and on ex vivo human breast tissue validated against histology. This freehand scanning method could contribute toward clinical implementation of OCT needle imaging. PMID:22808429
The GOLM-database standard- a framework for time-series data management based on free software
NASA Astrophysics Data System (ADS)
Eichler, M.; Francke, T.; Kneis, D.; Reusser, D.
2009-04-01
Monitoring and modelling projects usually involve time series data originating from different sources. Often, file formats, temporal resolution and meta-data documentation rarely adhere to a common standard. As a result, much effort is spent on converting, harmonizing, merging, checking, resampling and reformatting these data. Moreover, in work groups or during the course of time, these tasks tend to be carried out redundantly and repeatedly, especially when new data becomes available. The resulting duplication of data in various formats strains additional ressources. We propose a database structure and complementary scripts for facilitating these tasks. The GOLM- (General Observation and Location Management) framework allows for import and storage of time series data of different type while assisting in meta-data documentation, plausibility checking and harmonization. The imported data can be visually inspected and its coverage among locations and variables may be visualized. Supplementing scripts provide options for data export for selected stations and variables and resampling of the data to the desired temporal resolution. These tools can, for example, be used for generating model input files or reports. Since GOLM fully supports network access, the system can be used efficiently by distributed working groups accessing the same data over the internet. GOLM's database structure and the complementary scripts can easily be customized to specific needs. Any involved software such as MySQL, R, PHP, OpenOffice as well as the scripts for building and using the data base, including documentation, are free for download. GOLM was developed out of the practical requirements of the OPAQUE-project. It has been tested and further refined in the ERANET-CRUE and SESAM projects, all of which used GOLM to manage meteorological, hydrological and/or water quality data.
NASA Astrophysics Data System (ADS)
Wamser, Kyle
Hyperspectral imagery and the corresponding ability to conduct analysis below the pixel level have tremendous potential to aid in landcover monitoring. During large ecosystem restoration projects, being able to monitor specific aspects of the recovery over large and often inaccessible areas under constrained finances are major challenges. The Civil Air Patrol's Airborne Real-time Cueing Hyperspectral Enhanced Reconnaissance (ARCHER) can provide hyperspectral data in most parts of the United States at relatively low cost. Although designed specifically for use in locating downed aircraft, the imagery holds the potential to identify specific aspects of landcover at far greater fidelity than traditional multispectral means. The goals of this research were to improve the use of ARCHER hyperspectral imagery to classify sub-canopy and open-area vegetation in coniferous forests located in the Southern Rockies and to determine how much fidelity might be lost from a baseline of 1 meter spatial resolution resampled to 2 and 5 meter pixel size to simulate higher altitude collection. Based on analysis comparing linear spectral unmixing with a traditional supervised classification, the linear spectral unmixing proved to be statistically superior. More importantly, however, linear spectral unmixing provided additional sub-pixel information that was unavailable using other techniques. The second goal of determining fidelity loss based on spatial resolution was more difficult to determine due to how the data are represented. Furthermore, the 2 and 5 meter imagery were obtained by resampling the 1 meter imagery and therefore may not be representative of the quality of actual 2 or 5 meter imagery. Ultimately, the information derived from this research may be useful in better utilizing hyperspectral imagery to conduct forest monitoring and assessment.
A non-parametric peak calling algorithm for DamID-Seq.
Li, Renhua; Hempel, Leonie U; Jiang, Tingbo
2015-01-01
Protein-DNA interactions play a significant role in gene regulation and expression. In order to identify transcription factor binding sites (TFBS) of double sex (DSX)-an important transcription factor in sex determination, we applied the DNA adenine methylation identification (DamID) technology to the fat body tissue of Drosophila, followed by deep sequencing (DamID-Seq). One feature of DamID-Seq data is that induced adenine methylation signals are not assured to be symmetrically distributed at TFBS, which renders the existing peak calling algorithms for ChIP-Seq, including SPP and MACS, inappropriate for DamID-Seq data. This challenged us to develop a new algorithm for peak calling. A challenge in peaking calling based on sequence data is estimating the averaged behavior of background signals. We applied a bootstrap resampling method to short sequence reads in the control (Dam only). After data quality check and mapping reads to a reference genome, the peaking calling procedure compromises the following steps: 1) reads resampling; 2) reads scaling (normalization) and computing signal-to-noise fold changes; 3) filtering; 4) Calling peaks based on a statistically significant threshold. This is a non-parametric method for peak calling (NPPC). We also used irreproducible discovery rate (IDR) analysis, as well as ChIP-Seq data to compare the peaks called by the NPPC. We identified approximately 6,000 peaks for DSX, which point to 1,225 genes related to the fat body tissue difference between female and male Drosophila. Statistical evidence from IDR analysis indicated that these peaks are reproducible across biological replicates. In addition, these peaks are comparable to those identified by use of ChIP-Seq on S2 cells, in terms of peak number, location, and peaks width.
NASA Astrophysics Data System (ADS)
Montzka, Carsten; Hendricks Franssen, Harrie-Jan; Moradkhani, Hamid; Pütz, Thomas; Han, Xujun; Vereecken, Harry
2013-04-01
An adequate description of soil hydraulic properties is essential for a good performance of hydrological forecasts. So far, several studies showed that data assimilation could reduce the parameter uncertainty by considering soil moisture observations. However, these observations and also the model forcings were recorded with a specific measurement error. It seems a logical step to base state updating and parameter estimation on observations made at multiple time steps, in order to reduce the influence of outliers at single time steps given measurement errors and unknown model forcings. Such outliers could result in erroneous state estimation as well as inadequate parameters. This has been one of the reasons to use a smoothing technique as implemented for Bayesian data assimilation methods such as the Ensemble Kalman Filter (i.e. Ensemble Kalman Smoother). Recently, an ensemble-based smoother has been developed for state update with a SIR particle filter. However, this method has not been used for dual state-parameter estimation. In this contribution we present a Particle Smoother with sequentially smoothing of particle weights for state and parameter resampling within a time window as opposed to the single time step data assimilation used in filtering techniques. This can be seen as an intermediate variant between a parameter estimation technique using global optimization with estimation of single parameter sets valid for the whole period, and sequential Monte Carlo techniques with estimation of parameter sets evolving from one time step to another. The aims are i) to improve the forecast of evaporation and groundwater recharge by estimating hydraulic parameters, and ii) to reduce the impact of single erroneous model inputs/observations by a smoothing method. In order to validate the performance of the proposed method in a real world application, the experiment is conducted in a lysimeter environment.
Ali, Sajid; Soubeyrand, Samuel; Gladieux, Pierre; Giraud, Tatiana; Leconte, Marc; Gautier, Angélique; Mboup, Mamadou; Chen, Wanquan; de Vallavieille-Pope, Claude; Enjalbert, Jérôme
2016-07-01
Inferring reproductive and demographic parameters of populations is crucial to our understanding of species ecology and evolutionary potential but can be challenging, especially in partially clonal organisms. Here, we describe a new and accurate method, cloncase, for estimating both the rate of sexual vs. asexual reproduction and the effective population size, based on the frequency of clonemate resampling across generations. Simulations showed that our method provides reliable estimates of sex frequency and effective population size for a wide range of parameters. The cloncase method was applied to Puccinia striiformis f.sp. tritici, a fungal pathogen causing stripe/yellow rust, an important wheat disease. This fungus is highly clonal in Europe but has been suggested to recombine in Asia. Using two temporally spaced samples of P. striiformis f.sp. tritici in China, the estimated sex frequency was 75% (i.e. three-quarter of individuals being sexually derived during the yearly sexual cycle), indicating strong contribution of sexual reproduction to the life cycle of the pathogen in this area. The inferred effective population size of this partially clonal organism (Nc = 998) was in good agreement with estimates obtained using methods based on temporal variations in allelic frequencies. The cloncase estimator presented herein is the first method allowing accurate inference of both sex frequency and effective population size from population data without knowledge of recombination or mutation rates. cloncase can be applied to population genetic data from any organism with cyclical parthenogenesis and should in particular be very useful for improving our understanding of pest and microbial population biology. © 2016 John Wiley & Sons Ltd.
A novel iterative mixed model to remap three complex orthopedic traits in dogs
Huang, Meng; Hayward, Jessica J.; Corey, Elizabeth; Garrison, Susan J.; Wagner, Gabriela R.; Krotscheck, Ursula; Hayashi, Kei; Schweitzer, Peter A.; Lust, George; Boyko, Adam R.; Todhunter, Rory J.
2017-01-01
Hip dysplasia (HD), elbow dysplasia (ED), and rupture of the cranial (anterior) cruciate ligament (RCCL) are the most common complex orthopedic traits of dogs and all result in debilitating osteoarthritis. We reanalyzed previously reported data: the Norberg angle (a quantitative measure of HD) in 921 dogs, ED in 113 cases and 633 controls, and RCCL in 271 cases and 399 controls and their genotypes at ~185,000 single nucleotide polymorphisms. A novel fixed and random model with a circulating probability unification (FarmCPU) function, with marker-based principal components and a kinship matrix to correct for population stratification, was used. A Bonferroni correction at p<0.01 resulted in a P< 6.96 ×10−8. Six loci were identified; three for HD and three for RCCL. An associated locus at CFA28:34,369,342 for HD was described previously in the same dogs using a conventional mixed model. No loci were identified for RCCL in the previous report but the two loci for ED in the previous report did not reach genome-wide significance using the FarmCPU model. These results were supported by simulation which demonstrated that the FarmCPU held no power advantage over the linear mixed model for the ED sample but provided additional power for the HD and RCCL samples. Candidate genes for HD and RCCL are discussed. When using FarmCPU software, we recommend a resampling test, that a positive control be used to determine the optimum pseudo quantitative trait nucleotide-based covariate structure of the model, and a negative control be used consisting of permutation testing and the identical resampling test as for the non-permuted phenotypes. PMID:28614352
Automatic aortic root segmentation in CTA whole-body dataset
NASA Astrophysics Data System (ADS)
Gao, Xinpei; Kitslaar, Pieter H.; Scholte, Arthur J. H. A.; Lelieveldt, Boudewijn P. F.; Dijkstra, Jouke; Reiber, Johan H. C.
2016-03-01
Trans-catheter aortic valve replacement (TAVR) is an evolving technique for patients with serious aortic stenosis disease. Typically, in this application a CTA data set is obtained of the patient's arterial system from the subclavian artery to the femoral arteries, to evaluate the quality of the vascular access route and analyze the aortic root to determine if and which prosthesis should be used. In this paper, we concentrate on the automated segmentation of the aortic root. The purpose of this study was to automatically segment the aortic root in computed tomography angiography (CTA) datasets to support TAVR procedures. The method in this study includes 4 major steps. First, the patient's cardiac CTA image was resampled to reduce the computation time. Next, the cardiac CTA image was segmented using an atlas-based approach. The most similar atlas was selected from a total of 8 atlases based on its image similarity to the input CTA image. Third, the aortic root segmentation from the previous step was transferred to the patient's whole-body CTA image by affine registration and refined in the fourth step using a deformable subdivision surface model fitting procedure based on image intensity. The pipeline was applied to 20 patients. The ground truth was created by an analyst who semi-automatically corrected the contours of the automatic method, where necessary. The average Dice similarity index between the segmentations of the automatic method and the ground truth was found to be 0.965±0.024. In conclusion, the current results are very promising.
Estes, James A.; Tinker, M. Tim; Bodkin, James L.
2010-01-01
Recovery criteria for depleted species or populations normally are based on demographic measures, the goal being to maintain enough individuals over a sufficiently large area to assure a socially tolerable risk of future extinction. Such demographically based recovery criteria may be insufficient to restore the functional roles of strongly interacting species. We explored the idea of developing a recovery criterion for sea otters (Enhydra lutris) in the Aleutian archipelago on the basis of their keystone role in kelp forest ecosystems. We surveyed sea otters and rocky reef habitats at 34 island-time combinations. The system nearly always existed in either a kelp-dominated or deforested phase state, which was predictable from sea otter density. We used a resampling analysis of these data to show that the phase state at any particular island can be determined at 95% probability of correct classification with information from as few as six sites. When sea otter population status (and thus the phase state of the kelp forest) was allowed to vary randomly among islands, just 15 islands had to be sampled to estimate the true proportion that were kelp dominated (within 10%) with 90% confidence. We conclude that kelp forest phase state is a more appropriate, sensitive, and cost-effective measure of sea otter recovery than the more traditional demographically based metrics, and we suggest that similar approaches have broad potential utility in establishing recovery criteria for depleted populations of other functionally important species.
Mazurowski, Maciej A.; Zurada, Jacek M.; Tourassi, Georgia D.
2009-01-01
Ensemble classifiers have been shown efficient in multiple applications. In this article, the authors explore the effectiveness of ensemble classifiers in a case-based computer-aided diagnosis system for detection of masses in mammograms. They evaluate two general ways of constructing subclassifiers by resampling of the available development dataset: Random division and random selection. Furthermore, they discuss the problem of selecting the ensemble size and propose two adaptive incremental techniques that automatically select the size for the problem at hand. All the techniques are evaluated with respect to a previously proposed information-theoretic CAD system (IT-CAD). The experimental results show that the examined ensemble techniques provide a statistically significant improvement (AUC=0.905±0.024) in performance as compared to the original IT-CAD system (AUC=0.865±0.029). Some of the techniques allow for a notable reduction in the total number of examples stored in the case base (to 1.3% of the original size), which, in turn, results in lower storage requirements and a shorter response time of the system. Among the methods examined in this article, the two proposed adaptive techniques are by far the most effective for this purpose. Furthermore, the authors provide some discussion and guidance for choosing the ensemble parameters. PMID:19673196
NASA Astrophysics Data System (ADS)
Beck, Christoph; Philipp, Andreas; Jacobeit, Jucundus
2015-08-01
This contribution investigates the relationship between the large-scale atmospheric circulation and interannual variations of the standardized precipitation index (SPI) in Central Europe. To this end, circulation types (CT) have been derived from a variety of circulation type classifications (CTC) applied to daily sea level pressure (SLP) data and mean circulation indices of vorticity ( V), zonality ( Z) and meridionality ( M) have been calculated. Occurrence frequencies of CTs and circulation indices have been utilized as predictors within multiple regression models (MRM) for the estimation of gridded 3-month SPI values over Central Europe, for the period 1950 to 2010. CTC-based MRMs used in the analyses comprise variants concerning the basic method for CT classification, the number of CTs, the size and location of the spatial domain used for CTCs and the exclusive use of CT frequencies or the combined use of CT frequencies and mean circulation indices as predictors. Adequate MRM predictor combinations have been identified by applying stepwise multiple regression analyses within a resampling framework. The performance (robustness) of the resulting MRMs has been quantified based on a leave-one-out cross-validation procedure applying several skill scores. Furthermore, the relative importance of individual predictors has been estimated for each MRM. From these analyses, it can be stated that model skill is improved by (i) the consideration of vorticity characteristics within CTCs, (ii) a relatively small size of the spatial domain to which CTCs are applied and (iii) the inclusion of mean circulation indices. However, model skill exhibits distinct variations between seasons and regions. Whereas promising skill can be stated for the western and northwestern parts of the Central European domain, only unsatisfactory skill is reached in the more continental regions and particularly during summer. Thus, it can be concluded that the presented approaches feature the potential for the downscaling of Central European drought index variations from the large-scale circulation, at least for some regions. Further improvements of CTC-based approaches may be expected from the optimization of CTCs for explaining the SPI, e.g. via the inclusion of additional variables in the classification procedure.
Speckle reduction in digital holography with resampling ring masks
NASA Astrophysics Data System (ADS)
Zhang, Wenhui; Cao, Liangcai; Jin, Guofan
2018-01-01
One-shot digital holographic imaging has the advantages of high stability and low temporal cost. However, the reconstruction is affected by the speckle noise. Resampling ring-mask method in spectrum domain is proposed for speckle reduction. The useful spectrum of one hologram is divided into several sub-spectra by ring masks. In the reconstruction, angular spectrum transform is applied to guarantee the calculation accuracy which has no approximation. N reconstructed amplitude images are calculated from the corresponding sub-spectra. Thanks to speckle's random distribution, superimposing these N uncorrelated amplitude images would lead to a final reconstructed image with lower speckle noise. Normalized relative standard deviation values of the reconstructed image are used to evaluate the reduction of speckle. Effect of the method on the spatial resolution of the reconstructed image is also quantitatively evaluated. Experimental and simulation results prove the feasibility and effectiveness of the proposed method.
Statistical wiring of thalamic receptive fields optimizes spatial sampling of the retinal image
Wang, Xin; Sommer, Friedrich T.; Hirsch, Judith A.
2014-01-01
Summary It is widely assumed that mosaics of retinal ganglion cells establish the optimal representation of visual space. However, relay cells in the visual thalamus often receive convergent input from several retinal afferents and, in cat, outnumber ganglion cells. To explore how the thalamus transforms the retinal image, we built a model of the retinothalamic circuit using experimental data and simple wiring rules. The model shows how the thalamus might form a resampled map of visual space with the potential to facilitate detection of stimulus position in the presence of sensor noise. Bayesian decoding conducted with the model provides support for this scenario. Despite its benefits, however, resampling introduces image blur, thus impairing edge perception. Whole-cell recordings obtained in vivo suggest that this problem is mitigated by arrangements of excitation and inhibition within the receptive field that effectively boost contrast borders, much like strategies used in digital image processing. PMID:24559681
An Acoustic OFDM System with Symbol-by-Symbol Doppler Compensation for Underwater Communication
MinhHai, Tran; Rie, Saotome; Suzuki, Taisaku; Wada, Tomohisa
2016-01-01
We propose an acoustic OFDM system for underwater communication, specifically for vertical link communications such as between a robot in the sea bottom and a mother ship in the surface. The main contributions are (1) estimation of time varying Doppler shift using continual pilots in conjunction with monitoring the drift of Power Delay Profile and (2) symbol-by-symbol Doppler compensation in frequency domain by an ICI matrix representing nonuniform Doppler. In addition, we compare our proposal against a resampling method. Simulation and experimental results confirm that our system outperforms the resampling method when the velocity changes roughly over OFDM symbols. Overall, experimental results taken in Shizuoka, Japan, show our system using 16QAM, and 64QAM achieved a data throughput of 7.5 Kbit/sec with a transmitter moving at maximum 2 m/s, in a complicated trajectory, over 30 m vertically. PMID:27057558
Forecasting drought risks for a water supply storage system using bootstrap position analysis
Tasker, Gary; Dunne, Paul
1997-01-01
Forecasting the likelihood of drought conditions is an integral part of managing a water supply storage and delivery system. Position analysis uses a large number of possible flow sequences as inputs to a simulation of a water supply storage and delivery system. For a given set of operating rules and water use requirements, water managers can use such a model to forecast the likelihood of specified outcomes such as reservoir levels falling below a specified level or streamflows falling below statutory passing flows a few months ahead conditioned on the current reservoir levels and streamflows. The large number of possible flow sequences are generated using a stochastic streamflow model with a random resampling of innovations. The advantages of this resampling scheme, called bootstrap position analysis, are that it does not rely on the unverifiable assumption of normality and it allows incorporation of long-range weather forecasts into the analysis.
Simulation and statistics: Like rhythm and song
NASA Astrophysics Data System (ADS)
Othman, Abdul Rahman
2013-04-01
Simulation has been introduced to solve problems in the form of systems. By using this technique the following two problems can be overcome. First, a problem that has an analytical solution but the cost of running an experiment to solve is high in terms of money and lives. Second, a problem exists but has no analytical solution. In the field of statistical inference the second problem is often encountered. With the advent of high-speed computing devices, a statistician can now use resampling techniques such as the bootstrap and permutations to form pseudo sampling distribution that will lead to the solution of the problem that cannot be solved analytically. This paper discusses how a Monte Carlo simulation was and still being used to verify the analytical solution in inference. This paper also discusses the resampling techniques as simulation techniques. The misunderstandings about these two techniques are examined. The successful usages of both techniques are also explained.
Bootstrap position analysis for forecasting low flow frequency
Tasker, Gary D.; Dunne, P.
1997-01-01
A method of random resampling of residuals from stochastic models is used to generate a large number of 12-month-long traces of natural monthly runoff to be used in a position analysis model for a water-supply storage and delivery system. Position analysis uses the traces to forecast the likelihood of specified outcomes such as reservoir levels falling below a specified level or streamflows falling below statutory passing flows conditioned on the current reservoir levels and streamflows. The advantages of this resampling scheme, called bootstrap position analysis, are that it does not rely on the unverifiable assumption of normality, fewer parameters need to be estimated directly from the data, and accounting for parameter uncertainty is easily done. For a given set of operating rules and water-use requirements for a system, water managers can use such a model as a decision-making tool to evaluate different operating rules. ?? ASCE,.
Efficient geometric rectification techniques for spectral analysis algorithm
NASA Technical Reports Server (NTRS)
Chang, C. Y.; Pang, S. S.; Curlander, J. C.
1992-01-01
The spectral analysis algorithm is a viable technique for processing synthetic aperture radar (SAR) data in near real time throughput rates by trading the image resolution. One major challenge of the spectral analysis algorithm is that the output image, often referred to as the range-Doppler image, is represented in the iso-range and iso-Doppler lines, a curved grid format. This phenomenon is known to be the fanshape effect. Therefore, resampling is required to convert the range-Doppler image into a rectangular grid format before the individual images can be overlaid together to form seamless multi-look strip imagery. An efficient algorithm for geometric rectification of the range-Doppler image is presented. The proposed algorithm, realized in two one-dimensional resampling steps, takes into consideration the fanshape phenomenon of the range-Doppler image as well as the high squint angle and updates of the cross-track and along-track Doppler parameters. No ground reference points are required.
NASA Astrophysics Data System (ADS)
Rossi, M.; Luciani, S.; Valigi, D.; Kirschbaum, D.; Brunetti, M. T.; Peruccacci, S.; Guzzetti, F.
2017-05-01
Models for forecasting rainfall-induced landslides are mostly based on the identification of empirical rainfall thresholds obtained exploiting rain gauge data. Despite their increased availability, satellite rainfall estimates are scarcely used for this purpose. Satellite data should be useful in ungauged and remote areas, or should provide a significant spatial and temporal reference in gauged areas. In this paper, the analysis of the reliability of rainfall thresholds based on rainfall remote sensed and rain gauge data for the prediction of landslide occurrence is carried out. To date, the estimation of the uncertainty associated with the empirical rainfall thresholds is mostly based on a bootstrap resampling of the rainfall duration and the cumulated event rainfall pairs (D,E) characterizing rainfall events responsible for past failures. This estimation does not consider the measurement uncertainty associated with D and E. In the paper, we propose (i) a new automated procedure to reconstruct ED conditions responsible for the landslide triggering and their uncertainties, and (ii) three new methods to identify rainfall threshold for the possible landslide occurrence, exploiting rain gauge and satellite data. In particular, the proposed methods are based on Least Square (LS), Quantile Regression (QR) and Nonlinear Least Square (NLS) statistical approaches. We applied the new procedure and methods to define empirical rainfall thresholds and their associated uncertainties in the Umbria region (central Italy) using both rain-gauge measurements and satellite estimates. We finally validated the thresholds and tested the effectiveness of the different threshold definition methods with independent landslide information. The NLS method among the others performed better in calculating thresholds in the full range of rainfall durations. We found that the thresholds obtained from satellite data are lower than those obtained from rain gauge measurements. This is in agreement with the literature, where satellite rainfall data underestimate the "ground" rainfall registered by rain gauges.
NASA Technical Reports Server (NTRS)
Rossi, M.; Luciani, S.; Valigi, D.; Kirschbaum, D.; Brunetti, M. T.; Peruccacci, S.; Guzzetti, F.
2017-01-01
Models for forecasting rainfall-induced landslides are mostly based on the identification of empirical rainfall thresholds obtained exploiting rain gauge data. Despite their increased availability, satellite rainfall estimates are scarcely used for this purpose. Satellite data should be useful in ungauged and remote areas, or should provide a significant spatial and temporal reference in gauged areas. In this paper, the analysis of the reliability of rainfall thresholds based on rainfall remote sensed and rain gauge data for the prediction of landslide occurrence is carried out. To date, the estimation of the uncertainty associated with the empirical rainfall thresholds is mostly based on a bootstrap resampling of the rainfall duration and the cumulated event rainfall pairs (D,E) characterizing rainfall events responsible for past failures. This estimation does not consider the measurement uncertainty associated with D and E. In the paper, we propose (i) a new automated procedure to reconstruct ED conditions responsible for the landslide triggering and their uncertainties, and (ii) three new methods to identify rainfall threshold for the possible landslide occurrence, exploiting rain gauge and satellite data. In particular, the proposed methods are based on Least Square (LS), Quantile Regression (QR) and Nonlinear Least Square (NLS) statistical approaches. We applied the new procedure and methods to define empirical rainfall thresholds and their associated uncertainties in the Umbria region (central Italy) using both rain-gauge measurements and satellite estimates. We finally validated the thresholds and tested the effectiveness of the different threshold definition methods with independent landslide information. The NLS method among the others performed better in calculating thresholds in the full range of rainfall durations. We found that the thresholds obtained from satellite data are lower than those obtained from rain gauge measurements. This is in agreement with the literature, where satellite rainfall data underestimate the 'ground' rainfall registered by rain gauges.
A complex noise reduction method for improving visualization of SD-OCT skin biomedical images
NASA Astrophysics Data System (ADS)
Myakinin, Oleg O.; Zakharov, Valery P.; Bratchenko, Ivan A.; Kornilin, Dmitry V.; Khramov, Alexander G.
2014-05-01
In this paper we consider the original method of solving noise reduction problem for visualization's quality improvement of SD-OCT skin and tumors biomedical images. The principal advantages of OCT are high resolution and possibility of in vivo analysis. We propose a two-stage algorithm: 1) process of raw one-dimensional A-scans of SD-OCT and 2) remove a noise from the resulting B(C)-scans. The general mathematical methods of SD-OCT are unstable: if the noise of the CCD is 1.6% of the dynamic range then result distortions are already 25-40% of the dynamic range. We use at the first stage a resampling of A-scans and simple linear filters to reduce the amount of data and remove the noise of the CCD camera. The efficiency, improving productivity and conservation of the axial resolution when using this approach are showed. At the second stage we use an effective algorithms based on Hilbert-Huang Transform for more accurately noise peaks removal. The effectiveness of the proposed approach for visualization of malignant and benign skin tumors (melanoma, BCC etc.) and a significant improvement of SNR level for different methods of noise reduction are showed. Also in this study we consider a modification of this method depending of a specific hardware and software features of used OCT setup. The basic version does not require any hardware modifications of existing equipment. The effectiveness of proposed method for 3D visualization of tissues can simplify medical diagnosis in oncology.
Simultaneous F 0-F 1 modifications of Arabic for the improvement of natural-sounding
NASA Astrophysics Data System (ADS)
Ykhlef, F.; Bensebti, M.
2013-03-01
Pitch (F 0) modification is one of the most important problems in the area of speech synthesis. Several techniques have been developed in the literature to achieve this goal. The main restrictions of these techniques are in the modification range and the synthesised speech quality, intelligibility and naturalness. The control of formants in a spoken language can significantly improve the naturalness of the synthesised speech. This improvement is mainly dependent on the control of the first formant (F 1). Inspired by this observation, this article proposes a new approach that modifies both F 0 and F 1 of Arabic voiced sounds in order to improve the naturalness of the pitch shifted speech. The developed strategy takes a parallel processing approach, in which the analysis segments are decomposed into sub-bands in the wavelet domain, modified in the desired sub-band by using a resampling technique and reconstructed without affecting the remained sub-bands. Pitch marking and voicing detection are performed in the frequency decomposition step based on the comparison of the multi-level approximation and detail signals. The performance of the proposed technique is evaluated by listening tests and compared to the pitch synchronous overlap and add (PSOLA) technique in the third approximation level. Experimental results have shown that the manipulation in the wavelet domain of F 0 in conjunction with F 1 guarantees natural-sounding of the synthesised speech compared to the classical pitch modification technique. This improvement was appropriate for high pitch modifications.
Kang, Le; Chen, Weijie; Petrick, Nicholas A; Gallas, Brandon D
2015-02-20
The area under the receiver operating characteristic curve is often used as a summary index of the diagnostic ability in evaluating biomarkers when the clinical outcome (truth) is binary. When the clinical outcome is right-censored survival time, the C index, motivated as an extension of area under the receiver operating characteristic curve, has been proposed by Harrell as a measure of concordance between a predictive biomarker and the right-censored survival outcome. In this work, we investigate methods for statistical comparison of two diagnostic or predictive systems, of which they could either be two biomarkers or two fixed algorithms, in terms of their C indices. We adopt a U-statistics-based C estimator that is asymptotically normal and develop a nonparametric analytical approach to estimate the variance of the C estimator and the covariance of two C estimators. A z-score test is then constructed to compare the two C indices. We validate our one-shot nonparametric method via simulation studies in terms of the type I error rate and power. We also compare our one-shot method with resampling methods including the jackknife and the bootstrap. Simulation results show that the proposed one-shot method provides almost unbiased variance estimations and has satisfactory type I error control and power. Finally, we illustrate the use of the proposed method with an example from the Framingham Heart Study. Copyright © 2014 John Wiley & Sons, Ltd.
Falcaro, Milena; Carpenter, James R
2017-06-01
Population-based net survival by tumour stage at diagnosis is a key measure in cancer surveillance. Unfortunately, data on tumour stage are often missing for a non-negligible proportion of patients and the mechanism giving rise to the missingness is usually anything but completely at random. In this setting, restricting analysis to the subset of complete records gives typically biased results. Multiple imputation is a promising practical approach to the issues raised by the missing data, but its use in conjunction with the Pohar-Perme method for estimating net survival has not been formally evaluated. We performed a resampling study using colorectal cancer population-based registry data to evaluate the ability of multiple imputation, used along with the Pohar-Perme method, to deliver unbiased estimates of stage-specific net survival and recover missing stage information. We created 1000 independent data sets, each containing 5000 patients. Stage data were then made missing at random under two scenarios (30% and 50% missingness). Complete records analysis showed substantial bias and poor confidence interval coverage. Across both scenarios our multiple imputation strategy virtually eliminated the bias and greatly improved confidence interval coverage. In the presence of missing stage data complete records analysis often gives severely biased results. We showed that combining multiple imputation with the Pohar-Perme estimator provides a valid practical approach for the estimation of stage-specific colorectal cancer net survival. As usual, when the percentage of missing data is high the results should be interpreted cautiously and sensitivity analyses are recommended. Copyright © 2017 Elsevier Ltd. All rights reserved.
Geospatial Data Integration for Assessing Landslide Hazard on Engineered Slopes
NASA Astrophysics Data System (ADS)
Miller, P. E.; Mills, J. P.; Barr, S. L.; Birkinshaw, S. J.
2012-07-01
Road and rail networks are essential components of national infrastructures, underpinning the economy, and facilitating the mobility of goods and the human workforce. Earthwork slopes such as cuttings and embankments are primary components, and their reliability is of fundamental importance. However, instability and failure can occur, through processes such as landslides. Monitoring the condition of earthworks is a costly and continuous process for network operators, and currently, geospatial data is largely underutilised. The research presented here addresses this by combining airborne laser scanning and multispectral aerial imagery to develop a methodology for assessing landslide hazard. This is based on the extraction of key slope stability variables from the remotely sensed data. The methodology is implemented through numerical modelling, which is parameterised with the slope stability information, simulated climate conditions, and geotechnical properties. This allows determination of slope stability (expressed through the factor of safety) for a range of simulated scenarios. Regression analysis is then performed in order to develop a functional model relating slope stability to the input variables. The remotely sensed raster datasets are robustly re-sampled to two-dimensional cross-sections to facilitate meaningful interpretation of slope behaviour and mapping of landslide hazard. Results are stored in a geodatabase for spatial analysis within a GIS environment. For a test site located in England, UK, results have shown the utility of the approach in deriving practical hazard assessment information. Outcomes were compared to the network operator's hazard grading data, and show general agreement. The utility of the slope information was also assessed with respect to auto-population of slope geometry, and found to deliver significant improvements over the network operator's existing field-based approaches.
Backward Registration Based Aspect Ratio Similarity (ARS) for Image Retargeting Quality Assessment.
Zhang, Yabin; Fang, Yuming; Lin, Weisi; Zhang, Xinfeng; Li, Leida
2016-06-28
During the past few years, there have been various kinds of content-aware image retargeting operators proposed for image resizing. However, the lack of effective objective retargeting quality assessment metrics limits the further development of image retargeting techniques. Different from traditional Image Quality Assessment (IQA) metrics, the quality degradation during image retargeting is caused by artificial retargeting modifications, and the difficulty for Image Retargeting Quality Assessment (IRQA) lies in the alternation of the image resolution and content, which makes it impossible to directly evaluate the quality degradation like traditional IQA. In this paper, we interpret the image retargeting in a unified framework of resampling grid generation and forward resampling. We show that the geometric change estimation is an efficient way to clarify the relationship between the images. We formulate the geometric change estimation as a Backward Registration problem with Markov Random Field (MRF) and provide an effective solution. The geometric change aims to provide the evidence about how the original image is resized into the target image. Under the guidance of the geometric change, we develop a novel Aspect Ratio Similarity metric (ARS) to evaluate the visual quality of retargeted images by exploiting the local block changes with a visual importance pooling strategy. Experimental results on the publicly available MIT RetargetMe and CUHK datasets demonstrate that the proposed ARS can predict more accurate visual quality of retargeted images compared with state-of-the-art IRQA metrics.
Bi-phasic trends in mercury concentrations in blood of Wisconsin common loons during 1992–2010
Meyer, Michael W.; Rasmussen, Paul W.; Watras, Carl J.; Fevold, Brick M.; Kenow, Kevin P.
2011-01-01
Wisconsin Department of Natural Resources (WDNR) assessed the ecological risk of mercury (Hg) in aquatic systems by monitoring common loon (Gavia immer) population dynamics and blood Hg concentrations. We report temporal trends in blood Hg concentrations based on 334 samples collected from adults recaptured in subsequent years (resampled 2-9 times) and from 421 blood samples of chicks collected at lakes resampled 2-8 times 1992-2010.. Temporal trends were identified with generalized additive mixed effects models (GAMMs) and mixed effects models to account for the potential lack of independence among observations from the same loon or same lake. Trend analyses indicated that Hg concentrations in the blood of Wisconsin loons declined over the period 1992-2000, and increased during 2002-2010, but not to the level observed in the early 1990s. The best fitting linear mixed effects model included separate trends for the two time periods. The estimated trend in Hg concentration among the adult loon population during 1992-2000 was -2.6% per year and the estimated trend during 2002-2010 was +1.8% per year; chick blood Hg concentrations decreased by -6.5% per year during 1992-2000, but increased 1.8% per year during 2002-2010. This bi-phasic pattern is similar to trends observed for concentrations of methylmercury (meHg) and SO4 in lake water of a well studied seepage lake (Little Rock Lake, Vilas County) within our study area. A cause-effect relationship between these independent trends is hypothesized.
NASA Astrophysics Data System (ADS)
Chahrazed, Yahiaoui; Jean-Louis, Lanet; Mohamed, Mezghiche; Karim, Tamine
2018-01-01
Fault attack represents one of the serious threats against Java Card security. It consists of physical perturbation of chip components to introduce faults in the code execution. A fault may be induced using a laser beam to impact opcodes and operands of instructions. This could lead to a mutation of the application code in such a way that it becomes hostile. Any successful attack may reveal a secret information stored in the card or grant an undesired authorisation. We propose a methodology to recognise, during the development step, the sensitive patterns to the fault attack in the Java Card applications. It is based on the concepts from text categorisation and machine learning. In fact, in this method, we represented the patterns using opcodes n-grams as features, and we evaluated different machine learning classifiers. The results show that the classifiers performed poorly when classifying dangerous sensitive patterns, due to the imbalance of our data-set. The number of dangerous sensitive patterns is much lower than the number of not dangerous patterns. We used resampling techniques to balance the class distribution in our data-set. The experimental results indicated that the resampling techniques improved the accuracy of the classifiers. In addition, our proposed method reduces the execution time of sensitive patterns classification in comparison to the SmartCM tool. This tool is used in our study to evaluate the effect of faults on Java Card applications.
Estimating parasitic sea lamprey abundance in Lake Huron from heterogenous data sources
Young, Robert J.; Jones, Michael L.; Bence, James R.; McDonald, Rodney B.; Mullett, Katherine M.; Bergstedt, Roger A.
2003-01-01
The Great Lakes Fishery Commission uses time series of transformer, parasitic, and spawning population estimates to evaluate the effectiveness of its sea lamprey (Petromyzon marinus) control program. This study used an inverse variance weighting method to integrate Lake Huron sea lamprey population estimates derived from two estimation procedures: 1) prediction of the lake-wide spawning population from a regression model based on stream size and, 2) whole-lake mark and recapture estimates. In addition, we used a re-sampling procedure to evaluate the effect of trading off sampling effort between the regression and mark-recapture models. Population estimates derived from the regression model ranged from 132,000 to 377,000 while mark-recapture estimates of marked recently metamorphosed juveniles and parasitic sea lampreys ranged from 536,000 to 634,000 and 484,000 to 1,608,000, respectively. The precision of the estimates varied greatly among estimation procedures and years. The integrated estimate of the mark-recapture and spawner regression procedures ranged from 252,000 to 702,000 transformers. The re-sampling procedure indicated that the regression model is more sensitive to reduction in sampling effort than the mark-recapture model. Reliance on either the regression or mark-recapture model alone could produce misleading estimates of abundance of sea lampreys and the effect of the control program on sea lamprey abundance. These analyses indicate that the precision of the lakewide population estimate can be maximized by re-allocating sampling effort from marking sea lampreys to trapping additional streams.
The Need of Nested Grids for Aerial and Satellite Images and Digital Elevation Models
NASA Astrophysics Data System (ADS)
Villa, G.; Mas, S.; Fernández-Villarino, X.; Martínez-Luceño, J.; Ojeda, J. C.; Pérez-Martín, B.; Tejeiro, J. A.; García-González, C.; López-Romero, E.; Soteres, C.
2016-06-01
Usual workflows for production, archiving, dissemination and use of Earth observation images (both aerial and from remote sensing satellites) pose big interoperability problems, as for example: non-alignment of pixels at the different levels of the pyramids that makes it impossible to overlay, compare and mosaic different orthoimages, without resampling them and the need to apply multiple resamplings and compression-decompression cycles. These problems cause great inefficiencies in production, dissemination through web services and processing in "Big Data" environments. Most of them can be avoided, or at least greatly reduced, with the use of a common "nested grid" for mutiresolution production, archiving, dissemination and exploitation of orthoimagery, digital elevation models and other raster data. "Nested grids" are space allocation schemas that organize image footprints, pixel sizes and pixel positions at all pyramid levels, in order to achieve coherent and consistent multiresolution coverage of a whole working area. A "nested grid" must be complemented by an appropriate "tiling schema", ideally based on the "quad-tree" concept. In the last years a "de facto standard" grid and Tiling Schema has emerged and has been adopted by virtually all major geospatial data providers. It has also been adopted by OGC in its "WMTS Simple Profile" standard. In this paper we explain how the adequate use of this tiling schema as common nested grid for orthoimagery, DEMs and other types of raster data constitutes the most practical solution to most of the interoperability problems of these types of data.
Comparing apples and oranges: the Community Intercomparison Suite
NASA Astrophysics Data System (ADS)
Schutgens, Nick; Stier, Philip; Pascoe, Stephen
2014-05-01
Visual representation and comparison of geoscientific datasets presents a huge challenge due to the large variety of file formats and spatio-temporal sampling of data (be they observations or simulations). The Community Intercomparison Suite attempts to greatly simplify these tasks for users by offering an intelligent but simple command line tool for visualisation and colocation of diverse datasets. In addition, CIS can subset and aggregate large datasets into smaller more manageable datasets. Our philosophy is to remove as much as possible the need for specialist knowledge by the user of the structure of a dataset. The colocation of observations with model data is as simple as: "cis col
Kafeshani, Farzaneh Alizadeh; Rajabpour, Ali; Aghajanzadeh, Sirous; Gholamian, Esmaeil; Farkhari, Mohammad
2018-04-02
Aphis spiraecola Patch, Aphis gossypii Glover, and Toxoptera aurantii Boyer de Fonscolombe are three important aphid pests of citrus orchards. In this study, spatial distributions of the aphids on two orange species, Satsuma mandarin and Thomson navel, were evaluated using Taylor's power law and Iwao's patchiness. In addition, a fixed-precision sequential sampling plant was developed for each species on the host plant by Green's model at precision levels of 0.25 and 0.1. The results revealed that spatial distribution parameters and therefore the sampling plan were significantly different according to aphid and host plant species. Taylor's power law provides a better fit for the data than Iwao's patchiness regression. Except T. aurantii on Thomson navel orange, spatial distribution patterns of the aphids were aggregative on both citrus. T. aurantii had regular dispersion pattern on Thomson navel orange. Optimum sample size of the aphids varied from 30-2061 and 1-1622 shoots on Satsuma mandarin and Thomson navel orange based on aphid species and desired precision level. Calculated stop lines of the aphid species on Satsuma mandarin and Thomson navel orange ranged from 0.48 to 19 and 0.19 to 80.4 aphids per 24 shoots according to aphid species and desired precision level. The performance of the sampling plan was validated by resampling analysis using resampling for validation of sampling plans (RVSP) software. This sampling program is useful for IPM program of the aphids in citrus orchards.
Human Classification Based on Gestural Motions by Using Components of PCA
NASA Astrophysics Data System (ADS)
Aziz, Azri A.; Wan, Khairunizam; Za'aba, S. K.; B, Shahriman A.; Adnan, Nazrul H.; H, Asyekin; R, Zuradzman M.
2013-12-01
Lately, a study of human capabilities with the aim to be integrated into machine is the famous topic to be discussed. Moreover, human are bless with special abilities that they can hear, see, sense, speak, think and understand each other. Giving such abilities to machine for improvement of human life is researcher's aim for better quality of life in the future. This research was concentrating on human gesture, specifically arm motions for differencing the individuality which lead to the development of the hand gesture database. We try to differentiate the human physical characteristic based on hand gesture represented by arm trajectories. Subjects are selected from different type of the body sizes, and then acquired data undergo resampling process. The results discuss the classification of human based on arm trajectories by using Principle Component Analysis (PCA).
Geologic Materials Center - General Information | Alaska Division of
effective November 9, 2017. Set by DGGS Director's Order, the fees will help offset operational costs and -effective alternative to the tremendous expense of core drilling and resampling in the field. One foot of
Fast parallel image registration on CPU and GPU for diagnostic classification of Alzheimer's disease
Shamonin, Denis P.; Bron, Esther E.; Lelieveldt, Boudewijn P. F.; Smits, Marion; Klein, Stefan; Staring, Marius
2013-01-01
Nonrigid image registration is an important, but time-consuming task in medical image analysis. In typical neuroimaging studies, multiple image registrations are performed, i.e., for atlas-based segmentation or template construction. Faster image registration routines would therefore be beneficial. In this paper we explore acceleration of the image registration package elastix by a combination of several techniques: (i) parallelization on the CPU, to speed up the cost function derivative calculation; (ii) parallelization on the GPU building on and extending the OpenCL framework from ITKv4, to speed up the Gaussian pyramid computation and the image resampling step; (iii) exploitation of certain properties of the B-spline transformation model; (iv) further software optimizations. The accelerated registration tool is employed in a study on diagnostic classification of Alzheimer's disease and cognitively normal controls based on T1-weighted MRI. We selected 299 participants from the publicly available Alzheimer's Disease Neuroimaging Initiative database. Classification is performed with a support vector machine based on gray matter volumes as a marker for atrophy. We evaluated two types of strategies (voxel-wise and region-wise) that heavily rely on nonrigid image registration. Parallelization and optimization resulted in an acceleration factor of 4–5x on an 8-core machine. Using OpenCL a speedup factor of 2 was realized for computation of the Gaussian pyramids, and 15–60 for the resampling step, for larger images. The voxel-wise and the region-wise classification methods had an area under the receiver operator characteristic curve of 88 and 90%, respectively, both for standard and accelerated registration. We conclude that the image registration package elastix was substantially accelerated, with nearly identical results to the non-optimized version. The new functionality will become available in the next release of elastix as open source under the BSD license. PMID:24474917
NASA Astrophysics Data System (ADS)
Brown, James D.; Wu, Limin; He, Minxue; Regonda, Satish; Lee, Haksu; Seo, Dong-Jun
2014-11-01
Retrospective forecasts of precipitation, temperature, and streamflow were generated with the Hydrologic Ensemble Forecast Service (HEFS) of the U.S. National Weather Service (NWS) for a 20-year period between 1979 and 1999. The hindcasts were produced for two basins in each of four River Forecast Centers (RFCs), namely the Arkansas-Red Basin RFC, the Colorado Basin RFC, the California-Nevada RFC, and the Middle Atlantic RFC. Precipitation and temperature forecasts were produced with the HEFS Meteorological Ensemble Forecast Processor (MEFP). Inputs to the MEFP comprised ;raw; precipitation and temperature forecasts from the frozen (circa 1997) version of the NWS Global Forecast System (GFS) and a climatological ensemble, which involved resampling historical observations in a moving window around the forecast valid date (;resampled climatology;). In both cases, the forecast horizon was 1-14 days. This paper outlines the hindcasting and verification strategy, and then focuses on the quality of the temperature and precipitation forecasts from the MEFP. A companion paper focuses on the quality of the streamflow forecasts from the HEFS. In general, the precipitation forecasts are more skillful than resampled climatology during the first week, but comprise little or no skill during the second week. In contrast, the temperature forecasts improve upon resampled climatology at all forecast lead times. However, there are notable differences among RFCs and for different seasons, aggregation periods and magnitudes of the observed and forecast variables, both for precipitation and temperature. For example, the MEFP-GFS precipitation forecasts show the highest correlations and greatest skill in the California Nevada RFC, particularly during the wet season (November-April). While generally reliable, the MEFP forecasts typically underestimate the largest observed precipitation amounts (a Type-II conditional bias). As a statistical technique, the MEFP cannot detect, and thus appropriately correct for, conditions that are undetected by the GFS. The calibration of the MEFP to provide reliable and skillful forecasts of a range of precipitation amounts (not only large amounts) is a secondary factor responsible for these Type-II conditional biases. Interpretation of the verification results leads to guidance on the expected performance and limitations of the MEFP, together with recommendations on future enhancements.
Learning and Information Approaches for Inference in Dynamic Data-Driven Geophysical Applications
NASA Astrophysics Data System (ADS)
Ravela, S.
2015-12-01
Many Geophysical inference problems are characterized by non-linear processes, high-dimensional models and complex uncertainties. A dynamic coupling between models, estimation, and sampling is typically sought to efficiently characterize and reduce uncertainty. This process is however fraught with several difficulties. Among them, the key difficulties are the ability to deal with model errors, efficacy of uncertainty quantification and data assimilation. In this presentation, we present three key ideas from learning and intelligent systems theory and apply them to two geophysical applications. The first idea is the use of Ensemble Learning to compensate for model error, the second is to develop tractable Information Theoretic Learning to deal with non-Gaussianity in inference, and the third is a Manifold Resampling technique for effective uncertainty quantification. We apply these methods, first to the development of a cooperative autonomous observing system using sUAS for studying coherent structures. We apply this to Second, we apply this to the problem of quantifying risk from hurricanes and storm surges in a changing climate. Results indicate that learning approaches can enable new effectiveness in cases where standard approaches to model reduction, uncertainty quantification and data assimilation fail.
NASA Astrophysics Data System (ADS)
Zhang, Hongjuan; Hendricks Franssen, Harrie-Jan; Han, Xujun; Vrugt, Jasper A.; Vereecken, Harry
2017-09-01
Land surface models (LSMs) use a large cohort of parameters and state variables to simulate the water and energy balance at the soil-atmosphere interface. Many of these model parameters cannot be measured directly in the field, and require calibration against measured fluxes of carbon dioxide, sensible and/or latent heat, and/or observations of the thermal and/or moisture state of the soil. Here, we evaluate the usefulness and applicability of four different data assimilation methods for joint parameter and state estimation of the Variable Infiltration Capacity Model (VIC-3L) and the Community Land Model (CLM) using a 5-month calibration (assimilation) period (March-July 2012) of areal-averaged SPADE soil moisture measurements at 5, 20, and 50 cm depths in the Rollesbroich experimental test site in the Eifel mountain range in western Germany. We used the EnKF with state augmentation or dual estimation, respectively, and the residual resampling PF with a simple, statistically deficient, or more sophisticated, MCMC-based parameter resampling method. The performance of the calibrated
LSM models was investigated using SPADE water content measurements of a 5-month evaluation period (August-December 2012). As expected, all DA methods enhance the ability of the VIC and CLM models to describe spatiotemporal patterns of moisture storage within the vadose zone of the Rollesbroich site, particularly if the maximum baseflow velocity (VIC) or fractions of sand, clay, and organic matter of each layer (CLM) are estimated jointly with the model states of each soil layer. The differences between the soil moisture simulations of VIC-3L and CLM are much larger than the discrepancies among the four data assimilation methods. The EnKF with state augmentation or dual estimation yields the best performance of VIC-3L and CLM during the calibration and evaluation period, yet results are in close agreement with the PF using MCMC resampling. Overall, CLM demonstrated the best performance for the Rollesbroich site. The large systematic underestimation of water storage at 50 cm depth by VIC-3L during the first few months of the evaluation period questions, in part, the validity of its fixed water table depth at the bottom of the modeled soil domain.
Bayesian Optimization for Neuroimaging Pre-processing in Brain Age Classification and Prediction
Lancaster, Jenessa; Lorenz, Romy; Leech, Rob; Cole, James H.
2018-01-01
Neuroimaging-based age prediction using machine learning is proposed as a biomarker of brain aging, relating to cognitive performance, health outcomes and progression of neurodegenerative disease. However, even leading age-prediction algorithms contain measurement error, motivating efforts to improve experimental pipelines. T1-weighted MRI is commonly used for age prediction, and the pre-processing of these scans involves normalization to a common template and resampling to a common voxel size, followed by spatial smoothing. Resampling parameters are often selected arbitrarily. Here, we sought to improve brain-age prediction accuracy by optimizing resampling parameters using Bayesian optimization. Using data on N = 2003 healthy individuals (aged 16–90 years) we trained support vector machines to (i) distinguish between young (<22 years) and old (>50 years) brains (classification) and (ii) predict chronological age (regression). We also evaluated generalisability of the age-regression model to an independent dataset (CamCAN, N = 648, aged 18–88 years). Bayesian optimization was used to identify optimal voxel size and smoothing kernel size for each task. This procedure adaptively samples the parameter space to evaluate accuracy across a range of possible parameters, using independent sub-samples to iteratively assess different parameter combinations to arrive at optimal values. When distinguishing between young and old brains a classification accuracy of 88.1% was achieved, (optimal voxel size = 11.5 mm3, smoothing kernel = 2.3 mm). For predicting chronological age, a mean absolute error (MAE) of 5.08 years was achieved, (optimal voxel size = 3.73 mm3, smoothing kernel = 3.68 mm). This was compared to performance using default values of 1.5 mm3 and 4mm respectively, resulting in MAE = 5.48 years, though this 7.3% improvement was not statistically significant. When assessing generalisability, best performance was achieved when applying the entire Bayesian optimization framework to the new dataset, out-performing the parameters optimized for the initial training dataset. Our study outlines the proof-of-principle that neuroimaging models for brain-age prediction can use Bayesian optimization to derive case-specific pre-processing parameters. Our results suggest that different pre-processing parameters are selected when optimization is conducted in specific contexts. This potentially motivates use of optimization techniques at many different points during the experimental process, which may improve statistical sensitivity and reduce opportunities for experimenter-led bias. PMID:29483870
Hexagonal Pixels and Indexing Scheme for Binary Images
NASA Technical Reports Server (NTRS)
Johnson, Gordon G.
2004-01-01
A scheme for resampling binaryimage data from a rectangular grid to a regular hexagonal grid and an associated tree-structured pixel-indexing scheme keyed to the level of resolution have been devised. This scheme could be utilized in conjunction with appropriate image-data-processing algorithms to enable automated retrieval and/or recognition of images. For some purposes, this scheme is superior to a prior scheme that relies on rectangular pixels: one example of such a purpose is recognition of fingerprints, which can be approximated more closely by use of line segments along hexagonal axes than by line segments along rectangular axes. This scheme could also be combined with algorithms for query-image-based retrieval of images via the Internet. A binary image on a rectangular grid is generated by raster scanning or by sampling on a stationary grid of rectangular pixels. In either case, each pixel (each cell in the rectangular grid) is denoted as either bright or dark, depending on whether the light level in the pixel is above or below a prescribed threshold. The binary data on such an image are stored in a matrix form that lends itself readily to searches of line segments aligned with either or both of the perpendicular coordinate axes. The first step in resampling onto a regular hexagonal grid is to make the resolution of the hexagonal grid fine enough to capture all the binaryimage detail from the rectangular grid. In practice, this amounts to choosing a hexagonal-cell width equal to or less than a third of the rectangular- cell width. Once the data have been resampled onto the hexagonal grid, the image can readily be checked for line segments aligned with the hexagonal coordinate axes, which typically lie at angles of 30deg, 90deg, and 150deg with respect to say, the horizontal rectangular coordinate axis. Optionally, one can then rotate the rectangular image by 90deg, then again sample onto the hexagonal grid and check for line segments at angles of 0deg, 60deg, and 120deg to the original horizontal coordinate axis. The net result is that one has checked for line segments at angular intervals of 30deg. For even finer angular resolution, one could, for example, then rotate the rectangular-grid image +/-45deg before sampling to perform checking for line segments at angular intervals of 15deg.
NASA Astrophysics Data System (ADS)
Olafsdottir, Kristin B.; Mudelsee, Manfred
2013-04-01
Estimation of the Pearson's correlation coefficient between two time series to evaluate the influences of one time depended variable on another is one of the most often used statistical method in climate sciences. Various methods are used to estimate confidence interval to support the correlation point estimate. Many of them make strong mathematical assumptions regarding distributional shape and serial correlation, which are rarely met. More robust statistical methods are needed to increase the accuracy of the confidence intervals. Bootstrap confidence intervals are estimated in the Fortran 90 program PearsonT (Mudelsee, 2003), where the main intention was to get an accurate confidence interval for correlation coefficient between two time series by taking the serial dependence of the process that generated the data into account. However, Monte Carlo experiments show that the coverage accuracy for smaller data sizes can be improved. Here we adapt the PearsonT program into a new version called PearsonT3, by calibrating the confidence interval to increase the coverage accuracy. Calibration is a bootstrap resampling technique, which basically performs a second bootstrap loop or resamples from the bootstrap resamples. It offers, like the non-calibrated bootstrap confidence intervals, robustness against the data distribution. Pairwise moving block bootstrap is used to preserve the serial correlation of both time series. The calibration is applied to standard error based bootstrap Student's t confidence intervals. The performances of the calibrated confidence intervals are examined with Monte Carlo simulations, and compared with the performances of confidence intervals without calibration, that is, PearsonT. The coverage accuracy is evidently better for the calibrated confidence intervals where the coverage error is acceptably small (i.e., within a few percentage points) already for data sizes as small as 20. One form of climate time series is output from numerical models which simulate the climate system. The method is applied to model data from the high resolution ocean model, INALT01 where the relationship between the Agulhas Leakage and the North Brazil Current is evaluated. Preliminary results show significant correlation between the two variables when there is 10 year lag between them, which is more or less the time that takes the Agulhas Leakage water to reach the North Brazil Current. Mudelsee, M., 2003. Estimating Pearson's correlation coefficient with bootstrap confidence interval from serially dependent time series. Mathematical Geology 35, 651-665.
Liang, Bin; Li, Yongbao; Wei, Ran; Guo, Bin; Xu, Xuang; Liu, Bo; Li, Jiafeng; Wu, Qiuwen; Zhou, Fugen
2018-01-05
With robot-controlled linac positioning, robotic radiotherapy systems such as CyberKnife significantly increase freedom of radiation beam placement, but also impose more challenges on treatment plan optimization. The resampling mechanism in the vendor-supplied treatment planning system (MultiPlan) cannot fully explore the increased beam direction search space. Besides, a sparse treatment plan (using fewer beams) is desired to improve treatment efficiency. This study proposes a singular value decomposition linear programming (SVDLP) optimization technique for circular collimator based robotic radiotherapy. The SVDLP approach initializes the input beams by simulating the process of covering the entire target volume with equivalent beam tapers. The requirements on dosimetry distribution are modeled as hard and soft constraints, and the sparsity of the treatment plan is achieved by compressive sensing. The proposed linear programming (LP) model optimizes beam weights by minimizing the deviation of soft constraints subject to hard constraints, with a constraint on the l 1 norm of the beam weight. A singular value decomposition (SVD) based acceleration technique was developed for the LP model. Based on the degeneracy of the influence matrix, the model is first compressed into lower dimension for optimization, and then back-projected to reconstruct the beam weight. After beam weight optimization, the number of beams is reduced by removing the beams with low weight, and optimizing the weights of the remaining beams using the same model. This beam reduction technique is further validated by a mixed integer programming (MIP) model. The SVDLP approach was tested on a lung case. The results demonstrate that the SVD acceleration technique speeds up the optimization by a factor of 4.8. Furthermore, the beam reduction achieves a similar plan quality to the globally optimal plan obtained by the MIP model, but is one to two orders of magnitude faster. Furthermore, the SVDLP approach is tested and compared with MultiPlan on three clinical cases of varying complexities. In general, the plans generated by the SVDLP achieve steeper dose gradient, better conformity and less damage to normal tissues. In conclusion, the SVDLP approach effectively improves the quality of treatment plan due to the use of the complete beam search space. This challenging optimization problem with the complete beam search space is effectively handled by the proposed SVD acceleration.
NASA Astrophysics Data System (ADS)
Liang, Bin; Li, Yongbao; Wei, Ran; Guo, Bin; Xu, Xuang; Liu, Bo; Li, Jiafeng; Wu, Qiuwen; Zhou, Fugen
2018-01-01
With robot-controlled linac positioning, robotic radiotherapy systems such as CyberKnife significantly increase freedom of radiation beam placement, but also impose more challenges on treatment plan optimization. The resampling mechanism in the vendor-supplied treatment planning system (MultiPlan) cannot fully explore the increased beam direction search space. Besides, a sparse treatment plan (using fewer beams) is desired to improve treatment efficiency. This study proposes a singular value decomposition linear programming (SVDLP) optimization technique for circular collimator based robotic radiotherapy. The SVDLP approach initializes the input beams by simulating the process of covering the entire target volume with equivalent beam tapers. The requirements on dosimetry distribution are modeled as hard and soft constraints, and the sparsity of the treatment plan is achieved by compressive sensing. The proposed linear programming (LP) model optimizes beam weights by minimizing the deviation of soft constraints subject to hard constraints, with a constraint on the l 1 norm of the beam weight. A singular value decomposition (SVD) based acceleration technique was developed for the LP model. Based on the degeneracy of the influence matrix, the model is first compressed into lower dimension for optimization, and then back-projected to reconstruct the beam weight. After beam weight optimization, the number of beams is reduced by removing the beams with low weight, and optimizing the weights of the remaining beams using the same model. This beam reduction technique is further validated by a mixed integer programming (MIP) model. The SVDLP approach was tested on a lung case. The results demonstrate that the SVD acceleration technique speeds up the optimization by a factor of 4.8. Furthermore, the beam reduction achieves a similar plan quality to the globally optimal plan obtained by the MIP model, but is one to two orders of magnitude faster. Furthermore, the SVDLP approach is tested and compared with MultiPlan on three clinical cases of varying complexities. In general, the plans generated by the SVDLP achieve steeper dose gradient, better conformity and less damage to normal tissues. In conclusion, the SVDLP approach effectively improves the quality of treatment plan due to the use of the complete beam search space. This challenging optimization problem with the complete beam search space is effectively handled by the proposed SVD acceleration.
Osteosarcoma: Diagnostic dilemmas in histopathology and prognostic factors
Wadhwa, Neelam
2014-01-01
Osteosarcoma (OS), the commonest malignancy of osteoarticular origin, is a very aggressive neoplasm. Divergent histologic differentiation is common in OS; hence triple diagnostic approach is essential in all cases. 20% cases are atypical owing to lack of concurrence among clinicoradiologic and pathologic features necessitating resampling. Recognition of specific anatomic and histologic variants is essential in view of better outcome. Traditional prognostic factors of OS do stratify patients for short term outcome, but often fail to predict their long term outcome. Considering the negligible improvement in the patient outcome during the last 20 years, search for novel prognostic factors is in progress like ezrin vascular endothelial growth factor, chemokine receptors, dysregulation of various micro ribonucleic acid are potentially promising. Their utility needs to be validated by long term followup studies before they are incorporated in routine clinical practice. PMID:24932029
Model Selection for Monitoring CO2 Plume during Sequestration
DOE Office of Scientific and Technical Information (OSTI.GOV)
2014-12-31
The model selection method developed as part of this project mainly includes four steps: (1) assessing the connectivity/dynamic characteristics of a large prior ensemble of models, (2) model clustering using multidimensional scaling coupled with k-mean clustering, (3) model selection using the Bayes' rule in the reduced model space, (4) model expansion using iterative resampling of the posterior models. The fourth step expresses one of the advantages of the method: it provides a built-in means of quantifying the uncertainty in predictions made with the selected models. In our application to plume monitoring, by expanding the posterior space of models, the finalmore » ensemble of representations of geological model can be used to assess the uncertainty in predicting the future displacement of the CO2 plume. The software implementation of this approach is attached here.« less
NASA Astrophysics Data System (ADS)
Li, L.; Yang, C.
2017-12-01
Climate extremes often manifest as rare events in terms of surface air temperature and precipitation with an annual reoccurrence period. In order to represent the manifold characteristics of climate extremes for monitoring and analysis, the Expert Team on Climate Change Detection and Indices (ETCCDI) had worked out a set of 27 core indices based on daily temperature and precipitation data, describing extreme weather and climate events on an annual basis. The CLIMDEX project (http://www.climdex.org) had produced public domain datasets of such indices for data from a variety of sources, including output from global climate models (GCM) participating in the Coupled Model Intercomparison Project Phase 5 (CMIP5). Among the 27 ETCCDI indices, there are six percentile-based temperature extremes indices that may fall into two groups: exceedance rates (ER) (TN10p, TN90p, TX10p and TX90p) and durations (CSDI and WSDI). Percentiles must be estimated prior to the calculation of the indices, and could more or less be biased by the adopted algorithm. Such biases will in turn be propagated to the final results of indices. The CLIMDEX used an empirical quantile estimator combined with a bootstrap resampling procedure to reduce the inhomogeneity in the annual series of the ER indices. However, there are still some problems remained in the CLIMDEX datasets, namely the overestimated climate variability due to unaccounted autocorrelation in the daily temperature data, seasonally varying biases and inconsistency between algorithms applied to the ER indices and to the duration indices. We now present new results of the six indices through a semiparametric quantile regression approach for the CMIP5 model output. By using the base-period data as a whole and taking seasonality and autocorrelation into account, this approach successfully addressed the aforementioned issues and came out with consistent results. The new datasets cover the historical and three projected (RCP2.6, RCP4.5 and RCP8.5) emission scenarios run a multimodel ensemble of 19 members. We analyze changes in the six indices on global and regional scales over the 21st century relative to either the base period 1961-1990 or the reference period 1981-2000, and compare the results with those based on the CLIMDEX datasets.
Pourhajibagher, Maryam; Raoofian, Reza; Ghorbanzadeh, Roghayeh; Bahador, Abbas
2018-03-01
The infected root canal system harbors one of the highest accumulations of polymicrobial infections. Since the eradication of endopathogenic microbiota is a major goal in endodontic infection therapy, photo-activated disinfection (PAD) can be used as an alternative therapeutic method in endodontic treatment. Compared to cultivation-based approaches, molecular techniques are more reliable for identifying microbial agents associated with endodontic infections. The purpose of this study was to evaluate the ability of designed multiplex real-time PCR protocol for the rapid detection and quantification of six common microorganisms involved in endodontic infection before and after the PAD. Samples were taken from the root canals of 50 patients with primary and secondary/persistent endodontic infections using sterile paper points. PAD with toluidine blue O (TBO) plus diode laser was performed on root canals. Resampling was then performed, and the samples were transferred to transport medium. Then, six target microorganisms were detected using multiplex real-time PCR before and after the PAD. Veillonella parvula was found using multiplex real-time PCR to have the highest frequency among samples collected before the PAD (29.4%), followed by Porphyromonas gingivalis (23.1%), Aggregatibacter actinomycetemcomitans (13.6%), Actinomyces naeslundii (13.0%), Enterococcus faecalis (11.5%), and Lactobacillus rhamnosus (9.4%). After TBO-mediated PAD, P. gingivalis strains, the most resistance microorganisms, were recovered in 41.7% of the samples using molecular approach (P > 0.05). As the results shown, multiplex real-time PCR as an accurate detection approach with high-throughput and TBO-mediated PAD as an efficient antimicrobial strategy due to the significant reduction of the endopathogenic count can be used for detection and treatment of microbiota involved in infected root canals, respectively. Copyright © 2018 Elsevier B.V. All rights reserved.
Decision support for the selection of reference sites using 137Cs as a soil erosion tracer
NASA Astrophysics Data System (ADS)
Arata, Laura; Meusburger, Katrin; Bürge, Alexandra; Zehringer, Markus; Ketterer, Michael E.; Mabit, Lionel; Alewell, Christine
2017-08-01
The classical approach of using 137Cs as a soil erosion tracer is based on the comparison between stable reference sites and sites affected by soil redistribution processes; it enables the derivation of soil erosion and deposition rates. The method is associated with potentially large sources of uncertainty with major parts of this uncertainty being associated with the selection of the reference sites. We propose a decision support tool to Check the Suitability of reference Sites (CheSS). Commonly, the variation among 137Cs inventories of spatial replicate reference samples is taken as the sole criterion to decide on the suitability of a reference inventory. Here we propose an extension of this procedure using a repeated sampling approach, in which the reference sites are resampled after a certain time period. Suitable reference sites are expected to present no significant temporal variation in their decay-corrected 137Cs depth profiles. Possible causes of variation are assessed by a decision tree. More specifically, the decision tree tests for (i) uncertainty connected to small-scale variability in 137Cs due to its heterogeneous initial fallout (such as in areas affected by the Chernobyl fallout), (ii) signs of erosion or deposition processes and (iii) artefacts due to the collection, preparation and measurement of the samples; (iv) finally, if none of the above can be assigned, this variation might be attributed to turbation
processes (e.g. bioturbation, cryoturbation and mechanical turbation, such as avalanches or rockfalls). CheSS was exemplarily applied in one Swiss alpine valley where the apparent temporal variability called into question the suitability of the selected reference sites. In general we suggest the application of CheSS as a first step towards a comprehensible approach to test for the suitability of reference sites.
Controls on desert dune activity - a geospatial approach
NASA Astrophysics Data System (ADS)
Lancaster, N.; Hesse, P. P.
2017-12-01
Desert and other inland dunes occur on a wide spectrum of activity (defined loosely as the proportion of the surface area subject to sand movement) from unvegetated to sparsely vegetated "active" dunes through discontinuously vegetated inactive dunes to completely vegetated and degraded dunes. Many of the latter are relicts of past climatic conditions. Although field studies and modeling of the interactions between winds, vegetation cover, and dune activity can provide valuable insights, the response of dune systems to climate change and variability past, present, and future has until now been hampered by the lack of pertinent observational data on geomorphic and climatic boundary conditions and dune activity status for most dune areas. We have developed GIS-based approach that permits analysis of boundary conditions and controls on dune activity at a range of spatial scales from dunefield to global. In this approach, the digital mapping of dune field and sand sea extent has been combined with systematic observations of dune activity at 0.2° intervals from high resolution satellite image data, resulting in four classes of activity. 1 km resolution global gridded datasets for the aridity index (AI); precipitation, satellite-derived percent vegetation cover; and estimates of sand transport potential (DP) were re-sampled for each 0.2° grid cell, and dune activity was compared to vegetation cover, sand transport potential, precipitation, and the aridity index. Results so far indicate that there are broad-scale relationships between dunefield mean activity, climate, and vegetation cover. However, the scatter in the data suggest that other local factors may be at work. Intra-dune field patterns are complex in many cases. Overall, much more work needs to be done to gain a full understanding of controls at different spatial and temporal scales, which can be faciliated by this spatial database.
Devenish Nelson, Eleanor S.; Harris, Stephen; Soulsbury, Carl D.; Richards, Shane A.; Stephens, Philip A.
2010-01-01
Background Demographic models are widely used in conservation and management, and their parameterisation often relies on data collected for other purposes. When underlying data lack clear indications of associated uncertainty, modellers often fail to account for that uncertainty in model outputs, such as estimates of population growth. Methodology/Principal Findings We applied a likelihood approach to infer uncertainty retrospectively from point estimates of vital rates. Combining this with resampling techniques and projection modelling, we show that confidence intervals for population growth estimates are easy to derive. We used similar techniques to examine the effects of sample size on uncertainty. Our approach is illustrated using data on the red fox, Vulpes vulpes, a predator of ecological and cultural importance, and the most widespread extant terrestrial mammal. We show that uncertainty surrounding estimated population growth rates can be high, even for relatively well-studied populations. Halving that uncertainty typically requires a quadrupling of sampling effort. Conclusions/Significance Our results compel caution when comparing demographic trends between populations without accounting for uncertainty. Our methods will be widely applicable to demographic studies of many species. PMID:21049049
The seven deadly sins of DNA barcoding.
Collins, R A; Cruickshank, R H
2013-11-01
Despite the broad benefits that DNA barcoding can bring to a diverse range of biological disciplines, a number of shortcomings still exist in terms of the experimental design of studies incorporating this approach. One underlying reason for this lies in the confusion that often exists between species discovery and specimen identification, and this is reflected in the way that hypotheses are generated and tested. Although these aims can be associated, they are quite distinct and require different methodological approaches, but their conflation has led to the frequently inappropriate use of commonly used analytical methods such as neighbour-joining trees, bootstrap resampling and fixed distance thresholds. Furthermore, the misidentification of voucher specimens can also have serious implications for end users of reference libraries such as the Barcode of Life Data Systems, and in this regard we advocate increased diligence in the a priori identification of specimens to be used for this purpose. This commentary provides an assessment of seven deficiencies that we identify as common in the DNA barcoding literature, and outline some potential improvements for its adaptation and adoption towards more reliable and accurate outcomes. © 2012 John Wiley & Sons Ltd.
LANDSAT-D investigations in snow hydrology
NASA Technical Reports Server (NTRS)
Dozier, J. (Principal Investigator)
1984-01-01
Thematic mapper radiometric characteristics, snow/cloud reflectance, and atmospheric correction are discussed with application to determining the spectral albedo of snow. The geometric characterics of TM and digital elevation data are examined. The geometric transformations and resampling required to coregister these data are discussed.
Castells, Xavier; Acebes, Juan José; Majós, Carles; Boluda, Susana; Julià-Sapé, Margarida; Candiota, Ana Paula; Ariño, Joaquín; Barceló, Anna; Arús, Carles
2015-01-01
Glioblastoma (Gb) is one of the most deadly tumors. Its molecular subtypes are yet to be fully characterized while the attendant efforts for personalized medicine need to be intensified in relation to glioblastoma diagnosis, treatment, and prognosis. Several molecular signatures based on gene expression microarrays were reported, but the use of microarrays for routine clinical practice is challenged by attendant economic costs. Several authors have proposed discriminant equations based on RT-PCR. Still, the discriminant threshold is often incompletely described, which makes proper validation difficult. In a previous work, we have reported two Gb subtypes based on the expression levels of four genes: CHI3L1, LDHA, LGALS1, and IGFBP3. One Gb subtype presented with low expression of the four genes mentioned, and of MGMT in a large portion of the patients (with anticipated high methylation of its promoter), and mutated IDH1. Here, we evaluate the robustness of the equations fitted with these genes using RT-PCR values in a set of 64 cases and importantly, define an unequivocal discriminant threshold with a view to prognostic implications. We developed two approaches to generate the discriminant equations: 1) using the expression level of the four genes mentioned above, and 2) using those genes displaying the highest correlation with survival among the aforementioned four ones, plus MGMT, as an attempt to further reduce the number of genes. The ease of equations' applicability, reduction in cost for raw data, and robustness in terms of resampling-based classification accuracy warrant further evaluation of these equations to discern Gb tumor biopsy heterogeneity at molecular level, diagnose potential malignancy, and prognosis of individual patients with glioblastomas.
Molecular taxonomy of phytopathogenic fungi: a case study in Peronospora.
Göker, Markus; García-Blázquez, Gema; Voglmayr, Hermann; Tellería, M Teresa; Martín, María P
2009-07-29
Inappropriate taxon definitions may have severe consequences in many areas. For instance, biologically sensible species delimitation of plant pathogens is crucial for measures such as plant protection or biological control and for comparative studies involving model organisms. However, delimiting species is challenging in the case of organisms for which often only molecular data are available, such as prokaryotes, fungi, and many unicellular eukaryotes. Even in the case of organisms with well-established morphological characteristics, molecular taxonomy is often necessary to emend current taxonomic concepts and to analyze DNA sequences directly sampled from the environment. Typically, for this purpose clustering approaches to delineate molecular operational taxonomic units have been applied using arbitrary choices regarding the distance threshold values, and the clustering algorithms. Here, we report on a clustering optimization method to establish a molecular taxonomy of Peronospora based on ITS nrDNA sequences. Peronospora is the largest genus within the downy mildews, which are obligate parasites of higher plants, and includes various economically important pathogens. The method determines the distance function and clustering setting that result in an optimal agreement with selected reference data. Optimization was based on both taxonomy-based and host-based reference information, yielding the same outcome. Resampling and permutation methods indicate that the method is robust regarding taxon sampling and errors in the reference data. Tests with newly obtained ITS sequences demonstrate the use of the re-classified dataset in molecular identification of downy mildews. A corrected taxonomy is provided for all Peronospora ITS sequences contained in public databases. Clustering optimization appears to be broadly applicable in automated, sequence-based taxonomy. The method connects traditional and modern taxonomic disciplines by specifically addressing the issue of how to optimally account for both traditional species concepts and genetic divergence.
Molecular Taxonomy of Phytopathogenic Fungi: A Case Study in Peronospora
Göker, Markus; García-Blázquez, Gema; Voglmayr, Hermann; Tellería, M. Teresa; Martín, María P.
2009-01-01
Background Inappropriate taxon definitions may have severe consequences in many areas. For instance, biologically sensible species delimitation of plant pathogens is crucial for measures such as plant protection or biological control and for comparative studies involving model organisms. However, delimiting species is challenging in the case of organisms for which often only molecular data are available, such as prokaryotes, fungi, and many unicellular eukaryotes. Even in the case of organisms with well-established morphological characteristics, molecular taxonomy is often necessary to emend current taxonomic concepts and to analyze DNA sequences directly sampled from the environment. Typically, for this purpose clustering approaches to delineate molecular operational taxonomic units have been applied using arbitrary choices regarding the distance threshold values, and the clustering algorithms. Methodology Here, we report on a clustering optimization method to establish a molecular taxonomy of Peronospora based on ITS nrDNA sequences. Peronospora is the largest genus within the downy mildews, which are obligate parasites of higher plants, and includes various economically important pathogens. The method determines the distance function and clustering setting that result in an optimal agreement with selected reference data. Optimization was based on both taxonomy-based and host-based reference information, yielding the same outcome. Resampling and permutation methods indicate that the method is robust regarding taxon sampling and errors in the reference data. Tests with newly obtained ITS sequences demonstrate the use of the re-classified dataset in molecular identification of downy mildews. Conclusions A corrected taxonomy is provided for all Peronospora ITS sequences contained in public databases. Clustering optimization appears to be broadly applicable in automated, sequence-based taxonomy. The method connects traditional and modern taxonomic disciplines by specifically addressing the issue of how to optimally account for both traditional species concepts and genetic divergence. PMID:19641601
NASA Astrophysics Data System (ADS)
Han, Tao; Chen, Lingyun; Lai, Chao-Jen; Liu, Xinming; Shen, Youtao; Zhong, Yuncheng; Ge, Shuaiping; Yi, Ying; Wang, Tianpeng; Shaw, Chris C.
2009-02-01
Images of mastectomy breast specimens have been acquired with a bench top experimental Cone beam CT (CBCT) system. The resulting images have been segmented to model an uncompressed breast for simulation of various CBCT techniques. To further simulate conventional or tomosynthesis mammographic imaging for comparison with the CBCT technique, a deformation technique was developed to convert the CT data for an uncompressed breast to a compressed breast without altering the breast volume or regional breast density. With this technique, 3D breast deformation is separated into two 2D deformations in coronal and axial views. To preserve the total breast volume and regional tissue composition, each 2D deformation step was achieved by altering the square pixels into rectangular ones with the pixel areas unchanged and resampling with the original square pixels using bilinear interpolation. The compression was modeled by first stretching the breast in the superior-inferior direction in the coronal view. The image data were first deformed by distorting the voxels with a uniform distortion ratio. These deformed data were then deformed again using distortion ratios varying with the breast thickness and re-sampled. The deformation procedures were applied in the axial view to stretch the breast in the chest wall to nipple direction while shrinking it in the mediolateral to lateral direction re-sampled and converted into data for uniform cubic voxels. Threshold segmentation was applied to the final deformed image data to obtain the 3D compressed breast model. Our results show that the original segmented CBCT image data were successfully converted into those for a compressed breast with the same volume and regional density preserved. Using this compressed breast model, conventional and tomosynthesis mammograms were simulated for comparison with CBCT.
Dwivedi, Alok Kumar; Mallawaarachchi, Indika; Alvarado, Luis A
2017-06-30
Experimental studies in biomedical research frequently pose analytical problems related to small sample size. In such studies, there are conflicting findings regarding the choice of parametric and nonparametric analysis, especially with non-normal data. In such instances, some methodologists questioned the validity of parametric tests and suggested nonparametric tests. In contrast, other methodologists found nonparametric tests to be too conservative and less powerful and thus preferred using parametric tests. Some researchers have recommended using a bootstrap test; however, this method also has small sample size limitation. We used a pooled method in nonparametric bootstrap test that may overcome the problem related with small samples in hypothesis testing. The present study compared nonparametric bootstrap test with pooled resampling method corresponding to parametric, nonparametric, and permutation tests through extensive simulations under various conditions and using real data examples. The nonparametric pooled bootstrap t-test provided equal or greater power for comparing two means as compared with unpaired t-test, Welch t-test, Wilcoxon rank sum test, and permutation test while maintaining type I error probability for any conditions except for Cauchy and extreme variable lognormal distributions. In such cases, we suggest using an exact Wilcoxon rank sum test. Nonparametric bootstrap paired t-test also provided better performance than other alternatives. Nonparametric bootstrap test provided benefit over exact Kruskal-Wallis test. We suggest using nonparametric bootstrap test with pooled resampling method for comparing paired or unpaired means and for validating the one way analysis of variance test results for non-normal data in small sample size studies. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Yepez, Santiago; Laraque, Alain; Martinez, Jean-Michel; De Sa, Jose; Carrera, Juan Manuel; Castellanos, Bartolo; Gallay, Marjorie; Lopez, Jose L.
2018-01-01
In this study, 81 Landsat-8 scenes acquired from 2013 to 2015 were used to estimate the suspended sediment concentration (SSC) in the Orinoco River at its main hydrological station at Ciudad Bolivar, Venezuela. This gauging station monitors an upstream area corresponding to 89% of the total catchment area where the mean discharge is of 33,000 m3·s-1. SSC spatial and temporal variabilities were analyzed in relation to the hydrological cycle and to local geomorphological characteristics of the river mainstream. Three types of atmospheric correction models were evaluated to correct the Landsat-8 images: DOS, FLAASH, and L8SR. Surface reflectance was compared with monthly water sampling to calibrate a SSC retrieval model using a bootstrapping resampling. A regression model based on surface reflectance at the Near-Infrared wavelengths showed the best performance: R2 = 0.92 (N = 27) for the whole range of SSC (18 to 203 mg·l-1) measured at this station during the studied period. The method offers a simple new approach to estimate the SSC along the lower Orinoco River and demonstrates the feasibility and reliability of remote sensing images to map the spatiotemporal variability in sediment transport over large rivers.
Prediction of resource volumes at untested locations using simple local prediction models
Attanasi, E.D.; Coburn, T.C.; Freeman, P.A.
2006-01-01
This paper shows how local spatial nonparametric prediction models can be applied to estimate volumes of recoverable gas resources at individual undrilled sites, at multiple sites on a regional scale, and to compute confidence bounds for regional volumes based on the distribution of those estimates. An approach that combines cross-validation, the jackknife, and bootstrap procedures is used to accomplish this task. Simulation experiments show that cross-validation can be applied beneficially to select an appropriate prediction model. The cross-validation procedure worked well for a wide range of different states of nature and levels of information. Jackknife procedures are used to compute individual prediction estimation errors at undrilled locations. The jackknife replicates also are used with a bootstrap resampling procedure to compute confidence bounds for the total volume. The method was applied to data (partitioned into a training set and target set) from the Devonian Antrim Shale continuous-type gas play in the Michigan Basin in Otsego County, Michigan. The analysis showed that the model estimate of total recoverable volumes at prediction sites is within 4 percent of the total observed volume. The model predictions also provide frequency distributions of the cell volumes at the production unit scale. Such distributions are the basis for subsequent economic analyses. ?? Springer Science+Business Media, LLC 2007.
Replicating studies in which samples of participants respond to samples of stimuli.
Westfall, Jacob; Judd, Charles M; Kenny, David A
2015-05-01
In a direct replication, the typical goal is to reproduce a prior experimental result with a new but comparable sample of participants in a high-powered replication study. Often in psychology, the research to be replicated involves a sample of participants responding to a sample of stimuli. In replicating such studies, we argue that the same criteria should be used in sampling stimuli as are used in sampling participants. Namely, a new but comparable sample of stimuli should be used to ensure that the original results are not due to idiosyncrasies of the original stimulus sample, and the stimulus sample must often be enlarged to ensure high statistical power. In support of the latter point, we discuss the fact that in experiments involving samples of stimuli, statistical power typically does not approach 1 as the number of participants goes to infinity. As an example of the importance of sampling new stimuli, we discuss the bygone literature on the risky shift phenomenon, which was almost entirely based on a single stimulus sample that was later discovered to be highly unrepresentative. We discuss the use of both resampled and expanded stimulus sets, that is, stimulus samples that include the original stimuli plus new stimuli. © The Author(s) 2015.
SAMPLE SIZE FOR SEASONAL MEAN CONCENTRATION, DEPOSITION VELOCITY AND DEPOSITION: A RESAMPLING STUDY
Methodologies are described to assign confidence statements to seasonal means of concentration (C), deposition velocity (V J, and deposition categorized by species/parameters, sites, and seasons in the presence of missing data. Estimators of seasonal means with missing weekly dat...
MISR Level 1 Near Real Time Products
Atmospheric Science Data Center
2016-10-31
Level 1 Near Real Time The MISR Near Real Time Level 1 data products ... km MISR swath and projected onto a Space-Oblique Mercator (SOM) map grid. The Ellipsoid-projected and Terrain-projected top-of-atmosphere (TOA) radiance products provide measurements respectively resampled onto the ...
Using and Evaluating Resampling Simulations in SPSS and Excel.
ERIC Educational Resources Information Center
Smith, Brad
2003-01-01
Describes and evaluates three computer-assisted simulations used with Statistical Package for the Social Sciences (SPSS) and Microsoft Excel. Designed the simulations to reinforce and enhance student understanding of sampling distributions, confidence intervals, and significance tests. Reports evaluations revealed improved student comprehension of…
Uncertainties in the cluster-cluster correlation function
NASA Astrophysics Data System (ADS)
Ling, E. N.; Frenk, C. S.; Barrow, J. D.
1986-12-01
The bootstrap resampling technique is applied to estimate sampling errors and significance levels of the two-point correlation functions determined for a subset of the CfA redshift survey of galaxies and a redshift sample of 104 Abell clusters. The angular correlation function for a sample of 1664 Abell clusters is also calculated. The standard errors in xi(r) for the Abell data are found to be considerably larger than quoted 'Poisson errors'. The best estimate for the ratio of the correlation length of Abell clusters (richness class R greater than or equal to 1, distance class D less than or equal to 4) to that of CfA galaxies is 4.2 + 1.4 or - 1.0 (68 percentile error). The enhancement of cluster clustering over galaxy clustering is statistically significant in the presence of resampling errors. The uncertainties found do not include the effects of possible systematic biases in the galaxy and cluster catalogs and could be regarded as lower bounds on the true uncertainty range.
Survival estimation and the effects of dependency among animals
Schmutz, Joel A.; Ward, David H.; Sedinger, James S.; Rexstad, Eric A.
1995-01-01
Survival models assume that fates of individuals are independent, yet the robustness of this assumption has been poorly quantified. We examine how empirically derived estimates of the variance of survival rates are affected by dependency in survival probability among individuals. We used Monte Carlo simulations to generate known amounts of dependency among pairs of individuals and analyzed these data with Kaplan-Meier and Cormack-Jolly-Seber models. Dependency significantly increased these empirical variances as compared to theoretically derived estimates of variance from the same populations. Using resighting data from 168 pairs of black brant, we used a resampling procedure and program RELEASE to estimate empirical and mean theoretical variances. We estimated that the relationship between paired individuals caused the empirical variance of the survival rate to be 155% larger than the empirical variance for unpaired individuals. Monte Carlo simulations and use of this resampling strategy can provide investigators with information on how robust their data are to this common assumption of independent survival probabilities.
Confidence limit calculation for antidotal potency ratio derived from lethal dose 50
Manage, Ananda; Petrikovics, Ilona
2013-01-01
AIM: To describe confidence interval calculation for antidotal potency ratios using bootstrap method. METHODS: We can easily adapt the nonparametric bootstrap method which was invented by Efron to construct confidence intervals in such situations like this. The bootstrap method is a resampling method in which the bootstrap samples are obtained by resampling from the original sample. RESULTS: The described confidence interval calculation using bootstrap method does not require the sampling distribution antidotal potency ratio. This can serve as a substantial help for toxicologists, who are directed to employ the Dixon up-and-down method with the application of lower number of animals to determine lethal dose 50 values for characterizing the investigated toxic molecules and eventually for characterizing the antidotal protections by the test antidotal systems. CONCLUSION: The described method can serve as a useful tool in various other applications. Simplicity of the method makes it easier to do the calculation using most of the programming software packages. PMID:25237618
Smith, Dale L; Gozal, David; Hunter, Scott J; Kheirandish-Gozal, Leila
2017-01-01
Numerous studies over the past several decades have illustrated that children who suffer from sleep-disordered breathing (SDB) are at greater risk for cognitive, behavioral, and psychiatric problems. Although behavioral problems have been proposed as a potential mediator between SDB and cognitive functioning, these relationships have not been critically examined. This analysis is based on a community-based cohort of 1,115 children who underwent overnight polysomnography, and cognitive and behavioral phenotyping. Structural model of the relationships between SDB, behavior, and cognition, and two recently developed mediation approaches based on propensity score weighting and resampling were used to assess the mediational role of parent-reported behavior and psychiatric problems in the relationship between SDB and cognitive functioning. Multiple models utilizing two different SDB definitions further explored direct effects of SDB on cognition as well as indirect effects through behavioral pathology. All models were adjusted for age, sex, race, BMI z -score, and asthma status. Indirect effects of SDB through behavior problems were significant in all mediation models, while direct effects of SDB on cognition were not. The findings were consistent across different mediation procedures and remained essentially unaltered when different criteria for SDB, behavior, and cognition were used. Potential effects of SDB on cognitive functioning appear to occur through behavioral problems that are detectable in this pediatric population. Thus, early attentional or behavioral pathology may be implicated in the cognitive functioning deficits associated with SDB, and may present an early morbidity-related susceptibility biomarker.
NASA Astrophysics Data System (ADS)
Maples, B. L.; Alvarez, L. V.; Moreno, H. A.; Chilson, P. B.; Segales, A.
2017-12-01
Given that classical in-situ direct surveying for geomorphological subsurface information in rivers is time-consuming, labor-intensive, costly, and often involves high-risk activities, it is obvious that non-intrusive technologies, like UAS-based, LIDAR-based remote sensing, have a promising potential and benefits in terms of efficient and accurate measurement of channel topography over large areas within a short time; therefore, a tremendous amount of attention has been paid to the development of these techniques. Over the past two decades, efforts have been undertaken to develop a specialized technique that can penetrate the water body and detect the channel bed to derive river and coastal bathymetry. In this research, we develop a low-cost effective technique for water body bathymetry. With the use of a sUAS and a light-weight sonar, the bathymetry and volume of a small reservoir have been surveyed. The sUAS surveying approach is conducted under low altitudes (2 meters from the water) using the sUAS to tow a small boat with the sonar attached. A cluster analysis is conducted to optimize the sUAS data collection and minimize the standard deviation created by under-sampling in areas of highly variable bathymetry, so measurements are densified in regions featured by steep slopes and drastic changes in the reservoir bed. This technique provides flexibility, efficiency, and free-risk to humans while obtaining high-quality information. The irregularly-spaced bathymetric survey is then interpolated using unstructured Triangular Irregular Network (TIN)-based maps to avoid re-gridding or re-sampling issues.
Expedient Metrics to Describe Plant Community Change Across Gradients of Anthropogenic Influence
NASA Astrophysics Data System (ADS)
Marcelino, José A. P.; Weber, Everett; Silva, Luís; Garcia, Patrícia V.; Soares, António O.
2014-11-01
Human influence associated with land use may cause considerable biodiversity losses, namely in oceanic islands such as the Azores. Our goal was to identify plant indicator species across two gradients of increasing anthropogenic influence and management (arborescent and herbaceous communities) and determine similarity between plant communities of uncategorized vegetation plots to those in reference gradients using metrics derived from R programming. We intend to test and provide an expedient way to determine the conservation value of a given uncategorized vegetation plot based on the number of native, endemic, introduced, and invasive indicator species present. Using the metric IndVal, plant taxa with a significant indicator value for each community type in the two anthropogenic gradients were determined. A new metric, ComVal, was developed to assess the similarity of an uncategorized vegetation plot toward a reference community type, based on (i) the percentage of pre-defined indicator species from reference communities present in the vegetation plots, and (ii) the percentage of indicator species, specific to a given reference community type, present in the vegetation plot. Using a data resampling approach, the communities were randomly used as training or validation sets to classify vegetation plots based on ComVal. The percentage match with reference community types ranged from 77 to 100 % and from 79 to 100 %, for herbaceous and arborescent vegetation plots, respectively. Both IndVal and ComVal are part of a suite of useful tools characterizing plant communities and plant community change along gradients of anthropogenic influence without a priori knowledge of their biology and ecology.
Tres, A; van der Veer, G; Perez-Marin, M D; van Ruth, S M; Garrido-Varo, A
2012-08-22
Organic products tend to retail at a higher price than their conventional counterparts, which makes them susceptible to fraud. In this study we evaluate the application of near-infrared spectroscopy (NIRS) as a rapid, cost-effective method to verify the organic identity of feed for laying hens. For this purpose a total of 36 organic and 60 conventional feed samples from The Netherlands were measured by NIRS. A binary classification model (organic vs conventional feed) was developed using partial least squares discriminant analysis. Models were developed using five different data preprocessing techniques, which were externally validated by a stratified random resampling strategy using 1000 realizations. Spectral regions related to the protein and fat content were among the most important ones for the classification model. The models based on data preprocessed using direct orthogonal signal correction (DOSC), standard normal variate (SNV), and first and second derivatives provided the most successful results in terms of median sensitivity (0.91 in external validation) and median specificity (1.00 for external validation of SNV models and 0.94 for DOSC and first and second derivative models). A previously developed model, which was based on fatty acid fingerprinting of the same set of feed samples, provided a higher sensitivity (1.00). This shows that the NIRS-based approach provides a rapid and low-cost screening tool, whereas the fatty acid fingerprinting model can be used for further confirmation of the organic identity of feed samples for laying hens. These methods provide additional assurance to the administrative controls currently conducted in the organic feed sector.
NASA Astrophysics Data System (ADS)
Pan, Hao; Qu, Xinghua; Shi, Chunzhao; Zhang, Fumin; Li, Yating
2018-06-01
The non-uniform interval resampling method has been widely used in frequency modulated continuous wave (FMCW) laser ranging. In the large-bandwidth and long-distance measurements, the range peak is deteriorated due to the fiber dispersion mismatch. In this study, we analyze the frequency-sampling error caused by the mismatch and measure it using the spectroscopy of molecular frequency references line. By using the adjacent points' replacement and spline interpolation technique, the sampling errors could be eliminated. The results demonstrated that proposed method is suitable for resolution-enhancement and high-precision measurement. Moreover, using the proposed method, we achieved the precision of absolute distance less than 45 μm within 8 m.
Parasitic worms: how many really?
Strona, Giovanni; Fattorini, Simone
2014-04-01
Accumulation curves are useful tools to estimate species diversity. Here we argue that they can also be used in the study of global parasite species richness. Although this basic idea is not completely new, our approach differs from the previous ones as it treats each host species as an independent sample. We show that randomly resampling host-parasite records from the existing databases makes it possible to empirically model the relationship between the number of investigated host species, and the corresponding number of parasite species retrieved from those hosts. This method was tested on 21 inclusive lists of parasitic worms occurring on vertebrate hosts. All of the obtained models conform well to a power law curve. These curves were then used to estimate global parasite species richness. Results obtained with the new method suggest that current predictions are likely to severely overestimate parasite diversity. Copyright © 2014 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.
Finding Kuiper Belt Objects Below the Detection Limit
NASA Astrophysics Data System (ADS)
Whidden, Peter; Kalmbach, Bryce; Bektesevic, Dino; Connolly, Andrew; Jones, Lynne; Smotherman, Hayden; Becker, Andrew
2018-01-01
We demonstrate a novel approach for uncovering the signatures of moving objects (e.g. Kuiper Belt Objects) below the detection thresholds of single astronomical images. To do so, we will employ a matched filter moving at specific rates of proposed orbits through a time-domain dataset. This is analogous to the better-known "shift-and-stack" method; however it uses neither direct shifting nor stacking of the image pixels. Instead of resampling the raw pixels to create an image stack, we will instead integrate the object detection probabilities across multiple single-epoch images to accrue support for a proposed orbit. The filtering kernel provides a measure of the probability that an object is present along a given orbit, and enables the user to make principled decisions about when the search has been successful, and when it may be terminated. The results we present here utilize GPUs to speed up the search by two orders of magnitudes over CPU implementations.
A Stochastic Climate Generator for Agriculture in Southeast Asian Domains
NASA Astrophysics Data System (ADS)
Greene, A. M.; Allis, E. C.
2014-12-01
We extend a previously-described method for generating future climate scenarios, suitable for driving agricultural models, to selected domains in Lao PDR, Bangladesh and Indonesia. There are notable differences in climatology among the study regions, most importantly the inverse seasonal relationship of southeast Asian and Australian monsoons. These differences necessitate a partially-differentiated modeling approach, utilizing common features for better estimation while allowing independent modeling of divergent attributes. The method attempts to constrain uncertainty due to both anthropogenic and natural influences, providing a measure of how these effects may combine during specified future decades. Seasonal climate fields are downscaled to the daily time step by resampling the AgMERRA dataset, providing a full suite of agriculturally relevant variables and enabling the propagation of climate uncertainty to agricultural outputs. The role of this research in a broader project, conducted under the auspices of the International Fund for Agricultural Development (IFAD), is discussed.
Gangopadhyay, Subhrendu; McCabe, Gregory J.; Woodhouse, Connie A.
2015-01-01
In this paper, we present a methodology to use annual tree-ring chronologies and a monthly water balance model to generate annual reconstructions of water balance variables (e.g., potential evapotrans- piration (PET), actual evapotranspiration (AET), snow water equivalent (SWE), soil moisture storage (SMS), and runoff (R)). The method involves resampling monthly temperature and precipitation from the instrumental record directed by variability indicated by the paleoclimate record. The generated time series of monthly temperature and precipitation are subsequently used as inputs to a monthly water balance model. The methodology is applied to the Upper Colorado River Basin, and results indicate that the methodology reliably simulates water-year runoff, maximum snow water equivalent, and seasonal soil moisture storage for the instrumental period. As a final application, the methodology is used to produce time series of PET, AET, SWE, SMS, and R for the 1404–1905 period for the Upper Colorado River Basin.
Error Mitigation for Short-Depth Quantum Circuits
NASA Astrophysics Data System (ADS)
Temme, Kristan; Bravyi, Sergey; Gambetta, Jay M.
2017-11-01
Two schemes are presented that mitigate the effect of errors and decoherence in short-depth quantum circuits. The size of the circuits for which these techniques can be applied is limited by the rate at which the errors in the computation are introduced. Near-term applications of early quantum devices, such as quantum simulations, rely on accurate estimates of expectation values to become relevant. Decoherence and gate errors lead to wrong estimates of the expectation values of observables used to evaluate the noisy circuit. The two schemes we discuss are deliberately simple and do not require additional qubit resources, so to be as practically relevant in current experiments as possible. The first method, extrapolation to the zero noise limit, subsequently cancels powers of the noise perturbations by an application of Richardson's deferred approach to the limit. The second method cancels errors by resampling randomized circuits according to a quasiprobability distribution.
Trends and Correlation Estimation in Climate Sciences: Effects of Timescale Errors
NASA Astrophysics Data System (ADS)
Mudelsee, M.; Bermejo, M. A.; Bickert, T.; Chirila, D.; Fohlmeister, J.; Köhler, P.; Lohmann, G.; Olafsdottir, K.; Scholz, D.
2012-12-01
Trend describes time-dependence in the first moment of a stochastic process, and correlation measures the linear relation between two random variables. Accurately estimating the trend and correlation, including uncertainties, from climate time series data in the uni- and bivariate domain, respectively, allows first-order insights into the geophysical process that generated the data. Timescale errors, ubiquitious in paleoclimatology, where archives are sampled for proxy measurements and dated, poses a problem to the estimation. Statistical science and the various applied research fields, including geophysics, have almost completely ignored this problem due to its theoretical almost-intractability. However, computational adaptations or replacements of traditional error formulas have become technically feasible. This contribution gives a short overview of such an adaptation package, bootstrap resampling combined with parametric timescale simulation. We study linear regression, parametric change-point models and nonparametric smoothing for trend estimation. We introduce pairwise-moving block bootstrap resampling for correlation estimation. Both methods share robustness against autocorrelation and non-Gaussian distributional shape. We shortly touch computing-intensive calibration of bootstrap confidence intervals and consider options to parallelize the related computer code. Following examples serve not only to illustrate the methods but tell own climate stories: (1) the search for climate drivers of the Agulhas Current on recent timescales, (2) the comparison of three stalagmite-based proxy series of regional, western German climate over the later part of the Holocene, and (3) trends and transitions in benthic oxygen isotope time series from the Cenozoic. Financial support by Deutsche Forschungsgemeinschaft (FOR 668, FOR 1070, MU 1595/4-1) and the European Commission (MC ITN 238512, MC ITN 289447) is acknowledged.
Dunham, Kylee; Grand, James B.
2016-01-01
We examined the effects of complexity and priors on the accuracy of models used to estimate ecological and observational processes, and to make predictions regarding population size and structure. State-space models are useful for estimating complex, unobservable population processes and making predictions about future populations based on limited data. To better understand the utility of state space models in evaluating population dynamics, we used them in a Bayesian framework and compared the accuracy of models with differing complexity, with and without informative priors using sequential importance sampling/resampling (SISR). Count data were simulated for 25 years using known parameters and observation process for each model. We used kernel smoothing to reduce the effect of particle depletion, which is common when estimating both states and parameters with SISR. Models using informative priors estimated parameter values and population size with greater accuracy than their non-informative counterparts. While the estimates of population size and trend did not suffer greatly in models using non-informative priors, the algorithm was unable to accurately estimate demographic parameters. This model framework provides reasonable estimates of population size when little to no information is available; however, when information on some vital rates is available, SISR can be used to obtain more precise estimates of population size and process. Incorporating model complexity such as that required by structured populations with stage-specific vital rates affects precision and accuracy when estimating latent population variables and predicting population dynamics. These results are important to consider when designing monitoring programs and conservation efforts requiring management of specific population segments.
Sabourin, Jeremy; Nobel, Andrew B.; Valdar, William
2014-01-01
Genomewide association studies sometimes identify loci at which both the number and identities of the underlying causal variants are ambiguous. In such cases, statistical methods that model effects of multiple SNPs simultaneously can help disentangle the observed patterns of association and provide information about how those SNPs could be prioritized for follow-up studies. Current multi-SNP methods, however, tend to assume that SNP effects are well captured by additive genetics; yet when genetic dominance is present, this assumption translates to reduced power and faulty prioritizations. We describe a statistical procedure for prioritizing SNPs at GWAS loci that efficiently models both additive and dominance effects. Our method, LLARRMA-dawg, combines a group LASSO procedure for sparse modeling of multiple SNP effects with a resampling procedure based on fractional observation weights; it estimates for each SNP the robustness of association with the phenotype both to sampling variation and to competing explanations from other SNPs. In producing a SNP prioritization that best identifies underlying true signals, we show that: our method easily outperforms a single marker analysis; when additive-only signals are present, our joint model for additive and dominance is equivalent to or only slightly less powerful than modeling additive-only effects; and, when dominance signals are present, even in combination with substantial additive effects, our joint model is unequivocally more powerful than a model assuming additivity. We also describe how performance can be improved through calibrated randomized penalization, and discuss how dominance in ungenotyped SNPs can be incorporated through either heterozygote dosage or multiple imputation. PMID:25417853
Haines, Terry; Bowles, Kelly-Ann
2017-01-17
Low back pain is a common and costly condition internationally. There is high need to identify effective and economically efficient means for managing this problem. This study aimed to explore the cost-effectiveness of a novel motion-sensor biofeedback treatment approach in addition to guidelines-based care compared to guidelines-based care alone, from a societal perspective over a 12 month time horizon. This was an incremental cost-effectiveness analysis conducted concurrently with a pilot, cluster randomized controlled trial. Health care resource use was collected using daily diaries and patient-self report at 3, 6 and 12 month follow-up assessments. Productivity was measured using industry classifications and participant self-reporting of ability to do their normal work with their present pain. Clinical effect was measured using the Patient Global Impression of Change measured at the 12 month follow-up assessment. Data were compared between groups using linear regression clustered by recruitment site. Bootstrap resampling was used to generate a visual representation of the 95% confidence interval for the incremental cost-effectiveness estimate. Two, one-way sensitivity analyses were undertaken to examine the robustness of findings to key assumptions. There were n = 38 participants in the intervention group who completed the 12 month assessment and n = 45 in the control. The intervention group had greater use of trial-related medical and therapy resources [$477 per participant (95% CI: $447, $508)], but lower use of non-trial medical and therapy resources [$-53 per participant (95% CI: $-105, $-0)], and a greater improvement in productivity [$-5123 per participant (95% CI: $-10,174, $-72)]. Overall, the intervention dominated with a saving of $478,100 and an additional 41 participants self-rating as being very or much improved compared to the control. There was >99% confidence in this finding of dominance in both the primary and sensitivity analyses. The motion-sensor biofeedback treatment approach in addition to guidelines- based care appears to be both more clinically effective and economically efficient than guidelines- based care alone. This approach appears to be a viable means to manage low back pain and further research in this area should be a priority. The randomised trial this research was based upon was prospectively registered on March 25th 2009 with the Australian New Zealand Clinical Trials Registry: ACTRN12609000157279 .
Chroma sampling and modulation techniques in high dynamic range video coding
NASA Astrophysics Data System (ADS)
Dai, Wei; Krishnan, Madhu; Topiwala, Pankaj
2015-09-01
High Dynamic Range and Wide Color Gamut (HDR/WCG) Video Coding is an area of intense research interest in the engineering community, for potential near-term deployment in the marketplace. HDR greatly enhances the dynamic range of video content (up to 10,000 nits), as well as broadens the chroma representation (BT.2020). The resulting content offers new challenges in its coding and transmission. The Moving Picture Experts Group (MPEG) of the International Standards Organization (ISO) is currently exploring coding efficiency and/or the functionality enhancements of the recently developed HEVC video standard for HDR and WCG content. FastVDO has developed an advanced approach to coding HDR video, based on splitting the HDR signal into a smoothed luminance (SL) signal, and an associated base signal (B). Both signals are then chroma downsampled to YFbFr 4:2:0 signals, using advanced resampling filters, and coded using the Main10 High Efficiency Video Coding (HEVC) standard, which has been developed jointly by ISO/IEC MPEG and ITU-T WP3/16 (VCEG). Our proposal offers both efficient coding, and backwards compatibility with the existing HEVC Main10 Profile. That is, an existing Main10 decoder can produce a viewable standard dynamic range video, suitable for existing screens. Subjective tests show visible improvement over the anchors. Objective tests show a sizable gain of over 25% in PSNR (RGB domain) on average, for a key set of test clips selected by the ISO/MPEG committee.
ERIC Educational Resources Information Center
Peterson, Ivars
1991-01-01
A method that enables people to obtain the benefits of statistics and probability theory without the shortcomings of conventional methods because it is free of mathematical formulas and is easy to understand and use is described. A resampling technique called the "bootstrap" is discussed in terms of application and development. (KR)
NASA Astrophysics Data System (ADS)
Clark, E.; Wood, A.; Nijssen, B.; Clark, M. P.
2017-12-01
Short- to medium-range (1- to 7-day) streamflow forecasts are important for flood control operations and in issuing potentially life-save flood warnings. In the U.S., the National Weather Service River Forecast Centers (RFCs) issue such forecasts in real time, depending heavily on a manual data assimilation (DA) approach. Forecasters adjust model inputs, states, parameters and outputs based on experience and consideration of a range of supporting real-time information. Achieving high-quality forecasts from new automated, centralized forecast systems will depend critically on the adequacy of automated DA approaches to make analogous corrections to the forecasting system. Such approaches would further enable systematic evaluation of real-time flood forecasting methods and strategies. Toward this goal, we have implemented a real-time Sequential Importance Resampling particle filter (SIR-PF) approach to assimilate observed streamflow into simulated initial hydrologic conditions (states) for initializing ensemble flood forecasts. Assimilating streamflow alone in SIR-PF improves simulated streamflow and soil moisture during the model spin up period prior to a forecast, with consequent benefits for forecasts. Nevertheless, it only consistently limits error in simulated snow water equivalent during the snowmelt season and in basins where precipitation falls primarily as snow. We examine how the simulated initial conditions with and without SIR-PF propagate into 1- to 7-day ensemble streamflow forecasts. Forecasts are evaluated in terms of reliability and skill over a 10-year period from 2005-2015. The focus of this analysis is on how interactions between hydroclimate and SIR-PF performance impact forecast skill. To this end, we examine forecasts for 5 hydroclimatically diverse basins in the western U.S. Some of these basins receive most of their precipitation as snow, others as rain. Some freeze throughout the mid-winter while others experience significant mid-winter melt events. We describe the methodology and present seasonal and inter-basin variations in DA-enhanced forecast skill.
A versatile software package for inter-subject correlation based analyses of fMRI.
Kauppi, Jukka-Pekka; Pajula, Juha; Tohka, Jussi
2014-01-01
In the inter-subject correlation (ISC) based analysis of the functional magnetic resonance imaging (fMRI) data, the extent of shared processing across subjects during the experiment is determined by calculating correlation coefficients between the fMRI time series of the subjects in the corresponding brain locations. This implies that ISC can be used to analyze fMRI data without explicitly modeling the stimulus and thus ISC is a potential method to analyze fMRI data acquired under complex naturalistic stimuli. Despite of the suitability of ISC based approach to analyze complex fMRI data, no generic software tools have been made available for this purpose, limiting a widespread use of ISC based analysis techniques among neuroimaging community. In this paper, we present a graphical user interface (GUI) based software package, ISC Toolbox, implemented in Matlab for computing various ISC based analyses. Many advanced computations such as comparison of ISCs between different stimuli, time window ISC, and inter-subject phase synchronization are supported by the toolbox. The analyses are coupled with re-sampling based statistical inference. The ISC based analyses are data and computation intensive and the ISC toolbox is equipped with mechanisms to execute the parallel computations in a cluster environment automatically and with an automatic detection of the cluster environment in use. Currently, SGE-based (Oracle Grid Engine, Son of a Grid Engine, or Open Grid Scheduler) and Slurm environments are supported. In this paper, we present a detailed account on the methods behind the ISC Toolbox, the implementation of the toolbox and demonstrate the possible use of the toolbox by summarizing selected example applications. We also report the computation time experiments both using a single desktop computer and two grid environments demonstrating that parallelization effectively reduces the computing time. The ISC Toolbox is available in https://code.google.com/p/isc-toolbox/
A versatile software package for inter-subject correlation based analyses of fMRI
Kauppi, Jukka-Pekka; Pajula, Juha; Tohka, Jussi
2014-01-01
In the inter-subject correlation (ISC) based analysis of the functional magnetic resonance imaging (fMRI) data, the extent of shared processing across subjects during the experiment is determined by calculating correlation coefficients between the fMRI time series of the subjects in the corresponding brain locations. This implies that ISC can be used to analyze fMRI data without explicitly modeling the stimulus and thus ISC is a potential method to analyze fMRI data acquired under complex naturalistic stimuli. Despite of the suitability of ISC based approach to analyze complex fMRI data, no generic software tools have been made available for this purpose, limiting a widespread use of ISC based analysis techniques among neuroimaging community. In this paper, we present a graphical user interface (GUI) based software package, ISC Toolbox, implemented in Matlab for computing various ISC based analyses. Many advanced computations such as comparison of ISCs between different stimuli, time window ISC, and inter-subject phase synchronization are supported by the toolbox. The analyses are coupled with re-sampling based statistical inference. The ISC based analyses are data and computation intensive and the ISC toolbox is equipped with mechanisms to execute the parallel computations in a cluster environment automatically and with an automatic detection of the cluster environment in use. Currently, SGE-based (Oracle Grid Engine, Son of a Grid Engine, or Open Grid Scheduler) and Slurm environments are supported. In this paper, we present a detailed account on the methods behind the ISC Toolbox, the implementation of the toolbox and demonstrate the possible use of the toolbox by summarizing selected example applications. We also report the computation time experiments both using a single desktop computer and two grid environments demonstrating that parallelization effectively reduces the computing time. The ISC Toolbox is available in https://code.google.com/p/isc-toolbox/ PMID:24550818
Testing the Difference of Correlated Agreement Coefficients for Statistical Significance
ERIC Educational Resources Information Center
Gwet, Kilem L.
2016-01-01
This article addresses the problem of testing the difference between two correlated agreement coefficients for statistical significance. A number of authors have proposed methods for testing the difference between two correlated kappa coefficients, which require either the use of resampling methods or the use of advanced statistical modeling…
Rmax: A systematic approach to evaluate instrument sort performance using center stream catch☆
Riddell, Andrew; Gardner, Rui; Perez-Gonzalez, Alexis; Lopes, Telma; Martinez, Lola
2015-01-01
Sorting performance can be evaluated with regard to Purity, Yield and/or Recovery of the sorted fraction. Purity is a check on the quality of the sample and the sort decisions made by the instrument. Recovery and Yield definitions vary with some authors regarding both as how efficient the instrument is at sorting the target particles from the original sample, others distinguishing Recovery from Yield, where the former is used to describe the accuracy of the instrument’s sort count. Yield and Recovery are often neglected, mostly due to difficulties in their measurement. Purity of the sort product is often cited alone but is not sufficient to evaluate sorting performance. All of these three performance metrics require re-sampling of the sorted fraction. But, unlike Purity, calculating Yield and/or Recovery calls for the absolute counting of particles in the sorted fraction, which may not be feasible, particularly when dealing with rare populations and precious samples. In addition, the counting process itself involves large errors. Here we describe a new metric for evaluating instrument sort Recovery, defined as the number of particles sorted relative to the number of original particles to be sorted. This calculation requires only measuring the ratios of target and non-target populations in the original pre-sort sample and in the waste stream or center stream catch (CSC), avoiding re-sampling the sorted fraction and absolute counting. We called this new metric Rmax, since it corresponds to the maximum expected Recovery for a particular set of instrument parameters. Rmax is ideal to evaluate and troubleshoot the optimum drop-charge delay of the sorter, or any instrument related failures that will affect sort performance. It can be used as a daily quality control check but can be particularly useful to assess instrument performance before single-cell sorting experiments. Because we do not perturb the sort fraction we can calculate Rmax during the sort process, being especially valuable to check instrument performance during rare population sorts. PMID:25747337
Enhancing sampling design in mist-net bat surveys by accounting for sample size optimization.
Trevelin, Leonardo Carreira; Novaes, Roberto Leonan Morim; Colas-Rosas, Paul François; Benathar, Thayse Cristhina Melo; Peres, Carlos A
2017-01-01
The advantages of mist-netting, the main technique used in Neotropical bat community studies to date, include logistical implementation, standardization and sampling representativeness. Nonetheless, study designs still have to deal with issues of detectability related to how different species behave and use the environment. Yet there is considerable sampling heterogeneity across available studies in the literature. Here, we approach the problem of sample size optimization. We evaluated the common sense hypothesis that the first six hours comprise the period of peak night activity for several species, thereby resulting in a representative sample for the whole night. To this end, we combined re-sampling techniques, species accumulation curves, threshold analysis, and community concordance of species compositional data, and applied them to datasets of three different Neotropical biomes (Amazonia, Atlantic Forest and Cerrado). We show that the strategy of restricting sampling to only six hours of the night frequently results in incomplete sampling representation of the entire bat community investigated. From a quantitative standpoint, results corroborated the existence of a major Sample Area effect in all datasets, although for the Amazonia dataset the six-hour strategy was significantly less species-rich after extrapolation, and for the Cerrado dataset it was more efficient. From the qualitative standpoint, however, results demonstrated that, for all three datasets, the identity of species that are effectively sampled will be inherently impacted by choices of sub-sampling schedule. We also propose an alternative six-hour sampling strategy (at the beginning and the end of a sample night) which performed better when resampling Amazonian and Atlantic Forest datasets on bat assemblages. Given the observed magnitude of our results, we propose that sample representativeness has to be carefully weighed against study objectives, and recommend that the trade-off between logistical constraints and additional sampling performance should be carefully evaluated.
Enhancing sampling design in mist-net bat surveys by accounting for sample size optimization
Trevelin, Leonardo Carreira; Novaes, Roberto Leonan Morim; Colas-Rosas, Paul François; Benathar, Thayse Cristhina Melo; Peres, Carlos A.
2017-01-01
The advantages of mist-netting, the main technique used in Neotropical bat community studies to date, include logistical implementation, standardization and sampling representativeness. Nonetheless, study designs still have to deal with issues of detectability related to how different species behave and use the environment. Yet there is considerable sampling heterogeneity across available studies in the literature. Here, we approach the problem of sample size optimization. We evaluated the common sense hypothesis that the first six hours comprise the period of peak night activity for several species, thereby resulting in a representative sample for the whole night. To this end, we combined re-sampling techniques, species accumulation curves, threshold analysis, and community concordance of species compositional data, and applied them to datasets of three different Neotropical biomes (Amazonia, Atlantic Forest and Cerrado). We show that the strategy of restricting sampling to only six hours of the night frequently results in incomplete sampling representation of the entire bat community investigated. From a quantitative standpoint, results corroborated the existence of a major Sample Area effect in all datasets, although for the Amazonia dataset the six-hour strategy was significantly less species-rich after extrapolation, and for the Cerrado dataset it was more efficient. From the qualitative standpoint, however, results demonstrated that, for all three datasets, the identity of species that are effectively sampled will be inherently impacted by choices of sub-sampling schedule. We also propose an alternative six-hour sampling strategy (at the beginning and the end of a sample night) which performed better when resampling Amazonian and Atlantic Forest datasets on bat assemblages. Given the observed magnitude of our results, we propose that sample representativeness has to be carefully weighed against study objectives, and recommend that the trade-off between logistical constraints and additional sampling performance should be carefully evaluated. PMID:28334046
NASA Technical Reports Server (NTRS)
Hejduk, M. D.; Cowardin, H. M.; Stansbery, Eugene G.
2012-01-01
In performing debris surveys of deep-space orbital regions, the considerable volume of the area to be surveyed and the increased orbital altitude suggest optical telescopes as the most efficient survey instruments; but to proceed this way, methodologies for debris object size estimation using only optical tracking and photometric information are needed. Basic photometry theory indicates that size estimation should be possible if satellite albedo and shape are known. One method for estimating albedo is to try to determine the object's material type photometrically, as one can determine the albedos of common satellite materials in the laboratory. Examination of laboratory filter photometry (using Johnson BVRI filters) on a set of satellite material samples indicates that most material types can be separated at the 1-sigma level via B-R versus R-I color differences with a relatively small amount of required resampling, and objects that remain ambiguous can be resolved by B-R versus B-V color differences and solar radiation pressure differences. To estimate shape, a technique advanced by Hall et al. [1], based on phase-brightness density curves and not requiring any a priori knowledge of attitude, has been modified slightly to try to make it more resistant to the specular characteristics of different materials and to reduce the number of samples necessary to make robust shape determinations. Working from a gallery of idealized debris shapes, the modified technique identifies most shapes within this gallery correctly, also with a relatively small amount of resampling. These results are, of course, based on relatively small laboratory investigations and simulated data, and expanded laboratory experimentation and further investigation with in situ survey measurements will be required in order to assess their actual efficacy under survey conditions; but these techniques show sufficient promise to justify this next level of analysis.
Estimating river discharge uncertainty by applying the Rating Curve Model
NASA Astrophysics Data System (ADS)
Barbetta, S.; Melone, F.; Franchini, M.; Moramarco, T.
2012-04-01
The knowledge of the flow discharge at a river site is necessary for planning and management of water resources as well as for monitoring and real-time forecasting purposes when significant flood events occur. In the hydrological practice, the operational discharge measurement in medium and large rivers is mostly based on indirect approaches by converting the observed stage into discharge values using steady-flow rating curves. However, the stage-discharge relationship can be unknown for hydrometric sections where flow velocity measurements, particularly during high floods, are not available. To overcome this issue, a simplified approach named Rating Curve Model (RCM) and proposed by Moramarco et al. (Moramarco, T., Barbetta, S., F. Melone, F. & Singh, V.P., Relating local stage and remote discharge with significant lateral inflow, J. Hydrol. Engng ASCE, 10[1], 58?69, 2005) can be conveniently used. RCM turned out able to assess, with a high level of accuracy, the discharge hydrograph at a river site where only the stage is monitored while the flow is recorded at a different section along the river, even when significant lateral flows occur. The simple structure of the model is depending on three parameters of which two can be considered characteristic of the river reach and one of the wave travel time of floods. Considering that RCM well lends itself to predict the stage-discharge relationship at a river site wherein only stages are recorded, an uncertainty analysis on river discharge estimate is of interest for the hydrological practice definitely. To this aim, the uncertainty characterizing the RCM outcomes is addressed in this work by considering two different procedures based on the Monte Carlo approach and the Generalized Likelihood Uncertainty Estimation (GLUE) method, respectively. The statistical distribution of parameters is found and a random re-sampling of parameters is done for assessing the 90% confidence interval (CI) of discharge estimates. In particular, for the latter approach the Nash-Sutcliffe coefficient is used as likelihood measure. Two equipped river reaches of the Upper-Middle Tiber River basin, central Italy, are investigated as case studies. The results provided by the selected methodologies are discussed and compared showing that all the computed CIs are satisfied in term of percentage of included observed discharges with similar percentages characterizing the bands assessed by both Monte Carlo approach and GLUE procedure.
NASA Astrophysics Data System (ADS)
Frommholz, D.; Linkiewicz, M.; Poznanska, A. M.
2016-06-01
This paper proposes an in-line method for the simplified reconstruction of city buildings from nadir and oblique aerial images that at the same time are being used for multi-source texture mapping with minimal resampling. Further, the resulting unrectified texture atlases are analyzed for façade elements like windows to be reintegrated into the original 3D models. Tests on real-world data of Heligoland/ Germany comprising more than 800 buildings exposed a median positional deviation of 0.31 m at the façades compared to the cadastral map, a correctness of 67% for the detected windows and good visual quality when being rendered with GPU-based perspective correction. As part of the process building reconstruction takes the oriented input images and transforms them into dense point clouds by semi-global matching (SGM). The point sets undergo local RANSAC-based regression and topology analysis to detect adjacent planar surfaces and determine their semantics. Based on this information the roof, wall and ground surfaces found get intersected and limited in their extension to form a closed 3D building hull. For texture mapping the hull polygons are projected into each possible input bitmap to find suitable color sources regarding the coverage and resolution. Occlusions are detected by ray-casting a full-scale digital surface model (DSM) of the scene and stored in pixel-precise visibility maps. These maps are used to derive overlap statistics and radiometric adjustment coefficients to be applied when the visible image parts for each building polygon are being copied into a compact texture atlas without resampling whenever possible. The atlas bitmap is passed to a commercial object-based image analysis (OBIA) tool running a custom rule set to identify windows on the contained façade patches. Following multi-resolution segmentation and classification based on brightness and contrast differences potential window objects are evaluated against geometric constraints and conditionally grown, fused and filtered morphologically. The output polygons are vectorized and reintegrated into the previously reconstructed buildings by sparsely ray-tracing their vertices. Finally the enhanced 3D models get stored as textured geometry for visualization and semantically annotated "LOD-2.5" CityGML objects for GIS applications.
Thompson, Steven K
2006-12-01
A flexible class of adaptive sampling designs is introduced for sampling in network and spatial settings. In the designs, selections are made sequentially with a mixture distribution based on an active set that changes as the sampling progresses, using network or spatial relationships as well as sample values. The new designs have certain advantages compared with previously existing adaptive and link-tracing designs, including control over sample sizes and of the proportion of effort allocated to adaptive selections. Efficient inference involves averaging over sample paths consistent with the minimal sufficient statistic. A Markov chain resampling method makes the inference computationally feasible. The designs are evaluated in network and spatial settings using two empirical populations: a hidden human population at high risk for HIV/AIDS and an unevenly distributed bird population.
2011-03-01
resampling a second time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 70 Plot of RSA bitgroup exponentiation with DAILMOM after a...14 DVFS Dynamic Voltage and Frequency Switching . . . . . . . . . . . . . . . . . . . 14 MDPL Masked Dual-Rail...algorithms to prevent whole-sale discovery of PINs and other simple methods to prevent employee tampering [5]. In time , cryptographic systems have
On the estimation of spread rate for a biological population
Jim Clark; Lajos Horváth; Mark Lewis
2001-01-01
We propose a nonparametric estimator for the rate of spread of an introduced population. We prove that the limit distribution of the estimator is normal or stable, depending on the behavior of the moment generating function. We show that resampling methods can also be used to approximate the distribution of the estimators.
ERIC Educational Resources Information Center
Du, Yunfei
This paper discusses the impact of sampling error on the construction of confidence intervals around effect sizes. Sampling error affects the location and precision of confidence intervals. Meta-analytic resampling demonstrates that confidence intervals can haphazardly bounce around the true population parameter. Special software with graphical…
Applying Bootstrap Resampling to Compute Confidence Intervals for Various Statistics with R
ERIC Educational Resources Information Center
Dogan, C. Deha
2017-01-01
Background: Most of the studies in academic journals use p values to represent statistical significance. However, this is not a good indicator of practical significance. Although confidence intervals provide information about the precision of point estimation, they are, unfortunately, rarely used. The infrequent use of confidence intervals might…
Exploring the Replicability of a Study's Results: Bootstrap Statistics for the Multivariate Case.
ERIC Educational Resources Information Center
Thompson, Bruce
Conventional statistical significance tests do not inform the researcher regarding the likelihood that results will replicate. One strategy for evaluating result replication is to use a "bootstrap" resampling of a study's data so that the stability of results across numerous configurations of the subjects can be explored. This paper…
Statistical process control for residential treated wood
Patricia K. Lebow; Timothy M. Young; Stan Lebow
2017-01-01
This paper is the first stage of a study that attempts to improve the process of manufacturing treated lumber through the use of statistical process control (SPC). Analysis of industrial and auditing agency data sets revealed there are differences between the industry and agency probability density functions (pdf) for normalized retention data. Resampling of batches of...
Explanation of Two Anomalous Results in Statistical Mediation Analysis
ERIC Educational Resources Information Center
Fritz, Matthew S.; Taylor, Aaron B.; MacKinnon, David P.
2012-01-01
Previous studies of different methods of testing mediation models have consistently found two anomalous results. The first result is elevated Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap tests not found in nonresampling tests or in resampling tests that did not include a bias correction. This is of special…
NASA Technical Reports Server (NTRS)
Park, Steve
1990-01-01
A large and diverse number of computational techniques are routinely used to process and analyze remotely sensed data. These techniques include: univariate statistics; multivariate statistics; principal component analysis; pattern recognition and classification; other multivariate techniques; geometric correction; registration and resampling; radiometric correction; enhancement; restoration; Fourier analysis; and filtering. Each of these techniques will be considered, in order.
DOT National Transportation Integrated Search
2004-02-01
Researchers and practitioners are commonly faced with the problem of limited data in the evaluation of ITS systems. Due to high data collection costs and limited resources, they are often forced to make decisions about the efficacy of a system or tec...
Confidence Intervals for Effect Sizes: Applying Bootstrap Resampling
ERIC Educational Resources Information Center
Banjanovic, Erin S.; Osborne, Jason W.
2016-01-01
Confidence intervals for effect sizes (CIES) provide readers with an estimate of the strength of a reported statistic as well as the relative precision of the point estimate. These statistics offer more information and context than null hypothesis statistic testing. Although confidence intervals have been recommended by scholars for many years,…
NASA Astrophysics Data System (ADS)
Liu, Xingchen; Hu, Zhiyong; He, Qingbo; Zhang, Shangbin; Zhu, Jun
2017-10-01
Doppler distortion and background noise can reduce the effectiveness of wayside acoustic train bearing monitoring and fault diagnosis. This paper proposes a method of combining a microphone array and matching pursuit algorithm to overcome these difficulties. First, a dictionary is constructed based on the characteristics and mechanism of a far-field assumption. Then, the angle of arrival of the train bearing is acquired when applying matching pursuit to analyze the acoustic array signals. Finally, after obtaining the resampling time series, the Doppler distortion can be corrected, which is convenient for further diagnostic work. Compared with traditional single-microphone Doppler correction methods, the advantages of the presented array method are its robustness to background noise and its barely requiring pre-measuring parameters. Simulation and experimental study show that the proposed method is effective in performing wayside acoustic bearing fault diagnosis.
A Noise-Filtered Under-Sampling Scheme for Imbalanced Classification.
Kang, Qi; Chen, XiaoShuang; Li, SiSi; Zhou, MengChu
2017-12-01
Under-sampling is a popular data preprocessing method in dealing with class imbalance problems, with the purposes of balancing datasets to achieve a high classification rate and avoiding the bias toward majority class examples. It always uses full minority data in a training dataset. However, some noisy minority examples may reduce the performance of classifiers. In this paper, a new under-sampling scheme is proposed by incorporating a noise filter before executing resampling. In order to verify the efficiency, this scheme is implemented based on four popular under-sampling methods, i.e., Undersampling + Adaboost, RUSBoost, UnderBagging, and EasyEnsemble through benchmarks and significance analysis. Furthermore, this paper also summarizes the relationship between algorithm performance and imbalanced ratio. Experimental results indicate that the proposed scheme can improve the original undersampling-based methods with significance in terms of three popular metrics for imbalanced classification, i.e., the area under the curve, -measure, and -mean.
A Powerful Test for Comparing Multiple Regression Functions.
Maity, Arnab
2012-09-01
In this article, we address the important problem of comparison of two or more population regression functions. Recently, Pardo-Fernández, Van Keilegom and González-Manteiga (2007) developed test statistics for simple nonparametric regression models: Y(ij) = θ(j)(Z(ij)) + σ(j)(Z(ij))∊(ij), based on empirical distributions of the errors in each population j = 1, … , J. In this paper, we propose a test for equality of the θ(j)(·) based on the concept of generalized likelihood ratio type statistics. We also generalize our test for other nonparametric regression setups, e.g, nonparametric logistic regression, where the loglikelihood for population j is any general smooth function [Formula: see text]. We describe a resampling procedure to obtain the critical values of the test. In addition, we present a simulation study to evaluate the performance of the proposed test and compare our results to those in Pardo-Fernández et al. (2007).
Robust Audio Watermarking Scheme Based on Deterministic Plus Stochastic Model
NASA Astrophysics Data System (ADS)
Dhar, Pranab Kumar; Kim, Cheol Hong; Kim, Jong-Myon
Digital watermarking has been widely used for protecting digital contents from unauthorized duplication. This paper proposes a new watermarking scheme based on spectral modeling synthesis (SMS) for copyright protection of digital contents. SMS defines a sound as a combination of deterministic events plus a stochastic component that makes it possible for a synthesized sound to attain all of the perceptual characteristics of the original sound. In our proposed scheme, watermarks are embedded into the highest prominent peak of the magnitude spectrum of each non-overlapping frame in peak trajectories. Simulation results indicate that the proposed watermarking scheme is highly robust against various kinds of attacks such as noise addition, cropping, re-sampling, re-quantization, and MP3 compression and achieves similarity values ranging from 17 to 22. In addition, our proposed scheme achieves signal-to-noise ratio (SNR) values ranging from 29 dB to 30 dB.
Multi-Sensor Registration of Earth Remotely Sensed Imagery
NASA Technical Reports Server (NTRS)
LeMoigne, Jacqueline; Cole-Rhodes, Arlene; Eastman, Roger; Johnson, Kisha; Morisette, Jeffrey; Netanyahu, Nathan S.; Stone, Harold S.; Zavorin, Ilya; Zukor, Dorothy (Technical Monitor)
2001-01-01
Assuming that approximate registration is given within a few pixels by a systematic correction system, we develop automatic image registration methods for multi-sensor data with the goal of achieving sub-pixel accuracy. Automatic image registration is usually defined by three steps; feature extraction, feature matching, and data resampling or fusion. Our previous work focused on image correlation methods based on the use of different features. In this paper, we study different feature matching techniques and present five algorithms where the features are either original gray levels or wavelet-like features, and the feature matching is based on gradient descent optimization, statistical robust matching, and mutual information. These algorithms are tested and compared on several multi-sensor datasets covering one of the EOS Core Sites, the Konza Prairie in Kansas, from four different sensors: IKONOS (4m), Landsat-7/ETM+ (30m), MODIS (500m), and SeaWIFS (1000m).
Generation of the global cloud free data set of MODIS
NASA Astrophysics Data System (ADS)
Oguro, Y.; Tsuchiya, K.
To extract temporal change of the land cover from remotely sensed data from space the generation of the reliable cloud free data set is the first priority item With the objectives of generating accurate global basic data and to find the effects of spectral and spatial resolution differences and observation time an attempt is made to generate reliable global cloud free data set of Terra and Aqua MODIS utilizing personal computers Out of 36 bands seven bands with similar spectral features to those of Landsat TM i e Band 1 through 7 are selected These bands cover the most important spectra to derive landcover features The procedure of the data set generation is as follows 1 Download the global Terra and Aqua MODIS day time data MOD02 Level-1B Calibrated Geolocation Data Set of 250 meter Band 1 and 2 and 500 meter Band 3 through 7 resolution from NASA web site 2 Separate the data into several BSQ Band SeQuential image and several text geolocation information of pixels files 3 The geolocation information is given to the pixels of several kms interval Based on the information resampling of the data are made at 1 2 and 1 4 degrees intervals of latitude and longitude thus the resampled pixels are distributed in the latitude and longitudinal axis plane at 1 4 degrees high resolution and 1 2 degrees low resolution intervals 4 A global data for one day is composed 5 Compute NDVI for each pixel 6 Compare the value of NDVI of successive days and keep the larger NDVI At the same time keep the values of each band of the day of the larger
Uchiumi, Y; Ohtsuki, H; Sasaki, A
2017-12-01
Mutualism based on reciprocal exchange of costly services must avoid exploitation by 'free-rides'. Accordingly, hosts discriminate against free-riding symbionts in many mutualistic relationships. However, as the selective advantage of discriminators comes from the presence of variability in symbiont quality that they eliminate, discrimination and thus mutualism have been considered to be maintained with exogenous supply of free-riders. In this study, we tried to resolve the 'paradoxical' co-evolution of discrimination by hosts and cooperation by symbionts, by comparing two different types of discrimination: 'one-shot' discrimination, where a host does not reacquire new symbionts after evicting free-riders, and 'resampling' discrimination, where a host does from the environment. Our study shows that this apparently minor difference in discrimination types leads to qualitatively different evolutionary outcomes. First, although it has been usually considered that the benefit of discriminators is derived from the variability of symbiont quality, the benefit of a certain type of discriminators (e.g. one-shot discrimination) is proportional to the frequency of free-riders, which is in stark contrast to the case of resampling discrimination. As a result, one-shot discriminators can invade the free-rider/nondiscriminator population, even if standing variation for symbiont quality is absent. Second, our one-shot discriminators can also be maintained without exogenous supply of free-riders and hence is free from the paradox of discrimination. Therefore, our result indicates that the paradox is not a common feature of evolution of discrimination but is a problem of specific types of discrimination. © 2017 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2017 European Society For Evolutionary Biology.
Comparing apples and oranges: the Community Intercomparison Suite
NASA Astrophysics Data System (ADS)
Schutgens, Nick; Stier, Philip; Kershaw, Philip; Pascoe, Stephen
2015-04-01
Visual representation and comparison of geoscientific datasets presents a huge challenge due to the large variety of file formats and spatio-temporal sampling of data (be they observations or simulations). The Community Intercomparison Suite attempts to greatly simplify these tasks for users by offering an intelligent but simple command line tool for visualisation and colocation of diverse datasets. In addition, CIS can subset and aggregate large datasets into smaller more manageable datasets. Our philosophy is to remove as much as possible the need for specialist knowledge by the user of the structure of a dataset. The colocation of observations with model data is as simple as: "cis col
NASA Astrophysics Data System (ADS)
Yan, D.; Scott, R. L.; Moore, D. J.; Biederman, J. A.; Smith, W. K.
2017-12-01
Land surface phenology (LSP) - defined as remotely sensed seasonal variations in vegetation greenness - is intrinsically linked to seasonal carbon uptake, and is thus commonly used as a proxy for vegetation productivity (gross primary productivity; GPP). Yet, the relationship between LSP and GPP remains uncertain, particularly for understudied dryland ecosystems characterized by relatively large spatial and temporal variability. Here, we explored the relationship between LSP and the phenology of GPP for three dominant dryland ecosystem types, and we evaluated how these relationships change as a function of spatial and temporal scale. We focused on three long-term dryland eddy covariance flux tower sites: Walnut Gulch Lucky Hills Shrubland (WHS), Walnut Gulch Kendall Grassland (WKG), and Santa Rita Mesquite (SRM). We analyzed daily canopy-level, 16-day 30m, and 8-day 500m time series of greenness indices from PhenoCam, Landsat 7 ETM+/Landsat 8 OLI, and MODIS, respectively. We first quantified the impact of spatial scale by temporally resampling canopy-level PhenoCam, 30m Landsat, and 500m MODIS to 16-day intervals and then comparing against flux tower GPP estimates. We next quantified the impact of temporal scale by spatially resampling daily PhenoCam, 16-day Landsat, and 8-day MODIS to 500m time series and then comparing against flux tower GPP estimates. We find evidence of critical periods of decoupling between LSP and the phenology of GPP that vary according to the spatial and temporal scale, and as a function of ecosystem type. Our results provide key insight into dryland LSP and GPP dynamics that can be used in future efforts to improve ecosystem process models and satellite-based vegetation productivity algorithms.
NASA Astrophysics Data System (ADS)
Hale Topaloğlu, Raziye; Sertel, Elif; Musaoğlu, Nebiye
2016-06-01
This study aims to compare classification accuracies of land cover/use maps created from Sentinel-2 and Landsat-8 data. Istanbul metropolitan city of Turkey, with a population of around 14 million, having different landscape characteristics was selected as study area. Water, forest, agricultural areas, grasslands, transport network, urban, airport- industrial units and barren land- mine land cover/use classes adapted from CORINE nomenclature were used as main land cover/use classes to identify. To fulfil the aims of this research, recently acquired dated 08/02/2016 Sentinel-2 and dated 22/02/2016 Landsat-8 images of Istanbul were obtained and image pre-processing steps like atmospheric and geometric correction were employed. Both Sentinel-2 and Landsat-8 images were resampled to 30m pixel size after geometric correction and similar spectral bands for both satellites were selected to create a similar base for these multi-sensor data. Maximum Likelihood (MLC) and Support Vector Machine (SVM) supervised classification methods were applied to both data sets to accurately identify eight different land cover/ use classes. Error matrix was created using same reference points for Sentinel-2 and Landsat-8 classifications. After the classification accuracy, results were compared to find out the best approach to create current land cover/use map of the region. The results of MLC and SVM classification methods were compared for both images.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Jiali; Han, Yuefeng; Stein, Michael L.
2016-02-10
The Weather Research and Forecast (WRF) model downscaling skill in extreme maximum daily temperature is evaluated by using the generalized extreme value (GEV) distribution. While the GEV distribution has been used extensively in climatology and meteorology for estimating probabilities of extreme events, accurately estimating GEV parameters based on data from a single pixel can be difficult, even with fairly long data records. This work proposes a simple method assuming that the shape parameter, the most difficult of the three parameters to estimate, does not vary over a relatively large region. This approach is applied to evaluate 31-year WRF-downscaled extreme maximummore » temperature through comparison with North American Regional Reanalysis (NARR) data. Uncertainty in GEV parameter estimates and the statistical significance in the differences of estimates between WRF and NARR are accounted for by conducting bootstrap resampling. Despite certain biases over parts of the United States, overall, WRF shows good agreement with NARR in the spatial pattern and magnitudes of GEV parameter estimates. Both WRF and NARR show a significant increase in extreme maximum temperature over the southern Great Plains and southeastern United States in January and over the western United States in July. The GEV model shows clear benefits from the regionally constant shape parameter assumption, for example, leading to estimates of the location and scale parameters of the model that show coherent spatial patterns.« less
Wind Measurements from Arc Scans with Doppler Wind Lidar
Wang, H.; Barthelmie, R. J.; Clifton, Andy; ...
2015-11-25
When defining optimal scanning geometries for scanning lidars for wind energy applications, we found that it is still an active field of research. Our paper evaluates uncertainties associated with arc scan geometries and presents recommendations regarding optimal configurations in the atmospheric boundary layer. The analysis is based on arc scan data from a Doppler wind lidar with one elevation angle and seven azimuth angles spanning 30° and focuses on an estimation of 10-min mean wind speed and direction. When flow is horizontally uniform, this approach can provide accurate wind measurements required for wind resource assessments in part because of itsmore » high resampling rate. Retrieved wind velocities at a single range gate exhibit good correlation to data from a sonic anemometer on a nearby meteorological tower, and vertical profiles of horizontal wind speed, though derived from range gates located on a conical surface, match those measured by mast-mounted cup anemometers. Uncertainties in the retrieved wind velocity are related to high turbulent wind fluctuation and an inhomogeneous horizontal wind field. Moreover, the radial velocity variance is found to be a robust measure of the uncertainty of the retrieved wind speed because of its relationship to turbulence properties. It is further shown that the standard error of wind speed estimates can be minimized by increasing the azimuthal range beyond 30° and using five to seven azimuth angles.« less
Applying causal mediation analysis to personality disorder research.
Walters, Glenn D
2018-01-01
This article is designed to address fundamental issues in the application of causal mediation analysis to research on personality disorders. Causal mediation analysis is used to identify mechanisms of effect by testing variables as putative links between the independent and dependent variables. As such, it would appear to have relevance to personality disorder research. It is argued that proper implementation of causal mediation analysis requires that investigators take several factors into account. These factors are discussed under 5 headings: variable selection, model specification, significance evaluation, effect size estimation, and sensitivity testing. First, care must be taken when selecting the independent, dependent, mediator, and control variables for a mediation analysis. Some variables make better mediators than others and all variables should be based on reasonably reliable indicators. Second, the mediation model needs to be properly specified. This requires that the data for the analysis be prospectively or historically ordered and possess proper causal direction. Third, it is imperative that the significance of the identified pathways be established, preferably with a nonparametric bootstrap resampling approach. Fourth, effect size estimates should be computed or competing pathways compared. Finally, investigators employing the mediation method are advised to perform a sensitivity analysis. Additional topics covered in this article include parallel and serial multiple mediation designs, moderation, and the relationship between mediation and moderation. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Integrating scales of seagrass monitoring to meet conservation needs
Neckles, Hilary A.; Kopp, Blaine S.; Peterson, Bradley J.; Pooler, Penelope S.
2012-01-01
We evaluated a hierarchical framework for seagrass monitoring in two estuaries in the northeastern USA: Little Pleasant Bay, Massachusetts, and Great South Bay/Moriches Bay, New York. This approach includes three tiers of monitoring that are integrated across spatial scales and sampling intensities. We identified monitoring attributes for determining attainment of conservation objectives to protect seagrass ecosystems from estuarine nutrient enrichment. Existing mapping programs provided large-scale information on seagrass distribution and bed sizes (tier 1 monitoring). We supplemented this with bay-wide, quadrat-based assessments of seagrass percent cover and canopy height at permanent sampling stations following a spatially distributed random design (tier 2 monitoring). Resampling simulations showed that four observations per station were sufficient to minimize bias in estimating mean percent cover on a bay-wide scale, and sample sizes of 55 stations in a 624-ha system and 198 stations in a 9,220-ha system were sufficient to detect absolute temporal increases in seagrass abundance from 25% to 49% cover and from 4% to 12% cover, respectively. We made high-resolution measurements of seagrass condition (percent cover, canopy height, total and reproductive shoot density, biomass, and seagrass depth limit) at a representative index site in each system (tier 3 monitoring). Tier 3 data helped explain system-wide changes. Our results suggest tiered monitoring as an efficient and feasible way to detect and predict changes in seagrass systems relative to multi-scale conservation objectives.
[Recognition of landscape characteristic scale based on two-dimension wavelet analysis].
Gao, Yan-Ni; Chen, Wei; He, Xing-Yuan; Li, Xiao-Yu
2010-06-01
Three wavelet bases, i. e., Haar, Daubechies, and Symlet, were chosen to analyze the validity of two-dimension wavelet analysis in recognizing the characteristic scales of the urban, peri-urban, and rural landscapes of Shenyang. Owing to the transform scale of two-dimension wavelet must be the integer power of 2, some characteristic scales cannot be accurately recognized. Therefore, the pixel resolution of images was resampled to 3, 3.5, 4, and 4.5 m to densify the scale in analysis. It was shown that two-dimension wavelet analysis worked effectively in checking characteristic scale. Haar, Daubechies, and Symle were the optimal wavelet bases to the peri-urban landscape, urban landscape, and rural landscape, respectively. Both Haar basis and Symlet basis played good roles in recognizing the fine characteristic scale of rural landscape and in detecting the boundary of peri-urban landscape. Daubechies basis and Symlet basis could be also used to detect the boundary of urban landscape and rural landscape, respectively.
Why continuous simulation? The role of antecedent moisture in design flood estimation
NASA Astrophysics Data System (ADS)
Pathiraja, S.; Westra, S.; Sharma, A.
2012-06-01
Continuous simulation for design flood estimation is increasingly becoming a viable alternative to traditional event-based methods. The advantage of continuous simulation approaches is that the catchment moisture state prior to the flood-producing rainfall event is implicitly incorporated within the modeling framework, provided the model has been calibrated and validated to produce reasonable simulations. This contrasts with event-based models in which both information about the expected sequence of rainfall and evaporation preceding the flood-producing rainfall event, as well as catchment storage and infiltration properties, are commonly pooled together into a single set of "loss" parameters which require adjustment through the process of calibration. To identify the importance of accounting for antecedent moisture in flood modeling, this paper uses a continuous rainfall-runoff model calibrated to 45 catchments in the Murray-Darling Basin in Australia. Flood peaks derived using the historical daily rainfall record are compared with those derived using resampled daily rainfall, for which the sequencing of wet and dry days preceding the heavy rainfall event is removed. The analysis shows that there is a consistent underestimation of the design flood events when antecedent moisture is not properly simulated, which can be as much as 30% when only 1 or 2 days of antecedent rainfall are considered, compared to 5% when this is extended to 60 days of prior rainfall. These results show that, in general, it is necessary to consider both short-term memory in rainfall associated with synoptic scale dependence, as well as longer-term memory at seasonal or longer time scale variability in order to obtain accurate design flood estimates.
Yoon, Kyong Sup; Previte, Domenic J.; Hodgdon, Hilliary E.; Poole, Bryan C.; Kwon, Deok Ho; El-Ghar, Gamal E. Abo; Lee, Si Hyeock; Clark, J. Marshall
2014-01-01
The study examines the extent and frequency of a knockdown-type resistance allele (kdr type) in North American populations of human head lice. Lice were collected from 32 locations in Canada and the United States. DNA was extracted from individual lice and used to determine their zygosity using the serial invasive signal amplification technique to detect the kdr-type T917I (TI) mutation, which is most responsible for nerve insensitivity that results in the kdr phenotype and permethrin resistance. Previously sampled sites were resampled to determine if the frequency of the TI mutation was changing. The TI frequency was also reevaluated using a quantitative sequencing method on pooled DNA samples from selected sites to validate this population genotyping method. Genotyping substantiated that TI occurs at high levels in North American lice (88.4%). Overall, the TI frequency in U.S. lice was 84.4% from 1999 to 2009, increased to 99.6% from 2007 to 2009, and was 97.1% in Canadian lice in 2008. Genotyping results using the serial invasive signal amplification reaction (99.54%) and quantitative sequencing (99.45%) techniques were highly correlated. Thus, the frequencies of TI in North American head louse populations were found to be uniformly high, which may be due to the high selection pressure from the intensive and widespread use of the pyrethrins- or pyrethroid-based pediculicides over many years, and is likely a main cause of increased pediculosis and failure of pyrethrins- or permethrin-based products in Canada and the United States. Alternative approaches to treatment of head lice infestations are critically needed. PMID:24724296
2013-01-01
Background Whereas the prognosis of second kidney transplant recipients (STR) compared to the first ones has been frequently analyzed, no study has addressed the issue of comparing the risk factor effects on graft failure between both groups. Methods Here, we propose two alternative strategies to study the heterogeneity of risk factors between two groups of patients: (i) a multiplicative-regression model for relative survival (MRS) and (ii) a stratified Cox model (SCM) specifying the graft rank as strata and assuming subvectors of the explicatives variables. These developments were motivated by the analysis of factors associated with time to graft failure (return-to-dialysis or patient death) in second kidney transplant recipients (STR) compared to the first ones. Estimation of the parameters was based on partial likelihood maximization. Monte-Carlo simulations associated with bootstrap re-sampling was performed to calculate the standard deviations for the MRS. Results We demonstrate, for the first time in renal transplantation, that: (i) male donor gender is a specific risk factor for STR, (ii) the adverse effect of recipient age is enhanced for STR and (iii) the graft failure risk related to donor age is attenuated for STR. Conclusion While the traditional Cox model did not provide original results based on the renal transplantation literature, the proposed relative and stratified models revealed new findings that are useful for clinicians. These methodologies may be of interest in other medical fields when the principal objective is the comparison of risk factors between two populations. PMID:23915191
Improving multispectral satellite image compression using onboard subpixel registration
NASA Astrophysics Data System (ADS)
Albinet, Mathieu; Camarero, Roberto; Isnard, Maxime; Poulet, Christophe; Perret, Jokin
2013-09-01
Future CNES earth observation missions will have to deal with an ever increasing telemetry data rate due to improvements in resolution and addition of spectral bands. Current CNES image compressors implement a discrete wavelet transform (DWT) followed by a bit plane encoding (BPE) but only on a mono spectral basis and do not profit from the multispectral redundancy of the observed scenes. Recent CNES studies have proven a substantial gain on the achievable compression ratio, +20% to +40% on selected scenarios, by implementing a multispectral compression scheme based on a Karhunen Loeve transform (KLT) followed by the classical DWT+BPE. But such results can be achieved only on perfectly registered bands; a default of registration as low as 0.5 pixel ruins all the benefits of multispectral compression. In this work, we first study the possibility to implement a multi-bands subpixel onboard registration based on registration grids generated on-the-fly by the satellite attitude control system and simplified resampling and interpolation techniques. Indeed bands registration is usually performed on ground using sophisticated techniques too computationally intensive for onboard use. This fully quantized algorithm is tuned to meet acceptable registration performances within stringent image quality criteria, with the objective of onboard real-time processing. In a second part, we describe a FPGA implementation developed to evaluate the design complexity and, by extrapolation, the data rate achievable on a spacequalified ASIC. Finally, we present the impact of this approach on the processing chain not only onboard but also on ground and the impacts on the design of the instrument.
Adriaens, Els; Barroso, João; Eskes, Chantra; Hoffmann, Sebastian; McNamee, Pauline; Alépée, Nathalie; Bessou-Touya, Sandrine; De Smedt, Ann; De Wever, Bart; Pfannenbecker, Uwe; Tailhardat, Magalie; Zuang, Valérie
2014-03-01
For more than two decades, scientists have been trying to replace the regulatory in vivo Draize eye test by in vitro methods, but so far only partial replacement has been achieved. In order to better understand the reasons for this, historical in vivo rabbit data were analysed in detail and resampled with the purpose of (1) revealing which of the in vivo endpoints are most important in driving United Nations Globally Harmonized System/European Union Regulation on Classification, Labelling and Packaging (UN GHS/EU CLP) classification for serious eye damage/eye irritation and (2) evaluating the method's within-test variability for proposing acceptable and justifiable target values of sensitivity and specificity for alternative methods and their combinations in testing strategies. Among the Cat 1 chemicals evaluated, 36-65 % (depending on the database) were classified based only on persistence of effects, with the remaining being classified mostly based on severe corneal effects. Iritis was found to rarely drive the classification (<4 % of both Cat 1 and Cat 2 chemicals). The two most important endpoints driving Cat 2 classification are conjunctiva redness (75-81 %) and corneal opacity (54-75 %). The resampling analyses demonstrated an overall probability of at least 11 % that chemicals classified as Cat 1 by the Draize eye test could be equally identified as Cat 2 and of about 12 % for Cat 2 chemicals to be equally identified as No Cat. On the other hand, the over-classification error for No Cat and Cat 2 was negligible (<1 %), which strongly suggests a high over-predictive power of the Draize eye test. Moreover, our analyses of the classification drivers suggest a critical revision of the UN GHS/EU CLP decision criteria for the classification of chemicals based on Draize eye test data, in particular Cat 1 based only on persistence of conjunctiva effects or corneal opacity scores of 4. In order to successfully replace the regulatory in vivo Draize eye test, it will be important to recognise these uncertainties and to have in vitro tools to address the most important in vivo endpoints identified in this paper.
NASA Astrophysics Data System (ADS)
Durocher, M.; Mostofi Zadeh, S.; Burn, D. H.; Ashkar, F.
2017-12-01
Floods are one of the most costly hazards and frequency analysis of river discharges is an important part of the tools at our disposal to evaluate their inherent risks and to provide an adequate response. In comparison to the common examination of annual streamflow maximums, peaks over threshold (POT) is an interesting alternative that makes better use of the available information by including more than one flood event per year (on average). However, a major challenge is the selection of a satisfactory threshold above which peaks are assumed to respect certain conditions necessary for an adequate estimation of the risk. Additionally, studies have shown that POT is also a valuable approach to investigate the evolution of flood regimes in the context of climate change. Recently, automatic procedures for the selection of the threshold were suggested to guide that important choice, which otherwise rely on graphical tools and expert judgment. Furthermore, having an automatic procedure that is objective allows for quickly repeating the analysis on a large number of samples, which is useful in the context of large databases or for uncertainty analysis based on a resampling approach. This study investigates the impact of considering such procedures in a case study including many sites across Canada. A simulation study is conducted to evaluate the bias and predictive power of the automatic procedures in similar conditions as well as investigating the power of derived nonstationarity tests. The results obtained are also evaluated in the light of expert judgments established in a previous study. Ultimately, this study provides a thorough examination of the considerations that need to be addressed when conducting POT analysis using automatic threshold selection.
Mist net effort required to inventory a forest bat species assemblage.
Theodore J. Weller; Danny C. Lee
2007-01-01
Little quantitative information exists about the survey effort necessary to inventory temperate bat species assemblages. We used a bootstrap resampling lgorithm to estimate the number of mist net surveys required to capture individuals from 9 species at both study area and site levels using data collected in a forested watershed in northwestern California, USA, during...
Long-Term Soil Chemistry Changes in Aggrading Forest Ecosystems
Jennifer D. Knoepp; Wayne T. Swank
1994-01-01
Assessing potential long-term forest productivity requires identification of the processes regulating chemical changes in forest soils. We resampled the litter layer and upper two mineral soil horizons, A and AB/BA, in two aggrading southern Appalachian watersheds 20 yr after an earlier sampling. Soils from a mixed-hardwood watershed exhibited a small but significant...
Jeffrey T. Walton
2008-01-01
Three machine learning subpixel estimation methods (Cubist, Random Forests, and support vector regression) were applied to estimate urban cover. Urban forest canopy cover and impervious surface cover were estimated from Landsat-7 ETM+ imagery using a higher resolution cover map resampled to 30 m as training and reference data. Three different band combinations (...
The Relationship of Cohabitation and Mental Health: A Study of a Young Adult Cohort.
ERIC Educational Resources Information Center
Horwitz, Allan V.; White, Helene Raskin
1998-01-01
Uses a cohort of unmarried young adults who were sampled when they were 18, 21, or 24 years old and resampled seven years later. Results indicate no differences between cohabitators and married couples in levels of depression. Cohabitating men report more alcohol problems than married and single men; cohabitating women reported more alcohol…
Techniques for Down-Sampling a Measured Surface Height Map for Model Validation
NASA Technical Reports Server (NTRS)
Sidick, Erkin
2012-01-01
This software allows one to down-sample a measured surface map for model validation, not only without introducing any re-sampling errors, but also eliminating the existing measurement noise and measurement errors. The software tool of the current two new techniques can be used in all optical model validation processes involving large space optical surfaces
Rangeland exclosures of northeastern Oregon: stories they tell (1936–2004).
Charles Grier Johnson
2007-01-01
Rangeland exclosures installed primarily in the 1960s, but with some from the 1940s, were resampled for changes in plant community structure and composition periodically from 1977 to 2004 on the Malheur, Umatilla, and Wallowa-Whitman National Forests in northeastern Oregon. They allow one to compare vegetation with all-ungulate exclusion (known historically as game...