Petraco, Ricardo; Dehbi, Hakim-Moulay; Howard, James P; Shun-Shin, Matthew J; Sen, Sayan; Nijjer, Sukhjinder S; Mayet, Jamil; Davies, Justin E; Francis, Darrel P
2018-01-01
Diagnostic accuracy is widely accepted by researchers and clinicians as an optimal expression of a test's performance. The aim of this study was to evaluate the effects of disease severity distribution on values of diagnostic accuracy as well as propose a sample-independent methodology to calculate and display accuracy of diagnostic tests. We evaluated the diagnostic relationship between two hypothetical methods to measure serum cholesterol (Chol rapid and Chol gold ) by generating samples with statistical software and (1) keeping the numerical relationship between methods unchanged and (2) changing the distribution of cholesterol values. Metrics of categorical agreement were calculated (accuracy, sensitivity and specificity). Finally, a novel methodology to display and calculate accuracy values was presented (the V-plot of accuracies). No single value of diagnostic accuracy can be used to describe the relationship between tests, as accuracy is a metric heavily affected by the underlying sample distribution. Our novel proposed methodology, the V-plot of accuracies, can be used as a sample-independent measure of a test performance against a reference gold standard.
Dehbi, Hakim-Moulay; Howard, James P; Shun-Shin, Matthew J; Sen, Sayan; Nijjer, Sukhjinder S; Mayet, Jamil; Davies, Justin E; Francis, Darrel P
2018-01-01
Background Diagnostic accuracy is widely accepted by researchers and clinicians as an optimal expression of a test’s performance. The aim of this study was to evaluate the effects of disease severity distribution on values of diagnostic accuracy as well as propose a sample-independent methodology to calculate and display accuracy of diagnostic tests. Methods and findings We evaluated the diagnostic relationship between two hypothetical methods to measure serum cholesterol (Cholrapid and Cholgold) by generating samples with statistical software and (1) keeping the numerical relationship between methods unchanged and (2) changing the distribution of cholesterol values. Metrics of categorical agreement were calculated (accuracy, sensitivity and specificity). Finally, a novel methodology to display and calculate accuracy values was presented (the V-plot of accuracies). Conclusion No single value of diagnostic accuracy can be used to describe the relationship between tests, as accuracy is a metric heavily affected by the underlying sample distribution. Our novel proposed methodology, the V-plot of accuracies, can be used as a sample-independent measure of a test performance against a reference gold standard. PMID:29387424
[Accuracy Check of Monte Carlo Simulation in Particle Therapy Using Gel Dosimeters].
Furuta, Takuya
2017-01-01
Gel dosimeters are a three-dimensional imaging tool for dose distribution induced by radiations. They can be used for accuracy check of Monte Carlo simulation in particle therapy. An application was reviewed in this article. An inhomogeneous biological sample placing a gel dosimeter behind it was irradiated by carbon beam. The recorded dose distribution in the gel dosimeter reflected the inhomogeneity of the biological sample. Monte Carlo simulation was conducted by reconstructing the biological sample from its CT image. The accuracy of the particle transport by Monte Carlo simulation was checked by comparing the dose distribution in the gel dosimeter between simulation and experiment.
Improving the accuracy of livestock distribution estimates through spatial interpolation.
Bryssinckx, Ward; Ducheyne, Els; Muhwezi, Bernard; Godfrey, Sunday; Mintiens, Koen; Leirs, Herwig; Hendrickx, Guy
2012-11-01
Animal distribution maps serve many purposes such as estimating transmission risk of zoonotic pathogens to both animals and humans. The reliability and usability of such maps is highly dependent on the quality of the input data. However, decisions on how to perform livestock surveys are often based on previous work without considering possible consequences. A better understanding of the impact of using different sample designs and processing steps on the accuracy of livestock distribution estimates was acquired through iterative experiments using detailed survey. The importance of sample size, sample design and aggregation is demonstrated and spatial interpolation is presented as a potential way to improve cattle number estimates. As expected, results show that an increasing sample size increased the precision of cattle number estimates but these improvements were mainly seen when the initial sample size was relatively low (e.g. a median relative error decrease of 0.04% per sampled parish for sample sizes below 500 parishes). For higher sample sizes, the added value of further increasing the number of samples declined rapidly (e.g. a median relative error decrease of 0.01% per sampled parish for sample sizes above 500 parishes. When a two-stage stratified sample design was applied to yield more evenly distributed samples, accuracy levels were higher for low sample densities and stabilised at lower sample sizes compared to one-stage stratified sampling. Aggregating the resulting cattle number estimates yielded significantly more accurate results because of averaging under- and over-estimates (e.g. when aggregating cattle number estimates from subcounty to district level, P <0.009 based on a sample of 2,077 parishes using one-stage stratified samples). During aggregation, area-weighted mean values were assigned to higher administrative unit levels. However, when this step is preceded by a spatial interpolation to fill in missing values in non-sampled areas, accuracy is improved remarkably. This counts especially for low sample sizes and spatially even distributed samples (e.g. P <0.001 for a sample of 170 parishes using one-stage stratified sampling and aggregation on district level). Whether the same observations apply on a lower spatial scale should be further investigated.
Soultan, Alaaeldin; Safi, Kamran
2017-01-01
Digitized species occurrence data provide an unprecedented source of information for ecologists and conservationists. Species distribution model (SDM) has become a popular method to utilise these data for understanding the spatial and temporal distribution of species, and for modelling biodiversity patterns. Our objective is to study the impact of noise in species occurrence data (namely sample size and positional accuracy) on the performance and reliability of SDM, considering the multiplicative impact of SDM algorithms, species specialisation, and grid resolution. We created a set of four 'virtual' species characterized by different specialisation levels. For each of these species, we built the suitable habitat models using five algorithms at two grid resolutions, with varying sample sizes and different levels of positional accuracy. We assessed the performance and reliability of the SDM according to classic model evaluation metrics (Area Under the Curve and True Skill Statistic) and model agreement metrics (Overall Concordance Correlation Coefficient and geographic niche overlap) respectively. Our study revealed that species specialisation had by far the most dominant impact on the SDM. In contrast to previous studies, we found that for widespread species, low sample size and low positional accuracy were acceptable, and useful distribution ranges could be predicted with as few as 10 species occurrences. Range predictions for narrow-ranged species, however, were sensitive to sample size and positional accuracy, such that useful distribution ranges required at least 20 species occurrences. Against expectations, the MAXENT algorithm poorly predicted the distribution of specialist species at low sample size.
Chen, Yibin; Chen, Jiaxi; Chen, Xuan; Wang, Min; Wang, Wei
2015-01-01
A new method of uniform sampling is evaluated in this paper. The items and indexes were adopted to evaluate the rationality of the uniform sampling. The evaluation items included convenience of operation, uniformity of sampling site distribution, and accuracy and precision of measured results. The evaluation indexes included operational complexity, occupation rate of sampling site in a row and column, relative accuracy of pill weight, and relative deviation of pill weight. They were obtained from three kinds of drugs with different shape and size by four kinds of sampling methods. Gray correlation analysis was adopted to make the comprehensive evaluation by comparing it with the standard method. The experimental results showed that the convenience of uniform sampling method was 1 (100%), odds ratio of occupation rate in a row and column was infinity, relative accuracy was 99.50-99.89%, reproducibility RSD was 0.45-0.89%, and weighted incidence degree exceeded the standard method. Hence, the uniform sampling method was easy to operate, and the selected samples were distributed uniformly. The experimental results demonstrated that the uniform sampling method has good accuracy and reproducibility, which can be put into use in drugs analysis.
Loce, R P; Jodoin, R E
1990-09-10
Using the tools of Fourier analysis, a sampling requirement is derived that assures that sufficient information is contained within the samples of a distribution to calculate accurately geometric moments of that distribution. The derivation follows the standard textbook derivation of the Whittaker-Shannon sampling theorem, which is used for reconstruction, but further insight leads to a coarser minimum sampling interval for moment determination. The need for fewer samples to determine moments agrees with intuition since less information should be required to determine a characteristic of a distribution compared with that required to construct the distribution. A formula for calculation of the moments from these samples is also derived. A numerical analysis is performed to quantify the accuracy of the calculated first moment for practical nonideal sampling conditions. The theory is applied to a high speed laser beam position detector, which uses the normalized first moment to measure raster line positional accuracy in a laser printer. The effects of the laser irradiance profile, sampling aperture, number of samples acquired, quantization, and noise are taken into account.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu Huijun; Gordon, J. James; Siebers, Jeffrey V.
2011-02-15
Purpose: A dosimetric margin (DM) is the margin in a specified direction between a structure and a specified isodose surface, corresponding to a prescription or tolerance dose. The dosimetric margin distribution (DMD) is the distribution of DMs over all directions. Given a geometric uncertainty model, representing inter- or intrafraction setup uncertainties or internal organ motion, the DMD can be used to calculate coverage Q, which is the probability that a realized target or organ-at-risk (OAR) dose metric D{sub v} exceeds the corresponding prescription or tolerance dose. Postplanning coverage evaluation quantifies the percentage of uncertainties for which target and OAR structuresmore » meet their intended dose constraints. The goal of the present work is to evaluate coverage probabilities for 28 prostate treatment plans to determine DMD sampling parameters that ensure adequate accuracy for postplanning coverage estimates. Methods: Normally distributed interfraction setup uncertainties were applied to 28 plans for localized prostate cancer, with prescribed dose of 79.2 Gy and 10 mm clinical target volume to planning target volume (CTV-to-PTV) margins. Using angular or isotropic sampling techniques, dosimetric margins were determined for the CTV, bladder and rectum, assuming shift invariance of the dose distribution. For angular sampling, DMDs were sampled at fixed angular intervals {omega} (e.g., {omega}=1 deg., 2 deg., 5 deg., 10 deg., 20 deg.). Isotropic samples were uniformly distributed on the unit sphere resulting in variable angular increments, but were calculated for the same number of sampling directions as angular DMDs, and accordingly characterized by the effective angular increment {omega}{sub eff}. In each direction, the DM was calculated by moving the structure in radial steps of size {delta}(=0.1,0.2,0.5,1 mm) until the specified isodose was crossed. Coverage estimation accuracy {Delta}Q was quantified as a function of the sampling parameters {omega} or {omega}{sub eff} and {delta}. Results: The accuracy of coverage estimates depends on angular and radial DMD sampling parameters {omega} or {omega}{sub eff} and {delta}, as well as the employed sampling technique. Target |{Delta}Q|<1% and OAR |{Delta}Q|<3% can be achieved with sampling parameters {omega} or {omega}{sub eff}=20 deg., {delta}=1 mm. Better accuracy (target |{Delta}Q|<0.5% and OAR |{Delta}Q|<{approx}1%) can be achieved with {omega} or {omega}{sub eff}=10 deg., {delta}=0.5 mm. As the number of sampling points decreases, the isotropic sampling method maintains better accuracy than fixed angular sampling. Conclusions: Coverage estimates for post-planning evaluation are essential since coverage values of targets and OARs often differ from the values implied by the static margin-based plans. Finer sampling of the DMD enables more accurate assessment of the effect of geometric uncertainties on coverage estimates prior to treatment. DMD sampling with {omega} or {omega}{sub eff}=10 deg. and {delta}=0.5 mm should be adequate for planning purposes.« less
Xu, Huijun; Gordon, J James; Siebers, Jeffrey V
2011-02-01
A dosimetric margin (DM) is the margin in a specified direction between a structure and a specified isodose surface, corresponding to a prescription or tolerance dose. The dosimetric margin distribution (DMD) is the distribution of DMs over all directions. Given a geometric uncertainty model, representing inter- or intrafraction setup uncertainties or internal organ motion, the DMD can be used to calculate coverage Q, which is the probability that a realized target or organ-at-risk (OAR) dose metric D, exceeds the corresponding prescription or tolerance dose. Postplanning coverage evaluation quantifies the percentage of uncertainties for which target and OAR structures meet their intended dose constraints. The goal of the present work is to evaluate coverage probabilities for 28 prostate treatment plans to determine DMD sampling parameters that ensure adequate accuracy for postplanning coverage estimates. Normally distributed interfraction setup uncertainties were applied to 28 plans for localized prostate cancer, with prescribed dose of 79.2 Gy and 10 mm clinical target volume to planning target volume (CTV-to-PTV) margins. Using angular or isotropic sampling techniques, dosimetric margins were determined for the CTV, bladder and rectum, assuming shift invariance of the dose distribution. For angular sampling, DMDs were sampled at fixed angular intervals w (e.g., w = 1 degree, 2 degrees, 5 degrees, 10 degrees, 20 degrees). Isotropic samples were uniformly distributed on the unit sphere resulting in variable angular increments, but were calculated for the same number of sampling directions as angular DMDs, and accordingly characterized by the effective angular increment omega eff. In each direction, the DM was calculated by moving the structure in radial steps of size delta (=0.1, 0.2, 0.5, 1 mm) until the specified isodose was crossed. Coverage estimation accuracy deltaQ was quantified as a function of the sampling parameters omega or omega eff and delta. The accuracy of coverage estimates depends on angular and radial DMD sampling parameters omega or omega eff and delta, as well as the employed sampling technique. Target deltaQ/ < l% and OAR /deltaQ/ < 3% can be achieved with sampling parameters omega or omega eef = 20 degrees, delta =1 mm. Better accuracy (target /deltaQ < 0.5% and OAR /deltaQ < approximately 1%) can be achieved with omega or omega eff = 10 degrees, delta = 0.5 mm. As the number of sampling points decreases, the isotropic sampling method maintains better accuracy than fixed angular sampling. Coverage estimates for post-planning evaluation are essential since coverage values of targets and OARs often differ from the values implied by the static margin-based plans. Finer sampling of the DMD enables more accurate assessment of the effect of geometric uncertainties on coverage estimates prior to treatment. DMD sampling with omega or omega eff = 10 degrees and delta = 0.5 mm should be adequate for planning purposes.
Thomas C. Edwards; D. Richard Cutler; Niklaus E. Zimmermann; Linda Geiser; Gretchen G. Moisen
2006-01-01
We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by...
Detecting the Water-soluble Chloride Distribution of Cement Paste in a High-precision Way.
Chang, Honglei; Mu, Song
2017-11-21
To improve the accuracy of the chloride distribution along the depth of cement paste under cyclic wet-dry conditions, a new method is proposed to obtain a high-precision chloride profile. Firstly, paste specimens are molded, cured, and exposed to cyclic wet-dry conditions. Then, powder samples at different specimen depths are grinded when the exposure age is reached. Finally, the water-soluble chloride content is detected using a silver nitrate titration method, and chloride profiles are plotted. The key to improving the accuracy of the chloride distribution along the depth is to exclude the error in the powderization, which is the most critical step for testing the distribution of chloride. Based on the above concept, the grinding method in this protocol can be used to grind powder samples automatically layer by layer from the surface inward, and it should be noted that a very thin grinding thickness (less than 0.5 mm) with a minimum error less than 0.04 mm can be obtained. The chloride profile obtained by this method better reflects the chloride distribution in specimens, which helps researchers to capture the distribution features that are often overlooked. Furthermore, this method can be applied to studies in the field of cement-based materials, which require high chloride distribution accuracy.
Wickham, J.D.; Stehman, S.V.; Smith, J.H.; Wade, T.G.; Yang, L.
2004-01-01
Two-stage cluster sampling reduces the cost of collecting accuracy assessment reference data by constraining sample elements to fall within a limited number of geographic domains (clusters). However, because classification error is typically positively spatially correlated, within-cluster correlation may reduce the precision of the accuracy estimates. The detailed population information to quantify a priori the effect of within-cluster correlation on precision is typically unavailable. Consequently, a convenient, practical approach to evaluate the likely performance of a two-stage cluster sample is needed. We describe such an a priori evaluation protocol focusing on the spatial distribution of the sample by land-cover class across different cluster sizes and costs of different sampling options, including options not imposing clustering. This protocol also assesses the two-stage design's adequacy for estimating the precision of accuracy estimates for rare land-cover classes. We illustrate the approach using two large-area, regional accuracy assessments from the National Land-Cover Data (NLCD), and describe how the a priorievaluation was used as a decision-making tool when implementing the NLCD design.
Viana, Duarte S; Santamaría, Luis; Figuerola, Jordi
2016-02-01
Propagule retention time is a key factor in determining propagule dispersal distance and the shape of "seed shadows". Propagules dispersed by animal vectors are either ingested and retained in the gut until defecation or attached externally to the body until detachment. Retention time is a continuous variable, but it is commonly measured at discrete time points, according to pre-established sampling time-intervals. Although parametric continuous distributions have been widely fitted to these interval-censored data, the performance of different fitting methods has not been evaluated. To investigate the performance of five different fitting methods, we fitted parametric probability distributions to typical discretized retention-time data with known distribution using as data-points either the lower, mid or upper bounds of sampling intervals, as well as the cumulative distribution of observed values (using either maximum likelihood or non-linear least squares for parameter estimation); then compared the estimated and original distributions to assess the accuracy of each method. We also assessed the robustness of these methods to variations in the sampling procedure (sample size and length of sampling time-intervals). Fittings to the cumulative distribution performed better for all types of parametric distributions (lognormal, gamma and Weibull distributions) and were more robust to variations in sample size and sampling time-intervals. These estimated distributions had negligible deviations of up to 0.045 in cumulative probability of retention times (according to the Kolmogorov-Smirnov statistic) in relation to original distributions from which propagule retention time was simulated, supporting the overall accuracy of this fitting method. In contrast, fitting the sampling-interval bounds resulted in greater deviations that ranged from 0.058 to 0.273 in cumulative probability of retention times, which may introduce considerable biases in parameter estimates. We recommend the use of cumulative probability to fit parametric probability distributions to propagule retention time, specifically using maximum likelihood for parameter estimation. Furthermore, the experimental design for an optimal characterization of unimodal propagule retention time should contemplate at least 500 recovered propagules and sampling time-intervals not larger than the time peak of propagule retrieval, except in the tail of the distribution where broader sampling time-intervals may also produce accurate fits.
Bellier, Edwige; Grøtan, Vidar; Engen, Steinar; Schartau, Ann Kristin; Diserud, Ola H; Finstad, Anders G
2012-10-01
Obtaining accurate estimates of diversity indices is difficult because the number of species encountered in a sample increases with sampling intensity. We introduce a novel method that requires that the presence of species in a sample to be assessed while the counts of the number of individuals per species are only required for just a small part of the sample. To account for species included as incidence data in the species abundance distribution, we modify the likelihood function of the classical Poisson log-normal distribution. Using simulated community assemblages, we contrast diversity estimates based on a community sample, a subsample randomly extracted from the community sample, and a mixture sample where incidence data are added to a subsample. We show that the mixture sampling approach provides more accurate estimates than the subsample and at little extra cost. Diversity indices estimated from a freshwater zooplankton community sampled using the mixture approach show the same pattern of results as the simulation study. Our method efficiently increases the accuracy of diversity estimates and comprehension of the left tail of the species abundance distribution. We show how to choose the scale of sample size needed for a compromise between information gained, accuracy of the estimates and cost expended when assessing biological diversity. The sample size estimates are obtained from key community characteristics, such as the expected number of species in the community, the expected number of individuals in a sample and the evenness of the community.
Fixed-interval matching-to-sample: intermatching time and intermatching error runs1
Nelson, Thomas D.
1978-01-01
Four pigeons were trained on a matching-to-sample task in which reinforcers followed either the first matching response (fixed interval) or the fifth matching response (tandem fixed-interval fixed-ratio) that occurred 80 seconds or longer after the last reinforcement. Relative frequency distributions of the matching-to-sample responses that concluded intermatching times and runs of mismatches (intermatching error runs) were computed for the final matching responses directly followed by grain access and also for the three matching responses immediately preceding the final match. Comparison of these two distributions showed that the fixed-interval schedule arranged for the preferential reinforcement of matches concluding relatively extended intermatching times and runs of mismatches. Differences in matching accuracy and rate during the fixed interval, compared to the tandem fixed-interval fixed-ratio, suggested that reinforcers following matches concluding various intermatching times and runs of mismatches influenced the rate and accuracy of the last few matches before grain access, but did not control rate and accuracy throughout the entire fixed-interval period. PMID:16812032
Egger, Alexander E; Theiner, Sarah; Kornauth, Christoph; Heffeter, Petra; Berger, Walter; Keppler, Bernhard K; Hartinger, Christian G
2014-09-01
Laser ablation-inductively coupled plasma-mass spectrometry (LA-ICP-MS) was used to study the spatially-resolved distribution of ruthenium and platinum in viscera (liver, kidney, spleen, and muscle) originating from mice treated with the investigational ruthenium-based antitumor compound KP1339 or cisplatin, a potent, but nephrotoxic clinically-approved platinum-based anticancer drug. Method development was based on homogenized Ru- and Pt-containing samples (22.0 and 0.257 μg g(-1), respectively). Averaging yielded satisfactory precision and accuracy for both concentrations (3-15% and 93-120%, respectively), however when considering only single data points, the highly concentrated Ru sample maintained satisfactory precision and accuracy, while the low concentrated Pt sample yielded low recoveries and precision, which could not be improved by use of internal standards ((115)In, (185)Re or (13)C). Matrix-matched standards were used for quantification in LA-ICP-MS which yielded comparable metal distributions, i.e., enrichment in the cortex of the kidney in comparison with the medulla, a homogenous distribution in the liver and the muscle and areas of enrichment in the spleen. Elemental distributions were assigned to histological structures exceeding 100 μm in size. The accuracy of a quantitative LA-ICP-MS imaging experiment was validated by an independent method using microwave-assisted digestion (MW) followed by direct infusion ICP-MS analysis.
Target Tracking Using SePDAF under Ambiguous Angles for Distributed Array Radar.
Long, Teng; Zhang, Honggang; Zeng, Tao; Chen, Xinliang; Liu, Quanhua; Zheng, Le
2016-09-09
Distributed array radar can improve radar detection capability and measurement accuracy. However, it will suffer cyclic ambiguity in its angle estimates according to the spatial Nyquist sampling theorem since the large sparse array is undersampling. Consequently, the state estimation accuracy and track validity probability degrades when the ambiguous angles are directly used for target tracking. This paper proposes a second probability data association filter (SePDAF)-based tracking method for distributed array radar. Firstly, the target motion model and radar measurement model is built. Secondly, the fusion result of each radar's estimation is employed to the extended Kalman filter (EKF) to finish the first filtering. Thirdly, taking this result as prior knowledge, and associating with the array-processed ambiguous angles, the SePDAF is applied to accomplish the second filtering, and then achieving a high accuracy and stable trajectory with relatively low computational complexity. Moreover, the azimuth filtering accuracy will be promoted dramatically and the position filtering accuracy will also improve. Finally, simulations illustrate the effectiveness of the proposed method.
Edwards, T.C.; Cutler, D.R.; Zimmermann, N.E.; Geiser, L.; Moisen, Gretchen G.
2006-01-01
We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by resubstitution rates were similar for each lichen species irrespective of the underlying sample survey form. Cross-validation estimates of prediction accuracies were lower than resubstitution accuracies for all species and both design types, and in all cases were closer to the true prediction accuracies based on the EVALUATION data set. We argue that greater emphasis should be placed on calculating and reporting cross-validation accuracy rates rather than simple resubstitution accuracy rates. Evaluation of the DESIGN and PURPOSIVE tree models on the EVALUATION data set shows significantly lower prediction accuracy for the PURPOSIVE tree models relative to the DESIGN models, indicating that non-probabilistic sample surveys may generate models with limited predictive capability. These differences were consistent across all four lichen species, with 11 of the 12 possible species and sample survey type comparisons having significantly lower accuracy rates. Some differences in accuracy were as large as 50%. The classification tree structures also differed considerably both among and within the modelled species, depending on the sample survey form. Overlap in the predictor variables selected by the DESIGN and PURPOSIVE tree models ranged from only 20% to 38%, indicating the classification trees fit the two evaluated survey forms on different sets of predictor variables. The magnitude of these differences in predictor variables throws doubt on ecological interpretation derived from prediction models based on non-probabilistic sample surveys. ?? 2006 Elsevier B.V. All rights reserved.
Xiao, Zhu; Havyarimana, Vincent; Li, Tong; Wang, Dong
2016-05-13
In this paper, a novel nonlinear framework of smoothing method, non-Gaussian delayed particle smoother (nGDPS), is proposed, which enables vehicle state estimation (VSE) with high accuracy taking into account the non-Gaussianity of the measurement and process noises. Within the proposed method, the multivariate Student's t-distribution is adopted in order to compute the probability distribution function (PDF) related to the process and measurement noises, which are assumed to be non-Gaussian distributed. A computation approach based on Ensemble Kalman Filter (EnKF) is designed to cope with the mean and the covariance matrix of the proposal non-Gaussian distribution. A delayed Gibbs sampling algorithm, which incorporates smoothing of the sampled trajectories over a fixed-delay, is proposed to deal with the sample degeneracy of particles. The performance is investigated based on the real-world data, which is collected by low-cost on-board vehicle sensors. The comparison study based on the real-world experiments and the statistical analysis demonstrates that the proposed nGDPS has significant improvement on the vehicle state accuracy and outperforms the existing filtering and smoothing methods.
Hu, Junguo; Zhou, Jian; Zhou, Guomo; Luo, Yiqi; Xu, Xiaojun; Li, Pingheng; Liang, Junyi
2016-01-01
Soil respiration inherently shows strong spatial variability. It is difficult to obtain an accurate characterization of soil respiration with an insufficient number of monitoring points. However, it is expensive and cumbersome to deploy many sensors. To solve this problem, we proposed employing the Bayesian Maximum Entropy (BME) algorithm, using soil temperature as auxiliary information, to study the spatial distribution of soil respiration. The BME algorithm used the soft data (auxiliary information) effectively to improve the estimation accuracy of the spatiotemporal distribution of soil respiration. Based on the functional relationship between soil temperature and soil respiration, the BME algorithm satisfactorily integrated soil temperature data into said spatial distribution. As a means of comparison, we also applied the Ordinary Kriging (OK) and Co-Kriging (Co-OK) methods. The results indicated that the root mean squared errors (RMSEs) and absolute values of bias for both Day 1 and Day 2 were the lowest for the BME method, thus demonstrating its higher estimation accuracy. Further, we compared the performance of the BME algorithm coupled with auxiliary information, namely soil temperature data, and the OK method without auxiliary information in the same study area for 9, 21, and 37 sampled points. The results showed that the RMSEs for the BME algorithm (0.972 and 1.193) were less than those for the OK method (1.146 and 1.539) when the number of sampled points was 9 and 37, respectively. This indicates that the former method using auxiliary information could reduce the required number of sampling points for studying spatial distribution of soil respiration. Thus, the BME algorithm, coupled with soil temperature data, can not only improve the accuracy of soil respiration spatial interpolation but can also reduce the number of sampling points.
Hu, Junguo; Zhou, Jian; Zhou, Guomo; Luo, Yiqi; Xu, Xiaojun; Li, Pingheng; Liang, Junyi
2016-01-01
Soil respiration inherently shows strong spatial variability. It is difficult to obtain an accurate characterization of soil respiration with an insufficient number of monitoring points. However, it is expensive and cumbersome to deploy many sensors. To solve this problem, we proposed employing the Bayesian Maximum Entropy (BME) algorithm, using soil temperature as auxiliary information, to study the spatial distribution of soil respiration. The BME algorithm used the soft data (auxiliary information) effectively to improve the estimation accuracy of the spatiotemporal distribution of soil respiration. Based on the functional relationship between soil temperature and soil respiration, the BME algorithm satisfactorily integrated soil temperature data into said spatial distribution. As a means of comparison, we also applied the Ordinary Kriging (OK) and Co-Kriging (Co-OK) methods. The results indicated that the root mean squared errors (RMSEs) and absolute values of bias for both Day 1 and Day 2 were the lowest for the BME method, thus demonstrating its higher estimation accuracy. Further, we compared the performance of the BME algorithm coupled with auxiliary information, namely soil temperature data, and the OK method without auxiliary information in the same study area for 9, 21, and 37 sampled points. The results showed that the RMSEs for the BME algorithm (0.972 and 1.193) were less than those for the OK method (1.146 and 1.539) when the number of sampled points was 9 and 37, respectively. This indicates that the former method using auxiliary information could reduce the required number of sampling points for studying spatial distribution of soil respiration. Thus, the BME algorithm, coupled with soil temperature data, can not only improve the accuracy of soil respiration spatial interpolation but can also reduce the number of sampling points. PMID:26807579
NASA Astrophysics Data System (ADS)
Li, Zhe; Feng, Jinchao; Liu, Pengyu; Sun, Zhonghua; Li, Gang; Jia, Kebin
2018-05-01
Temperature is usually considered as a fluctuation in near-infrared spectral measurement. Chemometric methods were extensively studied to correct the effect of temperature variations. However, temperature can be considered as a constructive parameter that provides detailed chemical information when systematically changed during the measurement. Our group has researched the relationship between temperature-induced spectral variation (TSVC) and normalized squared temperature. In this study, we focused on the influence of temperature distribution in calibration set. Multi-temperature calibration set selection (MTCS) method was proposed to improve the prediction accuracy by considering the temperature distribution of calibration samples. Furthermore, double-temperature calibration set selection (DTCS) method was proposed based on MTCS method and the relationship between TSVC and normalized squared temperature. We compare the prediction performance of PLS models based on random sampling method and proposed methods. The results from experimental studies showed that the prediction performance was improved by using proposed methods. Therefore, MTCS method and DTCS method will be the alternative methods to improve prediction accuracy in near-infrared spectral measurement.
Target Tracking Using SePDAF under Ambiguous Angles for Distributed Array Radar
Long, Teng; Zhang, Honggang; Zeng, Tao; Chen, Xinliang; Liu, Quanhua; Zheng, Le
2016-01-01
Distributed array radar can improve radar detection capability and measurement accuracy. However, it will suffer cyclic ambiguity in its angle estimates according to the spatial Nyquist sampling theorem since the large sparse array is undersampling. Consequently, the state estimation accuracy and track validity probability degrades when the ambiguous angles are directly used for target tracking. This paper proposes a second probability data association filter (SePDAF)-based tracking method for distributed array radar. Firstly, the target motion model and radar measurement model is built. Secondly, the fusion result of each radar’s estimation is employed to the extended Kalman filter (EKF) to finish the first filtering. Thirdly, taking this result as prior knowledge, and associating with the array-processed ambiguous angles, the SePDAF is applied to accomplish the second filtering, and then achieving a high accuracy and stable trajectory with relatively low computational complexity. Moreover, the azimuth filtering accuracy will be promoted dramatically and the position filtering accuracy will also improve. Finally, simulations illustrate the effectiveness of the proposed method. PMID:27618058
Blind prediction of cyclohexane-water distribution coefficients from the SAMPL5 challenge.
Bannan, Caitlin C; Burley, Kalistyn H; Chiu, Michael; Shirts, Michael R; Gilson, Michael K; Mobley, David L
2016-11-01
In the recent SAMPL5 challenge, participants submitted predictions for cyclohexane/water distribution coefficients for a set of 53 small molecules. Distribution coefficients (log D) replace the hydration free energies that were a central part of the past five SAMPL challenges. A wide variety of computational methods were represented by the 76 submissions from 18 participating groups. Here, we analyze submissions by a variety of error metrics and provide details for a number of reference calculations we performed. As in the SAMPL4 challenge, we assessed the ability of participants to evaluate not just their statistical uncertainty, but their model uncertainty-how well they can predict the magnitude of their model or force field error for specific predictions. Unfortunately, this remains an area where prediction and analysis need improvement. In SAMPL4 the top performing submissions achieved a root-mean-squared error (RMSE) around 1.5 kcal/mol. If we anticipate accuracy in log D predictions to be similar to the hydration free energy predictions in SAMPL4, the expected error here would be around 1.54 log units. Only a few submissions had an RMSE below 2.5 log units in their predicted log D values. However, distribution coefficients introduced complexities not present in past SAMPL challenges, including tautomer enumeration, that are likely to be important in predicting biomolecular properties of interest to drug discovery, therefore some decrease in accuracy would be expected. Overall, the SAMPL5 distribution coefficient challenge provided great insight into the importance of modeling a variety of physical effects. We believe these types of measurements will be a promising source of data for future blind challenges, especially in view of the relatively straightforward nature of the experiments and the level of insight provided.
Blind prediction of cyclohexane-water distribution coefficients from the SAMPL5 challenge
Bannan, Caitlin C.; Burley, Kalistyn H.; Chiu, Michael; Shirts, Michael R.; Gilson, Michael K.; Mobley, David L.
2016-01-01
In the recent SAMPL5 challenge, participants submitted predictions for cyclohexane/water distribution coefficients for a set of 53 small molecules. Distribution coefficients (log D) replace the hydration free energies that were a central part of the past five SAMPL challenges. A wide variety of computational methods were represented by the 76 submissions from 18 participating groups. Here, we analyze submissions by a variety of error metrics and provide details for a number of reference calculations we performed. As in the SAMPL4 challenge, we assessed the ability of participants to evaluate not just their statistical uncertainty, but their model uncertainty – how well they can predict the magnitude of their model or force field error for specific predictions. Unfortunately, this remains an area where prediction and analysis need improvement. In SAMPL4 the top performing submissions achieved a root-mean-squared error (RMSE) around 1.5 kcal/mol. If we anticipate accuracy in log D predictions to be similar to the hydration free energy predictions in SAMPL4, the expected error here would be around 1.54 log units. Only a few submissions had an RMSE below 2.5 log units in their predicted log D values. However, distribution coefficients introduced complexities not present in past SAMPL challenges, including tautomer enumeration, that are likely to be important in predicting biomolecular properties of interest to drug discovery, therefore some decrease in accuracy would be expected. Overall, the SAMPL5 distribution coefficient challenge provided great insight into the importance of modeling a variety of physical effects. We believe these types of measurements will be a promising source of data for future blind challenges, especially in view of the relatively straightforward nature of the experiments and the level of insight provided. PMID:27677750
Xiao, Zhu; Havyarimana, Vincent; Li, Tong; Wang, Dong
2016-01-01
In this paper, a novel nonlinear framework of smoothing method, non-Gaussian delayed particle smoother (nGDPS), is proposed, which enables vehicle state estimation (VSE) with high accuracy taking into account the non-Gaussianity of the measurement and process noises. Within the proposed method, the multivariate Student’s t-distribution is adopted in order to compute the probability distribution function (PDF) related to the process and measurement noises, which are assumed to be non-Gaussian distributed. A computation approach based on Ensemble Kalman Filter (EnKF) is designed to cope with the mean and the covariance matrix of the proposal non-Gaussian distribution. A delayed Gibbs sampling algorithm, which incorporates smoothing of the sampled trajectories over a fixed-delay, is proposed to deal with the sample degeneracy of particles. The performance is investigated based on the real-world data, which is collected by low-cost on-board vehicle sensors. The comparison study based on the real-world experiments and the statistical analysis demonstrates that the proposed nGDPS has significant improvement on the vehicle state accuracy and outperforms the existing filtering and smoothing methods. PMID:27187405
NASA Astrophysics Data System (ADS)
Wahl, N.; Hennig, P.; Wieser, H. P.; Bangert, M.
2017-07-01
The sensitivity of intensity-modulated proton therapy (IMPT) treatment plans to uncertainties can be quantified and mitigated with robust/min-max and stochastic/probabilistic treatment analysis and optimization techniques. Those methods usually rely on sparse random, importance, or worst-case sampling. Inevitably, this imposes a trade-off between computational speed and accuracy of the uncertainty propagation. Here, we investigate analytical probabilistic modeling (APM) as an alternative for uncertainty propagation and minimization in IMPT that does not rely on scenario sampling. APM propagates probability distributions over range and setup uncertainties via a Gaussian pencil-beam approximation into moments of the probability distributions over the resulting dose in closed form. It supports arbitrary correlation models and allows for efficient incorporation of fractionation effects regarding random and systematic errors. We evaluate the trade-off between run-time and accuracy of APM uncertainty computations on three patient datasets. Results are compared against reference computations facilitating importance and random sampling. Two approximation techniques to accelerate uncertainty propagation and minimization based on probabilistic treatment plan optimization are presented. Runtimes are measured on CPU and GPU platforms, dosimetric accuracy is quantified in comparison to a sampling-based benchmark (5000 random samples). APM accurately propagates range and setup uncertainties into dose uncertainties at competitive run-times (GPU ≤slant {5} min). The resulting standard deviation (expectation value) of dose show average global γ{3% / {3}~mm} pass rates between 94.2% and 99.9% (98.4% and 100.0%). All investigated importance sampling strategies provided less accuracy at higher run-times considering only a single fraction. Considering fractionation, APM uncertainty propagation and treatment plan optimization was proven to be possible at constant time complexity, while run-times of sampling-based computations are linear in the number of fractions. Using sum sampling within APM, uncertainty propagation can only be accelerated at the cost of reduced accuracy in variance calculations. For probabilistic plan optimization, we were able to approximate the necessary pre-computations within seconds, yielding treatment plans of similar quality as gained from exact uncertainty propagation. APM is suited to enhance the trade-off between speed and accuracy in uncertainty propagation and probabilistic treatment plan optimization, especially in the context of fractionation. This brings fully-fledged APM computations within reach of clinical application.
Wahl, N; Hennig, P; Wieser, H P; Bangert, M
2017-06-26
The sensitivity of intensity-modulated proton therapy (IMPT) treatment plans to uncertainties can be quantified and mitigated with robust/min-max and stochastic/probabilistic treatment analysis and optimization techniques. Those methods usually rely on sparse random, importance, or worst-case sampling. Inevitably, this imposes a trade-off between computational speed and accuracy of the uncertainty propagation. Here, we investigate analytical probabilistic modeling (APM) as an alternative for uncertainty propagation and minimization in IMPT that does not rely on scenario sampling. APM propagates probability distributions over range and setup uncertainties via a Gaussian pencil-beam approximation into moments of the probability distributions over the resulting dose in closed form. It supports arbitrary correlation models and allows for efficient incorporation of fractionation effects regarding random and systematic errors. We evaluate the trade-off between run-time and accuracy of APM uncertainty computations on three patient datasets. Results are compared against reference computations facilitating importance and random sampling. Two approximation techniques to accelerate uncertainty propagation and minimization based on probabilistic treatment plan optimization are presented. Runtimes are measured on CPU and GPU platforms, dosimetric accuracy is quantified in comparison to a sampling-based benchmark (5000 random samples). APM accurately propagates range and setup uncertainties into dose uncertainties at competitive run-times (GPU [Formula: see text] min). The resulting standard deviation (expectation value) of dose show average global [Formula: see text] pass rates between 94.2% and 99.9% (98.4% and 100.0%). All investigated importance sampling strategies provided less accuracy at higher run-times considering only a single fraction. Considering fractionation, APM uncertainty propagation and treatment plan optimization was proven to be possible at constant time complexity, while run-times of sampling-based computations are linear in the number of fractions. Using sum sampling within APM, uncertainty propagation can only be accelerated at the cost of reduced accuracy in variance calculations. For probabilistic plan optimization, we were able to approximate the necessary pre-computations within seconds, yielding treatment plans of similar quality as gained from exact uncertainty propagation. APM is suited to enhance the trade-off between speed and accuracy in uncertainty propagation and probabilistic treatment plan optimization, especially in the context of fractionation. This brings fully-fledged APM computations within reach of clinical application.
Le Boedec, Kevin
2016-12-01
According to international guidelines, parametric methods must be chosen for RI construction when the sample size is small and the distribution is Gaussian. However, normality tests may not be accurate at small sample size. The purpose of the study was to evaluate normality test performance to properly identify samples extracted from a Gaussian population at small sample sizes, and assess the consequences on RI accuracy of applying parametric methods to samples that falsely identified the parent population as Gaussian. Samples of n = 60 and n = 30 values were randomly selected 100 times from simulated Gaussian, lognormal, and asymmetric populations of 10,000 values. The sensitivity and specificity of 4 normality tests were compared. Reference intervals were calculated using 6 different statistical methods from samples that falsely identified the parent population as Gaussian, and their accuracy was compared. Shapiro-Wilk and D'Agostino-Pearson tests were the best performing normality tests. However, their specificity was poor at sample size n = 30 (specificity for P < .05: .51 and .50, respectively). The best significance levels identified when n = 30 were 0.19 for Shapiro-Wilk test and 0.18 for D'Agostino-Pearson test. Using parametric methods on samples extracted from a lognormal population but falsely identified as Gaussian led to clinically relevant inaccuracies. At small sample size, normality tests may lead to erroneous use of parametric methods to build RI. Using nonparametric methods (or alternatively Box-Cox transformation) on all samples regardless of their distribution or adjusting, the significance level of normality tests depending on sample size would limit the risk of constructing inaccurate RI. © 2016 American Society for Veterinary Clinical Pathology.
NASA Astrophysics Data System (ADS)
Rana, Parvez; Vauhkonen, Jari; Junttila, Virpi; Hou, Zhengyang; Gautam, Basanta; Cawkwell, Fiona; Tokola, Timo
2017-12-01
Large-diameter trees (taking DBH > 30 cm to define large trees) dominate the dynamics, function and structure of a forest ecosystem. The aim here was to employ sparse airborne laser scanning (ALS) data with a mean point density of 0.8 m-2 and the non-parametric k-most similar neighbour (k-MSN) to predict tree diameter at breast height (DBH) distributions in a subtropical forest in southern Nepal. The specific objectives were: (1) to evaluate the accuracy of the large-tree fraction of the diameter distribution; and (2) to assess the effect of the number of training areas (sample size, n) on the accuracy of the predicted tree diameter distribution. Comparison of the predicted distributions with empirical ones indicated that the large tree diameter distribution can be derived in a mixed species forest with a RMSE% of 66% and a bias% of -1.33%. It was also feasible to downsize the sample size without losing the interpretability capacity of the model. For large-diameter trees, even a reduction of half of the training plots (n = 250), giving a marginal increase in the RMSE% (1.12-1.97%) was reported compared with the original training plots (n = 500). To be consistent with these outcomes, the sample areas should capture the entire range of spatial and feature variability in order to reduce the occurrence of error.
NASA Astrophysics Data System (ADS)
Beecken, B. P.; Fossum, E. R.
1996-07-01
Standard statistical theory is used to calculate how the accuracy of a conversion-gain measurement depends on the number of samples. During the development of a theoretical basis for this calculation, a model is developed that predicts how the noise levels from different elements of an ideal detector array are distributed. The model can also be used to determine what dependence the accuracy of measured noise has on the size of the sample. These features have been confirmed by experiment, thus enhancing the credibility of the method for calculating the uncertainty of a measured conversion gain. detector-array uniformity, charge coupled device, active pixel sensor.
Effects of Sample Selection Bias on the Accuracy of Population Structure and Ancestry Inference
Shringarpure, Suyash; Xing, Eric P.
2014-01-01
Population stratification is an important task in genetic analyses. It provides information about the ancestry of individuals and can be an important confounder in genome-wide association studies. Public genotyping projects have made a large number of datasets available for study. However, practical constraints dictate that of a geographical/ethnic population, only a small number of individuals are genotyped. The resulting data are a sample from the entire population. If the distribution of sample sizes is not representative of the populations being sampled, the accuracy of population stratification analyses of the data could be affected. We attempt to understand the effect of biased sampling on the accuracy of population structure analysis and individual ancestry recovery. We examined two commonly used methods for analyses of such datasets, ADMIXTURE and EIGENSOFT, and found that the accuracy of recovery of population structure is affected to a large extent by the sample used for analysis and how representative it is of the underlying populations. Using simulated data and real genotype data from cattle, we show that sample selection bias can affect the results of population structure analyses. We develop a mathematical framework for sample selection bias in models for population structure and also proposed a correction for sample selection bias using auxiliary information about the sample. We demonstrate that such a correction is effective in practice using simulated and real data. PMID:24637351
Analysis of spatial distribution of land cover maps accuracy
NASA Astrophysics Data System (ADS)
Khatami, R.; Mountrakis, G.; Stehman, S. V.
2017-12-01
Land cover maps have become one of the most important products of remote sensing science. However, classification errors will exist in any classified map and affect the reliability of subsequent map usage. Moreover, classification accuracy often varies over different regions of a classified map. These variations of accuracy will affect the reliability of subsequent analyses of different regions based on the classified maps. The traditional approach of map accuracy assessment based on an error matrix does not capture the spatial variation in classification accuracy. Here, per-pixel accuracy prediction methods are proposed based on interpolating accuracy values from a test sample to produce wall-to-wall accuracy maps. Different accuracy prediction methods were developed based on four factors: predictive domain (spatial versus spectral), interpolation function (constant, linear, Gaussian, and logistic), incorporation of class information (interpolating each class separately versus grouping them together), and sample size. Incorporation of spectral domain as explanatory feature spaces of classification accuracy interpolation was done for the first time in this research. Performance of the prediction methods was evaluated using 26 test blocks, with 10 km × 10 km dimensions, dispersed throughout the United States. The performance of the predictions was evaluated using the area under the curve (AUC) of the receiver operating characteristic. Relative to existing accuracy prediction methods, our proposed methods resulted in improvements of AUC of 0.15 or greater. Evaluation of the four factors comprising the accuracy prediction methods demonstrated that: i) interpolations should be done separately for each class instead of grouping all classes together; ii) if an all-classes approach is used, the spectral domain will result in substantially greater AUC than the spatial domain; iii) for the smaller sample size and per-class predictions, the spectral and spatial domain yielded similar AUC; iv) for the larger sample size (i.e., very dense spatial sample) and per-class predictions, the spatial domain yielded larger AUC; v) increasing the sample size improved accuracy predictions with a greater benefit accruing to the spatial domain; and vi) the function used for interpolation had the smallest effect on AUC.
Accuracy or precision: Implications of sample design and methodology on abundance estimation
Kowalewski, Lucas K.; Chizinski, Christopher J.; Powell, Larkin A.; Pope, Kevin L.; Pegg, Mark A.
2015-01-01
Sampling by spatially replicated counts (point-count) is an increasingly popular method of estimating population size of organisms. Challenges exist when sampling by point-count method, and it is often impractical to sample entire area of interest and impossible to detect every individual present. Ecologists encounter logistical limitations that force them to sample either few large-sample units or many small sample-units, introducing biases to sample counts. We generated a computer environment and simulated sampling scenarios to test the role of number of samples, sample unit area, number of organisms, and distribution of organisms in the estimation of population sizes using N-mixture models. Many sample units of small area provided estimates that were consistently closer to true abundance than sample scenarios with few sample units of large area. However, sample scenarios with few sample units of large area provided more precise abundance estimates than abundance estimates derived from sample scenarios with many sample units of small area. It is important to consider accuracy and precision of abundance estimates during the sample design process with study goals and objectives fully recognized, although and with consequence, consideration of accuracy and precision of abundance estimates is often an afterthought that occurs during the data analysis process.
Astronomic Position Accuracy Capability Study.
1979-10-01
sample of 37 FOs produced coefficients of skewness and excess of -0.21 and -0.28, respectively. The kurtosis of 2.72 indicates a platykurtic ...The kurtosis of 2.72 indicates a platykurtic distribution. The KS Test for Goodness of Fit was used to verify or refute that this sample is from a
Ding, Qian; Wang, Yong; Zhuang, Dafang
2018-04-15
The appropriate spatial interpolation methods must be selected to analyze the spatial distributions of Potentially Toxic Elements (PTEs), which is a precondition for evaluating PTE pollution. The accuracy and effect of different spatial interpolation methods, which include inverse distance weighting interpolation (IDW) (power = 1, 2, 3), radial basis function interpolation (RBF) (basis function: thin-plate spline (TPS), spline with tension (ST), completely regularized spline (CRS), multiquadric (MQ) and inverse multiquadric (IMQ)) and ordinary kriging interpolation (OK) (semivariogram model: spherical, exponential, gaussian and linear), were compared using 166 unevenly distributed soil PTE samples (As, Pb, Cu and Zn) in the Suxian District, Chenzhou City, Hunan Province as the study subject. The reasons for the accuracy differences of the interpolation methods and the uncertainties of the interpolation results are discussed, then several suggestions for improving the interpolation accuracy are proposed, and the direction of pollution control is determined. The results of this study are as follows: (i) RBF-ST and OK (exponential) are the optimal interpolation methods for As and Cu, and the optimal interpolation method for Pb and Zn is RBF-IMQ. (ii) The interpolation uncertainty is positively correlated with the PTE concentration, and higher uncertainties are primarily distributed around mines, which is related to the strong spatial variability of PTE concentrations caused by human interference. (iii) The interpolation accuracy can be improved by increasing the sample size around the mines, introducing auxiliary variables in the case of incomplete sampling and adopting the partition prediction method. (iv) It is necessary to strengthen the prevention and control of As and Pb pollution, particularly in the central and northern areas. The results of this study can provide an effective reference for the optimization of interpolation methods and parameters for unevenly distributed soil PTE data in mining areas. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Furuta, T.; Maeyama, T.; Ishikawa, K. L.; Fukunishi, N.; Fukasaku, K.; Takagi, S.; Noda, S.; Himeno, R.; Hayashi, S.
2015-08-01
In this research, we used a 135 MeV/nucleon carbon-ion beam to irradiate a biological sample composed of fresh chicken meat and bones, which was placed in front of a PAGAT gel dosimeter, and compared the measured and simulated transverse-relaxation-rate (R2) distributions in the gel dosimeter. We experimentally measured the three-dimensional R2 distribution, which records the dose induced by particles penetrating the sample, by using magnetic resonance imaging. The obtained R2 distribution reflected the heterogeneity of the biological sample. We also conducted Monte Carlo simulations using the PHITS code by reconstructing the elemental composition of the biological sample from its computed tomography images while taking into account the dependence of the gel response on the linear energy transfer. The simulation reproduced the experimental distal edge structure of the R2 distribution with an accuracy under about 2 mm, which is approximately the same as the voxel size currently used in treatment planning.
Furuta, T; Maeyama, T; Ishikawa, K L; Fukunishi, N; Fukasaku, K; Takagi, S; Noda, S; Himeno, R; Hayashi, S
2015-08-21
In this research, we used a 135 MeV/nucleon carbon-ion beam to irradiate a biological sample composed of fresh chicken meat and bones, which was placed in front of a PAGAT gel dosimeter, and compared the measured and simulated transverse-relaxation-rate (R2) distributions in the gel dosimeter. We experimentally measured the three-dimensional R2 distribution, which records the dose induced by particles penetrating the sample, by using magnetic resonance imaging. The obtained R2 distribution reflected the heterogeneity of the biological sample. We also conducted Monte Carlo simulations using the PHITS code by reconstructing the elemental composition of the biological sample from its computed tomography images while taking into account the dependence of the gel response on the linear energy transfer. The simulation reproduced the experimental distal edge structure of the R2 distribution with an accuracy under about 2 mm, which is approximately the same as the voxel size currently used in treatment planning.
Optimizing the Terzaghi Estimator of the 3D Distribution of Rock Fracture Orientations
NASA Astrophysics Data System (ADS)
Tang, Huiming; Huang, Lei; Juang, C. Hsein; Zhang, Junrong
2017-08-01
Orientation statistics are prone to bias when surveyed with the scanline mapping technique in which the observed probabilities differ, depending on the intersection angle between the fracture and the scanline. This bias leads to 1D frequency statistical data that are poorly representative of the 3D distribution. A widely accessible estimator named after Terzaghi was developed to estimate 3D frequencies from 1D biased observations, but the estimation accuracy is limited for fractures at narrow intersection angles to scanlines (termed the blind zone). Although numerous works have concentrated on accuracy with respect to the blind zone, accuracy outside the blind zone has rarely been studied. This work contributes to the limited investigations of accuracy outside the blind zone through a qualitative assessment that deploys a mathematical derivation of the Terzaghi equation in conjunction with a quantitative evaluation that uses fractures simulation and verification of natural fractures. The results show that the estimator does not provide a precise estimate of 3D distributions and that the estimation accuracy is correlated with the grid size adopted by the estimator. To explore the potential for improving accuracy, the particular grid size producing maximum accuracy is identified from 168 combinations of grid sizes and two other parameters. The results demonstrate that the 2° × 2° grid size provides maximum accuracy for the estimator in most cases when applied outside the blind zone. However, if the global sample density exceeds 0.5°-2, then maximum accuracy occurs at a grid size of 1° × 1°.
NASA Astrophysics Data System (ADS)
Huang, Guoqin; Zhang, Meiqin; Huang, Hui; Guo, Hua; Xu, Xipeng
2018-04-01
Circular sawing is an important method for the processing of natural stone. The ability to predict sawing power is important in the optimisation, monitoring and control of the sawing process. In this paper, a predictive model (PFD) of sawing power, which is based on the tangential force distribution at the sawing contact zone, was proposed, experimentally validated and modified. With regard to the influence of sawing speed on tangential force distribution, the modified PFD (MPFD) performed with high predictive accuracy across a wide range of sawing parameters, including sawing speed. The mean maximum absolute error rate was within 6.78%, and the maximum absolute error rate was within 11.7%. The practicability of predicting sawing power by the MPFD with few initial experimental samples was proved in case studies. On the premise of high sample measurement accuracy, only two samples are required for a fixed sawing speed. The feasibility of applying the MPFD to optimise sawing parameters while lowering the energy consumption of the sawing system was validated. The case study shows that energy use was reduced 28% by optimising the sawing parameters. The MPFD model can be used to predict sawing power, optimise sawing parameters and control energy.
Design of Malaria Diagnostic Criteria for the Sysmex XE-2100 Hematology Analyzer
Campuzano-Zuluaga, Germán; Álvarez-Sánchez, Gonzalo; Escobar-Gallo, Gloria Elcy; Valencia-Zuluaga, Luz Marina; Ríos-Orrego, Alexandra Marcela; Pabón-Vidal, Adriana; Miranda-Arboleda, Andrés Felipe; Blair-Trujillo, Silvia; Campuzano-Maya, Germán
2010-01-01
Thick film, the standard diagnostic procedure for malaria, is not always ordered promptly. A failsafe diagnostic strategy using an XE-2100 analyzer is proposed, and for this strategy, malaria diagnostic models for the XE-2100 were developed and tested for accuracy. Two hundred eighty-one samples were distributed into Plasmodium vivax, P. falciparum, and acute febrile syndrome groups for model construction. Model validation was performed using 60% of malaria cases and a composite control group of samples from AFS and healthy participants from endemic and non-endemic regions. For P. vivax, two observer-dependent models (accuracy = 95.3–96.9%), one non–observer-dependent model using built-in variables (accuracy = 94.7%), and one non–observer-dependent model using new and built-in variables (accuracy = 96.8%) were developed. For P. falciparum, two non–observer-dependent models (accuracies = 85% and 89%) were developed. These models could be used by health personnel or be integrated as a malaria alarm for the XE-2100 to prompt early malaria microscopic diagnosis. PMID:20207864
Jia, Zhenyi; Zhou, Shenglu; Su, Quanlong; Yi, Haomin; Wang, Junxiao
2017-12-26
Soil pollution by metal(loid)s resulting from rapid economic development is a major concern. Accurately estimating the spatial distribution of soil metal(loid) pollution has great significance in preventing and controlling soil pollution. In this study, 126 topsoil samples were collected in Kunshan City and the geo-accumulation index was selected as a pollution index. We used Kriging interpolation and BP neural network methods to estimate the spatial distribution of arsenic (As) and cadmium (Cd) pollution in the study area. Additionally, we introduced a cross-validation method to measure the errors of the estimation results by the two interpolation methods and discussed the accuracy of the information contained in the estimation results. The conclusions are as follows: data distribution characteristics, spatial variability, and mean square errors (MSE) of the different methods showed large differences. Estimation results from BP neural network models have a higher accuracy, the MSE of As and Cd are 0.0661 and 0.1743, respectively. However, the interpolation results show significant skewed distribution, and spatial autocorrelation is strong. Using Kriging interpolation, the MSE of As and Cd are 0.0804 and 0.2983, respectively. The estimation results have poorer accuracy. Combining the two methods can improve the accuracy of the Kriging interpolation and more comprehensively represent the spatial distribution characteristics of metal(loid)s in regional soil. The study may provide a scientific basis and technical support for the regulation of soil metal(loid) pollution.
Franklin, J.; Wejnert, K.E.; Hathaway, S.A.; Rochester, C.J.; Fisher, R.N.
2009-01-01
Aim: Several studies have found that more accurate predictive models of species' occurrences can be developed for rarer species; however, one recent study found the relationship between range size and model performance to be an artefact of sample prevalence, that is, the proportion of presence versus absence observations in the data used to train the model. We examined the effect of model type, species rarity class, species' survey frequency, detectability and manipulated sample prevalence on the accuracy of distribution models developed for 30 reptile and amphibian species. Location: Coastal southern California, USA. Methods: Classification trees, generalized additive models and generalized linear models were developed using species presence and absence data from 420 locations. Model performance was measured using sensitivity, specificity and the area under the curve (AUC) of the receiver-operating characteristic (ROC) plot based on twofold cross-validation, or on bootstrapping. Predictors included climate, terrain, soil and vegetation variables. Species were assigned to rarity classes by experts. The data were sampled to generate subsets with varying ratios of presences and absences to test for the effect of sample prevalence. Join count statistics were used to characterize spatial dependence in the prediction errors. Results: Species in classes with higher rarity were more accurately predicted than common species, and this effect was independent of sample prevalence. Although positive spatial autocorrelation remained in the prediction errors, it was weaker than was observed in the species occurrence data. The differences in accuracy among model types were slight. Main conclusions: Using a variety of modelling methods, more accurate species distribution models were developed for rarer than for more common species. This was presumably because it is difficult to discriminate suitable from unsuitable habitat for habitat generalists, and not as an artefact of the effect of sample prevalence on model estimation. ?? 2008 The Authors.
New tools for evaluating LQAS survey designs
2014-01-01
Lot Quality Assurance Sampling (LQAS) surveys have become increasingly popular in global health care applications. Incorporating Bayesian ideas into LQAS survey design, such as using reasonable prior beliefs about the distribution of an indicator, can improve the selection of design parameters and decision rules. In this paper, a joint frequentist and Bayesian framework is proposed for evaluating LQAS classification accuracy and informing survey design parameters. Simple software tools are provided for calculating the positive and negative predictive value of a design with respect to an underlying coverage distribution and the selected design parameters. These tools are illustrated using a data example from two consecutive LQAS surveys measuring Oral Rehydration Solution (ORS) preparation. Using the survey tools, the dependence of classification accuracy on benchmark selection and the width of the ‘grey region’ are clarified in the context of ORS preparation across seven supervision areas. Following the completion of an LQAS survey, estimation of the distribution of coverage across areas facilitates quantifying classification accuracy and can help guide intervention decisions. PMID:24528928
New tools for evaluating LQAS survey designs.
Hund, Lauren
2014-02-15
Lot Quality Assurance Sampling (LQAS) surveys have become increasingly popular in global health care applications. Incorporating Bayesian ideas into LQAS survey design, such as using reasonable prior beliefs about the distribution of an indicator, can improve the selection of design parameters and decision rules. In this paper, a joint frequentist and Bayesian framework is proposed for evaluating LQAS classification accuracy and informing survey design parameters. Simple software tools are provided for calculating the positive and negative predictive value of a design with respect to an underlying coverage distribution and the selected design parameters. These tools are illustrated using a data example from two consecutive LQAS surveys measuring Oral Rehydration Solution (ORS) preparation. Using the survey tools, the dependence of classification accuracy on benchmark selection and the width of the 'grey region' are clarified in the context of ORS preparation across seven supervision areas. Following the completion of an LQAS survey, estimation of the distribution of coverage across areas facilitates quantifying classification accuracy and can help guide intervention decisions.
A nonvoxel-based dose convolution/superposition algorithm optimized for scalable GPU architectures.
Neylon, J; Sheng, K; Yu, V; Chen, Q; Low, D A; Kupelian, P; Santhanam, A
2014-10-01
Real-time adaptive planning and treatment has been infeasible due in part to its high computational complexity. There have been many recent efforts to utilize graphics processing units (GPUs) to accelerate the computational performance and dose accuracy in radiation therapy. Data structure and memory access patterns are the key GPU factors that determine the computational performance and accuracy. In this paper, the authors present a nonvoxel-based (NVB) approach to maximize computational and memory access efficiency and throughput on the GPU. The proposed algorithm employs a ray-tracing mechanism to restructure the 3D data sets computed from the CT anatomy into a nonvoxel-based framework. In a process that takes only a few milliseconds of computing time, the algorithm restructured the data sets by ray-tracing through precalculated CT volumes to realign the coordinate system along the convolution direction, as defined by zenithal and azimuthal angles. During the ray-tracing step, the data were resampled according to radial sampling and parallel ray-spacing parameters making the algorithm independent of the original CT resolution. The nonvoxel-based algorithm presented in this paper also demonstrated a trade-off in computational performance and dose accuracy for different coordinate system configurations. In order to find the best balance between the computed speedup and the accuracy, the authors employed an exhaustive parameter search on all sampling parameters that defined the coordinate system configuration: zenithal, azimuthal, and radial sampling of the convolution algorithm, as well as the parallel ray spacing during ray tracing. The angular sampling parameters were varied between 4 and 48 discrete angles, while both radial sampling and parallel ray spacing were varied from 0.5 to 10 mm. The gamma distribution analysis method (γ) was used to compare the dose distributions using 2% and 2 mm dose difference and distance-to-agreement criteria, respectively. Accuracy was investigated using three distinct phantoms with varied geometries and heterogeneities and on a series of 14 segmented lung CT data sets. Performance gains were calculated using three 256 mm cube homogenous water phantoms, with isotropic voxel dimensions of 1, 2, and 4 mm. The nonvoxel-based GPU algorithm was independent of the data size and provided significant computational gains over the CPU algorithm for large CT data sizes. The parameter search analysis also showed that the ray combination of 8 zenithal and 8 azimuthal angles along with 1 mm radial sampling and 2 mm parallel ray spacing maintained dose accuracy with greater than 99% of voxels passing the γ test. Combining the acceleration obtained from GPU parallelization with the sampling optimization, the authors achieved a total performance improvement factor of >175 000 when compared to our voxel-based ground truth CPU benchmark and a factor of 20 compared with a voxel-based GPU dose convolution method. The nonvoxel-based convolution method yielded substantial performance improvements over a generic GPU implementation, while maintaining accuracy as compared to a CPU computed ground truth dose distribution. Such an algorithm can be a key contribution toward developing tools for adaptive radiation therapy systems.
A nonvoxel-based dose convolution/superposition algorithm optimized for scalable GPU architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Neylon, J., E-mail: jneylon@mednet.ucla.edu; Sheng, K.; Yu, V.
Purpose: Real-time adaptive planning and treatment has been infeasible due in part to its high computational complexity. There have been many recent efforts to utilize graphics processing units (GPUs) to accelerate the computational performance and dose accuracy in radiation therapy. Data structure and memory access patterns are the key GPU factors that determine the computational performance and accuracy. In this paper, the authors present a nonvoxel-based (NVB) approach to maximize computational and memory access efficiency and throughput on the GPU. Methods: The proposed algorithm employs a ray-tracing mechanism to restructure the 3D data sets computed from the CT anatomy intomore » a nonvoxel-based framework. In a process that takes only a few milliseconds of computing time, the algorithm restructured the data sets by ray-tracing through precalculated CT volumes to realign the coordinate system along the convolution direction, as defined by zenithal and azimuthal angles. During the ray-tracing step, the data were resampled according to radial sampling and parallel ray-spacing parameters making the algorithm independent of the original CT resolution. The nonvoxel-based algorithm presented in this paper also demonstrated a trade-off in computational performance and dose accuracy for different coordinate system configurations. In order to find the best balance between the computed speedup and the accuracy, the authors employed an exhaustive parameter search on all sampling parameters that defined the coordinate system configuration: zenithal, azimuthal, and radial sampling of the convolution algorithm, as well as the parallel ray spacing during ray tracing. The angular sampling parameters were varied between 4 and 48 discrete angles, while both radial sampling and parallel ray spacing were varied from 0.5 to 10 mm. The gamma distribution analysis method (γ) was used to compare the dose distributions using 2% and 2 mm dose difference and distance-to-agreement criteria, respectively. Accuracy was investigated using three distinct phantoms with varied geometries and heterogeneities and on a series of 14 segmented lung CT data sets. Performance gains were calculated using three 256 mm cube homogenous water phantoms, with isotropic voxel dimensions of 1, 2, and 4 mm. Results: The nonvoxel-based GPU algorithm was independent of the data size and provided significant computational gains over the CPU algorithm for large CT data sizes. The parameter search analysis also showed that the ray combination of 8 zenithal and 8 azimuthal angles along with 1 mm radial sampling and 2 mm parallel ray spacing maintained dose accuracy with greater than 99% of voxels passing the γ test. Combining the acceleration obtained from GPU parallelization with the sampling optimization, the authors achieved a total performance improvement factor of >175 000 when compared to our voxel-based ground truth CPU benchmark and a factor of 20 compared with a voxel-based GPU dose convolution method. Conclusions: The nonvoxel-based convolution method yielded substantial performance improvements over a generic GPU implementation, while maintaining accuracy as compared to a CPU computed ground truth dose distribution. Such an algorithm can be a key contribution toward developing tools for adaptive radiation therapy systems.« less
Scheid, Anika; Nebel, Markus E
2012-07-09
Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of such an SCFG based sampling method without significant accuracy losses. If, on the other hand, the quality of sampled structures can be observed to strongly react to slight disturbances, there is little hope for improving the complexity by heuristic procedures. We hence provide a reliable test for the hypothesis that a heuristic method could be implemented to improve the time scaling of RNA secondary structure prediction in the worst-case - without sacrificing much of the accuracy of the results. Our experiments indicate that absolute errors generally lead to the generation of useless sample sets, whereas relative errors seem to have only small negative impact on both the predictive accuracy and the overall quality of resulting structure samples. Based on these observations, we present some useful ideas for developing a time-reduced sampling method guaranteeing an acceptable predictive accuracy. We also discuss some inherent drawbacks that arise in the context of approximation. The key results of this paper are crucial for the design of an efficient and competitive heuristic prediction method based on the increasingly accepted and attractive statistical sampling approach. This has indeed been indicated by the construction of prototype algorithms.
2012-01-01
Background Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. Results In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of such an SCFG based sampling method without significant accuracy losses. If, on the other hand, the quality of sampled structures can be observed to strongly react to slight disturbances, there is little hope for improving the complexity by heuristic procedures. We hence provide a reliable test for the hypothesis that a heuristic method could be implemented to improve the time scaling of RNA secondary structure prediction in the worst-case – without sacrificing much of the accuracy of the results. Conclusions Our experiments indicate that absolute errors generally lead to the generation of useless sample sets, whereas relative errors seem to have only small negative impact on both the predictive accuracy and the overall quality of resulting structure samples. Based on these observations, we present some useful ideas for developing a time-reduced sampling method guaranteeing an acceptable predictive accuracy. We also discuss some inherent drawbacks that arise in the context of approximation. The key results of this paper are crucial for the design of an efficient and competitive heuristic prediction method based on the increasingly accepted and attractive statistical sampling approach. This has indeed been indicated by the construction of prototype algorithms. PMID:22776037
The Estimation of Tree Posterior Probabilities Using Conditional Clade Probability Distributions
Larget, Bret
2013-01-01
In this article I introduce the idea of conditional independence of separated subtrees as a principle by which to estimate the posterior probability of trees using conditional clade probability distributions rather than simple sample relative frequencies. I describe an algorithm for these calculations and software which implements these ideas. I show that these alternative calculations are very similar to simple sample relative frequencies for high probability trees but are substantially more accurate for relatively low probability trees. The method allows the posterior probability of unsampled trees to be calculated when these trees contain only clades that are in other sampled trees. Furthermore, the method can be used to estimate the total probability of the set of sampled trees which provides a measure of the thoroughness of a posterior sample. [Bayesian phylogenetics; conditional clade distributions; improved accuracy; posterior probabilities of trees.] PMID:23479066
Sun, Chenglu; Li, Wei; Chen, Wei
2017-01-01
For extracting the pressure distribution image and respiratory waveform unobtrusively and comfortably, we proposed a smart mat which utilized a flexible pressure sensor array, printed electrodes and novel soft seven-layer structure to monitor those physiological information. However, in order to obtain high-resolution pressure distribution and more accurate respiratory waveform, it needs more time to acquire the pressure signal of all the pressure sensors embedded in the smart mat. In order to reduce the sampling time while keeping the same resolution and accuracy, a novel method based on compressed sensing (CS) theory was proposed. By utilizing the CS based method, 40% of the sampling time can be decreased by means of acquiring nearly one-third of original sampling points. Then several experiments were carried out to validate the performance of the CS based method. While less than one-third of original sampling points were measured, the correlation degree coefficient between reconstructed respiratory waveform and original waveform can achieve 0.9078, and the accuracy of the respiratory rate (RR) extracted from the reconstructed respiratory waveform can reach 95.54%. The experimental results demonstrated that the novel method can fit the high resolution smart mat system and be a viable option for reducing the sampling time of the pressure sensor array. PMID:28796188
NASA Technical Reports Server (NTRS)
1981-01-01
The locations of total ozone stations and of stratospheric ozone samplings were presented. The samplings are concentrated in three areas: Japan, Europe, and India. Approximately 75% of the total ozone measurements are made with Dobson instruments which offer the best international measurements. When well calibrated their accuracy is on the order of a few percent. It is found that although the total ozone percent is similar in both hemispheres, the northern hemisphere has 3 to 10% more ozone than the southern hemisphere. The close association between total ozone distribution and pressure distribution in the atmosphere is noted.
Optimal predictions in everyday cognition: the wisdom of individuals or crowds?
Mozer, Michael C; Pashler, Harold; Homaei, Hadjar
2008-10-01
Griffiths and Tenenbaum (2006) asked individuals to make predictions about the duration or extent of everyday events (e.g., cake baking times), and reported that predictions were optimal, employing Bayesian inference based on veridical prior distributions. Although the predictions conformed strikingly to statistics of the world, they reflect averages over many individuals. On the conjecture that the accuracy of the group response is chiefly a consequence of aggregating across individuals, we constructed simple, heuristic approximations to the Bayesian model premised on the hypothesis that individuals have access merely to a sample of k instances drawn from the relevant distribution. The accuracy of the group response reported by Griffiths and Tenenbaum could be accounted for by supposing that individuals each utilize only two instances. Moreover, the variability of the group data is more consistent with this small-sample hypothesis than with the hypothesis that people utilize veridical or nearly veridical representations of the underlying prior distributions. Our analyses lead to a qualitatively different view of how individuals reason from past experience than the view espoused by Griffiths and Tenenbaum. 2008 Cognitive Science Society, Inc.
Effects of sample size on KERNEL home range estimates
Seaman, D.E.; Millspaugh, J.J.; Kernohan, Brian J.; Brundige, Gary C.; Raedeke, Kenneth J.; Gitzen, Robert A.
1999-01-01
Kernel methods for estimating home range are being used increasingly in wildlife research, but the effect of sample size on their accuracy is not known. We used computer simulations of 10-200 points/home range and compared accuracy of home range estimates produced by fixed and adaptive kernels with the reference (REF) and least-squares cross-validation (LSCV) methods for determining the amount of smoothing. Simulated home ranges varied from simple to complex shapes created by mixing bivariate normal distributions. We used the size of the 95% home range area and the relative mean squared error of the surface fit to assess the accuracy of the kernel home range estimates. For both measures, the bias and variance approached an asymptote at about 50 observations/home range. The fixed kernel with smoothing selected by LSCV provided the least-biased estimates of the 95% home range area. All kernel methods produced similar surface fit for most simulations, but the fixed kernel with LSCV had the lowest frequency and magnitude of very poor estimates. We reviewed 101 papers published in The Journal of Wildlife Management (JWM) between 1980 and 1997 that estimated animal home ranges. A minority of these papers used nonparametric utilization distribution (UD) estimators, and most did not adequately report sample sizes. We recommend that home range studies using kernel estimates use LSCV to determine the amount of smoothing, obtain a minimum of 30 observations per animal (but preferably a?Y50), and report sample sizes in published results.
Zhou, Shenglu; Su, Quanlong; Yi, Haomin
2017-01-01
Soil pollution by metal(loid)s resulting from rapid economic development is a major concern. Accurately estimating the spatial distribution of soil metal(loid) pollution has great significance in preventing and controlling soil pollution. In this study, 126 topsoil samples were collected in Kunshan City and the geo-accumulation index was selected as a pollution index. We used Kriging interpolation and BP neural network methods to estimate the spatial distribution of arsenic (As) and cadmium (Cd) pollution in the study area. Additionally, we introduced a cross-validation method to measure the errors of the estimation results by the two interpolation methods and discussed the accuracy of the information contained in the estimation results. The conclusions are as follows: data distribution characteristics, spatial variability, and mean square errors (MSE) of the different methods showed large differences. Estimation results from BP neural network models have a higher accuracy, the MSE of As and Cd are 0.0661 and 0.1743, respectively. However, the interpolation results show significant skewed distribution, and spatial autocorrelation is strong. Using Kriging interpolation, the MSE of As and Cd are 0.0804 and 0.2983, respectively. The estimation results have poorer accuracy. Combining the two methods can improve the accuracy of the Kriging interpolation and more comprehensively represent the spatial distribution characteristics of metal(loid)s in regional soil. The study may provide a scientific basis and technical support for the regulation of soil metal(loid) pollution. PMID:29278363
Sworn testimony of the model evidence: Gaussian Mixture Importance (GAME) sampling
NASA Astrophysics Data System (ADS)
Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.
2017-07-01
What is the "best" model? The answer to this question lies in part in the eyes of the beholder, nevertheless a good model must blend rigorous theory with redeeming qualities such as parsimony and quality of fit. Model selection is used to make inferences, via weighted averaging, from a set of K candidate models, Mk; k=>(1,…,K>), and help identify which model is most supported by the observed data, Y>˜=>(y˜1,…,y˜n>). Here, we introduce a new and robust estimator of the model evidence, p>(Y>˜|Mk>), which acts as normalizing constant in the denominator of Bayes' theorem and provides a single quantitative measure of relative support for each hypothesis that integrates model accuracy, uncertainty, and complexity. However, p>(Y>˜|Mk>) is analytically intractable for most practical modeling problems. Our method, coined GAussian Mixture importancE (GAME) sampling, uses bridge sampling of a mixture distribution fitted to samples of the posterior model parameter distribution derived from MCMC simulation. We benchmark the accuracy and reliability of GAME sampling by application to a diverse set of multivariate target distributions (up to 100 dimensions) with known values of p>(Y>˜|Mk>) and to hypothesis testing using numerical modeling of the rainfall-runoff transformation of the Leaf River watershed in Mississippi, USA. These case studies demonstrate that GAME sampling provides robust and unbiased estimates of the evidence at a relatively small computational cost outperforming commonly used estimators. The GAME sampler is implemented in the MATLAB package of DREAM and simplifies considerably scientific inquiry through hypothesis testing and model selection.
Normative Data on Audiovisual Speech Integration Using Sentence Recognition and Capacity Measures
Altieri, Nicholas; Hudock, Daniel
2016-01-01
Objective The ability to use visual speech cues and integrate them with auditory information is important, especially in noisy environments and for hearing-impaired (HI) listeners. Providing data on measures of integration skills that encompass accuracy and processing speed will benefit researchers and clinicians. Design The study consisted of two experiments: First, accuracy scores were obtained using CUNY sentences, and capacity measures that assessed reaction-time distributions were obtained from a monosyllabic word recognition task. Study Sample We report data on two measures of integration obtained from a sample comprised of 86 young and middle-age adult listeners: Results To summarize our results, capacity showed a positive correlation with accuracy measures of audiovisual benefit obtained from sentence recognition. More relevant, factor analysis indicated that a single-factor model captured audiovisual speech integration better than models containing more factors. Capacity exhibited strong loadings on the factor, while the accuracy-based measures from sentence recognition exhibited weaker loadings. Conclusions Results suggest that a listener’s integration skills may be assessed optimally using a measure that incorporates both processing speed and accuracy. PMID:26853446
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stanley, B.J.; Guiochon, G.
1994-11-01
Adsorption energy distributions (AEDs) are calculated from the classical, fundamental integral equation of adsorption using adsorption isotherms and the expectation-maximization method of parameter estimation. The adsorption isotherms are calculated from nonlinear elution profiles obtained from gas chromatographic data using the characteristic points method of finite concentration chromatography. Porous layer open tubular capillary columns are used to support the adsorbent. The performance of these columns is compared to that of packed columns in terms of their ability to supply accurate isotherm data and AEDs. The effect of the finite column efficiency and the limited loading factor on the accuracy of themore » estimated energy distributions is presented. This accuracy decreases with decreasing efficiency, and approximately 5000 theoretical plates are needed when the loading factor, L[sub f], equals 0.56 for sampling of a unimodal Gaussian distribution. Increasing L[sub f] further increases the contribution of finite efficiency to the AED and causes a divergence at the low-energy endpoint if too high. This occurs as the retention time approaches the holdup time. Data are presented for diethyl ether adsorption on porous silica and its C-18-bonded derivative. 36 refs., 8 figs., 2 tabs.« less
Influence of item distribution pattern and abundance on efficiency of benthic core sampling
Behney, Adam C.; O'Shaughnessy, Ryan; Eichholz, Michael W.; Stafford, Joshua D.
2014-01-01
ore sampling is a commonly used method to estimate benthic item density, but little information exists about factors influencing the accuracy and time-efficiency of this method. We simulated core sampling in a Geographic Information System framework by generating points (benthic items) and polygons (core samplers) to assess how sample size (number of core samples), core sampler size (cm2), distribution of benthic items, and item density affected the bias and precision of estimates of density, the detection probability of items, and the time-costs. When items were distributed randomly versus clumped, bias decreased and precision increased with increasing sample size and increased slightly with increasing core sampler size. Bias and precision were only affected by benthic item density at very low values (500–1,000 items/m2). Detection probability (the probability of capturing ≥ 1 item in a core sample if it is available for sampling) was substantially greater when items were distributed randomly as opposed to clumped. Taking more small diameter core samples was always more time-efficient than taking fewer large diameter samples. We are unable to present a single, optimal sample size, but provide information for researchers and managers to derive optimal sample sizes dependent on their research goals and environmental conditions.
System and method for high precision isotope ratio destructive analysis
Bushaw, Bruce A; Anheier, Norman C; Phillips, Jon R
2013-07-02
A system and process are disclosed that provide high accuracy and high precision destructive analysis measurements for isotope ratio determination of relative isotope abundance distributions in liquids, solids, and particulate samples. The invention utilizes a collinear probe beam to interrogate a laser ablated plume. This invention provides enhanced single-shot detection sensitivity approaching the femtogram range, and isotope ratios that can be determined at approximately 1% or better precision and accuracy (relative standard deviation).
Luo, Shezhou; Chen, Jing M; Wang, Cheng; Xi, Xiaohuan; Zeng, Hongcheng; Peng, Dailiang; Li, Dong
2016-05-30
Vegetation leaf area index (LAI), height, and aboveground biomass are key biophysical parameters. Corn is an important and globally distributed crop, and reliable estimations of these parameters are essential for corn yield forecasting, health monitoring and ecosystem modeling. Light Detection and Ranging (LiDAR) is considered an effective technology for estimating vegetation biophysical parameters. However, the estimation accuracies of these parameters are affected by multiple factors. In this study, we first estimated corn LAI, height and biomass (R2 = 0.80, 0.874 and 0.838, respectively) using the original LiDAR data (7.32 points/m2), and the results showed that LiDAR data could accurately estimate these biophysical parameters. Second, comprehensive research was conducted on the effects of LiDAR point density, sampling size and height threshold on the estimation accuracy of LAI, height and biomass. Our findings indicated that LiDAR point density had an important effect on the estimation accuracy for vegetation biophysical parameters, however, high point density did not always produce highly accurate estimates, and reduced point density could deliver reasonable estimation results. Furthermore, the results showed that sampling size and height threshold were additional key factors that affect the estimation accuracy of biophysical parameters. Therefore, the optimal sampling size and the height threshold should be determined to improve the estimation accuracy of biophysical parameters. Our results also implied that a higher LiDAR point density, larger sampling size and height threshold were required to obtain accurate corn LAI estimation when compared with height and biomass estimations. In general, our results provide valuable guidance for LiDAR data acquisition and estimation of vegetation biophysical parameters using LiDAR data.
Scanning fiber angle-resolved low coherence interferometry
Zhu, Yizheng; Terry, Neil G.; Wax, Adam
2010-01-01
We present a fiber-optic probe for Fourier-domain angle-resolved low coherence interferometry for the determination of depth-resolved scatterer size. The probe employs a scanning single-mode fiber to collect the angular scattering distribution of the sample, which is analyzed using the Mie theory to obtain the average size of the scatterers. Depth sectioning is achieved with low coherence Mach–Zehnder interferometry. In the sample arm of the interferometer, a fixed fiber illuminates the sample through an imaging lens and a collection fiber samples the backscattered angular distribution by scanning across the Fourier plane image of the sample. We characterize the optical performance of the probe and demonstrate the ability to execute depth-resolved sizing with subwavelength accuracy by using a double-layer phantom containing two sizes of polystyrene microspheres. PMID:19838271
Austin, Peter C; Steyerberg, Ewout W
2012-06-20
When outcomes are binary, the c-statistic (equivalent to the area under the Receiver Operating Characteristic curve) is a standard measure of the predictive accuracy of a logistic regression model. An analytical expression was derived under the assumption that a continuous explanatory variable follows a normal distribution in those with and without the condition. We then conducted an extensive set of Monte Carlo simulations to examine whether the expressions derived under the assumption of binormality allowed for accurate prediction of the empirical c-statistic when the explanatory variable followed a normal distribution in the combined sample of those with and without the condition. We also examine the accuracy of the predicted c-statistic when the explanatory variable followed a gamma, log-normal or uniform distribution in combined sample of those with and without the condition. Under the assumption of binormality with equality of variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the product of the standard deviation of the normal components (reflecting more heterogeneity) and the log-odds ratio (reflecting larger effects). Under the assumption of binormality with unequal variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the standardized difference of the explanatory variable in those with and without the condition. In our Monte Carlo simulations, we found that these expressions allowed for reasonably accurate prediction of the empirical c-statistic when the distribution of the explanatory variable was normal, gamma, log-normal, and uniform in the entire sample of those with and without the condition. The discriminative ability of a continuous explanatory variable cannot be judged by its odds ratio alone, but always needs to be considered in relation to the heterogeneity of the population.
A Dirichlet-Multinomial Bayes Classifier for Disease Diagnosis with Microbial Compositions.
Gao, Xiang; Lin, Huaiying; Dong, Qunfeng
2017-01-01
Dysbiosis of microbial communities is associated with various human diseases, raising the possibility of using microbial compositions as biomarkers for disease diagnosis. We have developed a Bayes classifier by modeling microbial compositions with Dirichlet-multinomial distributions, which are widely used to model multicategorical count data with extra variation. The parameters of the Dirichlet-multinomial distributions are estimated from training microbiome data sets based on maximum likelihood. The posterior probability of a microbiome sample belonging to a disease or healthy category is calculated based on Bayes' theorem, using the likelihood values computed from the estimated Dirichlet-multinomial distribution, as well as a prior probability estimated from the training microbiome data set or previously published information on disease prevalence. When tested on real-world microbiome data sets, our method, called DMBC (for Dirichlet-multinomial Bayes classifier), shows better classification accuracy than the only existing Bayesian microbiome classifier based on a Dirichlet-multinomial mixture model and the popular random forest method. The advantage of DMBC is its built-in automatic feature selection, capable of identifying a subset of microbial taxa with the best classification accuracy between different classes of samples based on cross-validation. This unique ability enables DMBC to maintain and even improve its accuracy at modeling species-level taxa. The R package for DMBC is freely available at https://github.com/qunfengdong/DMBC. IMPORTANCE By incorporating prior information on disease prevalence, Bayes classifiers have the potential to estimate disease probability better than other common machine-learning methods. Thus, it is important to develop Bayes classifiers specifically tailored for microbiome data. Our method shows higher classification accuracy than the only existing Bayesian classifier and the popular random forest method, and thus provides an alternative option for using microbial compositions for disease diagnosis.
Drift correction of the dissolved signal in single particle ICPMS.
Cornelis, Geert; Rauch, Sebastien
2016-07-01
A method is presented where drift, the random fluctuation of the signal intensity, is compensated for based on the estimation of the drift function by a moving average. It was shown using single particle ICPMS (spICPMS) measurements of 10 and 60 nm Au NPs that drift reduces accuracy of spICPMS analysis at the calibration stage and during calculations of the particle size distribution (PSD), but that the present method can again correct the average signal intensity as well as the signal distribution of particle-containing samples skewed by drift. Moreover, deconvolution, a method that models signal distributions of dissolved signals, fails in some cases when using standards and samples affected by drift, but the present method was shown to improve accuracy again. Relatively high particle signals have to be removed prior to drift correction in this procedure, which was done using a 3 × sigma method, and the signals are treated separately and added again. The method can also correct for flicker noise that increases when signal intensity is increased because of drift. The accuracy was improved in many cases when flicker correction was used, but when accurate results were obtained despite drift, the correction procedures did not reduce accuracy. The procedure may be useful to extract results from experimental runs that would otherwise have to be run again. Graphical Abstract A method is presented where a spICP-MS signal affected by drift (left) is corrected (right) by adjusting the local (moving) averages (green) and standard deviations (purple) to the respective values at a reference time (red). In combination with removing particle events (blue) in the case of calibration standards, this method is shown to obtain particle size distributions where that would otherwise be impossible, even when the deconvolution method is used to discriminate dissolved and particle signals.
NASA Astrophysics Data System (ADS)
Mo, S.; Lu, D.; Shi, X.; Zhang, G.; Ye, M.; Wu, J.
2016-12-01
Surrogate models have shown remarkable computational efficiency in hydrological simulations involving design space exploration, sensitivity analysis, uncertainty quantification, etc. The central task of constructing a global surrogate models is to achieve a prescribed approximation accuracy with as few original model executions as possible, which requires a good design strategy to optimize the distribution of data points in the parameter domains and an effective stopping criterion to automatically terminate the design process when desired approximation accuracy is achieved. This study proposes a novel adaptive sampling strategy, which starts from a small number of initial samples and adaptively selects additional samples by balancing the collection in unexplored regions and refinement in interesting areas. We define an efficient and effective evaluation metric basing on Taylor expansion to select the most promising potential samples from candidate points, and propose a robust stopping criterion basing on the approximation accuracy at new points to guarantee the achievement of desired accuracy. The numerical results of several benchmark analytical functions indicate that the proposed approach is more computationally efficient and robust than the widely used maximin distance design and two other well-known adaptive sampling strategies. The application to two complicated multiphase flow problems further demonstrates the efficiency and effectiveness of our method in constructing global surrogate models for high-dimensional and highly nonlinear problems. Acknowledgements: This work was financially supported by the National Nature Science Foundation of China grants No. 41030746 and 41172206.
Coalescence computations for large samples drawn from populations of time-varying sizes
Polanski, Andrzej; Szczesna, Agnieszka; Garbulowski, Mateusz; Kimmel, Marek
2017-01-01
We present new results concerning probability distributions of times in the coalescence tree and expected allele frequencies for coalescent with large sample size. The obtained results are based on computational methodologies, which involve combining coalescence time scale changes with techniques of integral transformations and using analytical formulae for infinite products. We show applications of the proposed methodologies for computing probability distributions of times in the coalescence tree and their limits, for evaluation of accuracy of approximate expressions for times in the coalescence tree and expected allele frequencies, and for analysis of large human mitochondrial DNA dataset. PMID:28170404
Thompson, Nicola D; Edwards, Jonathan R; Bamberg, Wendy; Beldavs, Zintars G; Dumyati, Ghinwa; Godine, Deborah; Maloney, Meghan; Kainer, Marion; Ray, Susan; Thompson, Deborah; Wilson, Lucy; Magill, Shelley S
2013-03-01
To evaluate the accuracy of weekly sampling of central line-associated bloodstream infection (CLABSI) denominator data to estimate central line-days (CLDs). Obtained CLABSI denominator logs showing daily counts of patient-days and CLD for 6-12 consecutive months from participants and CLABSI numerators and facility and location characteristics from the National Healthcare Safety Network (NHSN). Convenience sample of 119 inpatient locations in 63 acute care facilities within 9 states participating in the Emerging Infections Program. Actual CLD and estimated CLD obtained from sampling denominator data on all single-day and 2-day (day-pair) samples were compared by assessing the distributions of the CLD percentage error. Facility and location characteristics associated with increased precision of estimated CLD were assessed. The impact of using estimated CLD to calculate CLABSI rates was evaluated by measuring the change in CLABSI decile ranking. The distribution of CLD percentage error varied by the day and number of days sampled. On average, day-pair samples provided more accurate estimates than did single-day samples. For several day-pair samples, approximately 90% of locations had CLD percentage error of less than or equal to ±5%. A lower number of CLD per month was most significantly associated with poor precision in estimated CLD. Most locations experienced no change in CLABSI decile ranking, and no location's CLABSI ranking changed by more than 2 deciles. Sampling to obtain estimated CLD is a valid alternative to daily data collection for a large proportion of locations. Development of a sampling guideline for NHSN users is underway.
Li, Der-Chiang; Hu, Susan C; Lin, Liang-Sian; Yeh, Chun-Wu
2017-01-01
It is difficult for learning models to achieve high classification performances with imbalanced data sets, because with imbalanced data sets, when one of the classes is much larger than the others, most machine learning and data mining classifiers are overly influenced by the larger classes and ignore the smaller ones. As a result, the classification algorithms often have poor learning performances due to slow convergence in the smaller classes. To balance such data sets, this paper presents a strategy that involves reducing the sizes of the majority data and generating synthetic samples for the minority data. In the reducing operation, we use the box-and-whisker plot approach to exclude outliers and the Mega-Trend-Diffusion method to find representative data from the majority data. To generate the synthetic samples, we propose a counterintuitive hypothesis to find the distributed shape of the minority data, and then produce samples according to this distribution. Four real datasets were used to examine the performance of the proposed approach. We used paired t-tests to compare the Accuracy, G-mean, and F-measure scores of the proposed data pre-processing (PPDP) method merging in the D3C method (PPDP+D3C) with those of the one-sided selection (OSS), the well-known SMOTEBoost (SB) study, and the normal distribution-based oversampling (NDO) approach, and the proposed data pre-processing (PPDP) method. The results indicate that the classification performance of the proposed approach is better than that of above-mentioned methods.
Zollanvari, Amin; Dougherty, Edward R
2014-06-01
The most important aspect of any classifier is its error rate, because this quantifies its predictive capacity. Thus, the accuracy of error estimation is critical. Error estimation is problematic in small-sample classifier design because the error must be estimated using the same data from which the classifier has been designed. Use of prior knowledge, in the form of a prior distribution on an uncertainty class of feature-label distributions to which the true, but unknown, feature-distribution belongs, can facilitate accurate error estimation (in the mean-square sense) in circumstances where accurate completely model-free error estimation is impossible. This paper provides analytic asymptotically exact finite-sample approximations for various performance metrics of the resulting Bayesian Minimum Mean-Square-Error (MMSE) error estimator in the case of linear discriminant analysis (LDA) in the multivariate Gaussian model. These performance metrics include the first, second, and cross moments of the Bayesian MMSE error estimator with the true error of LDA, and therefore, the Root-Mean-Square (RMS) error of the estimator. We lay down the theoretical groundwork for Kolmogorov double-asymptotics in a Bayesian setting, which enables us to derive asymptotic expressions of the desired performance metrics. From these we produce analytic finite-sample approximations and demonstrate their accuracy via numerical examples. Various examples illustrate the behavior of these approximations and their use in determining the necessary sample size to achieve a desired RMS. The Supplementary Material contains derivations for some equations and added figures.
Measurement of 3D refractive index distribution by optical diffraction tomography
NASA Astrophysics Data System (ADS)
Chi, Weining; Wang, Dayong; Wang, Yunxin; Zhao, Jie; Rong, Lu; Yuan, Yuanyuan
2018-01-01
Optical Diffraction Tomography (ODT), as a novel 3D imaging technique, can obtain a 3D refractive index (RI) distribution to reveal the important optical properties of transparent samples. According to the theory of ODT, an optical diffraction tomography setup is built based on the Mach-Zehnder interferometer. The propagation direction of object beam is controlled by a 2D translation stage, and 121 holograms based on different illumination angles are recorded by a Charge-coupled Device (CCD). In order to prove the validity and accuracy of the ODT, the 3D RI profile of microsphere with a known RI is firstly measured. An iterative constraint algorithm is employed to improve the imaging accuracy effectively. The 3D morphology and average RI of the microsphere are consistent with that of the actual situation, and the RI error is less than 0.0033. Then, an optical element fabricated by laser with a non-uniform RI is taken as the sample. Its 3D RI profile is obtained by the optical diffraction tomography system.
Balss, Karin M; Long, Frederick H; Veselov, Vladimir; Orana, Argjenta; Akerman-Revis, Eugena; Papandreou, George; Maryanoff, Cynthia A
2008-07-01
Multivariate data analysis was applied to confocal Raman measurements on stents coated with the polymers and drug used in the CYPHER Sirolimus-eluting Coronary Stents. Partial least-squares (PLS) regression was used to establish three independent calibration curves for the coating constituents: sirolimus, poly(n-butyl methacrylate) [PBMA], and poly(ethylene-co-vinyl acetate) [PEVA]. The PLS calibrations were based on average spectra generated from each spatial location profiled. The PLS models were tested on six unknown stent samples to assess accuracy and precision. The wt % difference between PLS predictions and laboratory assay values for sirolimus was less than 1 wt % for the composite of the six unknowns, while the polymer models were estimated to be less than 0.5 wt % difference for the combined samples. The linearity and specificity of the three PLS models were also demonstrated with the three PLS models. In contrast to earlier univariate models, the PLS models achieved mass balance with better accuracy. This analysis was extended to evaluate the spatial distribution of the three constituents. Quantitative bitmap images of drug-eluting stent coatings are presented for the first time to assess the local distribution of components.
A Hybrid Semi-supervised Classification Scheme for Mining Multisource Geospatial Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vatsavai, Raju; Bhaduri, Budhendra L
2011-01-01
Supervised learning methods such as Maximum Likelihood (ML) are often used in land cover (thematic) classification of remote sensing imagery. ML classifier relies exclusively on spectral characteristics of thematic classes whose statistical distributions (class conditional probability densities) are often overlapping. The spectral response distributions of thematic classes are dependent on many factors including elevation, soil types, and ecological zones. A second problem with statistical classifiers is the requirement of large number of accurate training samples (10 to 30 |dimensions|), which are often costly and time consuming to acquire over large geographic regions. With the increasing availability of geospatial databases, itmore » is possible to exploit the knowledge derived from these ancillary datasets to improve classification accuracies even when the class distributions are highly overlapping. Likewise newer semi-supervised techniques can be adopted to improve the parameter estimates of statistical model by utilizing a large number of easily available unlabeled training samples. Unfortunately there is no convenient multivariate statistical model that can be employed for mulitsource geospatial databases. In this paper we present a hybrid semi-supervised learning algorithm that effectively exploits freely available unlabeled training samples from multispectral remote sensing images and also incorporates ancillary geospatial databases. We have conducted several experiments on real datasets, and our new hybrid approach shows over 25 to 35% improvement in overall classification accuracy over conventional classification schemes.« less
NASA Technical Reports Server (NTRS)
Johnson, Kenneth L.; White, K. Preston, Jr.
2012-01-01
The NASA Engineering and Safety Center was requested to improve on the Best Practices document produced for the NESC assessment, Verification of Probabilistic Requirements for the Constellation Program, by giving a recommended procedure for using acceptance sampling by variables techniques as an alternative to the potentially resource-intensive acceptance sampling by attributes method given in the document. In this paper, the results of empirical tests intended to assess the accuracy of acceptance sampling plan calculators implemented for six variable distributions are presented.
Norris, Darren; Fortin, Marie-Josée; Magnusson, William E.
2014-01-01
Background Ecological monitoring and sampling optima are context and location specific. Novel applications (e.g. biodiversity monitoring for environmental service payments) call for renewed efforts to establish reliable and robust monitoring in biodiversity rich areas. As there is little information on the distribution of biodiversity across the Amazon basin, we used altitude as a proxy for biological variables to test whether meso-scale variation can be adequately represented by different sample sizes in a standardized, regular-coverage sampling arrangement. Methodology/Principal Findings We used Shuttle-Radar-Topography-Mission digital elevation values to evaluate if the regular sampling arrangement in standard RAPELD (rapid assessments (“RAP”) over the long-term (LTER [“PELD” in Portuguese])) grids captured patters in meso-scale spatial variation. The adequacy of different sample sizes (n = 4 to 120) were examined within 32,325 km2/3,232,500 ha (1293×25 km2 sample areas) distributed across the legal Brazilian Amazon. Kolmogorov-Smirnov-tests, correlation and root-mean-square-error were used to measure sample representativeness, similarity and accuracy respectively. Trends and thresholds of these responses in relation to sample size and standard-deviation were modeled using Generalized-Additive-Models and conditional-inference-trees respectively. We found that a regular arrangement of 30 samples captured the distribution of altitude values within these areas. Sample size was more important than sample standard deviation for representativeness and similarity. In contrast, accuracy was more strongly influenced by sample standard deviation. Additionally, analysis of spatially interpolated data showed that spatial patterns in altitude were also recovered within areas using a regular arrangement of 30 samples. Conclusions/Significance Our findings show that the logistically feasible sample used in the RAPELD system successfully recovers meso-scale altitudinal patterns. This suggests that the sample size and regular arrangement may also be generally appropriate for quantifying spatial patterns in biodiversity at similar scales across at least 90% (≈5 million km2) of the Brazilian Amazon. PMID:25170894
Determining dynamical parameters of the Milky Way Galaxy based on high-accuracy radio astrometry
NASA Astrophysics Data System (ADS)
Honma, Mareki; Nagayama, Takumi; Sakai, Nobuyuki
2015-08-01
In this paper we evaluate how the dynamical structure of the Galaxy can be constrained by high-accuracy VLBI (Very Long Baseline Interferometry) astrometry such as VERA (VLBI Exploration of Radio Astrometry). We generate simulated samples of maser sources which follow the gas motion caused by a spiral or bar potential, with their distribution similar to those currently observed with VERA and VLBA (Very Long Baseline Array). We apply the Markov chain Monte Carlo analyses to the simulated sample sources to determine the dynamical parameter of the models. We show that one can successfully determine the initial model parameters if astrometric results are obtained for a few hundred sources with currently achieved astrometric accuracy. If astrometric data are available from 500 sources, the expected accuracy of R0 and Θ0 is ˜ 1% or higher, and parameters related to the spiral structure can be constrained by an error of 10% or with higher accuracy. We also show that the parameter determination accuracy is basically independent of the locations of resonances such as corotation and/or inner/outer Lindblad resonances. We also discuss the possibility of model selection based on the Bayesian information criterion (BIC), and demonstrate that BIC can be used to discriminate different dynamical models of the Galaxy.
Is Coefficient Alpha Robust to Non-Normal Data?
Sheng, Yanyan; Sheng, Zhaohui
2011-01-01
Coefficient alpha has been a widely used measure by which internal consistency reliability is assessed. In addition to essential tau-equivalence and uncorrelated errors, normality has been noted as another important assumption for alpha. Earlier work on evaluating this assumption considered either exclusively non-normal error score distributions, or limited conditions. In view of this and the availability of advanced methods for generating univariate non-normal data, Monte Carlo simulations were conducted to show that non-normal distributions for true or error scores do create problems for using alpha to estimate the internal consistency reliability. The sample coefficient alpha is affected by leptokurtic true score distributions, or skewed and/or kurtotic error score distributions. Increased sample sizes, not test lengths, help improve the accuracy, bias, or precision of using it with non-normal data. PMID:22363306
Menke, S.B.; Holway, D.A.; Fisher, R.N.; Jetz, W.
2009-01-01
Aim: Species distribution models (SDMs) or, more specifically, ecological niche models (ENMs) are a useful and rapidly proliferating tool in ecology and global change biology. ENMs attempt to capture associations between a species and its environment and are often used to draw biological inferences, to predict potential occurrences in unoccupied regions and to forecast future distributions under environmental change. The accuracy of ENMs, however, hinges critically on the quality of occurrence data. ENMs often use haphazardly collected data rather than data collected across the full spectrum of existing environmental conditions. Moreover, it remains unclear how processes affecting ENM predictions operate at different spatial scales. The scale (i.e. grain size) of analysis may be dictated more by the sampling regime than by biologically meaningful processes. The aim of our study is to jointly quantify how issues relating to region and scale affect ENM predictions using an economically important and ecologically damaging invasive species, the Argentine ant (Linepithema humile). Location: California, USA. Methods: We analysed the relationship between sampling sufficiency, regional differences in environmental parameter space and cell size of analysis and resampling environmental layers using two independently collected sets of presence/absence data. Differences in variable importance were determined using model averaging and logistic regression. Model accuracy was measured with area under the curve (AUC) and Cohen's kappa. Results: We first demonstrate that insufficient sampling of environmental parameter space can cause large errors in predicted distributions and biological interpretation. Models performed best when they were parametrized with data that sufficiently sampled environmental parameter space. Second, we show that altering the spatial grain of analysis changes the relative importance of different environmental variables. These changes apparently result from how environmental constraints and the sampling distributions of environmental variables change with spatial grain. Conclusions: These findings have clear relevance for biological inference. Taken together, our results illustrate potentially general limitations for ENMs, especially when such models are used to predict species occurrences in novel environments. We offer basic methodological and conceptual guidelines for appropriate sampling and scale matching. ?? 2009 The Authors Journal compilation ?? 2009 Blackwell Publishing.
Assessing the Application of a Geographic Presence-Only Model for Land Suitability Mapping
Heumann, Benjamin W.; Walsh, Stephen J.; McDaniel, Phillip M.
2011-01-01
Recent advances in ecological modeling have focused on novel methods for characterizing the environment that use presence-only data and machine-learning algorithms to predict the likelihood of species occurrence. These novel methods may have great potential for land suitability applications in the developing world where detailed land cover information is often unavailable or incomplete. This paper assesses the adaptation and application of the presence-only geographic species distribution model, MaxEnt, for agricultural crop suitability mapping in a rural Thailand where lowland paddy rice and upland field crops predominant. To assess this modeling approach, three independent crop presence datasets were used including a social-demographic survey of farm households, a remote sensing classification of land use/land cover, and ground control points, used for geodetic and thematic reference that vary in their geographic distribution and sample size. Disparate environmental data were integrated to characterize environmental settings across Nang Rong District, a region of approximately 1,300 sq. km in size. Results indicate that the MaxEnt model is capable of modeling crop suitability for upland and lowland crops, including rice varieties, although model results varied between datasets due to the high sensitivity of the model to the distribution of observed crop locations in geographic and environmental space. Accuracy assessments indicate that model outcomes were influenced by the sample size and the distribution of sample points in geographic and environmental space. The need for further research into accuracy assessments of presence-only models lacking true absence data is discussed. We conclude that the Maxent model can provide good estimates of crop suitability, but many areas need to be carefully scrutinized including geographic distribution of input data and assessment methods to ensure realistic modeling results. PMID:21860606
Occupational exposure decisions: can limited data interpretation training help improve accuracy?
Logan, Perry; Ramachandran, Gurumurthy; Mulhausen, John; Hewett, Paul
2009-06-01
Accurate exposure assessments are critical for ensuring that potentially hazardous exposures are properly identified and controlled. The availability and accuracy of exposure assessments can determine whether resources are appropriately allocated to engineering and administrative controls, medical surveillance, personal protective equipment and other programs designed to protect workers. A desktop study was performed using videos, task information and sampling data to evaluate the accuracy and potential bias of participants' exposure judgments. Desktop exposure judgments were obtained from occupational hygienists for material handling jobs with small air sampling data sets (0-8 samples) and without the aid of computers. In addition, data interpretation tests (DITs) were administered to participants where they were asked to estimate the 95th percentile of an underlying log-normal exposure distribution from small data sets. Participants were presented with an exposure data interpretation or rule of thumb training which included a simple set of rules for estimating 95th percentiles for small data sets from a log-normal population. DIT was given to each participant before and after the rule of thumb training. Results of each DIT and qualitative and quantitative exposure judgments were compared with a reference judgment obtained through a Bayesian probabilistic analysis of the sampling data to investigate overall judgment accuracy and bias. There were a total of 4386 participant-task-chemical judgments for all data collections: 552 qualitative judgments made without sampling data and 3834 quantitative judgments with sampling data. The DITs and quantitative judgments were significantly better than random chance and much improved by the rule of thumb training. In addition, the rule of thumb training reduced the amount of bias in the DITs and quantitative judgments. The mean DIT % correct scores increased from 47 to 64% after the rule of thumb training (P < 0.001). The accuracy for quantitative desktop judgments increased from 43 to 63% correct after the rule of thumb training (P < 0.001). The rule of thumb training did not significantly impact accuracy for qualitative desktop judgments. The finding that even some simple statistical rules of thumb improve judgment accuracy significantly suggests that hygienists need to routinely use statistical tools while making exposure judgments using monitoring data.
Vertical Accuracy Evaluation of Aster GDEM2 Over a Mountainous Area Based on Uav Photogrammetry
NASA Astrophysics Data System (ADS)
Liang, Y.; Qu, Y.; Guo, D.; Cui, T.
2018-05-01
Global digital elevation models (GDEM) provide elementary information on heights of the Earth's surface and objects on the ground. GDEMs have become an important data source for a range of applications. The vertical accuracy of a GDEM is critical for its applications. Nowadays UAVs has been widely used for large-scale surveying and mapping. Compared with traditional surveying techniques, UAV photogrammetry are more convenient and more cost-effective. UAV photogrammetry produces the DEM of the survey area with high accuracy and high spatial resolution. As a result, DEMs resulted from UAV photogrammetry can be used for a more detailed and accurate evaluation of the GDEM product. This study investigates the vertical accuracy (in terms of elevation accuracy and systematic errors) of the ASTER GDEM Version 2 dataset over a complex terrain based on UAV photogrammetry. Experimental results show that the elevation errors of ASTER GDEM2 are in normal distribution and the systematic error is quite small. The accuracy of the ASTER GDEM2 coincides well with that reported by the ASTER validation team. The accuracy in the research area is negatively correlated to both the slope of the terrain and the number of stereo observations. This study also evaluates the vertical accuracy of the up-sampled ASTER GDEM2. Experimental results show that the accuracy of the up-sampled ASTER GDEM2 data in the research area is not significantly reduced by the complexity of the terrain. The fine-grained accuracy evaluation of the ASTER GDEM2 is informative for the GDEM-supported UAV photogrammetric applications.
Evaluation of Techniques Used to Estimate Cortical Feature Maps
Katta, Nalin; Chen, Thomas L.; Watkins, Paul V.; Barbour, Dennis L.
2011-01-01
Functional properties of neurons are often distributed nonrandomly within a cortical area and form topographic maps that reveal insights into neuronal organization and interconnection. Some functional maps, such as in visual cortex, are fairly straightforward to discern with a variety of techniques, while other maps, such as in auditory cortex, have resisted easy characterization. In order to determine appropriate protocols for establishing accurate functional maps in auditory cortex, artificial topographic maps were probed under various conditions, and the accuracy of estimates formed from the actual maps was quantified. Under these conditions, low-complexity maps such as sound frequency can be estimated accurately with as few as 25 total samples (e.g., electrode penetrations or imaging pixels) if neural responses are averaged together. More samples are required to achieve the highest estimation accuracy for higher complexity maps, and averaging improves map estimate accuracy even more than increasing sampling density. Undersampling without averaging can result in misleading map estimates, while undersampling with averaging can lead to the false conclusion of no map when one actually exists. Uniform sample spacing only slightly improves map estimation over nonuniform sample spacing typical of serial electrode penetrations. Tessellation plots commonly used to visualize maps estimated using nonuniform sampling are always inferior to linearly interpolated estimates, although differences are slight at higher sampling densities. Within primary auditory cortex, then, multiunit sampling with at least 100 samples would likely result in reasonable feature map estimates for all but the highest complexity maps and the highest variability that might be expected. PMID:21889537
Vanguelova, E I; Bonifacio, E; De Vos, B; Hoosbeek, M R; Berger, T W; Vesterdal, L; Armolaitis, K; Celi, L; Dinca, L; Kjønaas, O J; Pavlenda, P; Pumpanen, J; Püttsepp, Ü; Reidy, B; Simončič, P; Tobin, B; Zhiyanski, M
2016-11-01
Spatially explicit knowledge of recent and past soil organic carbon (SOC) stocks in forests will improve our understanding of the effect of human- and non-human-induced changes on forest C fluxes. For SOC accounting, a minimum detectable difference must be defined in order to adequately determine temporal changes and spatial differences in SOC. This requires sufficiently detailed data to predict SOC stocks at appropriate scales within the required accuracy so that only significant changes are accounted for. When designing sampling campaigns, taking into account factors influencing SOC spatial and temporal distribution (such as soil type, topography, climate and vegetation) are needed to optimise sampling depths and numbers of samples, thereby ensuring that samples accurately reflect the distribution of SOC at a site. Furthermore, the appropriate scales related to the research question need to be defined: profile, plot, forests, catchment, national or wider. Scaling up SOC stocks from point sample to landscape unit is challenging, and thus requires reliable baseline data. Knowledge of the associated uncertainties related to SOC measures at each particular scale and how to reduce them is crucial for assessing SOC stocks with the highest possible accuracy at each scale. This review identifies where potential sources of errors and uncertainties related to forest SOC stock estimation occur at five different scales-sample, profile, plot, landscape/regional and European. Recommendations are also provided on how to reduce forest SOC uncertainties and increase efficiency of SOC assessment at each scale.
Ensemble-Biased Metadynamics: A Molecular Simulation Method to Sample Experimental Distributions
Marinelli, Fabrizio; Faraldo-Gómez, José D.
2015-01-01
We introduce an enhanced-sampling method for molecular dynamics (MD) simulations referred to as ensemble-biased metadynamics (EBMetaD). The method biases a conventional MD simulation to sample a molecular ensemble that is consistent with one or more probability distributions known a priori, e.g., experimental intramolecular distance distributions obtained by double electron-electron resonance or other spectroscopic techniques. To this end, EBMetaD adds an adaptive biasing potential throughout the simulation that discourages sampling of configurations inconsistent with the target probability distributions. The bias introduced is the minimum necessary to fulfill the target distributions, i.e., EBMetaD satisfies the maximum-entropy principle. Unlike other methods, EBMetaD does not require multiple simulation replicas or the introduction of Lagrange multipliers, and is therefore computationally efficient and straightforward in practice. We demonstrate the performance and accuracy of the method for a model system as well as for spin-labeled T4 lysozyme in explicit water, and show how EBMetaD reproduces three double electron-electron resonance distance distributions concurrently within a few tens of nanoseconds of simulation time. EBMetaD is integrated in the open-source PLUMED plug-in (www.plumed-code.org), and can be therefore readily used with multiple MD engines. PMID:26083917
Accuracy evaluation of an X-ray microtomography system.
Fernandes, Jaquiel S; Appoloni, Carlos R; Fernandes, Celso P
2016-06-01
Microstructural parameter evaluation of reservoir rocks is of great importance to petroleum production companies. In this connection, X-ray computed microtomography (μ-CT) has proven to be a quite useful method for the assessment of rocks, as it provides important microstructural parameters, such as porosity, permeability, pore size distribution and porous phase of the sample. X-ray computed microtomography is a non-destructive technique that enables the reuse of samples already measured and also yields 2-D cross-sectional images of the sample as well as volume rendering. This technique offers an additional advantage, as it does not require sample preparation, of reducing the measurement time, which is approximately one to three hours, depending on the spatial resolution used. Although this technique is extensively used, accuracy verification of measurements is hard to obtain because the existing calibrated samples (phantoms) have large volumes and are assessed in medical CT scanners with millimeter spatial resolution. Accordingly, this study aims to determine the accuracy of an X-ray computed microtomography system using a Skyscan 1172 X-ray microtomograph. To accomplish this investigation, it was used a nylon thread set with known appropriate diameter inserted into a glass tube. The results for porosity size and phase distribution by X-ray microtomography were very close to the geometrically calculated values. The geometrically calculated porosity and the porosity determined by the methodology using the μ-CT was 33.4±3.4% and 31.0±0.3%, respectively. The outcome of this investigation was excellent. It was also observed a small variability in the results along all 401 sections of the analyzed image. Minimum and maximum porosity values between the cross sections were 30.9% and 31.1%, respectively. A 3-D image representing the actual structure of the sample was also rendered from the 2-D images. Copyright © 2016 Elsevier Ltd. All rights reserved.
Analyzing thematic maps and mapping for accuracy
Rosenfield, G.H.
1982-01-01
Two problems which exist while attempting to test the accuracy of thematic maps and mapping are: (1) evaluating the accuracy of thematic content, and (2) evaluating the effects of the variables on thematic mapping. Statistical analysis techniques are applicable to both these problems and include techniques for sampling the data and determining their accuracy. In addition, techniques for hypothesis testing, or inferential statistics, are used when comparing the effects of variables. A comprehensive and valid accuracy test of a classification project, such as thematic mapping from remotely sensed data, includes the following components of statistical analysis: (1) sample design, including the sample distribution, sample size, size of the sample unit, and sampling procedure; and (2) accuracy estimation, including estimation of the variance and confidence limits. Careful consideration must be given to the minimum sample size necessary to validate the accuracy of a given. classification category. The results of an accuracy test are presented in a contingency table sometimes called a classification error matrix. Usually the rows represent the interpretation, and the columns represent the verification. The diagonal elements represent the correct classifications. The remaining elements of the rows represent errors by commission, and the remaining elements of the columns represent the errors of omission. For tests of hypothesis that compare variables, the general practice has been to use only the diagonal elements from several related classification error matrices. These data are arranged in the form of another contingency table. The columns of the table represent the different variables being compared, such as different scales of mapping. The rows represent the blocking characteristics, such as the various categories of classification. The values in the cells of the tables might be the counts of correct classification or the binomial proportions of these counts divided by either the row totals or the column totals from the original classification error matrices. In hypothesis testing, when the results of tests of multiple sample cases prove to be significant, some form of statistical test must be used to separate any results that differ significantly from the others. In the past, many analyses of the data in this error matrix were made by comparing the relative magnitudes of the percentage of correct classifications, for either individual categories, the entire map or both. More rigorous analyses have used data transformations and (or) two-way classification analysis of variance. A more sophisticated step of data analysis techniques would be to use the entire classification error matrices using the methods of discrete multivariate analysis or of multiviariate analysis of variance.
Exploring geo-tagged photos for land cover validation with deep learning
NASA Astrophysics Data System (ADS)
Xing, Hanfa; Meng, Yuan; Wang, Zixuan; Fan, Kaixuan; Hou, Dongyang
2018-07-01
Land cover validation plays an important role in the process of generating and distributing land cover thematic maps, which is usually implemented by high cost of sample interpretation with remotely sensed images or field survey. With an increasing availability of geo-tagged landscape photos, the automatic photo recognition methodologies, e.g., deep learning, can be effectively utilised for land cover applications. However, they have hardly been utilised in validation processes, as challenges remain in sample selection and classification for highly heterogeneous photos. This study proposed an approach to employ geo-tagged photos for land cover validation by using the deep learning technology. The approach first identified photos automatically based on the VGG-16 network. Then, samples for validation were selected and further classified by considering photos distribution and classification probabilities. The implementations were conducted for the validation of the GlobeLand30 land cover product in a heterogeneous area, western California. Experimental results represented promises in land cover validation, given that GlobeLand30 showed an overall accuracy of 83.80% with classified samples, which was close to the validation result of 80.45% based on visual interpretation. Additionally, the performances of deep learning based on ResNet-50 and AlexNet were also quantified, revealing no substantial differences in final validation results. The proposed approach ensures geo-tagged photo quality, and supports the sample classification strategy by considering photo distribution, with accuracy improvement from 72.07% to 79.33% compared with solely considering the single nearest photo. Consequently, the presented approach proves the feasibility of deep learning technology on land cover information identification of geo-tagged photos, and has a great potential to support and improve the efficiency of land cover validation.
Cell-free DNA fragment-size distribution analysis for non-invasive prenatal CNV prediction.
Arbabi, Aryan; Rampášek, Ladislav; Brudno, Michael
2016-06-01
Non-invasive detection of aneuploidies in a fetal genome through analysis of cell-free DNA circulating in the maternal plasma is becoming a routine clinical test. Such tests, which rely on analyzing the read coverage or the allelic ratios at single-nucleotide polymorphism (SNP) loci, are not sensitive enough for smaller sub-chromosomal abnormalities due to sequencing biases and paucity of SNPs in a genome. We have developed an alternative framework for identifying sub-chromosomal copy number variations in a fetal genome. This framework relies on the size distribution of fragments in a sample, as fetal-origin fragments tend to be smaller than those of maternal origin. By analyzing the local distribution of the cell-free DNA fragment sizes in each region, our method allows for the identification of sub-megabase CNVs, even in the absence of SNP positions. To evaluate the accuracy of our method, we used a plasma sample with the fetal fraction of 13%, down-sampled it to samples with coverage of 10X-40X and simulated samples with CNVs based on it. Our method had a perfect accuracy (both specificity and sensitivity) for detecting 5 Mb CNVs, and after reducing the fetal fraction (to 11%, 9% and 7%), it could correctly identify 98.82-100% of the 5 Mb CNVs and had a true-negative rate of 95.29-99.76%. Our source code is available on GitHub at https://github.com/compbio-UofT/FSDA CONTACT: : brudno@cs.toronto.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Chen, Zhiru; Hong, Wenxue
2016-02-01
Considering the low accuracy of prediction in the positive samples and poor overall classification effects caused by unbalanced sample data of MicroRNA (miRNA) target, we proposes a support vector machine (SVM)-integration of under-sampling and weight (IUSM) algorithm in this paper, an under-sampling based on the ensemble learning algorithm. The algorithm adopts SVM as learning algorithm and AdaBoost as integration framework, and embeds clustering-based under-sampling into the iterative process, aiming at reducing the degree of unbalanced distribution of positive and negative samples. Meanwhile, in the process of adaptive weight adjustment of the samples, the SVM-IUSM algorithm eliminates the abnormal ones in negative samples with robust sample weights smoothing mechanism so as to avoid over-learning. Finally, the prediction of miRNA target integrated classifier is achieved with the combination of multiple weak classifiers through the voting mechanism. The experiment revealed that the SVM-IUSW, compared with other algorithms on unbalanced dataset collection, could not only improve the accuracy of positive targets and the overall effect of classification, but also enhance the generalization ability of miRNA target classifier.
Characterizing regional soil mineral composition using spectroscopyand geostatistics
Mulder, V.L.; de Bruin, S.; Weyermann, J.; Kokaly, Raymond F.; Schaepman, M.E.
2013-01-01
This work aims at improving the mapping of major mineral variability at regional scale using scale-dependent spatial variability observed in remote sensing data. Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) data and statistical methods were combined with laboratory-based mineral characterization of field samples to create maps of the distributions of clay, mica and carbonate minerals and their abundances. The Material Identification and Characterization Algorithm (MICA) was used to identify the spectrally-dominant minerals in field samples; these results were combined with ASTER data using multinomial logistic regression to map mineral distributions. X-ray diffraction (XRD)was used to quantify mineral composition in field samples. XRD results were combined with ASTER data using multiple linear regression to map mineral abundances. We testedwhether smoothing of the ASTER data to match the scale of variability of the target sample would improve model correlations. Smoothing was donewith Fixed Rank Kriging (FRK) to represent the mediumand long-range spatial variability in the ASTER data. Stronger correlations resulted using the smoothed data compared to results obtained with the original data. Highest model accuracies came from using both medium and long-range scaled ASTER data as input to the statistical models. High correlation coefficients were obtained for the abundances of calcite and mica (R2 = 0.71 and 0.70, respectively). Moderately-high correlation coefficients were found for smectite and kaolinite (R2 = 0.57 and 0.45, respectively). Maps of mineral distributions, obtained by relating ASTER data to MICA analysis of field samples, were found to characterize major soil mineral variability (overall accuracies for mica, smectite and kaolinite were 76%, 89% and 86% respectively). The results of this study suggest that the distributions of minerals and their abundances derived using FRK-smoothed ASTER data more closely match the spatial variability of soil and environmental properties at regional scale.
NASA Astrophysics Data System (ADS)
Yasui, Takeshi
2017-08-01
Optical frequency combs are innovative tools for broadband spectroscopy because a series of comb modes can serve as frequency markers that are traceable to a microwave frequency standard. However, a mode distribution that is too discrete limits the spectral sampling interval to the mode frequency spacing even though individual mode linewidth is sufficiently narrow. Here, using a combination of a spectral interleaving and dual-comb spectroscopy in the terahertz (THz) region, we achieved a spectral sampling interval equal to the mode linewidth rather than the mode spacing. The spectrally interleaved THz comb was realized by sweeping the laser repetition frequency and interleaving additional frequency marks. In low-pressure gas spectroscopy, we achieved an improved spectral sampling density of 2.5 MHz and enhanced spectral accuracy of 8.39 × 10-7 in the THz region. The proposed method is a powerful tool for simultaneously achieving high resolution, high accuracy, and broad spectral coverage in THz spectroscopy.
Level 1 environmental assessment performance evaluation. Final report jun 77-oct 78
DOE Office of Scientific and Technical Information (OSTI.GOV)
Estes, E.D.; Smith, F.; Wagoner, D.E.
1979-02-01
The report gives results of a two-phased evaluation of Level 1 environmental assessment procedures. Results from Phase I, a field evaluation of the Source Assessment Sampling System (SASS), showed that the SASS train performed well within the desired factor of 3 Level 1 accuracy limit. Three sample runs were made with two SASS trains sampling simultaneously and from approximately the same sampling point in a horizontal duct. A Method-5 train was used to estimate the 'true' particulate loading. The sampling systems were upstream of the control devices to ensure collection of sufficient material for comparison of total particulate, particle sizemore » distribution, organic classes, and trace elements. Phase II consisted of providing each of three organizations with three types of control samples to challenge the spectrum of Level 1 analytical procedures: an artificial sample in methylene chloride, an artificial sample on a flyash matrix, and a real sample composed of the combined XAD-2 resin extracts from all Phase I runs. Phase II results showed that when the Level 1 analytical procedures are carefully applied, data of acceptable accuracy is obtained. Estimates of intralaboratory and interlaboratory precision are made.« less
McDonald, Gene D; Storrie-Lombardi, Michael C
2006-02-01
The relative abundance of the protein amino acids has been previously investigated as a potential marker for biogenicity in meteoritic samples. However, these investigations were executed without a quantitative metric to evaluate distribution variations, and they did not account for the possibility of interdisciplinary systematic error arising from inter-laboratory differences in extraction and detection techniques. Principal component analysis (PCA), hierarchical cluster analysis (HCA), and stochastic probabilistic artificial neural networks (ANNs) were used to compare the distributions for nine protein amino acids previously reported for the Murchison carbonaceous chondrite, Mars meteorites (ALH84001, Nakhla, and EETA79001), prebiotic synthesis experiments, and terrestrial biota and sediments. These techniques allowed us (1) to identify a shift in terrestrial amino acid distributions secondary to diagenesis; (2) to detect differences in terrestrial distributions that may be systematic differences between extraction and analysis techniques in biological and geological laboratories; and (3) to determine that distributions in meteoritic samples appear more similar to prebiotic chemistry samples than they do to the terrestrial unaltered or diagenetic samples. Both diagenesis and putative interdisciplinary differences in analysis complicate interpretation of meteoritic amino acid distributions. We propose that the analysis of future samples from such diverse sources as meteoritic influx, sample return missions, and in situ exploration of Mars would be less ambiguous with adoption of standardized assay techniques, systematic inclusion of assay standards, and the use of a quantitative, probabilistic metric. We present here one such metric determined by sequential feature extraction and normalization (PCA), information-driven automated exploration of classification possibilities (HCA), and prediction of classification accuracy (ANNs).
2012-01-01
Background When outcomes are binary, the c-statistic (equivalent to the area under the Receiver Operating Characteristic curve) is a standard measure of the predictive accuracy of a logistic regression model. Methods An analytical expression was derived under the assumption that a continuous explanatory variable follows a normal distribution in those with and without the condition. We then conducted an extensive set of Monte Carlo simulations to examine whether the expressions derived under the assumption of binormality allowed for accurate prediction of the empirical c-statistic when the explanatory variable followed a normal distribution in the combined sample of those with and without the condition. We also examine the accuracy of the predicted c-statistic when the explanatory variable followed a gamma, log-normal or uniform distribution in combined sample of those with and without the condition. Results Under the assumption of binormality with equality of variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the product of the standard deviation of the normal components (reflecting more heterogeneity) and the log-odds ratio (reflecting larger effects). Under the assumption of binormality with unequal variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the standardized difference of the explanatory variable in those with and without the condition. In our Monte Carlo simulations, we found that these expressions allowed for reasonably accurate prediction of the empirical c-statistic when the distribution of the explanatory variable was normal, gamma, log-normal, and uniform in the entire sample of those with and without the condition. Conclusions The discriminative ability of a continuous explanatory variable cannot be judged by its odds ratio alone, but always needs to be considered in relation to the heterogeneity of the population. PMID:22716998
Hurks, Petra; Hendriksen, Jos; Dek, Joelle; Kooij, Andress
2016-04-01
This article investigated the accuracy of six short forms of the Dutch Wechsler Preschool and Primary Scale of Intelligence-Third edition (WPPSI-III-NL) in estimating intelligent quotient (IQ) scores in healthy children aged 4 to 7 years (N = 1,037). Overall, accuracy for each short form was studied, comparing IQ equivalences based on the short forms with the original WPPSI-III-NL Full Scale IQ (FSIQ) scores. Next, our sample was divided into three groups: children performing below average, average, or above average, based on the WPPSI-III-NL FSIQ estimates of the original long form, to study the accuracy of WPPSI-III-NL short forms at the tails of the FSIQ distribution. While studying the entire sample, all IQ estimates of the WPPSI-III-NL short forms correlated highly with the FSIQ estimates of the original long form (all rs ≥ .83). Correlations decreased significantly while studying only the tails of the IQ distribution (rs varied between .55 and .83). Furthermore, IQ estimates of the short forms deviated significantly from the FSIQ score of the original long form, when the IQ estimates were based on short forms containing only two subtests. In contrast, unlike the short forms that contained two to four subtests, the Wechsler Abbreviated Scale of Intelligence short form (containing the subtests Vocabulary, Similarities, Block Design, and Matrix Reasoning) and the General Ability Index short form (containing the subtests Vocabulary, Similarities, Comprehension, Block Design, Matrix Reasoning, and Picture Concepts) produced less variations when compared with the original FSIQ score. © The Author(s) 2015.
Effect of non-Poisson samples on turbulence spectra from laser velocimetry
NASA Technical Reports Server (NTRS)
Sree, Dave; Kjelgaard, Scott O.; Sellers, William L., III
1994-01-01
Spectral analysis of laser velocimetry (LV) data plays an important role in characterizing a turbulent flow and in estimating the associated turbulence scales, which can be helpful in validating theoretical and numerical turbulence models. The determination of turbulence scales is critically dependent on the accuracy of the spectral estimates. Spectral estimations from 'individual realization' laser velocimetry data are typically based on the assumption of a Poisson sampling process. What this Note has demonstrated is that the sampling distribution must be considered before spectral estimates are used to infer turbulence scales.
NASA Astrophysics Data System (ADS)
Fernández-Ruiz, Ramón; Friedrich K., E. Josue; Redrejo, M. J.
2018-02-01
The main goal of this work was to investigate, in a systematic way, the influence of the controlled modulation of the particle size distribution of a representative solid sample with respect to the more relevant analytical parameters of the Direct Solid Analysis (DSA) by Total-reflection X-Ray Fluorescence (TXRF) quantitative method. In particular, accuracy, uncertainty, linearity and detection limits were correlated with the main parameters of their size distributions for the following elements; Al, Si, P, S, K, Ca, Ti, V, Cr, Mn, Fe, Ni, Cu, Zn, As, Se, Rb, Sr, Ba and Pb. In all cases strong correlations were finded. The main conclusion of this work can be resumed as follows; the modulation of particles shape to lower average sizes next to a minimization of the width of particle size distributions, produce a strong increment of accuracy, minimization of uncertainties and limit of detections for DSA-TXRF methodology. These achievements allow the future use of the DSA-TXRF analytical methodology for development of ISO norms and standardized protocols for the direct analysis of solids by mean of TXRF.
An Investigation to Improve Classifier Accuracy for Myo Collected Data
2017-02-01
distribution is unlimited. 13. SUPPLEMENTARY NOTES 14. ABSTRACT A naïve Bayes classifier trained with 1,360 samples from 17 volunteers performs at...movement data from 17 volunteers . Each volunteer performed 8 gestures (Freeze, Rally Point, Hurry Up, Down, Come, Stop, Line Abreast Formation, and Vehicle...line chart was plotted for each gesture’s feature (e.g., Pitch, xAcc) per user. All 10 recorded samples of a particular gesture for a single volunteer
The effects of spatial sampling choices on MR temperature measurements.
Todd, Nick; Vyas, Urvi; de Bever, Josh; Payne, Allison; Parker, Dennis L
2011-02-01
The purpose of this article is to quantify the effects that spatial sampling parameters have on the accuracy of magnetic resonance temperature measurements during high intensity focused ultrasound treatments. Spatial resolution and position of the sampling grid were considered using experimental and simulated data for two different types of high intensity focused ultrasound heating trajectories (a single point and a 4-mm circle) with maximum measured temperature and thermal dose volume as the metrics. It is demonstrated that measurement accuracy is related to the curvature of the temperature distribution, where regions with larger spatial second derivatives require higher resolution. The location of the sampling grid relative temperature distribution has a significant effect on the measured values. When imaging at 1.0 × 1.0 × 3.0 mm(3) resolution, the measured values for maximum temperature and volume dosed to 240 cumulative equivalent minutes (CEM) or greater varied by 17% and 33%, respectively, for the single-point heating case, and by 5% and 18%, respectively, for the 4-mm circle heating case. Accurate measurement of the maximum temperature required imaging at 1.0 × 1.0 × 3.0 mm(3) resolution for the single-point heating case and 2.0 × 2.0 × 5.0 mm(3) resolution for the 4-mm circle heating case. Copyright © 2010 Wiley-Liss, Inc.
a Comparative Analysis of Five Cropland Datasets in Africa
NASA Astrophysics Data System (ADS)
Wei, Y.; Lu, M.; Wu, W.
2018-04-01
The food security, particularly in Africa, is a challenge to be resolved. The cropland area and spatial distribution obtained from remote sensing imagery are vital information. In this paper, according to cropland area and spatial location, we compare five global cropland datasets including CCI Land Cover, GlobCover, MODIS Collection 5, GlobeLand30 and Unified Cropland in circa 2010 of Africa in terms of cropland area and spatial location. The accuracy of cropland area calculated from five datasets was analyzed compared with statistic data. Based on validation samples, the accuracies of spatial location for the five cropland products were assessed by error matrix. The results show that GlobeLand30 has the best fitness with the statistics, followed by MODIS Collection 5 and Unified Cropland, GlobCover and CCI Land Cover have the lower accuracies. For the accuracy of spatial location of cropland, GlobeLand30 reaches the highest accuracy, followed by Unified Cropland, MODIS Collection 5 and GlobCover, CCI Land Cover has the lowest accuracy. The spatial location accuracy of five datasets in the Csa with suitable farming condition is generally higher than in the Bsk.
Statistical computation of tolerance limits
NASA Technical Reports Server (NTRS)
Wheeler, J. T.
1993-01-01
Based on a new theory, two computer codes were developed specifically to calculate the exact statistical tolerance limits for normal distributions within unknown means and variances for the one-sided and two-sided cases for the tolerance factor, k. The quantity k is defined equivalently in terms of the noncentral t-distribution by the probability equation. Two of the four mathematical methods employ the theory developed for the numerical simulation. Several algorithms for numerically integrating and iteratively root-solving the working equations are written to augment the program simulation. The program codes generate some tables of k's associated with the varying values of the proportion and sample size for each given probability to show accuracy obtained for small sample sizes.
Clemmons, Elizabeth A; Stovall, Melissa I; Owens, Devon C; Scott, Jessica A; Jones-Wilkes, Amelia C; Kempf, Doty J; Ethun, Kelly F
2016-01-01
Handheld, point-of-care glucometers are commonly used in NHP for clinical and research purposes, but whether these devices are appropriate for use in NHP is unknown. Other animal studies indicate that glucometers should be species-specific, given differences in glucose distribution between RBC and plasma; in addition, Hct and sampling site (venous compared with capillary) influence glucometer readings. Therefore, we compared the accuracy of 2 human and 2 veterinary glucometers at various Hct ranges in rhesus macaques (Macaca mulatta), sooty mangabeys (Cercocebus atys), and chimpanzees (Pan troglodytes) with that of standard laboratory glucose analysis. Subsequent analyses assessed the effect of hypoglycemia, hyperglycemia, and sampling site on glucometer accuracy. The veterinary glucometers overestimated blood glucose (BG) values in all species by 26 to 75 mg/dL. The mean difference between the human glucometers and the laboratory analyzer was 7 mg/dL or less in all species. The human glucometers overestimated BG in hypoglycemic mangabeys by 4 mg/dL and underestimated BG in hyperglycemic mangabeys by 11 mg/dL; similar patterns occurred in rhesus macaques. Hct did not affect glucometer accuracy, but all samples were within the range at which glucometers generally are accurate in humans. BG values were significantly lower in venous than capillary samples. The current findings show that veterinary glucometers intended for companion-animal species are inappropriate for use in the studied NHP species, whereas the human glucometers showed clinically acceptable accuracy in all 3 species. Finally, potential differences between venous and capillary BG values should be considered when comparing and evaluating results.
NASA Astrophysics Data System (ADS)
WANG, P. T.
2015-12-01
Groundwater modeling requires to assign hydrogeological properties to every numerical grid. Due to the lack of detailed information and the inherent spatial heterogeneity, geological properties can be treated as random variables. Hydrogeological property is assumed to be a multivariate distribution with spatial correlations. By sampling random numbers from a given statistical distribution and assigning a value to each grid, a random field for modeling can be completed. Therefore, statistics sampling plays an important role in the efficiency of modeling procedure. Latin Hypercube Sampling (LHS) is a stratified random sampling procedure that provides an efficient way to sample variables from their multivariate distributions. This study combines the the stratified random procedure from LHS and the simulation by using LU decomposition to form LULHS. Both conditional and unconditional simulations of LULHS were develpoed. The simulation efficiency and spatial correlation of LULHS are compared to the other three different simulation methods. The results show that for the conditional simulation and unconditional simulation, LULHS method is more efficient in terms of computational effort. Less realizations are required to achieve the required statistical accuracy and spatial correlation.
Soares, André E R; Schrago, Carlos G
2015-01-07
Although taxon sampling is commonly considered an important issue in phylogenetic inference, it is rarely considered in the Bayesian estimation of divergence times. In fact, the studies conducted to date have presented ambiguous results, and the relevance of taxon sampling for molecular dating remains unclear. In this study, we developed a series of simulations that, after six hundred Bayesian molecular dating analyses, allowed us to evaluate the impact of taxon sampling on chronological estimates under three scenarios of among-lineage rate heterogeneity. The first scenario allowed us to examine the influence of the number of terminals on the age estimates based on a strict molecular clock. The second scenario imposed an extreme example of lineage specific rate variation, and the third scenario permitted extensive rate variation distributed along the branches. We also analyzed empirical data on selected mitochondrial genomes of mammals. Our results showed that in the strict molecular-clock scenario (Case I), taxon sampling had a minor impact on the accuracy of the time estimates, although the precision of the estimates was greater with an increased number of terminals. The effect was similar in the scenario (Case III) based on rate variation distributed among the branches. Only under intensive rate variation among lineages (Case II) taxon sampling did result in biased estimates. The results of an empirical analysis corroborated the simulation findings. We demonstrate that taxonomic sampling affected divergence time inference but that its impact was significant if the rates deviated from those derived for the strict molecular clock. Increased taxon sampling improved the precision and accuracy of the divergence time estimates, but the impact on precision is more relevant. On average, biased estimates were obtained only if lineage rate variation was pronounced. Copyright © 2014 Elsevier Ltd. All rights reserved.
Active machine learning for rapid landslide inventory mapping with VHR satellite images (Invited)
NASA Astrophysics Data System (ADS)
Stumpf, A.; Lachiche, N.; Malet, J.; Kerle, N.; Puissant, A.
2013-12-01
VHR satellite images have become a primary source for landslide inventory mapping after major triggering events such as earthquakes and heavy rainfalls. Visual image interpretation is still the prevailing standard method for operational purposes but is time-consuming and not well suited to fully exploit the increasingly better supply of remote sensing data. Recent studies have addressed the development of more automated image analysis workflows for landslide inventory mapping. In particular object-oriented approaches that account for spatial and textural image information have been demonstrated to be more adequate than pixel-based classification but manually elaborated rule-based classifiers are difficult to adapt under changing scene characteristics. Machine learning algorithm allow learning classification rules for complex image patterns from labelled examples and can be adapted straightforwardly with available training data. In order to reduce the amount of costly training data active learning (AL) has evolved as a key concept to guide the sampling for many applications. The underlying idea of AL is to initialize a machine learning model with a small training set, and to subsequently exploit the model state and data structure to iteratively select the most valuable samples that should be labelled by the user. With relatively few queries and labelled samples, an AL strategy yields higher accuracies than an equivalent classifier trained with many randomly selected samples. This study addressed the development of an AL method for landslide mapping from VHR remote sensing images with special consideration of the spatial distribution of the samples. Our approach [1] is based on the Random Forest algorithm and considers the classifier uncertainty as well as the variance of potential sampling regions to guide the user towards the most valuable sampling areas. The algorithm explicitly searches for compact regions and thereby avoids a spatially disperse sampling pattern inherent to most other AL methods. The accuracy, the sampling time and the computational runtime of the algorithm were evaluated on multiple satellite images capturing recent large scale landslide events. Sampling between 1-4% of the study areas the accuracies between 74% and 80% were achieved, whereas standard sampling schemes yielded only accuracies between 28% and 50% with equal sampling costs. Compared to commonly used point-wise AL algorithm the proposed approach significantly reduces the number of iterations and hence the computational runtime. Since the user can focus on relatively few compact areas (rather than on hundreds of distributed points) the overall labeling time is reduced by more than 50% compared to point-wise queries. An experimental evaluation of multiple expert mappings demonstrated strong relationships between the uncertainties of the experts and the machine learning model. It revealed that the achieved accuracies are within the range of the inter-expert disagreement and that it will be indispensable to consider ground truth uncertainties to truly achieve further enhancements in the future. The proposed method is generally applicable to a wide range of optical satellite images and landslide types. [1] A. Stumpf, N. Lachiche, J.-P. Malet, N. Kerle, and A. Puissant, Active learning in the spatial domain for remote sensing image classification, IEEE Transactions on Geosciece and Remote Sensing. 2013, DOI 10.1109/TGRS.2013.2262052.
Portable Electronic Nose Based on Electrochemical Sensors for Food Quality Assessment
Dymerski, Tomasz; Gębicki, Jacek; Namieśnik, Jacek
2017-01-01
The steady increase in global consumption puts a strain on agriculture and might lead to a decrease in food quality. Currently used techniques of food analysis are often labour-intensive and time-consuming and require extensive sample preparation. For that reason, there is a demand for novel methods that could be used for rapid food quality assessment. A technique based on the use of an array of chemical sensors for holistic analysis of the sample’s headspace is called electronic olfaction. In this article, a prototype of a portable, modular electronic nose intended for food analysis is described. Using the SVM method, it was possible to classify samples of poultry meat based on shelf-life with 100% accuracy, and also samples of rapeseed oil based on the degree of thermal degradation with 100% accuracy. The prototype was also used to detect adulterations of extra virgin olive oil with rapeseed oil with 82% overall accuracy. Due to the modular design, the prototype offers the advantages of solutions targeted for analysis of specific food products, at the same time retaining the flexibility of application. Furthermore, its portability allows the device to be used at different stages of the production and distribution process. PMID:29186754
Pepper seed variety identification based on visible/near-infrared spectral technology
NASA Astrophysics Data System (ADS)
Li, Cuiling; Wang, Xiu; Meng, Zhijun; Fan, Pengfei; Cai, Jichen
2016-11-01
Pepper is a kind of important fruit vegetable, with the expansion of pepper hybrid planting area, detection of pepper seed purity is especially important. This research used visible/near infrared (VIS/NIR) spectral technology to detect the variety of single pepper seed, and chose hybrid pepper seeds "Zhuo Jiao NO.3", "Zhuo Jiao NO.4" and "Zhuo Jiao NO.5" as research sample. VIS/NIR spectral data of 80 "Zhuo Jiao NO.3", 80 "Zhuo Jiao NO.4" and 80 "Zhuo Jiao NO.5" pepper seeds were collected, and the original spectral data was pretreated with standard normal variable (SNV) transform, first derivative (FD), and Savitzky-Golay (SG) convolution smoothing methods. Principal component analysis (PCA) method was adopted to reduce the dimension of the spectral data and extract principal components, according to the distribution of the first principal component (PC1) along with the second principal component(PC2) in the twodimensional plane, similarly, the distribution of PC1 coupled with the third principal component(PC3), and the distribution of PC2 combined with PC3, distribution areas of three varieties of pepper seeds were divided in each twodimensional plane, and the discriminant accuracy of PCA was tested through observing the distribution area of samples' principal components in validation set. This study combined PCA and linear discriminant analysis (LDA) to identify single pepper seed varieties, results showed that with the FD preprocessing method, the discriminant accuracy of pepper seed varieties was 98% for validation set, it concludes that using VIS/NIR spectral technology is feasible for identification of single pepper seed varieties.
Han, Zong-wei; Huang, Wei; Luo, Yun; Zhang, Chun-di; Qi, Da-cheng
2015-03-01
Taking the soil organic matter in eastern Zhongxiang County, Hubei Province, as a research object, thirteen sample sets from different regions were arranged surrounding the road network, the spatial configuration of which was optimized by the simulated annealing approach. The topographic factors of these thirteen sample sets, including slope, plane curvature, profile curvature, topographic wetness index, stream power index and sediment transport index, were extracted by the terrain analysis. Based on the results of optimization, a multiple linear regression model with topographic factors as independent variables was built. At the same time, a multilayer perception model on the basis of neural network approach was implemented. The comparison between these two models was carried out then. The results revealed that the proposed approach was practicable in optimizing soil sampling scheme. The optimal configuration was capable of gaining soil-landscape knowledge exactly, and the accuracy of optimal configuration was better than that of original samples. This study designed a sampling configuration to study the soil attribute distribution by referring to the spatial layout of road network, historical samples, and digital elevation data, which provided an effective means as well as a theoretical basis for determining the sampling configuration and displaying spatial distribution of soil organic matter with low cost and high efficiency.
Flameless atomic-absorption determination of gold in geological materials
Meier, A.L.
1980-01-01
Gold in geologic material is dissolved using a solution of hydrobromic acid and bromine, extracted with methyl isobutyl ketone, and determined using an atomic-absorption spectrophotometer equipped with a graphite furnace atomizer. A comparison of results obtained by this flameless atomic-absorption method on U.S. Geological Survey reference rocks and geochemical samples with reported values and with results obtained by flame atomic-absorption shows that reasonable accuracy is achieved with improved precision. The sensitivity, accuracy, and precision of the method allows acquisition of data on the distribution of gold at or below its crustal abundance. ?? 1980.
A randomised approach for NARX model identification based on a multivariate Bernoulli distribution
NASA Astrophysics Data System (ADS)
Bianchi, F.; Falsone, A.; Prandini, M.; Piroddi, L.
2017-04-01
The identification of polynomial NARX models is typically performed by incremental model building techniques. These methods assess the importance of each regressor based on the evaluation of partial individual models, which may ultimately lead to erroneous model selections. A more robust assessment of the significance of a specific model term can be obtained by considering ensembles of models, as done by the RaMSS algorithm. In that context, the identification task is formulated in a probabilistic fashion and a Bernoulli distribution is employed to represent the probability that a regressor belongs to the target model. Then, samples of the model distribution are collected to gather reliable information to update it, until convergence to a specific model. The basic RaMSS algorithm employs multiple independent univariate Bernoulli distributions associated to the different candidate model terms, thus overlooking the correlations between different terms, which are typically important in the selection process. Here, a multivariate Bernoulli distribution is employed, in which the sampling of a given term is conditioned by the sampling of the others. The added complexity inherent in considering the regressor correlation properties is more than compensated by the achievable improvements in terms of accuracy of the model selection process.
Accuracy of quadrat sampling in studying forest reproduction on cut-over areas
I. T. Haig
1929-01-01
The quadrat method, first introduced into ecological studies by Pound and Clements in i898, has been adopted by both foresters and ecologists as one of the most accurate means of studying the occurrence, distribution, and development of vegetation (Clements, '05; Weaver, '18). This method is unquestionably more precise than the descriptive method which it...
NASA Astrophysics Data System (ADS)
Zhang, G.; Lu, D.; Ye, M.; Gunzburger, M.
2011-12-01
Markov Chain Monte Carlo (MCMC) methods have been widely used in many fields of uncertainty analysis to estimate the posterior distributions of parameters and credible intervals of predictions in the Bayesian framework. However, in practice, MCMC may be computationally unaffordable due to slow convergence and the excessive number of forward model executions required, especially when the forward model is expensive to compute. Both disadvantages arise from the curse of dimensionality, i.e., the posterior distribution is usually a multivariate function of parameters. Recently, sparse grid method has been demonstrated to be an effective technique for coping with high-dimensional interpolation or integration problems. Thus, in order to accelerate the forward model and avoid the slow convergence of MCMC, we propose a new method for uncertainty analysis based on sparse grid interpolation and quasi-Monte Carlo sampling. First, we construct a polynomial approximation of the forward model in the parameter space by using the sparse grid interpolation. This approximation then defines an accurate surrogate posterior distribution that can be evaluated repeatedly at minimal computational cost. Second, instead of using MCMC, a quasi-Monte Carlo method is applied to draw samples in the parameter space. Then, the desired probability density function of each prediction is approximated by accumulating the posterior density values of all the samples according to the prediction values. Our method has the following advantages: (1) the polynomial approximation of the forward model on the sparse grid provides a very efficient evaluation of the surrogate posterior distribution; (2) the quasi-Monte Carlo method retains the same accuracy in approximating the PDF of predictions but avoids all disadvantages of MCMC. The proposed method is applied to a controlled numerical experiment of groundwater flow modeling. The results show that our method attains the same accuracy much more efficiently than traditional MCMC.
[External quality control system in medical microbiology and parasitology in the Czech Republic].
Slosárek, M; Petrás, P; Kríz, B
2004-11-01
The External Quality Control System (EQAS) of laboratory activities in medical microbiology and parasitology was implemented in the Czech Republic in 1993 with coded sera samples for diagnosis of viral hepatitis and bacterial strains for identification distributed to first participating laboratories. The number of sample types reached 31 in 2003 and the number of participating laboratories rised from 79 in 1993 to 421 in 2003. As many as 15.130 samples were distributed to the participating laboratories in 2003. Currently, almost all microbiology and parasitology laboratories in the Czech Republic involved in examination of clinical material participate in the EQAS. Based on the 11-year experience gained with the EQAS in the Czech Republic, the following benefits were observed: higher accuracy of results in different tests, standardisation of methods and the use of most suitable test kits.
Sampling procedures for inventory of commercial volume tree species in Amazon Forest.
Netto, Sylvio P; Pelissari, Allan L; Cysneiros, Vinicius C; Bonazza, Marcelo; Sanquetta, Carlos R
2017-01-01
The spatial distribution of tropical tree species can affect the consistency of the estimators in commercial forest inventories, therefore, appropriate sampling procedures are required to survey species with different spatial patterns in the Amazon Forest. For this, the present study aims to evaluate the conventional sampling procedures and introduce the adaptive cluster sampling for volumetric inventories of Amazonian tree species, considering the hypotheses that the density, the spatial distribution and the zero-plots affect the consistency of the estimators, and that the adaptive cluster sampling allows to obtain more accurate volumetric estimation. We use data from a census carried out in Jamari National Forest, Brazil, where trees with diameters equal to or higher than 40 cm were measured in 1,355 plots. Species with different spatial patterns were selected and sampled with simple random sampling, systematic sampling, linear cluster sampling and adaptive cluster sampling, whereby the accuracy of the volumetric estimation and presence of zero-plots were evaluated. The sampling procedures applied to species were affected by the low density of trees and the large number of zero-plots, wherein the adaptive clusters allowed concentrating the sampling effort in plots with trees and, thus, agglutinating more representative samples to estimate the commercial volume.
Zimmermann, N.E.; Edwards, T.C.; Moisen, Gretchen G.; Frescino, T.S.; Blackard, J.A.
2007-01-01
1. Compared to bioclimatic variables, remote sensing predictors are rarely used for predictive species modelling. When used, the predictors represent typically habitat classifications or filters rather than gradual spectral, surface or biophysical properties. Consequently, the full potential of remotely sensed predictors for modelling the spatial distribution of species remains unexplored. Here we analysed the partial contributions of remotely sensed and climatic predictor sets to explain and predict the distribution of 19 tree species in Utah. We also tested how these partial contributions were related to characteristics such as successional types or species traits. 2. We developed two spatial predictor sets of remotely sensed and topo-climatic variables to explain the distribution of tree species. We used variation partitioning techniques applied to generalized linear models to explore the combined and partial predictive powers of the two predictor sets. Non-parametric tests were used to explore the relationships between the partial model contributions of both predictor sets and species characteristics. 3. More than 60% of the variation explained by the models represented contributions by one of the two partial predictor sets alone, with topo-climatic variables outperforming the remotely sensed predictors. However, the partial models derived from only remotely sensed predictors still provided high model accuracies, indicating a significant correlation between climate and remote sensing variables. The overall accuracy of the models was high, but small sample sizes had a strong effect on cross-validated accuracies for rare species. 4. Models of early successional and broadleaf species benefited significantly more from adding remotely sensed predictors than did late seral and needleleaf species. The core-satellite species types differed significantly with respect to overall model accuracies. Models of satellite and urban species, both with low prevalence, benefited more from use of remotely sensed predictors than did the more frequent core species. 5. Synthesis and applications. If carefully prepared, remotely sensed variables are useful additional predictors for the spatial distribution of trees. Major improvements resulted for deciduous, early successional, satellite and rare species. The ability to improve model accuracy for species having markedly different life history strategies is a crucial step for assessing effects of global change. ?? 2007 The Authors.
ZIMMERMANN, N E; EDWARDS, T C; MOISEN, G G; FRESCINO, T S; BLACKARD, J A
2007-01-01
Compared to bioclimatic variables, remote sensing predictors are rarely used for predictive species modelling. When used, the predictors represent typically habitat classifications or filters rather than gradual spectral, surface or biophysical properties. Consequently, the full potential of remotely sensed predictors for modelling the spatial distribution of species remains unexplored. Here we analysed the partial contributions of remotely sensed and climatic predictor sets to explain and predict the distribution of 19 tree species in Utah. We also tested how these partial contributions were related to characteristics such as successional types or species traits. We developed two spatial predictor sets of remotely sensed and topo-climatic variables to explain the distribution of tree species. We used variation partitioning techniques applied to generalized linear models to explore the combined and partial predictive powers of the two predictor sets. Non-parametric tests were used to explore the relationships between the partial model contributions of both predictor sets and species characteristics. More than 60% of the variation explained by the models represented contributions by one of the two partial predictor sets alone, with topo-climatic variables outperforming the remotely sensed predictors. However, the partial models derived from only remotely sensed predictors still provided high model accuracies, indicating a significant correlation between climate and remote sensing variables. The overall accuracy of the models was high, but small sample sizes had a strong effect on cross-validated accuracies for rare species. Models of early successional and broadleaf species benefited significantly more from adding remotely sensed predictors than did late seral and needleleaf species. The core-satellite species types differed significantly with respect to overall model accuracies. Models of satellite and urban species, both with low prevalence, benefited more from use of remotely sensed predictors than did the more frequent core species. Synthesis and applications. If carefully prepared, remotely sensed variables are useful additional predictors for the spatial distribution of trees. Major improvements resulted for deciduous, early successional, satellite and rare species. The ability to improve model accuracy for species having markedly different life history strategies is a crucial step for assessing effects of global change. PMID:18642470
Petrovskaya, Natalia B.; Forbes, Emily; Petrovskii, Sergei V.; Walters, Keith F. A.
2018-01-01
Studies addressing many ecological problems require accurate evaluation of the total population size. In this paper, we revisit a sampling procedure used for the evaluation of the abundance of an invertebrate population from assessment data collected on a spatial grid of sampling locations. We first discuss how insufficient information about the spatial population density obtained on a coarse sampling grid may affect the accuracy of an evaluation of total population size. Such information deficit in field data can arise because of inadequate spatial resolution of the population distribution (spatially variable population density) when coarse grids are used, which is especially true when a strongly heterogeneous spatial population density is sampled. We then argue that the average trap count (the quantity routinely used to quantify abundance), if obtained from a sampling grid that is too coarse, is a random variable because of the uncertainty in sampling spatial data. Finally, we show that a probabilistic approach similar to bootstrapping techniques can be an efficient tool to quantify the uncertainty in the evaluation procedure in the presence of a spatial pattern reflecting a patchy distribution of invertebrates within the sampling grid. PMID:29495513
Radial q-space sampling for DSI.
Baete, Steven H; Yutzy, Stephen; Boada, Fernando E
2016-09-01
Diffusion spectrum imaging (DSI) has been shown to be an effective tool for noninvasively depicting the anatomical details of brain microstructure. Existing implementations of DSI sample the diffusion encoding space using a rectangular grid. Here we present a different implementation of DSI whereby a radially symmetric q-space sampling scheme for DSI is used to improve the angular resolution and accuracy of the reconstructed orientation distribution functions. Q-space is sampled by acquiring several q-space samples along a number of radial lines. Each of these radial lines in q-space is analytically connected to a value of the orientation distribution functions at the same angular location by the Fourier slice theorem. Computer simulations and in vivo brain results demonstrate that radial diffusion spectrum imaging correctly estimates the orientation distribution functions when moderately high b-values (4000 s/mm2) and number of q-space samples (236) are used. The nominal angular resolution of radial diffusion spectrum imaging depends on the number of radial lines used in the sampling scheme, and only weakly on the maximum b-value. In addition, the radial analytical reconstruction reduces truncation artifacts which affect Cartesian reconstructions. Hence, a radial acquisition of q-space can be favorable for DSI. Magn Reson Med 76:769-780, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
The Applicability of Confidence Intervals of Quantiles for the Generalized Logistic Distribution
NASA Astrophysics Data System (ADS)
Shin, H.; Heo, J.; Kim, T.; Jung, Y.
2007-12-01
The generalized logistic (GL) distribution has been widely used for frequency analysis. However, there is a little study related to the confidence intervals that indicate the prediction accuracy of distribution for the GL distribution. In this paper, the estimation of the confidence intervals of quantiles for the GL distribution is presented based on the method of moments (MOM), maximum likelihood (ML), and probability weighted moments (PWM) and the asymptotic variances of each quantile estimator are derived as functions of the sample sizes, return periods, and parameters. Monte Carlo simulation experiments are also performed to verify the applicability of the derived confidence intervals of quantile. As the results, the relative bias (RBIAS) and relative root mean square error (RRMSE) of the confidence intervals generally increase as return period increases and reverse as sample size increases. And PWM for estimating the confidence intervals performs better than the other methods in terms of RRMSE when the data is almost symmetric while ML shows the smallest RBIAS and RRMSE when the data is more skewed and sample size is moderately large. The GL model was applied to fit the distribution of annual maximum rainfall data. The results show that there are little differences in the estimated quantiles between ML and PWM while distinct differences in MOM.
Advanced platform for the in-plane ZT measurement of thin films
NASA Astrophysics Data System (ADS)
Linseis, V.; Völklein, F.; Reith, H.; Nielsch, K.; Woias, P.
2018-01-01
The characterization of nanostructured samples with at least one restricted dimension like thin films or nanowires is challenging, but important to understand their structure and transport mechanism, and to improve current industrial products and production processes. We report on the 2nd generation of a measurement chip, which allows for a simplified sample preparation process, and the measurement of samples deposited from the liquid phase using techniques like spin coating and drop casting. The new design enables us to apply much higher temperature gradients for the Seebeck coefficient measurement in a shorter time, without influencing the sample holder's temperature distribution. Furthermore, a two membrane correction method for the 3ω thermal conductivity measurement will be presented, which takes the heat loss due to radiation into account and increases the accuracy of the measurement results significantly. Errors caused by different sample compositions, varying sample geometries, and different heat profiles are avoided with the presented measurement method. As a showcase study displaying the validity and accuracy of our platform, we present temperature-dependent measurements of the thermoelectric properties of an 84 nm Bi87Sb13 thin film and a 15 μm PEDOT:PSS thin film.
A Distributed Wireless Camera System for the Management of Parking Spaces.
Vítek, Stanislav; Melničuk, Petr
2017-12-28
The importance of detection of parking space availability is still growing, particularly in major cities. This paper deals with the design of a distributed wireless camera system for the management of parking spaces, which can determine occupancy of the parking space based on the information from multiple cameras. The proposed system uses small camera modules based on Raspberry Pi Zero and computationally efficient algorithm for the occupancy detection based on the histogram of oriented gradients (HOG) feature descriptor and support vector machine (SVM) classifier. We have included information about the orientation of the vehicle as a supporting feature, which has enabled us to achieve better accuracy. The described solution can deliver occupancy information at the rate of 10 parking spaces per second with more than 90% accuracy in a wide range of conditions. Reliability of the implemented algorithm is evaluated with three different test sets which altogether contain over 700,000 samples of parking spaces.
Spectrally interleaved, comb-mode-resolved spectroscopy using swept dual terahertz combs
Hsieh, Yi-Da; Iyonaga, Yuki; Sakaguchi, Yoshiyuki; Yokoyama, Shuko; Inaba, Hajime; Minoshima, Kaoru; Hindle, Francis; Araki, Tsutomu; Yasui, Takeshi
2014-01-01
Optical frequency combs are innovative tools for broadband spectroscopy because a series of comb modes can serve as frequency markers that are traceable to a microwave frequency standard. However, a mode distribution that is too discrete limits the spectral sampling interval to the mode frequency spacing even though individual mode linewidth is sufficiently narrow. Here, using a combination of a spectral interleaving and dual-comb spectroscopy in the terahertz (THz) region, we achieved a spectral sampling interval equal to the mode linewidth rather than the mode spacing. The spectrally interleaved THz comb was realized by sweeping the laser repetition frequency and interleaving additional frequency marks. In low-pressure gas spectroscopy, we achieved an improved spectral sampling density of 2.5 MHz and enhanced spectral accuracy of 8.39 × 10−7 in the THz region. The proposed method is a powerful tool for simultaneously achieving high resolution, high accuracy, and broad spectral coverage in THz spectroscopy. PMID:24448604
Liu, Geng; Niu, Junjie; Zhang, Chao; Guo, Guanlin
2015-12-01
Data distribution is usually skewed severely by the presence of hot spots in contaminated sites. This causes difficulties for accurate geostatistical data transformation. Three types of typical normal distribution transformation methods termed the normal score, Johnson, and Box-Cox transformations were applied to compare the effects of spatial interpolation with normal distribution transformation data of benzo(b)fluoranthene in a large-scale coking plant-contaminated site in north China. Three normal transformation methods decreased the skewness and kurtosis of the benzo(b)fluoranthene, and all the transformed data passed the Kolmogorov-Smirnov test threshold. Cross validation showed that Johnson ordinary kriging has a minimum root-mean-square error of 1.17 and a mean error of 0.19, which was more accurate than the other two models. The area with fewer sampling points and that with high levels of contamination showed the largest prediction standard errors based on the Johnson ordinary kriging prediction map. We introduce an ideal normal transformation method prior to geostatistical estimation for severely skewed data, which enhances the reliability of risk estimation and improves the accuracy for determination of remediation boundaries.
Homogeneity tests of clustered diagnostic markers with applications to the BioCycle Study
Tang, Liansheng Larry; Liu, Aiyi; Schisterman, Enrique F.; Zhou, Xiao-Hua; Liu, Catherine Chun-ling
2014-01-01
Diagnostic trials often require the use of a homogeneity test among several markers. Such a test may be necessary to determine the power both during the design phase and in the initial analysis stage. However, no formal method is available for the power and sample size calculation when the number of markers is greater than two and marker measurements are clustered in subjects. This article presents two procedures for testing the accuracy among clustered diagnostic markers. The first procedure is a test of homogeneity among continuous markers based on a global null hypothesis of the same accuracy. The result under the alternative provides the explicit distribution for the power and sample size calculation. The second procedure is a simultaneous pairwise comparison test based on weighted areas under the receiver operating characteristic curves. This test is particularly useful if a global difference among markers is found by the homogeneity test. We apply our procedures to the BioCycle Study designed to assess and compare the accuracy of hormone and oxidative stress markers in distinguishing women with ovulatory menstrual cycles from those without. PMID:22733707
Comparison of Optimal Design Methods in Inverse Problems
Banks, H. T.; Holm, Kathleen; Kappel, Franz
2011-01-01
Typical optimal design methods for inverse or parameter estimation problems are designed to choose optimal sampling distributions through minimization of a specific cost function related to the resulting error in parameter estimates. It is hoped that the inverse problem will produce parameter estimates with increased accuracy using data collected according to the optimal sampling distribution. Here we formulate the classical optimal design problem in the context of general optimization problems over distributions of sampling times. We present a new Prohorov metric based theoretical framework that permits one to treat succinctly and rigorously any optimal design criteria based on the Fisher Information Matrix (FIM). A fundamental approximation theory is also included in this framework. A new optimal design, SE-optimal design (standard error optimal design), is then introduced in the context of this framework. We compare this new design criteria with the more traditional D-optimal and E-optimal designs. The optimal sampling distributions from each design are used to compute and compare standard errors; the standard errors for parameters are computed using asymptotic theory or bootstrapping and the optimal mesh. We use three examples to illustrate ideas: the Verhulst-Pearl logistic population model [13], the standard harmonic oscillator model [13] and a popular glucose regulation model [16, 19, 29]. PMID:21857762
Adjemian, Jennifer C Z; Girvetz, Evan H; Beckett, Laurel; Foley, Janet E
2006-01-01
More than 20 species of fleas in California are implicated as potential vectors of Yersinia pestis. Extremely limited spatial data exist for plague vectors-a key component to understanding where the greatest risks for human, domestic animal, and wildlife health exist. This study increases the spatial data available for 13 potential plague vectors by using the ecological niche modeling system Genetic Algorithm for Rule-Set Production (GARP) to predict their respective distributions. Because the available sample sizes in our data set varied greatly from one species to another, we also performed an analysis of the robustness of GARP by using the data available for flea Oropsylla montana (Baker) to quantify the effects that sample size and the chosen explanatory variables have on the final species distribution map. GARP effectively modeled the distributions of 13 vector species. Furthermore, our analyses show that all of these modeled ranges are robust, with a sample size of six fleas or greater not significantly impacting the percentage of the in-state area where the flea was predicted to be found, or the testing accuracy of the model. The results of this study will help guide the sampling efforts of future studies focusing on plague vectors.
NASA Astrophysics Data System (ADS)
Stumpf, A.; Lachiche, N.; Malet, J.; Kerle, N.; Puissant, A.
2011-12-01
VHR satellite images have become a primary source for landslide inventory mapping after major triggering events such as earthquakes and heavy rainfalls. Visual image interpretation is still the prevailing standard method for operational purposes but is time-consuming and not well suited to fully exploit the increasingly better supply of remote sensing data. Recent studies have addressed the development of more automated image analysis workflows for landslide inventory mapping. In particular object-oriented approaches that account for spatial and textural image information have been demonstrated to be more adequate than pixel-based classification but manually elaborated rule-based classifiers are difficult to adapt under changing scene characteristics. Machine learning algorithm allow learning classification rules for complex image patterns from labelled examples and can be adapted straightforwardly with available training data. In order to reduce the amount of costly training data active learning (AL) has evolved as a key concept to guide the sampling for many applications. The underlying idea of AL is to initialize a machine learning model with a small training set, and to subsequently exploit the model state and data structure to iteratively select the most valuable samples that should be labelled by the user. With relatively few queries and labelled samples, an AL strategy yields higher accuracies than an equivalent classifier trained with many randomly selected samples. This study addressed the development of an AL method for landslide mapping from VHR remote sensing images with special consideration of the spatial distribution of the samples. Our approach [1] is based on the Random Forest algorithm and considers the classifier uncertainty as well as the variance of potential sampling regions to guide the user towards the most valuable sampling areas. The algorithm explicitly searches for compact regions and thereby avoids a spatially disperse sampling pattern inherent to most other AL methods. The accuracy, the sampling time and the computational runtime of the algorithm were evaluated on multiple satellite images capturing recent large scale landslide events. Sampling between 1-4% of the study areas the accuracies between 74% and 80% were achieved, whereas standard sampling schemes yielded only accuracies between 28% and 50% with equal sampling costs. Compared to commonly used point-wise AL algorithm the proposed approach significantly reduces the number of iterations and hence the computational runtime. Since the user can focus on relatively few compact areas (rather than on hundreds of distributed points) the overall labeling time is reduced by more than 50% compared to point-wise queries. An experimental evaluation of multiple expert mappings demonstrated strong relationships between the uncertainties of the experts and the machine learning model. It revealed that the achieved accuracies are within the range of the inter-expert disagreement and that it will be indispensable to consider ground truth uncertainties to truly achieve further enhancements in the future. The proposed method is generally applicable to a wide range of optical satellite images and landslide types. [1] A. Stumpf, N. Lachiche, J.-P. Malet, N. Kerle, and A. Puissant, Active learning in the spatial domain for remote sensing image classification, IEEE Transactions on Geosciece and Remote Sensing. 2013, DOI 10.1109/TGRS.2013.2262052.
Using known populations of pronghorn to evaluate sampling plans and estimators
Kraft, K.M.; Johnson, D.H.; Samuelson, J.M.; Allen, S.H.
1995-01-01
Although sampling plans and estimators of abundance have good theoretical properties, their performance in real situations is rarely assessed because true population sizes are unknown. We evaluated widely used sampling plans and estimators of population size on 3 known clustered distributions of pronghorn (Antilocapra americana). Our criteria were accuracy of the estimate, coverage of 95% confidence intervals, and cost. Sampling plans were combinations of sampling intensities (16, 33, and 50%), sample selection (simple random sampling without replacement, systematic sampling, and probability proportional to size sampling with replacement), and stratification. We paired sampling plans with suitable estimators (simple, ratio, and probability proportional to size). We used area of the sampling unit as the auxiliary variable for the ratio and probability proportional to size estimators. All estimators were nearly unbiased, but precision was generally low (overall mean coefficient of variation [CV] = 29). Coverage of 95% confidence intervals was only 89% because of the highly skewed distribution of the pronghorn counts and small sample sizes, especially with stratification. Stratification combined with accurate estimates of optimal stratum sample sizes increased precision, reducing the mean CV from 33 without stratification to 25 with stratification; costs increased 23%. Precise results (mean CV = 13) but poor confidence interval coverage (83%) were obtained with simple and ratio estimators when the allocation scheme included all sampling units in the stratum containing most pronghorn. Although areas of the sampling units varied, ratio estimators and probability proportional to size sampling did not increase precision, possibly because of the clumped distribution of pronghorn. Managers should be cautious in using sampling plans and estimators to estimate abundance of aggregated populations.
Tang, Yunwei; Jing, Linhai; Li, Hui; Liu, Qingjie; Yan, Qi; Li, Xiuxia
2016-11-22
This study explores the ability of WorldView-2 (WV-2) imagery for bamboo mapping in a mountainous region in Sichuan Province, China. A large area of this place is covered by shadows in the image, and only a few sampled points derived were useful. In order to identify bamboos based on sparse training data, the sample size was expanded according to the reflectance of multispectral bands selected using the principal component analysis (PCA). Then, class separability based on the training data was calculated using a feature space optimization method to select the features for classification. Four regular object-based classification methods were applied based on both sets of training data. The results show that the k -nearest neighbor ( k -NN) method produced the greatest accuracy. A geostatistically-weighted k -NN classifier, accounting for the spatial correlation between classes, was then applied to further increase the accuracy. It achieved 82.65% and 93.10% of the producer's and user's accuracies respectively for the bamboo class. The canopy densities were estimated to explain the result. This study demonstrates that the WV-2 image can be used to identify small patches of understory bamboos given limited known samples, and the resulting bamboo distribution facilitates the assessments of the habitats of giant pandas.
Kistner, Emily O; Muller, Keith E
2004-09-01
Intraclass correlation and Cronbach's alpha are widely used to describe reliability of tests and measurements. Even with Gaussian data, exact distributions are known only for compound symmetric covariance (equal variances and equal correlations). Recently, large sample Gaussian approximations were derived for the distribution functions. New exact results allow calculating the exact distribution function and other properties of intraclass correlation and Cronbach's alpha, for Gaussian data with any covariance pattern, not just compound symmetry. Probabilities are computed in terms of the distribution function of a weighted sum of independent chi-square random variables. New F approximations for the distribution functions of intraclass correlation and Cronbach's alpha are much simpler and faster to compute than the exact forms. Assuming the covariance matrix is known, the approximations typically provide sufficient accuracy, even with as few as ten observations. Either the exact or approximate distributions may be used to create confidence intervals around an estimate of reliability. Monte Carlo simulations led to a number of conclusions. Correctly assuming that the covariance matrix is compound symmetric leads to accurate confidence intervals, as was expected from previously known results. However, assuming and estimating a general covariance matrix produces somewhat optimistically narrow confidence intervals with 10 observations. Increasing sample size to 100 gives essentially unbiased coverage. Incorrectly assuming compound symmetry leads to pessimistically large confidence intervals, with pessimism increasing with sample size. In contrast, incorrectly assuming general covariance introduces only a modest optimistic bias in small samples. Hence the new methods seem preferable for creating confidence intervals, except when compound symmetry definitely holds.
NIST High Accuracy Reference Reflectometer-Spectrophotometer
Proctor, James E.; Yvonne Barnes, P.
1996-01-01
A new reflectometer-spectrophotometer has been designed and constructed using state-of-the-art technology to enhance optical properties of materials measurements over the ultraviolet, visible, and near-infrared (UV-Vis-NIR) wavelength range (200 nm to 2500 nm). The instrument, Spectral Tri-function Automated Reference Reflectometer (STARR), is capable of measuring specular and diffuse reflectance, bidirectional reflectance distribution function (BRDF) of diffuse samples, and both diffuse and non-diffuse transmittance. Samples up to 30 cm by 30 cm can be measured. The instrument and its characterization are described. PMID:27805081
Kamath, Ganesh; Kurnikov, Igor; Fain, Boris; Leontyev, Igor; Illarionov, Alexey; Butin, Oleg; Olevanov, Michael; Pereyaslavets, Leonid
2016-11-01
We present the performance of blind predictions of water-cyclohexane distribution coefficients for 53 drug-like compounds in the SAMPL5 challenge by three methods currently in use within our group. Two of them utilize QMPFF3 and ARROW, polarizable force-fields of varying complexity, and the third uses the General Amber Force-Field (GAFF). The polarizable FF's are implemented in an in-house MD package, Arbalest. We find that when we had time to parametrize the functional groups with care (batch 0), the polarizable force-fields outperformed the non-polarizable one. Conversely, on the full set of 53 compounds, GAFF performed better than both QMPFF3 and ARROW. We also describe the torsion-restrain method we used to improve sampling of molecular conformational space and thus the overall accuracy of prediction. The SAMPL5 challenge highlighted several drawbacks of our force-fields, such as our significant systematic over-estimation of hydrophobic interactions, specifically for alkanes and aromatic rings.
Ademi, Abdulakim; Grozdanov, Anita; Paunović, Perica; Dimitrov, Aleksandar T
2015-01-01
Summary A model consisting of an equation that includes graphene thickness distribution is used to calculate theoretical 002 X-ray diffraction (XRD) peak intensities. An analysis was performed upon graphene samples produced by two different electrochemical procedures: electrolysis in aqueous electrolyte and electrolysis in molten salts, both using a nonstationary current regime. Herein, the model is enhanced by a partitioning of the corresponding 2θ interval, resulting in significantly improved accuracy of the results. The model curves obtained exhibit excellent fitting to the XRD intensities curves of the studied graphene samples. The employed equation parameters make it possible to calculate the j-layer graphene region coverage of the graphene samples, and hence the number of graphene layers. The results of the thorough analysis are in agreement with the calculated number of graphene layers from Raman spectra C-peak position values and indicate that the graphene samples studied are few-layered. PMID:26665083
Visual accumulation tube for size analysis of sands
Colby, B.C.; Christensen, R.P.
1956-01-01
The visual-accumulation-tube method was developed primarily for making size analyses of the sand fractions of suspended-sediment and bed-material samples. Because the fundamental property governing the motion of a sediment particle in a fluid is believed to be its fall velocity. the analysis is designed to determine the fall-velocity-frequency distribution of the individual particles of the sample. The analysis is based on a stratified sedimentation system in which the sample is introduced at the top of a transparent settling tube containing distilled water. The procedure involves the direct visual tracing of the height of sediment accumulation in a contracted section at the bottom of the tube. A pen records the height on a moving chart. The method is simple and fast, provides a continuous and permanent record, gives highly reproducible results, and accurately determines the fall-velocity characteristics of the sample. The apparatus, procedure, results, and accuracy of the visual-accumulation-tube method for determining the sedimentation-size distribution of sands are presented in this paper.
Bansal, Ravi; Hao, Xuejun; Liu, Jun; Peterson, Bradley S.
2014-01-01
Many investigators have tried to apply machine learning techniques to magnetic resonance images (MRIs) of the brain in order to diagnose neuropsychiatric disorders. Usually the number of brain imaging measures (such as measures of cortical thickness and measures of local surface morphology) derived from the MRIs (i.e., their dimensionality) has been large (e.g. >10) relative to the number of participants who provide the MRI data (<100). Sparse data in a high dimensional space increases the variability of the classification rules that machine learning algorithms generate, thereby limiting the validity, reproducibility, and generalizability of those classifiers. The accuracy and stability of the classifiers can improve significantly if the multivariate distributions of the imaging measures can be estimated accurately. To accurately estimate the multivariate distributions using sparse data, we propose to estimate first the univariate distributions of imaging data and then combine them using a Copula to generate more accurate estimates of their multivariate distributions. We then sample the estimated Copula distributions to generate dense sets of imaging measures and use those measures to train classifiers. We hypothesize that the dense sets of brain imaging measures will generate classifiers that are stable to variations in brain imaging measures, thereby improving the reproducibility, validity, and generalizability of diagnostic classification algorithms in imaging datasets from clinical populations. In our experiments, we used both computer-generated and real-world brain imaging datasets to assess the accuracy of multivariate Copula distributions in estimating the corresponding multivariate distributions of real-world imaging data. Our experiments showed that diagnostic classifiers generated using imaging measures sampled from the Copula were significantly more accurate and more reproducible than were the classifiers generated using either the real-world imaging measures or their multivariate Gaussian distributions. Thus, our findings demonstrate that estimated multivariate Copula distributions can generate dense sets of brain imaging measures that can in turn be used to train classifiers, and those classifiers are significantly more accurate and more reproducible than are those generated using real-world imaging measures alone. PMID:25093634
d'Assuncao, Jefferson; Irwig, Les; Macaskill, Petra; Chan, Siew F; Richards, Adele; Farnsworth, Annabelle
2007-01-01
Objective To compare the accuracy of liquid based cytology using the computerised ThinPrep Imager with that of manually read conventional cytology. Design Prospective study. Setting Pathology laboratory in Sydney, Australia. Participants 55 164 split sample pairs (liquid based sample collected after conventional sample from one collection) from consecutive samples of women choosing both types of cytology and whose specimens were examined between August 2004 and June 2005. Main outcome measures Primary outcome was accuracy of slides for detecting squamous lesions. Secondary outcomes were rate of unsatisfactory slides, distribution of squamous cytological classifications, and accuracy of detecting glandular lesions. Results Fewer unsatisfactory slides were found for imager read cytology than for conventional cytology (1.8% v 3.1%; P<0.001). More slides were classified as abnormal by imager read cytology (7.4% v 6.0% overall and 2.8% v 2.2% for cervical intraepithelial neoplasia of grade 1 or higher). Among 550 patients in whom imager read cytology was cervical intraepithelial neoplasia grade 1 or higher and conventional cytology was less severe than grade 1, 133 of 380 biopsy samples taken were high grade histology. Among 294 patients in whom imager read cytology was less severe than cervical intraepithelial neoplasia grade 1 and conventional cytology was grade 1 or higher, 62 of 210 biopsy samples taken were high grade histology. Imager read cytology therefore detected 71 more cases of high grade histology than did conventional cytology, resulting from 170 more biopsies. Similar results were found when one pathologist reread the slides, masked to cytology results. Conclusion The ThinPrep Imager detects 1.29 more cases of histological high grade squamous disease per 1000 women screened than conventional cytology, with cervical intraepithelial neoplasia grade 1 as the threshold for referral to colposcopy. More imager read slides than conventional slides were satisfactory for examination and more contained low grade cytological abnormalities. PMID:17604301
Mapping stand-age distribution of Russian forests from satellite data
NASA Astrophysics Data System (ADS)
Chen, D.; Loboda, T. V.; Hall, A.; Channan, S.; Weber, C. Y.
2013-12-01
Russian boreal forest is a critical component of the global boreal biome as approximately two thirds of the boreal forest is located in Russia. Numerous studies have shown that wildfire and logging have led to extensive modifications of forest cover in the region since 2000. Forest disturbance and subsequent regrowth influences carbon and energy budgets and, in turn, affect climate. Several global and regional satellite-based data products have been developed from coarse (>100m) and moderate (10-100m) resolution imagery to monitor forest cover change over the past decade, record of forest cover change pre-dating year 2000 is very fragmented. Although by using stacks of Landsat images, some information regarding the past disturbances can be obtained, the quantity and locations of such stacks with sufficient number of images are extremely limited, especially in Eastern Siberia. This paper describes a modified method which is built upon previous work to hindcast the disturbance history and map stand-age distribution in the Russian boreal forest. Utilizing data from both Landsat and the Moderate Resolution Imaging Spectroradiometer (MODIS), a wall-to-wall map indicating the estimated age of forest in the Russian boreal forest is created. Our previous work has shown that disturbances can be mapped successfully up to 30 years in the past as the spectral signature of regrowing forests is statistically significantly different from that of mature forests. The presented algorithm ingests 55 multi-temporal stacks of Landsat imagery available over Russian forest before 2001 and processes through a standardized and semi-automated approach to extract training and validation data samples. Landsat data, dating back to 1984, are used to generate maps of forest disturbance using temporal shifts in Disturbance Index through the multi-temporal stack of imagery in selected locations. These maps are then used as reference data to train a decision tree classifier on 50 MODIS-based indices. The resultant map provides an estimate of forest age based on the regrowth curves observed from Landsat imagery. The accuracy of the resultant map is assessed against three datasets: 1) subset of the disturbance maps developed within the algorithm, 2) independent disturbance maps created by the Northern Eurasia Land Dynamics Analysis (NELDA) project, and 3) field-based stand-age distribution from forestry inventory units. The current version of the product presents a considerable improvement on the previous version which used Landsat data samples at a set of randomly selected locations, resulting a strong bias of the training samples towards the Landsat-rich regions (e.g. European Russia) whereas regions such as Siberia were under-sampled. Aiming at improving accuracy, the current method significantly increases the number of training Landsat samples compared to the previous work. Aside from the previously used data, the current method uses all available Landsat data for the under-sampled regions in order to increase the representativeness of the total samples. The finial accuracy assessment is still ongoing, however, the initial results suggested an overall accuracy expressed in Kappa > 0.8. We plan to release both the training data and the final disturbance map of the Russian boreal forest to the public after the validation is completed.
Constructing a Watts-Strogatz network from a small-world network with symmetric degree distribution.
Menezes, Mozart B C; Kim, Seokjin; Huang, Rongbing
2017-01-01
Though the small-world phenomenon is widespread in many real networks, it is still challenging to replicate a large network at the full scale for further study on its structure and dynamics when sufficient data are not readily available. We propose a method to construct a Watts-Strogatz network using a sample from a small-world network with symmetric degree distribution. Our method yields an estimated degree distribution which fits closely with that of a Watts-Strogatz network and leads into accurate estimates of network metrics such as clustering coefficient and degree of separation. We observe that the accuracy of our method increases as network size increases.
NASA Technical Reports Server (NTRS)
Shimizu, H.; Kobayasi, T.; Inaba, H.
1979-01-01
A method of remote measurement of the particle size and density distribution of water droplets was developed. In this method, the size of droplets is measured from the Mie scattering parameter which is defined as the total-to-backscattering ratio of the laser beam. The water density distribution is obtained by a combination of the Mie scattering parameter and the extinction coefficient of the laser beam. This method was examined experimentally for the mist generated by an ultrasonic mist generator and applied to clouds containing rain and snow. Compared with the conventional sampling method, the present method has advantages of remote measurement capability and improvement in accuracy.
Normative data on audiovisual speech integration using sentence recognition and capacity measures.
Altieri, Nicholas; Hudock, Daniel
2016-01-01
The ability to use visual speech cues and integrate them with auditory information is important, especially in noisy environments and for hearing-impaired (HI) listeners. Providing data on measures of integration skills that encompass accuracy and processing speed will benefit researchers and clinicians. The study consisted of two experiments: First, accuracy scores were obtained using City University of New York (CUNY) sentences, and capacity measures that assessed reaction-time distributions were obtained from a monosyllabic word recognition task. We report data on two measures of integration obtained from a sample comprised of 86 young and middle-age adult listeners: To summarize our results, capacity showed a positive correlation with accuracy measures of audiovisual benefit obtained from sentence recognition. More relevant, factor analysis indicated that a single-factor model captured audiovisual speech integration better than models containing more factors. Capacity exhibited strong loadings on the factor, while the accuracy-based measures from sentence recognition exhibited weaker loadings. Results suggest that a listener's integration skills may be assessed optimally using a measure that incorporates both processing speed and accuracy.
Polished sample preparing and backscattered electron imaging and of fly ash-cement paste
NASA Astrophysics Data System (ADS)
Feng, Shuxia; Li, Yanqi
2018-03-01
In recent decades, the technology of backscattered electron imaging and image analysis was applied in more and more study of mixed cement paste because of its special advantages. Test accuracy of this technology is affected by polished sample preparation and image acquisition. In our work, effects of two factors in polished sample preparing and backscattered electron imaging were investigated. The results showed that increasing smoothing pressure could improve the flatness of polished surface and then help to eliminate interference of morphology on grey level distribution of backscattered electron images; increasing accelerating voltage was beneficial to increase gray difference among different phases in backscattered electron images.
Molecular cancer classification using a meta-sample-based regularized robust coding method.
Wang, Shu-Lin; Sun, Liuchao; Fang, Jianwen
2014-01-01
Previous studies have demonstrated that machine learning based molecular cancer classification using gene expression profiling (GEP) data is promising for the clinic diagnosis and treatment of cancer. Novel classification methods with high efficiency and prediction accuracy are still needed to deal with high dimensionality and small sample size of typical GEP data. Recently the sparse representation (SR) method has been successfully applied to the cancer classification. Nevertheless, its efficiency needs to be improved when analyzing large-scale GEP data. In this paper we present the meta-sample-based regularized robust coding classification (MRRCC), a novel effective cancer classification technique that combines the idea of meta-sample-based cluster method with regularized robust coding (RRC) method. It assumes that the coding residual and the coding coefficient are respectively independent and identically distributed. Similar to meta-sample-based SR classification (MSRC), MRRCC extracts a set of meta-samples from the training samples, and then encodes a testing sample as the sparse linear combination of these meta-samples. The representation fidelity is measured by the l2-norm or l1-norm of the coding residual. Extensive experiments on publicly available GEP datasets demonstrate that the proposed method is more efficient while its prediction accuracy is equivalent to existing MSRC-based methods and better than other state-of-the-art dimension reduction based methods.
Compressive sampling of polynomial chaos expansions: Convergence analysis and sampling strategies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hampton, Jerrad; Doostan, Alireza, E-mail: alireza.doostan@colorado.edu
2015-01-01
Sampling orthogonal polynomial bases via Monte Carlo is of interest for uncertainty quantification of models with random inputs, using Polynomial Chaos (PC) expansions. It is known that bounding a probabilistic parameter, referred to as coherence, yields a bound on the number of samples necessary to identify coefficients in a sparse PC expansion via solution to an ℓ{sub 1}-minimization problem. Utilizing results for orthogonal polynomials, we bound the coherence parameter for polynomials of Hermite and Legendre type under their respective natural sampling distribution. In both polynomial bases we identify an importance sampling distribution which yields a bound with weaker dependence onmore » the order of the approximation. For more general orthonormal bases, we propose the coherence-optimal sampling: a Markov Chain Monte Carlo sampling, which directly uses the basis functions under consideration to achieve a statistical optimality among all sampling schemes with identical support. We demonstrate these different sampling strategies numerically in both high-order and high-dimensional, manufactured PC expansions. In addition, the quality of each sampling method is compared in the identification of solutions to two differential equations, one with a high-dimensional random input and the other with a high-order PC expansion. In both cases, the coherence-optimal sampling scheme leads to similar or considerably improved accuracy.« less
Peng, Hao; Yang, Yifan; Zhe, Shandian; Wang, Jian; Gribskov, Michael; Qi, Yuan
2017-01-01
Abstract Motivation High-throughput mRNA sequencing (RNA-Seq) is a powerful tool for quantifying gene expression. Identification of transcript isoforms that are differentially expressed in different conditions, such as in patients and healthy subjects, can provide insights into the molecular basis of diseases. Current transcript quantification approaches, however, do not take advantage of the shared information in the biological replicates, potentially decreasing sensitivity and accuracy. Results We present a novel hierarchical Bayesian model called Differentially Expressed Isoform detection from Multiple biological replicates (DEIsoM) for identifying differentially expressed (DE) isoforms from multiple biological replicates representing two conditions, e.g. multiple samples from healthy and diseased subjects. DEIsoM first estimates isoform expression within each condition by (1) capturing common patterns from sample replicates while allowing individual differences, and (2) modeling the uncertainty introduced by ambiguous read mapping in each replicate. Specifically, we introduce a Dirichlet prior distribution to capture the common expression pattern of replicates from the same condition, and treat the isoform expression of individual replicates as samples from this distribution. Ambiguous read mapping is modeled as a multinomial distribution, and ambiguous reads are assigned to the most probable isoform in each replicate. Additionally, DEIsoM couples an efficient variational inference and a post-analysis method to improve the accuracy and speed of identification of DE isoforms over alternative methods. Application of DEIsoM to an hepatocellular carcinoma (HCC) dataset identifies biologically relevant DE isoforms. The relevance of these genes/isoforms to HCC are supported by principal component analysis (PCA), read coverage visualization, and the biological literature. Availability and implementation The software is available at https://github.com/hao-peng/DEIsoM Contact pengh@alumni.purdue.edu Supplementary information Supplementary data are available at Bioinformatics online. PMID:28595376
Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty
Baele, Guy; Lemey, Philippe; Suchard, Marc A.
2016-01-01
Marginal likelihood estimates to compare models using Bayes factors frequently accompany Bayesian phylogenetic inference. Approaches to estimate marginal likelihoods have garnered increased attention over the past decade. In particular, the introduction of path sampling (PS) and stepping-stone sampling (SS) into Bayesian phylogenetics has tremendously improved the accuracy of model selection. These sampling techniques are now used to evaluate complex evolutionary and population genetic models on empirical data sets, but considerable computational demands hamper their widespread adoption. Further, when very diffuse, but proper priors are specified for model parameters, numerical issues complicate the exploration of the priors, a necessary step in marginal likelihood estimation using PS or SS. To avoid such instabilities, generalized SS (GSS) has recently been proposed, introducing the concept of “working distributions” to facilitate—or shorten—the integration process that underlies marginal likelihood estimation. However, the need to fix the tree topology currently limits GSS in a coalescent-based framework. Here, we extend GSS by relaxing the fixed underlying tree topology assumption. To this purpose, we introduce a “working” distribution on the space of genealogies, which enables estimating marginal likelihoods while accommodating phylogenetic uncertainty. We propose two different “working” distributions that help GSS to outperform PS and SS in terms of accuracy when comparing demographic and evolutionary models applied to synthetic data and real-world examples. Further, we show that the use of very diffuse priors can lead to a considerable overestimation in marginal likelihood when using PS and SS, while still retrieving the correct marginal likelihood using both GSS approaches. The methods used in this article are available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses. PMID:26526428
Wang, Qianqian; Zhao, Jing; Gong, Yong; Hao, Qun; Peng, Zhong
2017-11-20
A hybrid artificial bee colony (ABC) algorithm inspired by the best-so-far solution and bacterial chemotaxis was introduced to optimize the parameters of the five-parameter bidirectional reflectance distribution function (BRDF) model. To verify the performance of the hybrid ABC algorithm, we measured BRDF of three kinds of samples and simulated the undetermined parameters of the five-parameter BRDF model using the hybrid ABC algorithm and the genetic algorithm, respectively. The experimental results demonstrate that the hybrid ABC algorithm outperforms the genetic algorithm in convergence speed, accuracy, and time efficiency under the same conditions.
Liu, Wei; Kulin, Merima; Kazaz, Tarik; Shahid, Adnan; Moerman, Ingrid; De Poorter, Eli
2017-09-12
Driven by the fast growth of wireless communication, the trend of sharing spectrum among heterogeneous technologies becomes increasingly dominant. Identifying concurrent technologies is an important step towards efficient spectrum sharing. However, due to the complexity of recognition algorithms and the strict condition of sampling speed, communication systems capable of recognizing signals other than their own type are extremely rare. This work proves that multi-model distribution of the received signal strength indicator (RSSI) is related to the signals' modulation schemes and medium access mechanisms, and RSSI from different technologies may exhibit highly distinctive features. A distinction is made between technologies with a streaming or a non-streaming property, and appropriate feature spaces can be established either by deriving parameters such as packet duration from RSSI or directly using RSSI's probability distribution. An experimental study shows that even RSSI acquired at a sub-Nyquist sampling rate is able to provide sufficient features to differentiate technologies such as Wi-Fi, Long Term Evolution (LTE), Digital Video Broadcasting-Terrestrial (DVB-T) and Bluetooth. The usage of the RSSI distribution-based feature space is illustrated via a sample algorithm. Experimental evaluation indicates that more than 92% accuracy is achieved with the appropriate configuration. As the analysis of RSSI distribution is straightforward and less demanding in terms of system requirements, we believe it is highly valuable for recognition of wideband technologies on constrained devices in the context of dynamic spectrum access.
Liu, Wei; Kulin, Merima; Kazaz, Tarik; De Poorter, Eli
2017-01-01
Driven by the fast growth of wireless communication, the trend of sharing spectrum among heterogeneous technologies becomes increasingly dominant. Identifying concurrent technologies is an important step towards efficient spectrum sharing. However, due to the complexity of recognition algorithms and the strict condition of sampling speed, communication systems capable of recognizing signals other than their own type are extremely rare. This work proves that multi-model distribution of the received signal strength indicator (RSSI) is related to the signals’ modulation schemes and medium access mechanisms, and RSSI from different technologies may exhibit highly distinctive features. A distinction is made between technologies with a streaming or a non-streaming property, and appropriate feature spaces can be established either by deriving parameters such as packet duration from RSSI or directly using RSSI’s probability distribution. An experimental study shows that even RSSI acquired at a sub-Nyquist sampling rate is able to provide sufficient features to differentiate technologies such as Wi-Fi, Long Term Evolution (LTE), Digital Video Broadcasting-Terrestrial (DVB-T) and Bluetooth. The usage of the RSSI distribution-based feature space is illustrated via a sample algorithm. Experimental evaluation indicates that more than 92% accuracy is achieved with the appropriate configuration. As the analysis of RSSI distribution is straightforward and less demanding in terms of system requirements, we believe it is highly valuable for recognition of wideband technologies on constrained devices in the context of dynamic spectrum access. PMID:28895879
Modelling population distribution using remote sensing imagery and location-based data
NASA Astrophysics Data System (ADS)
Song, J.; Prishchepov, A. V.
2017-12-01
Detailed spatial distribution of population density is essential for city studies such as urban planning, environmental pollution and city emergency, even estimate pressure on the environment and human exposure and risks to health. However, most of the researches used census data as the detailed dynamic population distribution are difficult to acquire, especially in microscale research. This research describes a method using remote sensing imagery and location-based data to model population distribution at the function zone level. Firstly, urban functional zones within a city were mapped by high-resolution remote sensing images and POIs. The workflow of functional zones extraction includes five parts: (1) Urban land use classification. (2) Segmenting images in built-up area. (3) Identification of functional segments by POIs. (4) Identification of functional blocks by functional segmentation and weight coefficients. (5) Assessing accuracy by validation points. The result showed as Fig.1. Secondly, we applied ordinary least square and geographically weighted regression to assess spatial nonstationary relationship between light digital number (DN) and population density of sampling points. The two methods were employed to predict the population distribution over the research area. The R²of GWR model were in the order of 0.7 and typically showed significant variations over the region than traditional OLS model. The result showed as Fig.2.Validation with sampling points of population density demonstrated that the result predicted by the GWR model correlated well with light value. The result showed as Fig.3. Results showed: (1) Population density is not linear correlated with light brightness using global model. (2) VIIRS night-time light data could estimate population density integrating functional zones at city level. (3) GWR is a robust model to map population distribution, the adjusted R2 of corresponding GWR models were higher than the optimal OLS models, confirming that GWR models demonstrate better prediction accuracy. So this method provide detailed population density information for microscale citizen studies.
An integral conservative gridding--algorithm using Hermitian curve interpolation.
Volken, Werner; Frei, Daniel; Manser, Peter; Mini, Roberto; Born, Ernst J; Fix, Michael K
2008-11-07
The problem of re-sampling spatially distributed data organized into regular or irregular grids to finer or coarser resolution is a common task in data processing. This procedure is known as 'gridding' or 're-binning'. Depending on the quantity the data represents, the gridding-algorithm has to meet different requirements. For example, histogrammed physical quantities such as mass or energy have to be re-binned in order to conserve the overall integral. Moreover, if the quantity is positive definite, negative sampling values should be avoided. The gridding process requires a re-distribution of the original data set to a user-requested grid according to a distribution function. The distribution function can be determined on the basis of the given data by interpolation methods. In general, accurate interpolation with respect to multiple boundary conditions of heavily fluctuating data requires polynomial interpolation functions of second or even higher order. However, this may result in unrealistic deviations (overshoots or undershoots) of the interpolation function from the data. Accordingly, the re-sampled data may overestimate or underestimate the given data by a significant amount. The gridding-algorithm presented in this work was developed in order to overcome these problems. Instead of a straightforward interpolation of the given data using high-order polynomials, a parametrized Hermitian interpolation curve was used to approximate the integrated data set. A single parameter is determined by which the user can control the behavior of the interpolation function, i.e. the amount of overshoot and undershoot. Furthermore, it is shown how the algorithm can be extended to multidimensional grids. The algorithm was compared to commonly used gridding-algorithms using linear and cubic interpolation functions. It is shown that such interpolation functions may overestimate or underestimate the source data by about 10-20%, while the new algorithm can be tuned to significantly reduce these interpolation errors. The accuracy of the new algorithm was tested on a series of x-ray CT-images (head and neck, lung, pelvis). The new algorithm significantly improves the accuracy of the sampled images in terms of the mean square error and a quality index introduced by Wang and Bovik (2002 IEEE Signal Process. Lett. 9 81-4).
Wellek, Stefan
2017-02-28
In current practice, the most frequently applied approach to the handling of ties in the Mann-Whitney-Wilcoxon (MWW) test is based on the conditional distribution of the sum of mid-ranks, given the observed pattern of ties. Starting from this conditional version of the testing procedure, a sample size formula was derived and investigated by Zhao et al. (Stat Med 2008). In contrast, the approach we pursue here is a nonconditional one exploiting explicit representations for the variances of and the covariance between the two U-statistics estimators involved in the Mann-Whitney form of the test statistic. The accuracy of both ways of approximating the sample sizes required for attaining a prespecified level of power in the MWW test for superiority with arbitrarily tied data is comparatively evaluated by means of simulation. The key qualitative conclusions to be drawn from these numerical comparisons are as follows: With the sample sizes calculated by means of the respective formula, both versions of the test maintain the level and the prespecified power with about the same degree of accuracy. Despite the equivalence in terms of accuracy, the sample size estimates obtained by means of the new formula are in many cases markedly lower than that calculated for the conditional test. Perhaps, a still more important advantage of the nonconditional approach based on U-statistics is that it can be also adopted for noninferiority trials. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Leveraging 3D-HST Grism Redshifts to Quantify Photometric Redshift Performance
NASA Astrophysics Data System (ADS)
Bezanson, Rachel; Wake, David A.; Brammer, Gabriel B.; van Dokkum, Pieter G.; Franx, Marijn; Labbé, Ivo; Leja, Joel; Momcheva, Ivelina G.; Nelson, Erica J.; Quadri, Ryan F.; Skelton, Rosalind E.; Weiner, Benjamin J.; Whitaker, Katherine E.
2016-05-01
We present a study of photometric redshift accuracy in the 3D-HST photometric catalogs, using 3D-HST grism redshifts to quantify and dissect trends in redshift accuracy for galaxies brighter than JH IR > 24 with an unprecedented and representative high-redshift galaxy sample. We find an average scatter of 0.0197 ± 0.0003(1 + z) in the Skelton et al. photometric redshifts. Photometric redshift accuracy decreases with magnitude and redshift, but does not vary monotonically with color or stellar mass. The 1σ scatter lies between 0.01 and 0.03 (1 + z) for galaxies of all masses and colors below z < 2.5 (for JH IR < 24), with the exception of a population of very red (U - V > 2), dusty star-forming galaxies for which the scatter increases to ˜0.1 (1 + z). We find that photometric redshifts depend significantly on galaxy size; the largest galaxies at fixed magnitude have photo-zs with up to ˜30% more scatter and ˜5 times the outlier rate. Although the overall photometric redshift accuracy for quiescent galaxies is better than that for star-forming galaxies, scatter depends more strongly on magnitude and redshift than on galaxy type. We verify these trends using the redshift distributions of close pairs and extend the analysis to fainter objects, where photometric redshift errors further increase to ˜0.046 (1 + z) at {H}F160W=26. We demonstrate that photometric redshift accuracy is strongly filter dependent and quantify the contribution of multiple filter combinations. We evaluate the widths of redshift probability distribution functions and find that error estimates are underestimated by a factor of ˜1.1-1.6, but that uniformly broadening the distribution does not adequately account for fitting outliers. Finally, we suggest possible applications of these data in planning for current and future surveys and simulate photometric redshift performance in the Large Synoptic Survey Telescope, Dark Energy Survey (DES), and combined DES and Vista Hemisphere surveys.
Reinforcer control by comparison-stimulus color and location in a delayed matching-to-sample task.
Alsop, Brent; Jones, B Max
2008-05-01
Six pigeons were trained in a delayed matching-to-sample task involving bright- and dim-yellow samples on a central key, a five-peck response requirement to either sample, a constant 1.5-s delay, and the presentation of comparison stimuli composed of red on the left key and green on the right key or vice versa. Green-key responses were occasionally reinforced following the dimmer-yellow sample, and red-key responses were occasionally reinforced following the brighter-yellow sample. Reinforcer delivery was controlled such that the distribution of reinforcers across both comparison-stimulus color and comparison-stimulus location could be varied systematically and independently across conditions. Matching accuracy was high throughout. The ratio of left to right side-key responses increased as the ratio of left to right reinforcers increased, the ratio of red to green responses increased as the ratio of red to green reinforcers increased, and there was no interaction between these variables. However, side-key biases were more sensitive to the distribution of reinforcers across key location than were comparison-color biases to the distribution of reinforcers across key color. An extension of Davison and Tustin's (1978) model of DMTS performance fit the data well, but the results were also consistent with an alternative theory of conditional discrimination performance (Jones, 2003) that calls for a conceptually distinct quantitative model.
Aronoff, Justin M; Yoon, Yang-soo; Soli, Sigfrid D
2010-06-01
Stratified sampling plans can increase the accuracy and facilitate the interpretation of a dataset characterizing a large population. However, such sampling plans have found minimal use in hearing aid (HA) research, in part because of a paucity of quantitative data on the characteristics of HA users. The goal of this study was to devise a quantitatively derived stratified sampling plan for HA research, so that such studies will be more representative and generalizable, and the results obtained using this method are more easily reinterpreted as the population changes. Pure-tone average (PTA) and age information were collected for 84,200 HAs acquired in 2006 and 2007. The distribution of PTA and age was quantified for each HA type and for a composite of all HA users. Based on their respective distributions, PTA and age were each divided into three groups, the combination of which defined the stratification plan. The most populous PTA and age group was also subdivided, allowing greater homogeneity within strata. Finally, the percentage of users in each stratum was calculated. This article provides a stratified sampling plan for HA research, based on a quantitative analysis of the distribution of PTA and age for HA users. Adopting such a sampling plan will make HA research results more representative and generalizable. In addition, data acquired using such plans can be reinterpreted as the HA population changes.
IKONOS geometric characterization
Helder, Dennis; Coan, Michael; Patrick, Kevin; Gaska, Peter
2003-01-01
The IKONOS spacecraft acquired images on July 3, 17, and 25, and August 13, 2001 of Brookings SD, a small city in east central South Dakota, and on May 22, June 30, and July 30, 2000, of the rural area around the EROS Data Center. South Dakota State University (SDSU) evaluated the Brookings scenes and the USGS EROS Data Center (EDC) evaluated the other scenes. The images evaluated by SDSU utilized various natural objects and man-made features as identifiable targets randomly distribution throughout the scenes, while the images evaluated by EDC utilized pre-marked artificial points (panel points) to provide the best possible targets distributed in a grid pattern. Space Imaging provided products at different processing levels to each institution. For each scene, the pixel (line, sample) locations of the various targets were compared to field observed, survey-grade Global Positioning System locations. Patterns of error distribution for each product were plotted, and a variety of statistical statements of accuracy are made. The IKONOS sensor also acquired 12 pairs of stereo images of globally distributed scenes between April 2000 and April 2001. For each scene, analysts at the National Imagery and Mapping Agency (NIMA) compared derived photogrammetric coordinates to their corresponding NIMA field-surveyed ground control point (GCPs). NIMA analysts determined horizontal and vertical accuracies by averaging the differences between the derived photogrammetric points and the field-surveyed GCPs for all 12 stereo pairs. Patterns of error distribution for each scene are presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chubar, Oleg, E-mail: chubar@bnl.gov; Chu, Yong S.; Huang, Xiaojing
2016-07-27
Commissioning of the first X-ray beamlines of NSLS-II included detailed measurements of spectral and spatial distributions of the radiation at different locations of the beamlines, from front-ends to sample positions. Comparison of some of these measurement results with high-accuracy calculations of synchrotron (undulator) emission and wavefront propagation through X-ray transport optics, performed using SRW code, is presented.
Photographic techniques for characterizing streambed particle sizes
Whitman, Matthew S.; Moran, Edward H.; Ourso, Robert T.
2003-01-01
We developed photographic techniques to characterize coarse (>2-mm) and fine (≤2-mm) streambed particle sizes in 12 streams in Anchorage, Alaska. Results were compared with current sampling techniques to assess which provided greater sampling efficiency and accuracy. The streams sampled were wadeable and contained gravel—cobble streambeds. Gradients ranged from about 5% at the upstream sites to about 0.25% at the downstream sites. Mean particle sizes and size-frequency distributions resulting from digitized photographs differed significantly from those resulting from Wolman pebble counts for five sites in the analysis. Wolman counts were biased toward selecting larger particles. Photographic analysis also yielded a greater number of measured particles (mean = 989) than did the Wolman counts (mean = 328). Stream embeddedness ratings assigned from field and photographic observations were significantly different at 5 of the 12 sites, although both types of ratings showed a positive relationship with digitized surface fines. Visual estimates of embeddedness and digitized surface fines may both be useful indicators of benthic conditions, but digitizing surface fines produces quantitative rather than qualitative data. Benefits of the photographic techniques include reduced field time, minimal streambed disturbance, convenience of postfield processing, easy sample archiving, and improved accuracy and replication potential.
New insights from cluster analysis methods for RNA secondary structure prediction
Rogers, Emily; Heitsch, Christine
2016-01-01
A widening gap exists between the best practices for RNA secondary structure prediction developed by computational researchers and the methods used in practice by experimentalists. Minimum free energy (MFE) predictions, although broadly used, are outperformed by methods which sample from the Boltzmann distribution and data mine the results. In particular, moving beyond the single structure prediction paradigm yields substantial gains in accuracy. Furthermore, the largest improvements in accuracy and precision come from viewing secondary structures not at the base pair level but at lower granularity/higher abstraction. This suggests that random errors affecting precision and systematic ones affecting accuracy are both reduced by this “fuzzier” view of secondary structures. Thus experimentalists who are willing to adopt a more rigorous, multilayered approach to secondary structure prediction by iterating through these levels of granularity will be much better able to capture fundamental aspects of RNA base pairing. PMID:26971529
Comparison of Methods for Analyzing Left-Censored Occupational Exposure Data
Huynh, Tran; Ramachandran, Gurumurthy; Banerjee, Sudipto; Monteiro, Joao; Stenzel, Mark; Sandler, Dale P.; Engel, Lawrence S.; Kwok, Richard K.; Blair, Aaron; Stewart, Patricia A.
2014-01-01
The National Institute for Environmental Health Sciences (NIEHS) is conducting an epidemiologic study (GuLF STUDY) to investigate the health of the workers and volunteers who participated from April to December of 2010 in the response and cleanup of the oil release after the Deepwater Horizon explosion in the Gulf of Mexico. The exposure assessment component of the study involves analyzing thousands of personal monitoring measurements that were collected during this effort. A substantial portion of these data has values reported by the analytic laboratories to be below the limits of detection (LOD). A simulation study was conducted to evaluate three established methods for analyzing data with censored observations to estimate the arithmetic mean (AM), geometric mean (GM), geometric standard deviation (GSD), and the 95th percentile (X0.95) of the exposure distribution: the maximum likelihood (ML) estimation, the β-substitution, and the Kaplan–Meier (K-M) methods. Each method was challenged with computer-generated exposure datasets drawn from lognormal and mixed lognormal distributions with sample sizes (N) varying from 5 to 100, GSDs ranging from 2 to 5, and censoring levels ranging from 10 to 90%, with single and multiple LODs. Using relative bias and relative root mean squared error (rMSE) as the evaluation metrics, the β-substitution method generally performed as well or better than the ML and K-M methods in most simulated lognormal and mixed lognormal distribution conditions. The ML method was suitable for large sample sizes (N ≥ 30) up to 80% censoring for lognormal distributions with small variability (GSD = 2–3). The K-M method generally provided accurate estimates of the AM when the censoring was <50% for lognormal and mixed distributions. The accuracy and precision of all methods decreased under high variability (GSD = 4 and 5) and small to moderate sample sizes (N < 20) but the β-substitution was still the best of the three methods. When using the ML method, practitioners are cautioned to be aware of different ways of estimating the AM as they could lead to biased interpretation. A limitation of the β-substitution method is the absence of a confidence interval for the estimate. More research is needed to develop methods that could improve the estimation accuracy for small sample sizes and high percent censored data and also provide uncertainty intervals. PMID:25261453
Accurate aging of juvenile salmonids using fork lengths
Sethi, Suresh; Gerken, Jonathon; Ashline, Joshua
2017-01-01
Juvenile salmon life history strategies, survival, and habitat interactions may vary by age cohort. However, aging individual juvenile fish using scale reading is time consuming and can be error prone. Fork length data are routinely measured while sampling juvenile salmonids. We explore the performance of aging juvenile fish based solely on fork length data, using finite Gaussian mixture models to describe multimodal size distributions and estimate optimal age-discriminating length thresholds. Fork length-based ages are compared against a validation set of juvenile coho salmon, Oncorynchus kisutch, aged by scales. Results for juvenile coho salmon indicate greater than 95% accuracy can be achieved by aging fish using length thresholds estimated from mixture models. Highest accuracy is achieved when aged fish are compared to length thresholds generated from samples from the same drainage, time of year, and habitat type (lentic versus lotic), although relatively high aging accuracy can still be achieved when thresholds are extrapolated to fish from populations in different years or drainages. Fork length-based aging thresholds are applicable for taxa for which multiple age cohorts coexist sympatrically. Where applicable, the method of aging individual fish is relatively quick to implement and can avoid ager interpretation bias common in scale-based aging.
A Distributed Wireless Camera System for the Management of Parking Spaces
Melničuk, Petr
2017-01-01
The importance of detection of parking space availability is still growing, particularly in major cities. This paper deals with the design of a distributed wireless camera system for the management of parking spaces, which can determine occupancy of the parking space based on the information from multiple cameras. The proposed system uses small camera modules based on Raspberry Pi Zero and computationally efficient algorithm for the occupancy detection based on the histogram of oriented gradients (HOG) feature descriptor and support vector machine (SVM) classifier. We have included information about the orientation of the vehicle as a supporting feature, which has enabled us to achieve better accuracy. The described solution can deliver occupancy information at the rate of 10 parking spaces per second with more than 90% accuracy in a wide range of conditions. Reliability of the implemented algorithm is evaluated with three different test sets which altogether contain over 700,000 samples of parking spaces. PMID:29283371
Jet Topics: Disentangling Quarks and Gluons at Colliders
NASA Astrophysics Data System (ADS)
Metodiev, Eric M.; Thaler, Jesse
2018-06-01
We introduce jet topics: a framework to identify underlying classes of jets from collider data. Because of a close mathematical relationship between distributions of observables in jets and emergent themes in sets of documents, we can apply recent techniques in "topic modeling" to extract jet topics from the data with minimal or no input from simulation or theory. As a proof of concept with parton shower samples, we apply jet topics to determine separate quark and gluon jet distributions for constituent multiplicity. We also determine separate quark and gluon rapidity spectra from a mixed Z -plus-jet sample. While jet topics are defined directly from hadron-level multidifferential cross sections, one can also predict jet topics from first-principles theoretical calculations, with potential implications for how to define quark and gluon jets beyond leading-logarithmic accuracy. These investigations suggest that jet topics will be useful for extracting underlying jet distributions and fractions in a wide range of contexts at the Large Hadron Collider.
Gilliom, Robert J.; Helsel, Dennis R.
1986-01-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations, for determining the best performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilliom, R.J.; Helsel, D.R.
1986-02-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensoredmore » observations, for determining the best performing parameter estimation method for any particular data det. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.« less
Estimation of distributional parameters for censored trace-level water-quality data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilliom, R.J.; Helsel, D.R.
1984-01-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water-sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations,more » for determining the best-performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least-squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification. 6 figs., 6 tabs.« less
Tang, Yunwei; Jing, Linhai; Li, Hui; Liu, Qingjie; Yan, Qi; Li, Xiuxia
2016-01-01
This study explores the ability of WorldView-2 (WV-2) imagery for bamboo mapping in a mountainous region in Sichuan Province, China. A large area of this place is covered by shadows in the image, and only a few sampled points derived were useful. In order to identify bamboos based on sparse training data, the sample size was expanded according to the reflectance of multispectral bands selected using the principal component analysis (PCA). Then, class separability based on the training data was calculated using a feature space optimization method to select the features for classification. Four regular object-based classification methods were applied based on both sets of training data. The results show that the k-nearest neighbor (k-NN) method produced the greatest accuracy. A geostatistically-weighted k-NN classifier, accounting for the spatial correlation between classes, was then applied to further increase the accuracy. It achieved 82.65% and 93.10% of the producer’s and user’s accuracies respectively for the bamboo class. The canopy densities were estimated to explain the result. This study demonstrates that the WV-2 image can be used to identify small patches of understory bamboos given limited known samples, and the resulting bamboo distribution facilitates the assessments of the habitats of giant pandas. PMID:27879661
Saad, David A.; Schwarz, Gregory E.; Robertson, Dale M.; Booth, Nathaniel
2011-01-01
Stream-loading information was compiled from federal, state, and local agencies, and selected universities as part of an effort to develop regional SPAtially Referenced Regressions On Watershed attributes (SPARROW) models to help describe the distribution, sources, and transport of nutrients in streams throughout much of the United States. After screening, 2,739 sites, sampled by 73 agencies, were identified as having suitable data for calculating long-term mean annual nutrient loads required for SPARROW model calibration. These sites had a wide range in nutrient concentrations, loads, and yields, and environmental characteristics in their basins. An analysis of the accuracy in load estimates relative to site attributes indicated that accuracy in loads improve with increases in the number of observations, the proportion of uncensored data, and the variability in flow on observation days, whereas accuracy declines with increases in the root mean square error of the water-quality model, the flow-bias ratio, the number of days between samples, the variability in daily streamflow for the prediction period, and if the load estimate has been detrended. Based on compiled data, all areas of the country had recent declines in the number of sites with sufficient water-quality data to compute accurate annual loads and support regional modeling analyses. These declines were caused by decreases in the number of sites being sampled and data not being entered in readily accessible databases.
The fractionation of nickel between olivine and augite as a geothermometer
Hakli, T.A.; Wright, T.L.
1967-01-01
The coexisting olivine, clinopyroxene and glass of five samples collected from the Makaopuhi lava lake in Hawaii, at temperatures ranging from 1050 to 1160??C were analysed for nickel with an electron probe microanalyser. The results strongly suggest that the distribution of nickel between these three phase pairs well obeys the thermodynamic partition law, and that under favourable conditions, the distribution coefficients permit the estimation of the crystallisation temperature within an accuracy of 10-20??C. It is concluded that the application of the Makaopuhi data to plutonic and to other volcanic rocks should be carried out with caution because the effect of pressure and the changing composition of the phases upon the numerical values of the distribution coefficients is not known quantitatively. ?? 1967.
Gesteme-free context-aware adaptation of robot behavior in human-robot cooperation.
Nessi, Federico; Beretta, Elisa; Gatti, Cecilia; Ferrigno, Giancarlo; De Momi, Elena
2016-11-01
Cooperative robotics is receiving greater acceptance because the typical advantages provided by manipulators are combined with an intuitive usage. In particular, hands-on robotics may benefit from the adaptation of the assistant behavior with respect to the activity currently performed by the user. A fast and reliable classification of human activities is required, as well as strategies to smoothly modify the control of the manipulator. In this scenario, gesteme-based motion classification is inadequate because it needs the observation of a wide signal percentage and the definition of a rich vocabulary. In this work, a system able to recognize the user's current activity without a vocabulary of gestemes, and to accordingly adapt the manipulator's dynamic behavior is presented. An underlying stochastic model fits variations in the user's guidance forces and the resulting trajectories of the manipulator's end-effector with a set of Gaussian distribution. The high-level switching between these distributions is captured with hidden Markov models. The dynamic of the KUKA light-weight robot, a torque-controlled manipulator, is modified with respect to the classified activity using sigmoidal-shaped functions. The presented system is validated over a pool of 12 näive users in a scenario that addresses surgical targeting tasks on soft tissue. The robot's assistance is adapted in order to obtain a stiff behavior during activities that require critical accuracy constraint, and higher compliance during wide movements. Both the ability to provide the correct classification at each moment (sample accuracy) and the capability of correctly identify the correct sequence of activity (sequence accuracy) were evaluated. The proposed classifier is fast and accurate in all the experiments conducted (80% sample accuracy after the observation of ∼450ms of signal). Moreover, the ability of recognize the correct sequence of activities, without unwanted transitions is guaranteed (sequence accuracy ∼90% when computed far away from user desired transitions). Finally, the proposed activity-based adaptation of the robot's dynamic does not lead to a not smooth behavior (high smoothness, i.e. normalized jerk score <0.01). The provided system is able to dynamic assist the operator during cooperation in the presented scenario. Copyright © 2016 Elsevier B.V. All rights reserved.
Transfer learning for bimodal biometrics recognition
NASA Astrophysics Data System (ADS)
Dan, Zhiping; Sun, Shuifa; Chen, Yanfei; Gan, Haitao
2013-10-01
Biometrics recognition aims to identify and predict new personal identities based on their existing knowledge. As the use of multiple biometric traits of the individual may enables more information to be used for recognition, it has been proved that multi-biometrics can produce higher accuracy than single biometrics. However, a common problem with traditional machine learning is that the training and test data should be in the same feature space, and have the same underlying distribution. If the distributions and features are different between training and future data, the model performance often drops. In this paper, we propose a transfer learning method for face recognition on bimodal biometrics. The training and test samples of bimodal biometric images are composed of the visible light face images and the infrared face images. Our algorithm transfers the knowledge across feature spaces, relaxing the assumption of same feature space as well as same underlying distribution by automatically learning a mapping between two different but somewhat similar face images. According to the experiments in the face images, the results show that the accuracy of face recognition has been greatly improved by the proposed method compared with the other previous methods. It demonstrates the effectiveness and robustness of our method.
Distributed database kriging for adaptive sampling (D²KAS)
Roehm, Dominic; Pavel, Robert S.; Barros, Kipton; ...
2015-03-18
We present an adaptive sampling method supplemented by a distributed database and a prediction method for multiscale simulations using the Heterogeneous Multiscale Method. A finite-volume scheme integrates the macro-scale conservation laws for elastodynamics, which are closed by momentum and energy fluxes evaluated at the micro-scale. In the original approach, molecular dynamics (MD) simulations are launched for every macro-scale volume element. Our adaptive sampling scheme replaces a large fraction of costly micro-scale MD simulations with fast table lookup and prediction. The cloud database Redis provides the plain table lookup, and with locality aware hashing we gather input data for our predictionmore » scheme. For the latter we use kriging, which estimates an unknown value and its uncertainty (error) at a specific location in parameter space by using weighted averages of the neighboring points. We find that our adaptive scheme significantly improves simulation performance by a factor of 2.5 to 25, while retaining high accuracy for various choices of the algorithm parameters.« less
NASA Astrophysics Data System (ADS)
Benninghoff, L.; von Czarnowski, D.; Denkhaus, E.; Lemke, K.
1997-07-01
For the determination of trace element distributions of more than 20 elements in malignant and normal tissues of the human colon, tissue samples (approx. 400 mg wet weight) were digested with 3 ml of nitric acid (sub-boiled quality) by use of an autoclave system. The accuracy of measurements has been investigated by using certified materials. The analytical results were evaluated by using a spreadsheet program to give an overview of the element distribution in cancerous samples and in normal colon tissues. A further application, cluster analysis of the analytical results, was introduced to demonstrate the possibility of classification for cancer diagnosis. To confirm the results of cluster analysis, multivariate three-way principal component analysis was performed. Additionally, microtome frozen sections (10 μm) were prepared from the same tissue samples to compare the analytical results, i.e. the mass fractions of elements, according to the preparation method and to exclude systematic errors depending on the inhomogeneity of the tissues.
Lin, Yu-Pin; Chu, Hone-Jay; Huang, Yu-Long; Tang, Chia-Hsi; Rouhani, Shahrokh
2011-06-01
This study develops a stratified conditional Latin hypercube sampling (scLHS) approach for multiple, remotely sensed, normalized difference vegetation index (NDVI) images. The objective is to sample, monitor, and delineate spatiotemporal landscape changes, including spatial heterogeneity and variability, in a given area. The scLHS approach, which is based on the variance quadtree technique (VQT) and the conditional Latin hypercube sampling (cLHS) method, selects samples in order to delineate landscape changes from multiple NDVI images. The images are then mapped for calibration and validation by using sequential Gaussian simulation (SGS) with the scLHS selected samples. Spatial statistical results indicate that in terms of their statistical distribution, spatial distribution, and spatial variation, the statistics and variograms of the scLHS samples resemble those of multiple NDVI images more closely than those of cLHS and VQT samples. Moreover, the accuracy of simulated NDVI images based on SGS with scLHS samples is significantly better than that of simulated NDVI images based on SGS with cLHS samples and VQT samples, respectively. However, the proposed approach efficiently monitors the spatial characteristics of landscape changes, including the statistics, spatial variability, and heterogeneity of NDVI images. In addition, SGS with the scLHS samples effectively reproduces spatial patterns and landscape changes in multiple NDVI images.
Comparing ordinary kriging and inverse distance weighting for soil as pollution in Beijing.
Qiao, Pengwei; Lei, Mei; Yang, Sucai; Yang, Jun; Guo, Guanghui; Zhou, Xiaoyong
2018-06-01
Spatial interpolation method is the basis of soil heavy metal pollution assessment and remediation. The existing evaluation index for interpolation accuracy did not combine with actual situation. The selection of interpolation methods needs to be based on specific research purposes and research object characteristics. In this paper, As pollution in soils of Beijing was taken as an example. The prediction accuracy of ordinary kriging (OK) and inverse distance weighted (IDW) were evaluated based on the cross validation results and spatial distribution characteristics of influencing factors. The results showed that, under the condition of specific spatial correlation, the cross validation results of OK and IDW for every soil point and the prediction accuracy of spatial distribution trend are similar. But the prediction accuracy of OK for the maximum and minimum is less than IDW, while the number of high pollution areas identified by OK are less than IDW. It is difficult to identify the high pollution areas fully by OK, which shows that the smoothing effect of OK is obvious. In addition, with increasing of the spatial correlation of As concentration, the cross validation error of OK and IDW decreases, and the high pollution area identified by OK is approaching the result of IDW, which can identify the high pollution areas more comprehensively. However, because the semivariogram constructed by OK interpolation method is more subjective and requires larger number of soil samples, IDW is more suitable for spatial prediction of heavy metal pollution in soils.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Zhaoying; Liu, Jia; Zhou, Yufan
It has been very difficult to use popular elemental imaging techniques to image Li and B distribution in glass samples with nanoscale resolution. In this study, atom probe tomography (APT), time-of-flight secondary ion mass spectrometry (ToF-SIMS), and nanoscale secondary ion mass spectrometry (NanoSIMS) were used to image the distribution of Li and B in two representative glass samples. APT can provide three-dimensional Li and B imaging with very high spatial resolution (≤ 2 nm). In addition, absolute quantification of Li and B is possible, though room remains to improve accuracy. However, the major drawbacks of APT include limited field ofmore » view (normally ≤ 100 × 100 × 500 nm 3) and poor sample compatibility. As a comparison, ToF-SIMS and NanoSIMS are sample-friendly with flexible field of view (up to 500 × 500 μm 2 and image stitching is feasible); however, lateral resolution is limited to only about 100 nm. Therefore, SIMS and APT can be regarded as complementary techniques for nanoscale imaging Li and B in glass and other novel materials.« less
NASA Astrophysics Data System (ADS)
Wrable-Rose, Madeline; Primera-Pedrozo, Oliva M.; Pacheco-Londoño, Leonardo C.; Hernandez-Rivera, Samuel P.
2010-12-01
This research examines the surface contamination properties, trace sample preparation methodologies, detection systems response and generation of explosive contamination standards for trace detection systems. Homogeneous and reproducible sample preparation is relevant for trace detection of chemical threats, such as warfare agents, highly energetic materials (HEM) and toxic industrial chemicals. The objective of this research was to develop a technology capable of producing samples and standards of HEM with controlled size and distribution on a substrate to generate specimens that would reproduce real contamination conditions. The research activities included (1) a study of the properties of particles generated by two deposition techniques: sample smearing deposition and inkjet deposition, on gold-coated silicon, glass and stainless steel substrates; (2) characterization of composition, distribution and adhesion characteristics of deposits; (3) evaluation of accuracy and reproducibility for depositing neat highly energetic materials such as TNT, RDX and ammonium nitrate; (4) a study of HEM-surface interactions using FTIR-RAIRS; and (5) establishment of protocols for validation of surface concentration using destructive methods such as HPLC.
The NYU inverse swept wing code
NASA Technical Reports Server (NTRS)
Bauer, F.; Garabedian, P.; Mcfadden, G.
1983-01-01
An inverse swept wing code is described that is based on the widely used transonic flow program FLO22. The new code incorporates a free boundary algorithm permitting the pressure distribution to be prescribed over a portion of the wing surface. A special routine is included to calculate the wave drag, which can be minimized in its dependence on the pressure distribution. An alternate formulation of the boundary condition at infinity was introduced to enhance the speed and accuracy of the code. A FORTRAN listing of the code and a listing of a sample run are presented. There is also a user's manual as well as glossaries of input and output parameters.
Quantifying the effect of 3D spatial resolution on the accuracy of microstructural distributions
NASA Astrophysics Data System (ADS)
Loughnane, Gregory; Groeber, Michael; Uchic, Michael; Riley, Matthew; Shah, Megna; Srinivasan, Raghavan; Grandhi, Ramana
The choice of spatial resolution for experimentally-collected 3D microstructural data is often governed by general rules of thumb. For example, serial section experiments often strive to collect at least ten sections through the average feature-of-interest. However, the desire to collect high resolution data in 3D is greatly tempered by the exponential growth in collection times and data storage requirements. This paper explores the use of systematic down-sampling of synthetically-generated grain microstructures to examine the effect of resolution on the calculated distributions of microstructural descriptors such as grain size, number of nearest neighbors, aspect ratio, and Ω3.
An assessment of the direction-finding accuracy of bat biosonar beampatterns.
Gilani, Uzair S; Müller, Rolf
2016-02-01
In the biosonar systems of bats, emitted acoustic energy and receiver sensitivity are distributed over direction and frequency through beampattern functions that have diverse and often complicated geometries. This complexity could be used by the animals to determine the direction of incoming sounds based on spectral signatures. The present study has investigated how well bat biosonar beampatterns are suited for direction finding using a measure of the smallest estimator variance that is possible for a given direction [Cramér-Rao lower bound (CRLB)]. CRLB values were estimated for numerical beampattern estimates derived from 330 individual shape samples, 157 noseleaves (used for emission), and 173 outer ears (pinnae). At an assumed 60 dB signal-to-noise ratio, the average value of the CRLB was 3.9°, which is similar to previous behavioral findings. Distribution for the CRLBs in individual beampatterns had a positive skew indicating the existence of regions where a given beampattern does not support a high accuracy. The highest supported accuracies were for direction finding in elevation (with the exception of phyllostomid emission patterns). No large, obvious differences in the CRLB (greater 2° in the mean) were found between the investigated major taxonomic groups, suggesting that different bat species have access to similar direction-finding information.
Micro/Nano-scale Strain Distribution Measurement from Sampling Moiré Fringes.
Wang, Qinghua; Ri, Shien; Tsuda, Hiroshi
2017-05-23
This work describes the measurement procedure and principles of a sampling moiré technique for full-field micro/nano-scale deformation measurements. The developed technique can be performed in two ways: using the reconstructed multiplication moiré method or the spatial phase-shifting sampling moiré method. When the specimen grid pitch is around 2 pixels, 2-pixel sampling moiré fringes are generated to reconstruct a multiplication moiré pattern for a deformation measurement. Both the displacement and strain sensitivities are twice as high as in the traditional scanning moiré method in the same wide field of view. When the specimen grid pitch is around or greater than 3 pixels, multi-pixel sampling moiré fringes are generated, and a spatial phase-shifting technique is combined for a full-field deformation measurement. The strain measurement accuracy is significantly improved, and automatic batch measurement is easily achievable. Both methods can measure the two-dimensional (2D) strain distributions from a single-shot grid image without rotating the specimen or scanning lines, as in traditional moiré techniques. As examples, the 2D displacement and strain distributions, including the shear strains of two carbon fiber-reinforced plastic specimens, were measured in three-point bending tests. The proposed technique is expected to play an important role in the non-destructive quantitative evaluations of mechanical properties, crack occurrences, and residual stresses of a variety of materials.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pena, J.E.; Mannion, C.; Amalin, D.
2007-03-15
Taylor's power law and Iwao's patchiness regression were used to analyze spatial distribution of eggs of the Diaprepes root weevil, Diaprepes abbreviatus (L.), on silver buttonwood trees, Conocarpus erectus, during 1997 and 1998. Taylor's power law and Iwao's patchiness regression provided similar descriptions of variance-mean relationship for egg distribution within trees. Sample size requirements were determined. Information presented in this paper should help to improve accuracy and efficiency in sampling of the weevil eggs in the future. (author) [Spanish] Se utilizaron la ley de Taylor y la regresion de Iwao para analizar la distribucion de los huevos del picudo Diaprepes,more » Diaprepes abbreviatus (L.) en arboles de boton plateado, Conocarpus erectus. Los estudios fueron realizados durante 1997 y 1998. Tanto la ley de Taylor como la regression de Iwao dieron resultados similares en cuanto a la relacion de la varianza y el promedio para la distribucion de huevos del picudo en los arboles. Se determinaron los requerimentos del tamano del numero de muestras. En un futuro, la informacion que se presenta en este articulo puede ayudar a mejorar la eficiencia del muestreo de huevos de este picudo. (author)« less
NASA Astrophysics Data System (ADS)
Chen, Ming; Guo, Jiming; Li, Zhicai; Zhang, Peng; Wu, Junli; Song, Weiwei
2017-04-01
BDS precision orbit determination is a key content of the BDS application, but the inadequate ground stations and the poor distribution of the network are the main reasons for the low accuracy of BDS precise orbit determination. In this paper, the BDS precise orbit determination results are obtained by using the IGS MGEX stations and the Chinese national reference stations,the accuracy of orbit determination of GEO, IGSO and MEO is 10.3cm, 2.8cm and 3.2cm, and the radial accuracy is 1.6cm,1.9cm and 1.5cm.The influence of ground reference stations distribution on BDS precise orbit determination is studied. The results show that the Chinese national reference stations contribute significantly to the BDS orbit determination, the overlap precision of GEO/IGSO/MEO satellites were improved by 15.5%, 57.5% and 5.3% respectively after adding the Chinese stations.Finally, the results of ODOP(orbit distribution of precision) and SLR are verified. Key words: BDS precise orbit determination; accuracy assessment;Chinese national reference stations;reference stations distribution;orbit distribution of precision
NASA Astrophysics Data System (ADS)
Ye, Su; Pontius, Robert Gilmore; Rakshit, Rahul
2018-07-01
Object-based image analysis (OBIA) has gained widespread popularity for creating maps from remotely sensed data. Researchers routinely claim that OBIA procedures outperform pixel-based procedures; however, it is not immediately obvious how to evaluate the degree to which an OBIA map compares to reference information in a manner that accounts for the fact that the OBIA map consists of objects that vary in size and shape. Our study reviews 209 journal articles concerning OBIA published between 2003 and 2017. We focus on the three stages of accuracy assessment: (1) sampling design, (2) response design and (3) accuracy analysis. First, we report the literature's overall characteristics concerning OBIA accuracy assessment. Simple random sampling was the most used method among probability sampling strategies, slightly more than stratified sampling. Office interpreted remotely sensed data was the dominant reference source. The literature reported accuracies ranging from 42% to 96%, with an average of 85%. A third of the articles failed to give sufficient information concerning accuracy methodology such as sampling scheme and sample size. We found few studies that focused specifically on the accuracy of the segmentation. Second, we identify a recent increase of OBIA articles in using per-polygon approaches compared to per-pixel approaches for accuracy assessment. We clarify the impacts of the per-pixel versus the per-polygon approaches respectively on sampling, response design and accuracy analysis. Our review defines the technical and methodological needs in the current per-polygon approaches, such as polygon-based sampling, analysis of mixed polygons, matching of mapped with reference polygons and assessment of segmentation accuracy. Our review summarizes and discusses the current issues in object-based accuracy assessment to provide guidance for improved accuracy assessments for OBIA.
NASA Astrophysics Data System (ADS)
Voloshin, A. E.; Prostomolotov, A. I.; Verezub, N. A.
2016-11-01
The paper deals with the analysis of the accuracy of some one-dimensional (1D) analytical models of the axial distribution of impurities in the crystal grown from a melt. The models proposed by Burton-Prim-Slichter, Ostrogorsky-Muller and Garandet with co-authors are considered, these models are compared to the results of a two-dimensional (2D) numerical simulation. Stationary solutions as well as solutions for the initial transient regime obtained using these models are considered. The sources of errors are analyzed, a conclusion is made about the applicability of 1D analytical models for quantitative estimates of impurity incorporation into the crystal sample as well as for the solution of the inverse problems.
Distributed medical image analysis and diagnosis through crowd-sourced games: a malaria case study.
Mavandadi, Sam; Dimitrov, Stoyan; Feng, Steve; Yu, Frank; Sikora, Uzair; Yaglidere, Oguzhan; Padmanabhan, Swati; Nielsen, Karin; Ozcan, Aydogan
2012-01-01
In this work we investigate whether the innate visual recognition and learning capabilities of untrained humans can be used in conducting reliable microscopic analysis of biomedical samples toward diagnosis. For this purpose, we designed entertaining digital games that are interfaced with artificial learning and processing back-ends to demonstrate that in the case of binary medical diagnostics decisions (e.g., infected vs. uninfected), with the use of crowd-sourced games it is possible to approach the accuracy of medical experts in making such diagnoses. Specifically, using non-expert gamers we report diagnosis of malaria infected red blood cells with an accuracy that is within 1.25% of the diagnostics decisions made by a trained medical professional.
Identifying High-Rate Flows Based on Sequential Sampling
NASA Astrophysics Data System (ADS)
Zhang, Yu; Fang, Binxing; Luo, Hao
We consider the problem of fast identification of high-rate flows in backbone links with possibly millions of flows. Accurate identification of high-rate flows is important for active queue management, traffic measurement and network security such as detection of distributed denial of service attacks. It is difficult to directly identify high-rate flows in backbone links because tracking the possible millions of flows needs correspondingly large high speed memories. To reduce the measurement overhead, the deterministic 1-out-of-k sampling technique is adopted which is also implemented in Cisco routers (NetFlow). Ideally, a high-rate flow identification method should have short identification time, low memory cost and processing cost. Most importantly, it should be able to specify the identification accuracy. We develop two such methods. The first method is based on fixed sample size test (FSST) which is able to identify high-rate flows with user-specified identification accuracy. However, since FSST has to record every sampled flow during the measurement period, it is not memory efficient. Therefore the second novel method based on truncated sequential probability ratio test (TSPRT) is proposed. Through sequential sampling, TSPRT is able to remove the low-rate flows and identify the high-rate flows at the early stage which can reduce the memory cost and identification time respectively. According to the way to determine the parameters in TSPRT, two versions of TSPRT are proposed: TSPRT-M which is suitable when low memory cost is preferred and TSPRT-T which is suitable when short identification time is preferred. The experimental results show that TSPRT requires less memory and identification time in identifying high-rate flows while satisfying the accuracy requirement as compared to previously proposed methods.
Analysis of Compression Algorithm in Ground Collision Avoidance Systems (Auto-GCAS)
NASA Technical Reports Server (NTRS)
Schmalz, Tyler; Ryan, Jack
2011-01-01
Automatic Ground Collision Avoidance Systems (Auto-GCAS) utilizes Digital Terrain Elevation Data (DTED) stored onboard a plane to determine potential recovery maneuvers. Because of the current limitations of computer hardware on military airplanes such as the F-22 and F-35, the DTED must be compressed through a lossy technique called binary-tree tip-tilt. The purpose of this study is to determine the accuracy of the compressed data with respect to the original DTED. This study is mainly interested in the magnitude of the error between the two as well as the overall distribution of the errors throughout the DTED. By understanding how the errors of the compression technique are affected by various factors (topography, density of sampling points, sub-sampling techniques, etc.), modifications can be made to the compression technique resulting in better accuracy. This, in turn, would minimize unnecessary activation of A-GCAS during flight as well as maximizing its contribution to fighter safety.
ACTION-SPACE CLUSTERING OF TIDAL STREAMS TO INFER THE GALACTIC POTENTIAL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sanderson, Robyn E.; Helmi, Amina; Hogg, David W., E-mail: robyn@astro.columbia.edu
2015-03-10
We present a new method for constraining the Milky Way halo gravitational potential by simultaneously fitting multiple tidal streams. This method requires three-dimensional positions and velocities for all stars to be fit, but does not require identification of any specific stream or determination of stream membership for any star. We exploit the principle that the action distribution of stream stars is most clustered when the potential used to calculate the actions is closest to the true potential. Clustering is quantified with the Kullback-Leibler Divergence (KLD), which also provides conditional uncertainties for our parameter estimates. We show, for toy Gaia-like datamore » in a spherical isochrone potential, that maximizing the KLD of the action distribution relative to a smoother distribution recovers the input potential. The precision depends on the observational errors and number of streams; using K III giants as tracers, we measure the enclosed mass at the average radius of the sample stars accurate to 3% and precise to 20%-40%. Recovery of the scale radius is precise to 25%, biased 50% high by the small galactocentric distance range of stars in our mock sample (1-25 kpc, or about three scale radii, with mean 6.5 kpc). 20-25 streams with at least 100 stars each are required for a stable confidence interval. With radial velocities (RVs) to 100 kpc, all parameters are determined with ∼10% accuracy and 20% precision (1.3% accuracy for the enclosed mass), underlining the need to complete the RV catalog for faint halo stars observed by Gaia.« less
Dronova, Iryna; Spotswood, Erica N.; Suding, Katharine N.
2017-01-01
Understanding spatial distributions of invasive plant species at early infestation stages is critical for assessing the dynamics and underlying factors of invasions. Recent progress in very high resolution remote sensing is facilitating this task by providing high spatial detail over whole-site extents that are prohibitive to comprehensive ground surveys. This study assessed the opportunities and constraints to characterize landscape distribution of the invasive grass medusahead (Elymus caput-medusae) in a ∼36.8 ha grassland in California, United States from 0.15m-resolution visible/near-infrared aerial imagery at the stage of late spring phenological contrast with dominant grasses. We compared several object-based unsupervised, single-run supervised and hierarchical approaches to classify medusahead using spectral, textural, and contextual variables. Fuzzy accuracy assessment indicated that 44–100% of test medusahead samples were matched by its classified extents from different methods, while 63–83% of test samples classified as medusahead had this class as an acceptable candidate. Main sources of error included spectral similarity between medusahead and other green species and mixing of medusahead with other vegetation at variable densities. Adding texture attributes to spectral variables increased the accuracy of most classification methods, corroborating the informative value of local patterns under limited spectral data. The highest accuracy across different metrics was shown by the supervised single-run support vector machine with seven vegetation classes and Bayesian algorithms with three vegetation classes; however, their medusahead allocations showed some “spillover” effects due to misclassifications with other green vegetation. This issue was addressed by more complex hierarchical approaches, though their final accuracy did not exceed the best single-run methods. However, the comparison of classified medusahead extents with field segments of its patches overlapping with survey transects indicated that most methods tended to miss and/or over-estimate the length of the smallest patches and under-estimate the largest ones due to classification errors. Overall, the study outcomes support the potential of cost-effective, very high-resolution sensing for the site-scale detection of infestation hotspots that can be customized to plant phenological schedules. However, more accurate medusahead patch delineation in mixed-cover grasslands would benefit from testing hyperspectral data and using our study’s framework to inform and constrain the candidate vegetation classes in heterogeneous locations. PMID:28611806
Dronova, Iryna; Spotswood, Erica N; Suding, Katharine N
2017-01-01
Understanding spatial distributions of invasive plant species at early infestation stages is critical for assessing the dynamics and underlying factors of invasions. Recent progress in very high resolution remote sensing is facilitating this task by providing high spatial detail over whole-site extents that are prohibitive to comprehensive ground surveys. This study assessed the opportunities and constraints to characterize landscape distribution of the invasive grass medusahead ( Elymus caput-medusae ) in a ∼36.8 ha grassland in California, United States from 0.15m-resolution visible/near-infrared aerial imagery at the stage of late spring phenological contrast with dominant grasses. We compared several object-based unsupervised, single-run supervised and hierarchical approaches to classify medusahead using spectral, textural, and contextual variables. Fuzzy accuracy assessment indicated that 44-100% of test medusahead samples were matched by its classified extents from different methods, while 63-83% of test samples classified as medusahead had this class as an acceptable candidate. Main sources of error included spectral similarity between medusahead and other green species and mixing of medusahead with other vegetation at variable densities. Adding texture attributes to spectral variables increased the accuracy of most classification methods, corroborating the informative value of local patterns under limited spectral data. The highest accuracy across different metrics was shown by the supervised single-run support vector machine with seven vegetation classes and Bayesian algorithms with three vegetation classes; however, their medusahead allocations showed some "spillover" effects due to misclassifications with other green vegetation. This issue was addressed by more complex hierarchical approaches, though their final accuracy did not exceed the best single-run methods. However, the comparison of classified medusahead extents with field segments of its patches overlapping with survey transects indicated that most methods tended to miss and/or over-estimate the length of the smallest patches and under-estimate the largest ones due to classification errors. Overall, the study outcomes support the potential of cost-effective, very high-resolution sensing for the site-scale detection of infestation hotspots that can be customized to plant phenological schedules. However, more accurate medusahead patch delineation in mixed-cover grasslands would benefit from testing hyperspectral data and using our study's framework to inform and constrain the candidate vegetation classes in heterogeneous locations.
Selbig, William R.
2017-01-01
Collection of water-quality samples that accurately characterize average particle concentrations and distributions in channels can be complicated by large sources of variability. The U.S. Geological Survey (USGS) developed a fully automated Depth-Integrated Sample Arm (DISA) as a way to reduce bias and improve accuracy in water-quality concentration data. The DISA was designed to integrate with existing autosampler configurations commonly used for the collection of water-quality samples in vertical profile thereby providing a better representation of average suspended sediment and sediment-associated pollutant concentrations and distributions than traditional fixed-point samplers. In controlled laboratory experiments, known concentrations of suspended sediment ranging from 596 to 1,189 mg/L were injected into a 3 foot diameter closed channel (circular pipe) with regulated flows ranging from 1.4 to 27.8 ft3 /s. Median suspended sediment concentrations in water-quality samples collected using the DISA were within 7 percent of the known, injected value compared to 96 percent for traditional fixed-point samplers. Field evaluation of this technology in open channel fluvial systems showed median differences between paired DISA and fixed-point samples to be within 3 percent. The range of particle size measured in the open channel was generally that of clay and silt. Differences between the concentration and distribution measured between the two sampler configurations could potentially be much larger in open channels that transport larger particles, such as sand.
Estimation of AUC or Partial AUC under Test-Result-Dependent Sampling.
Wang, Xiaofei; Ma, Junling; George, Stephen; Zhou, Haibo
2012-01-01
The area under the ROC curve (AUC) and partial area under the ROC curve (pAUC) are summary measures used to assess the accuracy of a biomarker in discriminating true disease status. The standard sampling approach used in biomarker validation studies is often inefficient and costly, especially when ascertaining the true disease status is costly and invasive. To improve efficiency and reduce the cost of biomarker validation studies, we consider a test-result-dependent sampling (TDS) scheme, in which subject selection for determining the disease state is dependent on the result of a biomarker assay. We first estimate the test-result distribution using data arising from the TDS design. With the estimated empirical test-result distribution, we propose consistent nonparametric estimators for AUC and pAUC and establish the asymptotic properties of the proposed estimators. Simulation studies show that the proposed estimators have good finite sample properties and that the TDS design yields more efficient AUC and pAUC estimates than a simple random sampling (SRS) design. A data example based on an ongoing cancer clinical trial is provided to illustrate the TDS design and the proposed estimators. This work can find broad applications in design and analysis of biomarker validation studies.
Boykin, K.G.; Thompson, B.C.; Propeck-Gray, S.
2010-01-01
Despite widespread and long-standing efforts to model wildlife-habitat associations using remotely sensed and other spatially explicit data, there are relatively few evaluations of the performance of variables included in predictive models relative to actual features on the landscape. As part of the National Gap Analysis Program, we specifically examined physical site features at randomly selected sample locations in the Southwestern U.S. to assess degree of concordance with predicted features used in modeling vertebrate habitat distribution. Our analysis considered hypotheses about relative accuracy with respect to 30 vertebrate species selected to represent the spectrum of habitat generalist to specialist and categorization of site by relative degree of conservation emphasis accorded to the site. Overall comparison of 19 variables observed at 382 sample sites indicated ???60% concordance for 12 variables. Directly measured or observed variables (slope, soil composition, rock outcrop) generally displayed high concordance, while variables that required judgments regarding descriptive categories (aspect, ecological system, landform) were less concordant. There were no differences detected in concordance among taxa groups, degree of specialization or generalization of selected taxa, or land conservation categorization of sample sites with respect to all sites. We found no support for the hypothesis that accuracy of habitat models is inversely related to degree of taxa specialization when model features for a habitat specialist could be more difficult to represent spatially. Likewise, we did not find support for the hypothesis that physical features will be predicted with higher accuracy on lands with greater dedication to biodiversity conservation than on other lands because of relative differences regarding available information. Accuracy generally was similar (>60%) to that observed for land cover mapping at the ecological system level. These patterns demonstrate resilience of gap analysis deductive model processes to the type of remotely sensed or interpreted data used in habitat feature predictions. ?? 2010 Elsevier B.V.
MAFsnp: A Multi-Sample Accurate and Flexible SNP Caller Using Next-Generation Sequencing Data
Hu, Jiyuan; Li, Tengfei; Xiu, Zidi; Zhang, Hong
2015-01-01
Most existing statistical methods developed for calling single nucleotide polymorphisms (SNPs) using next-generation sequencing (NGS) data are based on Bayesian frameworks, and there does not exist any SNP caller that produces p-values for calling SNPs in a frequentist framework. To fill in this gap, we develop a new method MAFsnp, a Multiple-sample based Accurate and Flexible algorithm for calling SNPs with NGS data. MAFsnp is based on an estimated likelihood ratio test (eLRT) statistic. In practical situation, the involved parameter is very close to the boundary of the parametric space, so the standard large sample property is not suitable to evaluate the finite-sample distribution of the eLRT statistic. Observing that the distribution of the test statistic is a mixture of zero and a continuous part, we propose to model the test statistic with a novel two-parameter mixture distribution. Once the parameters in the mixture distribution are estimated, p-values can be easily calculated for detecting SNPs, and the multiple-testing corrected p-values can be used to control false discovery rate (FDR) at any pre-specified level. With simulated data, MAFsnp is shown to have much better control of FDR than the existing SNP callers. Through the application to two real datasets, MAFsnp is also shown to outperform the existing SNP callers in terms of calling accuracy. An R package “MAFsnp” implementing the new SNP caller is freely available at http://homepage.fudan.edu.cn/zhangh/softwares/. PMID:26309201
Fracture network created by 3D printer and its validation using CT images
NASA Astrophysics Data System (ADS)
Suzuki, A.; Watanabe, N.; Li, K.; Horne, R. N.
2017-12-01
Understanding flow mechanisms in fractured media is essential for geoscientific research and geological development industries. This study used 3D printed fracture networks in order to control the properties of fracture distributions inside the sample. The accuracy and appropriateness of creating samples by the 3D printer was investigated by using a X-ray CT scanner. The CT scan images suggest that the 3D printer is able to reproduce complex three-dimensional spatial distributions of fracture networks. Use of hexane after printing was found to be an effective way to remove wax for the post-treatment. Local permeability was obtained by the cubic law and used to calculate the global mean. The experimental value of the permeability was between the arithmetic and geometric means of the numerical results, which is consistent with conventional studies. This methodology based on 3D printed fracture networks can help validate existing flow modeling and numerical methods.
A random-sum Wilcoxon statistic and its application to analysis of ROC and LROC data.
Tang, Liansheng Larry; Balakrishnan, N
2011-01-01
The Wilcoxon-Mann-Whitney statistic is commonly used for a distribution-free comparison of two groups. One requirement for its use is that the sample sizes of the two groups are fixed. This is violated in some of the applications such as medical imaging studies and diagnostic marker studies; in the former, the violation occurs since the number of correctly localized abnormal images is random, while in the latter the violation is due to some subjects not having observable measurements. For this reason, we propose here a random-sum Wilcoxon statistic for comparing two groups in the presence of ties, and derive its variance as well as its asymptotic distribution for large sample sizes. The proposed statistic includes the regular Wilcoxon rank-sum statistic. Finally, we apply the proposed statistic for summarizing location response operating characteristic data from a liver computed tomography study, and also for summarizing diagnostic accuracy of biomarker data.
WAMS measurements pre-processing for detecting low-frequency oscillations in power systems
NASA Astrophysics Data System (ADS)
Kovalenko, P. Y.
2017-07-01
Processing the data received from measurement systems implies the situation when one or more registered values stand apart from the sample collection. These values are referred to as “outliers”. The processing results may be influenced significantly by the presence of those in the data sample under consideration. In order to ensure the accuracy of low-frequency oscillations detection in power systems the corresponding algorithm has been developed for the outliers detection and elimination. The algorithm is based on the concept of the irregular component of measurement signal. This component comprises measurement errors and is assumed to be Gauss-distributed random. The median filtering is employed to detect the values lying outside the range of the normally distributed measurement error on the basis of a 3σ criterion. The algorithm has been validated involving simulated signals and WAMS data as well.
Izco, J M; Tormo, M; Harris, A; Tong, P S; Jimenez-Flores, R
2003-01-01
Quantification of phosphate and citrate compounds is very important because their distribution between soluble and colloidal phases of milk and their interactions with milk proteins influence the stability and some functional properties of dairy products. The aim of this work was to optimize and validate a capillary electrophoresis method for the rapid determination of these compounds in milk. Various parameters affecting analysis have been optimized, including type, composition, and pH of the electrolyte, and sample extraction. Ethanol, acetonitrile, sulfuric acid, water at 50 degrees C or at room temperature were tested as sample buffers (SB). Water at room temperature yielded the best overall results and was chosen for further validation. The extraction time was checked and could be shortened to less than 1 min. Also, sample preparation was simplified to pipet 12 microl of milk into 1 ml of water containing 20 ppm of tartaric acid as an internal standard. The linearity of the method was excellent (R2 > 0.999) with CV values of response factors <3%. The detection limits for phosphate and citrate were 5.1 and 2.4 nM, respectively. The accuracy of the method was calculated for each compound (103.2 and 100.3%). In addition, citrate and phosphate content of several commercial milk samples were analyzed by this method, and the results deviated less than 5% from values obtained when analyzing the samples by official methods. To study the versatility of the technique, other dairy productssuch as cream cheese, yogurt, or Cheddar cheese were analyzed and accuracy was similar to milk in all products tested. The procedure is rapid and offers a very fast and simple sample preparation. Once the sample has arrived at the laboratory, less than 5 min (including handling, preparation, running, integration, and quantification) are necessary to determine the concentration of citric acid and inorganic phosphate. Because of the speed and accuracy of this method, it is promising as an analytical quantitative testing technique.
Distributed micro-radar system for detection and tracking of low-profile, low-altitude targets
NASA Astrophysics Data System (ADS)
Gorwara, Ashok; Molchanov, Pavlo
2016-05-01
Proposed airborne surveillance radar system can detect, locate, track, and classify low-profile, low-altitude targets: from traditional fixed and rotary wing aircraft to non-traditional targets like unmanned aircraft systems (drones) and even small projectiles. Distributed micro-radar system is the next step in the development of passive monopulse direction finder proposed by Stephen E. Lipsky in the 80s. To extend high frequency limit and provide high sensitivity over the broadband of frequencies, multiple angularly spaced directional antennas are coupled with front end circuits and separately connected to a direction finder processor by a digital interface. Integration of antennas with front end circuits allows to exclude waveguide lines which limits system bandwidth and creates frequency dependent phase errors. Digitizing of received signals proximate to antennas allows loose distribution of antennas and dramatically decrease phase errors connected with waveguides. Accuracy of direction finding in proposed micro-radar in this case will be determined by time accuracy of digital processor and sampling frequency. Multi-band, multi-functional antennas can be distributed around the perimeter of a Unmanned Aircraft System (UAS) and connected to the processor by digital interface or can be distributed between swarm/formation of mini/micro UAS and connected wirelessly. Expendable micro-radars can be distributed by perimeter of defense object and create multi-static radar network. Low-profile, lowaltitude, high speed targets, like small projectiles, create a Doppler shift in a narrow frequency band. This signal can be effectively filtrated and detected with high probability. Proposed micro-radar can work in passive, monostatic or bistatic regime.
Conceptual data sampling for breast cancer histology image classification.
Rezk, Eman; Awan, Zainab; Islam, Fahad; Jaoua, Ali; Al Maadeed, Somaya; Zhang, Nan; Das, Gautam; Rajpoot, Nasir
2017-10-01
Data analytics have become increasingly complicated as the amount of data has increased. One technique that is used to enable data analytics in large datasets is data sampling, in which a portion of the data is selected to preserve the data characteristics for use in data analytics. In this paper, we introduce a novel data sampling technique that is rooted in formal concept analysis theory. This technique is used to create samples reliant on the data distribution across a set of binary patterns. The proposed sampling technique is applied in classifying the regions of breast cancer histology images as malignant or benign. The performance of our method is compared to other classical sampling methods. The results indicate that our method is efficient and generates an illustrative sample of small size. It is also competing with other sampling methods in terms of sample size and sample quality represented in classification accuracy and F1 measure. Copyright © 2017 Elsevier Ltd. All rights reserved.
A number of articles have investigated the impact of sampling design on remotely sensed landcover accuracy estimates. Gong and Howarth (1990) found significant differences for Kappa accuracy values when comparing purepixel sampling, stratified random sampling, and stratified sys...
Green, Christopher T.; Zhang, Yong; Jurgens, Bryant C.; Starn, J. Jeffrey; Landon, Matthew K.
2014-01-01
Analytical models of the travel time distribution (TTD) from a source area to a sample location are often used to estimate groundwater ages and solute concentration trends. The accuracies of these models are not well known for geologically complex aquifers. In this study, synthetic datasets were used to quantify the accuracy of four analytical TTD models as affected by TTD complexity, observation errors, model selection, and tracer selection. Synthetic TTDs and tracer data were generated from existing numerical models with complex hydrofacies distributions for one public-supply well and 14 monitoring wells in the Central Valley, California. Analytical TTD models were calibrated to synthetic tracer data, and prediction errors were determined for estimates of TTDs and conservative tracer (NO3−) concentrations. Analytical models included a new, scale-dependent dispersivity model (SDM) for two-dimensional transport from the watertable to a well, and three other established analytical models. The relative influence of the error sources (TTD complexity, observation error, model selection, and tracer selection) depended on the type of prediction. Geological complexity gave rise to complex TTDs in monitoring wells that strongly affected errors of the estimated TTDs. However, prediction errors for NO3− and median age depended more on tracer concentration errors. The SDM tended to give the most accurate estimates of the vertical velocity and other predictions, although TTD model selection had minor effects overall. Adding tracers improved predictions if the new tracers had different input histories. Studies using TTD models should focus on the factors that most strongly affect the desired predictions.
Quantifying Rock Weakening Due to Decreasing Calcite Mineral Content by Numerical Simulations
2018-01-01
The quantification of changes in geomechanical properties due to chemical reactions is of paramount importance for geological subsurface utilisation, since mineral dissolution generally reduces rock stiffness. In the present study, the effective elastic moduli of two digital rock samples, the Fontainebleau and Bentheim sandstones, are numerically determined based on micro-CT images. Reduction in rock stiffness due to the dissolution of 10% calcite cement by volume out of the pore network is quantified for three synthetic spatial calcite distributions (coating, partial filling and random) using representative sub-cubes derived from the digital rock samples. Due to the reduced calcite content, bulk and shear moduli decrease by 34% and 38% in maximum, respectively. Total porosity is clearly the dominant parameter, while spatial calcite distribution has a minor impact, except for a randomly chosen cement distribution within the pore network. Moreover, applying an initial stiffness reduced by 47% for the calcite cement results only in a slightly weaker mechanical behaviour. Using the quantitative approach introduced here substantially improves the accuracy of predictions in elastic rock properties compared to general analytical methods, and further enables quantification of uncertainties related to spatial variations in porosity and mineral distribution. PMID:29614776
Quantifying Rock Weakening Due to Decreasing Calcite Mineral Content by Numerical Simulations.
Wetzel, Maria; Kempka, Thomas; Kühn, Michael
2018-04-01
The quantification of changes in geomechanical properties due to chemical reactions is of paramount importance for geological subsurface utilisation, since mineral dissolution generally reduces rock stiffness. In the present study, the effective elastic moduli of two digital rock samples, the Fontainebleau and Bentheim sandstones, are numerically determined based on micro-CT images. Reduction in rock stiffness due to the dissolution of 10% calcite cement by volume out of the pore network is quantified for three synthetic spatial calcite distributions (coating, partial filling and random) using representative sub-cubes derived from the digital rock samples. Due to the reduced calcite content, bulk and shear moduli decrease by 34% and 38% in maximum, respectively. Total porosity is clearly the dominant parameter, while spatial calcite distribution has a minor impact, except for a randomly chosen cement distribution within the pore network. Moreover, applying an initial stiffness reduced by 47% for the calcite cement results only in a slightly weaker mechanical behaviour. Using the quantitative approach introduced here substantially improves the accuracy of predictions in elastic rock properties compared to general analytical methods, and further enables quantification of uncertainties related to spatial variations in porosity and mineral distribution.
Kratzer, Markus; Lasnik, Michael; Röhrig, Sören; Teichert, Christian; Deluca, Marco
2018-01-11
Lead zirconate titanate (PZT) is one of the prominent materials used in polycrystalline piezoelectric devices. Since the ferroelectric domain orientation is the most important parameter affecting the electromechanical performance, analyzing the domain orientation distribution is of great importance for the development and understanding of improved piezoceramic devices. Here, vector piezoresponse force microscopy (vector-PFM) has been applied in order to reconstruct the ferroelectric domain orientation distribution function of polished sections of device-ready polycrystalline lead zirconate titanate (PZT) material. A measurement procedure and a computer program based on the software Mathematica have been developed to automatically evaluate the vector-PFM data for reconstructing the domain orientation function. The method is tested on differently in-plane and out-of-plane poled PZT samples, and the results reveal the expected domain patterns and allow determination of the polarization orientation distribution function at high accuracy.
ERIC Educational Resources Information Center
Goomas, David T.
2012-01-01
The effects of wireless ring scanners, which provided immediate auditory and visual feedback, were evaluated to increase the performance and accuracy of order selectors at a meat distribution center. The scanners not only increased performance and accuracy compared to paper pick sheets, but were also instrumental in immediate and accurate data…
Zhao, Yu; Zhao, Min; Jiang, Qi; Qin, Feng; Wang, Chengying; Xiong, Zhili; Wang, Shaojie; He, Zhonggui; Guo, Xingjie; Zhao, Longshan
2018-06-02
MP3950 is being developed as a gastroprokinetic candidate compound. To illustrate the pharmacokinetic profiles, absolute bioavailability after intravenous administration and oral administration with MP3950 as well as tissue distribution in vivo, an UPLC-MS/MS approach which was rapid and selective was developed to determine MP3950 in plasma and tissue of rat. Sample pre-treatment of plasma sample was one-step protein precipitation. 0.1% formic acid containing 5 mmol/L ammonium acetate-methanol(55/45,v/v) was used for isocratic elution on a Waters ACQUITY UPLC® BEH C18 (50 mm × 2.1 mm, 1.7 μm) to achieve the separation. The analysis was performed in MRM mode via positive ESI mode. LLOQ of the method was 10 ng/mL, and the linearity up to 10,000 ng/mL. The intra-day precision (relative standard deviation, RSD) was 4.0-9.0% and the inter-day precision was 4.2-10.6%. The accuracy (relative error, RE) was -1.2-2.4%. Tissue samples were collected from the brain, heart, liver, spleen, lung, kidney, stomach, duodenum, small intestine, large intestine, appendix and skeletal muscle. The same liquid chromatographic and mass spectrometric conditions were used, and it's proven that this method was feasible to analyze the MP3950 in tissues with good precision and accuracy over the range from 10 to 5000 ng·mL -1 . It was found that the concentration of MP3950 is higher in digestive system. The tissue distribution, pharmacokinetic and bioavailability of MP3950 in rats were carried out by the method for the first time, which can provide enough information for the further development and investigation of MP3950. Copyright © 2018. Published by Elsevier B.V.
Evaluation of centroiding algorithm error for Nano-JASMINE
NASA Astrophysics Data System (ADS)
Hara, Takuji; Gouda, Naoteru; Yano, Taihei; Yamada, Yoshiyuki
2014-08-01
The Nano-JASMINE mission has been designed to perform absolute astrometric measurements with unprecedented accuracy; the end-of-mission parallax standard error is required to be of the order of 3 milli arc seconds for stars brighter than 7.5 mag in the zw-band(0.6μm-1.0μm) .These requirements set a stringent constraint on the accuracy of the estimation of the location of the stellar image on the CCD for each observation. However each stellar images have individual shape depend on the spectral energy distribution of the star, the CCD properties, and the optics and its associated wave front errors. So it is necessity that the centroiding algorithm performs a high accuracy in any observables. Referring to the study of Gaia, we use LSF fitting method for centroiding algorithm, and investigate systematic error of the algorithm for Nano-JASMINE. Furthermore, we found to improve the algorithm by restricting sample LSF when we use a Principle Component Analysis. We show that centroiding algorithm error decrease after adapted the method.
Lahanas, M; Baltas, D; Giannouli, S; Milickovic, N; Zamboglou, N
2000-05-01
We have studied the accuracy of statistical parameters of dose distributions in brachytherapy using actual clinical implants. These include the mean, minimum and maximum dose values and the variance of the dose distribution inside the PTV (planning target volume), and on the surface of the PTV. These properties have been studied as a function of the number of uniformly distributed sampling points. These parameters, or the variants of these parameters, are used directly or indirectly in optimization procedures or for a description of the dose distribution. The accurate determination of these parameters depends on the sampling point distribution from which they have been obtained. Some optimization methods ignore catheters and critical structures surrounded by the PTV or alternatively consider as surface dose points only those on the contour lines of the PTV. D(min) and D(max) are extreme dose values which are either on the PTV surface or within the PTV. They must be avoided for specification and optimization purposes in brachytherapy. Using D(mean) and the variance of D which we have shown to be stable parameters, achieves a more reliable description of the dose distribution on the PTV surface and within the PTV volume than does D(min) and D(max). Generation of dose points on the real surface of the PTV is obligatory and the consideration of catheter volumes results in a realistic description of anatomical dose distributions.
Neutron-Star Radius from a Population of Binary Neutron Star Mergers.
Bose, Sukanta; Chakravarti, Kabir; Rezzolla, Luciano; Sathyaprakash, B S; Takami, Kentaro
2018-01-19
We show how gravitational-wave observations with advanced detectors of tens to several tens of neutron-star binaries can measure the neutron-star radius with an accuracy of several to a few percent, for mass and spatial distributions that are realistic, and with none of the sources located within 100 Mpc. We achieve such an accuracy by combining measurements of the total mass from the inspiral phase with those of the compactness from the postmerger oscillation frequencies. For estimating the measurement errors of these frequencies, we utilize analytical fits to postmerger numerical relativity waveforms in the time domain, obtained here for the first time, for four nuclear-physics equations of state and a couple of values for the mass. We further exploit quasiuniversal relations to derive errors in compactness from those frequencies. Measuring the average radius to well within 10% is possible for a sample of 100 binaries distributed uniformly in volume between 100 and 300 Mpc, so long as the equation of state is not too soft or the binaries are not too heavy. We also give error estimates for the Einstein Telescope.
Three validation metrics for automated probabilistic image segmentation of brain tumours
Zou, Kelly H.; Wells, William M.; Kikinis, Ron; Warfield, Simon K.
2005-01-01
SUMMARY The validity of brain tumour segmentation is an important issue in image processing because it has a direct impact on surgical planning. We examined the segmentation accuracy based on three two-sample validation metrics against the estimated composite latent gold standard, which was derived from several experts’ manual segmentations by an EM algorithm. The distribution functions of the tumour and control pixel data were parametrically assumed to be a mixture of two beta distributions with different shape parameters. We estimated the corresponding receiver operating characteristic curve, Dice similarity coefficient, and mutual information, over all possible decision thresholds. Based on each validation metric, an optimal threshold was then computed via maximization. We illustrated these methods on MR imaging data from nine brain tumour cases of three different tumour types, each consisting of a large number of pixels. The automated segmentation yielded satisfactory accuracy with varied optimal thresholds. The performances of these validation metrics were also investigated via Monte Carlo simulation. Extensions of incorporating spatial correlation structures using a Markov random field model were considered. PMID:15083482
Measuring experimental cyclohexane-water distribution coefficients for the SAMPL5 challenge
NASA Astrophysics Data System (ADS)
Rustenburg, Ariën S.; Dancer, Justin; Lin, Baiwei; Feng, Jianwen A.; Ortwine, Daniel F.; Mobley, David L.; Chodera, John D.
2016-11-01
Small molecule distribution coefficients between immiscible nonaqueuous and aqueous phases—such as cyclohexane and water—measure the degree to which small molecules prefer one phase over another at a given pH. As distribution coefficients capture both thermodynamic effects (the free energy of transfer between phases) and chemical effects (protonation state and tautomer effects in aqueous solution), they provide an exacting test of the thermodynamic and chemical accuracy of physical models without the long correlation times inherent to the prediction of more complex properties of relevance to drug discovery, such as protein-ligand binding affinities. For the SAMPL5 challenge, we carried out a blind prediction exercise in which participants were tasked with the prediction of distribution coefficients to assess its potential as a new route for the evaluation and systematic improvement of predictive physical models. These measurements are typically performed for octanol-water, but we opted to utilize cyclohexane for the nonpolar phase. Cyclohexane was suggested to avoid issues with the high water content and persistent heterogeneous structure of water-saturated octanol phases, since it has greatly reduced water content and a homogeneous liquid structure. Using a modified shake-flask LC-MS/MS protocol, we collected cyclohexane/water distribution coefficients for a set of 53 druglike compounds at pH 7.4. These measurements were used as the basis for the SAMPL5 Distribution Coefficient Challenge, where 18 research groups predicted these measurements before the experimental values reported here were released. In this work, we describe the experimental protocol we utilized for measurement of cyclohexane-water distribution coefficients, report the measured data, propose a new bootstrap-based data analysis procedure to incorporate multiple sources of experimental error, and provide insights to help guide future iterations of this valuable exercise in predictive modeling.
NASA Astrophysics Data System (ADS)
Wang, Li; Li, Feng; Xing, Jian
2017-10-01
In this paper, a hybrid artificial bee colony (ABC) algorithm and pattern search (PS) method is proposed and applied for recovery of particle size distribution (PSD) from spectral extinction data. To be more useful and practical, size distribution function is modelled as the general Johnson's ? function that can overcome the difficulty of not knowing the exact type beforehand encountered in many real circumstances. The proposed hybrid algorithm is evaluated through simulated examples involving unimodal, bimodal and trimodal PSDs with different widths and mean particle diameters. For comparison, all examples are additionally validated by the single ABC algorithm. In addition, the performance of the proposed algorithm is further tested by actual extinction measurements with real standard polystyrene samples immersed in water. Simulation and experimental results illustrate that the hybrid algorithm can be used as an effective technique to retrieve the PSDs with high reliability and accuracy. Compared with the single ABC algorithm, our proposed algorithm can produce more accurate and robust inversion results while taking almost comparative CPU time over ABC algorithm alone. The superiority of ABC and PS hybridization strategy in terms of reaching a better balance of estimation accuracy and computation effort increases its potentials as an excellent inversion technique for reliable and efficient actual measurement of PSD.
Monte Carlo based, patient-specific RapidArc QA using Linac log files.
Teke, Tony; Bergman, Alanah M; Kwa, William; Gill, Bradford; Duzenli, Cheryl; Popescu, I Antoniu
2010-01-01
A Monte Carlo (MC) based QA process to validate the dynamic beam delivery accuracy for Varian RapidArc (Varian Medical Systems, Palo Alto, CA) using Linac delivery log files (DynaLog) is presented. Using DynaLog file analysis and MC simulations, the goal of this article is to (a) confirm that adequate sampling is used in the RapidArc optimization algorithm (177 static gantry angles) and (b) to assess the physical machine performance [gantry angle and monitor unit (MU) delivery accuracy]. Ten clinically acceptable RapidArc treatment plans were generated for various tumor sites and delivered to a water-equivalent cylindrical phantom on the treatment unit. Three Monte Carlo simulations were performed to calculate dose to the CT phantom image set: (a) One using a series of static gantry angles defined by 177 control points with treatment planning system (TPS) MLC control files (planning files), (b) one using continuous gantry rotation with TPS generated MLC control files, and (c) one using continuous gantry rotation with actual Linac delivery log files. Monte Carlo simulated dose distributions are compared to both ionization chamber point measurements and with RapidArc TPS calculated doses. The 3D dose distributions were compared using a 3D gamma-factor analysis, employing a 3%/3 mm distance-to-agreement criterion. The dose difference between MC simulations, TPS, and ionization chamber point measurements was less than 2.1%. For all plans, the MC calculated 3D dose distributions agreed well with the TPS calculated doses (gamma-factor values were less than 1 for more than 95% of the points considered). Machine performance QA was supplemented with an extensive DynaLog file analysis. A DynaLog file analysis showed that leaf position errors were less than 1 mm for 94% of the time and there were no leaf errors greater than 2.5 mm. The mean standard deviation in MU and gantry angle were 0.052 MU and 0.355 degrees, respectively, for the ten cases analyzed. The accuracy and flexibility of the Monte Carlo based RapidArc QA system were demonstrated. Good machine performance and accurate dose distribution delivery of RapidArc plans were observed. The sampling used in the TPS optimization algorithm was found to be adequate.
Toward accelerating landslide mapping with interactive machine learning techniques
NASA Astrophysics Data System (ADS)
Stumpf, André; Lachiche, Nicolas; Malet, Jean-Philippe; Kerle, Norman; Puissant, Anne
2013-04-01
Despite important advances in the development of more automated methods for landslide mapping from optical remote sensing images, the elaboration of inventory maps after major triggering events still remains a tedious task. Image classification with expert defined rules typically still requires significant manual labour for the elaboration and adaption of rule sets for each particular case. Machine learning algorithm, on the contrary, have the ability to learn and identify complex image patterns from labelled examples but may require relatively large amounts of training data. In order to reduce the amount of required training data active learning has evolved as key concept to guide the sampling for applications such as document classification, genetics and remote sensing. The general underlying idea of most active learning approaches is to initialize a machine learning model with a small training set, and to subsequently exploit the model state and/or the data structure to iteratively select the most valuable samples that should be labelled by the user and added in the training set. With relatively few queries and labelled samples, an active learning strategy should ideally yield at least the same accuracy than an equivalent classifier trained with many randomly selected samples. Our study was dedicated to the development of an active learning approach for landslide mapping from VHR remote sensing images with special consideration of the spatial distribution of the samples. The developed approach is a region-based query heuristic that enables to guide the user attention towards few compact spatial batches rather than distributed points resulting in time savings of 50% and more compared to standard active learning techniques. The approach was tested with multi-temporal and multi-sensor satellite images capturing recent large scale triggering events in Brazil and China and demonstrated balanced user's and producer's accuracies between 74% and 80%. The assessment also included an experimental evaluation of the uncertainties of manual mappings from multiple experts and demonstrated strong relationships between the uncertainty of the experts and the machine learning model.
Nearest neighbor density ratio estimation for large-scale applications in astronomy
NASA Astrophysics Data System (ADS)
Kremer, J.; Gieseke, F.; Steenstrup Pedersen, K.; Igel, C.
2015-09-01
In astronomical applications of machine learning, the distribution of objects used for building a model is often different from the distribution of the objects the model is later applied to. This is known as sample selection bias, which is a major challenge for statistical inference as one can no longer assume that the labeled training data are representative. To address this issue, one can re-weight the labeled training patterns to match the distribution of unlabeled data that are available already in the training phase. There are many examples in practice where this strategy yielded good results, but estimating the weights reliably from a finite sample is challenging. We consider an efficient nearest neighbor density ratio estimator that can exploit large samples to increase the accuracy of the weight estimates. To solve the problem of choosing the right neighborhood size, we propose to use cross-validation on a model selection criterion that is unbiased under covariate shift. The resulting algorithm is our method of choice for density ratio estimation when the feature space dimensionality is small and sample sizes are large. The approach is simple and, because of the model selection, robust. We empirically find that it is on a par with established kernel-based methods on relatively small regression benchmark datasets. However, when applied to large-scale photometric redshift estimation, our approach outperforms the state-of-the-art.
Ground Truth Sampling and LANDSAT Accuracy Assessment
NASA Technical Reports Server (NTRS)
Robinson, J. W.; Gunther, F. J.; Campbell, W. J.
1982-01-01
It is noted that the key factor in any accuracy assessment of remote sensing data is the method used for determining the ground truth, independent of the remote sensing data itself. The sampling and accuracy procedures developed for nuclear power plant siting study are described. The purpose of the sampling procedure was to provide data for developing supervised classifications for two study sites and for assessing the accuracy of that and the other procedures used. The purpose of the accuracy assessment was to allow the comparison of the cost and accuracy of various classification procedures as applied to various data types.
Multiplatform sampling (ship, aircraft, and satellite) of a Gulf Stream warm core ring
NASA Technical Reports Server (NTRS)
Smith, Raymond C.; Brown, Otis B.; Hoge, Frank E.; Baker, Karen S.; Evans, Robert H.
1987-01-01
The purpose of this paper is to demonstrate the ability to meet the need to measure distributions of physical and biological properties of the ocean over large areas synoptically and over long time periods by means of remote sensing utilizing contemporaneous buoy, ship, aircraft, and satellite (i.e., multiplatform) sampling strategies. A mapping of sea surface temperature and chlorophyll fields in a Gulf Stream warm core ring using the multiplatform approach is described. Sampling capabilities of each sensing system are discussed as background for the data collected by means of these three dissimilar methods. Commensurate space/time sample sets from each sensing system are compared, and their relative accuracies in space and time are determined. The three-dimensional composite maps derived from the data set provide a synoptic perspective unobtainable from single platforms alone.
Sampling effects on the identification of roadkill hotspots: Implications for survey design.
Santos, Sara M; Marques, J Tiago; Lourenço, André; Medinas, Denis; Barbosa, A Márcia; Beja, Pedro; Mira, António
2015-10-01
Although locating wildlife roadkill hotspots is essential to mitigate road impacts, the influence of study design on hotspot identification remains uncertain. We evaluated how sampling frequency affects the accuracy of hotspot identification, using a dataset of vertebrate roadkills (n = 4427) recorded over a year of daily surveys along 37 km of roads. "True" hotspots were identified using this baseline dataset, as the 500-m segments where the number of road-killed vertebrates exceeded the upper 95% confidence limit of the mean, assuming a Poisson distribution of road-kills per segment. "Estimated" hotspots were identified likewise, using datasets representing progressively lower sampling frequencies, which were produced by extracting data from the baseline dataset at appropriate time intervals (1-30 days). Overall, 24.3% of segments were "true" hotspots, concentrating 40.4% of roadkills. For different groups, "true" hotspots accounted from 6.8% (bats) to 29.7% (small birds) of road segments, concentrating from <40% (frogs and toads, snakes) to >60% (lizards, lagomorphs, carnivores) of roadkills. Spatial congruence between "true" and "estimated" hotspots declined rapidly with increasing time interval between surveys, due primarily to increasing false negatives (i.e., missing "true" hotspots). There were also false positives (i.e., wrong "estimated" hotspots), particularly at low sampling frequencies. Spatial accuracy decay with increasing time interval between surveys was higher for smaller-bodied (amphibians, reptiles, small birds, small mammals) than for larger-bodied species (birds of prey, hedgehogs, lagomorphs, carnivores). Results suggest that widely used surveys at weekly or longer intervals may produce poor estimates of roadkill hotspots, particularly for small-bodied species. Surveying daily or at two-day intervals may be required to achieve high accuracy in hotspot identification for multiple species. Copyright © 2015 Elsevier Ltd. All rights reserved.
Chen, Hua; Chen, Kun
2013-01-01
The distributions of coalescence times and ancestral lineage numbers play an essential role in coalescent modeling and ancestral inference. Both exact distributions of coalescence times and ancestral lineage numbers are expressed as the sum of alternating series, and the terms in the series become numerically intractable for large samples. More computationally attractive are their asymptotic distributions, which were derived in Griffiths (1984) for populations with constant size. In this article, we derive the asymptotic distributions of coalescence times and ancestral lineage numbers for populations with temporally varying size. For a sample of size n, denote by Tm the mth coalescent time, when m + 1 lineages coalesce into m lineages, and An(t) the number of ancestral lineages at time t back from the current generation. Similar to the results in Griffiths (1984), the number of ancestral lineages, An(t), and the coalescence times, Tm, are asymptotically normal, with the mean and variance of these distributions depending on the population size function, N(t). At the very early stage of the coalescent, when t → 0, the number of coalesced lineages n − An(t) follows a Poisson distribution, and as m → n, n(n−1)Tm/2N(0) follows a gamma distribution. We demonstrate the accuracy of the asymptotic approximations by comparing to both exact distributions and coalescent simulations. Several applications of the theoretical results are also shown: deriving statistics related to the properties of gene genealogies, such as the time to the most recent common ancestor (TMRCA) and the total branch length (TBL) of the genealogy, and deriving the allele frequency spectrum for large genealogies. With the advent of genomic-level sequencing data for large samples, the asymptotic distributions are expected to have wide applications in theoretical and methodological development for population genetic inference. PMID:23666939
Chen, Hua; Chen, Kun
2013-07-01
The distributions of coalescence times and ancestral lineage numbers play an essential role in coalescent modeling and ancestral inference. Both exact distributions of coalescence times and ancestral lineage numbers are expressed as the sum of alternating series, and the terms in the series become numerically intractable for large samples. More computationally attractive are their asymptotic distributions, which were derived in Griffiths (1984) for populations with constant size. In this article, we derive the asymptotic distributions of coalescence times and ancestral lineage numbers for populations with temporally varying size. For a sample of size n, denote by Tm the mth coalescent time, when m + 1 lineages coalesce into m lineages, and An(t) the number of ancestral lineages at time t back from the current generation. Similar to the results in Griffiths (1984), the number of ancestral lineages, An(t), and the coalescence times, Tm, are asymptotically normal, with the mean and variance of these distributions depending on the population size function, N(t). At the very early stage of the coalescent, when t → 0, the number of coalesced lineages n - An(t) follows a Poisson distribution, and as m → n, $$n\\left(n-1\\right){T}_{m}/2N\\left(0\\right)$$ follows a gamma distribution. We demonstrate the accuracy of the asymptotic approximations by comparing to both exact distributions and coalescent simulations. Several applications of the theoretical results are also shown: deriving statistics related to the properties of gene genealogies, such as the time to the most recent common ancestor (TMRCA) and the total branch length (TBL) of the genealogy, and deriving the allele frequency spectrum for large genealogies. With the advent of genomic-level sequencing data for large samples, the asymptotic distributions are expected to have wide applications in theoretical and methodological development for population genetic inference.
Icing Characteristics of Low Altitude, Supercooled Layer Clouds. Revision
1980-05-01
Droplet Size Distribution 5. Icing Rate Meters C. Accuracy and Sources of Error in the Measurements from the Period 1944-1950 11 1. Rotating...whether currently available LWC meters and icing rate detectors will give re- liable results when flown on helicopters. Concerning the forecasting...Max Dia. Size Distrib. Meter Samples 4 1944 MSP DP -- Al .... 4 6 1946 OR 2,4RC 2,4RHC Al 4RMC -- 3 7 1946-47 NEMO, 4RMC 4RMC AI 4RMC - 31 TN,OH, IN
Chai, Xin; Wang, Qisong; Zhao, Yongping; Li, Yongqiang; Liu, Dan; Liu, Xin; Bai, Ou
2017-01-01
Electroencephalography (EEG)-based emotion recognition is an important element in psychiatric health diagnosis for patients. However, the underlying EEG sensor signals are always non-stationary if they are sampled from different experimental sessions or subjects. This results in the deterioration of the classification performance. Domain adaptation methods offer an effective way to reduce the discrepancy of marginal distribution. However, for EEG sensor signals, both marginal and conditional distributions may be mismatched. In addition, the existing domain adaptation strategies always require a high level of additional computation. To address this problem, a novel strategy named adaptive subspace feature matching (ASFM) is proposed in this paper in order to integrate both the marginal and conditional distributions within a unified framework (without any labeled samples from target subjects). Specifically, we develop a linear transformation function which matches the marginal distributions of the source and target subspaces without a regularization term. This significantly decreases the time complexity of our domain adaptation procedure. As a result, both marginal and conditional distribution discrepancies between the source domain and unlabeled target domain can be reduced, and logistic regression (LR) can be applied to the new source domain in order to train a classifier for use in the target domain, since the aligned source domain follows a distribution which is similar to that of the target domain. We compare our ASFM method with six typical approaches using a public EEG dataset with three affective states: positive, neutral, and negative. Both offline and online evaluations were performed. The subject-to-subject offline experimental results demonstrate that our component achieves a mean accuracy and standard deviation of 80.46% and 6.84%, respectively, as compared with a state-of-the-art method, the subspace alignment auto-encoder (SAAE), which achieves values of 77.88% and 7.33% on average, respectively. For the online analysis, the average classification accuracy and standard deviation of ASFM in the subject-to-subject evaluation for all the 15 subjects in a dataset was 75.11% and 7.65%, respectively, gaining a significant performance improvement compared to the best baseline LR which achieves 56.38% and 7.48%, respectively. The experimental results confirm the effectiveness of the proposed method relative to state-of-the-art methods. Moreover, computational efficiency of the proposed ASFM method is much better than standard domain adaptation; if the numbers of training samples and test samples are controlled within certain range, it is suitable for real-time classification. It can be concluded that ASFM is a useful and effective tool for decreasing domain discrepancy and reducing performance degradation across subjects and sessions in the field of EEG-based emotion recognition. PMID:28467371
Chai, Xin; Wang, Qisong; Zhao, Yongping; Li, Yongqiang; Liu, Dan; Liu, Xin; Bai, Ou
2017-05-03
Electroencephalography (EEG)-based emotion recognition is an important element in psychiatric health diagnosis for patients. However, the underlying EEG sensor signals are always non-stationary if they are sampled from different experimental sessions or subjects. This results in the deterioration of the classification performance. Domain adaptation methods offer an effective way to reduce the discrepancy of marginal distribution. However, for EEG sensor signals, both marginal and conditional distributions may be mismatched. In addition, the existing domain adaptation strategies always require a high level of additional computation. To address this problem, a novel strategy named adaptive subspace feature matching (ASFM) is proposed in this paper in order to integrate both the marginal and conditional distributions within a unified framework (without any labeled samples from target subjects). Specifically, we develop a linear transformation function which matches the marginal distributions of the source and target subspaces without a regularization term. This significantly decreases the time complexity of our domain adaptation procedure. As a result, both marginal and conditional distribution discrepancies between the source domain and unlabeled target domain can be reduced, and logistic regression (LR) can be applied to the new source domain in order to train a classifier for use in the target domain, since the aligned source domain follows a distribution which is similar to that of the target domain. We compare our ASFM method with six typical approaches using a public EEG dataset with three affective states: positive, neutral, and negative. Both offline and online evaluations were performed. The subject-to-subject offline experimental results demonstrate that our component achieves a mean accuracy and standard deviation of 80.46% and 6.84%, respectively, as compared with a state-of-the-art method, the subspace alignment auto-encoder (SAAE), which achieves values of 77.88% and 7.33% on average, respectively. For the online analysis, the average classification accuracy and standard deviation of ASFM in the subject-to-subject evaluation for all the 15 subjects in a dataset was 75.11% and 7.65%, respectively, gaining a significant performance improvement compared to the best baseline LR which achieves 56.38% and 7.48%, respectively. The experimental results confirm the effectiveness of the proposed method relative to state-of-the-art methods. Moreover, computational efficiency of the proposed ASFM method is much better than standard domain adaptation; if the numbers of training samples and test samples are controlled within certain range, it is suitable for real-time classification. It can be concluded that ASFM is a useful and effective tool for decreasing domain discrepancy and reducing performance degradation across subjects and sessions in the field of EEG-based emotion recognition.
Particle shape accounts for instrumental discrepancy in ice core dust size distributions
NASA Astrophysics Data System (ADS)
Folden Simonsen, Marius; Cremonesi, Llorenç; Baccolo, Giovanni; Bosch, Samuel; Delmonte, Barbara; Erhardt, Tobias; Kjær, Helle Astrid; Potenza, Marco; Svensson, Anders; Vallelonga, Paul
2018-05-01
The Klotz Abakus laser sensor and the Coulter counter are both used for measuring the size distribution of insoluble mineral dust particles in ice cores. While the Coulter counter measures particle volume accurately, the equivalent Abakus instrument measurement deviates substantially from the Coulter counter. We show that the difference between the Abakus and the Coulter counter measurements is mainly caused by the irregular shape of dust particles in ice core samples. The irregular shape means that a new calibration routine based on standard spheres is necessary for obtaining fully comparable data. This new calibration routine gives an increased accuracy to Abakus measurements, which may improve future ice core record intercomparisons. We derived an analytical model for extracting the aspect ratio of dust particles from the difference between Abakus and Coulter counter data. For verification, we measured the aspect ratio of the same samples directly using a single-particle extinction and scattering instrument. The results demonstrate that the model is accurate enough to discern between samples of aspect ratio 0.3 and 0.4 using only the comparison of Abakus and Coulter counter data.
NASA Technical Reports Server (NTRS)
Kaufman, L. G., II; Johnson, C. B.
1984-01-01
Aerodynamic surface heating rate distributions in three dimensional shock wave boundary layer interaction flow regions are presented for a generic set of model configurations representative of the aft portion of hypersonic aircraft. Heat transfer data were obtained using the phase change coating technique (paint) and, at particular spanwise and streamwise stations for sample cases, by the thin wall transient temperature technique (thermocouples). Surface oil flow patterns are also shown. The good accuracy of the detailed heat transfer data, as attested in part by their repeatability, is attributable partially to the comparatively high temperature potential of the NASA-Langley Mach 8 Variable Density Tunnel. The data are well suited to help guide heating analyses of Mach 8 aircraft, and should be considered in formulating improvements to empiric analytic methods for calculating heat transfer rate coefficient distributions.
Alternative evaluation metrics for risk adjustment methods.
Park, Sungchul; Basu, Anirban
2018-06-01
Risk adjustment is instituted to counter risk selection by accurately equating payments with expected expenditures. Traditional risk-adjustment methods are designed to estimate accurate payments at the group level. However, this generates residual risks at the individual level, especially for high-expenditure individuals, thereby inducing health plans to avoid those with high residual risks. To identify an optimal risk-adjustment method, we perform a comprehensive comparison of prediction accuracies at the group level, at the tail distributions, and at the individual level across 19 estimators: 9 parametric regression, 7 machine learning, and 3 distributional estimators. Using the 2013-2014 MarketScan database, we find that no one estimator performs best in all prediction accuracies. Generally, machine learning and distribution-based estimators achieve higher group-level prediction accuracy than parametric regression estimators. However, parametric regression estimators show higher tail distribution prediction accuracy and individual-level prediction accuracy, especially at the tails of the distribution. This suggests that there is a trade-off in selecting an appropriate risk-adjustment method between estimating accurate payments at the group level and lower residual risks at the individual level. Our results indicate that an optimal method cannot be determined solely on the basis of statistical metrics but rather needs to account for simulating plans' risk selective behaviors. Copyright © 2018 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Iino, Shota; Ito, Riho; Doi, Kento; Imaizumi, Tomoyuki; Hikosaka, Shuhei
2017-10-01
In the developing countries, urban areas are expanding rapidly. With the rapid developments, a short term monitoring of urban changes is important. A constant observation and creation of urban distribution map of high accuracy and without noise pollution are the key issues for the short term monitoring. SAR satellites are highly suitable for day or night and regardless of atmospheric weather condition observations for this type of study. The current study highlights the methodology of generating high-accuracy urban distribution maps derived from the SAR satellite imagery based on Convolutional Neural Network (CNN), which showed the outstanding results for image classification. Several improvements on SAR polarization combinations and dataset construction were performed for increasing the accuracy. As an additional data, Digital Surface Model (DSM), which are useful to classify land cover, were added to improve the accuracy. From the obtained result, high-accuracy urban distribution map satisfying the quality for short-term monitoring was generated. For the evaluation, urban changes were extracted by taking the difference of urban distribution maps. The change analysis with time series of imageries revealed the locations of urban change areas for short-term. Comparisons with optical satellites were performed for validating the results. Finally, analysis of the urban changes combining X-band, L-band and C-band SAR satellites was attempted to increase the opportunity of acquiring satellite imageries. Further analysis will be conducted as future work of the present study
NASA Astrophysics Data System (ADS)
Montereale Gavazzi, G.; Madricardo, F.; Janowski, L.; Kruss, A.; Blondel, P.; Sigovini, M.; Foglini, F.
2016-03-01
Recent technological developments of multibeam echosounder systems (MBES) allow mapping of benthic habitats with unprecedented detail. MBES can now be employed in extremely shallow waters, challenging data acquisition (as these instruments were often designed for deeper waters) and data interpretation (honed on datasets with resolution sometimes orders of magnitude lower). With extremely high-resolution bathymetry and co-located backscatter data, it is now possible to map the spatial distribution of fine scale benthic habitats, even identifying the acoustic signatures of single sponges. In this context, it is necessary to understand which of the commonly used segmentation methods is best suited to account for such level of detail. At the same time, new sampling protocols for precisely geo-referenced ground truth data need to be developed to validate the benthic environmental classification. This study focuses on a dataset collected in a shallow (2-10 m deep) tidal channel of the Lagoon of Venice, Italy. Using 0.05-m and 0.2-m raster grids, we compared a range of classifications, both pixel-based and object-based approaches, including manual, Maximum Likelihood Classifier, Jenks Optimization clustering, textural analysis and Object Based Image Analysis. Through a comprehensive and accurately geo-referenced ground truth dataset, we were able to identify five different classes of the substrate composition, including sponges, mixed submerged aquatic vegetation, mixed detritic bottom (fine and coarse) and unconsolidated bare sediment. We computed estimates of accuracy (namely Overall, User, Producer Accuracies and the Kappa statistic) by cross tabulating predicted and reference instances. Overall, pixel based segmentations produced the highest accuracies and the accuracy assessment is strongly dependent on the number of classes chosen for the thematic output. Tidal channels in the Venice Lagoon are extremely important in terms of habitats and sediment distribution, particularly within the context of the new tidal barrier being built. However, they had remained largely unexplored until now, because of the surveying challenges. The application of this remote sensing approach, combined with targeted sampling, opens a new perspective in the monitoring of benthic habitats in view of a knowledge-based management of natural resources in shallow coastal areas.
Predictive accuracy of particle filtering in dynamic models supporting outbreak projections.
Safarishahrbijari, Anahita; Teyhouee, Aydin; Waldner, Cheryl; Liu, Juxin; Osgood, Nathaniel D
2017-09-26
While a new generation of computational statistics algorithms and availability of data streams raises the potential for recurrently regrounding dynamic models with incoming observations, the effectiveness of such arrangements can be highly subject to specifics of the configuration (e.g., frequency of sampling and representation of behaviour change), and there has been little attempt to identify effective configurations. Combining dynamic models with particle filtering, we explored a solution focusing on creating quickly formulated models regrounded automatically and recurrently as new data becomes available. Given a latent underlying case count, we assumed that observed incident case counts followed a negative binomial distribution. In accordance with the condensation algorithm, each such observation led to updating of particle weights. We evaluated the effectiveness of various particle filtering configurations against each other and against an approach without particle filtering according to the accuracy of the model in predicting future prevalence, given data to a certain point and a norm-based discrepancy metric. We examined the effectiveness of particle filtering under varying times between observations, negative binomial dispersion parameters, and rates with which the contact rate could evolve. We observed that more frequent observations of empirical data yielded super-linearly improved accuracy in model predictions. We further found that for the data studied here, the most favourable assumptions to make regarding the parameters associated with the negative binomial distribution and changes in contact rate were robust across observation frequency and the observation point in the outbreak. Combining dynamic models with particle filtering can perform well in projecting future evolution of an outbreak. Most importantly, the remarkable improvements in predictive accuracy resulting from more frequent sampling suggest that investments to achieve efficient reporting mechanisms may be more than paid back by improved planning capacity. The robustness of the results on particle filter configuration in this case study suggests that it may be possible to formulate effective standard guidelines and regularized approaches for such techniques in particular epidemiological contexts. Most importantly, the work tentatively suggests potential for health decision makers to secure strong guidance when anticipating outbreak evolution for emerging infectious diseases by combining even very rough models with particle filtering method.
Uemoto, Yoshinobu; Sasaki, Shinji; Kojima, Takatoshi; Sugimoto, Yoshikazu; Watanabe, Toshio
2015-11-19
Genetic variance that is not captured by single nucleotide polymorphisms (SNPs) is due to imperfect linkage disequilibrium (LD) between SNPs and quantitative trait loci (QTLs), and the extent of LD between SNPs and QTLs depends on different minor allele frequencies (MAF) between them. To evaluate the impact of MAF of QTLs on genomic evaluation, we performed a simulation study using real cattle genotype data. In total, 1368 Japanese Black cattle and 592,034 SNPs (Illumina BovineHD BeadChip) were used. We simulated phenotypes using real genotypes under different scenarios, varying the MAF categories, QTL heritability, number of QTLs, and distribution of QTL effect. After generating true breeding values and phenotypes, QTL heritability was estimated and the prediction accuracy of genomic estimated breeding value (GEBV) was assessed under different SNP densities, prediction models, and population size by a reference-test validation design. The extent of LD between SNPs and QTLs in this population was higher in the QTLs with high MAF than in those with low MAF. The effect of MAF of QTLs depended on the genetic architecture, evaluation strategy, and population size in genomic evaluation. In genetic architecture, genomic evaluation was affected by the MAF of QTLs combined with the QTL heritability and the distribution of QTL effect. The number of QTL was not affected on genomic evaluation if the number of QTL was more than 50. In the evaluation strategy, we showed that different SNP densities and prediction models affect the heritability estimation and genomic prediction and that this depends on the MAF of QTLs. In addition, accurate QTL heritability and GEBV were obtained using denser SNP information and the prediction model accounted for the SNPs with low and high MAFs. In population size, a large sample size is needed to increase the accuracy of GEBV. The MAF of QTL had an impact on heritability estimation and prediction accuracy. Most genetic variance can be captured using denser SNPs and the prediction model accounted for MAF, but a large sample size is needed to increase the accuracy of GEBV under all QTL MAF categories.
Chang, Ching-Min; Lo, Yu-Lung; Tran, Nghia-Khanh; Chang, Yu-Jen
2018-03-20
A method is proposed for characterizing the optical properties of articular cartilage sliced from a pig's thighbone using a Stokes-Mueller polarimetry technique. The principal axis angle, phase retardance, optical rotation angle, circular diattenuation, diattenuation axis angle, linear diattenuation, and depolarization index properties of the cartilage sample are all decoupled in the proposed analytical model. Consequently, the accuracy and robustness of the extracted results are improved. The glucose concentration, collagen distribution, and scattering properties of samples from various depths of the articular cartilage are systematically explored via an inspection of the related parameters. The results show that the glucose concentration and scattering effect are both enhanced in the superficial region of the cartilage. By contrast, the collagen density increases with an increasing sample depth.
Bayesian view of single-qubit clocks, and an energy versus accuracy tradeoff
NASA Astrophysics Data System (ADS)
Gopalkrishnan, Manoj; Kandula, Varshith; Sriram, Praveen; Deshpande, Abhishek; Muralidharan, Bhaskaran
2017-09-01
We bring a Bayesian approach to the analysis of clocks. Using exponential distributions as priors for clocks, we analyze how well one can keep time with a single qubit freely precessing under a magnetic field. We find that, at least with a single qubit, quantum mechanics does not allow exact timekeeping, in contrast to classical mechanics, which does. We find the design of the single-qubit clock that leads to maximum accuracy. Further, we find an energy versus accuracy tradeoff—the energy cost is at least kBT times the improvement in accuracy as measured by the entropy reduction in going from the prior distribution to the posterior distribution. We propose a physical realization of the single-qubit clock using charge transport across a capacitively coupled quantum dot.
Zarco-Perello, Salvador; Simões, Nuno
2017-01-01
Information about the distribution and abundance of the habitat-forming sessile organisms in marine ecosystems is of great importance for conservation and natural resource managers. Spatial interpolation methodologies can be useful to generate this information from in situ sampling points, especially in circumstances where remote sensing methodologies cannot be applied due to small-scale spatial variability of the natural communities and low light penetration in the water column. Interpolation methods are widely used in environmental sciences; however, published studies using these methodologies in coral reef science are scarce. We compared the accuracy of the two most commonly used interpolation methods in all disciplines, inverse distance weighting (IDW) and ordinary kriging (OK), to predict the distribution and abundance of hard corals, octocorals, macroalgae, sponges and zoantharians and identify hotspots of these habitat-forming organisms using data sampled at three different spatial scales (5, 10 and 20 m) in Madagascar reef, Gulf of Mexico. The deeper sandy environments of the leeward and windward regions of Madagascar reef were dominated by macroalgae and seconded by octocorals. However, the shallow rocky environments of the reef crest had the highest richness of habitat-forming groups of organisms; here, we registered high abundances of octocorals and macroalgae, with sponges, Millepora alcicornis and zoantharians dominating in some patches, creating high levels of habitat heterogeneity. IDW and OK generated similar maps of distribution for all the taxa; however, cross-validation tests showed that IDW outperformed OK in the prediction of their abundances. When the sampling distance was at 20 m, both interpolation techniques performed poorly, but as the sampling was done at shorter distances prediction accuracies increased, especially for IDW. OK had higher mean prediction errors and failed to correctly interpolate the highest abundance values measured in situ , except for macroalgae, whereas IDW had lower mean prediction errors and high correlations between predicted and measured values in all cases when sampling was every 5 m. The accurate spatial interpolations created using IDW allowed us to see the spatial variability of each taxa at a biological and spatial resolution that remote sensing would not have been able to produce. Our study sets the basis for further research projects and conservation management in Madagascar reef and encourages similar studies in the region and other parts of the world where remote sensing technologies are not suitable for use.
Simões, Nuno
2017-01-01
Information about the distribution and abundance of the habitat-forming sessile organisms in marine ecosystems is of great importance for conservation and natural resource managers. Spatial interpolation methodologies can be useful to generate this information from in situ sampling points, especially in circumstances where remote sensing methodologies cannot be applied due to small-scale spatial variability of the natural communities and low light penetration in the water column. Interpolation methods are widely used in environmental sciences; however, published studies using these methodologies in coral reef science are scarce. We compared the accuracy of the two most commonly used interpolation methods in all disciplines, inverse distance weighting (IDW) and ordinary kriging (OK), to predict the distribution and abundance of hard corals, octocorals, macroalgae, sponges and zoantharians and identify hotspots of these habitat-forming organisms using data sampled at three different spatial scales (5, 10 and 20 m) in Madagascar reef, Gulf of Mexico. The deeper sandy environments of the leeward and windward regions of Madagascar reef were dominated by macroalgae and seconded by octocorals. However, the shallow rocky environments of the reef crest had the highest richness of habitat-forming groups of organisms; here, we registered high abundances of octocorals and macroalgae, with sponges, Millepora alcicornis and zoantharians dominating in some patches, creating high levels of habitat heterogeneity. IDW and OK generated similar maps of distribution for all the taxa; however, cross-validation tests showed that IDW outperformed OK in the prediction of their abundances. When the sampling distance was at 20 m, both interpolation techniques performed poorly, but as the sampling was done at shorter distances prediction accuracies increased, especially for IDW. OK had higher mean prediction errors and failed to correctly interpolate the highest abundance values measured in situ, except for macroalgae, whereas IDW had lower mean prediction errors and high correlations between predicted and measured values in all cases when sampling was every 5 m. The accurate spatial interpolations created using IDW allowed us to see the spatial variability of each taxa at a biological and spatial resolution that remote sensing would not have been able to produce. Our study sets the basis for further research projects and conservation management in Madagascar reef and encourages similar studies in the region and other parts of the world where remote sensing technologies are not suitable for use. PMID:29204321
Sample size in studies on diagnostic accuracy in ophthalmology: a literature survey.
Bochmann, Frank; Johnson, Zoe; Azuara-Blanco, Augusto
2007-07-01
To assess the sample sizes used in studies on diagnostic accuracy in ophthalmology. Design and sources: A survey literature published in 2005. The frequency of reporting calculations of sample sizes and the samples' sizes were extracted from the published literature. A manual search of five leading clinical journals in ophthalmology with the highest impact (Investigative Ophthalmology and Visual Science, Ophthalmology, Archives of Ophthalmology, American Journal of Ophthalmology and British Journal of Ophthalmology) was conducted by two independent investigators. A total of 1698 articles were identified, of which 40 studies were on diagnostic accuracy. One study reported that sample size was calculated before initiating the study. Another study reported consideration of sample size without calculation. The mean (SD) sample size of all diagnostic studies was 172.6 (218.9). The median prevalence of the target condition was 50.5%. Only a few studies consider sample size in their methods. Inadequate sample sizes in diagnostic accuracy studies may result in misleading estimates of test accuracy. An improvement over the current standards on the design and reporting of diagnostic studies is warranted.
Farrell, Mary Beth
2018-06-01
This article is the second part of a continuing education series reviewing basic statistics that nuclear medicine and molecular imaging technologists should understand. In this article, the statistics for evaluating interpretation accuracy, significance, and variance are discussed. Throughout the article, actual statistics are pulled from the published literature. We begin by explaining 2 methods for quantifying interpretive accuracy: interreader and intrareader reliability. Agreement among readers can be expressed simply as a percentage. However, the Cohen κ-statistic is a more robust measure of agreement that accounts for chance. The higher the κ-statistic is, the higher is the agreement between readers. When 3 or more readers are being compared, the Fleiss κ-statistic is used. Significance testing determines whether the difference between 2 conditions or interventions is meaningful. Statistical significance is usually expressed using a number called a probability ( P ) value. Calculation of P value is beyond the scope of this review. However, knowing how to interpret P values is important for understanding the scientific literature. Generally, a P value of less than 0.05 is considered significant and indicates that the results of the experiment are due to more than just chance. Variance, standard deviation (SD), confidence interval, and standard error (SE) explain the dispersion of data around a mean of a sample drawn from a population. SD is commonly reported in the literature. A small SD indicates that there is not much variation in the sample data. Many biologic measurements fall into what is referred to as a normal distribution taking the shape of a bell curve. In a normal distribution, 68% of the data will fall within 1 SD, 95% will fall within 2 SDs, and 99.7% will fall within 3 SDs. Confidence interval defines the range of possible values within which the population parameter is likely to lie and gives an idea of the precision of the statistic being measured. A wide confidence interval indicates that if the experiment were repeated multiple times on other samples, the measured statistic would lie within a wide range of possibilities. The confidence interval relies on the SE. © 2018 by the Society of Nuclear Medicine and Molecular Imaging.
Large scale intercomparison of aerosol trace element analysis by different analytical methods
NASA Astrophysics Data System (ADS)
Bombelka, E.; Richter, F.-W.; Ries, H.; Wätjen, U.
1984-04-01
The general agreement of PIXE analysis with other methods (INAA, XRF, AAS, OES-ICP, and PhAA) is very good based on the analysis of filter pieces taken from 250 aerosol samples. It is better than 5% for Pb and Zn, better than 10% for V, Cr, and Mn, indicating that the accuracy of PIXE analysis can be within 10%. For elements such as Cd and Sb, difficult to analyze by PIXE because of their low mass content in the sample, the agreement is given mainly by the reproducibility of the method (20% to 30%). Similar agreement is found for sulfur, after taking account of the depth distribution of the aerosol in the filter.
A multicolor imaging pyrometer
NASA Technical Reports Server (NTRS)
Frish, Michael B.; Frank, Jonathan H.
1989-01-01
A multicolor imaging pyrometer was designed for accurately and precisely measuring the temperature distribution histories of small moving samples. The device projects six different color images of the sample onto a single charge coupled device array that provides an RS-170 video signal to a computerized frame grabber. The computer automatically selects which one of the six images provides useful data, and converts that information to a temperature map. By measuring the temperature of molten aluminum heated in a kiln, a breadboard version of the device was shown to provide high accuracy in difficult measurement situations. It is expected that this pyrometer will ultimately find application in measuring the temperature of materials undergoing radiant heating in a microgravity acoustic levitation furnace.
A multicolor imaging pyrometer
NASA Astrophysics Data System (ADS)
Frish, Michael B.; Frank, Jonathan H.
1989-06-01
A multicolor imaging pyrometer was designed for accurately and precisely measuring the temperature distribution histories of small moving samples. The device projects six different color images of the sample onto a single charge coupled device array that provides an RS-170 video signal to a computerized frame grabber. The computer automatically selects which one of the six images provides useful data, and converts that information to a temperature map. By measuring the temperature of molten aluminum heated in a kiln, a breadboard version of the device was shown to provide high accuracy in difficult measurement situations. It is expected that this pyrometer will ultimately find application in measuring the temperature of materials undergoing radiant heating in a microgravity acoustic levitation furnace.
Empirical evaluation of data normalization methods for molecular classification.
Huang, Huei-Chung; Qin, Li-Xuan
2018-01-01
Data artifacts due to variations in experimental handling are ubiquitous in microarray studies, and they can lead to biased and irreproducible findings. A popular approach to correct for such artifacts is through post hoc data adjustment such as data normalization. Statistical methods for data normalization have been developed and evaluated primarily for the discovery of individual molecular biomarkers. Their performance has rarely been studied for the development of multi-marker molecular classifiers-an increasingly important application of microarrays in the era of personalized medicine. In this study, we set out to evaluate the performance of three commonly used methods for data normalization in the context of molecular classification, using extensive simulations based on re-sampling from a unique pair of microRNA microarray datasets for the same set of samples. The data and code for our simulations are freely available as R packages at GitHub. In the presence of confounding handling effects, all three normalization methods tended to improve the accuracy of the classifier when evaluated in an independent test data. The level of improvement and the relative performance among the normalization methods depended on the relative level of molecular signal, the distributional pattern of handling effects (e.g., location shift vs scale change), and the statistical method used for building the classifier. In addition, cross-validation was associated with biased estimation of classification accuracy in the over-optimistic direction for all three normalization methods. Normalization may improve the accuracy of molecular classification for data with confounding handling effects; however, it cannot circumvent the over-optimistic findings associated with cross-validation for assessing classification accuracy.
NASA Astrophysics Data System (ADS)
Wang, Tao; Wang, Guilin; Zhu, Dengchao; Li, Shengyi
2015-02-01
In order to meet the requirement of aerodynamics, the infrared domes or windows with conformal and thin-wall structure becomes the development trend of high-speed aircrafts in the future. But these parts usually have low stiffness, the cutting force will change along with the axial position, and it is very difficult to meet the requirement of shape accuracy by single machining. Therefore, on-machine measurement and compensating turning are used to control the shape errors caused by the fluctuation of cutting force and the change of stiffness. In this paper, on the basis of ultra precision diamond lathe, a contact measuring system with five DOFs is developed to achieve on-machine measurement of conformal thin-wall parts with high accuracy. According to high gradient surface, the optimizing algorithm is designed on the distribution of measuring points by using the data screening method. The influence rule of sampling frequency is analyzed on measuring errors, the best sampling frequency is found out based on planning algorithm, the effect of environmental factors and the fitting errors are controlled within lower range, and the measuring accuracy of conformal dome is greatly improved in the process of on-machine measurement. According to MgF2 conformal dome with high gradient, the compensating turning is implemented by using the designed on-machine measuring algorithm. The shape error is less than PV 0.8μm, greatly superior compared with PV 3μm before compensating turning, which verifies the correctness of measuring algorithm.
Ramsthaler, F; Kreutz, K; Verhoff, M A
2007-11-01
It has been generally accepted in skeletal sex determination that the use of metric methods is limited due to the population dependence of the multivariate algorithms. The aim of the study was to verify the applicability of software-based sex estimations outside the reference population group for which discriminant equations have been developed. We examined 98 skulls from recent forensic cases of known age, sex, and Caucasian ancestry from cranium collections in Frankfurt and Mainz (Germany) to determine the accuracy of sex determination using the statistical software solution Fordisc which derives its database and functions from the US American Forensic Database. In a comparison between metric analysis using Fordisc and morphological determination of sex, average accuracy for both sexes was 86 vs 94%, respectively, and males were identified more accurately than females. The ratio of the true test result rate to the false test result rate was not statistically different for the two methodological approaches at a significance level of 0.05 but was statistically different at a level of 0.10 (p=0.06). Possible explanations for this difference comprise different ancestry, age distribution, and socio-economic status compared to the Fordisc reference sample. It is likely that a discriminant function analysis on the basis of more similar European reference samples will lead to more valid and reliable sexing results. The use of Fordisc as a single method for the estimation of sex of recent skeletal remains in Europe cannot be recommended without additional morphological assessment and without a built-in software update based on modern European reference samples.
[Effect of Characteristic Variable Extraction on Accuracy of Cu in Navel Orange Peel by LIBS].
Li, Wen-bing; Yao, Ming-yin; Huang, Lin; Chen, Tian-bing; Zheng, Jian-hong; Fan, Shi-quan; Liu Mu-hua HE, Mu-hua; Lin, Jin-long; Ouyang, Jing-yi
2015-07-01
Heavy metals pollution in foodstuffs is more and more serious. It is impossible to satisfy the modern agricultural development by conventional chemical analysis. Laser induced breakdown spectroscopy (LIBS) is an emerging technology with the characteristic of rapid and nondestructive detection. But LIBS' s repeatability, sensitivity and accuracy has much room to improve. In this work, heavy metal Cu in Gannan Navel Orange which is the Jiangxi specialty fruit will be predicted by LIBS. Firstly, the navel orange samples were contaminated in our lab. The spectra of samples were collected by irradiating the peel by optimized LIBS parameters. The laser energy was set as 20 mJ, delay time of Spectral Data Gathering was set as 1.2 micros, the integration time of Spectral data gathering was set as 2 ms. The real concentration in samples was obtained by AAS (atom absorption spectroscopy). The characteristic variables Cu I 324.7 and Cu I 327.4 were extracted. And the calibration model was constructed between LIBS spectra and real concentration about Cu. The results show that relative error of the predicted concentrations of three relational model were 7.01% or less, reached a minimum of 0.02%, 0.01% and 0.02% respectively. The average relative errors were 2.33%, 3.10% and 26.3%. Tests showed that different characteristic variables decided different accuracy. It is very important to choose suitable characteristic variable. At the same time, this work is helpful to explore the distribution of heavy metals between pulp and peel.
The weighted function method: A handy tool for flood frequency analysis or just a curiosity?
NASA Astrophysics Data System (ADS)
Bogdanowicz, Ewa; Kochanek, Krzysztof; Strupczewski, Witold G.
2018-04-01
The idea of the Weighted Function (WF) method for estimation of Pearson type 3 (Pe3) distribution introduced by Ma in 1984 has been revised and successfully applied for shifted inverse Gaussian (IGa3) distribution. Also the conditions of WF applicability to a shifted distribution have been formulated. The accuracy of WF flood quantiles for both Pe3 and IGa3 distributions was assessed by Monte Caro simulations under the true and false distribution assumption versus the maximum likelihood (MLM), moment (MOM) and L-moments (LMM) methods. Three datasets of annual peak flows of Polish catchments serve the case studies to compare the results of the WF, MOM, MLM and LMM performance for the real flood data. For the hundred-year flood the WF method revealed the explicit superiority only over the MLM surpassing the MOM and especially LMM both for the true and false distributional assumption with respect to relative bias and relative mean root square error values. Generally, the WF method performs well and for hydrological sample size and constitutes good alternative for the estimation of the flood upper quantiles.
A handheld computer-aided diagnosis system and simulated analysis
NASA Astrophysics Data System (ADS)
Su, Mingjian; Zhang, Xuejun; Liu, Brent; Su, Kening; Louie, Ryan
2016-03-01
This paper describes a Computer Aided Diagnosis (CAD) system based on cellphone and distributed cluster. One of the bottlenecks in building a CAD system for clinical practice is the storage and process of mass pathology samples freely among different devices, and normal pattern matching algorithm on large scale image set is very time consuming. Distributed computation on cluster has demonstrated the ability to relieve this bottleneck. We develop a system enabling the user to compare the mass image to a dataset with feature table by sending datasets to Generic Data Handler Module in Hadoop, where the pattern recognition is undertaken for the detection of skin diseases. A single and combination retrieval algorithm to data pipeline base on Map Reduce framework is used in our system in order to make optimal choice between recognition accuracy and system cost. The profile of lesion area is drawn by doctors manually on the screen, and then uploads this pattern to the server. In our evaluation experiment, an accuracy of 75% diagnosis hit rate is obtained by testing 100 patients with skin illness. Our system has the potential help in building a novel medical image dataset by collecting large amounts of gold standard during medical diagnosis. Once the project is online, the participants are free to join and eventually an abundant sample dataset will soon be gathered enough for learning. These results demonstrate our technology is very promising and expected to be used in clinical practice.
Cui, Jiwen; Zhao, Shiyuan; Yang, Di; Ding, Zhenyang
2018-02-20
We use a spectrum interpolation technique to improve the distributed strain measurement accuracy in a Rayleigh-scatter-based optical frequency domain reflectometry sensing system. We demonstrate that strain accuracy is not limited by the "uncertainty principle" that exists in the time-frequency analysis. Different interpolation methods are investigated and used to improve the accuracy of peak position of the cross-correlation and, therefore, improve the accuracy of the strain. Interpolation implemented by padding zeros on one side of the windowed data in the spatial domain, before the inverse fast Fourier transform, is found to have the best accuracy. Using this method, the strain accuracy and resolution are both improved without decreasing the spatial resolution. The strain of 3 μϵ within the spatial resolution of 1 cm at the position of 21.4 m is distinguished, and the measurement uncertainty is 3.3 μϵ.
Sample Complexity Bounds for Differentially Private Learning
Chaudhuri, Kamalika; Hsu, Daniel
2013-01-01
This work studies the problem of privacy-preserving classification – namely, learning a classifier from sensitive data while preserving the privacy of individuals in the training set. In particular, the learning algorithm is required in this problem to guarantee differential privacy, a very strong notion of privacy that has gained significant attention in recent years. A natural question to ask is: what is the sample requirement of a learning algorithm that guarantees a certain level of privacy and accuracy? We address this question in the context of learning with infinite hypothesis classes when the data is drawn from a continuous distribution. We first show that even for very simple hypothesis classes, any algorithm that uses a finite number of examples and guarantees differential privacy must fail to return an accurate classifier for at least some unlabeled data distributions. This result is unlike the case with either finite hypothesis classes or discrete data domains, in which distribution-free private learning is possible, as previously shown by Kasiviswanathan et al. (2008). We then consider two approaches to differentially private learning that get around this lower bound. The first approach is to use prior knowledge about the unlabeled data distribution in the form of a reference distribution chosen independently of the sensitive data. Given such a reference , we provide an upper bound on the sample requirement that depends (among other things) on a measure of closeness between and the unlabeled data distribution. Our upper bound applies to the non-realizable as well as the realizable case. The second approach is to relax the privacy requirement, by requiring only label-privacy – namely, that the only labels (and not the unlabeled parts of the examples) be considered sensitive information. An upper bound on the sample requirement of learning with label privacy was shown by Chaudhuri et al. (2006); in this work, we show a lower bound. PMID:25285183
Measurement of the Vertical Distribution of Aerosol by Globally Distributed MP Lidar Network Sites
NASA Technical Reports Server (NTRS)
Spinhirne, James; Welton, Judd; Campbell, James; Starr, David OC. (Technical Monitor)
2001-01-01
The global distribution of aerosol has an important influence on climate through the scattering and absorption of shortwave radiation and through modification of cloud optical properties. Current satellite and other data already provide a great amount of information on aerosol distribution. However there are critical parameters that can only be obtained by active optical profiling. For aerosol, no passive technique can adequately resolve the height profile of aerosol. The aerosol height distribution is required for any model for aerosol transport and the height resolved radiative heating/cooling effect of aerosol. The Geoscience Laser Altimeter System (GLAS) is an orbital lidar to be launched by 2002. GLAS will provide global measurements of the height distribution of aerosol. The sampling will be limited by nadir only coverage. There is a need for local sites to address sampling, and accuracy factors. Full time measurements of the vertical distribution of aerosol are now being acquired at a number of globally distributed MP (micro pulse) lidar sites. The MP lidar systems provide profiling of all significant cloud and aerosol to the limit of signal attenuation from compact, eye safe instruments. There are currently six sites in operation and over a dozen planned. At all sites there are a complement of passive aerosol and radiation measurements supporting the lidar data. Four of the installations are at Atmospheric Radiation Measurement program sites. The aerosol measurements, retrievals and data products from the network sites will be discussed. The current and planned application of data to supplement satellite aerosol measurements is covered.
Multicategory reclassification statistics for assessing improvements in diagnostic accuracy
Li, Jialiang; Jiang, Binyan; Fine, Jason P.
2013-01-01
In this paper, we extend the definitions of the net reclassification improvement (NRI) and the integrated discrimination improvement (IDI) in the context of multicategory classification. Both measures were proposed in Pencina and others (2008. Evaluating the added predictive ability of a new marker: from area under the receiver operating characteristic (ROC) curve to reclassification and beyond. Statistics in Medicine 27, 157–172) as numeric characterizations of accuracy improvement for binary diagnostic tests and were shown to have certain advantage over analyses based on ROC curves or other regression approaches. Estimation and inference procedures for the multiclass NRI and IDI are provided in this paper along with necessary asymptotic distributional results. Simulations are conducted to study the finite-sample properties of the proposed estimators. Two medical examples are considered to illustrate our methodology. PMID:23197381
Modelling of thick composites using a layerwise laminate theory
NASA Technical Reports Server (NTRS)
Robbins, D. H., Jr.; Reddy, J. N.
1993-01-01
The layerwise laminate theory of Reddy (1987) is used to develop a layerwise, two-dimensional, displacement-based, finite element model of laminated composite plates that assumes a piecewise continuous distribution of the tranverse strains through the laminate thickness. The resulting layerwise finite element model is capable of computing interlaminar stresses and other localized effects with the same level of accuracy as a conventional 3D finite element model. Although the total number of degrees of freedom are comparable in both models, the layerwise model maintains a 2D-type data structure that provides several advantages over a conventional 3D finite element model, e.g. simplified input data, ease of mesh alteration, and faster element stiffness matrix formulation. Two sample problems are provided to illustrate the accuracy of the present model in computing interlaminar stresses for laminates in bending and extension.
NASA Technical Reports Server (NTRS)
Dobson, M. C.; Ulaby, F. T.; Moezzi, S.; Roth, E.
1983-01-01
Simulated C-band radar imagery for a 124-km by 108-km test site in eastern Kansas is used to classify soil moisture. Simulated radar resolutions are 100 m by 100 m, 1 km by 1 km, and 3 km by 3 km, and each is processed using more than 23 independent samples. Moisture classification errors are examined as a function of land-cover distribution, field-size distribution, and local topographic relief for the full test site and also for subregions of cropland, urban areas, woodland, and pasture/rangeland. Results show that a radar resolution of 100 m by 100 m yields the most robust classification accuracies.
Laboratory evaluation of the Sequoia Scientific LISST-ABS acoustic backscatter sediment sensor
Snazelle, Teri T.
2017-12-18
Sequoia Scientific’s LISST-ABS is an acoustic backscatter sensor designed to measure suspended-sediment concentration at a point source. Three LISST-ABS were evaluated at the U.S. Geological Survey (USGS) Hydrologic Instrumentation Facility (HIF). Serial numbers 6010, 6039, and 6058 were assessed for accuracy in solutions with varying particle-size distributions and for the effect of temperature on sensor accuracy. Certified sediment samples composed of different ranges of particle size were purchased from Powder Technology Inc. These sediment samples were 30–80-micron (µm) Arizona Test Dust; less than 22-µm ISO 12103-1, A1 Ultrafine Test Dust; and 149-µm MIL-STD 810E Silica Dust. The sensor was able to accurately measure suspended-sediment concentration when calibrated with sediment of the same particle-size distribution as the measured. Overall testing demonstrated that sensors calibrated with finer sized sediments overdetect sediment concentrations with coarser sized sediments, and sensors calibrated with coarser sized sediments do not detect increases in sediment concentrations from small and fine sediments. These test results are not unexpected for an acoustic-backscatter device and stress the need for using accurate site-specific particle-size distributions during sensor calibration. When calibrated for ultrafine dust with a less than 22-µm particle size (silt) and with the Arizona Test Dust with a 30–80-µm range, the data from sensor 6039 were biased high when fractions of the coarser (149-µm) Silica Dust were added. Data from sensor 6058 showed similar results with an elevated response to coarser material when calibrated with a finer particle-size distribution and a lack of detection when subjected to finer particle-size sediment. Sensor 6010 was also tested for the effect of dissimilar particle size during the calibration and showed little effect. Subsequent testing revealed problems with this sensor, including an inadequate temperature compensation, making this data questionable. The sensor was replaced by Sequoia Scientific with serial number 6039. Results from the extended temperature testing showed proper temperature compensation for sensor 6039, and results from the dissimilar calibration/testing particle-size distribution closely corroborated the results from sensor 6058.
Parent, Francois; Loranger, Sebastien; Mandal, Koushik Kanti; Iezzi, Victor Lambin; Lapointe, Jerome; Boisvert, Jean-Sébastien; Baiad, Mohamed Diaa; Kadoury, Samuel; Kashyap, Raman
2017-04-01
We demonstrate a novel approach to enhance the precision of surgical needle shape tracking based on distributed strain sensing using optical frequency domain reflectometry (OFDR). The precision enhancement is provided by using optical fibers with high scattering properties. Shape tracking of surgical tools using strain sensing properties of optical fibers has seen increased attention in recent years. Most of the investigations made in this field use fiber Bragg gratings (FBG), which can be used as discrete or quasi-distributed strain sensors. By using a truly distributed sensing approach (OFDR), preliminary results show that the attainable accuracy is comparable to accuracies reported in the literature using FBG sensors for tracking applications (~1mm). We propose a technique that enhanced our accuracy by 47% using UV exposed fibers, which have higher light scattering compared to un-exposed standard single mode fibers. Improving the experimental setup will enhance the accuracy provided by shape tracking using OFDR and will contribute significantly to clinical applications.
Fast and Accurate Support Vector Machines on Large Scale Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vishnu, Abhinav; Narasimhan, Jayenthi; Holder, Larry
Support Vector Machines (SVM) is a supervised Machine Learning and Data Mining (MLDM) algorithm, which has become ubiquitous largely due to its high accuracy and obliviousness to dimensionality. The objective of SVM is to find an optimal boundary --- also known as hyperplane --- which separates the samples (examples in a dataset) of different classes by a maximum margin. Usually, very few samples contribute to the definition of the boundary. However, existing parallel algorithms use the entire dataset for finding the boundary, which is sub-optimal for performance reasons. In this paper, we propose a novel distributed memory algorithm to eliminatemore » the samples which do not contribute to the boundary definition in SVM. We propose several heuristics, which range from early (aggressive) to late (conservative) elimination of the samples, such that the overall time for generating the boundary is reduced considerably. In a few cases, a sample may be eliminated (shrunk) pre-emptively --- potentially resulting in an incorrect boundary. We propose a scalable approach to synchronize the necessary data structures such that the proposed algorithm maintains its accuracy. We consider the necessary trade-offs of single/multiple synchronization using in-depth time-space complexity analysis. We implement the proposed algorithm using MPI and compare it with libsvm--- de facto sequential SVM software --- which we enhance with OpenMP for multi-core/many-core parallelism. Our proposed approach shows excellent efficiency using up to 4096 processes on several large datasets such as UCI HIGGS Boson dataset and Offending URL dataset.« less
Evaluation of accelerometer based multi-sensor versus single-sensor activity recognition systems.
Gao, Lei; Bourke, A K; Nelson, John
2014-06-01
Physical activity has a positive impact on people's well-being and it had been shown to decrease the occurrence of chronic diseases in the older adult population. To date, a substantial amount of research studies exist, which focus on activity recognition using inertial sensors. Many of these studies adopt a single sensor approach and focus on proposing novel features combined with complex classifiers to improve the overall recognition accuracy. In addition, the implementation of the advanced feature extraction algorithms and the complex classifiers exceed the computing ability of most current wearable sensor platforms. This paper proposes a method to adopt multiple sensors on distributed body locations to overcome this problem. The objective of the proposed system is to achieve higher recognition accuracy with "light-weight" signal processing algorithms, which run on a distributed computing based sensor system comprised of computationally efficient nodes. For analysing and evaluating the multi-sensor system, eight subjects were recruited to perform eight normal scripted activities in different life scenarios, each repeated three times. Thus a total of 192 activities were recorded resulting in 864 separate annotated activity states. The methods for designing such a multi-sensor system required consideration of the following: signal pre-processing algorithms, sampling rate, feature selection and classifier selection. Each has been investigated and the most appropriate approach is selected to achieve a trade-off between recognition accuracy and computing execution time. A comparison of six different systems, which employ single or multiple sensors, is presented. The experimental results illustrate that the proposed multi-sensor system can achieve an overall recognition accuracy of 96.4% by adopting the mean and variance features, using the Decision Tree classifier. The results demonstrate that elaborate classifiers and feature sets are not required to achieve high recognition accuracies on a multi-sensor system. Copyright © 2014 IPEM. Published by Elsevier Ltd. All rights reserved.
The Detection and Statistics of Giant Arcs behind CLASH Clusters
NASA Astrophysics Data System (ADS)
Xu, Bingxiao; Postman, Marc; Meneghetti, Massimo; Seitz, Stella; Zitrin, Adi; Merten, Julian; Maoz, Dani; Frye, Brenda; Umetsu, Keiichi; Zheng, Wei; Bradley, Larry; Vega, Jesus; Koekemoer, Anton
2016-02-01
We developed an algorithm to find and characterize gravitationally lensed galaxies (arcs) to perform a comparison of the observed and simulated arc abundance. Observations are from the Cluster Lensing And Supernova survey with Hubble (CLASH). Simulated CLASH images are created using the MOKA package and also clusters selected from the high-resolution, hydrodynamical simulations, MUSIC, over the same mass and redshift range as the CLASH sample. The algorithm's arc elongation accuracy, completeness, and false positive rate are determined and used to compute an estimate of the true arc abundance. We derive a lensing efficiency of 4 ± 1 arcs (with length ≥6″ and length-to-width ratio ≥7) per cluster for the X-ray-selected CLASH sample, 4 ± 1 arcs per cluster for the MOKA-simulated sample, and 3 ± 1 arcs per cluster for the MUSIC-simulated sample. The observed and simulated arc statistics are in full agreement. We measure the photometric redshifts of all detected arcs and find a median redshift zs = 1.9 with 33% of the detected arcs having zs > 3. We find that the arc abundance does not depend strongly on the source redshift distribution but is sensitive to the mass distribution of the dark matter halos (e.g., the c-M relation). Our results show that consistency between the observed and simulated distributions of lensed arc sizes and axial ratios can be achieved by using cluster-lensing simulations that are carefully matched to the selection criteria used in the observations.
Are all data created equal?--Exploring some boundary conditions for a lazy intuitive statistician.
Lindskog, Marcus; Winman, Anders
2014-01-01
The study investigated potential effects of the presentation order of numeric information on retrospective subjective judgments of descriptive statistics of this information. The studies were theoretically motivated by the assumption in the naïve sampling model of independence between temporal encoding order of data in long-term memory and retrieval probability (i.e. as implied by a "random sampling" from memory metaphor). In Experiment 1, participants experienced Arabic numbers that varied in distribution shape/variability between the first and the second half of the information sequence. Results showed no effects of order on judgments of mean, variability or distribution shape. To strengthen the interpretation of these results, Experiment 2 used a repeated judgment procedure, with an initial judgment occurring prior to the change in distribution shape of the information half-way through data presentation. The results of Experiment 2 were in line with those from Experiment 1, and in addition showed that the act of making explicit judgments did not impair accuracy of later judgments, as would be suggested by an anchoring and insufficient adjustment strategy. Overall, the results indicated that participants were very responsive to the properties of the data while at the same time being more or less immune to order effects. The results were interpreted as being in line with the naïve sampling models in which values are stored as exemplars and sampled randomly from long-term memory.
Dopant mapping in thin FIB prepared silicon samples by Off-Axis Electron Holography.
Pantzer, Adi; Vakahy, Atsmon; Eliyahou, Zohar; Levi, George; Horvitz, Dror; Kohn, Amit
2014-03-01
Modern semiconductor devices function due to accurate dopant distribution. Off-Axis Electron Holography (OAEH) in the transmission electron microscope (TEM) can map quantitatively the electrostatic potential in semiconductors with high spatial resolution. For the microelectronics industry, ongoing reduction of device dimensions, 3D device geometry, and failure analysis of specific devices require preparation of thin TEM samples, under 70 nm thick, by focused ion beam (FIB). Such thicknesses, which are considerably thinner than the values reported to date in the literature, are challenging due to FIB induced damage and surface depletion effects. Here, we report on preparation of TEM samples of silicon PN junctions in the FIB completed by low-energy (5 keV) ion milling, which reduced amorphization of the silicon to 10nm thick. Additional perpendicular FIB sectioning enabled a direct measurement of the TEM sample thickness in order to determine accurately the crystalline thickness of the sample. Consequently, we find that the low-energy milling also resulted in a negligible thickness of electrically inactive regions, approximately 4nm thick. The influence of TEM sample thickness, FIB induced damage and doping concentrations on the accuracy of the OAEH measurements were examined by comparison to secondary ion mass spectrometry measurements as well as to 1D and 3D simulations of the electrostatic potentials. We conclude that for TEM samples down to 100 nm thick, OAEH measurements of Si-based PN junctions, for the doping levels examined here, resulted in quantitative mapping of potential variations, within ~0.1 V. For thinner TEM samples, down to 20 nm thick, mapping of potential variations is qualitative, due to a reduced accuracy of ~0.3 V. This article is dedicated to the memory of Zohar Eliyahou. Copyright © 2014 Elsevier B.V. All rights reserved.
Lakshmanan, Manu N.; Greenberg, Joel A.; Samei, Ehsan; Kapadia, Anuj J.
2016-01-01
Abstract. A scatter imaging technique for the differentiation of cancerous and healthy breast tissue in a heterogeneous sample is introduced in this work. Such a technique has potential utility in intraoperative margin assessment during lumpectomy procedures. In this work, we investigate the feasibility of the imaging method for tumor classification using Monte Carlo simulations and physical experiments. The coded aperture coherent scatter spectral imaging technique was used to reconstruct three-dimensional (3-D) images of breast tissue samples acquired through a single-position snapshot acquisition, without rotation as is required in coherent scatter computed tomography. We perform a quantitative assessment of the accuracy of the cancerous voxel classification using Monte Carlo simulations of the imaging system; describe our experimental implementation of coded aperture scatter imaging; show the reconstructed images of the breast tissue samples; and present segmentations of the 3-D images in order to identify the cancerous and healthy tissue in the samples. From the Monte Carlo simulations, we find that coded aperture scatter imaging is able to reconstruct images of the samples and identify the distribution of cancerous and healthy tissues (i.e., fibroglandular, adipose, or a mix of the two) inside them with a cancerous voxel identification sensitivity, specificity, and accuracy of 92.4%, 91.9%, and 92.0%, respectively. From the experimental results, we find that the technique is able to identify cancerous and healthy tissue samples and reconstruct differential coherent scatter cross sections that are highly correlated with those measured by other groups using x-ray diffraction. Coded aperture scatter imaging has the potential to provide scatter images that automatically differentiate cancerous and healthy tissue inside samples within a time on the order of a minute per slice. PMID:26962543
Lakshmanan, Manu N; Greenberg, Joel A; Samei, Ehsan; Kapadia, Anuj J
2016-01-01
A scatter imaging technique for the differentiation of cancerous and healthy breast tissue in a heterogeneous sample is introduced in this work. Such a technique has potential utility in intraoperative margin assessment during lumpectomy procedures. In this work, we investigate the feasibility of the imaging method for tumor classification using Monte Carlo simulations and physical experiments. The coded aperture coherent scatter spectral imaging technique was used to reconstruct three-dimensional (3-D) images of breast tissue samples acquired through a single-position snapshot acquisition, without rotation as is required in coherent scatter computed tomography. We perform a quantitative assessment of the accuracy of the cancerous voxel classification using Monte Carlo simulations of the imaging system; describe our experimental implementation of coded aperture scatter imaging; show the reconstructed images of the breast tissue samples; and present segmentations of the 3-D images in order to identify the cancerous and healthy tissue in the samples. From the Monte Carlo simulations, we find that coded aperture scatter imaging is able to reconstruct images of the samples and identify the distribution of cancerous and healthy tissues (i.e., fibroglandular, adipose, or a mix of the two) inside them with a cancerous voxel identification sensitivity, specificity, and accuracy of 92.4%, 91.9%, and 92.0%, respectively. From the experimental results, we find that the technique is able to identify cancerous and healthy tissue samples and reconstruct differential coherent scatter cross sections that are highly correlated with those measured by other groups using x-ray diffraction. Coded aperture scatter imaging has the potential to provide scatter images that automatically differentiate cancerous and healthy tissue inside samples within a time on the order of a minute per slice.
Measuring atmospheric aerosols of organic origin on multirotor Unmanned Aerial Vehicles (UAVs).
NASA Astrophysics Data System (ADS)
Crazzolara, Claudio; Platis, Andreas; Bange, Jens
2017-04-01
In-situ measurements of the spatial distribution and transportation of atmospheric organic particles such as pollen and spores are of great interdisciplinary interest such as: - In agriculture to investigate the spread of transgenetic material, - In paleoclimatology to improve the accuracy of paleoclimate models derived from pollen grains retrieved from sediments, and - In meteorology/climate research to determine the role of spores and pollen acting as nuclei in cloud formation processes. The few known state of the art in-situ measurement systems are using passive sampling units carried by fixed wing UAVs, thus providing only limited spatial resolution of aerosol concentration. Also the passively sampled air volume is determined with low accuracy as it is only calculated by the length of the flight path. We will present a new approach, which is based on the use of a multirotor UAV providing a versatile platform. On this UAV an optical particle counter in addition to a particle collecting unit, e.g. a conventional filter element and/or a inertial mass separator were installed. Both sampling units were driven by a mass flow controlled blower. This allows not only an accurate determination of the number and size concentration, but also an exact classification of the type of collected aerosol particles as well as an accurate determination of the sampled air volume. In addition, due to the application of a multirotor UAV with its automated position stabilisation system, the aerosol concentration can be measured with a very high spatial resolution of less than 1 m in all three dimensions. The combination of comprehensive determination of number, type and classification of aerosol particles in combination with the very high spatial resolution provides not only valuable progress in agriculture, paleoclimatology and meteorology, but also opens up the application of multirotor UAVs in new fields, for example for precise determination of the mechanisms of generation and distribution of fine particulate matter as the result of road traffic.
NASA Astrophysics Data System (ADS)
Chung, Kee-Choo; Park, Hwangseo
2016-11-01
The performance of the extended solvent-contact model has been addressed in the SAMPL5 blind prediction challenge for distribution coefficient (LogD) of drug-like molecules with respect to the cyclohexane/water partitioning system. All the atomic parameters defined for 41 atom types in the solvation free energy function were optimized by operating a standard genetic algorithm with respect to water and cyclohexane solvents. In the parameterizations for cyclohexane, the experimental solvation free energy (Δ G sol ) data of 15 molecules for 1-octanol were combined with those of 77 molecules for cyclohexane to construct a training set because Δ G sol values of the former were unavailable for cyclohexane in publicly accessible databases. Using this hybrid training set, we established the LogD prediction model with the correlation coefficient ( R), average error (AE), and root mean square error (RMSE) of 0.55, 1.53, and 3.03, respectively, for the comparison of experimental and computational results for 53 SAMPL5 molecules. The modest accuracy in LogD prediction could be attributed to the incomplete optimization of atomic solvation parameters for cyclohexane. With respect to 31 SAMPL5 molecules containing the atom types for which experimental reference data for Δ G sol were available for both water and cyclohexane, the accuracy in LogD prediction increased remarkably with the R, AE, and RMSE values of 0.82, 0.89, and 1.60, respectively. This significant enhancement in performance stemmed from the better optimization of atomic solvation parameters by limiting the element of training set to the molecules with experimental Δ G sol data for cyclohexane. Due to the simplicity in model building and to low computational cost for parameterizations, the extended solvent-contact model is anticipated to serve as a valuable computational tool for LogD prediction upon the enrichment of experimental Δ G sol data for organic solvents.
Using Historical Atlas Data to Develop High-Resolution Distribution Models of Freshwater Fishes
Huang, Jian; Frimpong, Emmanuel A.
2015-01-01
Understanding the spatial pattern of species distributions is fundamental in biogeography, and conservation and resource management applications. Most species distribution models (SDMs) require or prefer species presence and absence data for adequate estimation of model parameters. However, observations with unreliable or unreported species absences dominate and limit the implementation of SDMs. Presence-only models generally yield less accurate predictions of species distribution, and make it difficult to incorporate spatial autocorrelation. The availability of large amounts of historical presence records for freshwater fishes of the United States provides an opportunity for deriving reliable absences from data reported as presence-only, when sampling was predominantly community-based. In this study, we used boosted regression trees (BRT), logistic regression, and MaxEnt models to assess the performance of a historical metacommunity database with inferred absences, for modeling fish distributions, investigating the effect of model choice and data properties thereby. With models of the distribution of 76 native, non-game fish species of varied traits and rarity attributes in four river basins across the United States, we show that model accuracy depends on data quality (e.g., sample size, location precision), species’ rarity, statistical modeling technique, and consideration of spatial autocorrelation. The cross-validation area under the receiver-operating-characteristic curve (AUC) tended to be high in the spatial presence-absence models at the highest level of resolution for species with large geographic ranges and small local populations. Prevalence affected training but not validation AUC. The key habitat predictors identified and the fish-habitat relationships evaluated through partial dependence plots corroborated most previous studies. The community-based SDM framework broadens our capability to model species distributions by innovatively removing the constraint of lack of species absence data, thus providing a robust prediction of distribution for stream fishes in other regions where historical data exist, and for other taxa (e.g., benthic macroinvertebrates, birds) usually observed by community-based sampling designs. PMID:26075902
[Ecology suitability study of Ephedra intermedia].
Ma, Xiao-Hui; Lu, You-Yuan; Huang, De-Dong; Zhu, Tian-Tian; Lv, Pei-Lin; Jin, Ling
2017-06-01
The study aims at predicting ecological suitability of Ephedra intermedia in China by using maximum entropy Maxent model combined with GIS, and finding the main ecological factors affecting the distribution of E. intermedia suitability in appropriate growth area. Thirty-eight collected samples of E. intermedia and E. intermedia and 116 distribution information from CVH information using ArcGIS technology were analyzed. MaxEnt model was applied to forecast the E. intermedia in our country's ecology. E. intermedia MaxEnt ROC curve model training data and testing data sets the AUC value was 0.986 and 0.958, respectively, which were greater than 0.9, tending to be 1.The calculated E. intermedia habitat suitability by the model showed a high accuracy and credibility, which indicated that MaxEnt model could well predict the potential distribution area of E. intermedia in China. Copyright© by the Chinese Pharmaceutical Association.
Barnard, P.L.; Rubin, D.M.; Harney, J.; Mustain, N.
2007-01-01
This extensive field test of an autocorrelation technique for determining grain size from digital images was conducted using a digital bed-sediment camera, or 'beachball' camera. Using 205 sediment samples and >1200 images from a variety of beaches on the west coast of the US, grain size ranging from sand to granules was measured from field samples using both the autocorrelation technique developed by Rubin [Rubin, D.M., 2004. A simple autocorrelation algorithm for determining grain size from digital images of sediment. Journal of Sedimentary Research, 74(1): 160-165.] and traditional methods (i.e. settling tube analysis, sieving, and point counts). To test the accuracy of the digital-image grain size algorithm, we compared results with manual point counts of an extensive image data set in the Santa Barbara littoral cell. Grain sizes calculated using the autocorrelation algorithm were highly correlated with the point counts of the same images (r2 = 0.93; n = 79) and had an error of only 1%. Comparisons of calculated grain sizes and grain sizes measured from grab samples demonstrated that the autocorrelation technique works well on high-energy dissipative beaches with well-sorted sediment such as in the Pacific Northwest (r2 ??? 0.92; n = 115). On less dissipative, more poorly sorted beaches such as Ocean Beach in San Francisco, results were not as good (r2 ??? 0.70; n = 67; within 3% accuracy). Because the algorithm works well compared with point counts of the same image, the poorer correlation with grab samples must be a result of actual spatial and vertical variability of sediment in the field; closer agreement between grain size in the images and grain size of grab samples can be achieved by increasing the sampling volume of the images (taking more images, distributed over a volume comparable to that of a grab sample). In all field tests the autocorrelation method was able to predict the mean and median grain size with ???96% accuracy, which is more than adequate for the majority of sedimentological applications, especially considering that the autocorrelation technique is estimated to be at least 100 times faster than traditional methods.
Urban Land Cover Mapping Accuracy Assessment - A Cost-benefit Analysis Approach
NASA Astrophysics Data System (ADS)
Xiao, T.
2012-12-01
One of the most important components in urban land cover mapping is mapping accuracy assessment. Many statistical models have been developed to help design simple schemes based on both accuracy and confidence levels. It is intuitive that an increased number of samples increases the accuracy as well as the cost of an assessment. Understanding cost and sampling size is crucial in implementing efficient and effective of field data collection. Few studies have included a cost calculation component as part of the assessment. In this study, a cost-benefit sampling analysis model was created by combining sample size design and sampling cost calculation. The sampling cost included transportation cost, field data collection cost, and laboratory data analysis cost. Simple Random Sampling (SRS) and Modified Systematic Sampling (MSS) methods were used to design sample locations and to extract land cover data in ArcGIS. High resolution land cover data layers of Denver, CO and Sacramento, CA, street networks, and parcel GIS data layers were used in this study to test and verify the model. The relationship between the cost and accuracy was used to determine the effectiveness of each sample method. The results of this study can be applied to other environmental studies that require spatial sampling.
Zhang, Cuicui; Liang, Xuefeng; Matsuyama, Takashi
2014-12-08
Multi-camera networks have gained great interest in video-based surveillance systems for security monitoring, access control, etc. Person re-identification is an essential and challenging task in multi-camera networks, which aims to determine if a given individual has already appeared over the camera network. Individual recognition often uses faces as a trial and requires a large number of samples during the training phrase. This is difficult to fulfill due to the limitation of the camera hardware system and the unconstrained image capturing conditions. Conventional face recognition algorithms often encounter the "small sample size" (SSS) problem arising from the small number of training samples compared to the high dimensionality of the sample space. To overcome this problem, interest in the combination of multiple base classifiers has sparked research efforts in ensemble methods. However, existing ensemble methods still open two questions: (1) how to define diverse base classifiers from the small data; (2) how to avoid the diversity/accuracy dilemma occurring during ensemble. To address these problems, this paper proposes a novel generic learning-based ensemble framework, which augments the small data by generating new samples based on a generic distribution and introduces a tailored 0-1 knapsack algorithm to alleviate the diversity/accuracy dilemma. More diverse base classifiers can be generated from the expanded face space, and more appropriate base classifiers are selected for ensemble. Extensive experimental results on four benchmarks demonstrate the higher ability of our system to cope with the SSS problem compared to the state-of-the-art system.
Zhang, Cuicui; Liang, Xuefeng; Matsuyama, Takashi
2014-01-01
Multi-camera networks have gained great interest in video-based surveillance systems for security monitoring, access control, etc. Person re-identification is an essential and challenging task in multi-camera networks, which aims to determine if a given individual has already appeared over the camera network. Individual recognition often uses faces as a trial and requires a large number of samples during the training phrase. This is difficult to fulfill due to the limitation of the camera hardware system and the unconstrained image capturing conditions. Conventional face recognition algorithms often encounter the “small sample size” (SSS) problem arising from the small number of training samples compared to the high dimensionality of the sample space. To overcome this problem, interest in the combination of multiple base classifiers has sparked research efforts in ensemble methods. However, existing ensemble methods still open two questions: (1) how to define diverse base classifiers from the small data; (2) how to avoid the diversity/accuracy dilemma occurring during ensemble. To address these problems, this paper proposes a novel generic learning-based ensemble framework, which augments the small data by generating new samples based on a generic distribution and introduces a tailored 0–1 knapsack algorithm to alleviate the diversity/accuracy dilemma. More diverse base classifiers can be generated from the expanded face space, and more appropriate base classifiers are selected for ensemble. Extensive experimental results on four benchmarks demonstrate the higher ability of our system to cope with the SSS problem compared to the state-of-the-art system. PMID:25494350
Global habitat suitability for framework-forming cold-water corals.
Davies, Andrew J; Guinotte, John M
2011-04-15
Predictive habitat models are increasingly being used by conservationists, researchers and governmental bodies to identify vulnerable ecosystems and species' distributions in areas that have not been sampled. However, in the deep sea, several limitations have restricted the widespread utilisation of this approach. These range from issues with the accuracy of species presences, the lack of reliable absence data and the limited spatial resolution of environmental factors known or thought to control deep-sea species' distributions. To address these problems, global habitat suitability models have been generated for five species of framework-forming scleractinian corals by taking the best available data and using a novel approach to generate high resolution maps of seafloor conditions. High-resolution global bathymetry was used to resample gridded data from sources such as World Ocean Atlas to produce continuous 30-arc second (∼1 km(2)) global grids for environmental, chemical and physical data of the world's oceans. The increased area and resolution of the environmental variables resulted in a greater number of coral presence records being incorporated into habitat models and higher accuracy of model predictions. The most important factors in determining cold-water coral habitat suitability were depth, temperature, aragonite saturation state and salinity. Model outputs indicated the majority of suitable coral habitat is likely to occur on the continental shelves and slopes of the Atlantic, South Pacific and Indian Oceans. The North Pacific has very little suitable scleractinian coral habitat. Numerous small scale features (i.e., seamounts), which have not been sampled or identified as having a high probability of supporting cold-water coral habitat were identified in all ocean basins. Field validation of newly identified areas is needed to determine the accuracy of model results, assess the utility of modelling efforts to identify vulnerable marine ecosystems for inclusion in future marine protected areas and reduce coral bycatch by commercial fisheries.
Spatial tools for managing hemlock woolly adelgid in the southern Appalachians
NASA Astrophysics Data System (ADS)
Koch, Frank Henry, Jr.
The hemlock woolly adelgid (Adelges tsugae) has recently spread into the southern Appalachians. This insect attacks both native hemlock species (Tsuga canadensis and T. caroliniana ), has no natural enemies, and can kill hemlocks within four years. Biological control displays promise for combating the pest, but counter-measures are impeded because adelgid and hemlock distribution patterns have been detailed poorly. We developed a spatial management system to better target control efforts, with two components: (1) a protocol for mapping hemlock stands, and (2) a technique to map areas at risk of imminent infestation. To construct a hemlock classifier, we used topographically normalized satellite images from Great Smoky Mountains National Park. Employing a decision tree approach that supplemented image spectral data with several environmental variables, we generated rules distinguishing hemlock areas from other forest types. We then implemented these rules in a geographic information system and generated hemlock distribution maps. Assessment yielded an overall thematic accuracy of 90% for one study area, and 75% accuracy in capturing hemlocks in a second study area. To map areas at risk, we combined first-year infestation locations from Great Smoky Mountains National Park and the Blue Ridge Parkway with points from uninfested hemlock stands, recording a suite of environmental variables for each point. We applied four different multivariate classification techniques to generate models from this sample predicting locations with high infestation risk, and used the resulting models to generate risk maps for the study region. All techniques performed well, accurately capturing 70--90% of training and validation samples, with the logistic regression model best balancing accuracy and regional applicability. Areas close to trails, roads, and streams appear to have the highest initial risk, perhaps due to bird- or human-mediated dispersal. Both components of our management system are general enough for use throughout the southern Appalachians. Overlay of derived maps will allow forest managers to reduce the area where they must focus their control efforts and thus allocate resources more efficiently.
Le, Minh Uyen Thi; Son, Jin Gyeong; Shon, Hyun Kyoung; Park, Jeong Hyang; Lee, Sung Bae; Lee, Tae Geol
2018-03-30
Time-of-flight secondary ion mass spectrometry (ToF-SIMS) imaging elucidates molecular distributions in tissue sections, providing useful information about the metabolic pathways linked to diseases. However, delocalization of the analytes and inadequate tissue adherence during sample preparation are among some of the unfortunate phenomena associated with this technique due to their role in the reduction of the quality, reliability, and spatial resolution of the ToF-SIMS images. For these reasons, ToF-SIMS imaging requires a more rigorous sample preparation method in order to preserve the natural state of the tissues. The traditional thaw-mounting method is particularly vulnerable to altered distributions of the analytes due to thermal effects, as well as to tissue shrinkage. In the present study, the authors made comparisons of different tissue mounting methods, including the thaw-mounting method. The authors used conductive tape as the tissue-mounting material on the substrate because it does not require heat from the finger for the tissue section to adhere to the substrate and can reduce charge accumulation during data acquisition. With the conductive-tape sampling method, they were able to acquire reproducible tissue sections and high-quality images without redistribution of the molecules. Also, the authors were successful in preserving the natural states and chemical distributions of the different components of fat metabolites such as diacylglycerol and fatty acids by using the tape-supported sampling in microRNA-14 (miR-14) deleted Drosophila models. The method highlighted here shows an improvement in the accuracy of mass spectrometric imaging of tissue samples.
Walsh, Noreen M; Lai, Jonathan; Hanly, John G; Green, Peter J; Bosisio, Francesca; Garcias-Ladaria, Juan; Cerroni, Lorenzo
2015-01-01
Hypertrophic discoid lupus erythematosus (HDLE), a rare variant of lupus skin disease, is difficult to distinguish from squamous neoplasms and certain dermatoses microscopically. Recently, recognition of the pathogenetic significance of plasmacytoid dendritic cells (PDCS) in cutaneous lupus erythematosus (LE) and of their patterns of distribution in different manifestations of the disease prompted us to study their diagnostic value in the context of HDLE. Using immunohistochemistry (CD123) to label the cells, we examined their quantities and patterns of distribution in 27 tissue samples of HDLE from nine patients compared with 39 inflammatory and neoplastic control samples from 36 patients. Using three parameters pertaining to PDCs: (i) their representation of 10% or more of the inflammatory infiltrate, (ii) their arrangement in clusters of 10 cells or more and (iii) their presence at the dermoepidermal junction, we found them to have significant diagnostic value, with accuracies of 77%, 74% and 71%, respectively. This study supports the careful descriptive observations of previous authors in the field. It also lends validity to the diagnostic step of mapping, immunohistochemically, the density and distribution of PDCs in suspected cases of HDLE. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Hierarchical Probabilistic Inference of Cosmic Shear
NASA Astrophysics Data System (ADS)
Schneider, Michael D.; Hogg, David W.; Marshall, Philip J.; Dawson, William A.; Meyers, Joshua; Bard, Deborah J.; Lang, Dustin
2015-07-01
Point estimators for the shearing of galaxy images induced by gravitational lensing involve a complex inverse problem in the presence of noise, pixelization, and model uncertainties. We present a probabilistic forward modeling approach to gravitational lensing inference that has the potential to mitigate the biased inferences in most common point estimators and is practical for upcoming lensing surveys. The first part of our statistical framework requires specification of a likelihood function for the pixel data in an imaging survey given parameterized models for the galaxies in the images. We derive the lensing shear posterior by marginalizing over all intrinsic galaxy properties that contribute to the pixel data (i.e., not limited to galaxy ellipticities) and learn the distributions for the intrinsic galaxy properties via hierarchical inference with a suitably flexible conditional probabilitiy distribution specification. We use importance sampling to separate the modeling of small imaging areas from the global shear inference, thereby rendering our algorithm computationally tractable for large surveys. With simple numerical examples we demonstrate the improvements in accuracy from our importance sampling approach, as well as the significance of the conditional distribution specification for the intrinsic galaxy properties when the data are generated from an unknown number of distinct galaxy populations with different morphological characteristics.
Propagation of the velocity model uncertainties to the seismic event location
NASA Astrophysics Data System (ADS)
Gesret, A.; Desassis, N.; Noble, M.; Romary, T.; Maisons, C.
2015-01-01
Earthquake hypocentre locations are crucial in many domains of application (academic and industrial) as seismic event location maps are commonly used to delineate faults or fractures. The interpretation of these maps depends on location accuracy and on the reliability of the associated uncertainties. The largest contribution to location and uncertainty errors is due to the fact that the velocity model errors are usually not correctly taken into account. We propose a new Bayesian formulation that integrates properly the knowledge on the velocity model into the formulation of the probabilistic earthquake location. In this work, the velocity model uncertainties are first estimated with a Bayesian tomography of active shot data. We implement a sampling Monte Carlo type algorithm to generate velocity models distributed according to the posterior distribution. In a second step, we propagate the velocity model uncertainties to the seismic event location in a probabilistic framework. This enables to obtain more reliable hypocentre locations as well as their associated uncertainties accounting for picking and velocity model uncertainties. We illustrate the tomography results and the gain in accuracy of earthquake location for two synthetic examples and one real data case study in the context of induced microseismicity.
NASA Astrophysics Data System (ADS)
Wang, Bingjie; Sun, Qi; Pi, Shaohua; Wu, Hongyan
2014-09-01
In this paper, feature extraction and pattern recognition of the distributed optical fiber sensing signal have been studied. We adopt Mel-Frequency Cepstral Coefficient (MFCC) feature extraction, wavelet packet energy feature extraction and wavelet packet Shannon entropy feature extraction methods to obtain sensing signals (such as speak, wind, thunder and rain signals, etc.) characteristic vectors respectively, and then perform pattern recognition via RBF neural network. Performances of these three feature extraction methods are compared according to the results. We choose MFCC characteristic vector to be 12-dimensional. For wavelet packet feature extraction, signals are decomposed into six layers by Daubechies wavelet packet transform, in which 64 frequency constituents as characteristic vector are respectively extracted. In the process of pattern recognition, the value of diffusion coefficient is introduced to increase the recognition accuracy, while keeping the samples for testing algorithm the same. Recognition results show that wavelet packet Shannon entropy feature extraction method yields the best recognition accuracy which is up to 97%; the performance of 12-dimensional MFCC feature extraction method is less satisfactory; the performance of wavelet packet energy feature extraction method is the worst.
NASA Astrophysics Data System (ADS)
Kairn, T.; Asena, A.; Crowe, S. B.; Livingstone, A.; Papworth, D.; Smith, S.; Sutherland, B.; Sylvander, S.; Franich, R. D.; Trapp, J. V.
2017-05-01
This study investigated the use of the TruView xylenol-orange-based gel and VISTA optical CT scanner (both by Modus Medical Inc, London, Canada), for use in verifying the accuracy of planned dose distributions for hypo-fractionated (stereotactic) vertebral treatments. Gel measurements were carried out using three stereotactic vertebral treatments and compared with planned doses calculated using the Eclipse treatment planning system (Varian Medical Systems, Palo Alto, USA) as well as with film measurements made using Gafchromic EBT3 film (Ashland Inc, Covington, USA), to investigate the accuracy of the gel system. The gel was calibrated with reference to a moderate-dose gradient region in one of the gel samples. Generally, the gel measurements were able to approximate the close agreement between the doses calculated by the treatment planning system and the doses measured using film (which agreed with each other within 2%), despite lower resolution and bit depth. Poorer agreement was observed when the dose delivered to the gel exceeded the range of doses delivered in the calibration region. This commercial gel dosimetry system may be used to verify hypo-fractionated treatments of vertebral targets, although separate gel calibration measurements are recommended.
NASA Astrophysics Data System (ADS)
Fusco, Terence; Bi, Yaxin; Nugent, Chris; Wu, Shengli
2016-08-01
We can see that the data imputation approach using the Regression CTA has performed more favourably when compared with the alternative methods on this dataset. We now have the evidence to show that this method is viable moving forward with further research in this area. The weighted distribution experiments have provided us with a more balanced and appropriate ratio for snail density classification purposes when using either the 3 or 5 category combination. The most desirable results are found when using 3 categories of SD with the weighted distribution of classes being 20-60-20. This information reflects the optimum classification accuracy across the data range and can be applied to any novel environment feature dataset pertaining to Schistosomiasis vector classification. ITSVM has provided us with a method of labelling SD data which we can use for classification with epidemic disease prediction research. The confidence level selection enables consistent labelling accuracy for bespoke requirements when classifying the data from each year. The SMOTE Equilibrium proposed method has yielded a slight increase with each multiple of synthetic instances that are compounded to the training dataset. The reduction of overfitting and increase of data instances has shown a gradual classification accuracy increase across the data for each year. We will now test to see what the optimum synthetic instance incremental increase is across our data and apply this to our experiments with this research.
Comparison of optimal design methods in inverse problems
NASA Astrophysics Data System (ADS)
Banks, H. T.; Holm, K.; Kappel, F.
2011-07-01
Typical optimal design methods for inverse or parameter estimation problems are designed to choose optimal sampling distributions through minimization of a specific cost function related to the resulting error in parameter estimates. It is hoped that the inverse problem will produce parameter estimates with increased accuracy using data collected according to the optimal sampling distribution. Here we formulate the classical optimal design problem in the context of general optimization problems over distributions of sampling times. We present a new Prohorov metric-based theoretical framework that permits one to treat succinctly and rigorously any optimal design criteria based on the Fisher information matrix. A fundamental approximation theory is also included in this framework. A new optimal design, SE-optimal design (standard error optimal design), is then introduced in the context of this framework. We compare this new design criterion with the more traditional D-optimal and E-optimal designs. The optimal sampling distributions from each design are used to compute and compare standard errors; the standard errors for parameters are computed using asymptotic theory or bootstrapping and the optimal mesh. We use three examples to illustrate ideas: the Verhulst-Pearl logistic population model (Banks H T and Tran H T 2009 Mathematical and Experimental Modeling of Physical and Biological Processes (Boca Raton, FL: Chapman and Hall/CRC)), the standard harmonic oscillator model (Banks H T and Tran H T 2009) and a popular glucose regulation model (Bergman R N, Ider Y Z, Bowden C R and Cobelli C 1979 Am. J. Physiol. 236 E667-77 De Gaetano A and Arino O 2000 J. Math. Biol. 40 136-68 Toffolo G, Bergman R N, Finegood D T, Bowden C R and Cobelli C 1980 Diabetes 29 979-90).
Sepehrband, Farshid; Choupan, Jeiran; Caruyer, Emmanuel; Kurniawan, Nyoman D; Gal, Yaniv; Tieng, Quang M; McMahon, Katie L; Vegh, Viktor; Reutens, David C; Yang, Zhengyi
2014-01-01
We describe and evaluate a pre-processing method based on a periodic spiral sampling of diffusion-gradient directions for high angular resolution diffusion magnetic resonance imaging. Our pre-processing method incorporates prior knowledge about the acquired diffusion-weighted signal, facilitating noise reduction. Periodic spiral sampling of gradient direction encodings results in an acquired signal in each voxel that is pseudo-periodic with characteristics that allow separation of low-frequency signal from high frequency noise. Consequently, it enhances local reconstruction of the orientation distribution function used to define fiber tracks in the brain. Denoising with periodic spiral sampling was tested using synthetic data and in vivo human brain images. The level of improvement in signal-to-noise ratio and in the accuracy of local reconstruction of fiber tracks was significantly improved using our method.
Joint measurement of complementary observables in moment tomography
NASA Astrophysics Data System (ADS)
Teo, Yong Siah; Müller, Christian R.; Jeong, Hyunseok; Hradil, Zdeněk; Řeháček, Jaroslav; Sánchez-Soto, Luis L.
Wigner and Husimi quasi-distributions, owing to their functional regularity, give the two archetypal and equivalent representations of all observable-parameters in continuous-variable quantum information. Balanced homodyning (HOM) and heterodyning (HET) that correspond to their associated sampling procedures, on the other hand, fare very differently concerning their state or parameter reconstruction accuracies. We present a general theory of a now-known fact that HET can be tomographically more powerful than balanced homodyning to many interesting classes of single-mode quantum states, and discuss the treatment for two-mode sources.
Multiscale implementation of infinite-swap replica exchange molecular dynamics.
Yu, Tang-Qing; Lu, Jianfeng; Abrams, Cameron F; Vanden-Eijnden, Eric
2016-10-18
Replica exchange molecular dynamics (REMD) is a popular method to accelerate conformational sampling of complex molecular systems. The idea is to run several replicas of the system in parallel at different temperatures that are swapped periodically. These swaps are typically attempted every few MD steps and accepted or rejected according to a Metropolis-Hastings criterion. This guarantees that the joint distribution of the composite system of replicas is the normalized sum of the symmetrized product of the canonical distributions of these replicas at the different temperatures. Here we propose a different implementation of REMD in which (i) the swaps obey a continuous-time Markov jump process implemented via Gillespie's stochastic simulation algorithm (SSA), which also samples exactly the aforementioned joint distribution and has the advantage of being rejection free, and (ii) this REMD-SSA is combined with the heterogeneous multiscale method to accelerate the rate of the swaps and reach the so-called infinite-swap limit that is known to optimize sampling efficiency. The method is easy to implement and can be trivially parallelized. Here we illustrate its accuracy and efficiency on the examples of alanine dipeptide in vacuum and C-terminal β-hairpin of protein G in explicit solvent. In this latter example, our results indicate that the landscape of the protein is a triple funnel with two folded structures and one misfolded structure that are stabilized by H-bonds.
Precision time distribution within a deep space communications complex
NASA Technical Reports Server (NTRS)
Curtright, J. B.
1972-01-01
The Precision Time Distribution System (PTDS) at the Golstone Deep Space Communications Complex is a practical application of existing technology to the solution of a local problem. The problem was to synchronize four station timing systems to a master source with a relative accuracy consistently and significantly better than 10 microseconds. The solution involved combining a precision timing source, an automatic error detection assembly and a microwave distribution network into an operational system. Upon activation of the completed PTDS two years ago, synchronization accuracy at Goldstone (two station relative) was improved by an order of magnitude. It is felt that the validation of the PTDS mechanization is now completed. Other facilities which have site dispersion and synchronization accuracy requirements similar to Goldstone may find the PTDS mechanization useful in solving their problem. At present, the two station relative synchronization accuracy at Goldstone is better than one microsecond.
Estimate Soil Erodibility Factors Distribution for Maioli Block
NASA Astrophysics Data System (ADS)
Lee, Wen-Ying
2014-05-01
The natural conditions in Taiwan are poor. Because of the steep slopes, rushing river and fragile geology, soil erosion turn into a serious problem. Not only undermine the sloping landscape, but also created sediment disaster like that reservoir sedimentation, river obstruction…etc. Therefore, predict and control the amount of soil erosion has become an important research topic. Soil erodibility factor (K) is a quantitative index of distinguish the ability of soil to resist the erosion separation and handling. Taiwan soil erodibility factors have been calculated 280 soil samples' erodibility factors by Wann and Huang (1989) use the Wischmeier and Smith nomorgraph. 221 samples were collected at the Maioli block in Miaoli. The coordinates of every sample point and the land use situations were recorded. The physical properties were analyzed for each sample. Three estimation methods, consist of Kriging, Inverse Distance Weighted (IDW) and Spline, were applied to estimate soil erodibility factors distribution for Maioli block by using 181 points data, and the remaining 40 points for the validation. Then, the SPSS regression analysis was used to comparison of the accuracy of the training data and validation data by three different methods. Then, the best method can be determined. In the future, we can used this method to predict the soil erodibility factors in other areas.
Measurement and analysis of x-ray absorption in Al and MgF2 plasmas heated by Z-pinch radiation.
Rochau, Gregory A; Bailey, J E; Macfarlane, J J
2005-12-01
High-power Z pinches on Sandia National Laboratories' Z facility can be used in a variety of experiments to radiatively heat samples placed some distance away from the Z-pinch plasma. In such experiments, the heating radiation spectrum is influenced by both the Z-pinch emission and the re-emission of radiation from the high-Z surfaces that make up the Z-pinch diode. To test the understanding of the amplitude and spectral distribution of the heating radiation, thin foils containing both Al and MgF2 were heated by a 100-130 TW Z pinch. The heating of these samples was studied through the ionization distribution in each material as measured by x-ray absorption spectra. The resulting plasma conditions are inferred from a least-squares comparison between the measured spectra and calculations of the Al and Mg 1s-->2p absorption over a large range of temperatures and densities. These plasma conditions are then compared to radiation-hydrodynamics simulations of the sample dynamics and are found to agree within 1sigma to the best-fit conditions. This agreement indicates that both the driving radiation spectrum and the heating of the Al and MgF2 samples is understood within the accuracy of the spectroscopic method.
Tallman, Sean D; Winburn, Allysha P
2015-09-01
Ancestry assessment from the postcranial skeleton presents a significant challenge to forensic anthropologists. However, metric dimensions of the femur subtrochanteric region are believed to distinguish between individuals of Asian and non-Asian descent. This study tests the discriminatory power of subtrochanteric shape using modern samples of 128 Thai and 77 White American males. Results indicate that the samples' platymeric index distributions are significantly different (p≤0.001), with the Thai platymeric index range generally lower and the White American range generally higher. While the application of ancestry assessment methods developed from Native American subtrochanteric data results in low correct classification rates for the Thai sample (50.8-57.8%), adapting these methods to the current samples leads to better classification. The Thai data may be more useful in forensic analysis than previously published subtrochanteric data derived from Native American samples. Adapting methods to include appropriate geographic and contemporaneous populations increases the accuracy of femur subtrochanteric ancestry methods. © 2015 American Academy of Forensic Sciences.
Novikov, I; Fund, N; Freedman, L S
2010-01-15
Different methods for the calculation of sample size for simple logistic regression (LR) with one normally distributed continuous covariate give different results. Sometimes the difference can be large. Furthermore, some methods require the user to specify the prevalence of cases when the covariate equals its population mean, rather than the more natural population prevalence. We focus on two commonly used methods and show through simulations that the power for a given sample size may differ substantially from the nominal value for one method, especially when the covariate effect is large, while the other method performs poorly if the user provides the population prevalence instead of the required parameter. We propose a modification of the method of Hsieh et al. that requires specification of the population prevalence and that employs Schouten's sample size formula for a t-test with unequal variances and group sizes. This approach appears to increase the accuracy of the sample size estimates for LR with one continuous covariate.
Torres, Daiane Placido; Martins-Teixeira, Maristela Braga; Cadore, Solange; Queiroz, Helena Müller
2015-01-01
A method for the determination of total mercury in fresh fish and shrimp samples by solid sampling thermal decomposition/amalgamation atomic absorption spectrometry (TDA AAS) has been validated following international foodstuff protocols in order to fulfill the Brazilian National Residue Control Plan. The experimental parameters have been previously studied and optimized according to specific legislation on validation and inorganic contaminants in foodstuff. Linearity, sensitivity, specificity, detection and quantification limits, precision (repeatability and within-laboratory reproducibility), robustness as well as accuracy of the method have been evaluated. Linearity of response was satisfactory for the two range concentrations available on the TDA AAS equipment, between approximately 25.0 and 200.0 μg kg(-1) (square regression) and 250.0 and 2000.0 μg kg(-1) (linear regression) of mercury. The residues for both ranges were homoscedastic and independent, with normal distribution. Correlation coefficients obtained for these ranges were higher than 0.995. Limits of quantification (LOQ) and of detection of the method (LDM), based on signal standard deviation (SD) for a low-in-mercury sample, were 3.0 and 1.0 μg kg(-1), respectively. Repeatability of the method was better than 4%. Within-laboratory reproducibility achieved a relative SD better than 6%. Robustness of the current method was evaluated and pointed sample mass as a significant factor. Accuracy (assessed as the analyte recovery) was calculated on basis of the repeatability, and ranged from 89% to 99%. The obtained results showed the suitability of the present method for direct mercury measurement in fresh fish and shrimp samples and the importance of monitoring the analysis conditions for food control purposes. Additionally, the competence of this method was recognized by accreditation under the standard ISO/IEC 17025.
Molecular Isotopic Distribution Analysis (MIDAs) with Adjustable Mass Accuracy
NASA Astrophysics Data System (ADS)
Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo
2014-01-01
In this paper, we present Molecular Isotopic Distribution Analysis (MIDAs), a new software tool designed to compute molecular isotopic distributions with adjustable accuracies. MIDAs offers two algorithms, one polynomial-based and one Fourier-transform-based, both of which compute molecular isotopic distributions accurately and efficiently. The polynomial-based algorithm contains few novel aspects, whereas the Fourier-transform-based algorithm consists mainly of improvements to other existing Fourier-transform-based algorithms. We have benchmarked the performance of the two algorithms implemented in MIDAs with that of eight software packages (BRAIN, Emass, Mercury, Mercury5, NeutronCluster, Qmass, JFC, IC) using a consensus set of benchmark molecules. Under the proposed evaluation criteria, MIDAs's algorithms, JFC, and Emass compute with comparable accuracy the coarse-grained (low-resolution) isotopic distributions and are more accurate than the other software packages. For fine-grained isotopic distributions, we compared IC, MIDAs's polynomial algorithm, and MIDAs's Fourier transform algorithm. Among the three, IC and MIDAs's polynomial algorithm compute isotopic distributions that better resemble their corresponding exact fine-grained (high-resolution) isotopic distributions. MIDAs can be accessed freely through a user-friendly web-interface at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/midas/index.html.
Molecular Isotopic Distribution Analysis (MIDAs) with adjustable mass accuracy.
Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo
2014-01-01
In this paper, we present Molecular Isotopic Distribution Analysis (MIDAs), a new software tool designed to compute molecular isotopic distributions with adjustable accuracies. MIDAs offers two algorithms, one polynomial-based and one Fourier-transform-based, both of which compute molecular isotopic distributions accurately and efficiently. The polynomial-based algorithm contains few novel aspects, whereas the Fourier-transform-based algorithm consists mainly of improvements to other existing Fourier-transform-based algorithms. We have benchmarked the performance of the two algorithms implemented in MIDAs with that of eight software packages (BRAIN, Emass, Mercury, Mercury5, NeutronCluster, Qmass, JFC, IC) using a consensus set of benchmark molecules. Under the proposed evaluation criteria, MIDAs's algorithms, JFC, and Emass compute with comparable accuracy the coarse-grained (low-resolution) isotopic distributions and are more accurate than the other software packages. For fine-grained isotopic distributions, we compared IC, MIDAs's polynomial algorithm, and MIDAs's Fourier transform algorithm. Among the three, IC and MIDAs's polynomial algorithm compute isotopic distributions that better resemble their corresponding exact fine-grained (high-resolution) isotopic distributions. MIDAs can be accessed freely through a user-friendly web-interface at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/midas/index.html.
Mapping Wintering Waterfowl Distributions Using Weather Surveillance Radar
Buler, Jeffrey J.; Randall, Lori A.; Fleskes, Joseph P.; Barrow, Wylie C.; Bogart, Tianna; Kluver, Daria
2012-01-01
The current network of weather surveillance radars within the United States readily detects flying birds and has proven to be a useful remote-sensing tool for ornithological study. Radar reflectivity measures serve as an index to bird density and have been used to quantitatively map landbird distributions during migratory stopover by sampling birds aloft at the onset of nocturnal migratory flights. Our objective was to further develop and validate a similar approach for mapping wintering waterfowl distributions using weather surveillance radar observations at the onset of evening flights. We evaluated data from the Sacramento, CA radar (KDAX) during winters 1998–1999 and 1999–2000. We determined an optimal sampling time by evaluating the accuracy and precision of radar observations at different times during the onset of evening flight relative to observed diurnal distributions of radio-marked birds on the ground. The mean time of evening flight initiation occurred 23 min after sunset with the strongest correlations between reflectivity and waterfowl density on the ground occurring almost immediately after flight initiation. Radar measures became more spatially homogeneous as evening flight progressed because birds dispersed from their departure locations. Radars effectively detected birds to a mean maximum range of 83 km during the first 20 min of evening flight. Using a sun elevation angle of −5° (28 min after sunset) as our optimal sampling time, we validated our approach using KDAX data and additional data from the Beale Air Force Base, CA (KBBX) radar during winter 1998–1999. Bias-adjusted radar reflectivity of waterfowl aloft was positively related to the observed diurnal density of radio-marked waterfowl locations on the ground. Thus, weather radars provide accurate measures of relative wintering waterfowl density that can be used to comprehensively map their distributions over large spatial extents. PMID:22911816
Mapping wintering waterfowl distributions using weather surveillance radar.
Buler, Jeffrey J; Randall, Lori A; Fleskes, Joseph P; Barrow, Wylie C; Bogart, Tianna; Kluver, Daria
2012-01-01
The current network of weather surveillance radars within the United States readily detects flying birds and has proven to be a useful remote-sensing tool for ornithological study. Radar reflectivity measures serve as an index to bird density and have been used to quantitatively map landbird distributions during migratory stopover by sampling birds aloft at the onset of nocturnal migratory flights. Our objective was to further develop and validate a similar approach for mapping wintering waterfowl distributions using weather surveillance radar observations at the onset of evening flights. We evaluated data from the Sacramento, CA radar (KDAX) during winters 1998-1999 and 1999-2000. We determined an optimal sampling time by evaluating the accuracy and precision of radar observations at different times during the onset of evening flight relative to observed diurnal distributions of radio-marked birds on the ground. The mean time of evening flight initiation occurred 23 min after sunset with the strongest correlations between reflectivity and waterfowl density on the ground occurring almost immediately after flight initiation. Radar measures became more spatially homogeneous as evening flight progressed because birds dispersed from their departure locations. Radars effectively detected birds to a mean maximum range of 83 km during the first 20 min of evening flight. Using a sun elevation angle of -5° (28 min after sunset) as our optimal sampling time, we validated our approach using KDAX data and additional data from the Beale Air Force Base, CA (KBBX) radar during winter 1998-1999. Bias-adjusted radar reflectivity of waterfowl aloft was positively related to the observed diurnal density of radio-marked waterfowl locations on the ground. Thus, weather radars provide accurate measures of relative wintering waterfowl density that can be used to comprehensively map their distributions over large spatial extents.
Domain-wall excitations in the two-dimensional Ising spin glass
NASA Astrophysics Data System (ADS)
Khoshbakht, Hamid; Weigel, Martin
2018-02-01
The Ising spin glass in two dimensions exhibits rich behavior with subtle differences in the scaling for different coupling distributions. We use recently developed mappings to graph-theoretic problems together with highly efficient implementations of combinatorial optimization algorithms to determine exact ground states for systems on square lattices with up to 10 000 ×10 000 spins. While these mappings only work for planar graphs, for example for systems with periodic boundary conditions in at most one direction, we suggest here an iterative windowing technique that allows one to determine ground states for fully periodic samples up to sizes similar to those for the open-periodic case. Based on these techniques, a large number of disorder samples are used together with a careful finite-size scaling analysis to determine the stiffness exponents and domain-wall fractal dimensions with unprecedented accuracy, our best estimates being θ =-0.2793 (3 ) and df=1.273 19 (9 ) for Gaussian couplings. For bimodal disorder, a new uniform sampling algorithm allows us to study the domain-wall fractal dimension, finding df=1.279 (2 ) . Additionally, we also investigate the distributions of ground-state energies, of domain-wall energies, and domain-wall lengths.
NASA Astrophysics Data System (ADS)
Jia, Jia; Cheng, Shuiyuan; Yao, Sen; Xu, Tiebing; Zhang, Tingting; Ma, Yuetao; Wang, Hongliang; Duan, Wenjiao
2018-06-01
As one of the highest energy consumption and pollution industries, the iron and steel industry is regarded as a most important source of particulate matter emission. In this study, chemical components of size-segregated particulate matters (PM) emitted from different manufacturing units in iron and steel industry were sampled by a comprehensive sampling system. Results showed that the average particle mass concentration was highest in sintering process, followed by puddling, steelmaking and then rolling processes. PM samples were divided into eight size fractions for testing the chemical components, SO42- and NH4+ distributed more into fine particles while most of the Ca2+ was concentrated in coarse particles, the size distribution of mineral elements depended on the raw materials applied. Moreover, local database with PM chemical source profiles of iron and steel industry were built and applied in CMAQ modeling for simulating SO42- and NO3- concentration, results showed that the accuracy of model simulation improved with local chemical source profiles compared to the SPECIATE database. The results gained from this study are expected to be helpful to understand the components of PM in iron and steel industry and contribute to the source apportionment researches.
Seli, Paul; Cheyne, James Allan; Smilek, Daniel
2012-03-01
In two studies of a GO-NOGO task assessing sustained attention, we examined the effects of (1) altering speed-accuracy trade-offs through instructions (emphasizing both speed and accuracy or accuracy only) and (2) auditory alerts distributed throughout the task. Instructions emphasizing accuracy reduced errors and changed the distribution of GO trial RTs. Additionally, correlations between errors and increasing RTs produced a U-function; excessively fast and slow RTs accounted for much of the variance of errors. Contrary to previous reports, alerts increased errors and RT variability. The results suggest that (1) standard instructions for sustained attention tasks, emphasizing speed and accuracy equally, produce errors arising from attempts to conform to the misleading requirement for speed, which become conflated with attention-lapse produced errors and (2) auditory alerts have complex, and sometimes deleterious, effects on attention. We argue that instructions emphasizing accuracy provide a more precise assessment of attention lapses in sustained attention tasks. Copyright © 2011 Elsevier Inc. All rights reserved.
Empirical evaluation of data normalization methods for molecular classification
Huang, Huei-Chung
2018-01-01
Background Data artifacts due to variations in experimental handling are ubiquitous in microarray studies, and they can lead to biased and irreproducible findings. A popular approach to correct for such artifacts is through post hoc data adjustment such as data normalization. Statistical methods for data normalization have been developed and evaluated primarily for the discovery of individual molecular biomarkers. Their performance has rarely been studied for the development of multi-marker molecular classifiers—an increasingly important application of microarrays in the era of personalized medicine. Methods In this study, we set out to evaluate the performance of three commonly used methods for data normalization in the context of molecular classification, using extensive simulations based on re-sampling from a unique pair of microRNA microarray datasets for the same set of samples. The data and code for our simulations are freely available as R packages at GitHub. Results In the presence of confounding handling effects, all three normalization methods tended to improve the accuracy of the classifier when evaluated in an independent test data. The level of improvement and the relative performance among the normalization methods depended on the relative level of molecular signal, the distributional pattern of handling effects (e.g., location shift vs scale change), and the statistical method used for building the classifier. In addition, cross-validation was associated with biased estimation of classification accuracy in the over-optimistic direction for all three normalization methods. Conclusion Normalization may improve the accuracy of molecular classification for data with confounding handling effects; however, it cannot circumvent the over-optimistic findings associated with cross-validation for assessing classification accuracy. PMID:29666754
Szardenings, Carsten; Kuhn, Jörg-Tobias; Ranger, Jochen; Holling, Heinz
2017-01-01
The respective roles of the approximate number system (ANS) and an access deficit (AD) in developmental dyscalculia (DD) are not well-known. Most studies rely on response times (RTs) or accuracy (error rates) separately. We analyzed the results of two samples of elementary school children in symbolic magnitude comparison (MC) and non-symbolic MC using a diffusion model. This approach uses the joint distribution of both RTs and accuracy in order to synthesize measures closer to ability and response caution or response conservatism. The latter can be understood in the context of the speed-accuracy tradeoff: It expresses how much a subject trades in speed for improved accuracy. We found significant effects of DD on both ability (negative) and response caution (positive) in MC tasks and a negative interaction of DD with symbolic task material on ability. These results support that DD subjects suffer from both an impaired ANS and an AD and in particular support that slower RTs of children with DD are indeed related to impaired processing of numerical information. An interaction effect of symbolic task material and DD (low mathematical ability) on response caution could not be refuted. However, in a sample more representative of the general population we found a negative association of mathematical ability and response caution in symbolic but not in non-symbolic task material. The observed differences in response behavior highlight the importance of accounting for response caution in the analysis of MC tasks. The results as a whole present a good example of the benefits of a diffusion model analysis.
Szardenings, Carsten; Kuhn, Jörg-Tobias; Ranger, Jochen; Holling, Heinz
2018-01-01
The respective roles of the approximate number system (ANS) and an access deficit (AD) in developmental dyscalculia (DD) are not well-known. Most studies rely on response times (RTs) or accuracy (error rates) separately. We analyzed the results of two samples of elementary school children in symbolic magnitude comparison (MC) and non-symbolic MC using a diffusion model. This approach uses the joint distribution of both RTs and accuracy in order to synthesize measures closer to ability and response caution or response conservatism. The latter can be understood in the context of the speed-accuracy tradeoff: It expresses how much a subject trades in speed for improved accuracy. We found significant effects of DD on both ability (negative) and response caution (positive) in MC tasks and a negative interaction of DD with symbolic task material on ability. These results support that DD subjects suffer from both an impaired ANS and an AD and in particular support that slower RTs of children with DD are indeed related to impaired processing of numerical information. An interaction effect of symbolic task material and DD (low mathematical ability) on response caution could not be refuted. However, in a sample more representative of the general population we found a negative association of mathematical ability and response caution in symbolic but not in non-symbolic task material. The observed differences in response behavior highlight the importance of accounting for response caution in the analysis of MC tasks. The results as a whole present a good example of the benefits of a diffusion model analysis. PMID:29379450
Research on sparse feature matching of improved RANSAC algorithm
NASA Astrophysics Data System (ADS)
Kong, Xiangsi; Zhao, Xian
2018-04-01
In this paper, a sparse feature matching method based on modified RANSAC algorithm is proposed to improve the precision and speed. Firstly, the feature points of the images are extracted using the SIFT algorithm. Then, the image pair is matched roughly by generating SIFT feature descriptor. At last, the precision of image matching is optimized by the modified RANSAC algorithm,. The RANSAC algorithm is improved from three aspects: instead of the homography matrix, this paper uses the fundamental matrix generated by the 8 point algorithm as the model; the sample is selected by a random block selecting method, which ensures the uniform distribution and the accuracy; adds sequential probability ratio test(SPRT) on the basis of standard RANSAC, which cut down the overall running time of the algorithm. The experimental results show that this method can not only get higher matching accuracy, but also greatly reduce the computation and improve the matching speed.
Pine, P S; Boedigheimer, M; Rosenzweig, B A; Turpaz, Y; He, Y D; Delenstarr, G; Ganter, B; Jarnagin, K; Jones, W D; Reid, L H; Thompson, K L
2008-11-01
Effective use of microarray technology in clinical and regulatory settings is contingent on the adoption of standard methods for assessing performance. The MicroArray Quality Control project evaluated the repeatability and comparability of microarray data on the major commercial platforms and laid the groundwork for the application of microarray technology to regulatory assessments. However, methods for assessing performance that are commonly applied to diagnostic assays used in laboratory medicine remain to be developed for microarray assays. A reference system for microarray performance evaluation and process improvement was developed that includes reference samples, metrics and reference datasets. The reference material is composed of two mixes of four different rat tissue RNAs that allow defined target ratios to be assayed using a set of tissue-selective analytes that are distributed along the dynamic range of measurement. The diagnostic accuracy of detected changes in expression ratios, measured as the area under the curve from receiver operating characteristic plots, provides a single commutable value for comparing assay specificity and sensitivity. The utility of this system for assessing overall performance was evaluated for relevant applications like multi-laboratory proficiency testing programs and single-laboratory process drift monitoring. The diagnostic accuracy of detection of a 1.5-fold change in signal level was found to be a sensitive metric for comparing overall performance. This test approaches the technical limit for reliable discrimination of differences between two samples using this technology. We describe a reference system that provides a mechanism for internal and external assessment of laboratory proficiency with microarray technology and is translatable to performance assessments on other whole-genome expression arrays used for basic and clinical research.
Lakshmanan, Manu N.; Greenberg, Joel A.; Samei, Ehsan; Kapadia, Anuj J.
2017-01-01
Abstract. Although transmission-based x-ray imaging is the most commonly used imaging approach for breast cancer detection, it exhibits false negative rates higher than 15%. To improve cancer detection accuracy, x-ray coherent scatter computed tomography (CSCT) has been explored to potentially detect cancer with greater consistency. However, the 10-min scan duration of CSCT limits its possible clinical applications. The coded aperture coherent scatter spectral imaging (CACSSI) technique has been shown to reduce scan time through enabling single-angle imaging while providing high detection accuracy. Here, we use Monte Carlo simulations to test analytical optimization studies of the CACSSI technique, specifically for detecting cancer in ex vivo breast samples. An anthropomorphic breast tissue phantom was modeled, a CACSSI imaging system was virtually simulated to image the phantom, a diagnostic voxel classification algorithm was applied to all reconstructed voxels in the phantom, and receiver-operator characteristics analysis of the voxel classification was used to evaluate and characterize the imaging system for a range of parameters that have been optimized in a prior analytical study. The results indicate that CACSSI is able to identify the distribution of cancerous and healthy tissues (i.e., fibroglandular, adipose, or a mix of the two) in tissue samples with a cancerous voxel identification area-under-the-curve of 0.94 through a scan lasting less than 10 s per slice. These results show that coded aperture scatter imaging has the potential to provide scatter images that automatically differentiate cancerous and healthy tissue within ex vivo samples. Furthermore, the results indicate potential CACSSI imaging system configurations for implementation in subsequent imaging development studies. PMID:28331884
Enabling phenotypic big data with PheNorm.
Yu, Sheng; Ma, Yumeng; Gronsbell, Jessica; Cai, Tianrun; Ananthakrishnan, Ashwin N; Gainer, Vivian S; Churchill, Susanne E; Szolovits, Peter; Murphy, Shawn N; Kohane, Isaac S; Liao, Katherine P; Cai, Tianxi
2018-01-01
Electronic health record (EHR)-based phenotyping infers whether a patient has a disease based on the information in his or her EHR. A human-annotated training set with gold-standard disease status labels is usually required to build an algorithm for phenotyping based on a set of predictive features. The time intensiveness of annotation and feature curation severely limits the ability to achieve high-throughput phenotyping. While previous studies have successfully automated feature curation, annotation remains a major bottleneck. In this paper, we present PheNorm, a phenotyping algorithm that does not require expert-labeled samples for training. The most predictive features, such as the number of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes or mentions of the target phenotype, are normalized to resemble a normal mixture distribution with high area under the receiver operating curve (AUC) for prediction. The transformed features are then denoised and combined into a score for accurate disease classification. We validated the accuracy of PheNorm with 4 phenotypes: coronary artery disease, rheumatoid arthritis, Crohn's disease, and ulcerative colitis. The AUCs of the PheNorm score reached 0.90, 0.94, 0.95, and 0.94 for the 4 phenotypes, respectively, which were comparable to the accuracy of supervised algorithms trained with sample sizes of 100-300, with no statistically significant difference. The accuracy of the PheNorm algorithms is on par with algorithms trained with annotated samples. PheNorm fully automates the generation of accurate phenotyping algorithms and demonstrates the capacity for EHR-driven annotations to scale to the next level - phenotypic big data. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
DOE Office of Scientific and Technical Information (OSTI.GOV)
De Putter, Roland; Doré, Olivier; Das, Sudeep
2014-01-10
Cross correlations between the galaxy number density in a lensing source sample and that in an overlapping spectroscopic sample can in principle be used to calibrate the lensing source redshift distribution. In this paper, we study in detail to what extent this cross-correlation method can mitigate the loss of cosmological information in upcoming weak lensing surveys (combined with a cosmic microwave background prior) due to lack of knowledge of the source distribution. We consider a scenario where photometric redshifts are available and find that, unless the photometric redshift distribution p(z {sub ph}|z) is calibrated very accurately a priori (bias andmore » scatter known to ∼0.002 for, e.g., EUCLID), the additional constraint on p(z {sub ph}|z) from the cross-correlation technique to a large extent restores the cosmological information originally lost due to the uncertainty in dn/dz(z). Considering only the gain in photo-z accuracy and not the additional cosmological information, enhancements of the dark energy figure of merit of up to a factor of four (40) can be achieved for a SuMIRe-like (EUCLID-like) combination of lensing and redshift surveys, where SuMIRe stands for Subaru Measurement of Images and Redshifts). However, the success of the method is strongly sensitive to our knowledge of the galaxy bias evolution in the source sample and we find that a percent level bias prior is needed to optimize the gains from the cross-correlation method (i.e., to approach the cosmology constraints attainable if the bias was known exactly).« less
Accuracy of remotely sensed data: Sampling and analysis procedures
NASA Technical Reports Server (NTRS)
Congalton, R. G.; Oderwald, R. G.; Mead, R. A.
1982-01-01
A review and update of the discrete multivariate analysis techniques used for accuracy assessment is given. A listing of the computer program written to implement these techniques is given. New work on evaluating accuracy assessment using Monte Carlo simulation with different sampling schemes is given. The results of matrices from the mapping effort of the San Juan National Forest is given. A method for estimating the sample size requirements for implementing the accuracy assessment procedures is given. A proposed method for determining the reliability of change detection between two maps of the same area produced at different times is given.
Lightdrum—Portable Light Stage for Accurate BTF Measurement on Site
Havran, Vlastimil; Hošek, Jan; Němcová, Šárka; Čáp, Jiří; Bittner, Jiří
2017-01-01
We propose a miniaturised light stage for measuring the bidirectional reflectance distribution function (BRDF) and the bidirectional texture function (BTF) of surfaces on site in real world application scenarios. The main principle of our lightweight BTF acquisition gantry is a compact hemispherical skeleton with cameras along the meridian and with light emitting diode (LED) modules shining light onto a sample surface. The proposed device is portable and achieves a high speed of measurement while maintaining high degree of accuracy. While the positions of the LEDs are fixed on the hemisphere, the cameras allow us to cover the range of the zenith angle from 0∘ to 75∘ and by rotating the cameras along the axis of the hemisphere we can cover all possible camera directions. This allows us to take measurements with almost the same quality as existing stationary BTF gantries. Two degrees of freedom can be set arbitrarily for measurements and the other two degrees of freedom are fixed, which provides a tradeoff between accuracy of measurements and practical applicability. Assuming that a measured sample is locally flat and spatially accessible, we can set the correct perpendicular direction against the measured sample by means of an auto-collimator prior to measuring. Further, we have designed and used a marker sticker method to allow for the easy rectification and alignment of acquired images during data processing. We show the results of our approach by images rendered for 36 measured material samples. PMID:28241466
Accurate and fast multiple-testing correction in eQTL studies.
Sul, Jae Hoon; Raj, Towfique; de Jong, Simone; de Bakker, Paul I W; Raychaudhuri, Soumya; Ophoff, Roel A; Stranger, Barbara E; Eskin, Eleazar; Han, Buhm
2015-06-04
In studies of expression quantitative trait loci (eQTLs), it is of increasing interest to identify eGenes, the genes whose expression levels are associated with variation at a particular genetic variant. Detecting eGenes is important for follow-up analyses and prioritization because genes are the main entities in biological processes. To detect eGenes, one typically focuses on the genetic variant with the minimum p value among all variants in cis with a gene and corrects for multiple testing to obtain a gene-level p value. For performing multiple-testing correction, a permutation test is widely used. Because of growing sample sizes of eQTL studies, however, the permutation test has become a computational bottleneck in eQTL studies. In this paper, we propose an efficient approach for correcting for multiple testing and assess eGene p values by utilizing a multivariate normal distribution. Our approach properly takes into account the linkage-disequilibrium structure among variants, and its time complexity is independent of sample size. By applying our small-sample correction techniques, our method achieves high accuracy in both small and large studies. We have shown that our method consistently produces extremely accurate p values (accuracy > 98%) for three human eQTL datasets with different sample sizes and SNP densities: the Genotype-Tissue Expression pilot dataset, the multi-region brain dataset, and the HapMap 3 dataset. Copyright © 2015 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Varadarajan, Divya; Haldar, Justin P
2017-11-01
The data measured in diffusion MRI can be modeled as the Fourier transform of the Ensemble Average Propagator (EAP), a probability distribution that summarizes the molecular diffusion behavior of the spins within each voxel. This Fourier relationship is potentially advantageous because of the extensive theory that has been developed to characterize the sampling requirements, accuracy, and stability of linear Fourier reconstruction methods. However, existing diffusion MRI data sampling and signal estimation methods have largely been developed and tuned without the benefit of such theory, instead relying on approximations, intuition, and extensive empirical evaluation. This paper aims to address this discrepancy by introducing a novel theoretical signal processing framework for diffusion MRI. The new framework can be used to characterize arbitrary linear diffusion estimation methods with arbitrary q-space sampling, and can be used to theoretically evaluate and compare the accuracy, resolution, and noise-resilience of different data acquisition and parameter estimation techniques. The framework is based on the EAP, and makes very limited modeling assumptions. As a result, the approach can even provide new insight into the behavior of model-based linear diffusion estimation methods in contexts where the modeling assumptions are inaccurate. The practical usefulness of the proposed framework is illustrated using both simulated and real diffusion MRI data in applications such as choosing between different parameter estimation methods and choosing between different q-space sampling schemes. Copyright © 2017 Elsevier Inc. All rights reserved.
Sampling design for spatially distributed hydrogeologic and environmental processes
Christakos, G.; Olea, R.A.
1992-01-01
A methodology for the design of sampling networks over space is proposed. The methodology is based on spatial random field representations of nonhomogeneous natural processes, and on optimal spatial estimation techniques. One of the most important results of random field theory for physical sciences is its rationalization of correlations in spatial variability of natural processes. This correlation is extremely important both for interpreting spatially distributed observations and for predictive performance. The extent of site sampling and the types of data to be collected will depend on the relationship of subsurface variability to predictive uncertainty. While hypothesis formulation and initial identification of spatial variability characteristics are based on scientific understanding (such as knowledge of the physics of the underlying phenomena, geological interpretations, intuition and experience), the support offered by field data is statistically modelled. This model is not limited by the geometric nature of sampling and covers a wide range in subsurface uncertainties. A factorization scheme of the sampling error variance is derived, which possesses certain atttactive properties allowing significant savings in computations. By means of this scheme, a practical sampling design procedure providing suitable indices of the sampling error variance is established. These indices can be used by way of multiobjective decision criteria to obtain the best sampling strategy. Neither the actual implementation of the in-situ sampling nor the solution of the large spatial estimation systems of equations are necessary. The required values of the accuracy parameters involved in the network design are derived using reference charts (readily available for various combinations of data configurations and spatial variability parameters) and certain simple yet accurate analytical formulas. Insight is gained by applying the proposed sampling procedure to realistic examples related to sampling problems in two dimensions. ?? 1992.
NASA Astrophysics Data System (ADS)
Kostencka, Julianna; Kozacki, Tomasz; Hennelly, Bryan; Sheridan, John T.
2017-06-01
Holographic tomography (HT) allows noninvasive, quantitative, 3D imaging of transparent microobjects, such as living biological cells and fiber optics elements. The technique is based on acquisition of multiple scattered fields for various sample perspectives using digital holographic microscopy. Then, the captured data is processed with one of the tomographic reconstruction algorithms, which enables 3D reconstruction of refractive index distribution. In our recent works we addressed the issue of spatially variant accuracy of the HT reconstructions, which results from the insufficient model of diffraction that is applied in the widely-used tomographic reconstruction algorithms basing on the Rytov approximation. In the present study, we continue investigating the spatially variant properties of the HT imaging, however, we are now focusing on the limited spatial size of holograms as a source of this problem. Using the Wigner distribution representation and the Ewald sphere approach, we show that the limited size of the holograms results in a decreased quality of tomographic imaging in off-center regions of the HT reconstructions. This is because the finite detector extent becomes a limiting aperture that prohibits acquisition of full information about diffracted fields coming from the out-of-focus structures of a sample. The incompleteness of the data results in an effective truncation of the tomographic transfer function for the out-of-center regions of the tomographic image. In this paper, the described effect is quantitatively characterized for three types of the tomographic systems: the configuration with 1) object rotation, 2) scanning of the illumination direction, 3) the hybrid HT solution combing both previous approaches.
The Use of a Predictive Habitat Model and a Fuzzy Logic Approach for Marine Management and Planning
Hattab, Tarek; Ben Rais Lasram, Frida; Albouy, Camille; Sammari, Chérif; Romdhane, Mohamed Salah; Cury, Philippe; Leprieur, Fabien; Le Loc’h, François
2013-01-01
Bottom trawl survey data are commonly used as a sampling technique to assess the spatial distribution of commercial species. However, this sampling technique does not always correctly detect a species even when it is present, and this can create significant limitations when fitting species distribution models. In this study, we aim to test the relevance of a mixed methodological approach that combines presence-only and presence-absence distribution models. We illustrate this approach using bottom trawl survey data to model the spatial distributions of 27 commercially targeted marine species. We use an environmentally- and geographically-weighted method to simulate pseudo-absence data. The species distributions are modelled using regression kriging, a technique that explicitly incorporates spatial dependence into predictions. Model outputs are then used to identify areas that met the conservation targets for the deployment of artificial anti-trawling reefs. To achieve this, we propose the use of a fuzzy logic framework that accounts for the uncertainty associated with different model predictions. For each species, the predictive accuracy of the model is classified as ‘high’. A better result is observed when a large number of occurrences are used to develop the model. The map resulting from the fuzzy overlay shows that three main areas have a high level of agreement with the conservation criteria. These results align with expert opinion, confirming the relevance of the proposed methodology in this study. PMID:24146867
Calibrating photometric redshifts of luminous red galaxies
Padmanabhan, Nikhil; Budavari, Tamas; Schlegel, David J.; ...
2005-05-01
We discuss the construction of a photometric redshift catalogue of luminous red galaxies (LRGs) from the Sloan Digital Sky Survey (SDSS), emphasizing the principal steps necessary for constructing such a catalogue: (i) photometrically selecting the sample, (ii) measuring photometric redshifts and their error distributions, and (iii) estimating the true redshift distribution. We compare two photometric redshift algorithms for these data and find that they give comparable results. Calibrating against the SDSS and SDSS–2dF (Two Degree Field) spectroscopic surveys, we find that the photometric redshift accuracy is σ~ 0.03 for redshifts less than 0.55 and worsens at higher redshift (~ 0.06more » for z < 0.7). These errors are caused by photometric scatter, as well as systematic errors in the templates, filter curves and photometric zero-points. We also parametrize the photometric redshift error distribution with a sum of Gaussians and use this model to deconvolve the errors from the measured photometric redshift distribution to estimate the true redshift distribution. We pay special attention to the stability of this deconvolution, regularizing the method with a prior on the smoothness of the true redshift distribution. The methods that we develop are applicable to general photometric redshift surveys.« less
Sewage Reflects the Microbiomes of Human Populations
Newton, Ryan J.; McLellan, Sandra L.; Dila, Deborah K.; Vineis, Joseph H.; Morrison, Hilary G.; Eren, A. Murat
2015-01-01
ABSTRACT Molecular characterizations of the gut microbiome from individual human stool samples have identified community patterns that correlate with age, disease, diet, and other human characteristics, but resources for marker gene studies that consider microbiome trends among human populations scale with the number of individuals sampled from each population. As an alternative strategy for sampling populations, we examined whether sewage accurately reflects the microbial community of a mixture of stool samples. We used oligotyping of high-throughput 16S rRNA gene sequence data to compare the bacterial distribution in a stool data set to a sewage influent data set from 71 U.S. cities. On average, only 15% of sewage sample sequence reads were attributed to human fecal origin, but sewage recaptured most (97%) human fecal oligotypes. The most common oligotypes in stool matched the most common and abundant in sewage. After informatically separating sequences of human fecal origin, sewage samples exhibited ~3× greater diversity than stool samples. Comparisons among municipal sewage communities revealed the ubiquitous and abundant occurrence of 27 human fecal oligotypes, representing an apparent core set of organisms in U.S. populations. The fecal community variability among U.S. populations was significantly lower than among individuals. It clustered into three primary community structures distinguished by oligotypes from either: Bacteroidaceae, Prevotellaceae, or Lachnospiraceae/Ruminococcaceae. These distribution patterns reflected human population variation and predicted whether samples represented lean or obese populations with 81 to 89% accuracy. Our findings demonstrate that sewage represents the fecal microbial community of human populations and captures population-level traits of the human microbiome. PMID:25714718
"Tools For Analysis and Visualization of Large Time- Varying CFD Data Sets"
NASA Technical Reports Server (NTRS)
Wilhelms, Jane; vanGelder, Allen
1999-01-01
During the four years of this grant (including the one year extension), we have explored many aspects of the visualization of large CFD (Computational Fluid Dynamics) datasets. These have included new direct volume rendering approaches, hierarchical methods, volume decimation, error metrics, parallelization, hardware texture mapping, and methods for analyzing and comparing images. First, we implemented an extremely general direct volume rendering approach that can be used to render rectilinear, curvilinear, or tetrahedral grids, including overlapping multiple zone grids, and time-varying grids. Next, we developed techniques for associating the sample data with a k-d tree, a simple hierarchial data model to approximate samples in the regions covered by each node of the tree, and an error metric for the accuracy of the model. We also explored a new method for determining the accuracy of approximate models based on the light field method described at ACM SIGGRAPH (Association for Computing Machinery Special Interest Group on Computer Graphics) '96. In our initial implementation, we automatically image the volume from 32 approximately evenly distributed positions on the surface of an enclosing tessellated sphere. We then calculate differences between these images under different conditions of volume approximation or decimation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Edwards, R.L.; Chen, J.H.; Ku, T.L.
1987-06-19
The development of mass spectrometric techniques for determination of STTh abundance has made it possible to reduce analytical errors in STYU-STUU-STTh dating of corals even with very small samples. Samples of 6 x 10Y atoms of STTh can be measured to an accuracy of +/- 3% (2sigma) and 3 x 10 atoms of STTh can be measured to an accuracy of +/- 0.2%. The time range over which useful age data on corals can be obtained now ranges from about 50 to about 500,000 years. For young corals, this approach may be preferable to UC dating. The precision should makemore » it possible to critically test the Milankovitch hypothesis concerning Pleistocene climate fluctuations. Analyses of a number of corals that grew during the last interglacial period yield ages of 122,000 to 130,000 years. The ages coincide with, or slightly postdate, the summer solar insolation high at 65N latitude which occurred 128,000 years ago. This supports the idea that changes in Pleistocene climate can be the result of variations in the distribution of solar insolation caused by changes in the geometry of the earth's orbit and rotation axis.« less
Vital sign sensing method based on EMD in terahertz band
NASA Astrophysics Data System (ADS)
Xu, Zhengwu; Liu, Tong
2014-12-01
Non-contact respiration and heartbeat rates detection could be applied to find survivors trapped in the disaster or the remote monitoring of the respiration and heartbeat of a patient. This study presents an improved algorithm that extracts the respiration and heartbeat rates of humans by utilizing the terahertz radar, which further lessens the effects of noise, suppresses the cross-term, and enhances the detection accuracy. A human target echo model for the terahertz radar is first presented. Combining the over-sampling method, low-pass filter, and Empirical Mode Decomposition improves the signal-to-noise ratio. The smoothed pseudo Wigner-Ville distribution time-frequency technique and the centroid of the spectrogram are used to estimate the instantaneous velocity of the target's cardiopulmonary motion. The down-sampling method is adopted to prevent serious distortion. Finally, a second time-frequency analysis is applied to the centroid curve to extract the respiration and heartbeat rates of the individual. Simulation results show that compared with the previously presented vital sign sensing method, the improved algorithm enhances the signal-to-noise ratio to 1 dB with a detection accuracy of 80%. The improved algorithm is an effective approach for the detection of respiration and heartbeat signal in a complicated environment.
Bilbao, Aivett; Gibbons, Bryson C.; Slysz, Gordon W.; ...
2017-11-06
We present that the mass accuracy and peak intensity of ions detected by mass spectrometry (MS) measurements are essential to facilitate compound identification and quantitation. However, high concentration species can yield erroneous results if their ion intensities reach beyond the limits of the detection system, leading to distorted and non-ideal detector response (e.g. saturation), and largely precluding the calculation of accurate m/z and intensity values. Here we present an open source computational method to correct peaks above a defined intensity (saturated) threshold determined by the MS instrumentation such as the analog-to-digital converters or time-to-digital converters used in conjunction with time-of-flightmore » MS. Here, in this method, the isotopic envelope for each observed ion above the saturation threshold is compared to its expected theoretical isotopic distribution. The most intense isotopic peak for which saturation does not occur is then utilized to re-calculate the precursor m/z and correct the intensity, resulting in both higher mass accuracy and greater dynamic range. The benefits of this approach were evaluated with proteomic and lipidomic datasets of varying complexities. After correcting the high concentration species, reduced mass errors and enhanced dynamic range were observed for both simple and complex omic samples. Specifically, the mass error dropped by more than 50% in most cases for highly saturated species and dynamic range increased by 1–2 orders of magnitude for peptides in a blood serum sample.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bilbao, Aivett; Gibbons, Bryson C.; Slysz, Gordon W.
The mass accuracy and peak intensity of ions detected by mass spectrometry (MS) measurements are essential to facilitate compound identification and quantitation. However, high concentration species can easily cause problems if their ion intensities reach beyond the limits of the detection system, leading to distorted and non-ideal detector response (e.g. saturation), and largely precluding the calculation of accurate m/z and intensity values. Here we present an open source computational method to correct peaks above a defined intensity (saturated) threshold determined by the MS instrumentation such as the analog-to-digital converters or time-to-digital converters used in conjunction with time-of-flight MS. In thismore » method, the isotopic envelope for each observed ion above the saturation threshold is compared to its expected theoretical isotopic distribution. The most intense isotopic peak for which saturation does not occur is then utilized to re-calculate the precursor m/z and correct the intensity, resulting in both higher mass accuracy and greater dynamic range. The benefits of this approach were evaluated with proteomic and lipidomic datasets of varying complexities. After correcting the high concentration species, reduced mass errors and enhanced dynamic range were observed for both simple and complex omic samples. Specifically, the mass error dropped by more than 50% in most cases with highly saturated species and dynamic range increased by 1-2 orders of magnitude for peptides in a blood serum sample.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bilbao, Aivett; Gibbons, Bryson C.; Slysz, Gordon W.
We present that the mass accuracy and peak intensity of ions detected by mass spectrometry (MS) measurements are essential to facilitate compound identification and quantitation. However, high concentration species can yield erroneous results if their ion intensities reach beyond the limits of the detection system, leading to distorted and non-ideal detector response (e.g. saturation), and largely precluding the calculation of accurate m/z and intensity values. Here we present an open source computational method to correct peaks above a defined intensity (saturated) threshold determined by the MS instrumentation such as the analog-to-digital converters or time-to-digital converters used in conjunction with time-of-flightmore » MS. Here, in this method, the isotopic envelope for each observed ion above the saturation threshold is compared to its expected theoretical isotopic distribution. The most intense isotopic peak for which saturation does not occur is then utilized to re-calculate the precursor m/z and correct the intensity, resulting in both higher mass accuracy and greater dynamic range. The benefits of this approach were evaluated with proteomic and lipidomic datasets of varying complexities. After correcting the high concentration species, reduced mass errors and enhanced dynamic range were observed for both simple and complex omic samples. Specifically, the mass error dropped by more than 50% in most cases for highly saturated species and dynamic range increased by 1–2 orders of magnitude for peptides in a blood serum sample.« less
NASA Astrophysics Data System (ADS)
Iwaki, Y.
2010-07-01
The Quality Assurance (QA) of measurand has been discussed over many years by Quality Engineering (QE). It is need to more discuss about ISO standard. It is mining to find out root fault element for improvement of measured accuracy, and it remove. The accuracy assurance needs to investigate the Reference Material (RM) for calibration and an improvement accuracy of data processing. This research follows the accuracy improvement in field of data processing by how to improve of accuracy. As for the fault element relevant to measurement accuracy, in many cases, two or more element is buried exist. The QE is to assume the generating frequency of fault state, and it is solving from higher ranks for fault factor first by "Failure Mode and Effects Analysis (FMEA)". Then QE investigate the root cause over the fault element by "Root Cause Analysis (RCA)" and "Fault Tree Analysis (FTA)" and calculate order to the generating element of assume specific fault. These days comes, the accuracy assurance of measurement result became duty in the Professional Test (PT). ISO standard was legislated by ISO-GUM (Guide of express Uncertainty in Measurement) as guidance of an accuracy assurance in 1993 [1] for QA. Analysis method of ISO-GUM is changed into Exploratory Data Analysis (EDA) from Analysis of Valiance (ANOVA). EDA calculate one by one until an assurance performance is obtained according to "Law of the propagation of uncertainty". If the truth value was unknown, ISO-GUM is changed into reference value. A reference value set up by the EDA and it does check with a Key Comparison (KC) method. KC is comparing between null hypothesis and frequency hypothesis. It performs operation of assurance by ISO-GUM in order of standard uncertainty, the combined uncertainty of many fault elements and an expansion uncertain for assurance. An assurance value is authorized by multiplying the final expansion uncertainty [2] by K of coverage factor. K-value is calculated from the Effective Free Degree (EFD) which thought the number of samples is important. Free degree is based on maximum likelihood method of an improved information criterion (AIC) for a Quality Control (QC). The assurance performance of ISO-GUM is come out by set up of the confidence interval [3] and is decided. The result of research of "Decided level/Minimum Detectable Concentration (DL/MDC)" was able to profit by the operation. QE has developed for the QC of industry. However, these have been processed by regression analysis by making frequency probability of a statistic value into normalized distribution. The occurrence probability of the statistics value of a fault element which is accompanied element by a natural phenomenon becomes an abnormal distribution in many cases. The abnormal distribution needs to obtain an assurance value by other method than statistical work of type B in ISO-GUM. It is tried fusion the improvement of worker by QE became important for reservation of the reliability of measurement accuracy and safety. This research was to make the result of Blood Chemical Analysis (BCA) in the field of clinical test.
Finite element model updating using the shadow hybrid Monte Carlo technique
NASA Astrophysics Data System (ADS)
Boulkaibet, I.; Mthembu, L.; Marwala, T.; Friswell, M. I.; Adhikari, S.
2015-02-01
Recent research in the field of finite element model updating (FEM) advocates the adoption of Bayesian analysis techniques to dealing with the uncertainties associated with these models. However, Bayesian formulations require the evaluation of the Posterior Distribution Function which may not be available in analytical form. This is the case in FEM updating. In such cases sampling methods can provide good approximations of the Posterior distribution when implemented in the Bayesian context. Markov Chain Monte Carlo (MCMC) algorithms are the most popular sampling tools used to sample probability distributions. However, the efficiency of these algorithms is affected by the complexity of the systems (the size of the parameter space). The Hybrid Monte Carlo (HMC) offers a very important MCMC approach to dealing with higher-dimensional complex problems. The HMC uses the molecular dynamics (MD) steps as the global Monte Carlo (MC) moves to reach areas of high probability where the gradient of the log-density of the Posterior acts as a guide during the search process. However, the acceptance rate of HMC is sensitive to the system size as well as the time step used to evaluate the MD trajectory. To overcome this limitation we propose the use of the Shadow Hybrid Monte Carlo (SHMC) algorithm. The SHMC algorithm is a modified version of the Hybrid Monte Carlo (HMC) and designed to improve sampling for large-system sizes and time steps. This is done by sampling from a modified Hamiltonian function instead of the normal Hamiltonian function. In this paper, the efficiency and accuracy of the SHMC method is tested on the updating of two real structures; an unsymmetrical H-shaped beam structure and a GARTEUR SM-AG19 structure and is compared to the application of the HMC algorithm on the same structures.
NASA Astrophysics Data System (ADS)
Sitko, Rafał
2008-11-01
Knowledge of X-ray tube spectral distribution is necessary in theoretical methods of matrix correction, i.e. in both fundamental parameter (FP) methods and theoretical influence coefficient algorithms. Thus, the influence of X-ray tube distribution on the accuracy of the analysis of thin films and bulk samples is presented. The calculations are performed using experimental X-ray tube spectra taken from the literature and theoretical X-ray tube spectra evaluated by three different algorithms proposed by Pella et al. (X-Ray Spectrom. 14 (1985) 125-135), Ebel (X-Ray Spectrom. 28 (1999) 255-266), and Finkelshtein and Pavlova (X-Ray Spectrom. 28 (1999) 27-32). In this study, Fe-Cr-Ni system is selected as an example and the calculations are performed for X-ray tubes commonly applied in X-ray fluorescence analysis (XRF), i.e., Cr, Mo, Rh and W. The influence of X-ray tube spectra on FP analysis is evaluated when quantification is performed using various types of calibration samples. FP analysis of bulk samples is performed using pure-element bulk standards and multielement bulk standards similar to the analyzed material, whereas for FP analysis of thin films, the bulk and thin pure-element standards are used. For the evaluation of the influence of X-ray tube spectra on XRF analysis performed by theoretical influence coefficient methods, two algorithms for bulk samples are selected, i.e. Claisse-Quintin (Can. Spectrosc. 12 (1967) 129-134) and COLA algorithms (G.R. Lachance, Paper Presented at the International Conference on Industrial Inorganic Elemental Analysis, Metz, France, June 3, 1981) and two algorithms (constant and linear coefficients) for thin films recently proposed by Sitko (X-Ray Spectrom. 37 (2008) 265-272).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holzgrewe, F.; Hegedues, F.; Paratte, J.M.
1995-03-01
The light water reactor BOXER code was used to determine the fast azimuthal neutron fluence distribution at the inner surface of the reactor pressure vessel after the tenth cycle of a pressurized water reactor (PWR). Using a cross-section library in 45 groups, fixed-source calculations in transport theory and x-y geometry were carried out to determine the fast azimuthal neutron flux distribution at the inner surface of the pressure vessel for four different cycles. From these results, the fast azimuthal neutron fluence after the tenth cycle was estimated and compared with the results obtained from scraping test experiments. In these experiments,more » small samples of material were taken from the inner surface of the pressure vessel. The fast neutron fluence was then determined form the measured activity of the samples. Comparing the BOXER and scraping test results have maximal differences of 15%, which is very good, considering the factor of 10{sup 3} neutron attenuation between the reactor core and the pressure vessel. To compare the BOXER results with an independent code, the 21st cycle of the PWR was also calculated with the TWODANT two-dimensional transport code, using the same group structure and cross-section library. Deviations in the fast azimuthal flux distribution were found to be <3%, which verifies the accuracy of the BOXER results.« less
Experimental study of low-cost fiber optic distributed temperature sensor system performance
NASA Astrophysics Data System (ADS)
Dashkov, Michael V.; Zharkov, Alexander D.
2016-03-01
The distributed control of temperature is an actual task for various application such as oil & gas fields, high-voltage power lines, fire alarm systems etc. The most perspective are optical fiber distributed temperature sensors (DTS). They have advantages on accuracy, resolution and range, but have a high cost. Nevertheless, for some application the accuracy of measurement and localization aren't so important as cost. The results of an experimental study of low-cost Raman based DTS based on standard OTDR are represented.
THE DETECTION AND STATISTICS OF GIANT ARCS BEHIND CLASH CLUSTERS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu, Bingxiao; Zheng, Wei; Postman, Marc
We developed an algorithm to find and characterize gravitationally lensed galaxies (arcs) to perform a comparison of the observed and simulated arc abundance. Observations are from the Cluster Lensing And Supernova survey with Hubble (CLASH). Simulated CLASH images are created using the MOKA package and also clusters selected from the high-resolution, hydrodynamical simulations, MUSIC, over the same mass and redshift range as the CLASH sample. The algorithm's arc elongation accuracy, completeness, and false positive rate are determined and used to compute an estimate of the true arc abundance. We derive a lensing efficiency of 4 ± 1 arcs (with length ≥6″ andmore » length-to-width ratio ≥7) per cluster for the X-ray-selected CLASH sample, 4 ± 1 arcs per cluster for the MOKA-simulated sample, and 3 ± 1 arcs per cluster for the MUSIC-simulated sample. The observed and simulated arc statistics are in full agreement. We measure the photometric redshifts of all detected arcs and find a median redshift z{sub s} = 1.9 with 33% of the detected arcs having z{sub s} > 3. We find that the arc abundance does not depend strongly on the source redshift distribution but is sensitive to the mass distribution of the dark matter halos (e.g., the c–M relation). Our results show that consistency between the observed and simulated distributions of lensed arc sizes and axial ratios can be achieved by using cluster-lensing simulations that are carefully matched to the selection criteria used in the observations.« less
NASA Astrophysics Data System (ADS)
Ravisankar, R.; Manikandan, E.; Dheenathayalu, M.; Rao, Brahmaji; Seshadreesan, N. P.; Nair, K. G. M.
2006-10-01
Beach rocks are a peculiar type of formation when compared to other types of rocks. Rare earth element (REE) concentrations in beach rock samples collected from the South East Coast of Tamilnadu, India, have been measured using the instrumental neutron activation analysis (INAA) single comparator K0 method. The irradiations were carried out using a thermal neutron flux of ˜10 11 n cm -2 s -1 at 20 kW power using the Kalpakkam mini reactor (KAMINI), IGCAR, Kalpakkam, Tamilnadu. Accuracy and precision were evaluated by assaying irradiated standard reference material (SRM 1646a estuarine sediment). The results being found to be in good agreement with certified values. REE elements have been determined from 15 samples using high-resolution gamma spectrometry. The geochemical behavior of REE in beach rock, in particular REE (chondrite-normalized) pattern has been studied.
Satellites for the study of ocean primary productivity
NASA Technical Reports Server (NTRS)
Smith, R. C.; Baker, K. S.
1983-01-01
The use of remote sensing techniques for obtaining estimates of global marine primary productivity is examined. It is shown that remote sensing and multiplatform (ship, aircraft, and satellite) sampling strategies can be used to significantly lower the variance in estimates of phytoplankton abundance and of population growth rates from the values obtained using the C-14 method. It is noted that multiplatform sampling strategies are essential to assess the mean and variance of phytoplankton biomass on a regional or on a global basis. The relative errors associated with shipboard and satellite estimates of phytoplankton biomass and primary productivity, as well as the increased statistical accuracy possible from the utilization of contemporaneous data from both sampling platforms, are examined. It is shown to be possible to follow changes in biomass and the distribution patterns of biomass as a function of time with the use of satellite imagery.
The accuracy of selected land use and land cover maps at scales of 1:250,000 and 1:100,000
Fitzpatrick-Lins, Katherine
1980-01-01
Land use and land cover maps produced by the U.S. Geological Survey are found to meet or exceed the established standard of accuracy. When analyzed using a point sampling technique and binomial probability theory, several maps, illustrative of those produced for different parts of the country, were found to meet or exceed accuracies of 85 percent. Those maps tested were Tampa, Fla., Portland, Me., Charleston, W. Va., and Greeley, Colo., published at a scale of 1:250,000, and Atlanta, Ga., and Seattle and Tacoma, Wash., published at a scale of 1:100,000. For each map, the values were determined by calculating the ratio of the total number of points correctly interpreted to the total number of points sampled. Six of the seven maps tested have accuracies of 85 percent or better at the 95-percent lower confidence limit. When the sample data for predominant categories (those sampled with a significant number of points) were grouped together for all maps, accuracies of those predominant categories met the 85-percent accuracy criterion, with one exception. One category, Residential, had less than 85-percent accuracy at the 95-percent lower confidence limit. Nearly all residential land sampled was mapped correctly, but some areas of other land uses were mapped incorrectly as Residential.
NASA Astrophysics Data System (ADS)
Kankare, Ville; Vauhkonen, Jari; Tanhuanpää, Topi; Holopainen, Markus; Vastaranta, Mikko; Joensuu, Marianna; Krooks, Anssi; Hyyppä, Juha; Hyyppä, Hannu; Alho, Petteri; Viitala, Risto
2014-11-01
Detailed information about timber assortments and diameter distributions is required in forest management. Forest owners can make better decisions concerning the timing of timber sales and forest companies can utilize more detailed information to optimize their wood supply chain from forest to factory. The objective here was to compare the accuracies of high-density laser scanning techniques for the estimation of tree-level diameter distribution and timber assortments. We also introduce a method that utilizes a combination of airborne and terrestrial laser scanning in timber assortment estimation. The study was conducted in Evo, Finland. Harvester measurements were used as a reference for 144 trees within a single clear-cut stand. The results showed that accurate tree-level timber assortments and diameter distributions can be obtained, using terrestrial laser scanning (TLS) or a combination of TLS and airborne laser scanning (ALS). Saw log volumes were estimated with higher accuracy than pulpwood volumes. The saw log volumes were estimated with relative root-mean-squared errors of 17.5% and 16.8% with TLS and a combination of TLS and ALS, respectively. The respective accuracies for pulpwood were 60.1% and 59.3%. The differences in the bucking method used also caused some large errors. In addition, tree quality factors highly affected the bucking accuracy, especially with pulpwood volume.
Page, Michael M; Taranto, Mario; Ramsay, Duncan; van Schie, Greg; Glendenning, Paul; Gillett, Melissa J; Vasikaran, Samuel D
2018-01-01
Objective Primary aldosteronism is a curable cause of hypertension which can be treated surgically or medically depending on the findings of adrenal vein sampling studies. Adrenal vein sampling studies are technically demanding with a high failure rate in many centres. The use of intraprocedural cortisol measurement could improve the success rates of adrenal vein sampling but may be impracticable due to cost and effects on procedural duration. Design Retrospective review of the results of adrenal vein sampling procedures since commencement of point-of-care cortisol measurement using a novel single-use semi-quantitative measuring device for cortisol, the adrenal vein sampling Accuracy Kit. Success rate and complications of adrenal vein sampling procedures before and after use of the adrenal vein sampling Accuracy Kit. Routine use of the adrenal vein sampling Accuracy Kit device for intraprocedural measurement of cortisol commenced in 2016. Results Technical success rate of adrenal vein sampling increased from 63% of 99 procedures to 90% of 48 procedures ( P = 0.0007) after implementation of the adrenal vein sampling Accuracy Kit. Failure of right adrenal vein cannulation was the main reason for an unsuccessful study. Radiation dose decreased from 34.2 Gy.cm 2 (interquartile range, 15.8-85.9) to 15.7 Gy.cm 2 (6.9-47.3) ( P = 0.009). No complications were noted, and implementation costs were minimal. Conclusions Point-of-care cortisol measurement during adrenal vein sampling improved cannulation success rates and reduced radiation exposure. The use of the adrenal vein sampling Accuracy Kit is now standard practice at our centre.
Ruangsetakit, Varee
2015-11-01
To re-examine relative accuracy of intraocular lens (IOL) power calculation of immersion ultrasound biometry (IUB) and partial coherence interferometry (PCI) based on a new approach that limits its interest on the cases in which the IUB's IOL and PCI's IOL assignments disagree. Prospective observational study of 108 eyes that underwent cataract surgeries at Taksin Hospital. Two halves ofthe randomly chosen sample eyes were implanted with the IUB- and PCI-assigned lens. Postoperative refractive errors were measured in the fifth week. More accurate calculation was based on significantly smaller mean absolute errors (MAEs) and root mean squared errors (RMSEs) away from emmetropia. The distributions of the errors were examined to ensure that the higher accuracy was significant clinically as well. The (MAEs, RMSEs) were smaller for PCI of (0.5106 diopter (D), 0.6037D) than for IUB of (0.7000D, 0.8062D). The higher accuracy was principally contributedfrom negative errors, i.e., myopia. The MAEs and RMSEs for (IUB, PCI)'s negative errors were (0.7955D, 0.5185D) and (0.8562D, 0.5853D). Their differences were significant. The 72.34% of PCI errors fell within a clinically accepted range of ± 0.50D, whereas 50% of IUB errors did. PCI's higher accuracy was significant statistically and clinically, meaning that lens implantation based on PCI's assignments could improve postoperative outcomes over those based on IUB's assignments.
Effect of separate sampling on classification accuracy.
Shahrokh Esfahani, Mohammad; Dougherty, Edward R
2014-01-15
Measurements are commonly taken from two phenotypes to build a classifier, where the number of data points from each class is predetermined, not random. In this 'separate sampling' scenario, the data cannot be used to estimate the class prior probabilities. Moreover, predetermined class sizes can severely degrade classifier performance, even for large samples. We employ simulations using both synthetic and real data to show the detrimental effect of separate sampling on a variety of classification rules. We establish propositions related to the effect on the expected classifier error owing to a sampling ratio different from the population class ratio. From these we derive a sample-based minimax sampling ratio and provide an algorithm for approximating it from the data. We also extend to arbitrary distributions the classical population-based Anderson linear discriminant analysis minimax sampling ratio derived from the discriminant form of the Bayes classifier. All the codes for synthetic data and real data examples are written in MATLAB. A function called mmratio, whose output is an approximation of the minimax sampling ratio of a given dataset, is also written in MATLAB. All the codes are available at: http://gsp.tamu.edu/Publications/supplementary/shahrokh13b.
Use of Dual-wavelength Radar for Snow Parameter Estimates
NASA Technical Reports Server (NTRS)
Liao, Liang; Meneghini, Robert; Iguchi, Toshio; Detwiler, Andrew
2005-01-01
Use of dual-wavelength radar, with properly chosen wavelengths, will significantly lessen the ambiguities in the retrieval of microphysical properties of hydrometeors. In this paper, a dual-wavelength algorithm is described to estimate the characteristic parameters of the snow size distributions. An analysis of the computational results, made at X and Ka bands (T-39 airborne radar) and at S and X bands (CP-2 ground-based radar), indicates that valid estimates of the median volume diameter of snow particles, D(sub 0), should be possible if one of the two wavelengths of the radar operates in the non-Rayleigh scattering region. However, the accuracy may be affected to some extent if the shape factors of the Gamma function used for describing the particle distribution are chosen far from the true values or if cloud water attenuation is significant. To examine the validity and accuracy of the dual-wavelength radar algorithms, the algorithms are applied to the data taken from the Convective and Precipitation-Electrification Experiment (CaPE) in 1991, in which the dual-wavelength airborne radar was coordinated with in situ aircraft particle observations and ground-based radar measurements. Having carefully co-registered the data obtained from the different platforms, the airborne radar-derived size distributions are then compared with the in-situ measurements and ground-based radar. Good agreement is found for these comparisons despite the uncertainties resulting from mismatches of the sample volumes among the different sensors as well as spatial and temporal offsets.
Multi-level methods and approximating distribution functions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wilson, D., E-mail: daniel.wilson@dtc.ox.ac.uk; Baker, R. E.
2016-07-15
Biochemical reaction networks are often modelled using discrete-state, continuous-time Markov chains. System statistics of these Markov chains usually cannot be calculated analytically and therefore estimates must be generated via simulation techniques. There is a well documented class of simulation techniques known as exact stochastic simulation algorithms, an example of which is Gillespie’s direct method. These algorithms often come with high computational costs, therefore approximate stochastic simulation algorithms such as the tau-leap method are used. However, in order to minimise the bias in the estimates generated using them, a relatively small value of tau is needed, rendering the computational costs comparablemore » to Gillespie’s direct method. The multi-level Monte Carlo method (Anderson and Higham, Multiscale Model. Simul. 10:146–179, 2012) provides a reduction in computational costs whilst minimising or even eliminating the bias in the estimates of system statistics. This is achieved by first crudely approximating required statistics with many sample paths of low accuracy. Then correction terms are added until a required level of accuracy is reached. Recent literature has primarily focussed on implementing the multi-level method efficiently to estimate a single system statistic. However, it is clearly also of interest to be able to approximate entire probability distributions of species counts. We present two novel methods that combine known techniques for distribution reconstruction with the multi-level method. We demonstrate the potential of our methods using a number of examples.« less
The accuracy of thematic map products is not spatially homogenous, but instead variable across most landscapes. Properly analyzing and representing the spatial distribution (pattern) of thematic map accuracy would provide valuable user information for assessing appropriate applic...
NASA Astrophysics Data System (ADS)
Zhang, Yang; Liu, Wei; Li, Xiaodong; Yang, Fan; Gao, Peng; Jia, Zhenyuan
2015-10-01
Large-scale triangulation scanning measurement systems are widely used to measure the three-dimensional profile of large-scale components and parts. The accuracy and speed of the laser stripe center extraction are essential for guaranteeing the accuracy and efficiency of the measuring system. However, in the process of large-scale measurement, multiple factors can cause deviation of the laser stripe center, including the spatial light intensity distribution, material reflectivity characteristics, and spatial transmission characteristics. A center extraction method is proposed for improving the accuracy of the laser stripe center extraction based on image evaluation of Gaussian fitting structural similarity and analysis of the multiple source factors. First, according to the features of the gray distribution of the laser stripe, evaluation of the Gaussian fitting structural similarity is estimated to provide a threshold value for center compensation. Then using the relationships between the gray distribution of the laser stripe and the multiple source factors, a compensation method of center extraction is presented. Finally, measurement experiments for a large-scale aviation composite component are carried out. The experimental results for this specific implementation verify the feasibility of the proposed center extraction method and the improved accuracy for large-scale triangulation scanning measurements.
LaHaye, N. L.; Harilal, S. S.; Diwakar, P. K.; Hassanein, A.; Kulkarni, P.
2015-01-01
We investigated the role of femtosecond (fs) laser wavelength on laser ablation (LA) and its relation to laser generated aerosol counts and particle distribution, inductively coupled plasma-mass spectrometry (ICP-MS) signal intensity, detection limits, and elemental fractionation. Four different NIST standard reference materials (610, 613, 615, and 616) were ablated using 400 nm and 800 nm fs laser pulses to study the effect of wavelength on laser ablation rate, accuracy, precision, and fractionation. Our results show that the detection limits are lower for 400 nm laser excitation than 800 nm laser excitation at lower laser energies but approximately equal at higher energies. Ablation threshold was also found to be lower for 400 nm than 800 nm laser excitation. Particle size distributions are very similar for 400 nm and 800 nm wavelengths; however, they differ significantly in counts at similar laser fluence levels. This study concludes that 400 nm LA is more beneficial for sample introduction in ICP-MS, particularly when lower laser energies are to be used for ablation. PMID:26640294
Quantifying the line-of-sight mass distributions for time-delay lenses with stellar masses
NASA Astrophysics Data System (ADS)
Rusu, Cristian; Fassnacht, Chris; Treu, Tommaso; Suyu, Sherry; Auger, Matt; Koopmans, Leon; Marshall, Phil; Wong, Kenneth; Collett, Thomas; Agnello, Adriano; Blandford, Roger; Courbin, Frederic; Hilbert, Stefan; Meylan, Georges; Sluse, Dominique
2014-12-01
Measuring cosmological parameters with a realistic account of systematic uncertainties is currently one of the principal challenges of physical cosmology. Building on our recent successes with two gravitationally lensed systems, we have started a program to achieve accurate cosmographic measurements from five gravitationally lensed quasars. We aim at measuring H_0 with an accuracy better than 4%, comparable to but independent from measurements by current BAO, SN or Cepheid programs. The largest current contributor to the error budget in our sample is uncertainty about the line-of-sight mass distribution and environment of the lens systems. In this proposal, we request wide-field u-band imaging of the only lens in our sample without already available Spitzer/IRCA observations, B1608+656. The proposed observations are critical for reducing these uncertainties by providing accurate redshifts and in particular stellar masses for galaxies in the light cones of the target lens system. This will establish lensing as a powerful and independent tool for determining cosmography, in preparation for the hundreds of time-delay lenses that will be discovered by future surveys.
Multi-Wavelength Photomagnetic Imaging for Oral Cancer
NASA Astrophysics Data System (ADS)
Marks, Michael
In this study, a multi-wavelength Photomagnetic Imaging (PMI) system is developed and evaluated with experimental studies.. PMI measures temperature increases in samples illuminated by near-infrared light sources using magnetic resonance thermometry. A multiphysics solver combining light and heat transfer models the spatiotemporal distribution of the temperature change. The PMI system develop in this work uses three lasers of varying wavelength (785 nm, 808 nm, 860 nm) to heat the sample. By using multiple wavelengths, we enable the PMI system to quantify the relative concentrations of optical contrast in turbid media and monitor their distribution, at a higher resolution than conventional diffuse optical imaging. The data collected from agarose phantoms with multiple embedded contrast agents designed to simulate the optical properties of oxy- and deoxy-hemoglobin is presented. The reconstructed images demonstrate that multi-wavelength PMI can resolve this complex inclusion structure with high resolution and recover the concentration of each contrast agent with high quantitative accuracy. The modified multi-wavelength PMI system operates under the maximum skin exposure limits defined by the American National Standards Institute, to enable future clinical applications.
König, Gerhard; Miller, Benjamin T; Boresch, Stefan; Wu, Xiongwu; Brooks, Bernard R
2012-10-09
One of the key requirements for the accurate calculation of free energy differences is proper sampling of conformational space. Especially in biological applications, molecular dynamics simulations are often confronted with rugged energy surfaces and high energy barriers, leading to insufficient sampling and, in turn, poor convergence of the free energy results. In this work, we address this problem by employing enhanced sampling methods. We explore the possibility of using self-guided Langevin dynamics (SGLD) to speed up the exploration process in free energy simulations. To obtain improved free energy differences from such simulations, it is necessary to account for the effects of the bias due to the guiding forces. We demonstrate how this can be accomplished for the Bennett's acceptance ratio (BAR) and the enveloping distribution sampling (EDS) methods. While BAR is considered among the most efficient methods available for free energy calculations, the EDS method developed by Christ and van Gunsteren is a promising development that reduces the computational costs of free energy calculations by simulating a single reference state. To evaluate the accuracy of both approaches in connection with enhanced sampling, EDS was implemented in CHARMM. For testing, we employ benchmark systems with analytical reference results and the mutation of alanine to serine. We find that SGLD with reweighting can provide accurate results for BAR and EDS where conventional molecular dynamics simulations fail. In addition, we compare the performance of EDS with other free energy methods. We briefly discuss the implications of our results and provide practical guidelines for conducting free energy simulations with SGLD.
A computer program for geochemical analysis of acid-rain and other low-ionic-strength, acidic waters
Johnsson, P.A.; Lord, D.G.
1987-01-01
ARCHEM, a computer program written in FORTRAN 77, is designed primarily for use in the routine geochemical interpretation of low-ionic-strength, acidic waters. On the basis of chemical analyses of the water, and either laboratory or field determinations of pH, temperature, and dissolved oxygen, the program calculates the equilibrium distribution of major inorganic aqueous species and of inorganic aluminum complexes. The concentration of the organic anion is estimated from the dissolved organic concentration. Ionic ferrous iron is calculated from the dissolved oxygen concentration. Ionic balances and comparisons of computed with measured specific conductances are performed as checks on the analytical accuracy of chemical analyses. ARCHEM may be tailored easily to fit different sampling protocols, and may be run on multiple sample analyses. (Author 's abstract)
Geostatistical Sampling Methods for Efficient Uncertainty Analysis in Flow and Transport Problems
NASA Astrophysics Data System (ADS)
Liodakis, Stylianos; Kyriakidis, Phaedon; Gaganis, Petros
2015-04-01
In hydrogeological applications involving flow and transport of in heterogeneous porous media the spatial distribution of hydraulic conductivity is often parameterized in terms of a lognormal random field based on a histogram and variogram model inferred from data and/or synthesized from relevant knowledge. Realizations of simulated conductivity fields are then generated using geostatistical simulation involving simple random (SR) sampling and are subsequently used as inputs to physically-based simulators of flow and transport in a Monte Carlo framework for evaluating the uncertainty in the spatial distribution of solute concentration due to the uncertainty in the spatial distribution of hydraulic con- ductivity [1]. Realistic uncertainty analysis, however, calls for a large number of simulated concentration fields; hence, can become expensive in terms of both time and computer re- sources. A more efficient alternative to SR sampling is Latin hypercube (LH) sampling, a special case of stratified random sampling, which yields a more representative distribution of simulated attribute values with fewer realizations [2]. Here, term representative implies realizations spanning efficiently the range of possible conductivity values corresponding to the lognormal random field. In this work we investigate the efficiency of alternative methods to classical LH sampling within the context of simulation of flow and transport in a heterogeneous porous medium. More precisely, we consider the stratified likelihood (SL) sampling method of [3], in which attribute realizations are generated using the polar simulation method by exploring the geometrical properties of the multivariate Gaussian distribution function. In addition, we propose a more efficient version of the above method, here termed minimum energy (ME) sampling, whereby a set of N representative conductivity realizations at M locations is constructed by: (i) generating a representative set of N points distributed on the surface of a M-dimensional, unit radius hyper-sphere, (ii) relocating the N points on a representative set of N hyper-spheres of different radii, and (iii) transforming the coordinates of those points to lie on N different hyper-ellipsoids spanning the multivariate Gaussian distribution. The above method is applied in a dimensionality reduction context by defining flow-controlling points over which representative sampling of hydraulic conductivity is performed, thus also accounting for the sensitivity of the flow and transport model to the input hydraulic conductivity field. The performance of the various stratified sampling methods, LH, SL, and ME, is compared to that of SR sampling in terms of reproduction of ensemble statistics of hydraulic conductivity and solute concentration for different sample sizes N (numbers of realizations). The results indicate that ME sampling constitutes an equally if not more efficient simulation method than LH and SL sampling, as it can reproduce to a similar extent statistics of the conductivity and concentration fields, yet with smaller sampling variability than SR sampling. References [1] Gutjahr A.L. and Bras R.L. Spatial variability in subsurface flow and transport: A review. Reliability Engineering & System Safety, 42, 293-316, (1993). [2] Helton J.C. and Davis F.J. Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems. Reliability Engineering & System Safety, 81, 23-69, (2003). [3] Switzer P. Multiple simulation of spatial fields. In: Heuvelink G, Lemmens M (eds) Proceedings of the 4th International Symposium on Spatial Accuracy Assessment in Natural Resources and Environmental Sciences, Coronet Books Inc., pp 629?635 (2000).
Eisenberg, Sarita; Guo, Ling-Yu
2016-05-01
This article reviews the existing literature on the diagnostic accuracy of two grammatical accuracy measures for differentiating children with and without language impairment (LI) at preschool and early school age based on language samples. The first measure, the finite verb morphology composite (FVMC), is a narrow grammatical measure that computes children's overall accuracy of four verb tense morphemes. The second measure, percent grammatical utterances (PGU), is a broader grammatical measure that computes children's accuracy in producing grammatical utterances. The extant studies show that FVMC demonstrates acceptable (i.e., 80 to 89% accurate) to good (i.e., 90% accurate or higher) diagnostic accuracy for children between 4;0 (years;months) and 6;11 in conversational or narrative samples. In contrast, PGU yields acceptable to good diagnostic accuracy for children between 3;0 and 8;11 regardless of sample types. Given the diagnostic accuracy shown in the literature, we suggest that FVMC and PGU can be used as one piece of evidence for identifying children with LI in assessment when appropriate. However, FVMC or PGU should not be used as therapy goals directly. Instead, when children are low in FVMC or PGU, we suggest that follow-up analyses should be conducted to determine the verb tense morphemes or grammatical structures that children have difficulty with. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Impact of dose calibrators quality control programme in Argentina
NASA Astrophysics Data System (ADS)
Furnari, J. C.; de Cabrejas, M. L.; del C. Rotta, M.; Iglicki, F. A.; Milá, M. I.; Magnavacca, C.; Dima, J. C.; Rodríguez Pasqués, R. H.
1992-02-01
The national Quality Control (QC) programme for radionuclide calibrators started 12 years ago. Accuracy and the implementation of a QC programme were evaluated over all these years at 95 nuclear medicine laboratories where dose calibrators were in use. During all that time, the Metrology Group of CNEA has distributed 137Cs sealed sources to check stability and has been performing periodic "checking rounds" and postal surveys using unknown samples (external quality control). An account of the results of both methods is presented. At present, more of 65% of the dose calibrators measure activities with an error less than 10%.
On the accuracy of ERS-1 orbit predictions
NASA Technical Reports Server (NTRS)
Koenig, Rolf; Li, H.; Massmann, Franz-Heinrich; Raimondo, J. C.; Rajasenan, C.; Reigber, C.
1993-01-01
Since the launch of ERS-1, the D-PAF (German Processing and Archiving Facility) provides regularly orbit predictions for the worldwide SLR (Satellite Laser Ranging) tracking network. The weekly distributed orbital elements are so called tuned IRV's and tuned SAO-elements. The tuning procedure, designed to improve the accuracy of the recovery of the orbit at the stations, is discussed based on numerical results. This shows that tuning of elements is essential for ERS-1 with the currently applied tracking procedures. The orbital elements are updated by daily distributed time bias functions. The generation of the time bias function is explained. Problems and numerical results are presented. The time bias function increases the prediction accuracy considerably. Finally, the quality assessment of ERS-1 orbit predictions is described. The accuracy is compiled for about 250 days since launch. The average accuracy lies in the range of 50-100 ms and has considerably improved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Feng, Huan; Qian, Yu; Cochran, J. Kirk
Here, this article reports a nanometer-scale investigation of trace element (As, Ca, Cr, Cu, Fe, Mn, Ni, S and Zn) distributions in the root system Spartina alterniflora during dormancy. The sample was collected on a salt marsh island in Jamaica Bay, New York, in April 2015 and the root was cross-sectioned with 10 μm resolution. Synchrotron X-ray nanofluorescence was applied to map the trace element distributions in selected areas of the root epidermis and endodermis. The sampling resolution was 60 nm to increase the measurement accuracy and reduce the uncertainty. The results indicate that the elemental concentrations in the epidermis,more » outer endodermis and inner endodermis are significantly (p < 0.01) different. The root endodermis has relatively higher concentrations of these elements than the root epidermis. Furthermore, this high resolution measurement indicates that the elemental concentrations in the outer endodermis are significantly (p < 0.01) higher than those in the inner endodermis. These results suggest that the Casparian strip may play a role in governing the aplastic transport of these elements. Pearson correlation analysis on the average concentrations of each element in the selected areas shows that most of the elements are significantly (p < 0.05) correlated, which suggests that these elements may share the same transport pathways.« less
Feng, Huan; Qian, Yu; Cochran, J. Kirk; ...
2017-01-18
Here, this article reports a nanometer-scale investigation of trace element (As, Ca, Cr, Cu, Fe, Mn, Ni, S and Zn) distributions in the root system Spartina alterniflora during dormancy. The sample was collected on a salt marsh island in Jamaica Bay, New York, in April 2015 and the root was cross-sectioned with 10 μm resolution. Synchrotron X-ray nanofluorescence was applied to map the trace element distributions in selected areas of the root epidermis and endodermis. The sampling resolution was 60 nm to increase the measurement accuracy and reduce the uncertainty. The results indicate that the elemental concentrations in the epidermis,more » outer endodermis and inner endodermis are significantly (p < 0.01) different. The root endodermis has relatively higher concentrations of these elements than the root epidermis. Furthermore, this high resolution measurement indicates that the elemental concentrations in the outer endodermis are significantly (p < 0.01) higher than those in the inner endodermis. These results suggest that the Casparian strip may play a role in governing the aplastic transport of these elements. Pearson correlation analysis on the average concentrations of each element in the selected areas shows that most of the elements are significantly (p < 0.05) correlated, which suggests that these elements may share the same transport pathways.« less
HIERARCHICAL PROBABILISTIC INFERENCE OF COSMIC SHEAR
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schneider, Michael D.; Dawson, William A.; Hogg, David W.
2015-07-01
Point estimators for the shearing of galaxy images induced by gravitational lensing involve a complex inverse problem in the presence of noise, pixelization, and model uncertainties. We present a probabilistic forward modeling approach to gravitational lensing inference that has the potential to mitigate the biased inferences in most common point estimators and is practical for upcoming lensing surveys. The first part of our statistical framework requires specification of a likelihood function for the pixel data in an imaging survey given parameterized models for the galaxies in the images. We derive the lensing shear posterior by marginalizing over all intrinsic galaxymore » properties that contribute to the pixel data (i.e., not limited to galaxy ellipticities) and learn the distributions for the intrinsic galaxy properties via hierarchical inference with a suitably flexible conditional probabilitiy distribution specification. We use importance sampling to separate the modeling of small imaging areas from the global shear inference, thereby rendering our algorithm computationally tractable for large surveys. With simple numerical examples we demonstrate the improvements in accuracy from our importance sampling approach, as well as the significance of the conditional distribution specification for the intrinsic galaxy properties when the data are generated from an unknown number of distinct galaxy populations with different morphological characteristics.« less
A Novel Energy-Efficient Approach for Human Activity Recognition.
Zheng, Lingxiang; Wu, Dihong; Ruan, Xiaoyang; Weng, Shaolin; Peng, Ao; Tang, Biyu; Lu, Hai; Shi, Haibin; Zheng, Huiru
2017-09-08
In this paper, we propose a novel energy-efficient approach for mobile activity recognition system (ARS) to detect human activities. The proposed energy-efficient ARS, using low sampling rates, can achieve high recognition accuracy and low energy consumption. A novel classifier that integrates hierarchical support vector machine and context-based classification (HSVMCC) is presented to achieve a high accuracy of activity recognition when the sampling rate is less than the activity frequency, i.e., the Nyquist sampling theorem is not satisfied. We tested the proposed energy-efficient approach with the data collected from 20 volunteers (14 males and six females) and the average recognition accuracy of around 96.0% was achieved. Results show that using a low sampling rate of 1Hz can save 17.3% and 59.6% of energy compared with the sampling rates of 5 Hz and 50 Hz. The proposed low sampling rate approach can greatly reduce the power consumption while maintaining high activity recognition accuracy. The composition of power consumption in online ARS is also investigated in this paper.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hammond, Glenn Edward; Song, Xuehang; Ye, Ming
A new approach is developed to delineate the spatial distribution of discrete facies (geological units that have unique distributions of hydraulic, physical, and/or chemical properties) conditioned not only on direct data (measurements directly related to facies properties, e.g., grain size distribution obtained from borehole samples) but also on indirect data (observations indirectly related to facies distribution, e.g., hydraulic head and tracer concentration). Our method integrates for the first time ensemble data assimilation with traditional transition probability-based geostatistics. The concept of level set is introduced to build shape parameterization that allows transformation between discrete facies indicators and continuous random variables. Themore » spatial structure of different facies is simulated by indicator models using conditioning points selected adaptively during the iterative process of data assimilation. To evaluate the new method, a two-dimensional semi-synthetic example is designed to estimate the spatial distribution and permeability of two distinct facies from transient head data induced by pumping tests. The example demonstrates that our new method adequately captures the spatial pattern of facies distribution by imposing spatial continuity through conditioning points. The new method also reproduces the overall response in hydraulic head field with better accuracy compared to data assimilation with no constraints on spatial continuity on facies.« less
Tarafder, M R; Carabin, H; Joseph, L; Balolong, E; Olveda, R; McGarvey, S T
2010-03-15
The accuracy of the Kato-Katz technique in identifying individuals with soil-transmitted helminth (STH) infections is limited by day-to-day variation in helminth egg excretion, confusion with other parasites and the laboratory technicians' experience. We aimed to estimate the sensitivity and specificity of the Kato-Katz technique to detect infection with Ascaris lumbricoides, hookworm and Trichuris trichiura using a Bayesian approach in the absence of a 'gold standard'. Data were obtained from a longitudinal study conducted between January 2004 and December 2005 in Samar Province, the Philippines. Each participant provided between one and three stool samples over consecutive days. Stool samples were examined using the Kato-Katz technique and reported as positive or negative for STHs. In the presence of measurement error, the true status of each individual is considered as latent data. Using a Bayesian method, we calculated marginal posterior densities of sensitivity and specificity parameters from the product of the likelihood function of observed and latent data. A uniform prior distribution was used (beta distribution: alpha=1, beta=1). A total of 5624 individuals provided at least one stool sample. One, two and three stool samples were provided by 1582, 1893 and 2149 individuals, respectively. All STHs showed variation in test results from day to day. Sensitivity estimates of the Kato-Katz technique for one stool sample were 96.9% (95% Bayesian Credible Interval [BCI]: 96.1%, 97.6%), 65.2% (60.0%, 69.8%) and 91.4% (90.5%, 92.3%), for A. lumbricoides, hookworm and T. trichiura, respectively. Specificity estimates for one stool sample were 96.1% (95.5%, 96.7%), 93.8% (92.4%, 95.4%) and 94.4% (93.2%, 95.5%), for A. lumbricoides, hookworm and T. trichiura, respectively. Our results show that the Kato-Katz technique can perform with reasonable accuracy with one day's stool collection for A. lumbricoides and T. trichiura. Low sensitivity of the Kato-Katz for detection of hookworm infection may be related to rapid degeneration of delicate hookworm eggs with time. (c) 2009 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.
Classification accuracy for stratification with remotely sensed data
Raymond L. Czaplewski; Paul L. Patterson
2003-01-01
Tools are developed that help specify the classification accuracy required from remotely sensed data. These tools are applied during the planning stage of a sample survey that will use poststratification, prestratification with proportional allocation, or double sampling for stratification. Accuracy standards are developed in terms of an âerror matrix,â which is...
Oh, Jeongsu; Choi, Chi-Hwan; Park, Min-Kyu; Kim, Byung Kwon; Hwang, Kyuin; Lee, Sang-Heon; Hong, Soon Gyu; Nasir, Arshan; Cho, Wan-Sup; Kim, Kyung Mo
2016-01-01
High-throughput sequencing can produce hundreds of thousands of 16S rRNA sequence reads corresponding to different organisms present in the environmental samples. Typically, analysis of microbial diversity in bioinformatics starts from pre-processing followed by clustering 16S rRNA reads into relatively fewer operational taxonomic units (OTUs). The OTUs are reliable indicators of microbial diversity and greatly accelerate the downstream analysis time. However, existing hierarchical clustering algorithms that are generally more accurate than greedy heuristic algorithms struggle with large sequence datasets. To keep pace with the rapid rise in sequencing data, we present CLUSTOM-CLOUD, which is the first distributed sequence clustering program based on In-Memory Data Grid (IMDG) technology-a distributed data structure to store all data in the main memory of multiple computing nodes. The IMDG technology helps CLUSTOM-CLOUD to enhance both its capability of handling larger datasets and its computational scalability better than its ancestor, CLUSTOM, while maintaining high accuracy. Clustering speed of CLUSTOM-CLOUD was evaluated on published 16S rRNA human microbiome sequence datasets using the small laboratory cluster (10 nodes) and under the Amazon EC2 cloud-computing environments. Under the laboratory environment, it required only ~3 hours to process dataset of size 200 K reads regardless of the complexity of the human microbiome data. In turn, one million reads were processed in approximately 20, 14, and 11 hours when utilizing 20, 30, and 40 nodes on the Amazon EC2 cloud-computing environment. The running time evaluation indicates that CLUSTOM-CLOUD can handle much larger sequence datasets than CLUSTOM and is also a scalable distributed processing system. The comparative accuracy test using 16S rRNA pyrosequences of a mock community shows that CLUSTOM-CLOUD achieves higher accuracy than DOTUR, mothur, ESPRIT-Tree, UCLUST and Swarm. CLUSTOM-CLOUD is written in JAVA and is freely available at http://clustomcloud.kopri.re.kr.
Park, Min-Kyu; Kim, Byung Kwon; Hwang, Kyuin; Lee, Sang-Heon; Hong, Soon Gyu; Nasir, Arshan; Cho, Wan-Sup; Kim, Kyung Mo
2016-01-01
High-throughput sequencing can produce hundreds of thousands of 16S rRNA sequence reads corresponding to different organisms present in the environmental samples. Typically, analysis of microbial diversity in bioinformatics starts from pre-processing followed by clustering 16S rRNA reads into relatively fewer operational taxonomic units (OTUs). The OTUs are reliable indicators of microbial diversity and greatly accelerate the downstream analysis time. However, existing hierarchical clustering algorithms that are generally more accurate than greedy heuristic algorithms struggle with large sequence datasets. To keep pace with the rapid rise in sequencing data, we present CLUSTOM-CLOUD, which is the first distributed sequence clustering program based on In-Memory Data Grid (IMDG) technology–a distributed data structure to store all data in the main memory of multiple computing nodes. The IMDG technology helps CLUSTOM-CLOUD to enhance both its capability of handling larger datasets and its computational scalability better than its ancestor, CLUSTOM, while maintaining high accuracy. Clustering speed of CLUSTOM-CLOUD was evaluated on published 16S rRNA human microbiome sequence datasets using the small laboratory cluster (10 nodes) and under the Amazon EC2 cloud-computing environments. Under the laboratory environment, it required only ~3 hours to process dataset of size 200 K reads regardless of the complexity of the human microbiome data. In turn, one million reads were processed in approximately 20, 14, and 11 hours when utilizing 20, 30, and 40 nodes on the Amazon EC2 cloud-computing environment. The running time evaluation indicates that CLUSTOM-CLOUD can handle much larger sequence datasets than CLUSTOM and is also a scalable distributed processing system. The comparative accuracy test using 16S rRNA pyrosequences of a mock community shows that CLUSTOM-CLOUD achieves higher accuracy than DOTUR, mothur, ESPRIT-Tree, UCLUST and Swarm. CLUSTOM-CLOUD is written in JAVA and is freely available at http://clustomcloud.kopri.re.kr. PMID:26954507
NASA Astrophysics Data System (ADS)
Obuchi, Tomoyuki; Cocco, Simona; Monasson, Rémi
2015-11-01
We consider the problem of learning a target probability distribution over a set of N binary variables from the knowledge of the expectation values (with this target distribution) of M observables, drawn uniformly at random. The space of all probability distributions compatible with these M expectation values within some fixed accuracy, called version space, is studied. We introduce a biased measure over the version space, which gives a boost increasing exponentially with the entropy of the distributions and with an arbitrary inverse `temperature' Γ . The choice of Γ allows us to interpolate smoothly between the unbiased measure over all distributions in the version space (Γ =0) and the pointwise measure concentrated at the maximum entropy distribution (Γ → ∞ ). Using the replica method we compute the volume of the version space and other quantities of interest, such as the distance R between the target distribution and the center-of-mass distribution over the version space, as functions of α =(log M)/N and Γ for large N. Phase transitions at critical values of α are found, corresponding to qualitative improvements in the learning of the target distribution and to the decrease of the distance R. However, for fixed α the distance R does not vary with Γ which means that the maximum entropy distribution is not closer to the target distribution than any other distribution compatible with the observable values. Our results are confirmed by Monte Carlo sampling of the version space for small system sizes (N≤ 10).
Zeng, Chen; Xu, Huiping; Fischer, Andrew M.
2016-01-01
Ocean color remote sensing significantly contributes to our understanding of phytoplankton distribution and abundance and primary productivity in the Southern Ocean (SO). However, the current SO in situ optical database is still insufficient and unevenly distributed. This limits the ability to produce robust and accurate measurements of satellite-based chlorophyll. Based on data collected on cruises around the Antarctica Peninsula (AP) on January 2014 and 2016, this research intends to enhance our knowledge of SO water and atmospheric optical characteristics and address satellite algorithm deficiency of ocean color products. We collected high resolution in situ water leaving reflectance (±1 nm band resolution), simultaneous in situ chlorophyll-a concentrations and satellite (MODIS and VIIRS) water leaving reflectance. Field samples show that clouds have a great impact on the visible green bands and are difficult to detect because NASA protocols apply the NIR band as a cloud contamination threshold. When compared to global case I water, water around the AP has lower water leaving reflectance and a narrower blue-green band ratio, which explains chlorophyll-a underestimation in high chlorophyll-a regions and overestimation in low chlorophyll-a regions. VIIRS shows higher spatial coverage and detection accuracy than MODIS. After coefficient improvement, VIIRS is able to predict chlorophyll a with 53% accuracy. PMID:27941596
Zeng, Chen; Xu, Huiping; Fischer, Andrew M
2016-12-07
Ocean color remote sensing significantly contributes to our understanding of phytoplankton distribution and abundance and primary productivity in the Southern Ocean (SO). However, the current SO in situ optical database is still insufficient and unevenly distributed. This limits the ability to produce robust and accurate measurements of satellite-based chlorophyll. Based on data collected on cruises around the Antarctica Peninsula (AP) on January 2014 and 2016, this research intends to enhance our knowledge of SO water and atmospheric optical characteristics and address satellite algorithm deficiency of ocean color products. We collected high resolution in situ water leaving reflectance (±1 nm band resolution), simultaneous in situ chlorophyll-a concentrations and satellite (MODIS and VIIRS) water leaving reflectance. Field samples show that clouds have a great impact on the visible green bands and are difficult to detect because NASA protocols apply the NIR band as a cloud contamination threshold. When compared to global case I water, water around the AP has lower water leaving reflectance and a narrower blue-green band ratio, which explains chlorophyll-a underestimation in high chlorophyll-a regions and overestimation in low chlorophyll-a regions. VIIRS shows higher spatial coverage and detection accuracy than MODIS. After coefficient improvement, VIIRS is able to predict chlorophyll a with 53% accuracy.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Solaimani, Mohiuddin; Iftekhar, Mohammed; Khan, Latifur
Anomaly detection refers to the identi cation of an irregular or unusual pat- tern which deviates from what is standard, normal, or expected. Such deviated patterns typically correspond to samples of interest and are assigned different labels in different domains, such as outliers, anomalies, exceptions, or malware. Detecting anomalies in fast, voluminous streams of data is a formidable chal- lenge. This paper presents a novel, generic, real-time distributed anomaly detection framework for heterogeneous streaming data where anomalies appear as a group. We have developed a distributed statistical approach to build a model and later use it to detect anomaly. Asmore » a case study, we investigate group anomaly de- tection for a VMware-based cloud data center, which maintains a large number of virtual machines (VMs). We have built our framework using Apache Spark to get higher throughput and lower data processing time on streaming data. We have developed a window-based statistical anomaly detection technique to detect anomalies that appear sporadically. We then relaxed this constraint with higher accuracy by implementing a cluster-based technique to detect sporadic and continuous anomalies. We conclude that our cluster-based technique out- performs other statistical techniques with higher accuracy and lower processing time.« less
Remote Sensing of Energy Distribution Characteristics over the Tibet
NASA Astrophysics Data System (ADS)
Shi, J.; Husi, L.; Wang, T.
2017-12-01
The overall objective of our study is to quantify the spatiotemporal characteristics and changes of typical factors dominating water and energy cycles in the Tibet region. Especially, we focus on variables of clouds optical & microphysical parameters, surface shortwave and longwave radiation. Clouds play a key role in the Tibetan region's water and energy cycles. They seriously impact the precipitation, temperature and surface energy distribution. Considering that proper cloud products with relatively higher spatial and temporal sampling and with satisfactory accuracy are serious lacking in the Tibet region, except cloud optical thickness, cloud effective radius and liquid/ice water content, the cloud coverage dynamics at hourly scales also analyzed jointly based on measurements of Himawari-8, and MODIS. Surface radiation, as an important energy source in perturbating the Tibet's evapotranspiration, snow and glacier melting, is a controlling factor in energy balance in the Tibet region. All currently available radiation products in this area are not suitable for regional scale study of water and energy exchange and snow/glacier melting due to their coarse resolution and low accuracies because of cloud and topography. A strategy for deriving land surface upward and downward radiation by fusing optical and microwave remote sensing data is proposed. At the same time, the big topographic effect on the surface radiation are also modelled and analyzed over the Tibet region.
Klein, Mike E.; Zatorre, Robert J.
2015-01-01
In categorical perception (CP), continuous physical signals are mapped to discrete perceptual bins: mental categories not found in the physical world. CP has been demonstrated across multiple sensory modalities and, in audition, for certain over-learned speech and musical sounds. The neural basis of auditory CP, however, remains ambiguous, including its robustness in nonspeech processes and the relative roles of left/right hemispheres; primary/nonprimary cortices; and ventral/dorsal perceptual processing streams. Here, highly trained musicians listened to 2-tone musical intervals, which they perceive categorically while undergoing functional magnetic resonance imaging. Multivariate pattern analyses were performed after grouping sounds by interval quality (determined by frequency ratio between tones) or pitch height (perceived noncategorically, frequency ratios remain constant). Distributed activity patterns in spheres of voxels were used to determine sound sample identities. For intervals, significant decoding accuracy was observed in the right superior temporal and left intraparietal sulci, with smaller peaks observed homologously in contralateral hemispheres. For pitch height, no significant decoding accuracy was observed, consistent with the non-CP of this dimension. These results suggest that similar mechanisms are operative for nonspeech categories as for speech; espouse roles for 2 segregated processing streams; and support hierarchical processing models for CP. PMID:24488957
Cryogenic thermometry for refrigerant distribution system of JT-60SA
NASA Astrophysics Data System (ADS)
Natsume, K.; Murakami, H.; Kizu, K.; Yoshida, K.; Koide, Y.
2015-12-01
JT-60SA is a fully superconducting fusion experimental device involving Japan and Europe. The cryogenic system supplies supercritical or gaseous helium to superconducting coils through valve boxes or coil terminal boxes and in-cryostat pipes. There are 86 temperature measurement points at 4 K along the distribution line. Resistance temperature sensors will be installed on cooling pipes in vacuum. In this work, two sensor attachment methods, two types of sensor, two thermal anchoring methods, and two sensor fixation materials have been experimentally evaluated in terms of accuracy and mass productivity. Finally, the verification test of thermometry has been conducted using the sample pipe fabricated in the same way to the production version, which has been decided by the comparison experiments. The TVO sensor is attached by the saddle method with Apiezon N grease and the measurement wires made of phosphor bronze are wound on the pipe with Stycast 2850FT as the thermal anchoring. A Cernox sensor is directly immersed in liquid helium as a reference thermometer during the experiment. The measured temperature difference between the attached one and reference one has been within ±15 mK in the range of 3.40-4.73 K. It has satisfies the accuracy requirement of 0.1 K.
A preliminary study on identification of Thai rice samples by INAA and statistical analysis
NASA Astrophysics Data System (ADS)
Kongsri, S.; Kukusamude, C.
2017-09-01
This study aims to investigate the elemental compositions in 93 Thai rice samples using instrumental neutron activation analysis (INAA) and to identify rice according to their types and rice cultivars using statistical analysis. As, Mg, Cl, Al, Br, Mn, K, Rb and Zn in Thai jasmine rice and Sung Yod rice samples were successfully determined by INAA. The accuracy and precision of the INAA method were verified by SRM 1568a Rice Flour. All elements were found to be in a good agreement with the certified values. The precisions in term of %RSD were lower than 7%. The LODs were obtained in range of 0.01 to 29 mg kg-1. The concentration of 9 elements distributed in Thai rice samples was evaluated and used as chemical indicators to identify the type of rice samples. The result found that Mg, Cl, As, Br, Mn, K, Rb, and Zn concentrations in Thai jasmine rice samples are significantly different but there was no evidence that Al is significantly different from concentration in Sung Yod rice samples at 95% confidence interval. Our results may provide preliminary information for discrimination of rice samples and may be useful database of Thai rice.
Emperical Tests of Acceptance Sampling Plans
NASA Technical Reports Server (NTRS)
White, K. Preston, Jr.; Johnson, Kenneth L.
2012-01-01
Acceptance sampling is a quality control procedure applied as an alternative to 100% inspection. A random sample of items is drawn from a lot to determine the fraction of items which have a required quality characteristic. Both the number of items to be inspected and the criterion for determining conformance of the lot to the requirement are given by an appropriate sampling plan with specified risks of Type I and Type II sampling errors. In this paper, we present the results of empirical tests of the accuracy of selected sampling plans reported in the literature. These plans are for measureable quality characteristics which are known have either binomial, exponential, normal, gamma, Weibull, inverse Gaussian, or Poisson distributions. In the main, results support the accepted wisdom that variables acceptance plans are superior to attributes (binomial) acceptance plans, in the sense that these provide comparable protection against risks at reduced sampling cost. For the Gaussian and Weibull plans, however, there are ranges of the shape parameters for which the required sample sizes are in fact larger than the corresponding attributes plans, dramatically so for instances of large skew. Tests further confirm that the published inverse-Gaussian (IG) plan is flawed, as reported by White and Johnson (2011).
Short-arc measurement and fitting based on the bidirectional prediction of observed data
NASA Astrophysics Data System (ADS)
Fei, Zhigen; Xu, Xiaojie; Georgiadis, Anthimos
2016-02-01
To measure a short arc is a notoriously difficult problem. In this study, the bidirectional prediction method based on the Radial Basis Function Neural Network (RBFNN) to the observed data distributed along a short arc is proposed to increase the corresponding arc length, and thus improve its fitting accuracy. Firstly, the rationality of regarding observed data as a time series is discussed in accordance with the definition of a time series. Secondly, the RBFNN is constructed to predict the observed data where the interpolation method is used for enlarging the size of training examples in order to improve the learning accuracy of the RBFNN’s parameters. Finally, in the numerical simulation section, we focus on simulating how the size of the training sample and noise level influence the learning error and prediction error of the built RBFNN. Typically, the observed data coming from a 5{}^\\circ short arc are used to evaluate the performance of the Hyper method known as the ‘unbiased fitting method of circle’ with a different noise level before and after prediction. A number of simulation experiments reveal that the fitting stability and accuracy of the Hyper method after prediction are far superior to the ones before prediction.
To address accuracy and precision using methods from analytical chemistry and computational physics.
Kozmutza, Cornelia; Picó, Yolanda
2009-04-01
In this work the pesticides were determined by liquid chromatography-mass spectrometry (LC-MS). In present study the occurrence of imidacloprid in 343 samples of oranges, tangerines, date plum, and watermelons from Valencian Community (Spain) has been investigated. The nine additional pesticides were chosen as they have been recommended for orchard treatment together with imidacloprid. The Mulliken population analysis has been applied to present the charge distribution in imidacloprid. Partitioned energy terms and the virial ratios have been calculated for certain molecules entering in interaction. A new technique based on the comparison of the decomposed total energy terms at various configurations is demonstrated in this work. The interaction ability could be established correctly in the studied case. An attempt is also made in this work to address accuracy and precision. These quantities are well-known in experimental measurements. In case precise theoretical description is achieved for the contributing monomers and also for the interacting complex structure some properties of this latter system can be predicted to quite a good accuracy. Based on simple hypothetical considerations we estimate the impact of applying computations on reducing the amount of analytical work.
High-precision positioning system of four-quadrant detector based on the database query
NASA Astrophysics Data System (ADS)
Zhang, Xin; Deng, Xiao-guo; Su, Xiu-qin; Zheng, Xiao-qiang
2015-02-01
The fine pointing mechanism of the Acquisition, Pointing and Tracking (APT) system in free space laser communication usually use four-quadrant detector (QD) to point and track the laser beam accurately. The positioning precision of QD is one of the key factors of the pointing accuracy to APT system. A positioning system is designed based on FPGA and DSP in this paper, which can realize the sampling of AD, the positioning algorithm and the control of the fast swing mirror. We analyze the positioning error of facular center calculated by universal algorithm when the facular energy obeys Gauss distribution from the working principle of QD. A database is built by calculation and simulation with MatLab software, in which the facular center calculated by universal algorithm is corresponded with the facular center of Gaussian beam, and the database is stored in two pieces of E2PROM as the external memory of DSP. The facular center of Gaussian beam is inquiry in the database on the basis of the facular center calculated by universal algorithm in DSP. The experiment results show that the positioning accuracy of the high-precision positioning system is much better than the positioning accuracy calculated by universal algorithm.
Aguirre-Gutiérrez, Jesús; Carvalheiro, Luísa G; Polce, Chiara; van Loon, E Emiel; Raes, Niels; Reemer, Menno; Biesmeijer, Jacobus C
2013-01-01
Understanding species distributions and the factors limiting them is an important topic in ecology and conservation, including in nature reserve selection and predicting climate change impacts. While Species Distribution Models (SDM) are the main tool used for these purposes, choosing the best SDM algorithm is not straightforward as these are plentiful and can be applied in many different ways. SDM are used mainly to gain insight in 1) overall species distributions, 2) their past-present-future probability of occurrence and/or 3) to understand their ecological niche limits (also referred to as ecological niche modelling). The fact that these three aims may require different models and outputs is, however, rarely considered and has not been evaluated consistently. Here we use data from a systematically sampled set of species occurrences to specifically test the performance of Species Distribution Models across several commonly used algorithms. Species range in distribution patterns from rare to common and from local to widespread. We compare overall model fit (representing species distribution), the accuracy of the predictions at multiple spatial scales, and the consistency in selection of environmental correlations all across multiple modelling runs. As expected, the choice of modelling algorithm determines model outcome. However, model quality depends not only on the algorithm, but also on the measure of model fit used and the scale at which it is used. Although model fit was higher for the consensus approach and Maxent, Maxent and GAM models were more consistent in estimating local occurrence, while RF and GBM showed higher consistency in environmental variables selection. Model outcomes diverged more for narrowly distributed species than for widespread species. We suggest that matching study aims with modelling approach is essential in Species Distribution Models, and provide suggestions how to do this for different modelling aims and species' data characteristics (i.e. sample size, spatial distribution).
NASA Astrophysics Data System (ADS)
Godsey, S. E.; Kirchner, J. W.
2008-12-01
The mean residence time - the average time that it takes rainfall to reach the stream - is a basic parameter used to characterize catchment processes. Heterogeneities in these processes lead to a distribution of travel times around the mean residence time. By examining this travel time distribution, we can better predict catchment response to contamination events. A catchment system with shorter residence times or narrower distributions will respond quickly to contamination events, whereas systems with longer residence times or longer-tailed distributions will respond more slowly to those same contamination events. The travel time distribution of a catchment is typically inferred from time series of passive tracers (e.g., water isotopes or chloride) in precipitation and streamflow. Variations in the tracer concentration in streamflow are usually damped compared to those in precipitation, because precipitation inputs from different storms (with different tracer signatures) are mixed within the catchment. Mathematically, this mixing process is represented by the convolution of the travel time distribution and the precipitation tracer inputs to generate the stream tracer outputs. Because convolution in the time domain is equivalent to multiplication in the frequency domain, it is relatively straightforward to estimate the parameters of the travel time distribution in either domain. In the time domain, the parameters describing the travel time distribution are typically estimated by maximizing the goodness of fit between the modeled and measured tracer outputs. In the frequency domain, the travel time distribution parameters can be estimated by fitting a power-law curve to the ratio of precipitation spectral power to stream spectral power. Differences between the methods of parameter estimation in the time and frequency domain mean that these two methods may respond differently to variations in data quality, record length and sampling frequency. Here we evaluate how well these two methods of travel time parameter estimation respond to different sources of uncertainty and compare the methods to one another. We do this by generating synthetic tracer input time series of different lengths, and convolve these with specified travel-time distributions to generate synthetic output time series. We then sample both the input and output time series at various sampling intervals and corrupt the time series with realistic error structures. Using these 'corrupted' time series, we infer the apparent travel time distribution, and compare it to the known distribution that was used to generate the synthetic data in the first place. This analysis allows us to quantify how different record lengths, sampling intervals, and error structures in the tracer measurements affect the apparent mean residence time and the apparent shape of the travel time distribution.
Measurement and analysis of x-ray absorption in Al and MgF2 plasmas heated by Z-pinch radiation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
MacFarlane, Joseph John; Rochau, Gregory Alan; Bailey, James E.
2005-06-01
High-power Z pinches on Sandia National Laboratories Z facility can be used in a variety of experiments to radiatively heat samples placed some distance away from the Z-pinch plasma. In such experiments, the heating radiation spectrum is influenced by both the Z-pinch emission and the re-emission of radiation from the high-Z surfaces that make up the Z-pinch diode. To test the understanding of the amplitude and spectral distribution of the heating radiation, thin foils containing both Al and MgF{sub 2} were heated by a 100-130 TW Z pinch. The heating of these samples was studied through the ionization distribution inmore » each material as measured by x-ray absorption spectra. The resulting plasma conditions are inferred from a least-squares comparison between the measured spectra and calculations of the Al and Mg 1s {yields} 2p absorption over a large range of temperatures and densities. These plasma conditions are then compared to radiation-hydrodynamics simulations of the sample dynamics and are found to agree within 1{sigma} to the best-fit conditions. This agreement indicates that both the driving radiation spectrum and the heating of the Al and MgF{sub 2} samples is understood within the accuracy of the spectroscopic method.« less
Mulware, Stephen Juma
2015-01-01
The properties of many biological materials often depend on the spatial distribution and concentration of the trace elements present in a matrix. Scientists have over the years tried various techniques including classical physical and chemical analyzing techniques each with relative level of accuracy. However, with the development of spatially sensitive submicron beams, the nuclear microprobe techniques using focused proton beams for the elemental analysis of biological materials have yielded significant success. In this paper, the basic principles of the commonly used microprobe techniques of STIM, RBS, and PIXE for trace elemental analysis are discussed. The details for sample preparation, the detection, and data collection and analysis are discussed. Finally, an application of the techniques to analysis of corn roots for elemental distribution and concentration is presented.
Prediction of Ba, Mn and Zn for tropical soils using iron oxides and magnetic susceptibility
NASA Astrophysics Data System (ADS)
Marques Júnior, José; Arantes Camargo, Livia; Reynaldo Ferracciú Alleoni, Luís; Tadeu Pereira, Gener; De Bortoli Teixeira, Daniel; Santos Rabelo de Souza Bahia, Angelica
2017-04-01
Agricultural activity is an important source of potentially toxic elements (PTEs) in soil worldwide but particularly in heavily farmed areas. Spatial distribution characterization of PTE contents in farming areas is crucial to assess further environmental impacts caused by soil contamination. Designing prediction models become quite useful to characterize the spatial variability of continuous variables, as it allows prediction of soil attributes that might be difficult to attain in a large number of samples through conventional methods. This study aimed to evaluate, in three geomorphic surfaces of Oxisols, the capacity for predicting PTEs (Ba, Mn, Zn) and their spatial variability using iron oxides and magnetic susceptibility (MS). Soil samples were collected from three geomorphic surfaces and analyzed for chemical, physical, mineralogical properties, as well as magnetic susceptibility (MS). PTE prediction models were calibrated by multiple linear regression (MLR). MLR calibration accuracy was evaluated using the coefficient of determination (R2). PTE spatial distribution maps were built using the values calculated by the calibrated models that reached the best accuracy by means of geostatistics. The high correlations between the attributes clay, MS, hematite (Hm), iron oxides extracted by sodium dithionite-citrate-bicarbonate (Fed), and iron oxides extracted using acid ammonium oxalate (Feo) with the elements Ba, Mn, and Zn enabled them to be selected as predictors for PTEs. Stepwise multiple linear regression showed that MS and Fed were the best PTE predictors individually, as they promoted no significant increase in R2 when two or more attributes were considered together. The MS-calibrated models for Ba, Mn, and Zn prediction exhibited R2 values of 0.88, 0.66, and 0.55, respectively. These are promising results since MS is a fast, cheap, and non-destructive tool, allowing the prediction of a large number of samples, which in turn enables detailed mapping of large areas. MS predicted values enabled the characterization and the understanding of spatial variability of the studied PTEs.
Accuracy assessment with complex sampling designs
Raymond L. Czaplewski
2010-01-01
A reliable accuracy assessment of remotely sensed geospatial data requires a sufficiently large probability sample of expensive reference data. Complex sampling designs reduce cost or increase precision, especially with regional, continental and global projects. The General Restriction (GR) Estimator and the Recursive Restriction (RR) Estimator separate a complex...
Two-stage cluster sampling reduces the cost of collecting accuracy assessment reference data by constraining sample elements to fall within a limited number of geographic domains (clusters). However, because classification error is typically positively spatially correlated, withi...
Li, Jing; Ma, Bo; Zhang, Qi; Yang, Xiaojing; Sun, Jingjing; Tang, Bowen; Cui, Guangbo; Yao, Di; Liu, Lei; Gu, Guiying; Zhu, Jianwei; Wei, Ping; Ouyang, Pingkai
2014-11-01
A highly selective and sensitive method for simultaneous quantitation of osthole, bergapten and isopimpinellin in rat plasma and tissues was developed by liquid chromatography-tandem quadrupole mass spectrometry (LC-MS/MS). After liquid-liquid extraction of samples with methyl tert-butyl ether, the analytes and dextrorphan (internal standard, IS) were separated by a Hypersil GOLD AQ C18 column with gradient elution of acetonitrile and water containing 0.5‰ formic acid. Three determinands were detected using an electrospray ionization (ESI) tandem mass spectrometry in the multiple reaction monitoring (MRM) modes with positive electrospray ionization. Calibration curves were recovered over the concentration ranges of 1-200 ng/ml, 1-500 ng/ml, 0.25-200 ng/ml for osthole, bergapten and isopimpinellin in plasma; 1-100 ng/ml, 1-500 ng/ml, 0.5-100 ng/ml for osthole, bergapten and isopimpinellin in tissues, respectively. The intra-day precision (R.S.D.) was within 13.90% and the intra-day accuracy (R.E.) was within -6.27 to 6.84% in all biological matrixes. The inter-day precision (R.S.D.) was less than 13.66% and the inter-day accuracy (R.E.) was within -10.64 to 13.04%. Then the method was successfully applied to investigate plasma pharmacokinetic study and tissue distribution of osthole, bergapten and isopimpinellin in rats after oral administration of Fructus Cnidii extraction, especially for testis/uterus tissue distribution. The results demonstrated that osthole, bergapten and isopimpinellin were absorbed and eliminated rapidly with wide distributions in rats. Distribution data of these three bioactive components in testis/uterus tissues could offer useful information for the further preclinical and clinical studies of Fructus Cnidii in the treatment of genital system disease. Copyright © 2014 Elsevier B.V. All rights reserved.
Predicted deep-sea coral habitat suitability for the U.S. West coast.
Guinotte, John M; Davies, Andrew J
2014-01-01
Regional scale habitat suitability models provide finer scale resolution and more focused predictions of where organisms may occur. Previous modelling approaches have focused primarily on local and/or global scales, while regional scale models have been relatively few. In this study, regional scale predictive habitat models are presented for deep-sea corals for the U.S. West Coast (California, Oregon and Washington). Model results are intended to aid in future research or mapping efforts and to assess potential coral habitat suitability both within and outside existing bottom trawl closures (i.e. Essential Fish Habitat (EFH)) and identify suitable habitat within U.S. National Marine Sanctuaries (NMS). Deep-sea coral habitat suitability was modelled at 500 m×500 m spatial resolution using a range of physical, chemical and environmental variables known or thought to influence the distribution of deep-sea corals. Using a spatial partitioning cross-validation approach, maximum entropy models identified slope, temperature, salinity and depth as important predictors for most deep-sea coral taxa. Large areas of highly suitable deep-sea coral habitat were predicted both within and outside of existing bottom trawl closures and NMS boundaries. Predicted habitat suitability over regional scales are not currently able to identify coral areas with pin point accuracy and probably overpredict actual coral distribution due to model limitations and unincorporated variables (i.e. data on distribution of hard substrate) that are known to limit their distribution. Predicted habitat results should be used in conjunction with multibeam bathymetry, geological mapping and other tools to guide future research efforts to areas with the highest probability of harboring deep-sea corals. Field validation of predicted habitat is needed to quantify model accuracy, particularly in areas that have not been sampled.
Predicted Deep-Sea Coral Habitat Suitability for the U.S. West Coast
Guinotte, John M.; Davies, Andrew J.
2014-01-01
Regional scale habitat suitability models provide finer scale resolution and more focused predictions of where organisms may occur. Previous modelling approaches have focused primarily on local and/or global scales, while regional scale models have been relatively few. In this study, regional scale predictive habitat models are presented for deep-sea corals for the U.S. West Coast (California, Oregon and Washington). Model results are intended to aid in future research or mapping efforts and to assess potential coral habitat suitability both within and outside existing bottom trawl closures (i.e. Essential Fish Habitat (EFH)) and identify suitable habitat within U.S. National Marine Sanctuaries (NMS). Deep-sea coral habitat suitability was modelled at 500 m×500 m spatial resolution using a range of physical, chemical and environmental variables known or thought to influence the distribution of deep-sea corals. Using a spatial partitioning cross-validation approach, maximum entropy models identified slope, temperature, salinity and depth as important predictors for most deep-sea coral taxa. Large areas of highly suitable deep-sea coral habitat were predicted both within and outside of existing bottom trawl closures and NMS boundaries. Predicted habitat suitability over regional scales are not currently able to identify coral areas with pin point accuracy and probably overpredict actual coral distribution due to model limitations and unincorporated variables (i.e. data on distribution of hard substrate) that are known to limit their distribution. Predicted habitat results should be used in conjunction with multibeam bathymetry, geological mapping and other tools to guide future research efforts to areas with the highest probability of harboring deep-sea corals. Field validation of predicted habitat is needed to quantify model accuracy, particularly in areas that have not been sampled. PMID:24759613
NASA Astrophysics Data System (ADS)
Massey, Richard
Cropland characteristics and accurate maps of their spatial distribution are required to develop strategies for global food security by continental-scale assessments and agricultural land use policies. North America is the major producer and exporter of coarse grains, wheat, and other crops. While cropland characteristics such as crop types are available at country-scales in North America, however, at continental-scale cropland products are lacking at fine sufficient resolution such as 30m. Additionally, applications of automated, open, and rapid methods to map cropland characteristics over large areas without the need of ground samples are needed on efficient high performance computing platforms for timely and long-term cropland monitoring. In this study, I developed novel, automated, and open methods to map cropland extent, crop intensity, and crop types in the North American continent using large remote sensing datasets on high-performance computing platforms. First, a novel method was developed in this study to fuse pixel-based classification of continental-scale Landsat data using Random Forest algorithm available on Google Earth Engine cloud computing platform with an object-based classification approach, recursive hierarchical segmentation (RHSeg) to map cropland extent at continental scale. Using the fusion method, a continental-scale cropland extent map for North America at 30m spatial resolution for the nominal year 2010 was produced. In this map, the total cropland area for North America was estimated at 275.2 million hectares (Mha). This map was assessed for accuracy using randomly distributed samples derived from United States Department of Agriculture (USDA) cropland data layer (CDL), Agriculture and Agri-Food Canada (AAFC) annual crop inventory (ACI), Servicio de Informacion Agroalimentaria y Pesquera (SIAP), Mexico's agricultural boundaries, and photo-interpretation of high-resolution imagery. The overall accuracies of the map are 93.4% with a producer's accuracy for crop class at 85.4% and user's accuracy of 74.5% across the continent. The sub-country statistics including state-wise and county-wise cropland statistics derived from this map compared well in regression models resulting in R2 > 0.84. Secondly, an automated phenological pattern matching (PPM) method to efficiently map cropping intensity was also developed in this study. This study presents a continental-scale cropping intensity map for the North American continent at 250m spatial resolution for 2010. In this map, the total areas for single crop, double crop, continuous crop, and fallow were estimated to be 123.5 Mha, 11.1 Mha, 64.0 Mha, and 83.4 Mha, respectively. This map was assessed using limited country-level reference datasets derived from United States Department of Agriculture cropland data layer and Agriculture and Agri-Food Canada annual crop inventory with overall accuracies of 79.8% and 80.2%, respectively. Third, two novel and automated decision tree classification approaches to map crop types across the conterminous United States (U.S.) using MODIS 250 m resolution data: 1) generalized, and 2) year-specific classification were developed. The classification approaches use similarities and dissimilarities in crop type phenology derived from NDVI time-series data for the two approaches. Annual crop type maps were produced for 8 major crop types in the United States using the generalized classification approach for 2001-2014 and the year-specific approach for 2008, 2010, 2011 and 2012. The year-specific classification had overall accuracies greater than 78%, while the generalized classifier had accuracies greater than 75% for the conterminous U.S. for 2008, 2010, 2011, and 2012. The generalized classifier enables automated and routine crop type mapping without repeated and expensive ground sample collection year after year with overall accuracies > 70% across all independent years. Taken together, these cropland products of extent, cropping intensity, and crop types, are significantly beneficial in agricultural and water use planning and monitoring to formulate policies towards global and North American food security issues.
ERIC Educational Resources Information Center
Ikeda, Kenji; Ueno, Taiji; Ito, Yuichi; Kitagami, Shinji; Kawaguchi, Jun
2017-01-01
Humans can pronounce a nonword (e.g., rint). Some researchers have interpreted this behavior as requiring a sequential mechanism by which a grapheme-phoneme correspondence rule is applied to each grapheme in turn. However, several parallel-distributed processing (PDP) models in English have simulated human nonword reading accuracy without a…
NASA Technical Reports Server (NTRS)
Mahoney, M. J.; Ismail, S.; Browell, E. V.; Ferrare, R. A.; Kooi, S. A.; Brasseur, L.; Notari, A.; Petway, L.; Brackett, V.; Clayton, M.;
2002-01-01
LASE measures high resolution moisture, aerosol, and cloud distributions not available from conventional observations. LASE water vapor measurements were compared with dropsondes to evaluate their accuracy. LASE water vapor measurements were used to assess the capability of hurricane models to improve their track accuracy by 100 km on 3 day forecasts using Florida State University models.
Mizinga, Kemmy M; Burnett, Thomas J; Brunelle, Sharon L; Wallace, Michael A; Coleman, Mark R
2018-05-01
The U.S. Department of Agriculture, Food Safety Inspection Service regulatory method for monensin, Chemistry Laboratory Guidebook CLG-MON, is a semiquantitative bioautographic method adopted in 1991. Official Method of AnalysisSM (OMA) 2011.24, a modern quantitative and confirmatory LC-tandem MS method, uses no chlorinated solvents and has several advantages, including ease of use, ready availability of reagents and materials, shorter run-time, and higher throughput than CLG-MON. Therefore, a bridging study was conducted to support the replacement of method CLG-MON with OMA 2011.24 for regulatory use. Using fortified bovine tissue samples, CLG-MON yielded accuracies of 80-120% in 44 of the 56 samples tested (one sample had no result, six samples had accuracies of >120%, and five samples had accuracies of 40-160%), but the semiquantitative nature of CLG-MON prevented assessment of precision, whereas OMA 2011.24 had accuracies of 88-110% and RSDr of 0.00-15.6%. Incurred residue results corroborated these results, demonstrating improved accuracy (83.3-114%) and good precision (RSDr of 2.6-20.5%) for OMA 2011.24 compared with CLG-MON (accuracy generally within 80-150%, with exceptions). Furthermore, χ2 analysis revealed no statistically significant difference between the two methods. Thus, the microbiological activity of monensin correlated with the determination of monensin A in bovine tissues, and OMA 2011.24 provided improved accuracy and precision over CLG-MON.
NASA Astrophysics Data System (ADS)
Rupasinghe, P. A.; Markle, C. E.; Marcaccio, J. V.; Chow-Fraser, P.
2017-12-01
Phragmites australis (European common reed), is a relatively recent invader of wetlands and beaches in Ontario. It can establish large homogenous stands within wetlands and disperse widely throughout the landscape by wind and vehicular traffic. A first step in managing this invasive species includes accurate mapping and quantification of its distribution. This is challenging because Phragimtes is distributed in a large spatial extent, which makes the mapping more costly and time consuming. Here, we used freely available multispectral satellite images taken monthly (cloud free images as available) for the calendar year to determine the optimum phenological state of Phragmites that would allow it to be accurately identified using remote sensing data. We analyzed time series, Landsat-8 OLI and Sentinel-2 images for Big Creek Wildlife Area, ON using image classification (Support Vector Machines), Normalized Difference Vegetation Index (NDVI) and Normalized Difference Water Index (NDWI). We used field sampling data and high resolution image collected using Unmanned Aerial Vehicle (UAV; 8 cm spatial resolution) as training data and for the validation of the classified images. The accuracy for all land cover classes and for Phragmites alone were low at both the start and end of the calendar year, but reached overall accuracy >85% by mid to late summer. The highest classification accuracies for Landsat-8 OLI were associated with late July and early August imagery. We observed similar trends using the Sentinel-2 images, with higher overall accuracy for all land cover classes and for Phragmites alone from late July to late September. During this period, we found the greatest difference between Phragmites and Typha, commonly confused classes, with respect to near-infrared and shortwave infrared reflectance. Therefore, the unique spectral signature of Phragmites can be attributed to both the level of greenness and factors related to water content in the leaves during late summer. Landsat-8 OLI or Sentinel-2 images acquired in late summer can be used as a cost effective approach to mapping Phragmites at a large spatial scale without sacrificing accuracy.
Local indicators of geocoding accuracy (LIGA): theory and application
Jacquez, Geoffrey M; Rommel, Robert
2009-01-01
Background Although sources of positional error in geographic locations (e.g. geocoding error) used for describing and modeling spatial patterns are widely acknowledged, research on how such error impacts the statistical results has been limited. In this paper we explore techniques for quantifying the perturbability of spatial weights to different specifications of positional error. Results We find that a family of curves describes the relationship between perturbability and positional error, and use these curves to evaluate sensitivity of alternative spatial weight specifications to positional error both globally (when all locations are considered simultaneously) and locally (to identify those locations that would benefit most from increased geocoding accuracy). We evaluate the approach in simulation studies, and demonstrate it using a case-control study of bladder cancer in south-eastern Michigan. Conclusion Three results are significant. First, the shape of the probability distributions of positional error (e.g. circular, elliptical, cross) has little impact on the perturbability of spatial weights, which instead depends on the mean positional error. Second, our methodology allows researchers to evaluate the sensitivity of spatial statistics to positional accuracy for specific geographies. This has substantial practical implications since it makes possible routine sensitivity analysis of spatial statistics to positional error arising in geocoded street addresses, global positioning systems, LIDAR and other geographic data. Third, those locations with high perturbability (most sensitive to positional error) and high leverage (that contribute the most to the spatial weight being considered) will benefit the most from increased positional accuracy. These are rapidly identified using a new visualization tool we call the LIGA scatterplot. Herein lies a paradox for spatial analysis: For a given level of positional error increasing sample density to more accurately follow the underlying population distribution increases perturbability and introduces error into the spatial weights matrix. In some studies positional error may not impact the statistical results, and in others it might invalidate the results. We therefore must understand the relationships between positional accuracy and the perturbability of the spatial weights in order to have confidence in a study's results. PMID:19863795
NASA Astrophysics Data System (ADS)
Chu, Huaqiang; Liu, Fengshan; Consalvi, Jean-Louis
2014-08-01
The relationship between the spectral line based weighted-sum-of-gray-gases (SLW) model and the full-spectrum k-distribution (FSK) model in isothermal and homogeneous media is investigated in this paper. The SLW transfer equation can be derived from the FSK transfer equation expressed in the k-distribution function without approximation. It confirms that the SLW model is equivalent to the FSK model in the k-distribution function form. The numerical implementation of the SLW relies on a somewhat arbitrary discretization of the absorption cross section whereas the FSK model finds the spectrally integrated intensity by integration over the smoothly varying cumulative-k distribution function using a Gaussian quadrature scheme. The latter is therefore in general more efficient as a fewer number of gray gases is required to achieve a prescribed accuracy. Sample numerical calculations were conducted to demonstrate the different efficiency of these two methods. The FSK model is found more accurate than the SLW model in radiation transfer in H2O; however, the SLW model is more accurate in media containing CO2 as the only radiating gas due to its explicit treatment of ‘clear gas.’
Baranowski, Tom; Baranowski, Janice C; Watson, Kathleen B; Martin, Shelby; Beltran, Alicia; Islam, Noemi; Dadabhoy, Hafza; Adame, Su-heyla; Cullen, Karen; Thompson, Debbe; Buday, Richard; Subar, Amy
2011-03-01
To test the effect of image size and presence of size cues on the accuracy of portion size estimation by children. Children were randomly assigned to seeing images with or without food size cues (utensils and checked tablecloth) and were presented with sixteen food models (foods commonly eaten by children) in varying portion sizes, one at a time. They estimated each food model's portion size by selecting a digital food image. The same food images were presented in two ways: (i) as small, graduated portion size images all on one screen or (ii) by scrolling across large, graduated portion size images, one per sequential screen. Laboratory-based with computer and food models. Volunteer multi-ethnic sample of 120 children, equally distributed by gender and ages (8 to 13 years) in 2008-2009. Average percentage of correctly classified foods was 60·3 %. There were no differences in accuracy by any design factor or demographic characteristic. Multiple small pictures on the screen at once took half the time to estimate portion size compared with scrolling through large pictures. Larger pictures had more overestimation of size. Multiple images of successively larger portion sizes of a food on one computer screen facilitated quicker portion size responses with no decrease in accuracy. This is the method of choice for portion size estimation on a computer.
Därr, Roland; Kuhn, Matthias; Bode, Christoph; Bornstein, Stefan R; Pacak, Karel; Lenders, Jacques W M; Eisenhofer, Graeme
2017-06-01
To determine the accuracy of biochemical tests for the diagnosis of pheochromocytoma and paraganglioma. A search of the PubMed database was conducted for English-language articles published between October 1958 and December 2016 on the biochemical diagnosis of pheochromocytoma and paraganglioma using immunoassay methods or high-performance liquid chromatography with coulometric/electrochemical or tandem mass spectrometric detection for measurement of fractionated metanephrines in 24-h urine collections or plasma-free metanephrines obtained under seated or supine blood sampling conditions. Application of the Standards for Reporting of Diagnostic Studies Accuracy Group criteria yielded 23 suitable articles. Summary receiver operating characteristic analysis revealed sensitivities/specificities of 94/93% and 91/93% for measurement of plasma-free metanephrines and urinary fractionated metanephrines using high-performance liquid chromatography or immunoassay methods, respectively. Partial areas under the curve were 0.947 vs. 0.911. Irrespective of the analytical method, sensitivity was significantly higher for supine compared with seated sampling, 95 vs. 89% (p < 0.02), while specificity was significantly higher for supine sampling compared with 24-h urine, 95 vs. 90% (p < 0.03). Partial areas under the curve were 0.942, 0.913, and 0.932 for supine sampling, seated sampling, and urine. Test accuracy increased linearly from 90 to 93% for 24-h urine at prevalence rates of 0.0-1.0, decreased linearly from 94 to 89% for seated sampling and was constant at 95% for supine conditions. Current tests for the biochemical diagnosis of pheochromocytoma and paraganglioma show excellent diagnostic accuracy. Supine sampling conditions and measurement of plasma-free metanephrines using high-performance liquid chromatography with coulometric/electrochemical or tandem mass spectrometric detection provides the highest accuracy at all prevalence rates.
Rousselet, Jérôme; Imbert, Charles-Edouard; Dekri, Anissa; Garcia, Jacques; Goussard, Francis; Vincent, Bruno; Denux, Olivier; Robinet, Christelle; Dorkeld, Franck; Roques, Alain; Rossi, Jean-Pierre
2013-01-01
Mapping species spatial distribution using spatial inference and prediction requires a lot of data. Occurrence data are generally not easily available from the literature and are very time-consuming to collect in the field. For that reason, we designed a survey to explore to which extent large-scale databases such as Google maps and Google Street View could be used to derive valid occurrence data. We worked with the Pine Processionary Moth (PPM) Thaumetopoea pityocampa because the larvae of that moth build silk nests that are easily visible. The presence of the species at one location can therefore be inferred from visual records derived from the panoramic views available from Google Street View. We designed a standardized procedure allowing evaluating the presence of the PPM on a sampling grid covering the landscape under study. The outputs were compared to field data. We investigated two landscapes using grids of different extent and mesh size. Data derived from Google Street View were highly similar to field data in the large-scale analysis based on a square grid with a mesh of 16 km (96% of matching records). Using a 2 km mesh size led to a strong divergence between field and Google-derived data (46% of matching records). We conclude that Google database might provide useful occurrence data for mapping the distribution of species which presence can be visually evaluated such as the PPM. However, the accuracy of the output strongly depends on the spatial scales considered and on the sampling grid used. Other factors such as the coverage of Google Street View network with regards to sampling grid size and the spatial distribution of host trees with regards to road network may also be determinant.
Dekri, Anissa; Garcia, Jacques; Goussard, Francis; Vincent, Bruno; Denux, Olivier; Robinet, Christelle; Dorkeld, Franck; Roques, Alain; Rossi, Jean-Pierre
2013-01-01
Mapping species spatial distribution using spatial inference and prediction requires a lot of data. Occurrence data are generally not easily available from the literature and are very time-consuming to collect in the field. For that reason, we designed a survey to explore to which extent large-scale databases such as Google maps and Google street view could be used to derive valid occurrence data. We worked with the Pine Processionary Moth (PPM) Thaumetopoea pityocampa because the larvae of that moth build silk nests that are easily visible. The presence of the species at one location can therefore be inferred from visual records derived from the panoramic views available from Google street view. We designed a standardized procedure allowing evaluating the presence of the PPM on a sampling grid covering the landscape under study. The outputs were compared to field data. We investigated two landscapes using grids of different extent and mesh size. Data derived from Google street view were highly similar to field data in the large-scale analysis based on a square grid with a mesh of 16 km (96% of matching records). Using a 2 km mesh size led to a strong divergence between field and Google-derived data (46% of matching records). We conclude that Google database might provide useful occurrence data for mapping the distribution of species which presence can be visually evaluated such as the PPM. However, the accuracy of the output strongly depends on the spatial scales considered and on the sampling grid used. Other factors such as the coverage of Google street view network with regards to sampling grid size and the spatial distribution of host trees with regards to road network may also be determinant. PMID:24130675
Nelson, Sarah C.; Stilp, Adrienne M.; Papanicolaou, George J.; Taylor, Kent D.; Rotter, Jerome I.; Thornton, Timothy A.; Laurie, Cathy C.
2016-01-01
Imputation is commonly used in genome-wide association studies to expand the set of genetic variants available for analysis. Larger and more diverse reference panels, such as the final Phase 3 of the 1000 Genomes Project, hold promise for improving imputation accuracy in genetically diverse populations such as Hispanics/Latinos in the USA. Here, we sought to empirically evaluate imputation accuracy when imputing to a 1000 Genomes Phase 3 versus a Phase 1 reference, using participants from the Hispanic Community Health Study/Study of Latinos. Our assessments included calculating the correlation between imputed and observed allelic dosage in a subset of samples genotyped on a supplemental array. We observed that the Phase 3 reference yielded higher accuracy at rare variants, but that the two reference panels were comparable at common variants. At a sample level, the Phase 3 reference improved imputation accuracy in Hispanic/Latino samples from the Caribbean more than for Mainland samples, which we attribute primarily to the additional reference panel samples available in Phase 3. We conclude that a 1000 Genomes Project Phase 3 reference panel can yield improved imputation accuracy compared with Phase 1, particularly for rare variants and for samples of certain genetic ancestry compositions. Our findings can inform imputation design for other genome-wide association studies of participants with diverse ancestries, especially as larger and more diverse reference panels continue to become available. PMID:27346520
Turkers in Africa: A Crowdsourcing Approach to Improving Agricultural Landcover Maps
NASA Astrophysics Data System (ADS)
Estes, L. D.; Caylor, K. K.; Choi, J.
2012-12-01
In the coming decades a substantial portion of Africa is expected to be transformed to agriculture. The scale of this conversion may match or exceed that which occurred in the Brazilian Cerrado and Argentinian Pampa in recent years. Tracking the rate and extent of this conversion will depend on having an accurate baseline of the current extent of croplands. Continent-wide baseline data do exist, but the accuracy of these relatively coarse resolution, remotely sensed assessments is suspect in many regions. To develop more accurate maps of the distribution and nature of African croplands, we develop a distributed "crowdsourcing" approach that harnesses human eyeballs and image interpretation capabilities. Our initial goal is to assess the accuracy of existing agricultural land cover maps, but ultimately we aim to generate "wall-to-wall" cropland maps that can be revisited and updated to track agricultural transformation. Our approach utilizes the freely avail- able, high-resolution satellite imagery provided by Google Earth, combined with Amazon.com's Mechanical Turk platform, an online service that provides a large, global pool of workers (known as "Turkers") who perform "Human Intelligence Tasks" (HITs) for a fee. Using open-source R and python software, we select a random sample of 1 km2 cells from a grid placed over our study area, stratified by field density classes drawn from one of the coarse-scale land cover maps, and send these in batches to Mechanical Turk for processing. Each Turker is required to conduct an initial training session, on the basis of which they are assigned an accuracy score that determines whether the Turker is allowed to proceed with mapping tasks. Completed mapping tasks are automatically retrieved and processed on our server, and subject to two further quality control measures. The first of these is a measure of the spatial accuracy of Turker mapped areas compared to a "gold standard" maps from selected locations that are randomly inserted (at relatively low frequency, ˜1/100) into batches sent to Mechanical Turk. This check provides a measure of overall map accuracy, and is used to update individual Turker's accuracy scores, which is the basis for determining pay rates. The second measure compares the area of each mapped Turkers' results with the expected area derived from existing land cover data, accepting or rejecting each Turker's batch based on how closely the two distributions match, with accuracy scores adjusted accordingly. Those two checks balance the need to ensure mapping quality with the overall cost of the project. Our initial study is developed for South Africa, where an existing dataset of hand digitized fields commissioned by the South African Department of Agriculture provides our validation and gold standard data. We compare our Turker-produced results with these existing maps, and with the the coarser-scaled land cover datasets, providing insight into their relative accuracies, classified according to cropland type (e.g. small-scale/subsistence cropping; large-scale commercial farms), and provide information on the cost effectiveness of our approach.
Wogan, Guinevere O. U.
2016-01-01
A primary assumption of environmental niche models (ENMs) is that models are both accurate and transferable across geography or time; however, recent work has shown that models may be accurate but not highly transferable. While some of this is due to modeling technique, individual species ecologies may also underlie this phenomenon. Life history traits certainly influence the accuracy of predictive ENMs, but their impact on model transferability is less understood. This study investigated how life history traits influence the predictive accuracy and transferability of ENMs using historically calibrated models for birds. In this study I used historical occurrence and climate data (1950-1990s) to build models for a sample of birds, and then projected them forward to the ‘future’ (1960-1990s). The models were then validated against models generated from occurrence data at that ‘future’ time. Internal and external validation metrics, as well as metrics assessing transferability, and Generalized Linear Models were used to identify life history traits that were significant predictors of accuracy and transferability. This study found that the predictive ability of ENMs differs with regard to life history characteristics such as range, migration, and habitat, and that the rarity versus commonness of a species affects the predicted stability and overlap and hence the transferability of projected models. Projected ENMs with both high accuracy and transferability scores, still sometimes suffered from over- or under- predicted species ranges. Life history traits certainly influenced the accuracy of predictive ENMs for birds, but while aspects of geographic range impact model transferability, the mechanisms underlying this are less understood. PMID:26959979
Thaitrong, Numrin; Kim, Hanyoup; Renzi, Ronald F; Bartsch, Michael S; Meagher, Robert J; Patel, Kamlesh D
2012-12-01
We have developed an automated quality control (QC) platform for next-generation sequencing (NGS) library characterization by integrating a droplet-based digital microfluidic (DMF) system with a capillary-based reagent delivery unit and a quantitative CE module. Using an in-plane capillary-DMF interface, a prepared sample droplet was actuated into position between the ground electrode and the inlet of the separation capillary to complete the circuit for an electrokinetic injection. Using a DNA ladder as an internal standard, the CE module with a compact LIF detector was capable of detecting dsDNA in the range of 5-100 pg/μL, suitable for the amount of DNA required by the Illumina Genome Analyzer sequencing platform. This DMF-CE platform consumes tenfold less sample volume than the current Agilent BioAnalyzer QC technique, preserving precious sample while providing necessary sensitivity and accuracy for optimal sequencing performance. The ability of this microfluidic system to validate NGS library preparation was demonstrated by examining the effects of limited-cycle PCR amplification on the size distribution and the yield of Illumina-compatible libraries, demonstrating that as few as ten cycles of PCR bias the size distribution of the library toward undesirable larger fragments. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
An evaluation of the ELT-8 hematology analyzer.
Raik, E; McPherson, J; Barton, L; Hewitt, B S; Powell, E G; Gordon, S
1982-04-01
The TMELT-8 Hematology Analyzer is a fully automated cell counter which utilizes laser light scattering and hydrodynamic focusing to provide an 8 parameter whole blood count. The instrument consists of a sample handler with ticket printer, and a data handler with visual display unit, It accepts 100 microliter samples of venous or capillary blood and prints the values for WCC, RCC, Hb, Hct, MCV, MCH, MCHC and platelet count on to a standard result card. All operational and quality control functions, including graphic display of relative cell size distribution, can be obtained from the visual display unit and can also be printed as a permanent record if required. In a limited evaluation of the ELT-8, precision, linearity, accuracy, lack of sample carry-over and user acceptance were excellent. Reproducible values were obtained for all parameters after overnight storage of samples. Reagent usage and running costs were lower than for the Coulter S and the Coulter S Plus. The ease of processing capillary samples was considered to be a major advantage. The histograms served to alert the operator to a number of abnormalities, some of which were clinically significant.
Hydrogen concentration analysis in clinopyroxene using proton-proton scattering analysis
NASA Astrophysics Data System (ADS)
Weis, Franz A.; Ros, Linus; Reichart, Patrick; Skogby, Henrik; Kristiansson, Per; Dollinger, Günther
2018-02-01
Traditional methods to measure water in nominally anhydrous minerals (NAMs) are, for example, Fourier transformed infrared (FTIR) spectroscopy or secondary ion mass spectrometry (SIMS). Both well-established methods provide a low detection limit as well as high spatial resolution yet may require elaborate sample orientation or destructive sample preparation. Here we analyze the water content in erupted volcanic clinopyroxene phenocrysts by proton-proton scattering and reproduce water contents measured by FTIR spectroscopy. We show that this technique provides significant advantages over other methods as it can provide a three-dimensional distribution of hydrogen within a crystal, making the identification of potential inclusions possible as well as elimination of surface contamination. The sample analysis is also independent of crystal structure and orientation and independent of matrix effects other than sample density. The results are used to validate the accuracy of wavenumber-dependent vs. mineral-specific molar absorption coefficients in FTIR spectroscopy. In addition, we present a new method for the sample preparation of very thin crystals suitable for proton-proton scattering analysis using relatively low accelerator potentials.
Allen, Y.C.; Wilson, C.A.; Roberts, H.H.; Supan, J.
2005-01-01
Sidescan sonar holds great promise as a tool to quantitatively depict the distribution and extent of benthic habitats in Louisiana's turbid estuaries. In this study, we describe an effective protocol for acoustic sampling in this environment. We also compared three methods of classification in detail: mean-based thresholding, supervised, and unsupervised techniques to classify sidescan imagery into categories of mud and shell. Classification results were compared to ground truth results using quadrat and dredge sampling. Supervised classification gave the best overall result (kappa = 75%) when compared to quadrat results. Classification accuracy was less robust when compared to all dredge samples (kappa = 21-56%), but increased greatly (90-100%) when only dredge samples taken from acoustically homogeneous areas were considered. Sidescan sonar when combined with ground truth sampling at an appropriate scale can be effectively used to establish an accurate substrate base map for both research applications and shellfish management. The sidescan imagery presented here also provides, for the first time, a detailed presentation of oyster habitat patchiness and scale in a productive oyster growing area.
Flores, Araceli V; Pérez, Carlos A; Arruda, Marco A Z
2004-02-27
In the present paper, lithium was determined in river sediment using slurry sampling and electrothermal atomic absorption spectrometry (ET AAS) after L'vov platform coating with zirconium (as a permanent chemical modifier). The performance of this modifier and its distribution on the L'vov platform after different heating cycles were evaluated using synchrotron radiation X-ray fluorescence (SRXRF) and imaging scanning electron microscopy (SEM) techniques. The analytical conditions for lithium determination in river sediment slurries were also investigated and the best conditions were obtained employing 1300 and 2300 degrees C for pyrolysis and atomization temperatures, respectively. In addition, 100mg of sediment samples were prepared using 4.0moll(-1) HNO(3). The Zr-coating permitted lithium determination with good precision and accuracy after 480 heating cycles using the same platform for slurry samples. The sediment samples were collected from five different points of the Cachoeira river, São Paulo, Brazil. The detection and quantification limits were, respectively, 0.07 and 0.23mugl(-1).
Shen, Xiaomeng; Hu, Qiang; Li, Jun; Wang, Jianmin; Qu, Jun
2015-10-02
Comprehensive and accurate evaluation of data quality and false-positive biomarker discovery is critical to direct the method development/optimization for quantitative proteomics, which nonetheless remains challenging largely due to the high complexity and unique features of proteomic data. Here we describe an experimental null (EN) method to address this need. Because the method experimentally measures the null distribution (either technical or biological replicates) using the same proteomic samples, the same procedures and the same batch as the case-vs-contol experiment, it correctly reflects the collective effects of technical variability (e.g., variation/bias in sample preparation, LC-MS analysis, and data processing) and project-specific features (e.g., characteristics of the proteome and biological variation) on the performances of quantitative analysis. To show a proof of concept, we employed the EN method to assess the quantitative accuracy and precision and the ability to quantify subtle ratio changes between groups using different experimental and data-processing approaches and in various cellular and tissue proteomes. It was found that choices of quantitative features, sample size, experimental design, data-processing strategies, and quality of chromatographic separation can profoundly affect quantitative precision and accuracy of label-free quantification. The EN method was also demonstrated as a practical tool to determine the optimal experimental parameters and rational ratio cutoff for reliable protein quantification in specific proteomic experiments, for example, to identify the necessary number of technical/biological replicates per group that affords sufficient power for discovery. Furthermore, we assessed the ability of EN method to estimate levels of false-positives in the discovery of altered proteins, using two concocted sample sets mimicking proteomic profiling using technical and biological replicates, respectively, where the true-positives/negatives are known and span a wide concentration range. It was observed that the EN method correctly reflects the null distribution in a proteomic system and accurately measures false altered proteins discovery rate (FADR). In summary, the EN method provides a straightforward, practical, and accurate alternative to statistics-based approaches for the development and evaluation of proteomic experiments and can be universally adapted to various types of quantitative techniques.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ainsworth, Nathan; Hariri, Ali; Prabakar, Kumaraguru
Power hardware-in-the-loop (PHIL) simulation, where actual hardware under text is coupled with a real-time digital model in closed loop, is a powerful tool for analyzing new methods of control for emerging distributed power systems. However, without careful design and compensation of the interface between the simulated and actual systems, PHIL simulations may exhibit instability and modeling inaccuracies. This paper addresses issues that arise in the PHIL simulation of a hardware battery inverter interfaced with a simulated distribution feeder. Both the stability and accuracy issues are modeled and characterized, and a methodology for design of PHIL interface compensation to ensure stabilitymore » and accuracy is presented. The stability and accuracy of the resulting compensated PHIL simulation is then shown by experiment.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Prabakar, Kumaraguru; Ainsworth, Nathan; Pratt, Annabelle
Power hardware-in-the-loop (PHIL) simulation, where actual hardware under text is coupled with a real-time digital model in closed loop, is a powerful tool for analyzing new methods of control for emerging distributed power systems. However, without careful design and compensation of the interface between the simulated and actual systems, PHIL simulations may exhibit instability and modeling inaccuracies. This paper addresses issues that arise in the PHIL simulation of a hardware battery inverter interfaced with a simulated distribution feeder. Both the stability and accuracy issues are modeled and characterized, and a methodology for design of PHIL interface compensation to ensure stabilitymore » and accuracy is presented. The stability and accuracy of the resulting compensated PHIL simulation is then shown by experiment.« less
Plotnikov, Nikolay V
2014-08-12
Proposed in this contribution is a protocol for calculating fine-physics (e.g., ab initio QM/MM) free-energy surfaces at a high level of accuracy locally (e.g., only at reactants and at the transition state for computing the activation barrier) from targeted fine-physics sampling and extensive exploratory coarse-physics sampling. The full free-energy surface is still computed but at a lower level of accuracy from coarse-physics sampling. The method is analytically derived in terms of the umbrella sampling and the free-energy perturbation methods which are combined with the thermodynamic cycle and the targeted sampling strategy of the paradynamics approach. The algorithm starts by computing low-accuracy fine-physics free-energy surfaces from the coarse-physics sampling in order to identify the reaction path and to select regions for targeted sampling. Thus, the algorithm does not rely on the coarse-physics minimum free-energy reaction path. Next, segments of high-accuracy free-energy surface are computed locally at selected regions from the targeted fine-physics sampling and are positioned relative to the coarse-physics free-energy shifts. The positioning is done by averaging the free-energy perturbations computed with multistep linear response approximation method. This method is analytically shown to provide results of the thermodynamic integration and the free-energy interpolation methods, while being extremely simple in implementation. Incorporating the metadynamics sampling to the algorithm is also briefly outlined. The application is demonstrated by calculating the B3LYP//6-31G*/MM free-energy barrier for an enzymatic reaction using a semiempirical PM6/MM reference potential. These modifications allow computing the activation free energies at a significantly reduced computational cost but at the same level of accuracy compared to computing full potential of mean force.
2015-01-01
Proposed in this contribution is a protocol for calculating fine-physics (e.g., ab initio QM/MM) free-energy surfaces at a high level of accuracy locally (e.g., only at reactants and at the transition state for computing the activation barrier) from targeted fine-physics sampling and extensive exploratory coarse-physics sampling. The full free-energy surface is still computed but at a lower level of accuracy from coarse-physics sampling. The method is analytically derived in terms of the umbrella sampling and the free-energy perturbation methods which are combined with the thermodynamic cycle and the targeted sampling strategy of the paradynamics approach. The algorithm starts by computing low-accuracy fine-physics free-energy surfaces from the coarse-physics sampling in order to identify the reaction path and to select regions for targeted sampling. Thus, the algorithm does not rely on the coarse-physics minimum free-energy reaction path. Next, segments of high-accuracy free-energy surface are computed locally at selected regions from the targeted fine-physics sampling and are positioned relative to the coarse-physics free-energy shifts. The positioning is done by averaging the free-energy perturbations computed with multistep linear response approximation method. This method is analytically shown to provide results of the thermodynamic integration and the free-energy interpolation methods, while being extremely simple in implementation. Incorporating the metadynamics sampling to the algorithm is also briefly outlined. The application is demonstrated by calculating the B3LYP//6-31G*/MM free-energy barrier for an enzymatic reaction using a semiempirical PM6/MM reference potential. These modifications allow computing the activation free energies at a significantly reduced computational cost but at the same level of accuracy compared to computing full potential of mean force. PMID:25136268
Doble, Brett; Lorgelly, Paula
2016-04-01
To determine the external validity of existing mapping algorithms for predicting EQ-5D-3L utility values from EORTC QLQ-C30 responses and to establish their generalizability in different types of cancer. A main analysis (pooled) sample of 3560 observations (1727 patients) and two disease severity patient samples (496 and 93 patients) with repeated observations over time from Cancer 2015 were used to validate the existing algorithms. Errors were calculated between observed and predicted EQ-5D-3L utility values using a single pooled sample and ten pooled tumour type-specific samples. Predictive accuracy was assessed using mean absolute error (MAE) and standardized root-mean-squared error (RMSE). The association between observed and predicted EQ-5D utility values and other covariates across the distribution was tested using quantile regression. Quality-adjusted life years (QALYs) were calculated using observed and predicted values to test responsiveness. Ten 'preferred' mapping algorithms were identified. Two algorithms estimated via response mapping and ordinary least-squares regression using dummy variables performed well on number of validation criteria, including accurate prediction of the best and worst QLQ-C30 health states, predicted values within the EQ-5D tariff range, relatively small MAEs and RMSEs, and minimal differences between estimated QALYs. Comparison of predictive accuracy across ten tumour type-specific samples highlighted that algorithms are relatively insensitive to grouping by tumour type and affected more by differences in disease severity. Two of the 'preferred' mapping algorithms suggest more accurate predictions, but limitations exist. We recommend extensive scenario analyses if mapped utilities are used in cost-utility analyses.
Wang, Junxiao; Wang, Xiaorui; Zhou, Shenglu; Wu, Shaohua; Zhu, Yan; Lu, Chunfeng
2016-01-01
With China’s rapid economic development, the reduction in arable land has emerged as one of the most prominent problems in the nation. The long-term dynamic monitoring of arable land quality is important for protecting arable land resources. An efficient practice is to select optimal sample points while obtaining accurate predictions. To this end, the selection of effective points from a dense set of soil sample points is an urgent problem. In this study, data were collected from Donghai County, Jiangsu Province, China. The number and layout of soil sample points are optimized by considering the spatial variations in soil properties and by using an improved simulated annealing (SA) algorithm. The conclusions are as follows: (1) Optimization results in the retention of more sample points in the moderate- and high-variation partitions of the study area; (2) The number of optimal sample points obtained with the improved SA algorithm is markedly reduced, while the accuracy of the predicted soil properties is improved by approximately 5% compared with the raw data; (3) With regard to the monitoring of arable land quality, a dense distribution of sample points is needed to monitor the granularity. PMID:27706051
Evaluation and optimization of sampling errors for the Monte Carlo Independent Column Approximation
NASA Astrophysics Data System (ADS)
Räisänen, Petri; Barker, W. Howard
2004-07-01
The Monte Carlo Independent Column Approximation (McICA) method for computing domain-average broadband radiative fluxes is unbiased with respect to the full ICA, but its flux estimates contain conditional random noise. McICA's sampling errors are evaluated here using a global climate model (GCM) dataset and a correlated-k distribution (CKD) radiation scheme. Two approaches to reduce McICA's sampling variance are discussed. The first is to simply restrict all of McICA's samples to cloudy regions. This avoids wasting precious few samples on essentially homogeneous clear skies. Clear-sky fluxes need to be computed separately for this approach, but this is usually done in GCMs for diagnostic purposes anyway. Second, accuracy can be improved by repeated sampling, and averaging those CKD terms with large cloud radiative effects. Although this naturally increases computational costs over the standard CKD model, random errors for fluxes and heating rates are reduced by typically 50% to 60%, for the present radiation code, when the total number of samples is increased by 50%. When both variance reduction techniques are applied simultaneously, globally averaged flux and heating rate random errors are reduced by a factor of #3.
Radial q-space sampling for DSI
Baete, Steven H.; Yutzy, Stephen; Boada, Fernando, E.
2015-01-01
Purpose Diffusion Spectrum Imaging (DSI) has been shown to be an effective tool for non-invasively depicting the anatomical details of brain microstructure. Existing implementations of DSI sample the diffusion encoding space using a rectangular grid. Here we present a different implementation of DSI whereby a radially symmetric q-space sampling scheme for DSI (RDSI) is used to improve the angular resolution and accuracy of the reconstructed Orientation Distribution Functions (ODF). Methods Q-space is sampled by acquiring several q-space samples along a number of radial lines. Each of these radial lines in q-space is analytically connected to a value of the ODF at the same angular location by the Fourier slice theorem. Results Computer simulations and in vivo brain results demonstrate that RDSI correctly estimates the ODF when moderately high b-values (4000 s/mm2) and number of q-space samples (236) are used. Conclusion The nominal angular resolution of RDSI depends on the number of radial lines used in the sampling scheme, and only weakly on the maximum b-value. In addition, the radial analytical reconstruction reduces truncation artifacts which affect Cartesian reconstructions. Hence, a radial acquisition of q-space can be favorable for DSI. PMID:26363002
Yao, Rongjiang; Yang, Jingsong; Wu, Danhua; Xie, Wenping; Gao, Peng; Jin, Wenhui
2016-01-01
Reliable and real-time information on soil and crop properties is important for the development of management practices in accordance with the requirements of a specific soil and crop within individual field units. This is particularly the case in salt-affected agricultural landscape where managing the spatial variability of soil salinity is essential to minimize salinization and maximize crop output. The primary objectives were to use linear mixed-effects model for soil salinity and crop yield calibration with horizontal and vertical electromagnetic induction (EMI) measurements as ancillary data, to characterize the spatial distribution of soil salinity and crop yield and to verify the accuracy of spatial estimation. Horizontal and vertical EMI (type EM38) measurements at 252 locations were made during each survey, and root zone soil samples and crop samples at 64 sampling sites were collected. This work was periodically conducted on eight dates from June 2012 to May 2013 in a coastal salt-affected mud farmland. Multiple linear regression (MLR) and restricted maximum likelihood (REML) were applied to calibrate root zone soil salinity (ECe) and crop annual output (CAO) using ancillary data, and spatial distribution of soil ECe and CAO was generated using digital soil mapping (DSM) and the precision of spatial estimation was examined using the collected meteorological and groundwater data. Results indicated that a reduced model with EMh as a predictor was satisfactory for root zone ECe calibration, whereas a full model with both EMh and EMv as predictors met the requirement of CAO calibration. The obtained distribution maps of ECe showed consistency with those of EMI measurements at the corresponding time, and the spatial distribution of CAO generated from ancillary data showed agreement with that derived from raw crop data. Statistics of jackknifing procedure confirmed that the spatial estimation of ECe and CAO exhibited reliability and high accuracy. A general increasing trend of ECe was observed and moderately saline and very saline soils were predominant during the survey period. The temporal dynamics of root zone ECe coincided with those of daily rainfall, water table and groundwater data. Long-range EMI surveys and data collection are needed to capture the spatial and temporal variability of soil and crop parameters. Such results allowed us to conclude that, cost-effective and efficient EMI surveys, as one part of multi-source data for DSM, could be successfully used to characterize the spatial variability of soil salinity, to monitor the spatial and temporal dynamics of soil salinity, and to spatially estimate potential crop yield. PMID:27203697
Yao, Rongjiang; Yang, Jingsong; Wu, Danhua; Xie, Wenping; Gao, Peng; Jin, Wenhui
2016-01-01
Reliable and real-time information on soil and crop properties is important for the development of management practices in accordance with the requirements of a specific soil and crop within individual field units. This is particularly the case in salt-affected agricultural landscape where managing the spatial variability of soil salinity is essential to minimize salinization and maximize crop output. The primary objectives were to use linear mixed-effects model for soil salinity and crop yield calibration with horizontal and vertical electromagnetic induction (EMI) measurements as ancillary data, to characterize the spatial distribution of soil salinity and crop yield and to verify the accuracy of spatial estimation. Horizontal and vertical EMI (type EM38) measurements at 252 locations were made during each survey, and root zone soil samples and crop samples at 64 sampling sites were collected. This work was periodically conducted on eight dates from June 2012 to May 2013 in a coastal salt-affected mud farmland. Multiple linear regression (MLR) and restricted maximum likelihood (REML) were applied to calibrate root zone soil salinity (ECe) and crop annual output (CAO) using ancillary data, and spatial distribution of soil ECe and CAO was generated using digital soil mapping (DSM) and the precision of spatial estimation was examined using the collected meteorological and groundwater data. Results indicated that a reduced model with EMh as a predictor was satisfactory for root zone ECe calibration, whereas a full model with both EMh and EMv as predictors met the requirement of CAO calibration. The obtained distribution maps of ECe showed consistency with those of EMI measurements at the corresponding time, and the spatial distribution of CAO generated from ancillary data showed agreement with that derived from raw crop data. Statistics of jackknifing procedure confirmed that the spatial estimation of ECe and CAO exhibited reliability and high accuracy. A general increasing trend of ECe was observed and moderately saline and very saline soils were predominant during the survey period. The temporal dynamics of root zone ECe coincided with those of daily rainfall, water table and groundwater data. Long-range EMI surveys and data collection are needed to capture the spatial and temporal variability of soil and crop parameters. Such results allowed us to conclude that, cost-effective and efficient EMI surveys, as one part of multi-source data for DSM, could be successfully used to characterize the spatial variability of soil salinity, to monitor the spatial and temporal dynamics of soil salinity, and to spatially estimate potential crop yield.
Evaluating performance of stormwater sampling approaches using a dynamic watershed model.
Ackerman, Drew; Stein, Eric D; Ritter, Kerry J
2011-09-01
Accurate quantification of stormwater pollutant levels is essential for estimating overall contaminant discharge to receiving waters. Numerous sampling approaches exist that attempt to balance accuracy against the costs associated with the sampling method. This study employs a novel and practical approach of evaluating the accuracy of different stormwater monitoring methodologies using stormflows and constituent concentrations produced by a fully validated continuous simulation watershed model. A major advantage of using a watershed model to simulate pollutant concentrations is that a large number of storms representing a broad range of conditions can be applied in testing the various sampling approaches. Seventy-eight distinct methodologies were evaluated by "virtual samplings" of 166 simulated storms of varying size, intensity and duration, representing 14 years of storms in Ballona Creek near Los Angeles, California. The 78 methods can be grouped into four general strategies: volume-paced compositing, time-paced compositing, pollutograph sampling, and microsampling. The performances of each sampling strategy was evaluated by comparing the (1) median relative error between the virtually sampled and the true modeled event mean concentration (EMC) of each storm (accuracy), (2) median absolute deviation about the median or "MAD" of the relative error or (precision), and (3) the percentage of storms where sampling methods were within 10% of the true EMC (combined measures of accuracy and precision). Finally, costs associated with site setup, sampling, and laboratory analysis were estimated for each method. Pollutograph sampling consistently outperformed the other three methods both in terms of accuracy and precision, but was the most costly method evaluated. Time-paced sampling consistently underestimated while volume-paced sampling over estimated the storm EMCs. Microsampling performance approached that of pollutograph sampling at a substantial cost savings. The most efficient method for routine stormwater monitoring in terms of a balance between performance and cost was volume-paced microsampling, with variable sample pacing to ensure that the entirety of the storm was captured. Pollutograph sampling is recommended if the data are to be used for detailed analysis of runoff dynamics.
A Metastatistical Approach to Satellite Estimates of Extreme Rainfall Events
NASA Astrophysics Data System (ADS)
Zorzetto, E.; Marani, M.
2017-12-01
The estimation of the average recurrence interval of intense rainfall events is a central issue for both hydrologic modeling and engineering design. These estimates require the inference of the properties of the right tail of the statistical distribution of precipitation, a task often performed using the Generalized Extreme Value (GEV) distribution, estimated either from a samples of annual maxima (AM) or with a peaks over threshold (POT) approach. However, these approaches require long and homogeneous rainfall records, which often are not available, especially in the case of remote-sensed rainfall datasets. We use here, and tailor it to remotely-sensed rainfall estimates, an alternative approach, based on the metastatistical extreme value distribution (MEVD), which produces estimates of rainfall extreme values based on the probability distribution function (pdf) of all measured `ordinary' rainfall event. This methodology also accounts for the interannual variations observed in the pdf of daily rainfall by integrating over the sample space of its random parameters. We illustrate the application of this framework to the TRMM Multi-satellite Precipitation Analysis rainfall dataset, where MEVD optimally exploits the relatively short datasets of satellite-sensed rainfall, while taking full advantage of its high spatial resolution and quasi-global coverage. Accuracy of TRMM precipitation estimates and scale issues are here investigated for a case study located in the Little Washita watershed, Oklahoma, using a dense network of rain gauges for independent ground validation. The methodology contributes to our understanding of the risk of extreme rainfall events, as it allows i) an optimal use of the TRMM datasets in estimating the tail of the probability distribution of daily rainfall, and ii) a global mapping of daily rainfall extremes and distributional tail properties, bridging the existing gaps in rain gauges networks.
NASA Astrophysics Data System (ADS)
Choi, A.; Heymans, C.; Blake, C.; Hildebrandt, H.; Duncan, C. A. J.; Erben, T.; Nakajima, R.; Van Waerbeke, L.; Viola, M.
2016-12-01
We determine the accuracy of galaxy redshift distributions as estimated from photometric redshift probability distributions p(z). Our method utilizes measurements of the angular cross-correlation between photometric galaxies and an overlapping sample of galaxies with spectroscopic redshifts. We describe the redshift leakage from a galaxy photometric redshift bin j into a spectroscopic redshift bin I using the sum of the p(z) for the galaxies residing in bin j. We can then predict the angular cross-correlation between photometric and spectroscopic galaxies due to intrinsic galaxy clustering when I ≠ j as a function of the measured angular cross-correlation when I = j. We also account for enhanced clustering arising from lensing magnification using a halo model. The comparison of this prediction with the measured signal provides a consistency check on the validity of using the summed p(z) to determine galaxy redshift distributions in cosmological analyses, as advocated by the Canada-France-Hawaii Telescope Lensing Survey (CFHTLenS). We present an analysis of the photometric redshifts measured by CFHTLenS, which overlaps the Baryon Oscillation Spectroscopic Survey (BOSS). We also analyse the Red-sequence Cluster Lensing Survey, which overlaps both BOSS and the WiggleZ Dark Energy Survey. We find that the summed p(z) from both surveys are generally biased with respect to the true underlying distributions. If unaccounted for, this bias would lead to errors in cosmological parameter estimation from CFHTLenS by less than ˜4 per cent. For photometric redshift bins which spatially overlap in 3D with our spectroscopic sample, we determine redshift bias corrections which can be used in future cosmological analyses that rely on accurate galaxy redshift distributions.
Millard, Pierre; Massou, Stéphane; Portais, Jean-Charles; Létisse, Fabien
2014-10-21
Mass spectrometry (MS) is widely used for isotopic studies of metabolism in which detailed information about biochemical processes is obtained from the analysis of isotope incorporation into metabolites. The biological value of such experiments is dependent on the accuracy of the isotopic measurements. Using MS, isotopologue distributions are measured from the quantitative analysis of isotopic clusters. These measurements are prone to various biases, which can occur during the experimental workflow and/or MS analysis. The lack of relevant standards limits investigations of the quality of the measured isotopologue distributions. To meet that need, we developed a complete theoretical and experimental framework for the biological production of metabolites with fully controlled and predictable labeling patterns. This strategy is valid for different isotopes and different types of metabolisms and organisms, and was applied to two model microorganisms, Pichia augusta and Escherichia coli, cultivated on (13)C-labeled methanol and acetate as sole carbon source, respectively. The isotopic composition of the substrates was designed to obtain samples in which the isotopologue distribution of all the metabolites should give the binomial coefficients found in Pascal's triangle. The strategy was validated on a liquid chromatography-tandem mass spectrometry (LC-MS/MS) platform by quantifying the complete isotopologue distributions of different intracellular metabolites, which were in close agreement with predictions. This strategy can be used to evaluate entire experimental workflows (from sampling to data processing) or different analytical platforms in the context of isotope labeling experiments.
ERIC Educational Resources Information Center
Pfaffel, Andreas; Spiel, Christiane
2016-01-01
Approaches to correcting correlation coefficients for range restriction have been developed under the framework of large sample theory. The accuracy of missing data techniques for correcting correlation coefficients for range restriction has thus far only been investigated with relatively large samples. However, researchers and evaluators are…
A Novel Energy-Efficient Approach for Human Activity Recognition
Zheng, Lingxiang; Wu, Dihong; Ruan, Xiaoyang; Weng, Shaolin; Tang, Biyu; Lu, Hai; Shi, Haibin
2017-01-01
In this paper, we propose a novel energy-efficient approach for mobile activity recognition system (ARS) to detect human activities. The proposed energy-efficient ARS, using low sampling rates, can achieve high recognition accuracy and low energy consumption. A novel classifier that integrates hierarchical support vector machine and context-based classification (HSVMCC) is presented to achieve a high accuracy of activity recognition when the sampling rate is less than the activity frequency, i.e., the Nyquist sampling theorem is not satisfied. We tested the proposed energy-efficient approach with the data collected from 20 volunteers (14 males and six females) and the average recognition accuracy of around 96.0% was achieved. Results show that using a low sampling rate of 1Hz can save 17.3% and 59.6% of energy compared with the sampling rates of 5 Hz and 50 Hz. The proposed low sampling rate approach can greatly reduce the power consumption while maintaining high activity recognition accuracy. The composition of power consumption in online ARS is also investigated in this paper. PMID:28885560
Lovett, M W
1984-05-01
Children referred with specific reading dysfunction were subtyped as accuracy disabled or rate disabled according to criteria developed from an information processing model of reading skill. Multiple measures of oral and written language development were compared for two subtyped samples matched on age, sex, and IQ. The two samples were comparable in reading fluency, reading comprehension, word knowledge, and word retrieval functions. Accuracy disabled readers demonstrated inferior decoding and spelling skills. The accuracy disabled sample proved deficient in their understanding of oral language structure and in their ability to associate unfamiliar pseudowords and novel symbols in a task designed to simulate some of the learning involved in initial reading acquisition. It was suggested that these two samples of disabled readers may be best described with respect to their relative standing along a theoretical continuum of normal reading development.
Power calculation for comparing diagnostic accuracies in a multi-reader, multi-test design.
Kim, Eunhee; Zhang, Zheng; Wang, Youdan; Zeng, Donglin
2014-12-01
Receiver operating characteristic (ROC) analysis is widely used to evaluate the performance of diagnostic tests with continuous or ordinal responses. A popular study design for assessing the accuracy of diagnostic tests involves multiple readers interpreting multiple diagnostic test results, called the multi-reader, multi-test design. Although several different approaches to analyzing data from this design exist, few methods have discussed the sample size and power issues. In this article, we develop a power formula to compare the correlated areas under the ROC curves (AUC) in a multi-reader, multi-test design. We present a nonparametric approach to estimate and compare the correlated AUCs by extending DeLong et al.'s (1988, Biometrics 44, 837-845) approach. A power formula is derived based on the asymptotic distribution of the nonparametric AUCs. Simulation studies are conducted to demonstrate the performance of the proposed power formula and an example is provided to illustrate the proposed procedure. © 2014, The International Biometric Society.
Newborn screening healthcare information system based on service-oriented architecture.
Hsieh, Sung-Huai; Hsieh, Sheau-Ling; Chien, Yin-Hsiu; Weng, Yung-Ching; Hsu, Kai-Ping; Chen, Chi-Huang; Tu, Chien-Ming; Wang, Zhenyu; Lai, Feipei
2010-08-01
In this paper, we established a newborn screening system under the HL7/Web Services frameworks. We rebuilt the NTUH Newborn Screening Laboratory's original standalone architecture, having various heterogeneous systems operating individually, and restructured it into a Service-Oriented Architecture (SOA), distributed platform for further integrity and enhancements of sample collections, testing, diagnoses, evaluations, treatments or follow-up services, screening database management, as well as collaboration, communication among hospitals; decision supports and improving screening accuracy over the Taiwan neonatal systems are also addressed. In addition, the new system not only integrates the newborn screening procedures among phlebotomy clinics, referral hospitals, as well as the newborn screening center in Taiwan, but also introduces new models of screening procedures for the associated, medical practitioners. Furthermore, it reduces the burden of manual operations, especially the reporting services, those were heavily dependent upon previously. The new system can accelerate the whole procedures effectively and efficiently. It improves the accuracy and the reliability of the screening by ensuring the quality control during the processing as well.
Aldhous, Marian C; Abu Bakar, Suhaili; Prescott, Natalie J; Palla, Raquel; Soo, Kimberley; Mansfield, John C; Mathew, Christopher G; Satsangi, Jack; Armour, John A L
2010-12-15
The copy number variation in beta-defensin genes on human chromosome 8 has been proposed to underlie susceptibility to inflammatory disorders, but presents considerable challenges for accurate typing on the scale required for adequately powered case-control studies. In this work, we have used accurate methods of copy number typing based on the paralogue ratio test (PRT) to assess beta-defensin copy number in more than 1500 UK DNA samples including more than 1000 cases of Crohn's disease. A subset of 625 samples was typed using both PRT-based methods and standard real-time PCR methods, from which direct comparisons highlight potentially serious shortcomings of a real-time PCR assay for typing this variant. Comparing our PRT-based results with two previous studies based only on real-time PCR, we find no evidence to support the reported association of Crohn's disease with either low or high beta-defensin copy number; furthermore, it is noteworthy that there are disagreements between different studies on the observed frequency distribution of copy number states among European controls. We suggest safeguards to be adopted in assessing and reporting the accuracy of copy number measurement, with particular emphasis on integer clustering of results, to avoid reporting of spurious associations in future case-control studies.
Selecting the optimum plot size for a California design-based stream and wetland mapping program.
Lackey, Leila G; Stein, Eric D
2014-04-01
Accurate estimates of the extent and distribution of wetlands and streams are the foundation of wetland monitoring, management, restoration, and regulatory programs. Traditionally, these estimates have relied on comprehensive mapping. However, this approach is prohibitively resource-intensive over large areas, making it both impractical and statistically unreliable. Probabilistic (design-based) approaches to evaluating status and trends provide a more cost-effective alternative because, compared with comprehensive mapping, overall extent is inferred from mapping a statistically representative, randomly selected subset of the target area. In this type of design, the size of sample plots has a significant impact on program costs and on statistical precision and accuracy; however, no consensus exists on the appropriate plot size for remote monitoring of stream and wetland extent. This study utilized simulated sampling to assess the performance of four plot sizes (1, 4, 9, and 16 km(2)) for three geographic regions of California. Simulation results showed smaller plot sizes (1 and 4 km(2)) were most efficient for achieving desired levels of statistical accuracy and precision. However, larger plot sizes were more likely to contain rare and spatially limited wetland subtypes. Balancing these considerations led to selection of 4 km(2) for the California status and trends program.
Decomposing ADHD-Related Effects in Response Speed and Variability
Karalunas, Sarah L.; Huang-Pollock, Cynthia L.; Nigg, Joel T.
2012-01-01
Objective Slow and variable reaction times (RTs) on fast tasks are such a prominent feature of Attention Deficit Hyperactivity Disorder (ADHD) that any theory must account for them. However, this has proven difficult because the cognitive mechanisms responsible for this effect remain unexplained. Although speed and variability are typically correlated, it is unclear whether single or multiple mechanisms are responsible for group differences in each. RTs are a result of several semi-independent processes, including stimulus encoding, rate of information processing, speed-accuracy trade-offs, and motor response, which have not been previously well characterized. Method A diffusion model was applied to RTs from a forced-choice RT paradigm in two large, independent case-control samples (NCohort 1= 214 and N Cohort 2=172). The decomposition measured three validated parameters that account for the full RT distribution, and assessed reproducibility of ADHD effects. Results In both samples, group differences in traditional RT variables were explained by slow information processing speed, and unrelated to speed-accuracy trade-offs or non-decisional processes (e.g. encoding, motor response). Conclusions RT speed and variability in ADHD may be explained by a single information processing parameter, potentially simplifying explanations that assume different mechanisms are required to account for group differences in the mean and variability of RTs. PMID:23106115
Improved electron probe microanalysis of trace elements in quartz
Donovan, John J.; Lowers, Heather; Rusk, Brian G.
2011-01-01
Quartz occurs in a wide range of geologic environments throughout the Earth's crust. The concentration and distribution of trace elements in quartz provide information such as temperature and other physical conditions of formation. Trace element analyses with modern electron-probe microanalysis (EPMA) instruments can achieve 99% confidence detection of ~100 ppm with fairly minimal effort for many elements in samples of low to moderate average atomic number such as many common oxides and silicates. However, trace element measurements below 100 ppm in many materials are limited, not only by the precision of the background measurement, but also by the accuracy with which background levels are determined. A new "blank" correction algorithm has been developed and tested on both Cameca and JEOL instruments, which applies a quantitative correction to the emitted X-ray intensities during the iteration of the sample matrix correction based on a zero level (or known trace) abundance calibration standard. This iterated blank correction, when combined with improved background fit models, and an "aggregate" intensity calculation utilizing multiple spectrometer intensities in software for greater geometric efficiency, yields a detection limit of 2 to 3 ppm for Ti and 6 to 7 ppm for Al in quartz at 99% t-test confidence with similar levels for absolute accuracy.
Aldhous, Marian C.; Abu Bakar, Suhaili; Prescott, Natalie J.; Palla, Raquel; Soo, Kimberley; Mansfield, John C.; Mathew, Christopher G.; Satsangi, Jack; Armour, John A.L.
2010-01-01
The copy number variation in beta-defensin genes on human chromosome 8 has been proposed to underlie susceptibility to inflammatory disorders, but presents considerable challenges for accurate typing on the scale required for adequately powered case–control studies. In this work, we have used accurate methods of copy number typing based on the paralogue ratio test (PRT) to assess beta-defensin copy number in more than 1500 UK DNA samples including more than 1000 cases of Crohn's disease. A subset of 625 samples was typed using both PRT-based methods and standard real-time PCR methods, from which direct comparisons highlight potentially serious shortcomings of a real-time PCR assay for typing this variant. Comparing our PRT-based results with two previous studies based only on real-time PCR, we find no evidence to support the reported association of Crohn's disease with either low or high beta-defensin copy number; furthermore, it is noteworthy that there are disagreements between different studies on the observed frequency distribution of copy number states among European controls. We suggest safeguards to be adopted in assessing and reporting the accuracy of copy number measurement, with particular emphasis on integer clustering of results, to avoid reporting of spurious associations in future case–control studies. PMID:20858604
Evaluation of Fiber Bragg Grating and Distributed Optical Fiber Temperature Sensors
DOE Office of Scientific and Technical Information (OSTI.GOV)
McCary, Kelly Marie
Fiber optic temperature sensors were evaluated in the High Temperature Test Lab (HTTL) to determine the accuracy of the measurements at various temperatures. A distributed temperature sensor was evaluated up to 550C and a fiber Bragg grating sensor was evaluated up to 750C. HTTL measurements indicate that there is a drift in fiber Bragg sensor over time of approximately -10C with higher accuracy at temperatures above 300C. The distributed sensor produced some bad data points at and above 500C but produced measurements with less than 2% error at increasing temperatures up to 400C
NASA Astrophysics Data System (ADS)
Ma, Qian; Xia, Houping; Xu, Qiang; Zhao, Lei
2018-05-01
A new method combining Tikhonov regularization and kernel matrix optimization by multi-wavelength incidence is proposed for retrieving particle size distribution (PSD) in an independent model with improved accuracy and stability. In comparison to individual regularization or multi-wavelength least squares, the proposed method exhibited better anti-noise capability, higher accuracy and stability. While standard regularization typically makes use of the unit matrix, it is not universal for different PSDs, particularly for Junge distributions. Thus, a suitable regularization matrix was chosen by numerical simulation, with the second-order differential matrix found to be appropriate for most PSD types.
NASA Astrophysics Data System (ADS)
Sun, Aihui; Tian, Xiaolin; Kong, Yan; Jiang, Zhilong; Liu, Fei; Xue, Liang; Wang, Shouyu; Liu, Cheng
2018-01-01
As a lensfree imaging technique, ptychographic iterative engine (PIE) method can provide both quantitative sample amplitude and phase distributions avoiding aberration. However, it requires field of view (FoV) scanning often relying on mechanical translation, which not only slows down measuring speed, but also introduces mechanical errors decreasing both resolution and accuracy in retrieved information. In order to achieve high-accurate quantitative imaging with fast speed, digital micromirror device (DMD) is adopted in PIE for large FoV scanning controlled by on/off state coding by DMD. Measurements were implemented using biological samples as well as USAF resolution target, proving high resolution in quantitative imaging using the proposed system. Considering its fast and accurate imaging capability, it is believed the DMD based PIE technique provides a potential solution for medical observation and measurements.
3D resolved mapping of optical aberrations in thick tissues
Zeng, Jun; Mahou, Pierre; Schanne-Klein, Marie-Claire; Beaurepaire, Emmanuel; Débarre, Delphine
2012-01-01
We demonstrate a simple method for mapping optical aberrations with 3D resolution within thick samples. The method relies on the local measurement of the variation in image quality with externally applied aberrations. We discuss the accuracy of the method as a function of the signal strength and of the aberration amplitude and we derive the achievable resolution for the resulting measurements. We then report on measured 3D aberration maps in human skin biopsies and mouse brain slices. From these data, we analyse the consequences of tissue structure and refractive index distribution on aberrations and imaging depth in normal and cleared tissue samples. The aberration maps allow the estimation of the typical aplanetism region size over which aberrations can be uniformly corrected. This method and data pave the way towards efficient correction strategies for tissue imaging applications. PMID:22876353
Liu, An; Wijesiri, Buddhi; Hong, Nian; Zhu, Panfeng; Egodawatta, Prasanna; Goonetilleke, Ashantha
2018-05-08
Road deposited pollutants (build-up) are continuously re-distributed by external factors such as traffic and wind turbulence, influencing stormwater runoff quality. However, current stormwater quality modelling approaches do not account for the re-distribution of pollutants. This undermines the accuracy of stormwater quality predictions, constraining the design of effective stormwater treatment measures. This study, using over 1000 data points, developed a Bayesian Network modelling approach to investigate the re-distribution of pollutant build-up on urban road surfaces. BTEX, which are a group of highly toxic pollutants, was the case study pollutants. Build-up sampling was undertaken in Shenzhen, China, using a dry and wet vacuuming method. The research outcomes confirmed that the vehicle type and particle size significantly influence the re-distribution of particle-bound BTEX. Compared to heavy-duty traffic in commercial areas, light-duty traffic dominates the re-distribution of particles of all size ranges. In industrial areas, heavy-duty traffic re-distributes particles >75 μm, and light-duty traffic re-distributes particles <75 μm. In residential areas, light-duty traffic re-distributes particles >300 μm and <75 μm and heavy-duty traffic re-distributes particles in the 300-150 μm range. The study results provide important insights to improve stormwater quality modelling and the interpretation of modelling outcomes, contributing to safeguard the urban water environment. Copyright © 2018 Elsevier B.V. All rights reserved.
Landenburger, L.; Lawrence, R.L.; Podruzny, S.; Schwartz, C.C.
2008-01-01
Moderate resolution satellite imagery traditionally has been thought to be inadequate for mapping vegetation at the species level. This has made comprehensive mapping of regional distributions of sensitive species, such as whitebark pine, either impractical or extremely time consuming. We sought to determine whether using a combination of moderate resolution satellite imagery (Landsat Enhanced Thematic Mapper Plus), extensive stand data collected by land management agencies for other purposes, and modern statistical classification techniques (boosted classification trees) could result in successful mapping of whitebark pine. Overall classification accuracies exceeded 90%, with similar individual class accuracies. Accuracies on a localized basis varied based on elevation. Accuracies also varied among administrative units, although we were not able to determine whether these differences related to inherent spatial variations or differences in the quality of available reference data.
40 CFR 92.127 - Emission measurement accuracy.
Code of Federal Regulations, 2010 CFR
2010-07-01
... Emission measurement accuracy. (a) Good engineering practice dictates that exhaust emission sample analyzer... resolution read-out systems such as computers, data loggers, etc., can provide sufficient accuracy and...
40 CFR 92.127 - Emission measurement accuracy.
Code of Federal Regulations, 2014 CFR
2014-07-01
... Emission measurement accuracy. (a) Good engineering practice dictates that exhaust emission sample analyzer... resolution read-out systems such as computers, data loggers, etc., can provide sufficient accuracy and...
40 CFR 92.127 - Emission measurement accuracy.
Code of Federal Regulations, 2013 CFR
2013-07-01
... Emission measurement accuracy. (a) Good engineering practice dictates that exhaust emission sample analyzer... resolution read-out systems such as computers, data loggers, etc., can provide sufficient accuracy and...
40 CFR 92.127 - Emission measurement accuracy.
Code of Federal Regulations, 2012 CFR
2012-07-01
... Emission measurement accuracy. (a) Good engineering practice dictates that exhaust emission sample analyzer... resolution read-out systems such as computers, data loggers, etc., can provide sufficient accuracy and...
40 CFR 92.127 - Emission measurement accuracy.
Code of Federal Regulations, 2011 CFR
2011-07-01
... Emission measurement accuracy. (a) Good engineering practice dictates that exhaust emission sample analyzer... resolution read-out systems such as computers, data loggers, etc., can provide sufficient accuracy and...
The Herschel Multi-Tiered Extragalactic Survey: SPIRE-mm Photometric Redshifts
NASA Technical Reports Server (NTRS)
Roseboom, I. G.; Ivison, R. J.; Greve, T. R.; Amblard, A.; Arumugam, V.; Auld, R.; Aussel, H.; Bethermin, M.; Blain, A.; Bock, J.;
2011-01-01
We investigate the potential of submm-mm and submm-mm-radio photometric red-shifts using a sample of mm-selected sources as seen at 250, 350 and 500 micrometers by the SPIRE instrument on Herschel. From a sample of 63 previously identified mm-sources with reliable radio identifications in the GOODS-N and Lockman Hole North fields 46 (73 per cent) are found to have detections in at least one SPIRE band. We explore the observed submm/mm colour evolution with redshift, finding that the colours of mm-sources are adequately described by a modified blackbody with constant optical depth Tau = (nu/nu(0))beta where beta = +1.8 and nu(0) = c/100 micrometers. We find a tight correlation between dust temperature and IR luminosity. Using a single model of the dust temperature and IR luminosity relation we derive photometric redshift estimates for the 46 SPIRE detected mm-sources. Testing against the 22 sources with known spectroscopic, or good quality optical/near-IR photometric, redshifts we find submm/mm photometric redshifts offer a redshift accuracy of |delta z|/(1+z) = 0.16 (less than |delta z| greater than = 0.51). Including constraints from the radio-far IR correlation the accuracy is improved to |delta z|/(1 + z) = 0.15 (less than |delta z| greater than = 0.45). We estimate the redshift distribution of mm-selected sources finding a significant excess at z greater than 3 when compared to 850 micrometer selected samples.
Pearce, J; Ferrier, S; Scotts, D
2001-06-01
To use models of species distributions effectively in conservation planning, it is important to determine the predictive accuracy of such models. Extensive modelling of the distribution of vascular plant and vertebrate fauna species within north-east New South Wales has been undertaken by linking field survey data to environmental and geographical predictors using logistic regression. These models have been used in the development of a comprehensive and adequate reserve system within the region. We evaluate the predictive accuracy of models for 153 small reptile, arboreal marsupial, diurnal bird and vascular plant species for which independent evaluation data were available. The predictive performance of each model was evaluated using the relative operating characteristic curve to measure discrimination capacity. Good discrimination ability implies that a model's predictions provide an acceptable index of species occurrence. The discrimination capacity of 89% of the models was significantly better than random, with 70% of the models providing high levels of discrimination. Predictions generated by this type of modelling therefore provide a reasonably sound basis for regional conservation planning. The discrimination ability of models was highest for the less mobile biological groups, particularly the vascular plants and small reptiles. In the case of diurnal birds, poor performing models tended to be for species which occur mainly within specific habitats not well sampled by either the model development or evaluation data, highly mobile species, species that are locally nomadic or those that display very broad habitat requirements. Particular care needs to be exercised when employing models for these types of species in conservation planning.
The systematic component of phylogenetic error as a function of taxonomic sampling under parsimony.
Debry, Ronald W
2005-06-01
The effect of taxonomic sampling on phylogenetic accuracy under parsimony is examined by simulating nucleotide sequence evolution. Random error is minimized by using very large numbers of simulated characters. This allows estimation of the consistency behavior of parsimony, even for trees with up to 100 taxa. Data were simulated on 8 distinct 100-taxon model trees and analyzed as stratified subsets containing either 25 or 50 taxa, in addition to the full 100-taxon data set. Overall accuracy decreased in a majority of cases when taxa were added. However, the magnitude of change in the cases in which accuracy increased was larger than the magnitude of change in the cases in which accuracy decreased, so, on average, overall accuracy increased as more taxa were included. A stratified sampling scheme was used to assess accuracy for an initial subsample of 25 taxa. The 25-taxon analyses were compared to 50- and 100-taxon analyses that were pruned to include only the original 25 taxa. On average, accuracy for the 25 taxa was improved by taxon addition, but there was considerable variation in the degree of improvement among the model trees and across different rates of substitution.
Monaghan, Kieran A.
2016-01-01
Natural ecological variability and analytical design can bias the derived value of a biotic index through the variable influence of indicator body-size, abundance, richness, and ascribed tolerance scores. Descriptive statistics highlight this risk for 26 aquatic indicator systems; detailed analysis is provided for contrasting weighted-average indices applying the example of the BMWP, which has the best supporting data. Differences in body size between taxa from respective tolerance classes is a common feature of indicator systems; in some it represents a trend ranging from comparatively small pollution tolerant to larger intolerant organisms. Under this scenario, the propensity to collect a greater proportion of smaller organisms is associated with negative bias however, positive bias may occur when equipment (e.g. mesh-size) selectively samples larger organisms. Biotic indices are often derived from systems where indicator taxa are unevenly distributed along the gradient of tolerance classes. Such skews in indicator richness can distort index values in the direction of taxonomically rich indicator classes with the subsequent degree of bias related to the treatment of abundance data. The misclassification of indicator taxa causes bias that varies with the magnitude of the misclassification, the relative abundance of misclassified taxa and the treatment of abundance data. These artifacts of assessment design can compromise the ability to monitor biological quality. The statistical treatment of abundance data and the manipulation of indicator assignment and class richness can be used to improve index accuracy. While advances in methods of data collection (i.e. DNA barcoding) may facilitate improvement, the scope to reduce systematic bias is ultimately limited to a strategy of optimal compromise. The shortfall in accuracy must be addressed by statistical pragmatism. At any particular site, the net bias is a probabilistic function of the sample data, resulting in an error variance around an average deviation. Following standardized protocols and assigning precise reference conditions, the error variance of their comparative ratio (test-site:reference) can be measured and used to estimate the accuracy of the resultant assessment. PMID:27392036
A simplified analytical random walk model for proton dose calculation
NASA Astrophysics Data System (ADS)
Yao, Weiguang; Merchant, Thomas E.; Farr, Jonathan B.
2016-10-01
We propose an analytical random walk model for proton dose calculation in a laterally homogeneous medium. A formula for the spatial fluence distribution of primary protons is derived. The variance of the spatial distribution is in the form of a distance-squared law of the angular distribution. To improve the accuracy of dose calculation in the Bragg peak region, the energy spectrum of the protons is used. The accuracy is validated against Monte Carlo simulation in water phantoms with either air gaps or a slab of bone inserted. The algorithm accurately reflects the dose dependence on the depth of the bone and can deal with small-field dosimetry. We further applied the algorithm to patients’ cases in the highly heterogeneous head and pelvis sites and used a gamma test to show the reasonable accuracy of the algorithm in these sites. Our algorithm is fast for clinical use.
Lin, Zhixiong; Liu, Haiyan; Riniker, Sereina; van Gunsteren, Wilfred F
2011-12-13
Enveloping distribution sampling (EDS) is a powerful method to compute relative free energies from simulation. So far, the EDS method has only been applied to alchemical free energy differences, i.e., between different Hamiltonians defining different systems, and not yet to obtain free energy differences between different conformations or conformational states of a system. In this article, we extend the EDS formalism such that it can be applied to compute free energy differences of different conformations and apply it to compute the relative free enthalpy ΔG of 310-, α-, and π-helices of an alanine deca-peptide in explicit water solvent. The resulting ΔG values are compared to those obtained by standard thermodynamic integration (TI) and from so-called end-state simulations. A TI simulation requires the definition of a λ-dependent pathway which in the present case is based on hydrogen bonds of the different helical conformations. The values of ⟨(∂VTI)/(∂λ)⟩λ show a sharp change for a particular range of λ values, which is indicative of an energy barrier along the pathway, which lowers the accuracy of the resulting ΔG value. In contrast, in a two-state EDS simulation, an unphysical reference-state Hamiltonian which connects the parts of conformational space that are relevant to the different end states is constructed automatically; that is, no pathway needs to be defined. In the simulation using this reference state, both helices were sampled, and many transitions between them occurred, thus ensuring the accuracy of the resulting free enthalpy difference. According to the EDS simulations, the free enthalpy differences of the π-helix and the 310-helix versus the α-helix are 5 kJ mol(-1) and 47 kJ mol(-1), respectively, for an alanine deca-peptide in explicit SPC water solvent using the GROMOS 53A6 force field. The EDS method, which is a particular form of umbrella sampling, is thus applicable to compute free energy differences between conformational states as well as between systems and has definite advantages over the traditional TI and umbrella sampling methods to compute relative free energies.
Object-based vegetation classification with high resolution remote sensing imagery
NASA Astrophysics Data System (ADS)
Yu, Qian
Vegetation species are valuable indicators to understand the earth system. Information from mapping of vegetation species and community distribution at large scales provides important insight for studying the phenological (growth) cycles of vegetation and plant physiology. Such information plays an important role in land process modeling including climate, ecosystem and hydrological models. The rapidly growing remote sensing technology has increased its potential in vegetation species mapping. However, extracting information at a species level is still a challenging research topic. I proposed an effective method for extracting vegetation species distribution from remotely sensed data and investigated some ways for accuracy improvement. The study consists of three phases. Firstly, a statistical analysis was conducted to explore the spatial variation and class separability of vegetation as a function of image scale. This analysis aimed to confirm that high resolution imagery contains the information on spatial vegetation variation and these species classes can be potentially separable. The second phase was a major effort in advancing classification by proposing a method for extracting vegetation species from high spatial resolution remote sensing data. The proposed classification employs an object-based approach that integrates GIS and remote sensing data and explores the usefulness of ancillary information. The whole process includes image segmentation, feature generation and selection, and nearest neighbor classification. The third phase introduces a spatial regression model for evaluating the mapping quality from the above vegetation classification results. The effects of six categories of sample characteristics on the classification uncertainty are examined: topography, sample membership, sample density, spatial composition characteristics, training reliability and sample object features. This evaluation analysis answered several interesting scientific questions such as (1) whether the sample characteristics affect the classification accuracy and how significant if it does; (2) how much variance of classification uncertainty can be explained by above factors. This research is carried out on a hilly peninsular area in Mediterranean climate, Point Reyes National Seashore (PRNS) in Northern California. The area mainly consists of a heterogeneous, semi-natural broadleaf and conifer woodland, shrub land, and annual grassland. A detailed list of vegetation alliances is used in this study. Research results from the first phase indicates that vegetation spatial variation as reflected by the average local variance (ALV) keeps a high level of magnitude between 1 m and 4 m resolution. (Abstract shortened by UMI.)
INTERIM REPORT ON THE EVOLUTION AND ...
A demonstration of screening technologies for determining the presence of dioxin and dioxin-like compounds in soil and sediment was conducted under the U.S. Environmental Protection Agency's(EPA's) Superfund Innovative Technology Evaluation Program in Saginaw, Michigan in 2004. The objectives of the demonstration included evaluating each participating technology's accuracy, precision, sensitivity, sample throughput, tendency for matrix effects, and cost. The test also included an assessment of how well the technology's results compared to those generated by established laboratory methods using high-resolution mass spectrometry (HRMS). The demonstration objectives were accomplished by evaluating the results generated by each technology from 209 soil, sediment, and extract samples. The test samples included performance evaluation (PE) samples (i.e., contaminant concentrations were certified or the samples were spiked with known contaminants) and environmental samples collected from 10 different sampling locations. The PE and environmental samples were distributed to the technology developers in blind, random order. One of the participants in the original SITE demonstration was Hybrizyme Corporation, which demonstrated the use of the AhRC PCR Kit. The AhRC PCR Kit was a technology that reported the concentration of aryl hydrocarbon receptor (AhR) binding compounds in a sample, with units reported as Ah Receptor Binding Units (AhRBU). At the time of the original dem
Erbe, M; Hayes, B J; Matukumalli, L K; Goswami, S; Bowman, P J; Reich, C M; Mason, B A; Goddard, M E
2012-07-01
Achieving accurate genomic estimated breeding values for dairy cattle requires a very large reference population of genotyped and phenotyped individuals. Assembling such reference populations has been achieved for breeds such as Holstein, but is challenging for breeds with fewer individuals. An alternative is to use a multi-breed reference population, such that smaller breeds gain some advantage in accuracy of genomic estimated breeding values (GEBV) from information from larger breeds. However, this requires that marker-quantitative trait loci associations persist across breeds. Here, we assessed the gain in accuracy of GEBV in Jersey cattle as a result of using a combined Holstein and Jersey reference population, with either 39,745 or 624,213 single nucleotide polymorphism (SNP) markers. The surrogate used for accuracy was the correlation of GEBV with daughter trait deviations in a validation population. Two methods were used to predict breeding values, either a genomic BLUP (GBLUP_mod), or a new method, BayesR, which used a mixture of normal distributions as the prior for SNP effects, including one distribution that set SNP effects to zero. The GBLUP_mod method scaled both the genomic relationship matrix and the additive relationship matrix to a base at the time the breeds diverged, and regressed the genomic relationship matrix to account for sampling errors in estimating relationship coefficients due to a finite number of markers, before combining the 2 matrices. Although these modifications did result in less biased breeding values for Jerseys compared with an unmodified genomic relationship matrix, BayesR gave the highest accuracies of GEBV for the 3 traits investigated (milk yield, fat yield, and protein yield), with an average increase in accuracy compared with GBLUP_mod across the 3 traits of 0.05 for both Jerseys and Holsteins. The advantage was limited for either Jerseys or Holsteins in using 624,213 SNP rather than 39,745 SNP (0.01 for Holsteins and 0.03 for Jerseys, averaged across traits). Even this limited and nonsignificant advantage was only observed when BayesR was used. An alternative panel, which extracted the SNP in the transcribed part of the bovine genome from the 624,213 SNP panel (to give 58,532 SNP), performed better, with an increase in accuracy of 0.03 for Jerseys across traits. This panel captures much of the increased genomic content of the 624,213 SNP panel, with the advantage of a greatly reduced number of SNP effects to estimate. Taken together, using this panel, a combined breed reference and using BayesR rather than GBLUP_mod increased the accuracy of GEBV in Jerseys from 0.43 to 0.52, averaged across the 3 traits. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Mohammed A. Kalkhan; Robin M. Reich; Raymond L. Czaplewski
1996-01-01
A Monte Carlo simulation was used to evaluate the statistical properties of measures of association and the Kappa statistic under double sampling with replacement. Three error matrices representing three levels of classification accuracy of Landsat TM Data consisting of four forest cover types in North Carolina. The overall accuracy of the five indices ranged from 0.35...
A Surrogate-based Adaptive Sampling Approach for History Matching and Uncertainty Quantification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Weixuan; Zhang, Dongxiao; Lin, Guang
A critical procedure in reservoir simulations is history matching (or data assimilation in a broader sense), which calibrates model parameters such that the simulation results are consistent with field measurements, and hence improves the credibility of the predictions given by the simulations. Often there exist non-unique combinations of parameter values that all yield the simulation results matching the measurements. For such ill-posed history matching problems, Bayesian theorem provides a theoretical foundation to represent different solutions and to quantify the uncertainty with the posterior PDF. Lacking an analytical solution in most situations, the posterior PDF may be characterized with a samplemore » of realizations, each representing a possible scenario. A novel sampling algorithm is presented here for the Bayesian solutions to history matching problems. We aim to deal with two commonly encountered issues: 1) as a result of the nonlinear input-output relationship in a reservoir model, the posterior distribution could be in a complex form, such as multimodal, which violates the Gaussian assumption required by most of the commonly used data assimilation approaches; 2) a typical sampling method requires intensive model evaluations and hence may cause unaffordable computational cost. In the developed algorithm, we use a Gaussian mixture model as the proposal distribution in the sampling process, which is simple but also flexible to approximate non-Gaussian distributions and is particularly efficient when the posterior is multimodal. Also, a Gaussian process is utilized as a surrogate model to speed up the sampling process. Furthermore, an iterative scheme of adaptive surrogate refinement and re-sampling ensures sampling accuracy while keeping the computational cost at a minimum level. The developed approach is demonstrated with an illustrative example and shows its capability in handling the above-mentioned issues. Multimodal posterior of the history matching problem is captured and are used to give a reliable production prediction with uncertainty quantification. The new algorithm reveals a great improvement in terms of computational efficiency comparing previously studied approaches for the sample problem.« less
NASA Technical Reports Server (NTRS)
Card, Don H.; Strong, Laurence L.
1989-01-01
An application of a classification accuracy assessment procedure is described for a vegetation and land cover map prepared by digital image processing of LANDSAT multispectral scanner data. A statistical sampling procedure called Stratified Plurality Sampling was used to assess the accuracy of portions of a map of the Arctic National Wildlife Refuge coastal plain. Results are tabulated as percent correct classification overall as well as per category with associated confidence intervals. Although values of percent correct were disappointingly low for most categories, the study was useful in highlighting sources of classification error and demonstrating shortcomings of the plurality sampling method.
Dietary survey methods. 1. A semi-weighted technique for measuring dietary intake within families.
Nelson, M; Nettleton, P A
1980-10-01
Family diet studies which measure total family consumption can determine only the average nutrient intake. A method has been devised to measure all family members' individual diets concurrently in order to learn how food and nutrient intake is distributed within the family. In this semi-weighed method, the total quantity of food available for consumption by the family is weighted at time of preparation or serving, and the distribution between family members is recorded in household measures. The method is described in detail. It provides data on individual consumption with an accuracy approaching that of a weighed survey. A co-operation rate of 73 per cent in a random sample of 74 households with two adults and two or three children indicates that this semi-weighed method can be used to assess family diets in a broad cross-section of socio-economic backgounds.
Information retrieval from wide-band meteorological data - An example
NASA Technical Reports Server (NTRS)
Adelfang, S. I.; Smith, O. E.
1983-01-01
The methods proposed by Smith and Adelfang (1981) and Smith et al. (1982) are used to calculate probabilities over rectangles and sectors of the gust magnitude-gust length plane; probabilities over the same regions are also calculated from the observed distributions and a comparison is also presented to demonstrate the accuracy of the statistical model. These and other statistical results are calculated from samples of Jimsphere wind profiles at Cape Canaveral. The results are presented for a variety of wavelength bands, altitudes, and seasons. It is shown that wind perturbations observed in Jimsphere wind profiles in various wavelength bands can be analyzed by using digital filters. The relationship between gust magnitude and gust length is modeled with the bivariate gamma distribution. It is pointed out that application of the model to calculate probabilities over specific areas of the gust magnitude-gust length plane can be useful in aerospace design.
Model improvements to simulate charging in SEM
NASA Astrophysics Data System (ADS)
Arat, K. T.; Klimpel, T.; Hagen, C. W.
2018-03-01
Charging of insulators is a complex phenomenon to simulate since the accuracy of the simulations is very sensitive to the interaction of electrons with matter and electric fields. In this study, we report model improvements for a previously developed Monte-Carlo simulator to more accurately simulate samples that charge. The improvements include both modelling of low energy electron scattering and charging of insulators. The new first-principle scattering models provide a more realistic charge distribution cloud in the material, and a better match between non-charging simulations and experimental results. Improvements on charging models mainly focus on redistribution of the charge carriers in the material with an induced conductivity (EBIC) and a breakdown model, leading to a smoother distribution of the charges. Combined with a more accurate tracing of low energy electrons in the electric field, we managed to reproduce the dynamically changing charging contrast due to an induced positive surface potential.
[Image fusion: use in the control of the distribution of prostatic biopsies].
Mozer, Pierre; Baumann, Michaël; Chevreau, Grégoire; Troccaz, Jocelyne
2008-02-01
Prostate biopsies are performed under 2D TransRectal UltraSound (US) guidance by sampling the prostate according to a predefined pattern. Modern image processing tools allow better control of biopsy distribution. We evaluated the accuracy of a single operator performing a pattern of 12 ultrasound-guided biopsies by registering 3D ultrasound control images acquired after each biopsy. For each patient, prostate image alignment was performed automatically with a voxel-based registration algorithm allowing visualization of each biopsy trajectory in a single ultrasound reference volume. On average, the operator reached the target in 60% of all cases. This study shows that it is difficult to accurately reach targets in the prostate using 2D ultrasound. In the near future, real-time fusion of MRI and US images will allow selection of a target in previously acquired MR images and biopsy of this target by US guidance.
Wide area methane emissions mapping with airborne IPDA lidar
NASA Astrophysics Data System (ADS)
Bartholomew, Jarett; Lyman, Philip; Weimer, Carl; Tandy, William
2017-08-01
Methane emissions from natural gas production, storage, and transportation are potential sources of greenhouse gas emissions. Methane leaks also constitute revenue loss potential from operations. Since 2013, Ball Aerospace has been developing advanced airborne sensors using integrated path differential absorption (IPDA) LIDAR instrumentation to identify methane, propane, and longer-chain alkanes in the lowest region of the atmosphere. Additional funding has come from the U.S. Department of Transportation, Pipeline and Hazardous Materials Administration (PHMSA) to upgrade instrumentation to a broader swath coverage of up to 400 meters while maintaining high spatial sampling resolution and geolocation accuracy. Wide area coverage allows efficient mapping of emissions from gathering and distribution networks, processing facilities, landfills, natural seeps, and other distributed methane sources. This paper summarizes the benefits of advanced instrumentation for aerial methane emission mapping, describes the operating characteristics and design of this upgraded IPDA instrumentation, and reviews technical challenges encountered during development and deployment.
Yang, Y Isaac; Zhang, Jun; Che, Xing; Yang, Lijiang; Gao, Yi Qin
2016-03-07
In order to efficiently overcome high free energy barriers embedded in a complex energy landscape and calculate overall thermodynamics properties using molecular dynamics simulations, we developed and implemented a sampling strategy by combining the metadynamics with (selective) integrated tempering sampling (ITS/SITS) method. The dominant local minima on the potential energy surface (PES) are partially exalted by accumulating history-dependent potentials as in metadynamics, and the sampling over the entire PES is further enhanced by ITS/SITS. With this hybrid method, the simulated system can be rapidly driven across the dominant barrier along selected collective coordinates. Then, ITS/SITS ensures a fast convergence of the sampling over the entire PES and an efficient calculation of the overall thermodynamic properties of the simulation system. To test the accuracy and efficiency of this method, we first benchmarked this method in the calculation of ϕ - ψ distribution of alanine dipeptide in explicit solvent. We further applied it to examine the design of template molecules for aromatic meta-C-H activation in solutions and investigate solution conformations of the nonapeptide Bradykinin involving slow cis-trans isomerizations of three proline residues.
Unsupervised active learning based on hierarchical graph-theoretic clustering.
Hu, Weiming; Hu, Wei; Xie, Nianhua; Maybank, Steve
2009-10-01
Most existing active learning approaches are supervised. Supervised active learning has the following problems: inefficiency in dealing with the semantic gap between the distribution of samples in the feature space and their labels, lack of ability in selecting new samples that belong to new categories that have not yet appeared in the training samples, and lack of adaptability to changes in the semantic interpretation of sample categories. To tackle these problems, we propose an unsupervised active learning framework based on hierarchical graph-theoretic clustering. In the framework, two promising graph-theoretic clustering algorithms, namely, dominant-set clustering and spectral clustering, are combined in a hierarchical fashion. Our framework has some advantages, such as ease of implementation, flexibility in architecture, and adaptability to changes in the labeling. Evaluations on data sets for network intrusion detection, image classification, and video classification have demonstrated that our active learning framework can effectively reduce the workload of manual classification while maintaining a high accuracy of automatic classification. It is shown that, overall, our framework outperforms the support-vector-machine-based supervised active learning, particularly in terms of dealing much more efficiently with new samples whose categories have not yet appeared in the training samples.
NASA Astrophysics Data System (ADS)
Yang, Y. Isaac; Zhang, Jun; Che, Xing; Yang, Lijiang; Gao, Yi Qin
2016-03-01
In order to efficiently overcome high free energy barriers embedded in a complex energy landscape and calculate overall thermodynamics properties using molecular dynamics simulations, we developed and implemented a sampling strategy by combining the metadynamics with (selective) integrated tempering sampling (ITS/SITS) method. The dominant local minima on the potential energy surface (PES) are partially exalted by accumulating history-dependent potentials as in metadynamics, and the sampling over the entire PES is further enhanced by ITS/SITS. With this hybrid method, the simulated system can be rapidly driven across the dominant barrier along selected collective coordinates. Then, ITS/SITS ensures a fast convergence of the sampling over the entire PES and an efficient calculation of the overall thermodynamic properties of the simulation system. To test the accuracy and efficiency of this method, we first benchmarked this method in the calculation of ϕ - ψ distribution of alanine dipeptide in explicit solvent. We further applied it to examine the design of template molecules for aromatic meta-C—H activation in solutions and investigate solution conformations of the nonapeptide Bradykinin involving slow cis-trans isomerizations of three proline residues.
Fallon, J.D.; McChesney, J.A.
1993-01-01
Surface-water-quality data were collected from the lower Kansas River Basin in Kansas and Nebraska. The data are presented in 17 tables consisting of physical properties, concentrations of dissolved solids and major ions, dissolved and total nutrients, dissolved and total major metals and trace elements, radioactivity, organic carbon, pesticides and other synthetic-organic compounds, bacteria and chlorophyll-a, in water; particle-size distributions and concentrations of major metals and trace elements in suspended and streambed sediment; and concentrations of synthetic-organic compounds in streambed sediment. The data are grouped within each table by sampling sites, arranged in downstream order. Ninety-one sites were sampled in the study area. These sampling sites are classified in three, non-exclusive categories (fixed, synoptic, and miscellaneous sites) on the basis of sampling frequency and location. Sampling sites are presented on a plate and in 3 tables, cross-referenced by downstream order, alphabetical order, U.S. Geological Survey identification number, sampling-site classification category, and types of analyses performed at each site. The methods used to collect, analyze, and verify the accuracy of the data also are presented. (USGS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Y. Isaac; Zhang, Jun; Che, Xing
2016-03-07
In order to efficiently overcome high free energy barriers embedded in a complex energy landscape and calculate overall thermodynamics properties using molecular dynamics simulations, we developed and implemented a sampling strategy by combining the metadynamics with (selective) integrated tempering sampling (ITS/SITS) method. The dominant local minima on the potential energy surface (PES) are partially exalted by accumulating history-dependent potentials as in metadynamics, and the sampling over the entire PES is further enhanced by ITS/SITS. With this hybrid method, the simulated system can be rapidly driven across the dominant barrier along selected collective coordinates. Then, ITS/SITS ensures a fast convergence ofmore » the sampling over the entire PES and an efficient calculation of the overall thermodynamic properties of the simulation system. To test the accuracy and efficiency of this method, we first benchmarked this method in the calculation of ϕ − ψ distribution of alanine dipeptide in explicit solvent. We further applied it to examine the design of template molecules for aromatic meta-C—H activation in solutions and investigate solution conformations of the nonapeptide Bradykinin involving slow cis-trans isomerizations of three proline residues.« less
Cahill, John F.; Kertesz, Vilmos; Porta, Tiffany; ...
2018-02-08
Rationale: Laser microdissection-liquid vortex capture/electrospray ionization mass spectrometry (LMD-LVC/ESI-MS) has potential for on-line classification of tissue but an investigation into what analytical conditions provide best spectral differentiation has not been conducted. The effects of solvent, ionization polarity, and spectral acquisition parameters on differentiation of mouse brain tissue regions are described.Methods: Individual 40 × 40 μm microdissections from cortex, white, grey, granular, and nucleus regions of mouse brain tissue were analyzed using different capture/ESI solvents, in positive and negative ion mode ESI, using time-of-flight (TOF)-MS and sequential window acquisitions of all theoretical spectra (SWATH)-MS (a permutation of tandem-MS), and combinations thereof.more » Principal component analysis-linear discriminant analysis (PCA-LDA), applied to each mass spectral dataset, was used to determine the accuracy of differentiation of mouse brain tissue regions. Results: Mass spectral differences associated with capture/ESI solvent composition manifested as altered relative distributions of ions rather than the presence or absence of unique ions. In negative ion mode ESI, 80/20 (v/v) methanol/water yielded spectra with low signal/noise ratios relative to other solvents. PCA-LDA models acquired using 90/10 (v/v) methanol/chloroform differentiated tissue regions with 100% accuracy while data collected using methanol misclassified some samples. The combination of SWATH-MS and TOF-MS data improved differentiation accuracy.Conclusions: Combined TOF-MS and SWATH-MS data differentiated white, grey, granular, and nucleus mouse tissue regions with greater accuracy than when solely using TOF-MS data. Using 90/10 (v/v) methanol/chloroform, tissue regions were perfectly differentiated. Lastly, these results will guide future studies looking to utilize the potential of LMD-LVC/ESI-MS for tissue and disease differentiation.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cahill, John F.; Kertesz, Vilmos; Porta, Tiffany
Rationale: Laser microdissection-liquid vortex capture/electrospray ionization mass spectrometry (LMD-LVC/ESI-MS) has potential for on-line classification of tissue but an investigation into what analytical conditions provide best spectral differentiation has not been conducted. The effects of solvent, ionization polarity, and spectral acquisition parameters on differentiation of mouse brain tissue regions are described.Methods: Individual 40 × 40 μm microdissections from cortex, white, grey, granular, and nucleus regions of mouse brain tissue were analyzed using different capture/ESI solvents, in positive and negative ion mode ESI, using time-of-flight (TOF)-MS and sequential window acquisitions of all theoretical spectra (SWATH)-MS (a permutation of tandem-MS), and combinations thereof.more » Principal component analysis-linear discriminant analysis (PCA-LDA), applied to each mass spectral dataset, was used to determine the accuracy of differentiation of mouse brain tissue regions. Results: Mass spectral differences associated with capture/ESI solvent composition manifested as altered relative distributions of ions rather than the presence or absence of unique ions. In negative ion mode ESI, 80/20 (v/v) methanol/water yielded spectra with low signal/noise ratios relative to other solvents. PCA-LDA models acquired using 90/10 (v/v) methanol/chloroform differentiated tissue regions with 100% accuracy while data collected using methanol misclassified some samples. The combination of SWATH-MS and TOF-MS data improved differentiation accuracy.Conclusions: Combined TOF-MS and SWATH-MS data differentiated white, grey, granular, and nucleus mouse tissue regions with greater accuracy than when solely using TOF-MS data. Using 90/10 (v/v) methanol/chloroform, tissue regions were perfectly differentiated. Lastly, these results will guide future studies looking to utilize the potential of LMD-LVC/ESI-MS for tissue and disease differentiation.« less
Urine sampling techniques in symptomatic primary-care patients: a diagnostic accuracy review.
Holm, Anne; Aabenhus, Rune
2016-06-08
Choice of urine sampling technique in urinary tract infection may impact diagnostic accuracy and thus lead to possible over- or undertreatment. Currently no evidencebased consensus exists regarding correct sampling technique of urine from women with symptoms of urinary tract infection in primary care. The aim of this study was to determine the accuracy of urine culture from different sampling-techniques in symptomatic non-pregnant women in primary care. A systematic review was conducted by searching Medline and Embase for clinical studies conducted in primary care using a randomized or paired design to compare the result of urine culture obtained with two or more collection techniques in adult, female, non-pregnant patients with symptoms of urinary tract infection. We evaluated quality of the studies and compared accuracy based on dichotomized outcomes. We included seven studies investigating urine sampling technique in 1062 symptomatic patients in primary care. Mid-stream-clean-catch had a positive predictive value of 0.79 to 0.95 and a negative predictive value close to 1 compared to sterile techniques. Two randomized controlled trials found no difference in infection rate between mid-stream-clean-catch, mid-stream-urine and random samples. At present, no evidence suggests that sampling technique affects the accuracy of the microbiological diagnosis in non-pregnant women with symptoms of urinary tract infection in primary care. However, the evidence presented is in-direct and the difference between mid-stream-clean-catch, mid-stream-urine and random samples remains to be investigated in a paired design to verify the present findings.
NASA Astrophysics Data System (ADS)
Wei, Lin-Yang; Qi, Hong; Ren, Ya-Tao; Ruan, Li-Ming
2016-11-01
Inverse estimation of the refractive index distribution in one-dimensional participating media with graded refractive index (GRI) is investigated. The forward radiative transfer problem is solved by the Chebyshev collocation spectral method. The stochastic particle swarm optimization (SPSO) algorithm is employed to retrieve three kinds of GRI distribution, i.e. the linear, sinusoidal and quadratic GRI distribution. The retrieval accuracy of GRI distribution with different wall emissivity, optical thickness, absorption coefficients and scattering coefficients are discussed thoroughly. To improve the retrieval accuracy of quadratic GRI distribution, a double-layer model is proposed to supply more measurement information. The influence of measurement errors upon the precision of estimated results is also investigated. Considering the GRI distribution is unknown beforehand in practice, a quadratic function is employed to retrieve the linear GRI by SPSO algorithm. All the results show that the SPSO algorithm is applicable to retrieve different GRI distributions in participating media accurately even with noisy data.
Improving absolute gravity estimates by the L p -norm approximation of the ballistic trajectory
NASA Astrophysics Data System (ADS)
Nagornyi, V. D.; Svitlov, S.; Araya, A.
2016-04-01
Iteratively re-weighted least squares (IRLS) were used to simulate the L p -norm approximation of the ballistic trajectory in absolute gravimeters. Two iterations of the IRLS delivered sufficient accuracy of the approximation without a significant bias. The simulations were performed on different samplings and perturbations of the trajectory. For the platykurtic distributions of the perturbations, the L p -approximation with 3 < p < 4 was found to yield several times more precise gravity estimates compared to the standard least-squares. The simulation results were confirmed by processing real gravity observations performed at the excessive noise conditions.
Mapping permafrost in the boreal forest with Thematic Mapper satellite data
NASA Technical Reports Server (NTRS)
Morrissey, L. A.; Strong, L. L.; Card, D. H.
1986-01-01
A geographic data base incorporating Landsat TM data was used to develop and evaluate logistic discriminant functions for predicting the distribution of permafrost in a boreal forest watershed. The data base included both satellite-derived information and ancillary map data. Five permafrost classifications were developed from a stratified random sample of the data base and evaluated by comparison with a photo-interpreted permafrost map using contingency table analysis and soil temperatures recorded at sites within the watershed. A classification using a TM thermal band and a TM-derived vegetation map as independent variables yielded the highest mapping accuracy for all permafrost categories.
Guitet, Stéphane; Hérault, Bruno; Molto, Quentin; Brunaux, Olivier; Couteron, Pierre
2015-01-01
Precise mapping of above-ground biomass (AGB) is a major challenge for the success of REDD+ processes in tropical rainforest. The usual mapping methods are based on two hypotheses: a large and long-ranged spatial autocorrelation and a strong environment influence at the regional scale. However, there are no studies of the spatial structure of AGB at the landscapes scale to support these assumptions. We studied spatial variation in AGB at various scales using two large forest inventories conducted in French Guiana. The dataset comprised 2507 plots (0.4 to 0.5 ha) of undisturbed rainforest distributed over the whole region. After checking the uncertainties of estimates obtained from these data, we used half of the dataset to develop explicit predictive models including spatial and environmental effects and tested the accuracy of the resulting maps according to their resolution using the rest of the data. Forest inventories provided accurate AGB estimates at the plot scale, for a mean of 325 Mg.ha-1. They revealed high local variability combined with a weak autocorrelation up to distances of no more than10 km. Environmental variables accounted for a minor part of spatial variation. Accuracy of the best model including spatial effects was 90 Mg.ha-1 at plot scale but coarse graining up to 2-km resolution allowed mapping AGB with accuracy lower than 50 Mg.ha-1. Whatever the resolution, no agreement was found with available pan-tropical reference maps at all resolutions. We concluded that the combined weak autocorrelation and weak environmental effect limit AGB maps accuracy in rainforest, and that a trade-off has to be found between spatial resolution and effective accuracy until adequate "wall-to-wall" remote sensing signals provide reliable AGB predictions. Waiting for this, using large forest inventories with low sampling rate (<0.5%) may be an efficient way to increase the global coverage of AGB maps with acceptable accuracy at kilometric resolution.
Modeling occupancy distribution in large spaces with multi-feature classification algorithm
Wang, Wei; Chen, Jiayu; Hong, Tianzhen
2018-04-07
We present that occupancy information enables robust and flexible control of heating, ventilation, and air-conditioning (HVAC) systems in buildings. In large spaces, multiple HVAC terminals are typically installed to provide cooperative services for different thermal zones, and the occupancy information determines the cooperation among terminals. However, a person count at room-level does not adequately optimize HVAC system operation due to the movement of occupants within the room that creates uneven load distribution. Without accurate knowledge of the occupants’ spatial distribution, the uneven distribution of occupants often results in under-cooling/heating or over-cooling/heating in some thermal zones. Therefore, the lack of high-resolutionmore » occupancy distribution is often perceived as a bottleneck for future improvements to HVAC operation efficiency. To fill this gap, this study proposes a multi-feature k-Nearest-Neighbors (k-NN) classification algorithm to extract occupancy distribution through reliable, low-cost Bluetooth Low Energy (BLE) networks. An on-site experiment was conducted in a typical office of an institutional building to demonstrate the proposed methods, and the experiment outcomes of three case studies were examined to validate detection accuracy. One method based on City Block Distance (CBD) was used to measure the distance between detected occupancy distribution and ground truth and assess the results of occupancy distribution. Finally, the results show the accuracy when CBD = 1 is over 71.4% and the accuracy when CBD = 2 can reach up to 92.9%.« less
Modeling occupancy distribution in large spaces with multi-feature classification algorithm
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Wei; Chen, Jiayu; Hong, Tianzhen
We present that occupancy information enables robust and flexible control of heating, ventilation, and air-conditioning (HVAC) systems in buildings. In large spaces, multiple HVAC terminals are typically installed to provide cooperative services for different thermal zones, and the occupancy information determines the cooperation among terminals. However, a person count at room-level does not adequately optimize HVAC system operation due to the movement of occupants within the room that creates uneven load distribution. Without accurate knowledge of the occupants’ spatial distribution, the uneven distribution of occupants often results in under-cooling/heating or over-cooling/heating in some thermal zones. Therefore, the lack of high-resolutionmore » occupancy distribution is often perceived as a bottleneck for future improvements to HVAC operation efficiency. To fill this gap, this study proposes a multi-feature k-Nearest-Neighbors (k-NN) classification algorithm to extract occupancy distribution through reliable, low-cost Bluetooth Low Energy (BLE) networks. An on-site experiment was conducted in a typical office of an institutional building to demonstrate the proposed methods, and the experiment outcomes of three case studies were examined to validate detection accuracy. One method based on City Block Distance (CBD) was used to measure the distance between detected occupancy distribution and ground truth and assess the results of occupancy distribution. Finally, the results show the accuracy when CBD = 1 is over 71.4% and the accuracy when CBD = 2 can reach up to 92.9%.« less
On-line analysis of algae in water by discrete three-dimensional fluorescence spectroscopy.
Zhao, Nanjing; Zhang, Xiaoling; Yin, Gaofang; Yang, Ruifang; Hu, Li; Chen, Shuang; Liu, Jianguo; Liu, Wenqing
2018-03-19
In view of the problem of the on-line measurement of algae classification, a method of algae classification and concentration determination based on the discrete three-dimensional fluorescence spectra was studied in this work. The discrete three-dimensional fluorescence spectra of twelve common species of algae belonging to five categories were analyzed, the discrete three-dimensional standard spectra of five categories were built, and the recognition, classification and concentration prediction of algae categories were realized by the discrete three-dimensional fluorescence spectra coupled with non-negative weighted least squares linear regression analysis. The results show that similarities between discrete three-dimensional standard spectra of different categories were reduced and the accuracies of recognition, classification and concentration prediction of the algae categories were significantly improved. By comparing with that of the chlorophyll a fluorescence excitation spectra method, the recognition accuracy rate in pure samples by discrete three-dimensional fluorescence spectra is improved 1.38%, and the recovery rate and classification accuracy in pure diatom samples 34.1% and 46.8%, respectively; the recognition accuracy rate of mixed samples by discrete-three dimensional fluorescence spectra is enhanced by 26.1%, the recovery rate of mixed samples with Chlorophyta 37.8%, and the classification accuracy of mixed samples with diatoms 54.6%.
Combining Soil Databases for Topsoil Organic Carbon Mapping in Europe.
Aksoy, Ece; Yigini, Yusuf; Montanarella, Luca
2016-01-01
Accuracy in assessing the distribution of soil organic carbon (SOC) is an important issue because of playing key roles in the functions of both natural ecosystems and agricultural systems. There are several studies in the literature with the aim of finding the best method to assess and map the distribution of SOC content for Europe. Therefore this study aims searching for another aspect of this issue by looking to the performances of using aggregated soil samples coming from different studies and land-uses. The total number of the soil samples in this study was 23,835 and they're collected from the "Land Use/Cover Area frame Statistical Survey" (LUCAS) Project (samples from agricultural soil), BioSoil Project (samples from forest soil), and "Soil Transformations in European Catchments" (SoilTrEC) Project (samples from local soil data coming from six different critical zone observatories (CZOs) in Europe). Moreover, 15 spatial indicators (slope, aspect, elevation, compound topographic index (CTI), CORINE land-cover classification, parent material, texture, world reference base (WRB) soil classification, geological formations, annual average temperature, min-max temperature, total precipitation and average precipitation (for years 1960-1990 and 2000-2010)) were used as auxiliary variables in this prediction. One of the most popular geostatistical techniques, Regression-Kriging (RK), was applied to build the model and assess the distribution of SOC. This study showed that, even though RK method was appropriate for successful SOC mapping, using combined databases was not helpful to increase the statistical significance of the method results for assessing the SOC distribution. According to our results; SOC variation was mainly affected by elevation, slope, CTI, average temperature, average and total precipitation, texture, WRB and CORINE variables for Europe scale in our model. Moreover, the highest average SOC contents were found in the wetland areas; agricultural areas have much lower soil organic carbon content than forest and semi natural areas; Ireland, Sweden and Finland has the highest SOC, on the contrary, Portugal, Poland, Hungary, Spain, Italy have the lowest values with the average 3%.
Combining Soil Databases for Topsoil Organic Carbon Mapping in Europe
Aksoy, Ece
2016-01-01
Accuracy in assessing the distribution of soil organic carbon (SOC) is an important issue because of playing key roles in the functions of both natural ecosystems and agricultural systems. There are several studies in the literature with the aim of finding the best method to assess and map the distribution of SOC content for Europe. Therefore this study aims searching for another aspect of this issue by looking to the performances of using aggregated soil samples coming from different studies and land-uses. The total number of the soil samples in this study was 23,835 and they’re collected from the “Land Use/Cover Area frame Statistical Survey” (LUCAS) Project (samples from agricultural soil), BioSoil Project (samples from forest soil), and “Soil Transformations in European Catchments” (SoilTrEC) Project (samples from local soil data coming from six different critical zone observatories (CZOs) in Europe). Moreover, 15 spatial indicators (slope, aspect, elevation, compound topographic index (CTI), CORINE land-cover classification, parent material, texture, world reference base (WRB) soil classification, geological formations, annual average temperature, min-max temperature, total precipitation and average precipitation (for years 1960–1990 and 2000–2010)) were used as auxiliary variables in this prediction. One of the most popular geostatistical techniques, Regression-Kriging (RK), was applied to build the model and assess the distribution of SOC. This study showed that, even though RK method was appropriate for successful SOC mapping, using combined databases was not helpful to increase the statistical significance of the method results for assessing the SOC distribution. According to our results; SOC variation was mainly affected by elevation, slope, CTI, average temperature, average and total precipitation, texture, WRB and CORINE variables for Europe scale in our model. Moreover, the highest average SOC contents were found in the wetland areas; agricultural areas have much lower soil organic carbon content than forest and semi natural areas; Ireland, Sweden and Finland has the highest SOC, on the contrary, Portugal, Poland, Hungary, Spain, Italy have the lowest values with the average 3%. PMID:27011357
Wu, Mixia; Shu, Yu; Li, Zhaohai; Liu, Aiyi
2016-01-01
A sequential design is proposed to test whether the accuracy of a binary diagnostic biomarker meets the minimal level of acceptance. The accuracy of a binary diagnostic biomarker is a linear combination of the marker’s sensitivity and specificity. The objective of the sequential method is to minimize the maximum expected sample size under the null hypothesis that the marker’s accuracy is below the minimal level of acceptance. The exact results of two-stage designs based on Youden’s index and efficiency indicate that the maximum expected sample sizes are smaller than the sample sizes of the fixed designs. Exact methods are also developed for estimation, confidence interval and p-value concerning the proposed accuracy index upon termination of the sequential testing. PMID:26947768
A Variance Distribution Model of Surface EMG Signals Based on Inverse Gamma Distribution.
Hayashi, Hideaki; Furui, Akira; Kurita, Yuichi; Tsuji, Toshio
2017-11-01
Objective: This paper describes the formulation of a surface electromyogram (EMG) model capable of representing the variance distribution of EMG signals. Methods: In the model, EMG signals are handled based on a Gaussian white noise process with a mean of zero for each variance value. EMG signal variance is taken as a random variable that follows inverse gamma distribution, allowing the representation of noise superimposed onto this variance. Variance distribution estimation based on marginal likelihood maximization is also outlined in this paper. The procedure can be approximated using rectified and smoothed EMG signals, thereby allowing the determination of distribution parameters in real time at low computational cost. Results: A simulation experiment was performed to evaluate the accuracy of distribution estimation using artificially generated EMG signals, with results demonstrating that the proposed model's accuracy is higher than that of maximum-likelihood-based estimation. Analysis of variance distribution using real EMG data also suggested a relationship between variance distribution and signal-dependent noise. Conclusion: The study reported here was conducted to examine the performance of a proposed surface EMG model capable of representing variance distribution and a related distribution parameter estimation method. Experiments using artificial and real EMG data demonstrated the validity of the model. Significance: Variance distribution estimated using the proposed model exhibits potential in the estimation of muscle force. Objective: This paper describes the formulation of a surface electromyogram (EMG) model capable of representing the variance distribution of EMG signals. Methods: In the model, EMG signals are handled based on a Gaussian white noise process with a mean of zero for each variance value. EMG signal variance is taken as a random variable that follows inverse gamma distribution, allowing the representation of noise superimposed onto this variance. Variance distribution estimation based on marginal likelihood maximization is also outlined in this paper. The procedure can be approximated using rectified and smoothed EMG signals, thereby allowing the determination of distribution parameters in real time at low computational cost. Results: A simulation experiment was performed to evaluate the accuracy of distribution estimation using artificially generated EMG signals, with results demonstrating that the proposed model's accuracy is higher than that of maximum-likelihood-based estimation. Analysis of variance distribution using real EMG data also suggested a relationship between variance distribution and signal-dependent noise. Conclusion: The study reported here was conducted to examine the performance of a proposed surface EMG model capable of representing variance distribution and a related distribution parameter estimation method. Experiments using artificial and real EMG data demonstrated the validity of the model. Significance: Variance distribution estimated using the proposed model exhibits potential in the estimation of muscle force.
An evaluation of methods for estimating decadal stream loads
NASA Astrophysics Data System (ADS)
Lee, Casey J.; Hirsch, Robert M.; Schwarz, Gregory E.; Holtschlag, David J.; Preston, Stephen D.; Crawford, Charles G.; Vecchia, Aldo V.
2016-11-01
Effective management of water resources requires accurate information on the mass, or load of water-quality constituents transported from upstream watersheds to downstream receiving waters. Despite this need, no single method has been shown to consistently provide accurate load estimates among different water-quality constituents, sampling sites, and sampling regimes. We evaluate the accuracy of several load estimation methods across a broad range of sampling and environmental conditions. This analysis uses random sub-samples drawn from temporally-dense data sets of total nitrogen, total phosphorus, nitrate, and suspended-sediment concentration, and includes measurements of specific conductance which was used as a surrogate for dissolved solids concentration. Methods considered include linear interpolation and ratio estimators, regression-based methods historically employed by the U.S. Geological Survey, and newer flexible techniques including Weighted Regressions on Time, Season, and Discharge (WRTDS) and a generalized non-linear additive model. No single method is identified to have the greatest accuracy across all constituents, sites, and sampling scenarios. Most methods provide accurate estimates of specific conductance (used as a surrogate for total dissolved solids or specific major ions) and total nitrogen - lower accuracy is observed for the estimation of nitrate, total phosphorus and suspended sediment loads. Methods that allow for flexibility in the relation between concentration and flow conditions, specifically Beale's ratio estimator and WRTDS, exhibit greater estimation accuracy and lower bias. Evaluation of methods across simulated sampling scenarios indicate that (1) high-flow sampling is necessary to produce accurate load estimates, (2) extrapolation of sample data through time or across more extreme flow conditions reduces load estimate accuracy, and (3) WRTDS and methods that use a Kalman filter or smoothing to correct for departures between individual modeled and observed values benefit most from more frequent water-quality sampling.
An evaluation of methods for estimating decadal stream loads
Lee, Casey; Hirsch, Robert M.; Schwarz, Gregory E.; Holtschlag, David J.; Preston, Stephen D.; Crawford, Charles G.; Vecchia, Aldo V.
2016-01-01
Effective management of water resources requires accurate information on the mass, or load of water-quality constituents transported from upstream watersheds to downstream receiving waters. Despite this need, no single method has been shown to consistently provide accurate load estimates among different water-quality constituents, sampling sites, and sampling regimes. We evaluate the accuracy of several load estimation methods across a broad range of sampling and environmental conditions. This analysis uses random sub-samples drawn from temporally-dense data sets of total nitrogen, total phosphorus, nitrate, and suspended-sediment concentration, and includes measurements of specific conductance which was used as a surrogate for dissolved solids concentration. Methods considered include linear interpolation and ratio estimators, regression-based methods historically employed by the U.S. Geological Survey, and newer flexible techniques including Weighted Regressions on Time, Season, and Discharge (WRTDS) and a generalized non-linear additive model. No single method is identified to have the greatest accuracy across all constituents, sites, and sampling scenarios. Most methods provide accurate estimates of specific conductance (used as a surrogate for total dissolved solids or specific major ions) and total nitrogen – lower accuracy is observed for the estimation of nitrate, total phosphorus and suspended sediment loads. Methods that allow for flexibility in the relation between concentration and flow conditions, specifically Beale’s ratio estimator and WRTDS, exhibit greater estimation accuracy and lower bias. Evaluation of methods across simulated sampling scenarios indicate that (1) high-flow sampling is necessary to produce accurate load estimates, (2) extrapolation of sample data through time or across more extreme flow conditions reduces load estimate accuracy, and (3) WRTDS and methods that use a Kalman filter or smoothing to correct for departures between individual modeled and observed values benefit most from more frequent water-quality sampling.
The influence of sampling interval on the accuracy of trail impact assessment
Leung, Y.-F.; Marion, J.L.
1999-01-01
Trail impact assessment and monitoring (IA&M) programs have been growing in importance and application in recreation resource management at protected areas. Census-based and sampling-based approaches have been developed in such programs, with systematic point sampling being the most common survey design. This paper examines the influence of sampling interval on the accuracy of estimates for selected trail impact problems. A complete census of four impact types on 70 trails in Great Smoky Mountains National Park was utilized as the base data set for the analyses. The census data were resampled at increasing intervals to create a series of simulated point data sets. Estimates of frequency of occurrence and lineal extent for the four impact types were compared with the census data set. The responses of accuracy loss on lineal extent estimates to increasing sampling intervals varied across different impact types, while the responses on frequency of occurrence estimates were consistent, approximating an inverse asymptotic curve. These findings suggest that systematic point sampling may be an appropriate method for estimating the lineal extent but not the frequency of trail impacts. Sample intervals of less than 100 m appear to yield an excellent level of accuracy for the four impact types evaluated. Multiple regression analysis results suggest that appropriate sampling intervals are more likely to be determined by the type of impact in question rather than the length of trail. The census-based trail survey and the resampling-simulation method developed in this study can be a valuable first step in establishing long-term trail IA&M programs, in which an optimal sampling interval range with acceptable accuracy is determined before investing efforts in data collection.
NASA Technical Reports Server (NTRS)
Wey, Thomas; Liu, Nan-Suey
2008-01-01
This paper at first describes the fluid network approach recently implemented into the National Combustion Code (NCC) for the simulation of transport of aerosols (volatile particles and soot) in the particulate sampling systems. This network-based approach complements the other two approaches already in the NCC, namely, the lower-order temporal approach and the CFD-based approach. The accuracy and the computational costs of these three approaches are then investigated in terms of their application to the prediction of particle losses through sample transmission and distribution lines. Their predictive capabilities are assessed by comparing the computed results with the experimental data. The present work will help establish standard methodologies for measuring the size and concentration of particles in high-temperature, high-velocity jet engine exhaust. Furthermore, the present work also represents the first step of a long term effort of validating physics-based tools for the prediction of aircraft particulate emissions.
MultiNest: Efficient and Robust Bayesian Inference
NASA Astrophysics Data System (ADS)
Feroz, F.; Hobson, M. P.; Bridges, M.
2011-09-01
We present further development and the first public release of our multimodal nested sampling algorithm, called MultiNest. This Bayesian inference tool calculates the evidence, with an associated error estimate, and produces posterior samples from distributions that may contain multiple modes and pronounced (curving) degeneracies in high dimensions. The developments presented here lead to further substantial improvements in sampling efficiency and robustness, as compared to the original algorithm presented in Feroz & Hobson (2008), which itself significantly outperformed existing MCMC techniques in a wide range of astrophysical inference problems. The accuracy and economy of the MultiNest algorithm is demonstrated by application to two toy problems and to a cosmological inference problem focusing on the extension of the vanilla LambdaCDM model to include spatial curvature and a varying equation of state for dark energy. The MultiNest software is fully parallelized using MPI and includes an interface to CosmoMC. It will also be released as part of the SuperBayeS package, for the analysis of supersymmetric theories of particle physics, at this http URL.
Assessment of spatial distribution of soil heavy metals using ANN-GA, MSLR and satellite imagery.
Naderi, Arman; Delavar, Mohammad Amir; Kaboudin, Babak; Askari, Mohammad Sadegh
2017-05-01
This study aims to assess and compare heavy metal distribution models developed using stepwise multiple linear regression (MSLR) and neural network-genetic algorithm model (ANN-GA) based on satellite imagery. The source identification of heavy metals was also explored using local Moran index. Soil samples (n = 300) were collected based on a grid and pH, organic matter, clay, iron oxide contents cadmium (Cd), lead (Pb) and zinc (Zn) concentrations were determined for each sample. Visible/near-infrared reflectance (VNIR) within the electromagnetic ranges of satellite imagery was applied to estimate heavy metal concentrations in the soil using MSLR and ANN-GA models. The models were evaluated and ANN-GA model demonstrated higher accuracy, and the autocorrelation results showed higher significant clusters of heavy metals around the industrial zone. The higher concentration of Cd, Pb and Zn was noted under industrial lands and irrigation farming in comparison to barren and dryland farming. Accumulation of industrial wastes in roads and streams was identified as main sources of pollution, and the concentration of soil heavy metals was reduced by increasing the distance from these sources. In comparison to MLSR, ANN-GA provided a more accurate indirect assessment of heavy metal concentrations in highly polluted soils. The clustering analysis provided reliable information about the spatial distribution of soil heavy metals and their sources.
On distributed wavefront reconstruction for large-scale adaptive optics systems.
de Visser, Cornelis C; Brunner, Elisabeth; Verhaegen, Michel
2016-05-01
The distributed-spline-based aberration reconstruction (D-SABRE) method is proposed for distributed wavefront reconstruction with applications to large-scale adaptive optics systems. D-SABRE decomposes the wavefront sensor domain into any number of partitions and solves a local wavefront reconstruction problem on each partition using multivariate splines. D-SABRE accuracy is within 1% of a global approach with a speedup that scales quadratically with the number of partitions. The D-SABRE is compared to the distributed cumulative reconstruction (CuRe-D) method in open-loop and closed-loop simulations using the YAO adaptive optics simulation tool. D-SABRE accuracy exceeds CuRe-D for low levels of decomposition, and D-SABRE proved to be more robust to variations in the loop gain.
NASA Astrophysics Data System (ADS)
De Geyter, Gert; Baes, Maarten; Camps, Peter; Fritz, Jacopo; De Looze, Ilse; Hughes, Thomas M.; Viaene, Sébastien; Gentile, Gianfranco
2014-06-01
We investigate the amount and spatial distribution of interstellar dust in edge-on spiral galaxies, using detailed radiative transfer modelling of a homogeneous sample of 12 galaxies selected from the Calar Alto Legacy Integral Field Area survey. Our automated fitting routine, FITSKIRT, was first validated against artificial data. This is done by simultaneously reproducing the Sloan Digital Sky Survey g-, r-, i- and z-band observations of a toy model in order to combine the information present in the different bands. We show that this combined, oligochromatic fitting has clear advantages over standard monochromatic fitting especially regarding constraints on the dust properties. We model all galaxies in our sample using a three-component model, consisting of a double-exponential disc to describe the stellar and dust discs and using a Sérsic profile to describe the central bulge. The full model contains 19 free parameters, and we are able to constrain all these parameters to a satisfactory level of accuracy without human intervention or strong boundary conditions. Apart from two galaxies, the entire sample can be accurately reproduced by our model. We find that the dust disc is about 75 per cent more extended but only half as high as the stellar disc. The average face-on optical depth in the V band is 0.76 and the spread of 0.60 within our sample is quite substantial, which indicates that some spiral galaxies are relatively opaque even when seen face-on.
Nestor, Sean M; Gibson, Erin; Gao, Fu-Qiang; Kiss, Alex; Black, Sandra E
2013-02-01
Hippocampal volumetry derived from structural MRI is increasingly used to delineate regions of interest for functional measurements, assess efficacy in therapeutic trials of Alzheimer's disease (AD) and has been endorsed by the new AD diagnostic guidelines as a radiological marker of disease progression. Unfortunately, morphological heterogeneity in AD can prevent accurate demarcation of the hippocampus. Recent developments in automated volumetry commonly use multi-template fusion driven by expert manual labels, enabling highly accurate and reproducible segmentation in disease and healthy subjects. However, there are several protocols to define the hippocampus anatomically in vivo, and the method used to generate atlases may impact automatic accuracy and sensitivity - particularly in pathologically heterogeneous samples. Here we report a fully automated segmentation technique that provides a robust platform to directly evaluate both technical and biomarker performance in AD among anatomically unique labeling protocols. For the first time we test head-to-head the performance of five common hippocampal labeling protocols for multi-atlas based segmentation, using both the Sunnybrook Longitudinal Dementia Study and the entire Alzheimer's Disease Neuroimaging Initiative 1 (ADNI-1) baseline and 24-month dataset. We based these atlas libraries on the protocols of (Haller et al., 1997; Killiany et al., 1993; Malykhin et al., 2007; Pantel et al., 2000; Pruessner et al., 2000), and a single operator performed all manual tracings to generate de facto "ground truth" labels. All methods distinguished between normal elders, mild cognitive impairment (MCI), and AD in the expected directions, and showed comparable correlations with measures of episodic memory performance. Only more inclusive protocols distinguished between stable MCI and MCI-to-AD converters, and had slightly better associations with episodic memory. Moreover, we demonstrate that protocols including more posterior anatomy and dorsal white matter compartments furnish the best voxel-overlap accuracies (Dice Similarity Coefficient=0.87-0.89), compared to expert manual tracings, and achieve the smallest sample sizes required to power clinical trials in MCI and AD. The greatest distribution of errors was localized to the caudal hippocampus and the alveus-fimbria compartment when these regions were excluded. The definition of the medial body did not significantly alter accuracy among more comprehensive protocols. Voxel-overlap accuracies between automatic and manual labels were lower for the more pathologically heterogeneous Sunnybrook study in comparison to the ADNI-1 sample. Finally, accuracy among protocols appears to significantly differ the most in AD subjects compared to MCI and normal elders. Together, these results suggest that selection of a candidate protocol for fully automatic multi-template based segmentation in AD can influence both segmentation accuracy when compared to expert manual labels and performance as a biomarker in MCI and AD. Copyright © 2012 Elsevier Inc. All rights reserved.
Nestor, Sean M.; Gibson, Erin; Gao, Fu-Qiang; Kiss, Alex; Black, Sandra E.
2012-01-01
Hippocampal volumetry derived from structural MRI is increasingly used to delineate regions of interest for functional measurements, assess efficacy in therapeutic trials of Alzheimer’s disease (AD) and has been endorsed by the new AD diagnostic guidelines as a radiological marker of disease progression. Unfortunately, morphological heterogeneity in AD can prevent accurate demarcation of the hippocampus. Recent developments in automated volumetry commonly use multitemplate fusion driven by expert manual labels, enabling highly accurate and reproducible segmentation in disease and healthy subjects. However, there are several protocols to define the hippocampus anatomically in vivo, and the method used to generate atlases may impact automatic accuracy and sensitivity – particularly in pathologically heterogeneous samples. Here we report a fully automated segmentation technique that provides a robust platform to directly evaluate both technical and biomarker performance in AD among anatomically unique labeling protocols. For the first time we test head-to-head the performance of five common hippocampal labeling protocols for multi-atlas based segmentation, using both the Sunnybrook Longitudinal Dementia Study and the entire Alzheimer’s Disease Neuroimaging Initiative 1 (ADNI-1) baseline and 24-month dataset. We based these atlas libraries on the protocols of (Haller et al., 1997; Killiany et al., 1993; Malykhin et al., 2007; Pantel et al., 2000; Pruessner et al., 2000), and a single operator performed all manual tracings to generate de facto “ground truth” labels. All methods distinguished between normal elders, mild cognitive impairment (MCI), and AD in the expected directions, and showed comparable correlations with measures of episodic memory performance. Only more inclusive protocols distinguished between stable MCI and MCI-to-AD converters, and had slightly better associations with episodic memory. Moreover, we demonstrate that protocols including more posterior anatomy and dorsal white matter compartments furnish the best voxel-overlap accuracies (Dice Similarity Coefficient = 0.87–0.89), compared to expert manual tracings, and achieve the smallest sample sizes required to power clinical trials in MCI and AD. The greatest distribution of errors was localized to the caudal hippocampus and alveus-fimbria compartment when these regions were excluded. The definition of the medial body did not significantly alter accuracy among more comprehensive protocols. Voxel-overlap accuracies between automatic and manual labels were lower for the more pathologically heterogeneous Sunnybrook study in comparison to the ADNI-1 sample. Finally, accuracy among protocols appears to significantly differ the most in AD subjects compared to MCI and normal elders. Together, these results suggest that selection of a candidate protocol for fully automatic multi-template based segmentation in AD can influence both segmentation accuracy when compared to expert manual labels and performance as a biomarker in MCI and AD. PMID:23142652
Accuracy Assessment of Coastal Topography Derived from Uav Images
NASA Astrophysics Data System (ADS)
Long, N.; Millescamps, B.; Pouget, F.; Dumon, A.; Lachaussée, N.; Bertin, X.
2016-06-01
To monitor coastal environments, Unmanned Aerial Vehicle (UAV) is a low-cost and easy to use solution to enable data acquisition with high temporal frequency and spatial resolution. Compared to Light Detection And Ranging (LiDAR) or Terrestrial Laser Scanning (TLS), this solution produces Digital Surface Model (DSM) with a similar accuracy. To evaluate the DSM accuracy on a coastal environment, a campaign was carried out with a flying wing (eBee) combined with a digital camera. Using the Photoscan software and the photogrammetry process (Structure From Motion algorithm), a DSM and an orthomosaic were produced. Compared to GNSS surveys, the DSM accuracy is estimated. Two parameters are tested: the influence of the methodology (number and distribution of Ground Control Points, GCPs) and the influence of spatial image resolution (4.6 cm vs 2 cm). The results show that this solution is able to reproduce the topography of a coastal area with a high vertical accuracy (< 10 cm). The georeferencing of the DSM require a homogeneous distribution and a large number of GCPs. The accuracy is correlated with the number of GCPs (use 19 GCPs instead of 10 allows to reduce the difference of 4 cm); the required accuracy should be dependant of the research problematic. Last, in this particular environment, the presence of very small water surfaces on the sand bank does not allow to improve the accuracy when the spatial resolution of images is decreased.
Classification of stellar spectra with SVM based on within-class scatter and between-class scatter
NASA Astrophysics Data System (ADS)
Liu, Zhong-bao; Zhou, Fang-xiao; Qin, Zhen-tao; Luo, Xue-gang; Zhang, Jing
2018-07-01
Support Vector Machine (SVM) is a popular data mining technique, and it has been widely applied in astronomical tasks, especially in stellar spectra classification. Since SVM doesn't take the data distribution into consideration, and therefore, its classification efficiencies can't be greatly improved. Meanwhile, SVM ignores the internal information of the training dataset, such as the within-class structure and between-class structure. In view of this, we propose a new classification algorithm-SVM based on Within-Class Scatter and Between-Class Scatter (WBS-SVM) in this paper. WBS-SVM tries to find an optimal hyperplane to separate two classes. The difference is that it incorporates minimum within-class scatter and maximum between-class scatter in Linear Discriminant Analysis (LDA) into SVM. These two scatters represent the distributions of the training dataset, and the optimization of WBS-SVM ensures the samples in the same class are as close as possible and the samples in different classes are as far as possible. Experiments on the K-, F-, G-type stellar spectra from Sloan Digital Sky Survey (SDSS), Data Release 8 show that our proposed WBS-SVM can greatly improve the classification accuracies.
Quantum-assisted learning of graphical models with arbitrary pairwise connectivity
NASA Astrophysics Data System (ADS)
Realpe-Gómez, John; Benedetti, Marcello; Biswas, Rupak; Perdomo-Ortiz, Alejandro
Mainstream machine learning techniques rely heavily on sampling from generally intractable probability distributions. There is increasing interest in the potential advantages of using quantum computing technologies as sampling engines to speedup these tasks. However, some pressing challenges in state-of-the-art quantum annealers have to be overcome before we can assess their actual performance. The sparse connectivity, resulting from the local interaction between quantum bits in physical hardware implementations, is considered the most severe limitation to the quality of constructing powerful machine learning models. Here we show how to surpass this `curse of limited connectivity' bottleneck and illustrate our findings by training probabilistic generative models with arbitrary pairwise connectivity on a real dataset of handwritten digits and two synthetic datasets in experiments with up to 940 quantum bits. Our model can be trained in quantum hardware without full knowledge of the effective parameters specifying the corresponding Boltzmann-like distribution. Therefore, the need to infer the effective temperature at each iteration is avoided, speeding up learning, and the effect of noise in the control parameters is mitigated, improving accuracy. This work was supported in part by NASA, AFRL, ODNI, and IARPA.
Consistent Adjoint Driven Importance Sampling using Space, Energy and Angle
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peplow, Douglas E.; Mosher, Scott W; Evans, Thomas M
2012-08-01
For challenging radiation transport problems, hybrid methods combine the accuracy of Monte Carlo methods with the global information present in deterministic methods. One of the most successful hybrid methods is CADIS Consistent Adjoint Driven Importance Sampling. This method uses a deterministic adjoint solution to construct a biased source distribution and consistent weight windows to optimize a specific tally in a Monte Carlo calculation. The method has been implemented into transport codes using just the spatial and energy information from the deterministic adjoint and has been used in many applications to compute tallies with much higher figures-of-merit than analog calculations. CADISmore » also outperforms user-supplied importance values, which usually take long periods of user time to develop. This work extends CADIS to develop weight windows that are a function of the position, energy, and direction of the Monte Carlo particle. Two types of consistent source biasing are presented: one method that biases the source in space and energy while preserving the original directional distribution and one method that biases the source in space, energy, and direction. Seven simple example problems are presented which compare the use of the standard space/energy CADIS with the new space/energy/angle treatments.« less
NASA Astrophysics Data System (ADS)
Buzulukov, Yu; Antsiferova, A.; Demin, V. A.; Demin, V. F.; Kashkarov, P.
2015-11-01
The method to measure the mass of inorganic nanoparticles in biological (or any other samples) using nanoparticles labeled with radioactive tracers is developed and applied to practice. The tracers are produced in original nanoparticles by radioactive activation of some of their atomic nuclei. The method of radioactive tracers demonstrates a sensitivity, specificity and accuracy equal or better than popular methods of optical and mass spectrometry, or electron microscopy and has some specific advantages. The method can be used for study of absorption, distribution, metabolism and excretion in living organism, as well as in ecological and fundamental research. It was used in practice to study absorption, distribution, metabolism and excretion of nanoparticles of Ag, Au, Se, ZnO, TiO2 as well as to study transportation of silver nanoparticles through the barriers of blood-brain, placenta and milk gland of rats. Brief descriptions of data obtained in experiments with application of this method included in the article. The method was certified in Russian Federation standard system GOST-R and recommended by the Russian Federation regulation authority ROSPOTREBNADZOR for measuring of toxicokinetic and organotropy parameters of nanoparticles.
Antonini, Filippo; Fuccio, Lorenzo; Giorgini, Sara; Fabbri, Carlo; Frazzoni, Leonardo; Scarpelli, Marina; Macarri, Giampiero
2017-08-01
While the presence of biliary stent significantly decreases the accuracy of endoscopic ultrasound (EUS) for pancreatic head cancer staging, its impact on the EUS-guided sampling accuracy is still debated. Furthermore, data on EUS-fine needle biopsy (EUS-FNB) using core biopsy needles in patients with pancreatic mass and biliary stent are lacking. The aim of this study was to evaluate the influence of biliary stent on the adequacy and accuracy of EUS-FNB in patients with pancreatic head mass. All patients who underwent EUS-guided sampling with core needles of solid pancreatic head masses causing obstructive jaundice were retrospectively identified in a single tertiary referral center. Adequacy, defined as the rate of cases in which a tissue specimen for proper examination was achieved, with and without biliary stent, was the primary outcome measure. The diagnostic accuracy and complication rate were the secondary outcome measures. A total of 130 patients with pancreatic head mass causing biliary obstruction were included in the study: 74 cases of them were sampled without stent and 56 cases with plastic stent in situ. The adequacy was 96.4% in the stent group and 90.5% in the group without stent (p=0.190). No significant differences were observed for sensitivity (88.9% vs. 85.9%), specificity (100% for both groups), and accuracy (89.3% vs. 86.5%) between those with and without stent, respectively. The accuracy was not influenced by the timing of stenting (<48h or ≥48h before EUS). No EUS-FNB related complications were recorded. The presence of biliary stent does not influence the tissue sampling adequacy, the diagnostic accuracy and the complication rate of EUS-FNB of pancreatic head masses performed with core biopsy needles. Copyright © 2017 Editrice Gastroenterologica Italiana S.r.l. Published by Elsevier Ltd. All rights reserved.
Good Practices in Free-energy Calculations
NASA Technical Reports Server (NTRS)
Pohorille, Andrew; Jarzynski, Christopher; Chipot, Christopher
2013-01-01
As access to computational resources continues to increase, free-energy calculations have emerged as a powerful tool that can play a predictive role in drug design. Yet, in a number of instances, the reliability of these calculations can be improved significantly if a number of precepts, or good practices are followed. For the most part, the theory upon which these good practices rely has been known for many years, but often overlooked, or simply ignored. In other cases, the theoretical developments are too recent for their potential to be fully grasped and merged into popular platforms for the computation of free-energy differences. The current best practices for carrying out free-energy calculations will be reviewed demonstrating that, at little to no additional cost, free-energy estimates could be markedly improved and bounded by meaningful error estimates. In energy perturbation and nonequilibrium work methods, monitoring the probability distributions that underlie the transformation between the states of interest, performing the calculation bidirectionally, stratifying the reaction pathway and choosing the most appropriate paradigms and algorithms for transforming between states offer significant gains in both accuracy and precision. In thermodynamic integration and probability distribution (histogramming) methods, properly designed adaptive techniques yield nearly uniform sampling of the relevant degrees of freedom and, by doing so, could markedly improve efficiency and accuracy of free energy calculations without incurring any additional computational expense.
Wang, Zhi-Jie; Jiao, Ju-Ying; Lei, Bo; Su, Yuan
2015-09-01
Remote sensing can provide large-scale spatial data for the detection of vegetation types. In this study, two shortwave infrared spectral bands (TM5 and TM7) and one visible spectral band (TM3) of Landsat 5 TM data were used to detect five typical vegetation types (communities dominated by Bothriochloa ischaemum, Artemisia gmelinii, Hippophae rhamnoides, Robinia pseudoacacia, and Quercus liaotungensis) using 270 field survey data in the Yanhe watershed on the Loess Plateau. The relationships between 200 field data points and their corresponding radiance reflectance were analyzed, and the equation termed the vegetation type index (VTI) was generated. The VTI values of five vegetation types were calculated, and the accuracy was tested using the remaining 70 field data points. The applicability of VTI was also tested by the distribution of vegetation type of two small watersheds in the Yanhe watershed and field sample data collected from other regions (Ziwuling Region, Huangling County, and Luochuan County) on the Loess Plateau. The results showed that the VTI can effectively detect the five vegetation types with an average accuracy exceeding 80 % and a representativeness above 85 %. As a new approach for monitoring vegetation types using remote sensing at a larger regional scale, VTI can play an important role in the assessment of vegetation restoration and in the investigation of the spatial distribution and community diversity of vegetation on the Loess Plateau.
Lange, Berit; Cohn, Jennifer; Roberts, Teri; Camp, Johannes; Chauffour, Jeanne; Gummadi, Nina; Ishizaki, Azumi; Nagarathnam, Anupriya; Tuaillon, Edouard; van de Perre, Philippe; Pichler, Christine; Easterbrook, Philippa; Denkinger, Claudia M
2017-11-01
Dried blood spots (DBS) are a convenient tool to enable diagnostic testing for viral diseases due to transport, handling and logistical advantages over conventional venous blood sampling. A better understanding of the performance of serological testing for hepatitis C (HCV) and hepatitis B virus (HBV) from DBS is important to enable more widespread use of this sampling approach in resource limited settings, and to inform the 2017 World Health Organization (WHO) guidance on testing for HBV/HCV. We conducted two systematic reviews and meta-analyses on the diagnostic accuracy of HCV antibody (HCV-Ab) and HBV surface antigen (HBsAg) from DBS samples compared to venous blood samples. MEDLINE, EMBASE, Global Health and Cochrane library were searched for studies that assessed diagnostic accuracy with DBS and agreement between DBS and venous sampling. Heterogeneity of results was assessed and where possible a pooled analysis of sensitivity and specificity was performed using a bivariate analysis with maximum likelihood estimate and 95% confidence intervals (95%CI). We conducted a narrative review on the impact of varying storage conditions or limits of detection in subsets of samples. The QUADAS-2 tool was used to assess risk of bias. For the diagnostic accuracy of HBsAg from DBS compared to venous blood, 19 studies were included in a quantitative meta-analysis, and 23 in a narrative review. Pooled sensitivity and specificity were 98% (95%CI:95%-99%) and 100% (95%CI:99-100%), respectively. For the diagnostic accuracy of HCV-Ab from DBS, 19 studies were included in a pooled quantitative meta-analysis, and 23 studies were included in a narrative review. Pooled estimates of sensitivity and specificity were 98% (CI95%:95-99) and 99% (CI95%:98-100), respectively. Overall quality of studies and heterogeneity were rated as moderate in both systematic reviews. HCV-Ab and HBsAg testing using DBS compared to venous blood sampling was associated with excellent diagnostic accuracy. However, generalizability is limited as no uniform protocol was applied and most studies did not use fresh samples. Future studies on diagnostic accuracy should include an assessment of impact of environmental conditions common in low resource field settings. Manufacturers also need to formally validate their assays for DBS for use with their commercial assays.
Utilizing Metalized Fabrics for Liquid and Rip Detection and Localization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holland, Stephen; Mahan, Cody; Kuhn, Michael J
2013-01-01
This paper proposes a novel technique for utilizing conductive textiles as a distributed sensor for detecting and localizing liquids (e.g., blood), rips (e.g., bullet holes), and potentially biosignals. The proposed technique is verified through both simulation and experimental measurements. Circuit theory is utilized to depict conductive fabric as a bounded, near-infinite grid of resistors. Solutions to the well-known infinite resistance grid problem are used to confirm the accuracy and validity of this modeling approach. Simulations allow for discontinuities to be placed within the resistor matrix to illustrate the effects of bullet holes within the fabric. A real-time experimental system wasmore » developed that uses a multiplexed Wheatstone bridge approach to reconstruct the resistor grid across the conductive fabric and detect liquids and rips. The resistor grid model is validated through a comparison of simulated and experimental results. Results suggest accuracy proportional to the electrode spacing in determining the presence and location of discontinuities in conductive fabric samples. Future work is focused on refining the experimental system to provide more accuracy in detecting and localizing events as well as developing a complete prototype that can be deployed for field testing. Potential applications include intelligent clothing, flexible, lightweight sensing systems, and combat wound detection.« less
Liu, Derek; Sloboda, Ron S
2014-05-01
Boyer and Mok proposed a fast calculation method employing the Fourier transform (FT), for which calculation time is independent of the number of seeds but seed placement is restricted to calculation grid points. Here an interpolation method is described enabling unrestricted seed placement while preserving the computational efficiency of the original method. The Iodine-125 seed dose kernel was sampled and selected values were modified to optimize interpolation accuracy for clinically relevant doses. For each seed, the kernel was shifted to the nearest grid point via convolution with a unit impulse, implemented in the Fourier domain. The remaining fractional shift was performed using a piecewise third-order Lagrange filter. Implementation of the interpolation method greatly improved FT-based dose calculation accuracy. The dose distribution was accurate to within 2% beyond 3 mm from each seed. Isodose contours were indistinguishable from explicit TG-43 calculation. Dose-volume metric errors were negligible. Computation time for the FT interpolation method was essentially the same as Boyer's method. A FT interpolation method for permanent prostate brachytherapy TG-43 dose calculation was developed which expands upon Boyer's original method and enables unrestricted seed placement. The proposed method substantially improves the clinically relevant dose accuracy with negligible additional computation cost, preserving the efficiency of the original method.
A three-parameter model for classifying anurans into four genera based on advertisement calls.
Gingras, Bruno; Fitch, William Tecumseh
2013-01-01
The vocalizations of anurans are innate in structure and may therefore contain indicators of phylogenetic history. Thus, advertisement calls of species which are more closely related phylogenetically are predicted to be more similar than those of distant species. This hypothesis was evaluated by comparing several widely used machine-learning algorithms. Recordings of advertisement calls from 142 species belonging to four genera were analyzed. A logistic regression model, using mean values for dominant frequency, coefficient of variation of root-mean square energy, and spectral flux, correctly classified advertisement calls with regard to genus with an accuracy above 70%. Similar accuracy rates were obtained using these parameters with a support vector machine model, a K-nearest neighbor algorithm, and a multivariate Gaussian distribution classifier, whereas a Gaussian mixture model performed slightly worse. In contrast, models based on mel-frequency cepstral coefficients did not fare as well. Comparable accuracy levels were obtained on out-of-sample recordings from 52 of the 142 original species. The results suggest that a combination of low-level acoustic attributes is sufficient to discriminate efficiently between the vocalizations of these four genera, thus supporting the initial premise and validating the use of high-throughput algorithms on animal vocalizations to evaluate phylogenetic hypotheses.
An efficient algorithm for generating random number pairs drawn from a bivariate normal distribution
NASA Technical Reports Server (NTRS)
Campbell, C. W.
1983-01-01
An efficient algorithm for generating random number pairs from a bivariate normal distribution was developed. Any desired value of the two means, two standard deviations, and correlation coefficient can be selected. Theoretically the technique is exact and in practice its accuracy is limited only by the quality of the uniform distribution random number generator, inaccuracies in computer function evaluation, and arithmetic. A FORTRAN routine was written to check the algorithm and good accuracy was obtained. Some small errors in the correlation coefficient were observed to vary in a surprisingly regular manner. A simple model was developed which explained the qualities aspects of the errors.
Lee, Bang Yeon; Kang, Su-Tae; Yun, Hae-Bum; Kim, Yun Yong
2016-01-12
The distribution of fiber orientation is an important factor in determining the mechanical properties of fiber-reinforced concrete. This study proposes a new image analysis technique for improving the evaluation accuracy of fiber orientation distribution in the sectional image of fiber-reinforced concrete. A series of tests on the accuracy of fiber detection and the estimation performance of fiber orientation was performed on artificial fiber images to assess the validity of the proposed technique. The validation test results showed that the proposed technique estimates the distribution of fiber orientation more accurately than the direct measurement of fiber orientation by image analysis.
Lee, Bang Yeon; Kang, Su-Tae; Yun, Hae-Bum; Kim, Yun Yong
2016-01-01
The distribution of fiber orientation is an important factor in determining the mechanical properties of fiber-reinforced concrete. This study proposes a new image analysis technique for improving the evaluation accuracy of fiber orientation distribution in the sectional image of fiber-reinforced concrete. A series of tests on the accuracy of fiber detection and the estimation performance of fiber orientation was performed on artificial fiber images to assess the validity of the proposed technique. The validation test results showed that the proposed technique estimates the distribution of fiber orientation more accurately than the direct measurement of fiber orientation by image analysis. PMID:28787839
Cross-coherent vector sensor processing for spatially distributed glider networks.
Nichols, Brendan; Sabra, Karim G
2015-09-01
Autonomous underwater gliders fitted with vector sensors can be used as a spatially distributed sensor array to passively locate underwater sources. However, to date, the positional accuracy required for robust array processing (especially coherent processing) is not achievable using dead-reckoning while the gliders remain submerged. To obtain such accuracy, the gliders can be temporarily surfaced to allow for global positioning system contact, but the acoustically active sea surface introduces locally additional sensor noise. This letter demonstrates that cross-coherent array processing, which inherently mitigates the effects of local noise, outperforms traditional incoherent processing source localization methods for this spatially distributed vector sensor network.
Dem Local Accuracy Patterns in Land-Use/Land-Cover Classification
NASA Astrophysics Data System (ADS)
Katerji, Wassim; Farjas Abadia, Mercedes; Morillo Balsera, Maria del Carmen
2016-01-01
Global and nation-wide DEM do not preserve the same height accuracy throughout the area of study. Instead of assuming a single RMSE value for the whole area, this study proposes a vario-model that divides the area into sub-regions depending on the land-use / landcover (LULC) classification, and assigns a local accuracy per each zone, as these areas share similar terrain formation and roughness, and tend to have similar DEM accuracies. A pilot study over Lebanon using the SRTM and ASTER DEMs, combined with a set of 1,105 randomly distributed ground control points (GCPs) showed that even though the inputDEMs have different spatial and temporal resolution, and were collected using difierent techniques, their accuracy varied similarly when changing over difierent LULC classes. Furthermore, validating the generated vario-models proved that they provide a closer representation of the accuracy to the validating GCPs than the conventional RMSE, by 94% and 86% for the SRTMand ASTER respectively. Geostatistical analysis of the input and output datasets showed that the results have a normal distribution, which support the generalization of the proven hypothesis, making this finding applicable to other input datasets anywhere around the world.
RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics
Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo
2007-01-01
Background The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. Results Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request. PMID:17961253
Capalbo, Antonio; Wright, Graham; Elliott, Thomas; Ubaldi, Filippo Maria; Rienzi, Laura; Nagy, Zsolt Peter
2013-08-01
Does comprehensive chromosome screening (CCS) of cells sampled from the blastocyst trophectoderm (TE) accurately predict the chromosome complement of the inner cell mass (ICM)? Comprehensive chromosome screening of a TE sample is unlikely to be confounded by mosaicism and has the potential for high diagnostic accuracy. The effectiveness of chromosome aneuploidy screening is limited by the technologies available and chromosome mosaicism in the embryo. Combined with improving methods for cryopreservation and blastocyst culture, TE biopsy and CCS is considered to be a promising approach to select diploid embryos for transfer. The study was performed between January 2011 and August 2011. In the first part, a new ICM isolation method was developed and tested on 20 good morphology blastocysts. In the main phase of the study, fluorescence in situ hybridization (FISH) was used to reanalyse the ICMs and TEs separated from 70 embryos obtained from 26 patients undergoing blastocyst stage array comparative genome hybridization (aCGH) PGS cycles. The isolated ICM and TE fractions were characterized by immunostaining for KRT18. Then, non-transferrable cryopreserved embryos were selected for the FISH reanalysis based on previous genetic diagnosis obtained by TE aCGH analysis. Blastocysts either diploid for chromosome copy number (20) or diagnosed as single- (40) or double aneuploid (10) were included after preparing the embryo into one ICM and three equal-sized TE sections. Accuracy of the aCGH was measured based on FISH reanalysis. Chromosomal segregations resulting in diploid/aneuploid mosaicism were classified as 'low-', 'medium-' and 'high-' grade and categorized with respect to their distribution (1TE, 2TE, 3TE, ICM or ALL embryo). Linear regression model was used to test the relationship between the distributions and the proportion of aneuploid cells across the four embryo sections. Fisher's exact test was used to test for random allocation of aneuploid cells between TE and ICM. All ICM biopsy procedures displayed ICM cells in the recovered fraction with a mean number of ICM cells of 26.2 and a mean TE cell contamination rate of 2%. By FISH reanalysis of previously aCGH-screened blastocysts, a total of 66 aneuploidies were scored, 52 (78.8%) observed in all cells and 14 (21.2%) mosaic. Overall, mosaic chromosomal errors were observed only in 11 out of 70 blastocysts (15.7%) but only 2 cases were classified as mosaic diploid/aneuploid (2.9%). Sensitivity and specificity of aCGH on TE clinical biopsies were 98.0 and 100% per embryo and 95.2 and 99.8% per chromosome, respectively. Linear regression analysis performed on the 11 mosaic diploid/aneuploid chromosomal segregations showed a significant positive correlation between the distribution and the proportion of aneuploid cells across the four-blastocyst sections (P < 0.01). In addition, regression analysis revealed that both the grade and the distribution of mosaic abnormal cells were significantly correlated with the likelihood of being diagnosed by aCGH performed on clinical TE biopsies (P = 0.019 and P < 0.01, respectively). Fisher's exact test for the 66 aneuploidies recorded showed no preferential allocation of abnormal cells between ICM and TE (P = 0.33). The study is limited to non-transferable embryos, reanalyzed for only nine chromosomes and excludes segmental imbalance and uniparental disomy. The prevalence of aneuploidy in the study group is likely to be higher than in the general population of clinical PGD embryos. This study showed high accuracy of diagnosis achievable during blastocyst stage PGS cycles coupled with 24-chromosomes molecular karyotyping analysis. The new ICM isolation strategy developed may open new possibilities for basic research in embryology and for clinical grade derivation of human embryonic stem cells. No specific funding was sought or obtained for this study.
Zhang, Xiaoheng; Wang, Lirui; Cao, Yao; Wang, Pin; Zhang, Cheng; Yang, Liuyang; Li, Yongming; Zhang, Yanling; Cheng, Oumei
2018-02-01
Diagnosis of Parkinson's disease (PD) based on speech data has been proved to be an effective way in recent years. However, current researches just care about the feature extraction and classifier design, and do not consider the instance selection. Former research by authors showed that the instance selection can lead to improvement on classification accuracy. However, no attention is paid on the relationship between speech sample and feature until now. Therefore, a new diagnosis algorithm of PD is proposed in this paper by simultaneously selecting speech sample and feature based on relevant feature weighting algorithm and multiple kernel method, so as to find their synergy effects, thereby improving classification accuracy. Experimental results showed that this proposed algorithm obtained apparent improvement on classification accuracy. It can obtain mean classification accuracy of 82.5%, which was 30.5% higher than the relevant algorithm. Besides, the proposed algorithm detected the synergy effects of speech sample and feature, which is valuable for speech marker extraction.
Blagus, Rok; Lusa, Lara
2015-11-04
Prediction models are used in clinical research to develop rules that can be used to accurately predict the outcome of the patients based on some of their characteristics. They represent a valuable tool in the decision making process of clinicians and health policy makers, as they enable them to estimate the probability that patients have or will develop a disease, will respond to a treatment, or that their disease will recur. The interest devoted to prediction models in the biomedical community has been growing in the last few years. Often the data used to develop the prediction models are class-imbalanced as only few patients experience the event (and therefore belong to minority class). Prediction models developed using class-imbalanced data tend to achieve sub-optimal predictive accuracy in the minority class. This problem can be diminished by using sampling techniques aimed at balancing the class distribution. These techniques include under- and oversampling, where a fraction of the majority class samples are retained in the analysis or new samples from the minority class are generated. The correct assessment of how the prediction model is likely to perform on independent data is of crucial importance; in the absence of an independent data set, cross-validation is normally used. While the importance of correct cross-validation is well documented in the biomedical literature, the challenges posed by the joint use of sampling techniques and cross-validation have not been addressed. We show that care must be taken to ensure that cross-validation is performed correctly on sampled data, and that the risk of overestimating the predictive accuracy is greater when oversampling techniques are used. Examples based on the re-analysis of real datasets and simulation studies are provided. We identify some results from the biomedical literature where the incorrect cross-validation was performed, where we expect that the performance of oversampling techniques was heavily overestimated.
Improved Time-Lapsed Angular Scattering Microscopy of Single Cells
NASA Astrophysics Data System (ADS)
Cannaday, Ashley E.
By measuring angular scattering patterns from biological samples and fitting them with a Mie theory model, one can estimate the organelle size distribution within many cells. Quantitative organelle sizing of ensembles of cells using this method has been well established. Our goal is to develop the methodology to extend this approach to the single cell level, measuring the angular scattering at multiple time points and estimating the non-nuclear organelle size distribution parameters. The diameters of individual organelle-size beads were successfully extracted using scattering measurements with a minimum deflection angle of 20 degrees. However, the accuracy of size estimates can be limited by the angular range detected. In particular, simulations by our group suggest that, for cell organelle populations with a broader size distribution, the accuracy of size prediction improves substantially if the minimum angle of detection angle is 15 degrees or less. The system was therefore modified to collect scattering angles down to 10 degrees. To confirm experimentally that size predictions will become more stable when lower scattering angles are detected, initial validations were performed on individual polystyrene beads ranging in diameter from 1 to 5 microns. We found that the lower minimum angle enabled the width of this delta-function size distribution to be predicted more accurately. Scattering patterns were then acquired and analyzed from single mouse squamous cell carcinoma cells at multiple time points. The scattering patterns exhibit angular dependencies that look unlike those of any single sphere size, but are well-fit by a broad distribution of sizes, as expected. To determine the fluctuation level in the estimated size distribution due to measurement imperfections alone, formaldehyde-fixed cells were measured. Subsequent measurements on live (non-fixed) cells revealed an order of magnitude greater fluctuation in the estimated sizes compared to fixed cells. With our improved and better-understood approach to single cell angular scattering, we are now capable of reliably detecting changes in organelle size predictions due to biological causes above our measurement error of 20 nm, which enables us to apply our system to future studies of the investigation of various single cell biological processes.
Abstract for poster presentation:
Site-specific accuracy assessments evaluate fine-scale accuracy of land-use/land-cover(LULC) datasets but provide little insight into accuracy of area estimates of LULC
classes derived from sampling units of varying size. Additiona...
Fulford, Janice M.; Clayton, Christopher S.
2015-10-09
The calibration device and proposed method were used to calibrate a sample of in-service USGS steel and electric groundwater tapes. The sample of in-service groundwater steel tapes were in relatively good condition. All steel tapes, except one, were accurate to ±0.01 ft per 100 ft over their entire length. One steel tape, which had obvious damage in the first hundred feet, was marginally outside the accuracy of ±0.01 ft per 100 ft by 0.001 ft. The sample of in-service groundwater-level electric tapes were in a range of conditions—from like new, with cosmetic damage, to nonfunctional. The in-service electric tapes did not meet the USGS accuracy recommendation of ±0.01 ft. In-service electric tapes, except for the nonfunctional tape, were accurate to about ±0.03 ft per 100 ft. A comparison of new with in-service electric tapes found that steel-core electric tapes maintained their length and accuracy better than electric tapes without a steel core. The in-service steel tapes could be used as is and achieve USGS accuracy recommendations for groundwater-level measurements. The in-service electric tapes require tape corrections to achieve USGS accuracy recommendations for groundwater-level measurement.