Fisher classifier and its probability of error estimation
NASA Technical Reports Server (NTRS)
Chittineni, C. B.
1979-01-01
Computationally efficient expressions are derived for estimating the probability of error using the leave-one-out method. The optimal threshold for the classification of patterns projected onto Fisher's direction is derived. A simple generalization of the Fisher classifier to multiple classes is presented. Computational expressions are developed for estimating the probability of error of the multiclass Fisher classifier.
Inverse sequential detection of parameter changes in developing time series
NASA Technical Reports Server (NTRS)
Radok, Uwe; Brown, Timothy J.
1992-01-01
Progressive values of two probabilities are obtained for parameter estimates derived from an existing set of values and from the same set enlarged by one or more new values, respectively. One probability is that of erroneously preferring the second of these estimates for the existing data ('type 1 error'), while the second probability is that of erroneously accepting their estimates for the enlarged test ('type 2 error'). A more stable combined 'no change' probability which always falls between 0.5 and 0 is derived from the (logarithmic) width of the uncertainty region of an equivalent 'inverted' sequential probability ratio test (SPRT, Wald 1945) in which the error probabilities are calculated rather than prescribed. A parameter change is indicated when the compound probability undergoes a progressive decrease. The test is explicitly formulated and exemplified for Gaussian samples.
Estimating parameters for probabilistic linkage of privacy-preserved datasets.
Brown, Adrian P; Randall, Sean M; Ferrante, Anna M; Semmens, James B; Boyd, James H
2017-07-10
Probabilistic record linkage is a process used to bring together person-based records from within the same dataset (de-duplication) or from disparate datasets using pairwise comparisons and matching probabilities. The linkage strategy and associated match probabilities are often estimated through investigations into data quality and manual inspection. However, as privacy-preserved datasets comprise encrypted data, such methods are not possible. In this paper, we present a method for estimating the probabilities and threshold values for probabilistic privacy-preserved record linkage using Bloom filters. Our method was tested through a simulation study using synthetic data, followed by an application using real-world administrative data. Synthetic datasets were generated with error rates from zero to 20% error. Our method was used to estimate parameters (probabilities and thresholds) for de-duplication linkages. Linkage quality was determined by F-measure. Each dataset was privacy-preserved using separate Bloom filters for each field. Match probabilities were estimated using the expectation-maximisation (EM) algorithm on the privacy-preserved data. Threshold cut-off values were determined by an extension to the EM algorithm allowing linkage quality to be estimated for each possible threshold. De-duplication linkages of each privacy-preserved dataset were performed using both estimated and calculated probabilities. Linkage quality using the F-measure at the estimated threshold values was also compared to the highest F-measure. Three large administrative datasets were used to demonstrate the applicability of the probability and threshold estimation technique on real-world data. Linkage of the synthetic datasets using the estimated probabilities produced an F-measure that was comparable to the F-measure using calculated probabilities, even with up to 20% error. Linkage of the administrative datasets using estimated probabilities produced an F-measure that was higher than the F-measure using calculated probabilities. Further, the threshold estimation yielded results for F-measure that were only slightly below the highest possible for those probabilities. The method appears highly accurate across a spectrum of datasets with varying degrees of error. As there are few alternatives for parameter estimation, the approach is a major step towards providing a complete operational approach for probabilistic linkage of privacy-preserved datasets.
Error Patterns in Ordering Fractions among At-Risk Fourth-Grade Students
Malone, Amelia S.; Fuchs, Lynn S.
2016-01-01
The 3 purposes of this study were to: (a) describe fraction ordering errors among at-risk 4th-grade students; (b) assess the effect of part-whole understanding and accuracy of fraction magnitude estimation on the probability of committing errors; and (c) examine the effect of students' ability to explain comparing problems on the probability of committing errors. Students (n = 227) completed a 9-item ordering test. A high proportion (81%) of problems were completed incorrectly. Most (65% of) errors were due to students misapplying whole number logic to fractions. Fraction-magnitude estimation skill, but not part-whole understanding, significantly predicted the probability of committing this type of error. Implications for practice are discussed. PMID:26966153
Human Error Analysis in a Permit to Work System: A Case Study in a Chemical Plant
Jahangiri, Mehdi; Hoboubi, Naser; Rostamabadi, Akbar; Keshavarzi, Sareh; Hosseini, Ali Akbar
2015-01-01
Background A permit to work (PTW) is a formal written system to control certain types of work which are identified as potentially hazardous. However, human error in PTW processes can lead to an accident. Methods This cross-sectional, descriptive study was conducted to estimate the probability of human errors in PTW processes in a chemical plant in Iran. In the first stage, through interviewing the personnel and studying the procedure in the plant, the PTW process was analyzed using the hierarchical task analysis technique. In doing so, PTW was considered as a goal and detailed tasks to achieve the goal were analyzed. In the next step, the standardized plant analysis risk-human (SPAR-H) reliability analysis method was applied for estimation of human error probability. Results The mean probability of human error in the PTW system was estimated to be 0.11. The highest probability of human error in the PTW process was related to flammable gas testing (50.7%). Conclusion The SPAR-H method applied in this study could analyze and quantify the potential human errors and extract the required measures for reducing the error probabilities in PTW system. Some suggestions to reduce the likelihood of errors, especially in the field of modifying the performance shaping factors and dependencies among tasks are provided. PMID:27014485
Butler, Troy; Wildey, Timothy
2018-01-01
In thist study, we develop a procedure to utilize error estimates for samples of a surrogate model to compute robust upper and lower bounds on estimates of probabilities of events. We show that these error estimates can also be used in an adaptive algorithm to simultaneously reduce the computational cost and increase the accuracy in estimating probabilities of events using computationally expensive high-fidelity models. Specifically, we introduce the notion of reliability of a sample of a surrogate model, and we prove that utilizing the surrogate model for the reliable samples and the high-fidelity model for the unreliable samples gives preciselymore » the same estimate of the probability of the output event as would be obtained by evaluation of the original model for each sample. The adaptive algorithm uses the additional evaluations of the high-fidelity model for the unreliable samples to locally improve the surrogate model near the limit state, which significantly reduces the number of high-fidelity model evaluations as the limit state is resolved. Numerical results based on a recently developed adjoint-based approach for estimating the error in samples of a surrogate are provided to demonstrate (1) the robustness of the bounds on the probability of an event, and (2) that the adaptive enhancement algorithm provides a more accurate estimate of the probability of the QoI event than standard response surface approximation methods at a lower computational cost.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Butler, Troy; Wildey, Timothy
In thist study, we develop a procedure to utilize error estimates for samples of a surrogate model to compute robust upper and lower bounds on estimates of probabilities of events. We show that these error estimates can also be used in an adaptive algorithm to simultaneously reduce the computational cost and increase the accuracy in estimating probabilities of events using computationally expensive high-fidelity models. Specifically, we introduce the notion of reliability of a sample of a surrogate model, and we prove that utilizing the surrogate model for the reliable samples and the high-fidelity model for the unreliable samples gives preciselymore » the same estimate of the probability of the output event as would be obtained by evaluation of the original model for each sample. The adaptive algorithm uses the additional evaluations of the high-fidelity model for the unreliable samples to locally improve the surrogate model near the limit state, which significantly reduces the number of high-fidelity model evaluations as the limit state is resolved. Numerical results based on a recently developed adjoint-based approach for estimating the error in samples of a surrogate are provided to demonstrate (1) the robustness of the bounds on the probability of an event, and (2) that the adaptive enhancement algorithm provides a more accurate estimate of the probability of the QoI event than standard response surface approximation methods at a lower computational cost.« less
Nematode Damage Functions: The Problems of Experimental and Sampling Error
Ferris, H.
1984-01-01
The development and use of pest damage functions involves measurement and experimental errors associated with cultural, environmental, and distributional factors. Damage predictions are more valuable if considered with associated probability. Collapsing population densities into a geometric series of population classes allows a pseudo-replication removal of experimental and sampling error in damage function development. Recognition of the nature of sampling error for aggregated populations allows assessment of probability associated with the population estimate. The product of the probabilities incorporated in the damage function and in the population estimate provides a basis for risk analysis of the yield loss prediction and the ensuing management decision. PMID:19295865
Probability shapes perceptual precision: A study in orientation estimation.
Jabar, Syaheed B; Anderson, Britt
2015-12-01
Probability is known to affect perceptual estimations, but an understanding of mechanisms is lacking. Moving beyond binary classification tasks, we had naive participants report the orientation of briefly viewed gratings where we systematically manipulated contingent probability. Participants rapidly developed faster and more precise estimations for high-probability tilts. The shapes of their error distributions, as indexed by a kurtosis measure, also showed a distortion from Gaussian. This kurtosis metric was robust, capturing probability effects that were graded, contextual, and varying as a function of stimulus orientation. Our data can be understood as a probability-induced reduction in the variability or "shape" of estimation errors, as would be expected if probability affects the perceptual representations. As probability manipulations are an implicit component of many endogenous cuing paradigms, changes at the perceptual level could account for changes in performance that might have traditionally been ascribed to "attention." (c) 2015 APA, all rights reserved).
Nonparametric probability density estimation by optimization theoretic techniques
NASA Technical Reports Server (NTRS)
Scott, D. W.
1976-01-01
Two nonparametric probability density estimators are considered. The first is the kernel estimator. The problem of choosing the kernel scaling factor based solely on a random sample is addressed. An interactive mode is discussed and an algorithm proposed to choose the scaling factor automatically. The second nonparametric probability estimate uses penalty function techniques with the maximum likelihood criterion. A discrete maximum penalized likelihood estimator is proposed and is shown to be consistent in the mean square error. A numerical implementation technique for the discrete solution is discussed and examples displayed. An extensive simulation study compares the integrated mean square error of the discrete and kernel estimators. The robustness of the discrete estimator is demonstrated graphically.
Identification of dynamic systems, theory and formulation
NASA Technical Reports Server (NTRS)
Maine, R. E.; Iliff, K. W.
1985-01-01
The problem of estimating parameters of dynamic systems is addressed in order to present the theoretical basis of system identification and parameter estimation in a manner that is complete and rigorous, yet understandable with minimal prerequisites. Maximum likelihood and related estimators are highlighted. The approach used requires familiarity with calculus, linear algebra, and probability, but does not require knowledge of stochastic processes or functional analysis. The treatment emphasizes unification of the various areas in estimation in dynamic systems is treated as a direct outgrowth of the static system theory. Topics covered include basic concepts and definitions; numerical optimization methods; probability; statistical estimators; estimation in static systems; stochastic processes; state estimation in dynamic systems; output error, filter error, and equation error methods of parameter estimation in dynamic systems, and the accuracy of the estimates.
Wald Sequential Probability Ratio Test for Analysis of Orbital Conjunction Data
NASA Technical Reports Server (NTRS)
Carpenter, J. Russell; Markley, F. Landis; Gold, Dara
2013-01-01
We propose a Wald Sequential Probability Ratio Test for analysis of commonly available predictions associated with spacecraft conjunctions. Such predictions generally consist of a relative state and relative state error covariance at the time of closest approach, under the assumption that prediction errors are Gaussian. We show that under these circumstances, the likelihood ratio of the Wald test reduces to an especially simple form, involving the current best estimate of collision probability, and a similar estimate of collision probability that is based on prior assumptions about the likelihood of collision.
Genetic Algorithm-Based Motion Estimation Method using Orientations and EMGs for Robot Controls
Chae, Jeongsook; Jin, Yong; Sung, Yunsick
2018-01-01
Demand for interactive wearable devices is rapidly increasing with the development of smart devices. To accurately utilize wearable devices for remote robot controls, limited data should be analyzed and utilized efficiently. For example, the motions by a wearable device, called Myo device, can be estimated by measuring its orientation, and calculating a Bayesian probability based on these orientation data. Given that Myo device can measure various types of data, the accuracy of its motion estimation can be increased by utilizing these additional types of data. This paper proposes a motion estimation method based on weighted Bayesian probability and concurrently measured data, orientations and electromyograms (EMG). The most probable motion among estimated is treated as a final estimated motion. Thus, recognition accuracy can be improved when compared to the traditional methods that employ only a single type of data. In our experiments, seven subjects perform five predefined motions. When orientation is measured by the traditional methods, the sum of the motion estimation errors is 37.3%; likewise, when only EMG data are used, the error in motion estimation by the proposed method was also 37.3%. The proposed combined method has an error of 25%. Therefore, the proposed method reduces motion estimation errors by 12%. PMID:29324641
Federal Register 2010, 2011, 2012, 2013, 2014
2013-05-15
....gov/acs/www/ or contact the Census Bureau's Social, Economic, and Housing Statistics Division at (301...) Sampling Error, which consists of the error that arises from the use of probability sampling to create the... direction; and (2) Sampling Error, which consists of the error that arises from the use of probability...
Field evaluation of distance-estimation error during wetland-dependent bird surveys
Nadeau, Christopher P.; Conway, Courtney J.
2012-01-01
Context: The most common methods to estimate detection probability during avian point-count surveys involve recording a distance between the survey point and individual birds detected during the survey period. Accurately measuring or estimating distance is an important assumption of these methods; however, this assumption is rarely tested in the context of aural avian point-count surveys. Aims: We expand on recent bird-simulation studies to document the error associated with estimating distance to calling birds in a wetland ecosystem. Methods: We used two approaches to estimate the error associated with five surveyor's distance estimates between the survey point and calling birds, and to determine the factors that affect a surveyor's ability to estimate distance. Key results: We observed biased and imprecise distance estimates when estimating distance to simulated birds in a point-count scenario (x̄error = -9 m, s.d.error = 47 m) and when estimating distances to real birds during field trials (x̄error = 39 m, s.d.error = 79 m). The amount of bias and precision in distance estimates differed among surveyors; surveyors with more training and experience were less biased and more precise when estimating distance to both real and simulated birds. Three environmental factors were important in explaining the error associated with distance estimates, including the measured distance from the bird to the surveyor, the volume of the call and the species of bird. Surveyors tended to make large overestimations to birds close to the survey point, which is an especially serious error in distance sampling. Conclusions: Our results suggest that distance-estimation error is prevalent, but surveyor training may be the easiest way to reduce distance-estimation error. Implications: The present study has demonstrated how relatively simple field trials can be used to estimate the error associated with distance estimates used to estimate detection probability during avian point-count surveys. Evaluating distance-estimation errors will allow investigators to better evaluate the accuracy of avian density and trend estimates. Moreover, investigators who evaluate distance-estimation errors could employ recently developed models to incorporate distance-estimation error into analyses. We encourage further development of such models, including the inclusion of such models into distance-analysis software.
Asteroid orbital error analysis: Theory and application
NASA Technical Reports Server (NTRS)
Muinonen, K.; Bowell, Edward
1992-01-01
We present a rigorous Bayesian theory for asteroid orbital error estimation in which the probability density of the orbital elements is derived from the noise statistics of the observations. For Gaussian noise in a linearized approximation the probability density is also Gaussian, and the errors of the orbital elements at a given epoch are fully described by the covariance matrix. The law of error propagation can then be applied to calculate past and future positional uncertainty ellipsoids (Cappellari et al. 1976, Yeomans et al. 1987, Whipple et al. 1991). To our knowledge, this is the first time a Bayesian approach has been formulated for orbital element estimation. In contrast to the classical Fisherian school of statistics, the Bayesian school allows a priori information to be formally present in the final estimation. However, Bayesian estimation does give the same results as Fisherian estimation when no priori information is assumed (Lehtinen 1988, and reference therein).
Method of estimating natural recharge to the Edwards Aquifer in the San Antonio area, Texas
Puente, Celso
1978-01-01
The principal errors in the estimates of annual recharge are related to errors in estimating runoff in ungaged areas, which represent about 30 percent of the infiltration area. The estimated long-term average annual recharge in each basin, however, is probably representative of the actual recharge because the averaging procedure tends to cancel out the major errors.
Error Patterns in Ordering Fractions among At-Risk Fourth-Grade Students
ERIC Educational Resources Information Center
Malone, Amelia Schneider; Fuchs, Lynn S.
2015-01-01
The 3 purposes of this study were to: (a) describe fraction ordering errors among at-risk 4th-grade students; (b) assess the effect of part-whole understanding and accuracy of fraction magnitude estimation on the probability of committing errors; and (c) examine the effect of students' ability to explain comparing problems on the probability of…
Causal inference with measurement error in outcomes: Bias analysis and estimation methods.
Shu, Di; Yi, Grace Y
2017-01-01
Inverse probability weighting estimation has been popularly used to consistently estimate the average treatment effect. Its validity, however, is challenged by the presence of error-prone variables. In this paper, we explore the inverse probability weighting estimation with mismeasured outcome variables. We study the impact of measurement error for both continuous and discrete outcome variables and reveal interesting consequences of the naive analysis which ignores measurement error. When a continuous outcome variable is mismeasured under an additive measurement error model, the naive analysis may still yield a consistent estimator; when the outcome is binary, we derive the asymptotic bias in a closed-form. Furthermore, we develop consistent estimation procedures for practical scenarios where either validation data or replicates are available. With validation data, we propose an efficient method for estimation of average treatment effect; the efficiency gain is substantial relative to usual methods of using validation data. To provide protection against model misspecification, we further propose a doubly robust estimator which is consistent even when either the treatment model or the outcome model is misspecified. Simulation studies are reported to assess the performance of the proposed methods. An application to a smoking cessation dataset is presented.
NASA Technical Reports Server (NTRS)
Lin, Shu; Fossorier, Marc
1998-01-01
In a coded communication system with equiprobable signaling, MLD minimizes the word error probability and delivers the most likely codeword associated with the corresponding received sequence. This decoding has two drawbacks. First, minimization of the word error probability is not equivalent to minimization of the bit error probability. Therefore, MLD becomes suboptimum with respect to the bit error probability. Second, MLD delivers a hard-decision estimate of the received sequence, so that information is lost between the input and output of the ML decoder. This information is important in coded schemes where the decoded sequence is further processed, such as concatenated coding schemes, multi-stage and iterative decoding schemes. In this chapter, we first present a decoding algorithm which both minimizes bit error probability, and provides the corresponding soft information at the output of the decoder. This algorithm is referred to as the MAP (maximum aposteriori probability) decoding algorithm.
McClintock, Brett T.; Bailey, Larissa L.; Pollock, Kenneth H.; Simons, Theodore R.
2010-01-01
The recent surge in the development and application of species occurrence models has been associated with an acknowledgment among ecologists that species are detected imperfectly due to observation error. Standard models now allow unbiased estimation of occupancy probability when false negative detections occur, but this is conditional on no false positive detections and sufficient incorporation of explanatory variables for the false negative detection process. These assumptions are likely reasonable in many circumstances, but there is mounting evidence that false positive errors and detection probability heterogeneity may be much more prevalent in studies relying on auditory cues for species detection (e.g., songbird or calling amphibian surveys). We used field survey data from a simulated calling anuran system of known occupancy state to investigate the biases induced by these errors in dynamic models of species occurrence. Despite the participation of expert observers in simplified field conditions, both false positive errors and site detection probability heterogeneity were extensive for most species in the survey. We found that even low levels of false positive errors, constituting as little as 1% of all detections, can cause severe overestimation of site occupancy, colonization, and local extinction probabilities. Further, unmodeled detection probability heterogeneity induced substantial underestimation of occupancy and overestimation of colonization and local extinction probabilities. Completely spurious relationships between species occurrence and explanatory variables were also found. Such misleading inferences would likely have deleterious implications for conservation and management programs. We contend that all forms of observation error, including false positive errors and heterogeneous detection probabilities, must be incorporated into the estimation framework to facilitate reliable inferences about occupancy and its associated vital rate parameters.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-05-12
... Household Economic Statistics Division at (301) 763-3243. Under the advice of the Census Bureau, HHS..., which consists of the error that arises from the use of probability sampling to create the sample. For...) Sampling Error, which consists of the error that arises from the use of probability sampling to create the...
Noise Estimation and Adaptive Encoding for Asymmetric Quantum Error Correcting Codes
NASA Astrophysics Data System (ADS)
Florjanczyk, Jan; Brun, Todd; CenterQuantum Information Science; Technology Team
We present a technique that improves the performance of asymmetric quantum error correcting codes in the presence of biased qubit noise channels. Our study is motivated by considering what useful information can be learned from the statistics of syndrome measurements in stabilizer quantum error correcting codes (QECC). We consider the case of a qubit dephasing channel where the dephasing axis is unknown and time-varying. We are able to estimate the dephasing angle from the statistics of the standard syndrome measurements used in stabilizer QECC's. We use this estimate to rotate the computational basis of the code in such a way that the most likely type of error is covered by the highest distance of the asymmetric code. In particular, we use the [ [ 15 , 1 , 3 ] ] shortened Reed-Muller code which can correct one phase-flip error but up to three bit-flip errors. In our simulations, we tune the computational basis to match the estimated dephasing axis which in turn leads to a decrease in the probability of a phase-flip error. With a sufficiently accurate estimate of the dephasing axis, our memory's effective error is dominated by the much lower probability of four bit-flips. Aro MURI Grant No. W911NF-11-1-0268.
A multistate dynamic site occupancy model for spatially aggregated sessile communities
Fukaya, Keiichi; Royle, J. Andrew; Okuda, Takehiro; Nakaoka, Masahiro; Noda, Takashi
2017-01-01
Estimation of transition probabilities of sessile communities seems easy in principle but may still be difficult in practice because resampling error (i.e. a failure to resample exactly the same location at fixed points) may cause significant estimation bias. Previous studies have developed novel analytical methods to correct for this estimation bias. However, they did not consider the local structure of community composition induced by the aggregated distribution of organisms that is typically observed in sessile assemblages and is very likely to affect observations.We developed a multistate dynamic site occupancy model to estimate transition probabilities that accounts for resampling errors associated with local community structure. The model applies a nonparametric multivariate kernel smoothing methodology to the latent occupancy component to estimate the local state composition near each observation point, which is assumed to determine the probability distribution of data conditional on the occurrence of resampling error.By using computer simulations, we confirmed that an observation process that depends on local community structure may bias inferences about transition probabilities. By applying the proposed model to a real data set of intertidal sessile communities, we also showed that estimates of transition probabilities and of the properties of community dynamics may differ considerably when spatial dependence is taken into account.Results suggest the importance of accounting for resampling error and local community structure for developing management plans that are based on Markovian models. Our approach provides a solution to this problem that is applicable to broad sessile communities. It can even accommodate an anisotropic spatial correlation of species composition, and may also serve as a basis for inferring complex nonlinear ecological dynamics.
Conflict Probability Estimation for Free Flight
NASA Technical Reports Server (NTRS)
Paielli, Russell A.; Erzberger, Heinz
1996-01-01
The safety and efficiency of free flight will benefit from automated conflict prediction and resolution advisories. Conflict prediction is based on trajectory prediction and is less certain the farther in advance the prediction, however. An estimate is therefore needed of the probability that a conflict will occur, given a pair of predicted trajectories and their levels of uncertainty. A method is developed in this paper to estimate that conflict probability. The trajectory prediction errors are modeled as normally distributed, and the two error covariances for an aircraft pair are combined into a single equivalent covariance of the relative position. A coordinate transformation is then used to derive an analytical solution. Numerical examples and Monte Carlo validation are presented.
Modeling habitat dynamics accounting for possible misclassification
Veran, Sophie; Kleiner, Kevin J.; Choquet, Remi; Collazo, Jaime; Nichols, James D.
2012-01-01
Land cover data are widely used in ecology as land cover change is a major component of changes affecting ecological systems. Landscape change estimates are characterized by classification errors. Researchers have used error matrices to adjust estimates of areal extent, but estimation of land cover change is more difficult and more challenging, with error in classification being confused with change. We modeled land cover dynamics for a discrete set of habitat states. The approach accounts for state uncertainty to produce unbiased estimates of habitat transition probabilities using ground information to inform error rates. We consider the case when true and observed habitat states are available for the same geographic unit (pixel) and when true and observed states are obtained at one level of resolution, but transition probabilities estimated at a different level of resolution (aggregations of pixels). Simulation results showed a strong bias when estimating transition probabilities if misclassification was not accounted for. Scaling-up does not necessarily decrease the bias and can even increase it. Analyses of land cover data in the Southeast region of the USA showed that land change patterns appeared distorted if misclassification was not accounted for: rate of habitat turnover was artificially increased and habitat composition appeared more homogeneous. Not properly accounting for land cover misclassification can produce misleading inferences about habitat state and dynamics and also misleading predictions about species distributions based on habitat. Our models that explicitly account for state uncertainty should be useful in obtaining more accurate inferences about change from data that include errors.
Klaus, Christian A; Carrasco, Luis E; Goldberg, Daniel W; Henry, Kevin A; Sherman, Recinda L
2015-09-15
The utility of patient attributes associated with the spatiotemporal analysis of medical records lies not just in their values but also the strength of association between them. Estimating the extent to which a hierarchy of conditional probability exists between patient attribute associations such as patient identifying fields, patient and date of diagnosis, and patient and address at diagnosis is fundamental to estimating the strength of association between patient and geocode, and patient and enumeration area. We propose a hierarchy for the attribute associations within medical records that enable spatiotemporal relationships. We also present a set of metrics that store attribute association error probability (AAEP), to estimate error probability for all attribute associations upon which certainty in a patient geocode depends. A series of experiments were undertaken to understand how error estimation could be operationalized within health data and what levels of AAEP in real data reveal themselves using these methods. Specifically, the goals of this evaluation were to (1) assess if the concept of our error assessment techniques could be implemented by a population-based cancer registry; (2) apply the techniques to real data from a large health data agency and characterize the observed levels of AAEP; and (3) demonstrate how detected AAEP might impact spatiotemporal health research. We present an evaluation of AAEP metrics generated for cancer cases in a North Carolina county. We show examples of how we estimated AAEP for selected attribute associations and circumstances. We demonstrate the distribution of AAEP in our case sample across attribute associations, and demonstrate ways in which disease registry specific operations influence the prevalence of AAEP estimates for specific attribute associations. The effort to detect and store estimates of AAEP is worthwhile because of the increase in confidence fostered by the attribute association level approach to the assessment of uncertainty in patient geocodes, relative to existing geocoding related uncertainty metrics.
Learn-as-you-go acceleration of cosmological parameter estimates
NASA Astrophysics Data System (ADS)
Aslanyan, Grigor; Easther, Richard; Price, Layne C.
2015-09-01
Cosmological analyses can be accelerated by approximating slow calculations using a training set, which is either precomputed or generated dynamically. However, this approach is only safe if the approximations are well understood and controlled. This paper surveys issues associated with the use of machine-learning based emulation strategies for accelerating cosmological parameter estimation. We describe a learn-as-you-go algorithm that is implemented in the Cosmo++ code and (1) trains the emulator while simultaneously estimating posterior probabilities; (2) identifies unreliable estimates, computing the exact numerical likelihoods if necessary; and (3) progressively learns and updates the error model as the calculation progresses. We explicitly describe and model the emulation error and show how this can be propagated into the posterior probabilities. We apply these techniques to the Planck likelihood and the calculation of ΛCDM posterior probabilities. The computation is significantly accelerated without a pre-defined training set and uncertainties in the posterior probabilities are subdominant to statistical fluctuations. We have obtained a speedup factor of 6.5 for Metropolis-Hastings and 3.5 for nested sampling. Finally, we discuss the general requirements for a credible error model and show how to update them on-the-fly.
Learn-as-you-go acceleration of cosmological parameter estimates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aslanyan, Grigor; Easther, Richard; Price, Layne C., E-mail: g.aslanyan@auckland.ac.nz, E-mail: r.easther@auckland.ac.nz, E-mail: lpri691@aucklanduni.ac.nz
2015-09-01
Cosmological analyses can be accelerated by approximating slow calculations using a training set, which is either precomputed or generated dynamically. However, this approach is only safe if the approximations are well understood and controlled. This paper surveys issues associated with the use of machine-learning based emulation strategies for accelerating cosmological parameter estimation. We describe a learn-as-you-go algorithm that is implemented in the Cosmo++ code and (1) trains the emulator while simultaneously estimating posterior probabilities; (2) identifies unreliable estimates, computing the exact numerical likelihoods if necessary; and (3) progressively learns and updates the error model as the calculation progresses. We explicitlymore » describe and model the emulation error and show how this can be propagated into the posterior probabilities. We apply these techniques to the Planck likelihood and the calculation of ΛCDM posterior probabilities. The computation is significantly accelerated without a pre-defined training set and uncertainties in the posterior probabilities are subdominant to statistical fluctuations. We have obtained a speedup factor of 6.5 for Metropolis-Hastings and 3.5 for nested sampling. Finally, we discuss the general requirements for a credible error model and show how to update them on-the-fly.« less
Potter, Gail E; Smieszek, Timo; Sailer, Kerstin
2015-09-01
Face-to-face social contacts are potentially important transmission routes for acute respiratory infections, and understanding the contact network can improve our ability to predict, contain, and control epidemics. Although workplaces are important settings for infectious disease transmission, few studies have collected workplace contact data and estimated workplace contact networks. We use contact diaries, architectural distance measures, and institutional structures to estimate social contact networks within a Swiss research institute. Some contact reports were inconsistent, indicating reporting errors. We adjust for this with a latent variable model, jointly estimating the true (unobserved) network of contacts and duration-specific reporting probabilities. We find that contact probability decreases with distance, and that research group membership, role, and shared projects are strongly predictive of contact patterns. Estimated reporting probabilities were low only for 0-5 min contacts. Adjusting for reporting error changed the estimate of the duration distribution, but did not change the estimates of covariate effects and had little effect on epidemic predictions. Our epidemic simulation study indicates that inclusion of network structure based on architectural and organizational structure data can improve the accuracy of epidemic forecasting models.
Potter, Gail E.; Smieszek, Timo; Sailer, Kerstin
2015-01-01
Face-to-face social contacts are potentially important transmission routes for acute respiratory infections, and understanding the contact network can improve our ability to predict, contain, and control epidemics. Although workplaces are important settings for infectious disease transmission, few studies have collected workplace contact data and estimated workplace contact networks. We use contact diaries, architectural distance measures, and institutional structures to estimate social contact networks within a Swiss research institute. Some contact reports were inconsistent, indicating reporting errors. We adjust for this with a latent variable model, jointly estimating the true (unobserved) network of contacts and duration-specific reporting probabilities. We find that contact probability decreases with distance, and that research group membership, role, and shared projects are strongly predictive of contact patterns. Estimated reporting probabilities were low only for 0–5 min contacts. Adjusting for reporting error changed the estimate of the duration distribution, but did not change the estimates of covariate effects and had little effect on epidemic predictions. Our epidemic simulation study indicates that inclusion of network structure based on architectural and organizational structure data can improve the accuracy of epidemic forecasting models. PMID:26634122
NASA Technical Reports Server (NTRS)
Chittineni, C. B.
1979-01-01
The problem of estimating label imperfections and the use of the estimation in identifying mislabeled patterns is presented. Expressions for the maximum likelihood estimates of classification errors and a priori probabilities are derived from the classification of a set of labeled patterns. Expressions also are given for the asymptotic variances of probability of correct classification and proportions. Simple models are developed for imperfections in the labels and for classification errors and are used in the formulation of a maximum likelihood estimation scheme. Schemes are presented for the identification of mislabeled patterns in terms of threshold on the discriminant functions for both two-class and multiclass cases. Expressions are derived for the probability that the imperfect label identification scheme will result in a wrong decision and are used in computing thresholds. The results of practical applications of these techniques in the processing of remotely sensed multispectral data are presented.
Two-step estimation in ratio-of-mediator-probability weighted causal mediation analysis.
Bein, Edward; Deutsch, Jonah; Hong, Guanglei; Porter, Kristin E; Qin, Xu; Yang, Cheng
2018-04-15
This study investigates appropriate estimation of estimator variability in the context of causal mediation analysis that employs propensity score-based weighting. Such an analysis decomposes the total effect of a treatment on the outcome into an indirect effect transmitted through a focal mediator and a direct effect bypassing the mediator. Ratio-of-mediator-probability weighting estimates these causal effects by adjusting for the confounding impact of a large number of pretreatment covariates through propensity score-based weighting. In step 1, a propensity score model is estimated. In step 2, the causal effects of interest are estimated using weights derived from the prior step's regression coefficient estimates. Statistical inferences obtained from this 2-step estimation procedure are potentially problematic if the estimated standard errors of the causal effect estimates do not reflect the sampling uncertainty in the estimation of the weights. This study extends to ratio-of-mediator-probability weighting analysis a solution to the 2-step estimation problem by stacking the score functions from both steps. We derive the asymptotic variance-covariance matrix for the indirect effect and direct effect 2-step estimators, provide simulation results, and illustrate with an application study. Our simulation results indicate that the sampling uncertainty in the estimated weights should not be ignored. The standard error estimation using the stacking procedure offers a viable alternative to bootstrap standard error estimation. We discuss broad implications of this approach for causal analysis involving propensity score-based weighting. Copyright © 2018 John Wiley & Sons, Ltd.
Effects of structural error on the estimates of parameters of dynamical systems
NASA Technical Reports Server (NTRS)
Hadaegh, F. Y.; Bekey, G. A.
1986-01-01
In this paper, the notion of 'near-equivalence in probability' is introduced for identifying a system in the presence of several error sources. Following some basic definitions, necessary and sufficient conditions for the identifiability of parameters are given. The effects of structural error on the parameter estimates for both the deterministic and stochastic cases are considered.
Gilliom, Robert J.; Helsel, Dennis R.
1986-01-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations, for determining the best performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilliom, R.J.; Helsel, D.R.
1986-02-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensoredmore » observations, for determining the best performing parameter estimation method for any particular data det. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.« less
NASA Astrophysics Data System (ADS)
Huo, Ming-Xia; Li, Ying
2017-12-01
Quantum error correction is important to quantum information processing, which allows us to reliably process information encoded in quantum error correction codes. Efficient quantum error correction benefits from the knowledge of error rates. We propose a protocol for monitoring error rates in real time without interrupting the quantum error correction. Any adaptation of the quantum error correction code or its implementation circuit is not required. The protocol can be directly applied to the most advanced quantum error correction techniques, e.g. surface code. A Gaussian processes algorithm is used to estimate and predict error rates based on error correction data in the past. We find that using these estimated error rates, the probability of error correction failures can be significantly reduced by a factor increasing with the code distance.
Probability of misclassifying biological elements in surface waters.
Loga, Małgorzata; Wierzchołowska-Dziedzic, Anna
2017-11-24
Measurement uncertainties are inherent to assessment of biological indices of water bodies. The effect of these uncertainties on the probability of misclassification of ecological status is the subject of this paper. Four Monte-Carlo (M-C) models were applied to simulate the occurrence of random errors in the measurements of metrics corresponding to four biological elements of surface waters: macrophytes, phytoplankton, phytobenthos, and benthic macroinvertebrates. Long series of error-prone measurement values of these metrics, generated by M-C models, were used to identify cases in which values of any of the four biological indices lay outside of the "true" water body class, i.e., outside the class assigned from the actual physical measurements. Fraction of such cases in the M-C generated series was used to estimate the probability of misclassification. The method is particularly useful for estimating the probability of misclassification of the ecological status of surface water bodies in the case of short sequences of measurements of biological indices. The results of the Monte-Carlo simulations show a relatively high sensitivity of this probability to measurement errors of the river macrophyte index (MIR) and high robustness to measurement errors of the benthic macroinvertebrate index (MMI). The proposed method of using Monte-Carlo models to estimate the probability of misclassification has significant potential for assessing the uncertainty of water body status reported to the EC by the EU member countries according to WFD. The method can be readily applied also in risk assessment of water management decisions before adopting the status dependent corrective actions.
Wind power error estimation in resource assessments.
Rodríguez, Osvaldo; Del Río, Jesús A; Jaramillo, Oscar A; Martínez, Manuel
2015-01-01
Estimating the power output is one of the elements that determine the techno-economic feasibility of a renewable project. At present, there is a need to develop reliable methods that achieve this goal, thereby contributing to wind power penetration. In this study, we propose a method for wind power error estimation based on the wind speed measurement error, probability density function, and wind turbine power curves. This method uses the actual wind speed data without prior statistical treatment based on 28 wind turbine power curves, which were fitted by Lagrange's method, to calculate the estimate wind power output and the corresponding error propagation. We found that wind speed percentage errors of 10% were propagated into the power output estimates, thereby yielding an error of 5%. The proposed error propagation complements the traditional power resource assessments. The wind power estimation error also allows us to estimate intervals for the power production leveled cost or the investment time return. The implementation of this method increases the reliability of techno-economic resource assessment studies.
Wind Power Error Estimation in Resource Assessments
Rodríguez, Osvaldo; del Río, Jesús A.; Jaramillo, Oscar A.; Martínez, Manuel
2015-01-01
Estimating the power output is one of the elements that determine the techno-economic feasibility of a renewable project. At present, there is a need to develop reliable methods that achieve this goal, thereby contributing to wind power penetration. In this study, we propose a method for wind power error estimation based on the wind speed measurement error, probability density function, and wind turbine power curves. This method uses the actual wind speed data without prior statistical treatment based on 28 wind turbine power curves, which were fitted by Lagrange's method, to calculate the estimate wind power output and the corresponding error propagation. We found that wind speed percentage errors of 10% were propagated into the power output estimates, thereby yielding an error of 5%. The proposed error propagation complements the traditional power resource assessments. The wind power estimation error also allows us to estimate intervals for the power production leveled cost or the investment time return. The implementation of this method increases the reliability of techno-economic resource assessment studies. PMID:26000444
Latin hypercube approach to estimate uncertainty in ground water vulnerability
Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.
2007-01-01
A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.
Estimation of distributional parameters for censored trace-level water-quality data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilliom, R.J.; Helsel, D.R.
1984-01-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water-sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations,more » for determining the best-performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least-squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification. 6 figs., 6 tabs.« less
Objective Analysis of Oceanic Data for Coast Guard Trajectory Models Phase II
1997-12-01
as outliers depends on the desired probability of false alarm, Pfa values, which is the probability of marking a valid point as an outlier. Table 2-2...constructed to minimize the mean-squared prediction error of the grid point estimate under the constraint that the estimate is unbiased . The...prediction error, e= Zl(S) _oizl(Si)+oC1iZz(S) (2.44) subject to the constraints of unbiasedness , • c/1 = 1,and (2.45) i SCC12 = 0. (2.46) Denoting
NASA Technical Reports Server (NTRS)
Huddleston, Lisa L.; Roeder, William; Merceret, Francis J.
2010-01-01
A technique has been developed to calculate the probability that any nearby lightning stroke is within any radius of any point of interest. In practice, this provides the probability that a nearby lightning stroke was within a key distance of a facility, rather than the error ellipses centered on the stroke. This process takes the current bivariate Gaussian distribution of probability density provided by the current lightning location error ellipse for the most likely location of a lightning stroke and integrates it to get the probability that the stroke is inside any specified radius. This new facility-centric technique will be much more useful to the space launch customers and may supersede the lightning error ellipse approach discussed in [5], [6].
Probabilistic confidence for decisions based on uncertain reliability estimates
NASA Astrophysics Data System (ADS)
Reid, Stuart G.
2013-05-01
Reliability assessments are commonly carried out to provide a rational basis for risk-informed decisions concerning the design or maintenance of engineering systems and structures. However, calculated reliabilities and associated probabilities of failure often have significant uncertainties associated with the possible estimation errors relative to the 'true' failure probabilities. For uncertain probabilities of failure, a measure of 'probabilistic confidence' has been proposed to reflect the concern that uncertainty about the true probability of failure could result in a system or structure that is unsafe and could subsequently fail. The paper describes how the concept of probabilistic confidence can be applied to evaluate and appropriately limit the probabilities of failure attributable to particular uncertainties such as design errors that may critically affect the dependability of risk-acceptance decisions. This approach is illustrated with regard to the dependability of structural design processes based on prototype testing with uncertainties attributable to sampling variability.
Wedell, Douglas H; Moro, Rodrigo
2008-04-01
Two experiments used within-subject designs to examine how conjunction errors depend on the use of (1) choice versus estimation tasks, (2) probability versus frequency language, and (3) conjunctions of two likely events versus conjunctions of likely and unlikely events. All problems included a three-option format verified to minimize misinterpretation of the base event. In both experiments, conjunction errors were reduced when likely events were conjoined. Conjunction errors were also reduced for estimations compared with choices, with this reduction greater for likely conjuncts, an interaction effect. Shifting conceptual focus from probabilities to frequencies did not affect conjunction error rates. Analyses of numerical estimates for a subset of the problems provided support for the use of three general models by participants for generating estimates. Strikingly, the order in which the two tasks were carried out did not affect the pattern of results, supporting the idea that the mode of responding strongly determines the mode of thinking about conjunctions and hence the occurrence of the conjunction fallacy. These findings were evaluated in terms of implications for rationality of human judgment and reasoning.
Austin, Peter C
2016-12-30
Propensity score methods are used to reduce the effects of observed confounding when using observational data to estimate the effects of treatments or exposures. A popular method of using the propensity score is inverse probability of treatment weighting (IPTW). When using this method, a weight is calculated for each subject that is equal to the inverse of the probability of receiving the treatment that was actually received. These weights are then incorporated into the analyses to minimize the effects of observed confounding. Previous research has found that these methods result in unbiased estimation when estimating the effect of treatment on survival outcomes. However, conventional methods of variance estimation were shown to result in biased estimates of standard error. In this study, we conducted an extensive set of Monte Carlo simulations to examine different methods of variance estimation when using a weighted Cox proportional hazards model to estimate the effect of treatment. We considered three variance estimation methods: (i) a naïve model-based variance estimator; (ii) a robust sandwich-type variance estimator; and (iii) a bootstrap variance estimator. We considered estimation of both the average treatment effect and the average treatment effect in the treated. We found that the use of a bootstrap estimator resulted in approximately correct estimates of standard errors and confidence intervals with the correct coverage rates. The other estimators resulted in biased estimates of standard errors and confidence intervals with incorrect coverage rates. Our simulations were informed by a case study examining the effect of statin prescribing on mortality. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
On the Probability of Error and Stochastic Resonance in Discrete Memoryless Channels
2013-12-01
Information - Driven Doppler Shift Estimation and Compensation Methods for Underwater Wireless Sensor Networks ”, which is to analyze and develop... underwater wireless sensor networks . We formulated an analytic relationship that relates the average probability of error to the systems parameters, the...thesis, we studied the performance of Discrete Memoryless Channels (DMC), arising in the context of cooperative underwater wireless sensor networks
Analysis of the impact of error detection on computer performance
NASA Technical Reports Server (NTRS)
Shin, K. C.; Lee, Y. H.
1983-01-01
Conventionally, reliability analyses either assume that a fault/error is detected immediately following its occurrence, or neglect damages caused by latent errors. Though unrealistic, this assumption was imposed in order to avoid the difficulty of determining the respective probabilities that a fault induces an error and the error is then detected in a random amount of time after its occurrence. As a remedy for this problem a model is proposed to analyze the impact of error detection on computer performance under moderate assumptions. Error latency, the time interval between occurrence and the moment of detection, is used to measure the effectiveness of a detection mechanism. This model is used to: (1) predict the probability of producing an unreliable result, and (2) estimate the loss of computation due to fault and/or error.
Multinomial mixture model with heterogeneous classification probabilities
Holland, M.D.; Gray, B.R.
2011-01-01
Royle and Link (Ecology 86(9):2505-2512, 2005) proposed an analytical method that allowed estimation of multinomial distribution parameters and classification probabilities from categorical data measured with error. While useful, we demonstrate algebraically and by simulations that this method yields biased multinomial parameter estimates when the probabilities of correct category classifications vary among sampling units. We address this shortcoming by treating these probabilities as logit-normal random variables within a Bayesian framework. We use Markov chain Monte Carlo to compute Bayes estimates from a simulated sample from the posterior distribution. Based on simulations, this elaborated Royle-Link model yields nearly unbiased estimates of multinomial and correct classification probability estimates when classification probabilities are allowed to vary according to the normal distribution on the logit scale or according to the Beta distribution. The method is illustrated using categorical submersed aquatic vegetation data. ?? 2010 Springer Science+Business Media, LLC.
Przemyslaw, Baranski; Pawel, Strumillo
2012-01-01
The paper presents an algorithm for estimating a pedestrian location in an urban environment. The algorithm is based on the particle filter and uses different data sources: a GPS receiver, inertial sensors, probability maps and a stereo camera. Inertial sensors are used to estimate a relative displacement of a pedestrian. A gyroscope estimates a change in the heading direction. An accelerometer is used to count a pedestrian's steps and their lengths. The so-called probability maps help to limit GPS inaccuracy by imposing constraints on pedestrian kinematics, e.g., it is assumed that a pedestrian cannot cross buildings, fences etc. This limits position inaccuracy to ca. 10 m. Incorporation of depth estimates derived from a stereo camera that are compared to the 3D model of an environment has enabled further reduction of positioning errors. As a result, for 90% of the time, the algorithm is able to estimate a pedestrian location with an error smaller than 2 m, compared to an error of 6.5 m for a navigation based solely on GPS. PMID:22969321
A posteriori error estimates in voice source recovery
NASA Astrophysics Data System (ADS)
Leonov, A. S.; Sorokin, V. N.
2017-12-01
The inverse problem of voice source pulse recovery from a segment of a speech signal is under consideration. A special mathematical model is used for the solution that relates these quantities. A variational method of solving inverse problem of voice source recovery for a new parametric class of sources, that is for piecewise-linear sources (PWL-sources), is proposed. Also, a technique for a posteriori numerical error estimation for obtained solutions is presented. A computer study of the adequacy of adopted speech production model with PWL-sources is performed in solving the inverse problems for various types of voice signals, as well as corresponding study of a posteriori error estimates. Numerical experiments for speech signals show satisfactory properties of proposed a posteriori error estimates, which represent the upper bounds of possible errors in solving the inverse problem. The estimate of the most probable error in determining the source-pulse shapes is about 7-8% for the investigated speech material. It is noted that a posteriori error estimates can be used as a criterion of the quality for obtained voice source pulses in application to speaker recognition.
NASA Technical Reports Server (NTRS)
Huddleston, Lisa L.; Roeder, William P.; Merceret, Francis J.
2010-01-01
A new technique has been developed to estimate the probability that a nearby cloud-to-ground lightning stroke was within a specified radius of any point of interest. This process uses the bivariate Gaussian distribution of probability density provided by the current lightning location error ellipse for the most likely location of a lightning stroke and integrates it to determine the probability that the stroke is inside any specified radius of any location, even if that location is not centered on or even within the location error ellipse. This technique is adapted from a method of calculating the probability of debris collision with spacecraft. Such a technique is important in spaceport processing activities because it allows engineers to quantify the risk of induced current damage to critical electronics due to nearby lightning strokes. This technique was tested extensively and is now in use by space launch organizations at Kennedy Space Center and Cape Canaveral Air Force station.
Wang, Yao; Jing, Lei; Ke, Hong-Liang; Hao, Jian; Gao, Qun; Wang, Xiao-Xun; Sun, Qiang; Xu, Zhi-Jun
2016-09-20
The accelerated aging tests under electric stress for one type of LED lamp are conducted, and the differences between online and offline tests of the degradation of luminous flux are studied in this paper. The transformation of the two test modes is achieved with an adjustable AC voltage stabilized power source. Experimental results show that the exponential fitting of the luminous flux degradation in online tests possesses a higher fitting degree for most lamps, and the degradation rate of the luminous flux by online tests is always lower than that by offline tests. Bayes estimation and Weibull distribution are used to calculate the failure probabilities under the accelerated voltages, and then the reliability of the lamps under rated voltage of 220 V is estimated by use of the inverse power law model. Results show that the relative error of the lifetime estimation by offline tests increases as the failure probability decreases, and it cannot be neglected when the failure probability is less than 1%. The relative errors of lifetime estimation are 7.9%, 5.8%, 4.2%, and 3.5%, at the failure probabilities of 0.1%, 1%, 5%, and 10%, respectively.
Cao, Youfang; Terebus, Anna; Liang, Jie
2016-01-01
The discrete chemical master equation (dCME) provides a general framework for studying stochasticity in mesoscopic reaction networks. Since its direct solution rapidly becomes intractable due to the increasing size of the state space, truncation of the state space is necessary for solving most dCMEs. It is therefore important to assess the consequences of state space truncations so errors can be quantified and minimized. Here we describe a novel method for state space truncation. By partitioning a reaction network into multiple molecular equivalence groups (MEG), we truncate the state space by limiting the total molecular copy numbers in each MEG. We further describe a theoretical framework for analysis of the truncation error in the steady state probability landscape using reflecting boundaries. By aggregating the state space based on the usage of a MEG and constructing an aggregated Markov process, we show that the truncation error of a MEG can be asymptotically bounded by the probability of states on the reflecting boundary of the MEG. Furthermore, truncating states of an arbitrary MEG will not undermine the estimated error of truncating any other MEGs. We then provide an overall error estimate for networks with multiple MEGs. To rapidly determine the appropriate size of an arbitrary MEG, we also introduce an a priori method to estimate the upper bound of its truncation error. This a priori estimate can be rapidly computed from reaction rates of the network, without the need of costly trial solutions of the dCME. As examples, we show results of applying our methods to the four stochastic networks of 1) the birth and death model, 2) the single gene expression model, 3) the genetic toggle switch model, and 4) the phage lambda bistable epigenetic switch model. We demonstrate how truncation errors and steady state probability landscapes can be computed using different sizes of the MEG(s) and how the results validate out theories. Overall, the novel state space truncation and error analysis methods developed here can be used to ensure accurate direct solutions to the dCME for a large number of stochastic networks. PMID:27105653
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cao, Youfang; Terebus, Anna; Liang, Jie
The discrete chemical master equation (dCME) provides a general framework for studying stochasticity in mesoscopic reaction networks. Since its direct solution rapidly becomes intractable due to the increasing size of the state space, truncation of the state space is necessary for solving most dCMEs. It is therefore important to assess the consequences of state space truncations so errors can be quantified and minimized. Here we describe a novel method for state space truncation. By partitioning a reaction network into multiple molecular equivalence groups (MEGs), we truncate the state space by limiting the total molecular copy numbers in each MEG. Wemore » further describe a theoretical framework for analysis of the truncation error in the steady-state probability landscape using reflecting boundaries. By aggregating the state space based on the usage of a MEG and constructing an aggregated Markov process, we show that the truncation error of a MEG can be asymptotically bounded by the probability of states on the reflecting boundary of the MEG. Furthermore, truncating states of an arbitrary MEG will not undermine the estimated error of truncating any other MEGs. We then provide an overall error estimate for networks with multiple MEGs. To rapidly determine the appropriate size of an arbitrary MEG, we also introduce an a priori method to estimate the upper bound of its truncation error. This a priori estimate can be rapidly computed from reaction rates of the network, without the need of costly trial solutions of the dCME. As examples, we show results of applying our methods to the four stochastic networks of (1) the birth and death model, (2) the single gene expression model, (3) the genetic toggle switch model, and (4) the phage lambda bistable epigenetic switch model. We demonstrate how truncation errors and steady-state probability landscapes can be computed using different sizes of the MEG(s) and how the results validate our theories. Overall, the novel state space truncation and error analysis methods developed here can be used to ensure accurate direct solutions to the dCME for a large number of stochastic networks.« less
Cao, Youfang; Terebus, Anna; Liang, Jie
2016-04-22
The discrete chemical master equation (dCME) provides a general framework for studying stochasticity in mesoscopic reaction networks. Since its direct solution rapidly becomes intractable due to the increasing size of the state space, truncation of the state space is necessary for solving most dCMEs. It is therefore important to assess the consequences of state space truncations so errors can be quantified and minimized. Here we describe a novel method for state space truncation. By partitioning a reaction network into multiple molecular equivalence groups (MEGs), we truncate the state space by limiting the total molecular copy numbers in each MEG. Wemore » further describe a theoretical framework for analysis of the truncation error in the steady-state probability landscape using reflecting boundaries. By aggregating the state space based on the usage of a MEG and constructing an aggregated Markov process, we show that the truncation error of a MEG can be asymptotically bounded by the probability of states on the reflecting boundary of the MEG. Furthermore, truncating states of an arbitrary MEG will not undermine the estimated error of truncating any other MEGs. We then provide an overall error estimate for networks with multiple MEGs. To rapidly determine the appropriate size of an arbitrary MEG, we also introduce an a priori method to estimate the upper bound of its truncation error. This a priori estimate can be rapidly computed from reaction rates of the network, without the need of costly trial solutions of the dCME. As examples, we show results of applying our methods to the four stochastic networks of (1) the birth and death model, (2) the single gene expression model, (3) the genetic toggle switch model, and (4) the phage lambda bistable epigenetic switch model. We demonstrate how truncation errors and steady-state probability landscapes can be computed using different sizes of the MEG(s) and how the results validate our theories. Overall, the novel state space truncation and error analysis methods developed here can be used to ensure accurate direct solutions to the dCME for a large number of stochastic networks.« less
Optimal estimation for discrete time jump processes
NASA Technical Reports Server (NTRS)
Vaca, M. V.; Tretter, S. A.
1977-01-01
Optimum estimates of nonobservable random variables or random processes which influence the rate functions of a discrete time jump process (DTJP) are obtained. The approach is based on the a posteriori probability of a nonobservable event expressed in terms of the a priori probability of that event and of the sample function probability of the DTJP. A general representation for optimum estimates and recursive equations for minimum mean squared error (MMSE) estimates are obtained. MMSE estimates are nonlinear functions of the observations. The problem of estimating the rate of a DTJP when the rate is a random variable with a probability density function of the form cx super K (l-x) super m and show that the MMSE estimates are linear in this case. This class of density functions explains why there are insignificant differences between optimum unconstrained and linear MMSE estimates in a variety of problems.
Optimal estimation for discrete time jump processes
NASA Technical Reports Server (NTRS)
Vaca, M. V.; Tretter, S. A.
1978-01-01
Optimum estimates of nonobservable random variables or random processes which influence the rate functions of a discrete time jump process (DTJP) are derived. The approach used is based on the a posteriori probability of a nonobservable event expressed in terms of the a priori probability of that event and of the sample function probability of the DTJP. Thus a general representation is obtained for optimum estimates, and recursive equations are derived for minimum mean-squared error (MMSE) estimates. In general, MMSE estimates are nonlinear functions of the observations. The problem is considered of estimating the rate of a DTJP when the rate is a random variable with a beta probability density function and the jump amplitudes are binomially distributed. It is shown that the MMSE estimates are linear. The class of beta density functions is rather rich and explains why there are insignificant differences between optimum unconstrained and linear MMSE estimates in a variety of problems.
Optimum nonparametric estimation of population density based on ordered distances
Patil, S.A.; Kovner, J.L.; Burnham, Kenneth P.
1982-01-01
The asymptotic mean and error mean square are determined for the nonparametric estimator of plant density by distance sampling proposed by Patil, Burnham and Kovner (1979, Biometrics 35, 597-604. On the basis of these formulae, a bias-reduced version of this estimator is given, and its specific form is determined which gives minimum mean square error under varying assumptions about the true probability density function of the sampled data. Extension is given to line-transect sampling.
An evaluation of multipass electrofishing for estimating the abundance of stream-dwelling salmonids
James T. Peterson; Russell F. Thurow; John W. Guzevich
2004-01-01
Failure to estimate capture efficiency, defined as the probability of capturing individual fish, can introduce a systematic error or bias into estimates of fish abundance. We evaluated the efficacy of multipass electrofishing removal methods for estimating fish abundance by comparing estimates of capture efficiency from multipass removal estimates to capture...
Grizzly Bear Noninvasive Genetic Tagging Surveys: Estimating the Magnitude of Missed Detections.
Fisher, Jason T; Heim, Nicole; Code, Sandra; Paczkowski, John
2016-01-01
Sound wildlife conservation decisions require sound information, and scientists increasingly rely on remotely collected data over large spatial scales, such as noninvasive genetic tagging (NGT). Grizzly bears (Ursus arctos), for example, are difficult to study at population scales except with noninvasive data, and NGT via hair trapping informs management over much of grizzly bears' range. Considerable statistical effort has gone into estimating sources of heterogeneity, but detection error-arising when a visiting bear fails to leave a hair sample-has not been independently estimated. We used camera traps to survey grizzly bear occurrence at fixed hair traps and multi-method hierarchical occupancy models to estimate the probability that a visiting bear actually leaves a hair sample with viable DNA. We surveyed grizzly bears via hair trapping and camera trapping for 8 monthly surveys at 50 (2012) and 76 (2013) sites in the Rocky Mountains of Alberta, Canada. We used multi-method occupancy models to estimate site occupancy, probability of detection, and conditional occupancy at a hair trap. We tested the prediction that detection error in NGT studies could be induced by temporal variability within season, leading to underestimation of occupancy. NGT via hair trapping consistently underestimated grizzly bear occupancy at a site when compared to camera trapping. At best occupancy was underestimated by 50%; at worst, by 95%. Probability of false absence was reduced through successive surveys, but this mainly accounts for error imparted by movement among repeated surveys, not necessarily missed detections by extant bears. The implications of missed detections and biased occupancy estimates for density estimation-which form the crux of management plans-require consideration. We suggest hair-trap NGT studies should estimate and correct detection error using independent survey methods such as cameras, to ensure the reliability of the data upon which species management and conservation actions are based.
Error Patterns in Ordering Fractions among At-Risk Fourth-Grade Students
ERIC Educational Resources Information Center
Malone, Amelia S.; Fuchs, Lynn S.
2017-01-01
The three purposes of this study were to (a) describe fraction ordering errors among at-risk fourth grade students, (b) assess the effect of part-whole understanding and accuracy of fraction magnitude estimation on the probability of committing errors, and (c) examine the effect of students' ability to explain comparing problems on the probability…
Jonsen, Ian
2016-02-08
State-space models provide a powerful way to scale up inference of movement behaviours from individuals to populations when the inference is made across multiple individuals. Here, I show how a joint estimation approach that assumes individuals share identical movement parameters can lead to improved inference of behavioural states associated with different movement processes. I use simulated movement paths with known behavioural states to compare estimation error between nonhierarchical and joint estimation formulations of an otherwise identical state-space model. Behavioural state estimation error was strongly affected by the degree of similarity between movement patterns characterising the behavioural states, with less error when movements were strongly dissimilar between states. The joint estimation model improved behavioural state estimation relative to the nonhierarchical model for simulated data with heavy-tailed Argos location errors. When applied to Argos telemetry datasets from 10 Weddell seals, the nonhierarchical model estimated highly uncertain behavioural state switching probabilities for most individuals whereas the joint estimation model yielded substantially less uncertainty. The joint estimation model better resolved the behavioural state sequences across all seals. Hierarchical or joint estimation models should be the preferred choice for estimating behavioural states from animal movement data, especially when location data are error-prone.
NASA Technical Reports Server (NTRS)
Barth, Timothy J.
2016-01-01
This chapter discusses the ongoing development of combined uncertainty and error bound estimates for computational fluid dynamics (CFD) calculations subject to imposed random parameters and random fields. An objective of this work is the construction of computable error bound formulas for output uncertainty statistics that guide CFD practitioners in systematically determining how accurately CFD realizations should be approximated and how accurately uncertainty statistics should be approximated for output quantities of interest. Formal error bounds formulas for moment statistics that properly account for the presence of numerical errors in CFD calculations and numerical quadrature errors in the calculation of moment statistics have been previously presented in [8]. In this past work, hierarchical node-nested dense and sparse tensor product quadratures are used to calculate moment statistics integrals. In the present work, a framework has been developed that exploits the hierarchical structure of these quadratures in order to simplify the calculation of an estimate of the quadrature error needed in error bound formulas. When signed estimates of realization error are available, this signed error may also be used to estimate output quantity of interest probability densities as a means to assess the impact of realization error on these density estimates. Numerical results are presented for CFD problems with uncertainty to demonstrate the capabilities of this framework.
Discrepancy-based error estimates for Quasi-Monte Carlo III. Error distributions and central limits
NASA Astrophysics Data System (ADS)
Hoogland, Jiri; Kleiss, Ronald
1997-04-01
In Quasi-Monte Carlo integration, the integration error is believed to be generally smaller than in classical Monte Carlo with the same number of integration points. Using an appropriate definition of an ensemble of quasi-random point sets, we derive various results on the probability distribution of the integration error, which can be compared to the standard Central Limit Theorem for normal stochastic sampling. In many cases, a Gaussian error distribution is obtained.
Simulation of rare events in quantum error correction
NASA Astrophysics Data System (ADS)
Bravyi, Sergey; Vargo, Alexander
2013-12-01
We consider the problem of calculating the logical error probability for a stabilizer quantum code subject to random Pauli errors. To access the regime of large code distances where logical errors are extremely unlikely we adopt the splitting method widely used in Monte Carlo simulations of rare events and Bennett's acceptance ratio method for estimating the free energy difference between two canonical ensembles. To illustrate the power of these methods in the context of error correction, we calculate the logical error probability PL for the two-dimensional surface code on a square lattice with a pair of holes for all code distances d≤20 and all error rates p below the fault-tolerance threshold. Our numerical results confirm the expected exponential decay PL˜exp[-α(p)d] and provide a simple fitting formula for the decay rate α(p). Both noiseless and noisy syndrome readout circuits are considered.
Multiple symbol partially coherent detection of MPSK
NASA Technical Reports Server (NTRS)
Simon, M. K.; Divsalar, D.
1992-01-01
It is shown that by using the known (or estimated) value of carrier tracking loop signal to noise ratio (SNR) in the decision metric, it is possible to improve the error probability performance of a partially coherent multiple phase-shift-keying (MPSK) system relative to that corresponding to the commonly used ideal coherent decision rule. Using a maximum-likeihood approach, an optimum decision metric is derived and shown to take the form of a weighted sum of the ideal coherent decision metric (i.e., correlation) and the noncoherent decision metric which is optimum for differential detection of MPSK. The performance of a receiver based on this optimum decision rule is derived and shown to provide continued improvement with increasing length of observation interval (data symbol sequence length). Unfortunately, increasing the observation length does not eliminate the error floor associated with the finite loop SNR. Nevertheless, in the limit of infinite observation length, the average error probability performance approaches the algebraic sum of the error floor and the performance of ideal coherent detection, i.e., at any error probability above the error floor, there is no degradation due to the partial coherence. It is shown that this limiting behavior is virtually achievable with practical size observation lengths. Furthermore, the performance is quite insensitive to mismatch between the estimate of loop SNR (e.g., obtained from measurement) fed to the decision metric and its true value. These results may be of use in low-cost Earth-orbiting or deep-space missions employing coded modulations.
NASA Technical Reports Server (NTRS)
Huddleston, Lisa L.; Roeder, William P.; Merceret, Francis J.
2011-01-01
A new technique has been developed to estimate the probability that a nearby cloud to ground lightning stroke was within a specified radius of any point of interest. This process uses the bivariate Gaussian distribution of probability density provided by the current lightning location error ellipse for the most likely location of a lightning stroke and integrates it to determine the probability that the stroke is inside any specified radius of any location, even if that location is not centered on or even with the location error ellipse. This technique is adapted from a method of calculating the probability of debris collision with spacecraft. Such a technique is important in spaceport processing activities because it allows engineers to quantify the risk of induced current damage to critical electronics due to nearby lightning strokes. This technique was tested extensively and is now in use by space launch organizations at Kennedy Space Center and Cape Canaveral Air Force Station. Future applications could include forensic meteorology.
NASA Technical Reports Server (NTRS)
Huddleston, Lisa; Roeder, WIlliam P.; Merceret, Francis J.
2011-01-01
A new technique has been developed to estimate the probability that a nearby cloud-to-ground lightning stroke was within a specified radius of any point of interest. This process uses the bivariate Gaussian distribution of probability density provided by the current lightning location error ellipse for the most likely location of a lightning stroke and integrates it to determine the probability that the stroke is inside any specified radius of any location, even if that location is not centered on or even within the location error ellipse. This technique is adapted from a method of calculating the probability of debris collision with spacecraft. Such a technique is important in spaceport processing activities because it allows engineers to quantify the risk of induced current damage to critical electronics due to nearby lightning strokes. This technique was tested extensively and is now in use by space launch organizations at Kennedy Space Center and Cape Canaveral Air Force station. Future applications could include forensic meteorology.
Circular Probable Error for Circular and Noncircular Gaussian Impacts
2012-09-01
1M simulated impacts ph(k)=mean(imp(:,1).^2+imp(:,2).^2<=CEP^2); % hit frequency on CEP end phit (j)=mean(ph...avg 100 hit frequencies to “incr n” end % GRAPHICS plot(i, phit ,’r-’); % error exponent versus Ph estimate
Quantifying seining detection probability for fishes of Great Plains sand‐bed rivers
Mollenhauer, Robert; Logue, Daniel R.; Brewer, Shannon K.
2018-01-01
Species detection error (i.e., imperfect and variable detection probability) is an essential consideration when investigators map distributions and interpret habitat associations. When fish detection error that is due to highly variable instream environments needs to be addressed, sand‐bed streams of the Great Plains represent a unique challenge. We quantified seining detection probability for diminutive Great Plains fishes across a range of sampling conditions in two sand‐bed rivers in Oklahoma. Imperfect detection resulted in underestimates of species occurrence using naïve estimates, particularly for less common fishes. Seining detection probability also varied among fishes and across sampling conditions. We observed a quadratic relationship between water depth and detection probability, in which the exact nature of the relationship was species‐specific and dependent on water clarity. Similarly, the direction of the relationship between water clarity and detection probability was species‐specific and dependent on differences in water depth. The relationship between water temperature and detection probability was also species dependent, where both the magnitude and direction of the relationship varied among fishes. We showed how ignoring detection error confounded an underlying relationship between species occurrence and water depth. Despite imperfect and heterogeneous detection, our results support that determining species absence can be accomplished with two to six spatially replicated seine hauls per 200‐m reach under average sampling conditions; however, required effort would be higher under certain conditions. Detection probability was low for the Arkansas River Shiner Notropis girardi, which is federally listed as threatened, and more than 10 seine hauls per 200‐m reach would be required to assess presence across sampling conditions. Our model allows scientists to estimate sampling effort to confidently assess species occurrence, which maximizes the use of available resources. Increased implementation of approaches that consider detection error promote ecological advancements and conservation and management decisions that are better informed.
Liu, Xiaoming; Fu, Yun-Xin; Maxwell, Taylor J.; Boerwinkle, Eric
2010-01-01
It is known that sequencing error can bias estimation of evolutionary or population genetic parameters. This problem is more prominent in deep resequencing studies because of their large sample size n, and a higher probability of error at each nucleotide site. We propose a new method based on the composite likelihood of the observed SNP configurations to infer population mutation rate θ = 4Neμ, population exponential growth rate R, and error rate ɛ, simultaneously. Using simulation, we show the combined effects of the parameters, θ, n, ɛ, and R on the accuracy of parameter estimation. We compared our maximum composite likelihood estimator (MCLE) of θ with other θ estimators that take into account the error. The results show the MCLE performs well when the sample size is large or the error rate is high. Using parametric bootstrap, composite likelihood can also be used as a statistic for testing the model goodness-of-fit of the observed DNA sequences. The MCLE method is applied to sequence data on the ANGPTL4 gene in 1832 African American and 1045 European American individuals. PMID:19952140
Quantifying uncertainty in geoacoustic inversion. II. Application to broadband, shallow-water data.
Dosso, Stan E; Nielsen, Peter L
2002-01-01
This paper applies the new method of fast Gibbs sampling (FGS) to estimate the uncertainties of seabed geoacoustic parameters in a broadband, shallow-water acoustic survey, with the goal of interpreting the survey results and validating the method for experimental data. FGS applies a Bayesian approach to geoacoustic inversion based on sampling the posterior probability density to estimate marginal probability distributions and parameter covariances. This requires knowledge of the statistical distribution of the data errors, including both measurement and theory errors, which is generally not available. Invoking the simplifying assumption of independent, identically distributed Gaussian errors allows a maximum-likelihood estimate of the data variance and leads to a practical inversion algorithm. However, it is necessary to validate these assumptions, i.e., to verify that the parameter uncertainties obtained represent meaningful estimates. To this end, FGS is applied to a geoacoustic experiment carried out at a site off the west coast of Italy where previous acoustic and geophysical studies have been performed. The parameter uncertainties estimated via FGS are validated by comparison with: (i) the variability in the results of inverting multiple independent data sets collected during the experiment; (ii) the results of FGS inversion of synthetic test cases designed to simulate the experiment and data errors; and (iii) the available geophysical ground truth. Comparisons are carried out for a number of different source bandwidths, ranges, and levels of prior information, and indicate that FGS provides reliable and stable uncertainty estimates for the geoacoustic inverse problem.
Iterative updating of model error for Bayesian inversion
NASA Astrophysics Data System (ADS)
Calvetti, Daniela; Dunlop, Matthew; Somersalo, Erkki; Stuart, Andrew
2018-02-01
In computational inverse problems, it is common that a detailed and accurate forward model is approximated by a computationally less challenging substitute. The model reduction may be necessary to meet constraints in computing time when optimization algorithms are used to find a single estimate, or to speed up Markov chain Monte Carlo (MCMC) calculations in the Bayesian framework. The use of an approximate model introduces a discrepancy, or modeling error, that may have a detrimental effect on the solution of the ill-posed inverse problem, or it may severely distort the estimate of the posterior distribution. In the Bayesian paradigm, the modeling error can be considered as a random variable, and by using an estimate of the probability distribution of the unknown, one may estimate the probability distribution of the modeling error and incorporate it into the inversion. We introduce an algorithm which iterates this idea to update the distribution of the model error, leading to a sequence of posterior distributions that are demonstrated empirically to capture the underlying truth with increasing accuracy. Since the algorithm is not based on rejections, it requires only limited full model evaluations. We show analytically that, in the linear Gaussian case, the algorithm converges geometrically fast with respect to the number of iterations when the data is finite dimensional. For more general models, we introduce particle approximations of the iteratively generated sequence of distributions; we also prove that each element of the sequence converges in the large particle limit under a simplifying assumption. We show numerically that, as in the linear case, rapid convergence occurs with respect to the number of iterations. Additionally, we show through computed examples that point estimates obtained from this iterative algorithm are superior to those obtained by neglecting the model error.
Neokosmidis, Ioannis; Kamalakis, Thomas; Chipouras, Aristides; Sphicopoulos, Thomas
2005-01-01
The performance of high-powered wavelength-division multiplexed (WDM) optical networks can be severely degraded by four-wave-mixing- (FWM-) induced distortion. The multicanonical Monte Carlo method (MCMC) is used to calculate the probability-density function (PDF) of the decision variable of a receiver, limited by FWM noise. Compared with the conventional Monte Carlo method previously used to estimate this PDF, the MCMC method is much faster and can accurately estimate smaller error probabilities. The method takes into account the correlation between the components of the FWM noise, unlike the Gaussian model, which is shown not to provide accurate results.
NASA Astrophysics Data System (ADS)
Pernot, Pascal; Savin, Andreas
2018-06-01
Benchmarking studies in computational chemistry use reference datasets to assess the accuracy of a method through error statistics. The commonly used error statistics, such as the mean signed and mean unsigned errors, do not inform end-users on the expected amplitude of prediction errors attached to these methods. We show that, the distributions of model errors being neither normal nor zero-centered, these error statistics cannot be used to infer prediction error probabilities. To overcome this limitation, we advocate for the use of more informative statistics, based on the empirical cumulative distribution function of unsigned errors, namely, (1) the probability for a new calculation to have an absolute error below a chosen threshold and (2) the maximal amplitude of errors one can expect with a chosen high confidence level. Those statistics are also shown to be well suited for benchmarking and ranking studies. Moreover, the standard error on all benchmarking statistics depends on the size of the reference dataset. Systematic publication of these standard errors would be very helpful to assess the statistical reliability of benchmarking conclusions.
NASA Technical Reports Server (NTRS)
Mashiku, Alinda; Garrison, James L.; Carpenter, J. Russell
2012-01-01
The tracking of space objects requires frequent and accurate monitoring for collision avoidance. As even collision events with very low probability are important, accurate prediction of collisions require the representation of the full probability density function (PDF) of the random orbit state. Through representing the full PDF of the orbit state for orbit maintenance and collision avoidance, we can take advantage of the statistical information present in the heavy tailed distributions, more accurately representing the orbit states with low probability. The classical methods of orbit determination (i.e. Kalman Filter and its derivatives) provide state estimates based on only the second moments of the state and measurement errors that are captured by assuming a Gaussian distribution. Although the measurement errors can be accurately assumed to have a Gaussian distribution, errors with a non-Gaussian distribution could arise during propagation between observations. Moreover, unmodeled dynamics in the orbit model could introduce non-Gaussian errors into the process noise. A Particle Filter (PF) is proposed as a nonlinear filtering technique that is capable of propagating and estimating a more complete representation of the state distribution as an accurate approximation of a full PDF. The PF uses Monte Carlo runs to generate particles that approximate the full PDF representation. The PF is applied in the estimation and propagation of a highly eccentric orbit and the results are compared to the Extended Kalman Filter and Splitting Gaussian Mixture algorithms to demonstrate its proficiency.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yan, H; Chen, Z; Nath, R
Purpose: kV fluoroscopic imaging combined with MV treatment beam imaging has been investigated for intrafractional motion monitoring and correction. It is, however, subject to additional kV imaging dose to normal tissue. To balance tracking accuracy and imaging dose, we previously proposed an adaptive imaging strategy to dynamically decide future imaging type and moments based on motion tracking uncertainty. kV imaging may be used continuously for maximal accuracy or only when the position uncertainty (probability of out of threshold) is high if a preset imaging dose limit is considered. In this work, we propose more accurate methods to estimate tracking uncertaintymore » through analyzing acquired data in real-time. Methods: We simulated motion tracking process based on a previously developed imaging framework (MV + initial seconds of kV imaging) using real-time breathing data from 42 patients. Motion tracking errors for each time point were collected together with the time point’s corresponding features, such as tumor motion speed and 2D tracking error of previous time points, etc. We tested three methods for error uncertainty estimation based on the features: conditional probability distribution, logistic regression modeling, and support vector machine (SVM) classification to detect errors exceeding a threshold. Results: For conditional probability distribution, polynomial regressions on three features (previous tracking error, prediction quality, and cosine of the angle between the trajectory and the treatment beam) showed strong correlation with the variation (uncertainty) of the mean 3D tracking error and its standard deviation: R-square = 0.94 and 0.90, respectively. The logistic regression and SVM classification successfully identified about 95% of tracking errors exceeding 2.5mm threshold. Conclusion: The proposed methods can reliably estimate the motion tracking uncertainty in real-time, which can be used to guide adaptive additional imaging to confirm the tumor is within the margin or initialize motion compensation if it is out of the margin.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Charles, B.N.
1955-05-12
Charts of the geographical distribution of the annual and seasonal D-values and their standard deviations at altitudes of 4500, 6000, and 7000 feeet over Eurasia are derived, which are used to estimate the frequency of baro system errors.
Estimating alarm thresholds and the number of components in mixture distributions
NASA Astrophysics Data System (ADS)
Burr, Tom; Hamada, Michael S.
2012-09-01
Mixtures of probability distributions arise in many nuclear assay and forensic applications, including nuclear weapon detection, neutron multiplicity counting, and in solution monitoring (SM) for nuclear safeguards. SM data is increasingly used to enhance nuclear safeguards in aqueous reprocessing facilities having plutonium in solution form in many tanks. This paper provides background for mixture probability distributions and then focuses on mixtures arising in SM data. SM data can be analyzed by evaluating transfer-mode residuals defined as tank-to-tank transfer differences, and wait-mode residuals defined as changes during non-transfer modes. A previous paper investigated impacts on transfer-mode and wait-mode residuals of event marking errors which arise when the estimated start and/or stop times of tank events such as transfers are somewhat different from the true start and/or stop times. Event marking errors contribute to non-Gaussian behavior and larger variation than predicted on the basis of individual tank calibration studies. This paper illustrates evidence for mixture probability distributions arising from such event marking errors and from effects such as condensation or evaporation during non-transfer modes, and pump carryover during transfer modes. A quantitative assessment of the sample size required to adequately characterize a mixture probability distribution arising in any context is included.
Maximum entropy approach to statistical inference for an ocean acoustic waveguide.
Knobles, D P; Sagers, J D; Koch, R A
2012-02-01
A conditional probability distribution suitable for estimating the statistical properties of ocean seabed parameter values inferred from acoustic measurements is derived from a maximum entropy principle. The specification of the expectation value for an error function constrains the maximization of an entropy functional. This constraint determines the sensitivity factor (β) to the error function of the resulting probability distribution, which is a canonical form that provides a conservative estimate of the uncertainty of the parameter values. From the conditional distribution, marginal distributions for individual parameters can be determined from integration over the other parameters. The approach is an alternative to obtaining the posterior probability distribution without an intermediary determination of the likelihood function followed by an application of Bayes' rule. In this paper the expectation value that specifies the constraint is determined from the values of the error function for the model solutions obtained from a sparse number of data samples. The method is applied to ocean acoustic measurements taken on the New Jersey continental shelf. The marginal probability distribution for the values of the sound speed ratio at the surface of the seabed and the source levels of a towed source are examined for different geoacoustic model representations. © 2012 Acoustical Society of America
Haber, M; An, Q; Foppa, I M; Shay, D K; Ferdinands, J M; Orenstein, W A
2015-05-01
As influenza vaccination is now widely recommended, randomized clinical trials are no longer ethical in many populations. Therefore, observational studies on patients seeking medical care for acute respiratory illnesses (ARIs) are a popular option for estimating influenza vaccine effectiveness (VE). We developed a probability model for evaluating and comparing bias and precision of estimates of VE against symptomatic influenza from two commonly used case-control study designs: the test-negative design and the traditional case-control design. We show that when vaccination does not affect the probability of developing non-influenza ARI then VE estimates from test-negative design studies are unbiased even if vaccinees and non-vaccinees have different probabilities of seeking medical care against ARI, as long as the ratio of these probabilities is the same for illnesses resulting from influenza and non-influenza infections. Our numerical results suggest that in general, estimates from the test-negative design have smaller bias compared to estimates from the traditional case-control design as long as the probability of non-influenza ARI is similar among vaccinated and unvaccinated individuals. We did not find consistent differences between the standard errors of the estimates from the two study designs.
Estimating Climatological Bias Errors for the Global Precipitation Climatology Project (GPCP)
NASA Technical Reports Server (NTRS)
Adler, Robert; Gu, Guojun; Huffman, George
2012-01-01
A procedure is described to estimate bias errors for mean precipitation by using multiple estimates from different algorithms, satellite sources, and merged products. The Global Precipitation Climatology Project (GPCP) monthly product is used as a base precipitation estimate, with other input products included when they are within +/- 50% of the GPCP estimates on a zonal-mean basis (ocean and land separately). The standard deviation s of the included products is then taken to be the estimated systematic, or bias, error. The results allow one to examine monthly climatologies and the annual climatology, producing maps of estimated bias errors, zonal-mean errors, and estimated errors over large areas such as ocean and land for both the tropics and the globe. For ocean areas, where there is the largest question as to absolute magnitude of precipitation, the analysis shows spatial variations in the estimated bias errors, indicating areas where one should have more or less confidence in the mean precipitation estimates. In the tropics, relative bias error estimates (s/m, where m is the mean precipitation) over the eastern Pacific Ocean are as large as 20%, as compared with 10%-15% in the western Pacific part of the ITCZ. An examination of latitudinal differences over ocean clearly shows an increase in estimated bias error at higher latitudes, reaching up to 50%. Over land, the error estimates also locate regions of potential problems in the tropics and larger cold-season errors at high latitudes that are due to snow. An empirical technique to area average the gridded errors (s) is described that allows one to make error estimates for arbitrary areas and for the tropics and the globe (land and ocean separately, and combined). Over the tropics this calculation leads to a relative error estimate for tropical land and ocean combined of 7%, which is considered to be an upper bound because of the lack of sign-of-the-error canceling when integrating over different areas with a different number of input products. For the globe the calculated relative error estimate from this study is about 9%, which is also probably a slight overestimate. These tropical and global estimated bias errors provide one estimate of the current state of knowledge of the planet's mean precipitation.
Computer-aided diagnosis with potential application to rapid detection of disease outbreaks.
Burr, Tom; Koster, Frederick; Picard, Rick; Forslund, Dave; Wokoun, Doug; Joyce, Ed; Brillman, Judith; Froman, Phil; Lee, Jack
2007-04-15
Our objectives are to quickly interpret symptoms of emergency patients to identify likely syndromes and to improve population-wide disease outbreak detection. We constructed a database of 248 syndromes, each syndrome having an estimated probability of producing any of 85 symptoms, with some two-way, three-way, and five-way probabilities reflecting correlations among symptoms. Using these multi-way probabilities in conjunction with an iterative proportional fitting algorithm allows estimation of full conditional probabilities. Combining these conditional probabilities with misdiagnosis error rates and incidence rates via Bayes theorem, the probability of each syndrome is estimated. We tested a prototype of computer-aided differential diagnosis (CADDY) on simulated data and on more than 100 real cases, including West Nile Virus, Q fever, SARS, anthrax, plague, tularaemia and toxic shock cases. We conclude that: (1) it is important to determine whether the unrecorded positive status of a symptom means that the status is negative or that the status is unknown; (2) inclusion of misdiagnosis error rates produces more realistic results; (3) the naive Bayes classifier, which assumes all symptoms behave independently, is slightly outperformed by CADDY, which includes available multi-symptom information on correlations; as more information regarding symptom correlations becomes available, the advantage of CADDY over the naive Bayes classifier should increase; (4) overlooking low-probability, high-consequence events is less likely if the standard output summary is augmented with a list of rare syndromes that are consistent with observed symptoms, and (5) accumulating patient-level probabilities across a larger population can aid in biosurveillance for disease outbreaks. c 2007 John Wiley & Sons, Ltd.
Analytical performance evaluation of SAR ATR with inaccurate or estimated models
NASA Astrophysics Data System (ADS)
DeVore, Michael D.
2004-09-01
Hypothesis testing algorithms for automatic target recognition (ATR) are often formulated in terms of some assumed distribution family. The parameter values corresponding to a particular target class together with the distribution family constitute a model for the target's signature. In practice such models exhibit inaccuracy because of incorrect assumptions about the distribution family and/or because of errors in the assumed parameter values, which are often determined experimentally. Model inaccuracy can have a significant impact on performance predictions for target recognition systems. Such inaccuracy often causes model-based predictions that ignore the difference between assumed and actual distributions to be overly optimistic. This paper reports on research to quantify the effect of inaccurate models on performance prediction and to estimate the effect using only trained parameters. We demonstrate that for large observation vectors the class-conditional probabilities of error can be expressed as a simple function of the difference between two relative entropies. These relative entropies quantify the discrepancies between the actual and assumed distributions and can be used to express the difference between actual and predicted error rates. Focusing on the problem of ATR from synthetic aperture radar (SAR) imagery, we present estimators of the probabilities of error in both ideal and plug-in tests expressed in terms of the trained model parameters. These estimators are defined in terms of unbiased estimates for the first two moments of the sample statistic. We present an analytical treatment of these results and include demonstrations from simulated radar data.
Saviane, Chiara; Silver, R Angus
2006-06-15
Synapses play a crucial role in information processing in the brain. Amplitude fluctuations of synaptic responses can be used to extract information about the mechanisms underlying synaptic transmission and its modulation. In particular, multiple-probability fluctuation analysis can be used to estimate the number of functional release sites, the mean probability of release and the amplitude of the mean quantal response from fits of the relationship between the variance and mean amplitude of postsynaptic responses, recorded at different probabilities. To determine these quantal parameters, calculate their uncertainties and the goodness-of-fit of the model, it is important to weight the contribution of each data point in the fitting procedure. We therefore investigated the errors associated with measuring the variance by determining the best estimators of the variance of the variance and have used simulations of synaptic transmission to test their accuracy and reliability under different experimental conditions. For central synapses, which generally have a low number of release sites, the amplitude distribution of synaptic responses is not normal, thus the use of a theoretical variance of the variance based on the normal assumption is not a good approximation. However, appropriate estimators can be derived for the population and for limited sample sizes using a more general expression that involves higher moments and introducing unbiased estimators based on the h-statistics. Our results are likely to be relevant for various applications of fluctuation analysis when few channels or release sites are present.
Estimation of State Transition Probabilities: A Neural Network Model
NASA Astrophysics Data System (ADS)
Saito, Hiroshi; Takiyama, Ken; Okada, Masato
2015-12-01
Humans and animals can predict future states on the basis of acquired knowledge. This prediction of the state transition is important for choosing the best action, and the prediction is only possible if the state transition probability has already been learned. However, how our brains learn the state transition probability is unknown. Here, we propose a simple algorithm for estimating the state transition probability by utilizing the state prediction error. We analytically and numerically confirmed that our algorithm is able to learn the probability completely with an appropriate learning rate. Furthermore, our learning rule reproduced experimentally reported psychometric functions and neural activities in the lateral intraparietal area in a decision-making task. Thus, our algorithm might describe the manner in which our brains learn state transition probabilities and predict future states.
Using satellite radiotelemetry data to delineate and manage wildlife populations
Amstrup, Steven C.; McDonald, T.L.; Durner, George M.
2004-01-01
The greatest promise of radiotelemetry always has been a better understanding of animal movements. Telemetry has helped us know when animals are active, how active they are, how far and how fast they move, the geographic areas they occupy, and whether individuals vary in these traits. Unfortunately, the inability to estimate the error in animals utilization distributions (UDs), has prevented probabilistic linkage of movements data, which are always retrospective, with future management actions. We used the example of the harvested population of polar bears (Ursus maritimus) in the Southern Beaufort Sea to illustrate a method that provides that linkage. We employed a 2-dimensional Gaussian kernel density estimator to smooth and scale frequencies of polar bear radio locations within cells of a grid overlying our study area. True 2-dimensional smoothing allowed us to create accurate descriptions of the UDs of individuals and groups of bears. We used a new method of clustering, based upon the relative use collared bears made of each cell in our grid, to assign individual animals to populations. We applied the fast Fourier transform to make bootstrapped estimates of the error in UDs computationally feasible. Clustering and kernel smoothing identified 3 populations of polar bears in the region between Wrangel Island, Russia, and Banks Island, Canada. The relative probability of occurrence of animals from each population varied significantly among grid cells distributed across the study area. We displayed occurrence probabilities as contour maps wherein each contour line corresponded with a change in relative probability. Only at the edges of our study area and in some offshore regions were bootstrapped estimates of error in occurrence probabilities too high to allow prediction. Error estimates, which also were displayed as contours, allowed us to show that occurrence probabilities did not vary by season. Near Barrow, Alaska, 50% of bears observed are predicted to be from the Chukchi Sea population and 50% from the Southern Beaufort Sea population. At Tuktoyaktuk, Northwest Territories, Canada, 50% are from the Southern Beaufort Sea and 50% from the Northern Beaufort Sea population. The methods described here will aid managers of all wildlife that can be studied by telemetry to allocate harvests and other human perturbations to the appropriate populations, make risk assessments, and predict impacts of human activities. They will aid researchers by providing the refined descriptions of study populations that are necessary for population estimation and other investigative tasks. Arctic, Beaufort Sea, boundaries, clustering, Fourier transform, kernel, management, polar bears, population delineation, radiotelemetry, satellite, smoothing, Ursus maritimus
Improved Analysis of GW150914 Using a Fully Spin-Precessing Waveform Model
NASA Astrophysics Data System (ADS)
Abbott, B. P.; Abbott, R.; Abbott, T. D.; Abernathy, M. R.; Acernese, F.; Ackley, K.; Adams, C.; Adams, T.; Addesso, P.; Adhikari, R. X.; Adya, V. B.; Affeldt, C.; Agathos, M.; Agatsuma, K.; Aggarwal, N.; Aguiar, O. D.; Aiello, L.; Ain, A.; Ajith, P.; Allen, B.; Allocca, A.; Altin, P. A.; Anderson, S. B.; Anderson, W. G.; Arai, K.; Araya, M. C.; Arceneaux, C. C.; Areeda, J. S.; Arnaud, N.; Arun, K. G.; Ascenzi, S.; Ashton, G.; Ast, M.; Aston, S. M.; Astone, P.; Aufmuth, P.; Aulbert, C.; Babak, S.; Bacon, P.; Bader, M. K. M.; Baker, P. T.; Baldaccini, F.; Ballardin, G.; Ballmer, S. W.; Barayoga, J. C.; Barclay, S. E.; Barish, B. C.; Barker, D.; Barone, F.; Barr, B.; Barsotti, L.; Barsuglia, M.; Barta, D.; Bartlett, J.; Bartos, I.; Bassiri, R.; Basti, A.; Batch, J. C.; Baune, C.; Bavigadda, V.; Bazzan, M.; Bejger, M.; Bell, A. S.; Berger, B. K.; Bergmann, G.; Berry, C. P. L.; Bersanetti, D.; Bertolini, A.; Betzwieser, J.; Bhagwat, S.; Bhandare, R.; Bilenko, I. A.; Billingsley, G.; Birch, J.; Birney, R.; Birnholtz, O.; Biscans, S.; Bisht, A.; Bitossi, M.; Biwer, C.; Bizouard, M. A.; Blackburn, J. K.; Blair, C. D.; Blair, D. G.; Blair, R. M.; Bloemen, S.; Bock, O.; Boer, M.; Bogaert, G.; Bogan, C.; Bohe, A.; Bond, C.; Bondu, F.; Bonnand, R.; Boom, B. A.; Bork, R.; Boschi, V.; Bose, S.; Bouffanais, Y.; Bozzi, A.; Bradaschia, C.; Brady, P. R.; Braginsky, V. B.; Branchesi, M.; Brau, J. E.; Briant, T.; Brillet, A.; Brinkmann, M.; Brisson, V.; Brockill, P.; Broida, J. E.; Brooks, A. F.; Brown, D. A.; Brown, D. D.; Brown, N. M.; Brunett, S.; Buchanan, C. C.; Buikema, A.; Bulik, T.; Bulten, H. J.; Buonanno, A.; Buskulic, D.; Buy, C.; Byer, R. L.; Cabero, M.; Cadonati, L.; Cagnoli, G.; Cahillane, C.; Calderón Bustillo, J.; Callister, T.; Calloni, E.; Camp, J. B.; Cannon, K. C.; Cao, J.; Capano, C. D.; Capocasa, E.; Carbognani, F.; Caride, S.; Casanueva Diaz, C.; Casentini, J.; Caudill, S.; Cavaglià, M.; Cavalier, F.; Cavalieri, R.; Cella, G.; Cepeda, C. B.; Cerboni Baiardi, L.; Cerretani, G.; Cesarini, E.; Chamberlin, S. J.; Chan, M.; Chao, S.; Charlton, P.; Chassande-Mottin, E.; Cheeseboro, B. D.; Chen, H. Y.; Chen, Y.; Cheng, C.; Chincarini, A.; Chiummo, A.; Cho, H. S.; Cho, M.; Chow, J. H.; Christensen, N.; Chu, Q.; Chua, S.; Chung, S.; Ciani, G.; Clara, F.; Clark, J. A.; Cleva, F.; Coccia, E.; Cohadon, P.-F.; Colla, A.; Collette, C. G.; Cominsky, L.; Constancio, M.; Conte, A.; Conti, L.; Cook, D.; Corbitt, T. R.; Cornish, N.; Corsi, A.; Cortese, S.; Costa, C. A.; Coughlin, M. W.; Coughlin, S. B.; Coulon, J.-P.; Countryman, S. T.; Couvares, P.; Cowan, E. E.; Coward, D. M.; Cowart, M. J.; Coyne, D. C.; Coyne, R.; Craig, K.; Creighton, J. D. E.; Cripe, J.; Crowder, S. G.; Cumming, A.; Cunningham, L.; Cuoco, E.; Dal Canton, T.; Danilishin, S. L.; D'Antonio, S.; Danzmann, K.; Darman, N. S.; Dasgupta, A.; Da Silva Costa, C. F.; Dattilo, V.; Dave, I.; Davier, M.; Davies, G. S.; Daw, E. J.; Day, R.; De, S.; DeBra, D.; Debreczeni, G.; Degallaix, J.; De Laurentis, M.; Deléglise, S.; Del Pozzo, W.; Denker, T.; Dent, T.; Dergachev, V.; De Rosa, R.; DeRosa, R. T.; DeSalvo, R.; Devine, R. C.; Dhurandhar, S.; Díaz, M. C.; Di Fiore, L.; Di Giovanni, M.; Di Girolamo, T.; Di Lieto, A.; Di Pace, S.; Di Palma, I.; Di Virgilio, A.; Dolique, V.; Donovan, F.; Dooley, K. L.; Doravari, S.; Douglas, R.; Downes, T. P.; Drago, M.; Drever, R. W. P.; Driggers, J. C.; Ducrot, M.; Dwyer, S. E.; Edo, T. B.; Edwards, M. C.; Effler, A.; Eggenstein, H.-B.; Ehrens, P.; Eichholz, J.; Eikenberry, S. S.; Engels, W.; Essick, R. C.; Etienne, Z.; Etzel, T.; Evans, M.; Evans, T. M.; Everett, R.; Factourovich, M.; Fafone, V.; Fair, H.; Fairhurst, S.; Fan, X.; Fang, Q.; Farinon, S.; Farr, B.; Farr, W. M.; Fauchon-Jones, E.; Favata, M.; Fays, M.; Fehrmann, H.; Fejer, M. M.; Fenyvesi, E.; Ferrante, I.; Ferreira, E. C.; Ferrini, F.; Fidecaro, F.; Fiori, I.; Fiorucci, D.; Fisher, R. P.; Flaminio, R.; Fletcher, M.; Fournier, J.-D.; Frasca, S.; Frasconi, F.; Frei, Z.; Freise, A.; Frey, R.; Frey, V.; Fritschel, P.; Frolov, V. V.; Fulda, P.; Fyffe, M.; Gabbard, H. A. G.; Gaebel, S.; Gair, J. R.; Gammaitoni, L.; Gaonkar, S. G.; Garufi, F.; Gaur, G.; Gehrels, N.; Gemme, G.; Geng, P.; Genin, E.; Gennai, A.; George, J.; Gergely, L.; Germain, V.; Ghosh, Abhirup; Ghosh, Archisman; Ghosh, S.; Giaime, J. A.; Giardina, K. D.; Giazotto, A.; Gill, K.; Glaefke, A.; Goetz, E.; Goetz, R.; Gondan, L.; González, G.; Gonzalez Castro, J. M.; Gopakumar, A.; Gordon, N. A.; Gorodetsky, M. L.; Gossan, S. E.; Gosselin, M.; Gouaty, R.; Grado, A.; Graef, C.; Graff, P. B.; Granata, M.; Grant, A.; Gras, S.; Gray, C.; Greco, G.; Green, A. C.; Groot, P.; Grote, H.; Grunewald, S.; Guidi, G. M.; Guo, X.; Gupta, A.; Gupta, M. K.; Gushwa, K. E.; Gustafson, E. K.; Gustafson, R.; Hacker, J. J.; Hall, B. R.; Hall, E. D.; Hammond, G.; Haney, M.; Hanke, M. M.; Hanks, J.; Hanna, C.; Hannam, M. D.; Hanson, J.; Hardwick, T.; Harms, J.; Harry, G. M.; Harry, I. W.; Hart, M. J.; Hartman, M. T.; Haster, C.-J.; Haughian, K.; Healy, J.; Heidmann, A.; Heintze, M. C.; Heitmann, H.; Hello, P.; Hemming, G.; Hendry, M.; Heng, I. S.; Hennig, J.; Henry, J.; Heptonstall, A. W.; Heurs, M.; Hild, S.; Hoak, D.; Hofman, D.; Holt, K.; Holz, D. E.; Hopkins, P.; Hough, J.; Houston, E. A.; Howell, E. J.; Hu, Y. M.; Huang, S.; Huerta, E. A.; Huet, D.; Hughey, B.; Husa, S.; Huttner, S. H.; Huynh-Dinh, T.; Indik, N.; Ingram, D. R.; Inta, R.; Isa, H. N.; Isac, J.-M.; Isi, M.; Isogai, T.; Iyer, B. R.; Izumi, K.; Jacqmin, T.; Jang, H.; Jani, K.; Jaranowski, P.; Jawahar, S.; Jian, L.; Jiménez-Forteza, F.; Johnson, W. W.; Johnson-McDaniel, N. K.; Jones, D. I.; Jones, R.; Jonker, R. J. G.; Ju, L.; K, Haris; Kalaghatgi, C. V.; Kalogera, V.; Kandhasamy, S.; Kang, G.; Kanner, J. B.; Kapadia, S. J.; Karki, S.; Karvinen, K. S.; Kasprzack, M.; Katsavounidis, E.; Katzman, W.; Kaufer, S.; Kaur, T.; Kawabe, K.; Kéfélian, F.; Kehl, M. S.; Keitel, D.; Kelley, D. B.; Kells, W.; Kennedy, R.; Key, J. S.; Khalili, F. Y.; Khan, I.; Khan, S.; Khan, Z.; Khazanov, E. A.; Kijbunchoo, N.; Kim, Chi-Woong; Kim, Chunglee; Kim, J.; Kim, K.; Kim, N.; Kim, W.; Kim, Y.-M.; Kimbrell, S. J.; King, E. J.; King, P. J.; Kissel, J. S.; Klein, B.; Kleybolte, L.; Klimenko, S.; Koehlenbeck, S. M.; Koley, S.; Kondrashov, V.; Kontos, A.; Korobko, M.; Korth, W. Z.; Kowalska, I.; Kozak, D. B.; Kringel, V.; Królak, A.; Krueger, C.; Kuehn, G.; Kumar, P.; Kumar, R.; Kuo, L.; Kutynia, A.; Lackey, B. D.; Landry, M.; Lange, J.; Lantz, B.; Lasky, P. D.; Laxen, M.; Lazzarini, A.; Lazzaro, C.; Leaci, P.; Leavey, S.; Lebigot, E. O.; Lee, C. H.; Lee, H. K.; Lee, H. M.; Lee, K.; Lenon, A.; Leonardi, M.; Leong, J. R.; Leroy, N.; Letendre, N.; Levin, Y.; Lewis, J. B.; Li, T. G. F.; Libson, A.; Littenberg, T. B.; Lockerbie, N. A.; Lombardi, A. L.; London, L. T.; Lord, J. E.; Lorenzini, M.; Loriette, V.; Lormand, M.; Losurdo, G.; Lough, J. D.; Lousto, C. O.; Lovelace, G.; Lück, H.; Lundgren, A. P.; Lynch, R.; Ma, Y.; Machenschalk, B.; MacInnis, M.; Macleod, D. M.; Magaña-Sandoval, F.; Magaña Zertuche, L.; Magee, R. M.; Majorana, E.; Maksimovic, I.; Malvezzi, V.; Man, N.; Mandic, V.; Mangano, V.; Mansell, G. L.; Manske, M.; Mantovani, M.; Marchesoni, F.; Marion, F.; Márka, S.; Márka, Z.; Markosyan, A. S.; Maros, E.; Martelli, F.; Martellini, L.; Martin, I. W.; Martynov, D. V.; Marx, J. N.; Mason, K.; Masserot, A.; Massinger, T. J.; Masso-Reid, M.; Mastrogiovanni, S.; Matichard, F.; Matone, L.; Mavalvala, N.; Mazumder, N.; McCarthy, R.; McClelland, D. E.; McCormick, S.; McGuire, S. C.; McIntyre, G.; McIver, J.; McManus, D. J.; McRae, T.; McWilliams, S. T.; Meacher, D.; Meadors, G. D.; Meidam, J.; Melatos, A.; Mendell, G.; Mercer, R. A.; Merilh, E. L.; Merzougui, M.; Meshkov, S.; Messenger, C.; Messick, C.; Metzdorff, R.; Meyers, P. M.; Mezzani, F.; Miao, H.; Michel, C.; Middleton, H.; Mikhailov, E. E.; Milano, L.; Miller, A. L.; Miller, A.; Miller, B. B.; Miller, J.; Millhouse, M.; Minenkov, Y.; Ming, J.; Mirshekari, S.; Mishra, C.; Mitra, S.; Mitrofanov, V. P.; Mitselmakher, G.; Mittleman, R.; Moggi, A.; Mohan, M.; Mohapatra, S. R. P.; Montani, M.; Moore, B. C.; Moore, C. J.; Moraru, D.; Moreno, G.; Morriss, S. R.; Mossavi, K.; Mours, B.; Mow-Lowry, C. M.; Mueller, G.; Muir, A. W.; Mukherjee, Arunava; Mukherjee, D.; Mukherjee, S.; Mukund, N.; Mullavey, A.; Munch, J.; Murphy, D. J.; Murray, P. G.; Mytidis, A.; Nardecchia, I.; Naticchioni, L.; Nayak, R. K.; Nedkova, K.; Nelemans, G.; Nelson, T. J. N.; Neri, M.; Neunzert, A.; Newton, G.; Nguyen, T. T.; Nielsen, A. B.; Nissanke, S.; Nitz, A.; Nocera, F.; Nolting, D.; Normandin, M. E. N.; Nuttall, L. K.; Oberling, J.; Ochsner, E.; O'Dell, J.; Oelker, E.; Ogin, G. H.; Oh, J. J.; Oh, S. H.; Ohme, F.; Oliver, M.; Oppermann, P.; Oram, Richard J.; O'Reilly, B.; O'Shaughnessy, R.; Ottaway, D. J.; Overmier, H.; Owen, B. J.; Pai, A.; Pai, S. A.; Palamos, J. R.; Palashov, O.; Palomba, C.; Pal-Singh, A.; Pan, H.; Pankow, C.; Pannarale, F.; Pant, B. C.; Paoletti, F.; Paoli, A.; Papa, M. A.; Paris, H. R.; Parker, W.; Pascucci, D.; Pasqualetti, A.; Passaquieti, R.; Passuello, D.; Patricelli, B.; Patrick, Z.; Pearlstone, B. L.; Pedraza, M.; Pedurand, R.; Pekowsky, L.; Pele, A.; Penn, S.; Perreca, A.; Perri, L. M.; Pfeiffer, H. P.; Phelps, M.; Piccinni, O. J.; Pichot, M.; Piergiovanni, F.; Pierro, V.; Pillant, G.; Pinard, L.; Pinto, I. M.; Pitkin, M.; Poe, M.; Poggiani, R.; Popolizio, P.; Post, A.; Powell, J.; Prasad, J.; Predoi, V.; Prestegard, T.; Price, L. R.; Prijatelj, M.; Principe, M.; Privitera, S.; Prix, R.; Prodi, G. A.; Prokhorov, L.; Puncken, O.; Punturo, M.; Puppo, P.; Pürrer, M.; Qi, H.; Qin, J.; Qiu, S.; Quetschke, V.; Quintero, E. A.; Quitzow-James, R.; Raab, F. J.; Rabeling, D. S.; Radkins, H.; Raffai, P.; Raja, S.; Rajan, C.; Rakhmanov, M.; Rapagnani, P.; Raymond, V.; Razzano, M.; Re, V.; Read, J.; Reed, C. M.; Regimbau, T.; Rei, L.; Reid, S.; Reitze, D. H.; Rew, H.; Reyes, S. D.; Ricci, F.; Riles, K.; Rizzo, M.; Robertson, N. A.; Robie, R.; Robinet, F.; Rocchi, A.; Rolland, L.; Rollins, J. G.; Roma, V. J.; Romano, R.; Romanov, G.; Romie, J. H.; Rosińska, D.; Rowan, S.; Rüdiger, A.; Ruggi, P.; Ryan, K.; Sachdev, S.; Sadecki, T.; Sadeghian, L.; Sakellariadou, M.; Salconi, L.; Saleem, M.; Salemi, F.; Samajdar, A.; Sammut, L.; Sanchez, E. J.; Sandberg, V.; Sandeen, B.; Sanders, J. R.; Sassolas, B.; Sathyaprakash, B. S.; Saulson, P. R.; Sauter, O. E. S.; Savage, R. L.; Sawadsky, A.; Schale, P.; Schilling, R.; Schmidt, J.; Schmidt, P.; Schnabel, R.; Schofield, R. M. S.; Schönbeck, A.; Schreiber, E.; Schuette, D.; Schutz, B. F.; Scott, J.; Scott, S. M.; Sellers, D.; Sengupta, A. S.; Sentenac, D.; Sequino, V.; Sergeev, A.; Setyawati, Y.; Shaddock, D. A.; Shaffer, T.; Shahriar, M. S.; Shaltev, M.; Shapiro, B.; Shawhan, P.; Sheperd, A.; Shoemaker, D. H.; Shoemaker, D. M.; Siellez, K.; Siemens, X.; Sieniawska, M.; Sigg, D.; Silva, A. D.; Singer, A.; Singer, L. P.; Singh, A.; Singh, R.; Singhal, A.; Sintes, A. M.; Slagmolen, B. J. J.; Smith, J. R.; Smith, N. D.; Smith, R. J. E.; Son, E. J.; Sorazu, B.; Sorrentino, F.; Souradeep, T.; Srivastava, A. K.; Staley, A.; Steinke, M.; Steinlechner, J.; Steinlechner, S.; Steinmeyer, D.; Stephens, B. C.; Stevenson, S. P.; Stone, R.; Strain, K. A.; Straniero, N.; Stratta, G.; Strauss, N. A.; Strigin, S.; Sturani, R.; Stuver, A. L.; Summerscales, T. Z.; Sun, L.; Sunil, S.; Sutton, P. J.; Swinkels, B. L.; Szczepańczyk, M. J.; Tacca, M.; Talukder, D.; Tanner, D. B.; Tápai, M.; Tarabrin, S. P.; Taracchini, A.; Taylor, R.; Theeg, T.; Thirugnanasambandam, M. P.; Thomas, E. G.; Thomas, M.; Thomas, P.; Thorne, K. A.; Thorne, K. S.; Thrane, E.; Tiwari, S.; Tiwari, V.; Tokmakov, K. V.; Toland, K.; Tomlinson, C.; Tonelli, M.; Tornasi, Z.; Torres, C. V.; Torrie, C. I.; Töyrä, D.; Travasso, F.; Traylor, G.; Trifirò, D.; Tringali, M. C.; Trozzo, L.; Tse, M.; Turconi, M.; Tuyenbayev, D.; Ugolini, D.; Unnikrishnan, C. S.; Urban, A. L.; Usman, S. A.; Vahlbruch, H.; Vajente, G.; Valdes, G.; Vallisneri, M.; van Bakel, N.; van Beuzekom, M.; van den Brand, J. F. J.; Van Den Broeck, C.; Vander-Hyde, D. C.; van der Schaaf, L.; van der Sluys, M. V.; van Heijningen, J. V.; Vano-Vinuales, A.; van Veggel, A. A.; Vardaro, M.; Vass, S.; Vasúth, M.; Vaulin, R.; Vecchio, A.; Vedovato, G.; Veitch, J.; Veitch, P. J.; Venkateswara, K.; Verkindt, D.; Vetrano, F.; Viceré, A.; Vinciguerra, S.; Vine, D. J.; Vinet, J.-Y.; Vitale, S.; Vo, T.; Vocca, H.; Vorvick, C.; Voss, D. V.; Vousden, W. D.; Vyatchanin, S. P.; Wade, A. R.; Wade, L. E.; Wade, M.; Walker, M.; Wallace, L.; Walsh, S.; Wang, G.; Wang, H.; Wang, M.; Wang, X.; Wang, Y.; Ward, R. L.; Warner, J.; Was, M.; Weaver, B.; Wei, L.-W.; Weinert, M.; Weinstein, A. J.; Weiss, R.; Wen, L.; Weßels, P.; Westphal, T.; Wette, K.; Whelan, J. T.; Whiting, B. F.; Williams, R. D.; Williamson, A. R.; Willis, J. L.; Willke, B.; Wimmer, M. H.; Winkler, W.; Wipf, C. C.; Wittel, H.; Woan, G.; Woehler, J.; Worden, J.; Wright, J. L.; Wu, D. S.; Wu, G.; Yablon, J.; Yam, W.; Yamamoto, H.; Yancey, C. C.; Yu, H.; Yvert, M.; ZadroŻny, A.; Zangrando, L.; Zanolin, M.; Zendri, J.-P.; Zevin, M.; Zhang, L.; Zhang, M.; Zhang, Y.; Zhao, C.; Zhou, M.; Zhou, Z.; Zhu, X. J.; Zucker, M. E.; Zuraw, S. E.; Zweizig, J.; Boyle, M.; Brügmann, B.; Campanelli, M.; Chu, T.; Clark, M.; Haas, R.; Hemberger, D.; Hinder, I.; Kidder, L. E.; Kinsey, M.; Laguna, P.; Ossokine, S.; Pan, Y.; Röver, C.; Scheel, M.; Szilagyi, B.; Teukolsky, S.; Zlochower, Y.; LIGO Scientific Collaboration; Virgo Collaboration
2016-10-01
This paper presents updated estimates of source parameters for GW150914, a binary black-hole coalescence event detected by the Laser Interferometer Gravitational-wave Observatory (LIGO) in 2015 [Abbott et al. Phys. Rev. Lett. 116, 061102 (2016).]. Abbott et al. [Phys. Rev. Lett. 116, 241102 (2016).] presented parameter estimation of the source using a 13-dimensional, phenomenological precessing-spin model (precessing IMRPhenom) and an 11-dimensional nonprecessing effective-one-body (EOB) model calibrated to numerical-relativity simulations, which forces spin alignment (nonprecessing EOBNR). Here, we present new results that include a 15-dimensional precessing-spin waveform model (precessing EOBNR) developed within the EOB formalism. We find good agreement with the parameters estimated previously [Abbott et al. Phys. Rev. Lett. 116, 241102 (2016).], and we quote updated component masses of 35-3+5 M⊙ and 3 0-4+3 M⊙ (where errors correspond to 90% symmetric credible intervals). We also present slightly tighter constraints on the dimensionless spin magnitudes of the two black holes, with a primary spin estimate <0.65 and a secondary spin estimate <0.75 at 90% probability. Abbott et al. [Phys. Rev. Lett. 116, 241102 (2016).] estimated the systematic parameter-extraction errors due to waveform-model uncertainty by combining the posterior probability densities of precessing IMRPhenom and nonprecessing EOBNR. Here, we find that the two precessing-spin models are in closer agreement, suggesting that these systematic errors are smaller than previously quoted.
Patrick L. Zimmerman; Greg C. Liknes
2010-01-01
Dot grids are often used to estimate the proportion of land cover belonging to some class in an aerial photograph. Interpreter misclassification is an often-ignored source of error in dot-grid sampling that has the potential to significantly bias proportion estimates. For the case when the true class of items is unknown, we present a maximum-likelihood estimator of...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Yunlong; Wang, Aiping; Guo, Lei
This paper presents an error-entropy minimization tracking control algorithm for a class of dynamic stochastic system. The system is represented by a set of time-varying discrete nonlinear equations with non-Gaussian stochastic input, where the statistical properties of stochastic input are unknown. By using Parzen windowing with Gaussian kernel to estimate the probability densities of errors, recursive algorithms are then proposed to design the controller such that the tracking error can be minimized. The performance of the error-entropy minimization criterion is compared with the mean-square-error minimization in the simulation results.
Survival estimates for Florida manatees from the photo-identification of individuals
Langtimm, C.A.; Beck, C.A.; Edwards, H.H.; Fick-Child, K. J.; Ackerman, B.B.; Barton, S.L.; Hartley, W.C.
2004-01-01
We estimated adult survival probabilities for the endangered Florida manatee (Trichechus manatus latirostris) in four regional populations using photo-identification data and open-population capture-recapture statistical models. The mean annual adult survival probability over the most recent 10-yr period of available estimates was as follows: Northwest - 0.956 (SE 0.007), Upper St. Johns River - 0.960 (0.011), Atlantic Coast - 0.937 (0.008), and Southwest - 0.908 (0.019). Estimates of temporal variance independent of sampling error, calculated from the survival estimates, indicated constant survival in the Upper St. Johns River, true temporal variability in the Northwest and Atlantic Coast, and large sampling variability obscuring estimates for the Southwest. Calf and subadult survival probabilities were estimated for the Upper St. Johns River from the only available data for known-aged individuals: 0.810 (95% CI 0.727-0.873) for 1st year calves, 0.915 (0.827-0.960) for 2nd year calves, and 0.969 (0.946-0.982) for manatee 3 yr or older. These estimates of survival probabilities and temporal variance, in conjunction with estimates of reproduction probabilities from photoidentification data can be used to model manatee population dynamics, estimate population growth rates, and provide an integrated measure of regional status.
Multistage classification of multispectral Earth observational data: The design approach
NASA Technical Reports Server (NTRS)
Bauer, M. E. (Principal Investigator); Muasher, M. J.; Landgrebe, D. A.
1981-01-01
An algorithm is proposed which predicts the optimal features at every node in a binary tree procedure. The algorithm estimates the probability of error by approximating the area under the likelihood ratio function for two classes and taking into account the number of training samples used in estimating each of these two classes. Some results on feature selection techniques, particularly in the presence of a very limited set of training samples, are presented. Results comparing probabilities of error predicted by the proposed algorithm as a function of dimensionality as compared to experimental observations are shown for aircraft and LANDSAT data. Results are obtained for both real and simulated data. Finally, two binary tree examples which use the algorithm are presented to illustrate the usefulness of the procedure.
ERIC Educational Resources Information Center
Rodriguez, Paul F.
2009-01-01
Memory systems are known to be influenced by feedback and error processing, but it is not well known what aspects of outcome contingencies are related to different memory systems. Here we use the Rescorla-Wagner model to estimate prediction errors in an fMRI study of stimulus-outcome association learning. The conditional probabilities of outcomes…
An Enhanced Non-Coherent Pre-Filter Design for Tracking Error Estimation in GNSS Receivers.
Luo, Zhibin; Ding, Jicheng; Zhao, Lin; Wu, Mouyan
2017-11-18
Tracking error estimation is of great importance in global navigation satellite system (GNSS) receivers. Any inaccurate estimation for tracking error will decrease the signal tracking ability of signal tracking loops and the accuracies of position fixing, velocity determination, and timing. Tracking error estimation can be done by traditional discriminator, or Kalman filter-based pre-filter. The pre-filter can be divided into two categories: coherent and non-coherent. This paper focuses on the performance improvements of non-coherent pre-filter. Firstly, the signal characteristics of coherent and non-coherent integration-which are the basis of tracking error estimation-are analyzed in detail. After that, the probability distribution of estimation noise of four-quadrant arctangent (ATAN2) discriminator is derived according to the mathematical model of coherent integration. Secondly, the statistical property of observation noise of non-coherent pre-filter is studied through Monte Carlo simulation to set the observation noise variance matrix correctly. Thirdly, a simple fault detection and exclusion (FDE) structure is introduced to the non-coherent pre-filter design, and thus its effective working range for carrier phase error estimation extends from (-0.25 cycle, 0.25 cycle) to (-0.5 cycle, 0.5 cycle). Finally, the estimation accuracies of discriminator, coherent pre-filter, and the enhanced non-coherent pre-filter are evaluated comprehensively through the carefully designed experiment scenario. The pre-filter outperforms traditional discriminator in estimation accuracy. In a highly dynamic scenario, the enhanced non-coherent pre-filter provides accuracy improvements of 41.6%, 46.4%, and 50.36% for carrier phase error, carrier frequency error, and code phase error estimation, respectively, when compared with coherent pre-filter. The enhanced non-coherent pre-filter outperforms the coherent pre-filter in code phase error estimation when carrier-to-noise density ratio is less than 28.8 dB-Hz, in carrier frequency error estimation when carrier-to-noise density ratio is less than 20 dB-Hz, and in carrier phase error estimation when carrier-to-noise density belongs to (15, 23) dB-Hz ∪ (26, 50) dB-Hz.
The influence of random element displacement on DOA estimates obtained with (Khatri-Rao-)root-MUSIC.
Inghelbrecht, Veronique; Verhaevert, Jo; van Hecke, Tanja; Rogier, Hendrik
2014-11-11
Although a wide range of direction of arrival (DOA) estimation algorithms has been described for a diverse range of array configurations, no specific stochastic analysis framework has been established to assess the probability density function of the error on DOA estimates due to random errors in the array geometry. Therefore, we propose a stochastic collocation method that relies on a generalized polynomial chaos expansion to connect the statistical distribution of random position errors to the resulting distribution of the DOA estimates. We apply this technique to the conventional root-MUSIC and the Khatri-Rao-root-MUSIC methods. According to Monte-Carlo simulations, this novel approach yields a speedup by a factor of more than 100 in terms of CPU-time for a one-dimensional case and by a factor of 56 for a two-dimensional case.
NASA Astrophysics Data System (ADS)
Berlanga, Juan M.; Harbaugh, John W.
The Tabasco region contains a number of major oilfields, including some of the emerging "giant" oil fields which have received extensive publicity. Fields in the Tabasco region are associated with large geologic structures which are detected readily by seismic surveys. The structures seem to be associated with deepseated movement of salt, and they are complexly faulted. Some structures have as much as 1000 milliseconds relief of seismic lines. A study, interpreting the structure of the area, used initially only a fraction of the total seismic lines That part of Tabasco region that has been studied was surveyed with a close-spaced rectilinear network of seismic lines. A, interpreting the structure of the area, used initially only a fraction of the total seismic data available. The purpose was to compare "predictions" of reflection time based on widely spaced seismic lines, with "results" obtained along more closely spaced lines. This process of comparison simulates the sequence of events in which a reconnaissance network of seismic lines is used to guide a succession of progressively more closely spaced lines. A square gridwork was established with lines spaced at 10 km intervals, and using machine contour maps, compared the results with those obtained with seismic grids employing spacings of 5 and 2.5 km respectively. The comparisons of predictions based on widely spaced lines with observations along closely spaced lines provide information by which an error function can be established. The error at any point can be defined as the difference between the predicted value for that point, and the subsequently observed value at that point. Residuals obtained by fitting third-degree polynomial trend surfaces were used for comparison. The root mean square of the error measurement, (expressed in seconds or milliseconds reflection time) was found to increase more or less linearly with distance from the nearest seismic point. Oil-occurrence probabilities were established on the basis of frequency distributions of trend-surface residuals obtained by fitting and subtracting polynomial trend surfaces from the machine-contoured reflection time maps. We found that there is a strong preferential relationship between the occurrence of petroleum (i.e. its presence versus absence) and particular ranges of trend-surface residual values. An estimate of the probability of oil occurring at any particular geographic point can be calculated on the basis of the estimated trend-surface residual value. This estimate, however, must be tempered by the probable error in the estimate of the residual value provided by the error function. The result, we believe, is a simple but effective procedure for estimating exploration outcome probabilities where seismic data provide the principal form of information in advance of drilling. Implicit in this approach is the comparison between a maturely explored area, for which both seismic and production data are available, and which serves as a statistical "training area", with the "target" area which is undergoing exploration and for which probability forecasts are to be calculated.
Probability theory, not the very guide of life.
Juslin, Peter; Nilsson, Håkan; Winman, Anders
2009-10-01
Probability theory has long been taken as the self-evident norm against which to evaluate inductive reasoning, and classical demonstrations of violations of this norm include the conjunction error and base-rate neglect. Many of these phenomena require multiplicative probability integration, whereas people seem more inclined to linear additive integration, in part, at least, because of well-known capacity constraints on controlled thought. In this article, the authors show with computer simulations that when based on approximate knowledge of probabilities, as is routinely the case in natural environments, linear additive integration can yield as accurate estimates, and as good average decision returns, as estimates based on probability theory. It is proposed that in natural environments people have little opportunity or incentive to induce the normative rules of probability theory and, given their cognitive constraints, linear additive integration may often offer superior bounded rationality.
Propensity Score Weighting with Error-Prone Covariates
ERIC Educational Resources Information Center
McCaffrey, Daniel F.; Lockwood, J. R.; Setodji, Claude M.
2011-01-01
Inverse probability weighting (IPW) estimates are widely used in applications where data are missing due to nonresponse or censoring or in observational studies of causal effects where the counterfactuals cannot be observed. This extensive literature has shown the estimators to be consistent and asymptotically normal under very general conditions,…
Wang, Dan; Silkie, Sarah S; Nelson, Kara L; Wuertz, Stefan
2010-09-01
Cultivation- and library-independent, quantitative PCR-based methods have become the method of choice in microbial source tracking. However, these qPCR assays are not 100% specific and sensitive for the target sequence in their respective hosts' genome. The factors that can lead to false positive and false negative information in qPCR results are well defined. It is highly desirable to have a way of removing such false information to estimate the true concentration of host-specific genetic markers and help guide the interpretation of environmental monitoring studies. Here we propose a statistical model based on the Law of Total Probability to predict the true concentration of these markers. The distributions of the probabilities of obtaining false information are estimated from representative fecal samples of known origin. Measurement error is derived from the sample precision error of replicated qPCR reactions. Then, the Monte Carlo method is applied to sample from these distributions of probabilities and measurement error. The set of equations given by the Law of Total Probability allows one to calculate the distribution of true concentrations, from which their expected value, confidence interval and other statistical characteristics can be easily evaluated. The output distributions of predicted true concentrations can then be used as input to watershed-wide total maximum daily load determinations, quantitative microbial risk assessment and other environmental models. This model was validated by both statistical simulations and real world samples. It was able to correct the intrinsic false information associated with qPCR assays and output the distribution of true concentrations of Bacteroidales for each animal host group. Model performance was strongly affected by the precision error. It could perform reliably and precisely when the standard deviation of the precision error was small (≤ 0.1). Further improvement on the precision of sample processing and qPCR reaction would greatly improve the performance of the model. This methodology, built upon Bacteroidales assays, is readily transferable to any other microbial source indicator where a universal assay for fecal sources of that indicator exists. Copyright © 2010 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Chang, Alfred T. C.; Chiu, Long S.; Wilheit, Thomas T.
1993-01-01
Global averages and random errors associated with the monthly oceanic rain rates derived from the Special Sensor Microwave/Imager (SSM/I) data using the technique developed by Wilheit et al. (1991) are computed. Accounting for the beam-filling bias, a global annual average rain rate of 1.26 m is computed. The error estimation scheme is based on the existence of independent (morning and afternoon) estimates of the monthly mean. Calculations show overall random errors of about 50-60 percent for each 5 deg x 5 deg box. The results are insensitive to different sampling strategy (odd and even days of the month). Comparison of the SSM/I estimates with raingage data collected at the Pacific atoll stations showed a low bias of about 8 percent, a correlation of 0.7, and an rms difference of 55 percent.
An Enhanced Non-Coherent Pre-Filter Design for Tracking Error Estimation in GNSS Receivers
Luo, Zhibin; Ding, Jicheng; Zhao, Lin; Wu, Mouyan
2017-01-01
Tracking error estimation is of great importance in global navigation satellite system (GNSS) receivers. Any inaccurate estimation for tracking error will decrease the signal tracking ability of signal tracking loops and the accuracies of position fixing, velocity determination, and timing. Tracking error estimation can be done by traditional discriminator, or Kalman filter-based pre-filter. The pre-filter can be divided into two categories: coherent and non-coherent. This paper focuses on the performance improvements of non-coherent pre-filter. Firstly, the signal characteristics of coherent and non-coherent integration—which are the basis of tracking error estimation—are analyzed in detail. After that, the probability distribution of estimation noise of four-quadrant arctangent (ATAN2) discriminator is derived according to the mathematical model of coherent integration. Secondly, the statistical property of observation noise of non-coherent pre-filter is studied through Monte Carlo simulation to set the observation noise variance matrix correctly. Thirdly, a simple fault detection and exclusion (FDE) structure is introduced to the non-coherent pre-filter design, and thus its effective working range for carrier phase error estimation extends from (−0.25 cycle, 0.25 cycle) to (−0.5 cycle, 0.5 cycle). Finally, the estimation accuracies of discriminator, coherent pre-filter, and the enhanced non-coherent pre-filter are evaluated comprehensively through the carefully designed experiment scenario. The pre-filter outperforms traditional discriminator in estimation accuracy. In a highly dynamic scenario, the enhanced non-coherent pre-filter provides accuracy improvements of 41.6%, 46.4%, and 50.36% for carrier phase error, carrier frequency error, and code phase error estimation, respectively, when compared with coherent pre-filter. The enhanced non-coherent pre-filter outperforms the coherent pre-filter in code phase error estimation when carrier-to-noise density ratio is less than 28.8 dB-Hz, in carrier frequency error estimation when carrier-to-noise density ratio is less than 20 dB-Hz, and in carrier phase error estimation when carrier-to-noise density belongs to (15, 23) dB-Hz ∪ (26, 50) dB-Hz. PMID:29156581
Inferences about landbird abundance from count data: recent advances and future directions
Nichols, J.D.; Thomas, L.; Conn, P.B.; Thomson, David L.; Cooch, Evan G.; Conroy, Michael J.
2009-01-01
We summarize results of a November 2006 workshop dealing with recent research on the estimation of landbird abundance from count data. Our conceptual framework includes a decomposition of the probability of detecting a bird potentially exposed to sampling efforts into four separate probabilities. Primary inference methods are described and include distance sampling, multiple observers, time of detection, and repeated counts. The detection parameters estimated by these different approaches differ, leading to different interpretations of resulting estimates of density and abundance. Simultaneous use of combinations of these different inference approaches can not only lead to increased precision but also provides the ability to decompose components of the detection process. Recent efforts to test the efficacy of these different approaches using natural systems and a new bird radio test system provide sobering conclusions about the ability of observers to detect and localize birds in auditory surveys. Recent research is reported on efforts to deal with such potential sources of error as bird misclassification, measurement error, and density gradients. Methods for inference about spatial and temporal variation in avian abundance are outlined. Discussion topics include opinions about the need to estimate detection probability when drawing inference about avian abundance, methodological recommendations based on the current state of knowledge and suggestions for future research.
Building a kinetic Monte Carlo model with a chosen accuracy.
Bhute, Vijesh J; Chatterjee, Abhijit
2013-06-28
The kinetic Monte Carlo (KMC) method is a popular modeling approach for reaching large materials length and time scales. The KMC dynamics is erroneous when atomic processes that are relevant to the dynamics are missing from the KMC model. Recently, we had developed for the first time an error measure for KMC in Bhute and Chatterjee [J. Chem. Phys. 138, 084103 (2013)]. The error measure, which is given in terms of the probability that a missing process will be selected in the correct dynamics, requires estimation of the missing rate. In this work, we present an improved procedure for estimating the missing rate. The estimate found using the new procedure is within an order of magnitude of the correct missing rate, unlike our previous approach where the estimate was larger by orders of magnitude. This enables one to find the error in the KMC model more accurately. In addition, we find the time for which the KMC model can be used before a maximum error in the dynamics has been reached.
Hazard Function Estimation with Cause-of-Death Data Missing at Random.
Wang, Qihua; Dinse, Gregg E; Liu, Chunling
2012-04-01
Hazard function estimation is an important part of survival analysis. Interest often centers on estimating the hazard function associated with a particular cause of death. We propose three nonparametric kernel estimators for the hazard function, all of which are appropriate when death times are subject to random censorship and censoring indicators can be missing at random. Specifically, we present a regression surrogate estimator, an imputation estimator, and an inverse probability weighted estimator. All three estimators are uniformly strongly consistent and asymptotically normal. We derive asymptotic representations of the mean squared error and the mean integrated squared error for these estimators and we discuss a data-driven bandwidth selection method. A simulation study, conducted to assess finite sample behavior, demonstrates that the proposed hazard estimators perform relatively well. We illustrate our methods with an analysis of some vascular disease data.
Peterson, J.; Dunham, J.B.
2003-01-01
Effective conservation efforts for at-risk species require knowledge of the locations of existing populations. Species presence can be estimated directly by conducting field-sampling surveys or alternatively by developing predictive models. Direct surveys can be expensive and inefficient, particularly for rare and difficult-to-sample species, and models of species presence may produce biased predictions. We present a Bayesian approach that combines sampling and model-based inferences for estimating species presence. The accuracy and cost-effectiveness of this approach were compared to those of sampling surveys and predictive models for estimating the presence of the threatened bull trout ( Salvelinus confluentus ) via simulation with existing models and empirical sampling data. Simulations indicated that a sampling-only approach would be the most effective and would result in the lowest presence and absence misclassification error rates for three thresholds of detection probability. When sampling effort was considered, however, the combined approach resulted in the lowest error rates per unit of sampling effort. Hence, lower probability-of-detection thresholds can be specified with the combined approach, resulting in lower misclassification error rates and improved cost-effectiveness.
NASA Astrophysics Data System (ADS)
Mori, Shohei; Hirata, Shinnosuke; Yamaguchi, Tadashi; Hachiya, Hiroyuki
To develop a quantitative diagnostic method for liver fibrosis using an ultrasound B-mode image, a probability imaging method of tissue characteristics based on a multi-Rayleigh model, which expresses a probability density function of echo signals from liver fibrosis, has been proposed. In this paper, an effect of non-speckle echo signals on tissue characteristics estimated from the multi-Rayleigh model was evaluated. Non-speckle signals were determined and removed using the modeling error of the multi-Rayleigh model. The correct tissue characteristics of fibrotic tissue could be estimated with the removal of non-speckle signals.
Bayes Error Rate Estimation Using Classifier Ensembles
NASA Technical Reports Server (NTRS)
Tumer, Kagan; Ghosh, Joydeep
2003-01-01
The Bayes error rate gives a statistical lower bound on the error achievable for a given classification problem and the associated choice of features. By reliably estimating th is rate, one can assess the usefulness of the feature set that is being used for classification. Moreover, by comparing the accuracy achieved by a given classifier with the Bayes rate, one can quantify how effective that classifier is. Classical approaches for estimating or finding bounds for the Bayes error, in general, yield rather weak results for small sample sizes; unless the problem has some simple characteristics, such as Gaussian class-conditional likelihoods. This article shows how the outputs of a classifier ensemble can be used to provide reliable and easily obtainable estimates of the Bayes error with negligible extra computation. Three methods of varying sophistication are described. First, we present a framework that estimates the Bayes error when multiple classifiers, each providing an estimate of the a posteriori class probabilities, a recombined through averaging. Second, we bolster this approach by adding an information theoretic measure of output correlation to the estimate. Finally, we discuss a more general method that just looks at the class labels indicated by ensem ble members and provides error estimates based on the disagreements among classifiers. The methods are illustrated for artificial data, a difficult four-class problem involving underwater acoustic data, and two problems from the Problem benchmarks. For data sets with known Bayes error, the combiner-based methods introduced in this article outperform existing methods. The estimates obtained by the proposed methods also seem quite reliable for the real-life data sets for which the true Bayes rates are unknown.
Grizzly Bear Noninvasive Genetic Tagging Surveys: Estimating the Magnitude of Missed Detections
Fisher, Jason T.; Heim, Nicole; Code, Sandra; Paczkowski, John
2016-01-01
Sound wildlife conservation decisions require sound information, and scientists increasingly rely on remotely collected data over large spatial scales, such as noninvasive genetic tagging (NGT). Grizzly bears (Ursus arctos), for example, are difficult to study at population scales except with noninvasive data, and NGT via hair trapping informs management over much of grizzly bears’ range. Considerable statistical effort has gone into estimating sources of heterogeneity, but detection error–arising when a visiting bear fails to leave a hair sample–has not been independently estimated. We used camera traps to survey grizzly bear occurrence at fixed hair traps and multi-method hierarchical occupancy models to estimate the probability that a visiting bear actually leaves a hair sample with viable DNA. We surveyed grizzly bears via hair trapping and camera trapping for 8 monthly surveys at 50 (2012) and 76 (2013) sites in the Rocky Mountains of Alberta, Canada. We used multi-method occupancy models to estimate site occupancy, probability of detection, and conditional occupancy at a hair trap. We tested the prediction that detection error in NGT studies could be induced by temporal variability within season, leading to underestimation of occupancy. NGT via hair trapping consistently underestimated grizzly bear occupancy at a site when compared to camera trapping. At best occupancy was underestimated by 50%; at worst, by 95%. Probability of false absence was reduced through successive surveys, but this mainly accounts for error imparted by movement among repeated surveys, not necessarily missed detections by extant bears. The implications of missed detections and biased occupancy estimates for density estimation–which form the crux of management plans–require consideration. We suggest hair-trap NGT studies should estimate and correct detection error using independent survey methods such as cameras, to ensure the reliability of the data upon which species management and conservation actions are based. PMID:27603134
Elliott, Rachel A; Putman, Koen D; Franklin, Matthew; Annemans, Lieven; Verhaeghe, Nick; Eden, Martin; Hayre, Jasdeep; Rodgers, Sarah; Sheikh, Aziz; Avery, Anthony J
2014-06-01
We recently showed that a pharmacist-led information technology-based intervention (PINCER) was significantly more effective in reducing medication errors in general practices than providing simple feedback on errors, with cost per error avoided at £79 (US$131). We aimed to estimate cost effectiveness of the PINCER intervention by combining effectiveness in error reduction and intervention costs with the effect of the individual errors on patient outcomes and healthcare costs, to estimate the effect on costs and QALYs. We developed Markov models for each of six medication errors targeted by PINCER. Clinical event probability, treatment pathway, resource use and costs were extracted from literature and costing tariffs. A composite probabilistic model combined patient-level error models with practice-level error rates and intervention costs from the trial. Cost per extra QALY and cost-effectiveness acceptability curves were generated from the perspective of NHS England, with a 5-year time horizon. The PINCER intervention generated £2,679 less cost and 0.81 more QALYs per practice [incremental cost-effectiveness ratio (ICER): -£3,037 per QALY] in the deterministic analysis. In the probabilistic analysis, PINCER generated 0.001 extra QALYs per practice compared with simple feedback, at £4.20 less per practice. Despite this extremely small set of differences in costs and outcomes, PINCER dominated simple feedback with a mean ICER of -£3,936 (standard error £2,970). At a ceiling 'willingness-to-pay' of £20,000/QALY, PINCER reaches 59 % probability of being cost effective. PINCER produced marginal health gain at slightly reduced overall cost. Results are uncertain due to the poor quality of data to inform the effect of avoiding errors.
NASA Technical Reports Server (NTRS)
Tranter, W. H.; Turner, M. D.
1977-01-01
Techniques are developed to estimate power gain, delay, signal-to-noise ratio, and mean square error in digital computer simulations of lowpass and bandpass systems. The techniques are applied to analog and digital communications. The signal-to-noise ratio estimates are shown to be maximum likelihood estimates in additive white Gaussian noise. The methods are seen to be especially useful for digital communication systems where the mapping from the signal-to-noise ratio to the error probability can be obtained. Simulation results show the techniques developed to be accurate and quite versatile in evaluating the performance of many systems through digital computer simulation.
Improved Analysis of GW150914 Using a Fully Spin-Precessing Waveform Model
NASA Technical Reports Server (NTRS)
Abbott, B. P.; Abbott, R.; Abbott, T. D.; Abernathy, M. R.; Acernese, F.; Ackley, K.; Adams, C.; Adams, T.; Addesso, P.; Camp, J. B.;
2016-01-01
This paper presents updated estimates of source parameters for GW150914, a binary black-hole coalescence event detected by the Laser Interferometer Gravitational-wave Observatory (LIGO) in 2015 [Abbott et al. Phys. Rev. Lett. 116, 061102 (2016).]. Abbott et al. [Phys. Rev. Lett. 116, 241102 (2016).] presented parameter estimation of the source using a 13-dimensional, phenomenological precessing-spin model (precessing IMRPhenom) and an 11-dimensional nonprecessing effective-one-body (EOB) model calibrated to numerical-relativity simulations, which forces spin alignment (nonprecessing EOBNR). Here, we present new results that include a 15-dimensional precessing-spin waveform model (precessing EOBNR) developed within the EOB formalism. We find good agreement with the parameters estimated previously [Abbott et al. Phys. Rev. Lett. 116, 241102 (2016).], and we quote updated component masses of 35(+5)(-3) solar M; and 30(+3)(-4) solar M; (where errors correspond to 90 symmetric credible intervals). We also present slightly tighter constraints on the dimensionless spin magnitudes of the two black holes, with a primary spin estimate is less than 0.65 and a secondary spin estimate is less than 0.75 at 90% probability. Abbott et al. [Phys. Rev. Lett. 116, 241102 (2016).] estimated the systematic parameter-extraction errors due to waveform-model uncertainty by combining the posterior probability densities of precessing IMRPhenom and nonprecessing EOBNR. Here, we find that the two precessing-spin models are in closer agreement, suggesting that these systematic errors are smaller than previously quoted.
NASA Technical Reports Server (NTRS)
Kumar, Rajendra (Inventor)
1991-01-01
A multistage estimator is provided for the parameters of a received carrier signal possibly phase-modulated by unknown data and experiencing very high Doppler, Doppler rate, etc., as may arise, for example, in the case of Global Positioning Systems (GPS) where the signal parameters are directly related to the position, velocity and jerk of the GPS ground-based receiver. In a two-stage embodiment of the more general multistage scheme, the first stage, selected to be a modified least squares algorithm referred to as differential least squares (DLS), operates as a coarse estimator resulting in higher rms estimation errors but with a relatively small probability of the frequency estimation error exceeding one-half of the sampling frequency, provides relatively coarse estimates of the frequency and its derivatives. The second stage of the estimator, an extended Kalman filter (EKF), operates on the error signal available from the first stage refining the overall estimates of the phase along with a more refined estimate of frequency as well and in the process also reduces the number of cycle slips.
NASA Technical Reports Server (NTRS)
Kumar, Rajendra (Inventor)
1990-01-01
A multistage estimator is provided for the parameters of a received carrier signal possibly phase-modulated by unknown data and experiencing very high Doppler, Doppler rate, etc., as may arise, for example, in the case of Global Positioning Systems (GPS) where the signal parameters are directly related to the position, velocity and jerk of the GPS ground-based receiver. In a two-stage embodiment of the more general multistage scheme, the first stage, selected to be a modified least squares algorithm referred to as differential least squares (DLS), operates as a coarse estimator resulting in higher rms estimation errors but with a relatively small probability of the frequency estimation error exceeding one-half of the sampling frequency, provides relatively coarse estimates of the frequency and its derivatives. The second stage of the estimator, an extended Kalman filter (EKF), operates on the error signal available from the first stage refining the overall estimates of the phase along with a more refined estimate of frequency as well and in the process also reduces the number of cycle slips.
Estimating Single-Event Logic Cross Sections in Advanced Technologies
NASA Astrophysics Data System (ADS)
Harrington, R. C.; Kauppila, J. S.; Warren, K. M.; Chen, Y. P.; Maharrey, J. A.; Haeffner, T. D.; Loveless, T. D.; Bhuva, B. L.; Bounasser, M.; Lilja, K.; Massengill, L. W.
2017-08-01
Reliable estimation of logic single-event upset (SEU) cross section is becoming increasingly important for predicting the overall soft error rate. As technology scales and single-event transient (SET) pulse widths shrink to widths on the order of the setup-and-hold time of flip-flops, the probability of latching an SET as an SEU must be reevaluated. In this paper, previous assumptions about the relationship of SET pulsewidth to the probability of latching an SET are reconsidered and a model for transient latching probability has been developed for advanced technologies. A method using the improved transient latching probability and SET data is used to predict logic SEU cross section. The presented model has been used to estimate combinational logic SEU cross sections in 32-nm partially depleted silicon-on-insulator (SOI) technology given experimental heavy-ion SET data. Experimental SEU data show good agreement with the model presented in this paper.
Time synchronization of a frequency-hopped MFSK communication system
NASA Technical Reports Server (NTRS)
Simon, M. K.; Polydoros, A.; Huth, G. K.
1981-01-01
In a frequency-hopped (FH) multiple-frequency-shift-keyed (MFSK) communication system, frequency hopping causes the necessary frequency transitions for time synchronization estimation rather than the data sequence as in the conventional (nonfrequency-hopped) system. Making use of this observation, this paper presents a fine synchronization (i.e., time errors of less than a hop duration) technique for estimation of FH timing. The performance degradation due to imperfect FH time synchronization is found in terms of the effect on bit error probability as a function of full-band or partial-band noise jamming levels and of the number of hops used in the FH timing estimate.
Modified fast frequency acquisition via adaptive least squares algorithm
NASA Technical Reports Server (NTRS)
Kumar, Rajendra (Inventor)
1992-01-01
A method and the associated apparatus for estimating the amplitude, frequency, and phase of a signal of interest are presented. The method comprises the following steps: (1) inputting the signal of interest; (2) generating a reference signal with adjustable amplitude, frequency and phase at an output thereof; (3) mixing the signal of interest with the reference signal and a signal 90 deg out of phase with the reference signal to provide a pair of quadrature sample signals comprising respectively a difference between the signal of interest and the reference signal and a difference between the signal of interest and the signal 90 deg out of phase with the reference signal; (4) using the pair of quadrature sample signals to compute estimates of the amplitude, frequency, and phase of an error signal comprising the difference between the signal of interest and the reference signal employing a least squares estimation; (5) adjusting the amplitude, frequency, and phase of the reference signal from the numerically controlled oscillator in a manner which drives the error signal towards zero; and (6) outputting the estimates of the amplitude, frequency, and phase of the error signal in combination with the reference signal to produce a best estimate of the amplitude, frequency, and phase of the signal of interest. The preferred method includes the step of providing the error signal as a real time confidence measure as to the accuracy of the estimates wherein the closer the error signal is to zero, the higher the probability that the estimates are accurate. A matrix in the estimation algorithm provides an estimate of the variance of the estimation error.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-03-15
... contact the Census Bureau's Social, Economic and Housing Statistics Division at (301) 763- 3243. Under the... the use of probability sampling to create the sample. For additional information about the accuracy of... consists of the error that arises from the use of probability sampling to create the sample. \\2\\ These...
Reliability of drivers in urban intersections.
Gstalter, Herbert; Fastenmeier, Wolfgang
2010-01-01
The concept of human reliability has been widely used in industrial settings by human factors experts to optimise the person-task fit. Reliability is estimated by the probability that a task will successfully be completed by personnel in a given stage of system operation. Human Reliability Analysis (HRA) is a technique used to calculate human error probabilities as the ratio of errors committed to the number of opportunities for that error. To transfer this notion to the measurement of car driver reliability the following components are necessary: a taxonomy of driving tasks, a definition of correct behaviour in each of these tasks, a list of errors as deviations from the correct actions and an adequate observation method to register errors and opportunities for these errors. Use of the SAFE-task analysis procedure recently made it possible to derive driver errors directly from the normative analysis of behavioural requirements. Driver reliability estimates could be used to compare groups of tasks (e.g. different types of intersections with their respective regulations) as well as groups of drivers' or individual drivers' aptitudes. This approach was tested in a field study with 62 drivers of different age groups. The subjects drove an instrumented car and had to complete an urban test route, the main features of which were 18 intersections representing six different driving tasks. The subjects were accompanied by two trained observers who recorded driver errors using standardized observation sheets. Results indicate that error indices often vary between both the age group of drivers and the type of driving task. The highest error indices occurred in the non-signalised intersection tasks and the roundabout, which exactly equals the corresponding ratings of task complexity from the SAFE analysis. A comparison of age groups clearly shows the disadvantage of older drivers, whose error indices in nearly all tasks are significantly higher than those of the other groups. The vast majority of these errors could be explained by high task load in the intersections, as they represent difficult tasks. The discussion shows how reliability estimates can be used in a constructive way to propose changes in car design, intersection layout and regulation as well as driver training.
Anytime synthetic projection: Maximizing the probability of goal satisfaction
NASA Technical Reports Server (NTRS)
Drummond, Mark; Bresina, John L.
1990-01-01
A projection algorithm is presented for incremental control rule synthesis. The algorithm synthesizes an initial set of goal achieving control rules using a combination of situation probability and estimated remaining work as a search heuristic. This set of control rules has a certain probability of satisfying the given goal. The probability is incrementally increased by synthesizing additional control rules to handle 'error' situations the execution system is likely to encounter when following the initial control rules. By using situation probabilities, the algorithm achieves a computationally effective balance between the limited robustness of triangle tables and the absolute robustness of universal plans.
Hazard Function Estimation with Cause-of-Death Data Missing at Random
Wang, Qihua; Dinse, Gregg E.; Liu, Chunling
2010-01-01
Hazard function estimation is an important part of survival analysis. Interest often centers on estimating the hazard function associated with a particular cause of death. We propose three nonparametric kernel estimators for the hazard function, all of which are appropriate when death times are subject to random censorship and censoring indicators can be missing at random. Specifically, we present a regression surrogate estimator, an imputation estimator, and an inverse probability weighted estimator. All three estimators are uniformly strongly consistent and asymptotically normal. We derive asymptotic representations of the mean squared error and the mean integrated squared error for these estimators and we discuss a data-driven bandwidth selection method. A simulation study, conducted to assess finite sample behavior, demonstrates that the proposed hazard estimators perform relatively well. We illustrate our methods with an analysis of some vascular disease data. PMID:22267874
Constrained motion estimation-based error resilient coding for HEVC
NASA Astrophysics Data System (ADS)
Guo, Weihan; Zhang, Yongfei; Li, Bo
2018-04-01
Unreliable communication channels might lead to packet losses and bit errors in the videos transmitted through it, which will cause severe video quality degradation. This is even worse for HEVC since more advanced and powerful motion estimation methods are introduced to further remove the inter-frame dependency and thus improve the coding efficiency. Once a Motion Vector (MV) is lost or corrupted, it will cause distortion in the decoded frame. More importantly, due to motion compensation, the error will propagate along the motion prediction path, accumulate over time, and significantly degrade the overall video presentation quality. To address this problem, we study the problem of encoder-sider error resilient coding for HEVC and propose a constrained motion estimation scheme to mitigate the problem of error propagation to subsequent frames. The approach is achieved by cutting off MV dependencies and limiting the block regions which are predicted by temporal motion vector. The experimental results show that the proposed method can effectively suppress the error propagation caused by bit errors of motion vector and can improve the robustness of the stream in the bit error channels. When the bit error probability is 10-5, an increase of the decoded video quality (PSNR) by up to1.310dB and on average 0.762 dB can be achieved, compared to the reference HEVC.
Muhlfeld, Clint C.; Taper, Mark L.; Staples, David F.; Shepard, Bradley B.
2006-01-01
Despite the widespread use of redd counts to monitor trends in salmonid populations, few studies have evaluated the uncertainties in observed counts. We assessed the variability in redd counts for migratory bull trout Salvelinus confluentus among experienced observers in Lion and Goat creeks, which are tributaries to the Swan River, Montana. We documented substantially lower observer variability in bull trout redd counts than did previous studies. Observer counts ranged from 78% to 107% of our best estimates of true redd numbers in Lion Creek and from 90% to 130% of our best estimates in Goat Creek. Observers made both errors of omission and errors of false identification, and we modeled this combination by use of a binomial probability of detection and a Poisson count distribution of false identifications. Redd detection probabilities were high (mean = 83%) and exhibited no significant variation among observers (SD = 8%). We applied this error structure to annual redd counts in the Swan River basin (1982–2004) to correct for observer error and thus derived more accurate estimates of redd numbers and associated confidence intervals. Our results indicate that bias in redd counts can be reduced if experienced observers are used to conduct annual redd counts. Future studies should assess both sources of observer error to increase the validity of using redd counts for inferring true redd numbers in different basins. This information will help fisheries biologists to more precisely monitor population trends, identify recovery and extinction thresholds for conservation and recovery programs, ascertain and predict how management actions influence distribution and abundance, and examine effects of recovery and restoration activities.
Estimating trace-suspect match probabilities for singleton Y-STR haplotypes using coalescent theory.
Andersen, Mikkel Meyer; Caliebe, Amke; Jochens, Arne; Willuweit, Sascha; Krawczak, Michael
2013-02-01
Estimation of match probabilities for singleton haplotypes of lineage markers, i.e. for haplotypes observed only once in a reference database augmented by a suspect profile, is an important problem in forensic genetics. We compared the performance of four estimators of singleton match probabilities for Y-STRs, namely the count estimate, both with and without Brenner's so-called 'kappa correction', the surveying estimate, and a previously proposed, but rarely used, coalescent-based approach implemented in the BATWING software. Extensive simulation with BATWING of the underlying population history, haplotype evolution and subsequent database sampling revealed that the coalescent-based approach is characterized by lower bias and lower mean squared error than the uncorrected count estimator and the surveying estimator. Moreover, in contrast to the two count estimators, both the surveying and the coalescent-based approach exhibited a good correlation between the estimated and true match probabilities. However, although its overall performance is thus better than that of any other recognized method, the coalescent-based estimator is still computation-intense on the verge of general impracticability. Its application in forensic practice therefore will have to be limited to small reference databases, or to isolated cases of particular interest, until more powerful algorithms for coalescent simulation have become available. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Inference of emission rates from multiple sources using Bayesian probability theory.
Yee, Eugene; Flesch, Thomas K
2010-03-01
The determination of atmospheric emission rates from multiple sources using inversion (regularized least-squares or best-fit technique) is known to be very susceptible to measurement and model errors in the problem, rendering the solution unusable. In this paper, a new perspective is offered for this problem: namely, it is argued that the problem should be addressed as one of inference rather than inversion. Towards this objective, Bayesian probability theory is used to estimate the emission rates from multiple sources. The posterior probability distribution for the emission rates is derived, accounting fully for the measurement errors in the concentration data and the model errors in the dispersion model used to interpret the data. The Bayesian inferential methodology for emission rate recovery is validated against real dispersion data, obtained from a field experiment involving various source-sensor geometries (scenarios) consisting of four synthetic area sources and eight concentration sensors. The recovery of discrete emission rates from three different scenarios obtained using Bayesian inference and singular value decomposition inversion are compared and contrasted.
On land-use modeling: A treatise of satellite imagery data and misclassification error
NASA Astrophysics Data System (ADS)
Sandler, Austin M.
Recent availability of satellite-based land-use data sets, including data sets with contiguous spatial coverage over large areas, relatively long temporal coverage, and fine-scale land cover classifications, is providing new opportunities for land-use research. However, care must be used when working with these datasets due to misclassification error, which causes inconsistent parameter estimates in the discrete choice models typically used to model land-use. I therefore adapt the empirical correction methods developed for other contexts (e.g., epidemiology) so that they can be applied to land-use modeling. I then use a Monte Carlo simulation, and an empirical application using actual satellite imagery data from the Northern Great Plains, to compare the results of a traditional model ignoring misclassification to those from models accounting for misclassification. Results from both the simulation and application indicate that ignoring misclassification will lead to biased results. Even seemingly insignificant levels of misclassification error (e.g., 1%) result in biased parameter estimates, which alter marginal effects enough to affect policy inference. At the levels of misclassification typical in current satellite imagery datasets (e.g., as high as 35%), ignoring misclassification can lead to systematically erroneous land-use probabilities and substantially biased marginal effects. The correction methods I propose, however, generate consistent parameter estimates and therefore consistent estimates of marginal effects and predicted land-use probabilities.
Distribution of the two-sample t-test statistic following blinded sample size re-estimation.
Lu, Kaifeng
2016-05-01
We consider the blinded sample size re-estimation based on the simple one-sample variance estimator at an interim analysis. We characterize the exact distribution of the standard two-sample t-test statistic at the final analysis. We describe a simulation algorithm for the evaluation of the probability of rejecting the null hypothesis at given treatment effect. We compare the blinded sample size re-estimation method with two unblinded methods with respect to the empirical type I error, the empirical power, and the empirical distribution of the standard deviation estimator and final sample size. We characterize the type I error inflation across the range of standardized non-inferiority margin for non-inferiority trials, and derive the adjusted significance level to ensure type I error control for given sample size of the internal pilot study. We show that the adjusted significance level increases as the sample size of the internal pilot study increases. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Simonen, E.P.; Johnson, K.I.; Simonen, F.A.
The Vessel Integrity Simulation Analysis (VISA-II) code was developed to allow calculations of the failure probability of a reactor pressure vessel subject to defined pressure/temperature transients. A version of the code, revised by Pacific Northwest Laboratory for the US Nuclear Regulatory Commission, was used to evaluate the sensitivities of calculated through-wall flaw probability to material, flaw and calculational assumptions. Probabilities were more sensitive to flaw assumptions than to material or calculational assumptions. Alternative flaw assumptions changed the probabilities by one to two orders of magnitude, whereas alternative material assumptions typically changed the probabilities by a factor of two or less.more » Flaw shape, flaw through-wall position and flaw inspection were sensitivities examined. Material property sensitivities included the assumed distributions in copper content and fracture toughness. Methods of modeling flaw propagation that were evaluated included arrest/reinitiation toughness correlations, multiple toughness values along the length of a flaw, flaw jump distance for each computer simulation and added error in estimating irradiated properties caused by the trend curve correlation error.« less
Peak-flow characteristics of Virginia streams
Austin, Samuel H.; Krstolic, Jennifer L.; Wiegand, Ute
2011-01-01
Peak-flow annual exceedance probabilities, also called probability-percent chance flow estimates, and regional regression equations are provided describing the peak-flow characteristics of Virginia streams. Statistical methods are used to evaluate peak-flow data. Analysis of Virginia peak-flow data collected from 1895 through 2007 is summarized. Methods are provided for estimating unregulated peak flow of gaged and ungaged streams. Station peak-flow characteristics identified by fitting the logarithms of annual peak flows to a Log Pearson Type III frequency distribution yield annual exceedance probabilities of 0.5, 0.4292, 0.2, 0.1, 0.04, 0.02, 0.01, 0.005, and 0.002 for 476 streamgaging stations. Stream basin characteristics computed using spatial data and a geographic information system are used as explanatory variables in regional regression model equations for six physiographic regions to estimate regional annual exceedance probabilities at gaged and ungaged sites. Weighted peak-flow values that combine annual exceedance probabilities computed from gaging station data and from regional regression equations provide improved peak-flow estimates. Text, figures, and lists are provided summarizing selected peak-flow sites, delineated physiographic regions, peak-flow estimates, basin characteristics, regional regression model equations, error estimates, definitions, data sources, and candidate regression model equations. This study supersedes previous studies of peak flows in Virginia.
An adaptive threshold detector and channel parameter estimator for deep space optical communications
NASA Technical Reports Server (NTRS)
Arabshahi, P.; Mukai, R.; Yan, T. -Y.
2001-01-01
This paper presents a method for optimal adaptive setting of ulse-position-modulation pulse detection thresholds, which minimizes the total probability of error for the dynamically fading optical fee space channel.
Dionisio, Kathie L; Chang, Howard H; Baxter, Lisa K
2016-11-25
Exposure measurement error in copollutant epidemiologic models has the potential to introduce bias in relative risk (RR) estimates. A simulation study was conducted using empirical data to quantify the impact of correlated measurement errors in time-series analyses of air pollution and health. ZIP-code level estimates of exposure for six pollutants (CO, NO x , EC, PM 2.5 , SO 4 , O 3 ) from 1999 to 2002 in the Atlanta metropolitan area were used to calculate spatial, population (i.e. ambient versus personal), and total exposure measurement error. Empirically determined covariance of pollutant concentration pairs and the associated measurement errors were used to simulate true exposure (exposure without error) from observed exposure. Daily emergency department visits for respiratory diseases were simulated using a Poisson time-series model with a main pollutant RR = 1.05 per interquartile range, and a null association for the copollutant (RR = 1). Monte Carlo experiments were used to evaluate the impacts of correlated exposure errors of different copollutant pairs. Substantial attenuation of RRs due to exposure error was evident in nearly all copollutant pairs studied, ranging from 10 to 40% attenuation for spatial error, 3-85% for population error, and 31-85% for total error. When CO, NO x or EC is the main pollutant, we demonstrated the possibility of false positives, specifically identifying significant, positive associations for copollutants based on the estimated type I error rate. The impact of exposure error must be considered when interpreting results of copollutant epidemiologic models, due to the possibility of attenuation of main pollutant RRs and the increased probability of false positives when measurement error is present.
A comparison of abundance estimates from extended batch-marking and Jolly–Seber-type experiments
Cowen, Laura L E; Besbeas, Panagiotis; Morgan, Byron J T; Schwarz, Carl J
2014-01-01
Little attention has been paid to the use of multi-sample batch-marking studies, as it is generally assumed that an individual's capture history is necessary for fully efficient estimates. However, recently, Huggins et al. (2010) present a pseudo-likelihood for a multi-sample batch-marking study where they used estimating equations to solve for survival and capture probabilities and then derived abundance estimates using a Horvitz–Thompson-type estimator. We have developed and maximized the likelihood for batch-marking studies. We use data simulated from a Jolly–Seber-type study and convert this to what would have been obtained from an extended batch-marking study. We compare our abundance estimates obtained from the Crosbie–Manly–Arnason–Schwarz (CMAS) model with those of the extended batch-marking model to determine the efficiency of collecting and analyzing batch-marking data. We found that estimates of abundance were similar for all three estimators: CMAS, Huggins, and our likelihood. Gains are made when using unique identifiers and employing the CMAS model in terms of precision; however, the likelihood typically had lower mean square error than the pseudo-likelihood method of Huggins et al. (2010). When faced with designing a batch-marking study, researchers can be confident in obtaining unbiased abundance estimators. Furthermore, they can design studies in order to reduce mean square error by manipulating capture probabilities and sample size. PMID:24558576
Taking error into account when fitting models using Approximate Bayesian Computation.
van der Vaart, Elske; Prangle, Dennis; Sibly, Richard M
2018-03-01
Stochastic computer simulations are often the only practical way of answering questions relating to ecological management. However, due to their complexity, such models are difficult to calibrate and evaluate. Approximate Bayesian Computation (ABC) offers an increasingly popular approach to this problem, widely applied across a variety of fields. However, ensuring the accuracy of ABC's estimates has been difficult. Here, we obtain more accurate estimates by incorporating estimation of error into the ABC protocol. We show how this can be done where the data consist of repeated measures of the same quantity and errors may be assumed to be normally distributed and independent. We then derive the correct acceptance probabilities for a probabilistic ABC algorithm, and update the coverage test with which accuracy is assessed. We apply this method, which we call error-calibrated ABC, to a toy example and a realistic 14-parameter simulation model of earthworms that is used in environmental risk assessment. A comparison with exact methods and the diagnostic coverage test show that our approach improves estimation of parameter values and their credible intervals for both models. © 2017 by the Ecological Society of America.
Gaussian Hypothesis Testing and Quantum Illumination.
Wilde, Mark M; Tomamichel, Marco; Lloyd, Seth; Berta, Mario
2017-09-22
Quantum hypothesis testing is one of the most basic tasks in quantum information theory and has fundamental links with quantum communication and estimation theory. In this paper, we establish a formula that characterizes the decay rate of the minimal type-II error probability in a quantum hypothesis test of two Gaussian states given a fixed constraint on the type-I error probability. This formula is a direct function of the mean vectors and covariance matrices of the quantum Gaussian states in question. We give an application to quantum illumination, which is the task of determining whether there is a low-reflectivity object embedded in a target region with a bright thermal-noise bath. For the asymmetric-error setting, we find that a quantum illumination transmitter can achieve an error probability exponent stronger than a coherent-state transmitter of the same mean photon number, and furthermore, that it requires far fewer trials to do so. This occurs when the background thermal noise is either low or bright, which means that a quantum advantage is even easier to witness than in the symmetric-error setting because it occurs for a larger range of parameters. Going forward from here, we expect our formula to have applications in settings well beyond those considered in this paper, especially to quantum communication tasks involving quantum Gaussian channels.
NASA Astrophysics Data System (ADS)
Kirstetter, P.; Hong, Y.; Gourley, J. J.; Chen, S.; Flamig, Z.; Zhang, J.; Howard, K.; Petersen, W. A.
2011-12-01
Proper characterization of the error structure of TRMM Precipitation Radar (PR) quantitative precipitation estimation (QPE) is needed for their use in TRMM combined products, water budget studies and hydrological modeling applications. Due to the variety of sources of error in spaceborne radar QPE (attenuation of the radar signal, influence of land surface, impact of off-nadir viewing angle, etc.) and the impact of correction algorithms, the problem is addressed by comparison of PR QPEs with reference values derived from ground-based measurements (GV) using NOAA/NSSL's National Mosaic QPE (NMQ) system. An investigation of this subject has been carried out at the PR estimation scale (instantaneous and 5 km) on the basis of a 3-month-long data sample. A significant effort has been carried out to derive a bias-corrected, robust reference rainfall source from NMQ. The GV processing details will be presented along with preliminary results of PR's error characteristics using contingency table statistics, probability distribution comparisons, scatter plots, semi-variograms, and systematic biases and random errors.
Estimation in a discrete tail rate family of recapture sampling models
NASA Technical Reports Server (NTRS)
Gupta, Rajan; Lee, Larry D.
1990-01-01
In the context of recapture sampling design for debugging experiments the problem of estimating the error or hitting rate of the faults remaining in a system is considered. Moment estimators are derived for a family of models in which the rate parameters are assumed proportional to the tail probabilities of a discrete distribution on the positive integers. The estimators are shown to be asymptotically normal and fully efficient. Their fixed sample properties are compared, through simulation, with those of the conditional maximum likelihood estimators.
Detection of presence of chemical precursors
NASA Technical Reports Server (NTRS)
Li, Jing (Inventor); Meyyappan, Meyya (Inventor); Lu, Yijiang (Inventor)
2009-01-01
Methods and systems for determining if one or more target molecules are present in a gas, by exposing a functionalized carbon nanostructure (CNS) to the gas and measuring an electrical parameter value EPV(n) associated with each of N CNS sub-arrays. In a first embodiment, a most-probable concentration value C(opt) is estimated, and an error value, depending upon differences between the measured values EPV(n) and corresponding values EPV(n;C(opt)) is computed. If the error value is less than a first error threshold value, the system interprets this as indicating that the target molecule is present in a concentration C.apprxeq.C(opt). A second embodiment uses extensive statistical and vector space analysis to estimate target molecule concentration.
NASA Technical Reports Server (NTRS)
Mobasseri, B. G.; Mcgillem, C. D.; Anuta, P. E. (Principal Investigator)
1978-01-01
The author has identified the following significant results. The probability of correct classification of various populations in data was defined as the primary performance index. The multispectral data being of multiclass nature as well, required a Bayes error estimation procedure that was dependent on a set of class statistics alone. The classification error was expressed in terms of an N dimensional integral, where N was the dimensionality of the feature space. The multispectral scanner spatial model was represented by a linear shift, invariant multiple, port system where the N spectral bands comprised the input processes. The scanner characteristic function, the relationship governing the transformation of the input spatial, and hence, spectral correlation matrices through the systems, was developed.
Assessing Gaussian Assumption of PMU Measurement Error Using Field Data
Wang, Shaobu; Zhao, Junbo; Huang, Zhenyu; ...
2017-10-13
Gaussian PMU measurement error has been assumed for many power system applications, such as state estimation, oscillatory modes monitoring, voltage stability analysis, to cite a few. This letter proposes a simple yet effective approach to assess this assumption by using the stability property of a probability distribution and the concept of redundant measurement. Extensive results using field PMU data from WECC system reveal that the Gaussian assumption is questionable.
Thyroid cancer following scalp irradiation: a reanalysis accounting for uncertainty in dosimetry.
Schafer, D W; Lubin, J H; Ron, E; Stovall, M; Carroll, R J
2001-09-01
In the 1940s and 1950s, over 20,000 children in Israel were treated for tinea capitis (scalp ringworm) by irradiation to induce epilation. Follow-up studies showed that the radiation exposure was associated with the development of malignant thyroid neoplasms. Despite this clear evidence of an effect, the magnitude of the dose-response relationship is much less clear because of probable errors in individual estimates of dose to the thyroid gland. Such errors have the potential to bias dose-response estimation, a potential that was not widely appreciated at the time of the original analyses. We revisit this issue, describing in detail how errors in dosimetry might occur, and we develop a new dose-response model that takes the uncertainties of the dosimetry into account. Our model for the uncertainty in dosimetry is a complex and new variant of the classical multiplicative Berkson error model, having components of classical multiplicative measurement error as well as missing data. Analysis of the tinea capitis data suggests that measurement error in the dosimetry has only a negligible effect on dose-response estimation and inference as well as on the modifying effect of age at exposure.
Quantum error-correction failure distributions: Comparison of coherent and stochastic error models
NASA Astrophysics Data System (ADS)
Barnes, Jeff P.; Trout, Colin J.; Lucarelli, Dennis; Clader, B. D.
2017-06-01
We compare failure distributions of quantum error correction circuits for stochastic errors and coherent errors. We utilize a fully coherent simulation of a fault-tolerant quantum error correcting circuit for a d =3 Steane and surface code. We find that the output distributions are markedly different for the two error models, showing that no simple mapping between the two error models exists. Coherent errors create very broad and heavy-tailed failure distributions. This suggests that they are susceptible to outlier events and that mean statistics, such as pseudothreshold estimates, may not provide the key figure of merit. This provides further statistical insight into why coherent errors can be so harmful for quantum error correction. These output probability distributions may also provide a useful metric that can be utilized when optimizing quantum error correcting codes and decoding procedures for purely coherent errors.
A comparison of the weights-of-evidence method and probabilistic neural networks
Singer, Donald A.; Kouda, Ryoichi
1999-01-01
The need to integrate large quantities of digital geoscience information to classify locations as mineral deposits or nondeposits has been met by the weights-of-evidence method in many situations. Widespread selection of this method may be more the result of its ease of use and interpretation rather than comparisons with alternative methods. A comparison of the weights-of-evidence method to probabilistic neural networks is performed here with data from Chisel Lake-Andeson Lake, Manitoba, Canada. Each method is designed to estimate the probability of belonging to learned classes where the estimated probabilities are used to classify the unknowns. Using these data, significantly lower classification error rates were observed for the neural network, not only when test and training data were the same (0.02 versus 23%), but also when validation data, not used in any training, were used to test the efficiency of classification (0.7 versus 17%). Despite these data containing too few deposits, these tests of this set of data demonstrate the neural network's ability at making unbiased probability estimates and lower error rates when measured by number of polygons or by the area of land misclassified. For both methods, independent validation tests are required to ensure that estimates are representative of real-world results. Results from the weights-of-evidence method demonstrate a strong bias where most errors are barren areas misclassified as deposits. The weights-of-evidence method is based on Bayes rule, which requires independent variables in order to make unbiased estimates. The chi-square test for independence indicates no significant correlations among the variables in the Chisel Lake–Andeson Lake data. However, the expected number of deposits test clearly demonstrates that these data violate the independence assumption. Other, independent simulations with three variables show that using variables with correlations of 1.0 can double the expected number of deposits as can correlations of −1.0. Studies done in the 1970s on methods that use Bayes rule show that moderate correlations among attributes seriously affect estimates and even small correlations lead to increases in misclassifications. Adverse effects have been observed with small to moderate correlations when only six to eight variables were used. Consistent evidence of upward biased probability estimates from multivariate methods founded on Bayes rule must be of considerable concern to institutions and governmental agencies where unbiased estimates are required. In addition to increasing the misclassification rate, biased probability estimates make classification into deposit and nondeposit classes an arbitrary subjective decision. The probabilistic neural network has no problem dealing with correlated variables—its performance depends strongly on having a thoroughly representative training set. Probabilistic neural networks or logistic regression should receive serious consideration where unbiased estimates are required. The weights-of-evidence method would serve to estimate thresholds between anomalies and background and for exploratory data analysis.
A Track Initiation Method for the Underwater Target Tracking Environment
NASA Astrophysics Data System (ADS)
Li, Dong-dong; Lin, Yang; Zhang, Yao
2018-04-01
A novel efficient track initiation method is proposed for the harsh underwater target tracking environment (heavy clutter and large measurement errors): track splitting, evaluating, pruning and merging method (TSEPM). Track initiation demands that the method should determine the existence and initial state of a target quickly and correctly. Heavy clutter and large measurement errors certainly pose additional difficulties and challenges, which deteriorate and complicate the track initiation in the harsh underwater target tracking environment. There are three primary shortcomings for the current track initiation methods to initialize a target: (a) they cannot eliminate the turbulences of clutter effectively; (b) there may be a high false alarm probability and low detection probability of a track; (c) they cannot estimate the initial state for a new confirmed track correctly. Based on the multiple hypotheses tracking principle and modified logic-based track initiation method, in order to increase the detection probability of a track, track splitting creates a large number of tracks which include the true track originated from the target. And in order to decrease the false alarm probability, based on the evaluation mechanism, track pruning and track merging are proposed to reduce the false tracks. TSEPM method can deal with the track initiation problems derived from heavy clutter and large measurement errors, determine the target's existence and estimate its initial state with the least squares method. What's more, our method is fully automatic and does not require any kind manual input for initializing and tuning any parameter. Simulation results indicate that our new method improves significantly the performance of the track initiation in the harsh underwater target tracking environment.
Estimation of the probability of success in petroleum exploration
Davis, J.C.
1977-01-01
A probabilistic model for oil exploration can be developed by assessing the conditional relationship between perceived geologic variables and the subsequent discovery of petroleum. Such a model includes two probabilistic components, the first reflecting the association between a geologic condition (structural closure, for example) and the occurrence of oil, and the second reflecting the uncertainty associated with the estimation of geologic variables in areas of limited control. Estimates of the conditional relationship between geologic variables and subsequent production can be found by analyzing the exploration history of a "training area" judged to be geologically similar to the exploration area. The geologic variables are assessed over the training area using an historical subset of the available data, whose density corresponds to the present control density in the exploration area. The success or failure of wells drilled in the training area subsequent to the time corresponding to the historical subset provides empirical estimates of the probability of success conditional upon geology. Uncertainty in perception of geological conditions may be estimated from the distribution of errors made in geologic assessment using the historical subset of control wells. These errors may be expressed as a linear function of distance from available control. Alternatively, the uncertainty may be found by calculating the semivariogram of the geologic variables used in the analysis: the two procedures will yield approximately equivalent results. The empirical probability functions may then be transferred to the exploration area and used to estimate the likelihood of success of specific exploration plays. These estimates will reflect both the conditional relationship between the geological variables used to guide exploration and the uncertainty resulting from lack of control. The technique is illustrated with case histories from the mid-Continent area of the U.S.A. ?? 1977 Plenum Publishing Corp.
System statistical reliability model and analysis
NASA Technical Reports Server (NTRS)
Lekach, V. S.; Rood, H.
1973-01-01
A digital computer code was developed to simulate the time-dependent behavior of the 5-kwe reactor thermoelectric system. The code was used to determine lifetime sensitivity coefficients for a number of system design parameters, such as thermoelectric module efficiency and degradation rate, radiator absorptivity and emissivity, fuel element barrier defect constant, beginning-of-life reactivity, etc. A probability distribution (mean and standard deviation) was estimated for each of these design parameters. Then, error analysis was used to obtain a probability distribution for the system lifetime (mean = 7.7 years, standard deviation = 1.1 years). From this, the probability that the system will achieve the design goal of 5 years lifetime is 0.993. This value represents an estimate of the degradation reliability of the system.
Classification based upon gene expression data: bias and precision of error rates.
Wood, Ian A; Visscher, Peter M; Mengersen, Kerrie L
2007-06-01
Gene expression data offer a large number of potentially useful predictors for the classification of tissue samples into classes, such as diseased and non-diseased. The predictive error rate of classifiers can be estimated using methods such as cross-validation. We have investigated issues of interpretation and potential bias in the reporting of error rate estimates. The issues considered here are optimization and selection biases, sampling effects, measures of misclassification rate, baseline error rates, two-level external cross-validation and a novel proposal for detection of bias using the permutation mean. Reporting an optimal estimated error rate incurs an optimization bias. Downward bias of 3-5% was found in an existing study of classification based on gene expression data and may be endemic in similar studies. Using a simulated non-informative dataset and two example datasets from existing studies, we show how bias can be detected through the use of label permutations and avoided using two-level external cross-validation. Some studies avoid optimization bias by using single-level cross-validation and a test set, but error rates can be more accurately estimated via two-level cross-validation. In addition to estimating the simple overall error rate, we recommend reporting class error rates plus where possible the conditional risk incorporating prior class probabilities and a misclassification cost matrix. We also describe baseline error rates derived from three trivial classifiers which ignore the predictors. R code which implements two-level external cross-validation with the PAMR package, experiment code, dataset details and additional figures are freely available for non-commercial use from http://www.maths.qut.edu.au/profiles/wood/permr.jsp
Knock probability estimation through an in-cylinder temperature model with exogenous noise
NASA Astrophysics Data System (ADS)
Bares, P.; Selmanaj, D.; Guardiola, C.; Onder, C.
2018-01-01
This paper presents a new knock model which combines a deterministic knock model based on the in-cylinder temperature and an exogenous noise disturbing this temperature. The autoignition of the end-gas is modelled by an Arrhenius-like function and the knock probability is estimated by propagating a virtual error probability distribution. Results show that the random nature of knock can be explained by uncertainties at the in-cylinder temperature estimation. The model only has one parameter for calibration and thus can be easily adapted online. In order to reduce the measurement uncertainties associated with the air mass flow sensor, the trapped mass is derived from the in-cylinder pressure resonance, which improves the knock probability estimation and reduces the number of sensors needed for the model. A four stroke SI engine was used for model validation. By varying the intake temperature, the engine speed, the injected fuel mass, and the spark advance, specific tests were conducted, which furnished data with various knock intensities and probabilities. The new model is able to predict the knock probability within a sufficient range at various operating conditions. The trapped mass obtained by the acoustical model was compared in steady conditions by using a fuel balance and a lambda sensor and differences below 1 % were found.
Subramanian, Sundarraman
2008-01-01
This article concerns asymptotic theory for a new estimator of a survival function in the missing censoring indicator model of random censorship. Specifically, the large sample results for an inverse probability-of-non-missingness weighted estimator of the cumulative hazard function, so far not available, are derived, including an almost sure representation with rate for a remainder term, and uniform strong consistency with rate of convergence. The estimator is based on a kernel estimate for the conditional probability of non-missingness of the censoring indicator. Expressions for its bias and variance, in turn leading to an expression for the mean squared error as a function of the bandwidth, are also obtained. The corresponding estimator of the survival function, whose weak convergence is derived, is asymptotically efficient. A numerical study, comparing the performances of the proposed and two other currently existing efficient estimators, is presented. PMID:18953423
Subramanian, Sundarraman
2006-01-01
This article concerns asymptotic theory for a new estimator of a survival function in the missing censoring indicator model of random censorship. Specifically, the large sample results for an inverse probability-of-non-missingness weighted estimator of the cumulative hazard function, so far not available, are derived, including an almost sure representation with rate for a remainder term, and uniform strong consistency with rate of convergence. The estimator is based on a kernel estimate for the conditional probability of non-missingness of the censoring indicator. Expressions for its bias and variance, in turn leading to an expression for the mean squared error as a function of the bandwidth, are also obtained. The corresponding estimator of the survival function, whose weak convergence is derived, is asymptotically efficient. A numerical study, comparing the performances of the proposed and two other currently existing efficient estimators, is presented.
Quantifying errors without random sampling.
Phillips, Carl V; LaPole, Luwanna M
2003-06-12
All quantifications of mortality, morbidity, and other health measures involve numerous sources of error. The routine quantification of random sampling error makes it easy to forget that other sources of error can and should be quantified. When a quantification does not involve sampling, error is almost never quantified and results are often reported in ways that dramatically overstate their precision. We argue that the precision implicit in typical reporting is problematic and sketch methods for quantifying the various sources of error, building up from simple examples that can be solved analytically to more complex cases. There are straightforward ways to partially quantify the uncertainty surrounding a parameter that is not characterized by random sampling, such as limiting reported significant figures. We present simple methods for doing such quantifications, and for incorporating them into calculations. More complicated methods become necessary when multiple sources of uncertainty must be combined. We demonstrate that Monte Carlo simulation, using available software, can estimate the uncertainty resulting from complicated calculations with many sources of uncertainty. We apply the method to the current estimate of the annual incidence of foodborne illness in the United States. Quantifying uncertainty from systematic errors is practical. Reporting this uncertainty would more honestly represent study results, help show the probability that estimated values fall within some critical range, and facilitate better targeting of further research.
Sampling considerations for disease surveillance in wildlife populations
Nusser, S.M.; Clark, W.R.; Otis, D.L.; Huang, L.
2008-01-01
Disease surveillance in wildlife populations involves detecting the presence of a disease, characterizing its prevalence and spread, and subsequent monitoring. A probability sample of animals selected from the population and corresponding estimators of disease prevalence and detection provide estimates with quantifiable statistical properties, but this approach is rarely used. Although wildlife scientists often assume probability sampling and random disease distributions to calculate sample sizes, convenience samples (i.e., samples of readily available animals) are typically used, and disease distributions are rarely random. We demonstrate how landscape-based simulation can be used to explore properties of estimators from convenience samples in relation to probability samples. We used simulation methods to model what is known about the habitat preferences of the wildlife population, the disease distribution, and the potential biases of the convenience-sample approach. Using chronic wasting disease in free-ranging deer (Odocoileus virginianus) as a simple illustration, we show that using probability sample designs with appropriate estimators provides unbiased surveillance parameter estimates but that the selection bias and coverage errors associated with convenience samples can lead to biased and misleading results. We also suggest practical alternatives to convenience samples that mix probability and convenience sampling. For example, a sample of land areas can be selected using a probability design that oversamples areas with larger animal populations, followed by harvesting of individual animals within sampled areas using a convenience sampling method.
Fusing metabolomics data sets with heterogeneous measurement errors
Waaijenborg, Sandra; Korobko, Oksana; Willems van Dijk, Ko; Lips, Mirjam; Hankemeier, Thomas; Wilderjans, Tom F.; Smilde, Age K.
2018-01-01
Combining different metabolomics platforms can contribute significantly to the discovery of complementary processes expressed under different conditions. However, analysing the fused data might be hampered by the difference in their quality. In metabolomics data, one often observes that measurement errors increase with increasing measurement level and that different platforms have different measurement error variance. In this paper we compare three different approaches to correct for the measurement error heterogeneity, by transformation of the raw data, by weighted filtering before modelling and by a modelling approach using a weighted sum of residuals. For an illustration of these different approaches we analyse data from healthy obese and diabetic obese individuals, obtained from two metabolomics platforms. Concluding, the filtering and modelling approaches that both estimate a model of the measurement error did not outperform the data transformation approaches for this application. This is probably due to the limited difference in measurement error and the fact that estimation of measurement error models is unstable due to the small number of repeats available. A transformation of the data improves the classification of the two groups. PMID:29698490
Murad, Havi; Kipnis, Victor; Freedman, Laurence S
2016-10-01
Assessing interactions in linear regression models when covariates have measurement error (ME) is complex.We previously described regression calibration (RC) methods that yield consistent estimators and standard errors for interaction coefficients of normally distributed covariates having classical ME. Here we extend normal based RC (NBRC) and linear RC (LRC) methods to a non-classical ME model, and describe more efficient versions that combine estimates from the main study and internal sub-study. We apply these methods to data from the Observing Protein and Energy Nutrition (OPEN) study. Using simulations we show that (i) for normally distributed covariates efficient NBRC and LRC were nearly unbiased and performed well with sub-study size ≥200; (ii) efficient NBRC had lower MSE than efficient LRC; (iii) the naïve test for a single interaction had type I error probability close to the nominal significance level, whereas efficient NBRC and LRC were slightly anti-conservative but more powerful; (iv) for markedly non-normal covariates, efficient LRC yielded less biased estimators with smaller variance than efficient NBRC. Our simulations suggest that it is preferable to use: (i) efficient NBRC for estimating and testing interaction effects of normally distributed covariates and (ii) efficient LRC for estimating and testing interactions for markedly non-normal covariates. © The Author(s) 2013.
Detection of sea otters in boat-based surveys of Prince William Sound, Alaska
Udevitz, Mark S.; Bodkin, James L.; Costa, Daniel P.
1995-01-01
Boat-based surveys have been commonly used to monitor sea otter populations, but there has been little quantitative work to evaluate detection biases that may affect these surveys. We used ground-based observers to investigate sea otter detection probabilities in a boat-based survey of Prince William Sound, Alaska. We estimated that 30% of the otters present on surveyed transects were not detected by boat crews. Approximately half (53%) of the undetected otters were missed because the otters left the transects, apparently in response to the approaching boat. Unbiased estimates of detection probabilities will be required for obtaining unbiased population estimates from boat-based surveys of sea otters. Therefore, boat-based surveys should include methods to estimate sea otter detection probabilities under the conditions specific to each survey. Unbiased estimation of detection probabilities with ground-based observers requires either that the ground crews detect all of the otters in observed subunits, or that there are no errors in determining which crews saw each detected otter. Ground-based observer methods may be appropriate in areas where nearly all of the sea otter habitat is potentially visible from ground-based vantage points.
Blöchliger, Nicolas; Keller, Peter M; Böttger, Erik C; Hombach, Michael
2017-09-01
The procedure for setting clinical breakpoints (CBPs) for antimicrobial susceptibility has been poorly standardized with respect to population data, pharmacokinetic parameters and clinical outcome. Tools to standardize CBP setting could result in improved antibiogram forecast probabilities. We propose a model to estimate probabilities for methodological categorization errors and defined zones of methodological uncertainty (ZMUs), i.e. ranges of zone diameters that cannot reliably be classified. The impact of ZMUs on methodological error rates was used for CBP optimization. The model distinguishes theoretical true inhibition zone diameters from observed diameters, which suffer from methodological variation. True diameter distributions are described with a normal mixture model. The model was fitted to observed inhibition zone diameters of clinical Escherichia coli strains. Repeated measurements for a quality control strain were used to quantify methodological variation. For 9 of 13 antibiotics analysed, our model predicted error rates of < 0.1% applying current EUCAST CBPs. Error rates were > 0.1% for ampicillin, cefoxitin, cefuroxime and amoxicillin/clavulanic acid. Increasing the susceptible CBP (cefoxitin) and introducing ZMUs (ampicillin, cefuroxime, amoxicillin/clavulanic acid) decreased error rates to < 0.1%. ZMUs contained low numbers of isolates for ampicillin and cefuroxime (3% and 6%), whereas the ZMU for amoxicillin/clavulanic acid contained 41% of all isolates and was considered not practical. We demonstrate that CBPs can be improved and standardized by minimizing methodological categorization error rates. ZMUs may be introduced if an intermediate zone is not appropriate for pharmacokinetic/pharmacodynamic or drug dosing reasons. Optimized CBPs will provide a standardized antibiotic susceptibility testing interpretation at a defined level of probability. © The Author 2017. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Toroody, Ahmad Bahoo; Abaiee, Mohammad Mahdi; Gholamnia, Reza; Ketabdari, Mohammad Javad
2016-09-01
Owing to the increase in unprecedented accidents with new root causes in almost all operational areas, the importance of risk management has dramatically risen. Risk assessment, one of the most significant aspects of risk management, has a substantial impact on the system-safety level of organizations, industries, and operations. If the causes of all kinds of failure and the interactions between them are considered, effective risk assessment can be highly accurate. A combination of traditional risk assessment approaches and modern scientific probability methods can help in realizing better quantitative risk assessment methods. Most researchers face the problem of minimal field data with respect to the probability and frequency of each failure. Because of this limitation in the availability of epistemic knowledge, it is important to conduct epistemic estimations by applying the Bayesian theory for identifying plausible outcomes. In this paper, we propose an algorithm and demonstrate its application in a case study for a light-weight lifting operation in the Persian Gulf of Iran. First, we identify potential accident scenarios and present them in an event tree format. Next, excluding human error, we use the event tree to roughly estimate the prior probability of other hazard-promoting factors using a minimal amount of field data. We then use the Success Likelihood Index Method (SLIM) to calculate the probability of human error. On the basis of the proposed event tree, we use the Bayesian network of the provided scenarios to compensate for the lack of data. Finally, we determine the resulting probability of each event based on its evidence in the epistemic estimation format by building on two Bayesian network types: the probability of hazard promotion factors and the Bayesian theory. The study results indicate that despite the lack of available information on the operation of floating objects, a satisfactory result can be achieved using epistemic data.
Cluster membership probability: polarimetric approach
NASA Astrophysics Data System (ADS)
Medhi, Biman J.; Tamura, Motohide
2013-04-01
Interstellar polarimetric data of the six open clusters Hogg 15, NGC 6611, NGC 5606, NGC 6231, NGC 5749 and NGC 6250 have been used to estimate the membership probability for the stars within them. For proper-motion member stars, the membership probability estimated using the polarimetric data is in good agreement with the proper-motion cluster membership probability. However, for proper-motion non-member stars, the membership probability estimated by the polarimetric method is in total disagreement with the proper-motion cluster membership probability. The inconsistencies in the determined memberships may be because of the fundamental differences between the two methods of determination: one is based on stellar proper motion in space and the other is based on selective extinction of the stellar output by the asymmetric aligned dust grains present in the interstellar medium. The results and analysis suggest that the scatter of the Stokes vectors q (per cent) and u (per cent) for the proper-motion member stars depends on the interstellar and intracluster differential reddening in the open cluster. It is found that this method could be used to estimate the cluster membership probability if we have additional polarimetric and photometric information for a star to identify it as a probable member/non-member of a particular cluster, such as the maximum wavelength value (λmax), the unit weight error of the fit (σ1), the dispersion in the polarimetric position angles (overline{ɛ }), reddening (E(B - V)) or the differential intracluster reddening (ΔE(B - V)). This method could also be used to estimate the membership probability of known member stars having no membership probability as well as to resolve disagreements about membership among different proper-motion surveys.
Flood-frequency characteristics of Wisconsin streams
Walker, John F.; Peppler, Marie C.; Danz, Mari E.; Hubbard, Laura E.
2017-05-22
Flood-frequency characteristics for 360 gaged sites on unregulated rural streams in Wisconsin are presented for percent annual exceedance probabilities ranging from 0.2 to 50 using a statewide skewness map developed for this report. Equations of the relations between flood-frequency and drainage-basin characteristics were developed by multiple-regression analyses. Flood-frequency characteristics for ungaged sites on unregulated, rural streams can be estimated by use of the equations presented in this report. The State was divided into eight areas of similar physiographic characteristics. The most significant basin characteristics are drainage area, soil saturated hydraulic conductivity, main-channel slope, and several land-use variables. The standard error of prediction for the equation for the 1-percent annual exceedance probability flood ranges from 56 to 70 percent for Wisconsin Streams; these values are larger than results presented in previous reports. The increase in the standard error of prediction is likely due to increased variability of the annual-peak discharges, resulting in increased variability in the magnitude of flood peaks at higher frequencies. For each of the unregulated rural streamflow-gaging stations, a weighted estimate based on the at-site log Pearson type III analysis and the multiple regression results was determined. The weighted estimate generally has a lower uncertainty than either the Log Pearson type III or multiple regression estimates. For regulated streams, a graphical method for estimating flood-frequency characteristics was developed from the relations of discharge and drainage area for selected annual exceedance probabilities. Graphs for the major regulated streams in Wisconsin are presented in the report.
Correcting for Measurement Error in Time-Varying Covariates in Marginal Structural Models.
Kyle, Ryan P; Moodie, Erica E M; Klein, Marina B; Abrahamowicz, Michał
2016-08-01
Unbiased estimation of causal parameters from marginal structural models (MSMs) requires a fundamental assumption of no unmeasured confounding. Unfortunately, the time-varying covariates used to obtain inverse probability weights are often error-prone. Although substantial measurement error in important confounders is known to undermine control of confounders in conventional unweighted regression models, this issue has received comparatively limited attention in the MSM literature. Here we propose a novel application of the simulation-extrapolation (SIMEX) procedure to address measurement error in time-varying covariates, and we compare 2 approaches. The direct approach to SIMEX-based correction targets outcome model parameters, while the indirect approach corrects the weights estimated using the exposure model. We assess the performance of the proposed methods in simulations under different clinically plausible assumptions. The simulations demonstrate that measurement errors in time-dependent covariates may induce substantial bias in MSM estimators of causal effects of time-varying exposures, and that both proposed SIMEX approaches yield practically unbiased estimates in scenarios featuring low-to-moderate degrees of error. We illustrate the proposed approach in a simple analysis of the relationship between sustained virological response and liver fibrosis progression among persons infected with hepatitis C virus, while accounting for measurement error in γ-glutamyltransferase, using data collected in the Canadian Co-infection Cohort Study from 2003 to 2014. © The Author 2016. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Kang, Le; Carter, Randy; Darcy, Kathleen; Kauderer, James; Liao, Shu-Yuan
2013-01-01
In this article we use a latent class model (LCM) with prevalence modeled as a function of covariates to assess diagnostic test accuracy in situations where the true disease status is not observed, but observations on three or more conditionally independent diagnostic tests are available. A fast Monte Carlo EM (MCEM) algorithm with binary (disease) diagnostic data is implemented to estimate parameters of interest; namely, sensitivity, specificity, and prevalence of the disease as a function of covariates. To obtain standard errors for confidence interval construction of estimated parameters, the missing information principle is applied to adjust information matrix estimates. We compare the adjusted information matrix based standard error estimates with the bootstrap standard error estimates both obtained using the fast MCEM algorithm through an extensive Monte Carlo study. Simulation demonstrates that the adjusted information matrix approach estimates the standard error similarly with the bootstrap methods under certain scenarios. The bootstrap percentile intervals have satisfactory coverage probabilities. We then apply the LCM analysis to a real data set of 122 subjects from a Gynecologic Oncology Group (GOG) study of significant cervical lesion (S-CL) diagnosis in women with atypical glandular cells of undetermined significance (AGC) to compare the diagnostic accuracy of a histology-based evaluation, a CA-IX biomarker-based test and a human papillomavirus (HPV) DNA test. PMID:24163493
Understanding seasonal variability of uncertainty in hydrological prediction
NASA Astrophysics Data System (ADS)
Li, M.; Wang, Q. J.
2012-04-01
Understanding uncertainty in hydrological prediction can be highly valuable for improving the reliability of streamflow prediction. In this study, a monthly water balance model, WAPABA, in a Bayesian joint probability with error models are presented to investigate the seasonal dependency of prediction error structure. A seasonal invariant error model, analogous to traditional time series analysis, uses constant parameters for model error and account for no seasonal variations. In contrast, a seasonal variant error model uses a different set of parameters for bias, variance and autocorrelation for each individual calendar month. Potential connection amongst model parameters from similar months is not considered within the seasonal variant model and could result in over-fitting and over-parameterization. A hierarchical error model further applies some distributional restrictions on model parameters within a Bayesian hierarchical framework. An iterative algorithm is implemented to expedite the maximum a posterior (MAP) estimation of a hierarchical error model. Three error models are applied to forecasting streamflow at a catchment in southeast Australia in a cross-validation analysis. This study also presents a number of statistical measures and graphical tools to compare the predictive skills of different error models. From probability integral transform histograms and other diagnostic graphs, the hierarchical error model conforms better to reliability when compared to the seasonal invariant error model. The hierarchical error model also generally provides the most accurate mean prediction in terms of the Nash-Sutcliffe model efficiency coefficient and the best probabilistic prediction in terms of the continuous ranked probability score (CRPS). The model parameters of the seasonal variant error model are very sensitive to each cross validation, while the hierarchical error model produces much more robust and reliable model parameters. Furthermore, the result of the hierarchical error model shows that most of model parameters are not seasonal variant except for error bias. The seasonal variant error model is likely to use more parameters than necessary to maximize the posterior likelihood. The model flexibility and robustness indicates that the hierarchical error model has great potential for future streamflow predictions.
NASA Astrophysics Data System (ADS)
Xia, Xintao; Wang, Zhongyu
2008-10-01
For some methods of stability analysis of a system using statistics, it is difficult to resolve the problems of unknown probability distribution and small sample. Therefore, a novel method is proposed in this paper to resolve these problems. This method is independent of probability distribution, and is useful for small sample systems. After rearrangement of the original data series, the order difference and two polynomial membership functions are introduced to estimate the true value, the lower bound and the supper bound of the system using fuzzy-set theory. Then empirical distribution function is investigated to ensure confidence level above 95%, and the degree of similarity is presented to evaluate stability of the system. Cases of computer simulation investigate stable systems with various probability distribution, unstable systems with linear systematic errors and periodic systematic errors and some mixed systems. The method of analysis for systematic stability is approved.
A note on variance estimation in random effects meta-regression.
Sidik, Kurex; Jonkman, Jeffrey N
2005-01-01
For random effects meta-regression inference, variance estimation for the parameter estimates is discussed. Because estimated weights are used for meta-regression analysis in practice, the assumed or estimated covariance matrix used in meta-regression is not strictly correct, due to possible errors in estimating the weights. Therefore, this note investigates the use of a robust variance estimation approach for obtaining variances of the parameter estimates in random effects meta-regression inference. This method treats the assumed covariance matrix of the effect measure variables as a working covariance matrix. Using an example of meta-analysis data from clinical trials of a vaccine, the robust variance estimation approach is illustrated in comparison with two other methods of variance estimation. A simulation study is presented, comparing the three methods of variance estimation in terms of bias and coverage probability. We find that, despite the seeming suitability of the robust estimator for random effects meta-regression, the improved variance estimator of Knapp and Hartung (2003) yields the best performance among the three estimators, and thus may provide the best protection against errors in the estimated weights.
Yoshizaki, J.; Pollock, K.H.; Brownie, C.; Webster, R.A.
2009-01-01
Misidentification of animals is potentially important when naturally existing features (natural tags) are used to identify individual animals in a capture-recapture study. Photographic identification (photoID) typically uses photographic images of animals' naturally existing features as tags (photographic tags) and is subject to two main causes of identification errors: those related to quality of photographs (non-evolving natural tags) and those related to changes in natural marks (evolving natural tags). The conventional methods for analysis of capture-recapture data do not account for identification errors, and to do so requires a detailed understanding of the misidentification mechanism. Focusing on the situation where errors are due to evolving natural tags, we propose a misidentification mechanism and outline a framework for modeling the effect of misidentification in closed population studies. We introduce methods for estimating population size based on this model. Using a simulation study, we show that conventional estimators can seriously overestimate population size when errors due to misidentification are ignored, and that, in comparison, our new estimators have better properties except in cases with low capture probabilities (<0.2) or low misidentification rates (<2.5%). ?? 2009 by the Ecological Society of America.
NASA Astrophysics Data System (ADS)
Al-Mudhafar, W. J.
2013-12-01
Precisely prediction of rock facies leads to adequate reservoir characterization by improving the porosity-permeability relationships to estimate the properties in non-cored intervals. It also helps to accurately identify the spatial facies distribution to perform an accurate reservoir model for optimal future reservoir performance. In this paper, the facies estimation has been done through Multinomial logistic regression (MLR) with respect to the well logs and core data in a well in upper sandstone formation of South Rumaila oil field. The entire independent variables are gamma rays, formation density, water saturation, shale volume, log porosity, core porosity, and core permeability. Firstly, Robust Sequential Imputation Algorithm has been considered to impute the missing data. This algorithm starts from a complete subset of the dataset and estimates sequentially the missing values in an incomplete observation by minimizing the determinant of the covariance of the augmented data matrix. Then, the observation is added to the complete data matrix and the algorithm continues with the next observation with missing values. The MLR has been chosen to estimate the maximum likelihood and minimize the standard error for the nonlinear relationships between facies & core and log data. The MLR is used to predict the probabilities of the different possible facies given each independent variable by constructing a linear predictor function having a set of weights that are linearly combined with the independent variables by using a dot product. Beta distribution of facies has been considered as prior knowledge and the resulted predicted probability (posterior) has been estimated from MLR based on Baye's theorem that represents the relationship between predicted probability (posterior) with the conditional probability and the prior knowledge. To assess the statistical accuracy of the model, the bootstrap should be carried out to estimate extra-sample prediction error by randomly drawing datasets with replacement from the training data. Each sample has the same size of the original training set and it can be conducted N times to produce N bootstrap datasets to re-fit the model accordingly to decrease the squared difference between the estimated and observed categorical variables (facies) leading to decrease the degree of uncertainty.
High dimensional linear regression models under long memory dependence and measurement error
NASA Astrophysics Data System (ADS)
Kaul, Abhishek
This dissertation consists of three chapters. The first chapter introduces the models under consideration and motivates problems of interest. A brief literature review is also provided in this chapter. The second chapter investigates the properties of Lasso under long range dependent model errors. Lasso is a computationally efficient approach to model selection and estimation, and its properties are well studied when the regression errors are independent and identically distributed. We study the case, where the regression errors form a long memory moving average process. We establish a finite sample oracle inequality for the Lasso solution. We then show the asymptotic sign consistency in this setup. These results are established in the high dimensional setup (p> n) where p can be increasing exponentially with n. Finally, we show the consistency, n½ --d-consistency of Lasso, along with the oracle property of adaptive Lasso, in the case where p is fixed. Here d is the memory parameter of the stationary error sequence. The performance of Lasso is also analysed in the present setup with a simulation study. The third chapter proposes and investigates the properties of a penalized quantile based estimator for measurement error models. Standard formulations of prediction problems in high dimension regression models assume the availability of fully observed covariates and sub-Gaussian and homogeneous model errors. This makes these methods inapplicable to measurement errors models where covariates are unobservable and observations are possibly non sub-Gaussian and heterogeneous. We propose weighted penalized corrected quantile estimators for the regression parameter vector in linear regression models with additive measurement errors, where unobservable covariates are nonrandom. The proposed estimators forgo the need for the above mentioned model assumptions. We study these estimators in both the fixed dimension and high dimensional sparse setups, in the latter setup, the dimensionality can grow exponentially with the sample size. In the fixed dimensional setting we provide the oracle properties associated with the proposed estimators. In the high dimensional setting, we provide bounds for the statistical error associated with the estimation, that hold with asymptotic probability 1, thereby providing the ℓ1-consistency of the proposed estimator. We also establish the model selection consistency in terms of the correctly estimated zero components of the parameter vector. A simulation study that investigates the finite sample accuracy of the proposed estimator is also included in this chapter.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morley, Steven
The PyForecastTools package provides Python routines for calculating metrics for model validation, forecast verification and model comparison. For continuous predictands the package provides functions for calculating bias (mean error, mean percentage error, median log accuracy, symmetric signed bias), and for calculating accuracy (mean squared error, mean absolute error, mean absolute scaled error, normalized RMSE, median symmetric accuracy). Convenience routines to calculate the component parts (e.g. forecast error, scaled error) of each metric are also provided. To compare models the package provides: generic skill score; percent better. Robust measures of scale including median absolute deviation, robust standard deviation, robust coefficient ofmore » variation and the Sn estimator are all provided by the package. Finally, the package implements Python classes for NxN contingency tables. In the case of a multi-class prediction, accuracy and skill metrics such as proportion correct and the Heidke and Peirce skill scores are provided as object methods. The special case of a 2x2 contingency table inherits from the NxN class and provides many additional metrics for binary classification: probability of detection, probability of false detection, false alarm ration, threat score, equitable threat score, bias. Confidence intervals for many of these quantities can be calculated using either the Wald method or Agresti-Coull intervals.« less
Reconciling uncertain costs and benefits in bayes nets for invasive species management
Burgman, M.A.; Wintle, B.A.; Thompson, C.A.; Moilanen, A.; Runge, M.C.; Ben-Haim, Y.
2010-01-01
Bayes nets are used increasingly to characterize environmental systems and formalize probabilistic reasoning to support decision making. These networks treat probabilities as exact quantities. Sensitivity analysis can be used to evaluate the importance of assumptions and parameter estimates. Here, we outline an application of info-gap theory to Bayes nets that evaluates the sensitivity of decisions to possibly large errors in the underlying probability estimates and utilities. We apply it to an example of management and eradication of Red Imported Fire Ants in Southern Queensland, Australia and show how changes in management decisions can be justified when uncertainty is considered. ?? 2009 Society for Risk Analysis.
Rekaya, Romdhane; Smith, Shannon; Hay, El Hamidi; Farhat, Nourhene; Aggrey, Samuel E
2016-01-01
Errors in the binary status of some response traits are frequent in human, animal, and plant applications. These error rates tend to differ between cases and controls because diagnostic and screening tests have different sensitivity and specificity. This increases the inaccuracies of classifying individuals into correct groups, giving rise to both false-positive and false-negative cases. The analysis of these noisy binary responses due to misclassification will undoubtedly reduce the statistical power of genome-wide association studies (GWAS). A threshold model that accommodates varying diagnostic errors between cases and controls was investigated. A simulation study was carried out where several binary data sets (case-control) were generated with varying effects for the most influential single nucleotide polymorphisms (SNPs) and different diagnostic error rate for cases and controls. Each simulated data set consisted of 2000 individuals. Ignoring misclassification resulted in biased estimates of true influential SNP effects and inflated estimates for true noninfluential markers. A substantial reduction in bias and increase in accuracy ranging from 12% to 32% was observed when the misclassification procedure was invoked. In fact, the majority of influential SNPs that were not identified using the noisy data were captured using the proposed method. Additionally, truly misclassified binary records were identified with high probability using the proposed method. The superiority of the proposed method was maintained across different simulation parameters (misclassification rates and odds ratios) attesting to its robustness.
Aggregate and individual replication probability within an explicit model of the research process.
Miller, Jeff; Schwarz, Wolf
2011-09-01
We study a model of the research process in which the true effect size, the replication jitter due to changes in experimental procedure, and the statistical error of effect size measurement are all normally distributed random variables. Within this model, we analyze the probability of successfully replicating an initial experimental result by obtaining either a statistically significant result in the same direction or any effect in that direction. We analyze both the probability of successfully replicating a particular experimental effect (i.e., the individual replication probability) and the average probability of successful replication across different studies within some research context (i.e., the aggregate replication probability), and we identify the conditions under which the latter can be approximated using the formulas of Killeen (2005a, 2007). We show how both of these probabilities depend on parameters of the research context that would rarely be known in practice. In addition, we show that the statistical uncertainty associated with the size of an initial observed effect would often prevent accurate estimation of the desired individual replication probability even if these research context parameters were known exactly. We conclude that accurate estimates of replication probability are generally unattainable.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, Ping; Wang, Chenyu; Li, Mingjie
In general, the modeling errors of dynamic system model are a set of random variables. The traditional performance index of modeling such as means square error (MSE) and root means square error (RMSE) can not fully express the connotation of modeling errors with stochastic characteristics both in the dimension of time domain and space domain. Therefore, the probability density function (PDF) is introduced to completely describe the modeling errors in both time scales and space scales. Based on it, a novel wavelet neural network (WNN) modeling method is proposed by minimizing the two-dimensional (2D) PDF shaping of modeling errors. First,more » the modeling error PDF by the tradional WNN is estimated using data-driven kernel density estimation (KDE) technique. Then, the quadratic sum of 2D deviation between the modeling error PDF and the target PDF is utilized as performance index to optimize the WNN model parameters by gradient descent method. Since the WNN has strong nonlinear approximation and adaptive capability, and all the parameters are well optimized by the proposed method, the developed WNN model can make the modeling error PDF track the target PDF, eventually. Simulation example and application in a blast furnace ironmaking process show that the proposed method has a higher modeling precision and better generalization ability compared with the conventional WNN modeling based on MSE criteria. Furthermore, the proposed method has more desirable estimation for modeling error PDF that approximates to a Gaussian distribution whose shape is high and narrow.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, Ping; Wang, Chenyu; Li, Mingjie
In general, the modeling errors of dynamic system model are a set of random variables. The traditional performance index of modeling such as means square error (MSE) and root means square error (RMSE) cannot fully express the connotation of modeling errors with stochastic characteristics both in the dimension of time domain and space domain. Therefore, the probability density function (PDF) is introduced to completely describe the modeling errors in both time scales and space scales. Based on it, a novel wavelet neural network (WNN) modeling method is proposed by minimizing the two-dimensional (2D) PDF shaping of modeling errors. First, themore » modeling error PDF by the traditional WNN is estimated using data-driven kernel density estimation (KDE) technique. Then, the quadratic sum of 2D deviation between the modeling error PDF and the target PDF is utilized as performance index to optimize the WNN model parameters by gradient descent method. Since the WNN has strong nonlinear approximation and adaptive capability, and all the parameters are well optimized by the proposed method, the developed WNN model can make the modeling error PDF track the target PDF, eventually. Simulation example and application in a blast furnace ironmaking process show that the proposed method has a higher modeling precision and better generalization ability compared with the conventional WNN modeling based on MSE criteria. However, the proposed method has more desirable estimation for modeling error PDF that approximates to a Gaussian distribution whose shape is high and narrow.« less
Zhou, Ping; Wang, Chenyu; Li, Mingjie; ...
2018-01-31
In general, the modeling errors of dynamic system model are a set of random variables. The traditional performance index of modeling such as means square error (MSE) and root means square error (RMSE) cannot fully express the connotation of modeling errors with stochastic characteristics both in the dimension of time domain and space domain. Therefore, the probability density function (PDF) is introduced to completely describe the modeling errors in both time scales and space scales. Based on it, a novel wavelet neural network (WNN) modeling method is proposed by minimizing the two-dimensional (2D) PDF shaping of modeling errors. First, themore » modeling error PDF by the traditional WNN is estimated using data-driven kernel density estimation (KDE) technique. Then, the quadratic sum of 2D deviation between the modeling error PDF and the target PDF is utilized as performance index to optimize the WNN model parameters by gradient descent method. Since the WNN has strong nonlinear approximation and adaptive capability, and all the parameters are well optimized by the proposed method, the developed WNN model can make the modeling error PDF track the target PDF, eventually. Simulation example and application in a blast furnace ironmaking process show that the proposed method has a higher modeling precision and better generalization ability compared with the conventional WNN modeling based on MSE criteria. However, the proposed method has more desirable estimation for modeling error PDF that approximates to a Gaussian distribution whose shape is high and narrow.« less
Bernoulli, Darwin, and Sagan: the probability of life on other planets
NASA Astrophysics Data System (ADS)
Rossmo, D. Kim
2017-04-01
The recent discovery that billions of planets in the Milky Way Galaxy may be in circumstellar habitable zones has renewed speculation over the possibility of extraterrestrial life. The Drake equation is a probabilistic framework for estimating the number of technological advanced civilizations in our Galaxy; however, many of the equation's component probabilities are either unknown or have large error intervals. In this paper, a different method of examining this question is explored, one that replaces the various Drake factors with the single estimate for the probability of life existing on Earth. This relationship can be described by the binomial distribution if the presence of life on a given number of planets is equated to successes in a Bernoulli trial. The question of exoplanet life may then be reformulated as follows - given the probability of one or more independent successes for a given number of trials, what is the probability of two or more successes? Some of the implications of this approach for finding life on exoplanets are discussed.
Airborne radar technology for windshear detection
NASA Technical Reports Server (NTRS)
Hibey, Joseph L.; Khalaf, Camille S.
1988-01-01
The objectives and accomplishments of the two-and-a-half year effort to describe how returns from on-board Doppler radar are to be used to detect the presence of a wind shear are reported. The problem is modeled as one of first passage in terms of state variables, the state estimates are generated by a bank of extended Kalman filters working in parallel, and the decision strategy involves the use of a voting algorithm for a series of likelihood ratio tests. The performance issue for filtering is addressed in terms of error-covariance reduction and filter divergence, and the performance issue for detection is addressed in terms of using a probability measure transformation to derive theoretical expressions for the error probabilities of a false alarm and a miss.
The theory precision analyse of RFM localization of satellite remote sensing imagery
NASA Astrophysics Data System (ADS)
Zhang, Jianqing; Xv, Biao
2009-11-01
The tradition method of detecting precision of Rational Function Model(RFM) is to make use of a great deal check points, and it calculates mean square error through comparing calculational coordinate with known coordinate. This method is from theory of probability, through a large number of samples to statistic estimate value of mean square error, we can think its estimate value approaches in its true when samples are well enough. This paper is from angle of survey adjustment, take law of propagation of error as the theory basis, and it calculates theory precision of RFM localization. Then take the SPOT5 three array imagery as experiment data, and the result of traditional method and narrated method in the paper are compared, while has confirmed tradition method feasible, and answered its theory precision question from the angle of survey adjustment.
Estimating error rates for firearm evidence identifications in forensic science
Song, John; Vorburger, Theodore V.; Chu, Wei; Yen, James; Soons, Johannes A.; Ott, Daniel B.; Zhang, Nien Fan
2018-01-01
Estimating error rates for firearm evidence identification is a fundamental challenge in forensic science. This paper describes the recently developed congruent matching cells (CMC) method for image comparisons, its application to firearm evidence identification, and its usage and initial tests for error rate estimation. The CMC method divides compared topography images into correlation cells. Four identification parameters are defined for quantifying both the topography similarity of the correlated cell pairs and the pattern congruency of the registered cell locations. A declared match requires a significant number of CMCs, i.e., cell pairs that meet all similarity and congruency requirements. Initial testing on breech face impressions of a set of 40 cartridge cases fired with consecutively manufactured pistol slides showed wide separation between the distributions of CMC numbers observed for known matching and known non-matching image pairs. Another test on 95 cartridge cases from a different set of slides manufactured by the same process also yielded widely separated distributions. The test results were used to develop two statistical models for the probability mass function of CMC correlation scores. The models were applied to develop a framework for estimating cumulative false positive and false negative error rates and individual error rates of declared matches and non-matches for this population of breech face impressions. The prospect for applying the models to large populations and realistic case work is also discussed. The CMC method can provide a statistical foundation for estimating error rates in firearm evidence identifications, thus emulating methods used for forensic identification of DNA evidence. PMID:29331680
Estimating error rates for firearm evidence identifications in forensic science.
Song, John; Vorburger, Theodore V; Chu, Wei; Yen, James; Soons, Johannes A; Ott, Daniel B; Zhang, Nien Fan
2018-03-01
Estimating error rates for firearm evidence identification is a fundamental challenge in forensic science. This paper describes the recently developed congruent matching cells (CMC) method for image comparisons, its application to firearm evidence identification, and its usage and initial tests for error rate estimation. The CMC method divides compared topography images into correlation cells. Four identification parameters are defined for quantifying both the topography similarity of the correlated cell pairs and the pattern congruency of the registered cell locations. A declared match requires a significant number of CMCs, i.e., cell pairs that meet all similarity and congruency requirements. Initial testing on breech face impressions of a set of 40 cartridge cases fired with consecutively manufactured pistol slides showed wide separation between the distributions of CMC numbers observed for known matching and known non-matching image pairs. Another test on 95 cartridge cases from a different set of slides manufactured by the same process also yielded widely separated distributions. The test results were used to develop two statistical models for the probability mass function of CMC correlation scores. The models were applied to develop a framework for estimating cumulative false positive and false negative error rates and individual error rates of declared matches and non-matches for this population of breech face impressions. The prospect for applying the models to large populations and realistic case work is also discussed. The CMC method can provide a statistical foundation for estimating error rates in firearm evidence identifications, thus emulating methods used for forensic identification of DNA evidence. Published by Elsevier B.V.
Transmuted of Rayleigh Distribution with Estimation and Application on Noise Signal
NASA Astrophysics Data System (ADS)
Ahmed, Suhad; Qasim, Zainab
2018-05-01
This paper deals with transforming one parameter Rayleigh distribution, into transmuted probability distribution through introducing a new parameter (λ), since this studied distribution is necessary in representing signal data distribution and failure data model the value of this transmuted parameter |λ| ≤ 1, is also estimated as well as the original parameter (⊖) by methods of moments and maximum likelihood using different sample size (n=25, 50, 75, 100) and comparing the results of estimation by statistical measure (mean square error, MSE).
Performance of concatenated Reed-Solomon/Viterbi channel coding
NASA Technical Reports Server (NTRS)
Divsalar, D.; Yuen, J. H.
1982-01-01
The concatenated Reed-Solomon (RS)/Viterbi coding system is reviewed. The performance of the system is analyzed and results are derived with a new simple approach. A functional model for the input RS symbol error probability is presented. Based on this new functional model, we compute the performance of a concatenated system in terms of RS word error probability, output RS symbol error probability, bit error probability due to decoding failure, and bit error probability due to decoding error. Finally we analyze the effects of the noisy carrier reference and the slow fading on the system performance.
Letcher, B.H.; Horton, G.E.
2008-01-01
We estimated the magnitude and shape of size-dependent survival (SDS) across multiple sampling intervals for two cohorts of stream-dwelling Atlantic salmon (Salmo salar) juveniles using multistate capture-mark-recapture (CMR) models. Simulations designed to test the effectiveness of multistate models for detecting SDS in our system indicated that error in SDS estimates was low and that both time-invariant and time-varying SDS could be detected with sample sizes of >250, average survival of >0.6, and average probability of capture of >0.6, except for cases of very strong SDS. In the field (N ??? 750, survival 0.6-0.8 among sampling intervals, probability of capture 0.6-0.8 among sampling occasions), about one-third of the sampling intervals showed evidence of SDS, with poorer survival of larger fish during the age-2+ autumn and quadratic survival (opposite direction between cohorts) during age-1+ spring. The varying magnitude and shape of SDS among sampling intervals suggest a potential mechanism for the maintenance of the very wide observed size distributions. Estimating SDS using multistate CMR models appears complementary to established approaches, can provide estimates with low error, and can be used to detect intermittent SDS. ?? 2008 NRC Canada.
Agoritsas, Thomas; Courvoisier, Delphine S; Combescure, Christophe; Deom, Marie; Perneger, Thomas V
2011-04-01
The probability of a disease following a diagnostic test depends on the sensitivity and specificity of the test, but also on the prevalence of the disease in the population of interest (or pre-test probability). How physicians use this information is not well known. To assess whether physicians correctly estimate post-test probability according to various levels of prevalence and explore this skill across respondent groups. Randomized trial. Population-based sample of 1,361 physicians of all clinical specialties. We described a scenario of a highly accurate screening test (sensitivity 99% and specificity 99%) in which we randomly manipulated the prevalence of the disease (1%, 2%, 10%, 25%, 95%, or no information). We asked physicians to estimate the probability of disease following a positive test (categorized as <60%, 60-79%, 80-94%, 95-99.9%, and >99.9%). Each answer was correct for a different version of the scenario, and no answer was possible in the "no information" scenario. We estimated the proportion of physicians proficient in assessing post-test probability as the proportion of correct answers beyond the distribution of answers attributable to guessing. Most respondents in each of the six groups (67%-82%) selected a post-test probability of 95-99.9%, regardless of the prevalence of disease and even when no information on prevalence was provided. This answer was correct only for a prevalence of 25%. We estimated that 9.1% (95% CI 6.0-14.0) of respondents knew how to assess correctly the post-test probability. This proportion did not vary with clinical experience or practice setting. Most physicians do not take into account the prevalence of disease when interpreting a positive test result. This may cause unnecessary testing and diagnostic errors.
NASA Technical Reports Server (NTRS)
Massey, J. L.
1976-01-01
The very low error probability obtained with long error-correcting codes results in a very small number of observed errors in simulation studies of practical size and renders the usual confidence interval techniques inapplicable to the observed error probability. A natural extension of the notion of a 'confidence interval' is made and applied to such determinations of error probability by simulation. An example is included to show the surprisingly great significance of as few as two decoding errors in a very large number of decoding trials.
Estimating transition probabilities in unmarked populations --entropy revisited
Cooch, E.G.; Link, W.A.
1999-01-01
The probability of surviving and moving between 'states' is of great interest to biologists. Robust estimation of these transitions using multiple observations of individually identifiable marked individuals has received considerable attention in recent years. However, in some situations, individuals are not identifiable (or have a very low recapture rate), although all individuals in a sample can be assigned to a particular state (e.g. breeding or non-breeding) without error. In such cases, only aggregate data (number of individuals in a given state at each occasion) are available. If the underlying matrix of transition probabilities does not vary through time and aggregate data are available for several time periods, then it is possible to estimate these parameters using least-squares methods. Even when such data are available, this assumption of stationarity will usually be deemed overly restrictive and, frequently, data will only be available for two time periods. In these cases, the problem reduces to estimating the most likely matrix (or matrices) leading to the observed frequency distribution of individuals in each state. An entropy maximization approach has been previously suggested. In this paper, we show that the entropy approach rests on a particular limiting assumption, and does not provide estimates of latent population parameters (the transition probabilities), but rather predictions of realized rates.
NASA Technical Reports Server (NTRS)
Chang, C. Y.; Curlander, J. C.
1992-01-01
Estimation of the Doppler centroid ambiguity is a necessary element of the signal processing for SAR systems with large antenna pointing errors. Without proper resolution of the Doppler centroid estimation (DCE) ambiguity, the image quality will be degraded in the system impulse response function and the geometric fidelity. Two techniques for resolution of DCE ambiguity for the spaceborne SAR are presented; they include a brief review of the range cross-correlation technique and presentation of a new technique using multiple pulse repetition frequencies (PRFs). For SAR systems, where other performance factors control selection of the PRF's, an algorithm is devised to resolve the ambiguity that uses PRF's of arbitrary numerical values. The performance of this multiple PRF technique is analyzed based on a statistical error model. An example is presented that demonstrates for the Shuttle Imaging Radar-C (SIR-C) C-band SAR, the probability of correct ambiguity resolution is higher than 95 percent for antenna attitude errors as large as 3 deg.
Zhao, Wei; Cella, Massimo; Della Pasqua, Oscar; Burger, David; Jacqz-Aigrain, Evelyne
2012-01-01
AIMS To develop a population pharmacokinetic model for abacavir in HIV-infected infants and toddlers, which will be used to describe both once and twice daily pharmacokinetic profiles, identify covariates that explain variability and propose optimal time points to optimize the area under the concentration–time curve (AUC) targeted dosage and individualize therapy. METHODS The pharmacokinetics of abacavir was described with plasma concentrations from 23 patients using nonlinear mixed-effects modelling (NONMEM) software. A two-compartment model with first-order absorption and elimination was developed. The final model was validated using bootstrap, visual predictive check and normalized prediction distribution errors. The Bayesian estimator was validated using the cross-validation and simulation–estimation method. RESULTS The typical population pharmacokinetic parameters and relative standard errors (RSE) were apparent systemic clearance (CL) 13.4 l h−1 (RSE 6.3%), apparent central volume of distribution 4.94 l (RSE 28.7%), apparent peripheral volume of distribution 8.12 l (RSE14.2%), apparent intercompartment clearance 1.25 l h−1 (RSE 16.9%) and absorption rate constant 0.758 h−1 (RSE 5.8%). The covariate analysis identified weight as the individual factor influencing the apparent oral clearance: CL = 13.4 × (weight/12)1.14. The maximum a posteriori probability Bayesian estimator, based on three concentrations measured at 0, 1 or 2, and 3 h after drug intake allowed predicting individual AUC0–t. CONCLUSIONS The population pharmacokinetic model developed for abacavir in HIV-infected infants and toddlers accurately described both once and twice daily pharmacokinetic profiles. The maximum a posteriori probability Bayesian estimator of AUC0–t was developed from the final model and can be used routinely to optimize individual dosing. PMID:21988586
Frequency synchronization of a frequency-hopped MFSK communication system
NASA Technical Reports Server (NTRS)
Huth, G. K.; Polydoros, A.; Simon, M. K.
1981-01-01
This paper presents the performance of fine-frequency synchronization. The performance degradation due to imperfect frequency synchronization is found in terms of the effect on bit error probability as a function of full-band or partial-band noise jamming levels and of the number of frequency hops used in the estimator. The effect of imperfect fine-time synchronization is also included in the calculation of fine-frequency synchronization performance to obtain the overall performance degradation due to synchronization errors.
Tentori, Katya; Chater, Nick; Crupi, Vincenzo
2016-04-01
Inductive reasoning requires exploiting links between evidence and hypotheses. This can be done focusing either on the posterior probability of the hypothesis when updated on the new evidence or on the impact of the new evidence on the credibility of the hypothesis. But are these two cognitive representations equally reliable? This study investigates this question by comparing probability and impact judgments on the same experimental materials. The results indicate that impact judgments are more consistent in time and more accurate than probability judgments. Impact judgments also predict the direction of errors in probability judgments. These findings suggest that human inductive reasoning relies more on estimating evidential impact than on posterior probability. Copyright © 2015 Cognitive Science Society, Inc.
Berkvens, Rafael; Peremans, Herbert; Weyn, Maarten
2016-10-02
Localization systems are increasingly valuable, but their location estimates are only useful when the uncertainty of the estimate is known. This uncertainty is currently calculated as the location error given a ground truth, which is then used as a static measure in sometimes very different environments. In contrast, we propose the use of the conditional entropy of a posterior probability distribution as a complementary measure of uncertainty. This measure has the advantage of being dynamic, i.e., it can be calculated during localization based on individual sensor measurements, does not require a ground truth, and can be applied to discrete localization algorithms. Furthermore, for every consistent location estimation algorithm, both the location error and the conditional entropy measures must be related, i.e., a low entropy should always correspond with a small location error, while a high entropy can correspond with either a small or large location error. We validate this relationship experimentally by calculating both measures of uncertainty in three publicly available datasets using probabilistic Wi-Fi fingerprinting with eight different implementations of the sensor model. We show that the discrepancy between these measures, i.e., many location estimates having a high location error while simultaneously having a low conditional entropy, is largest for the least realistic implementations of the probabilistic sensor model. Based on the results presented in this paper, we conclude that conditional entropy, being dynamic, complementary to location error, and applicable to both continuous and discrete localization, provides an important extra means of characterizing a localization method.
Berkvens, Rafael; Peremans, Herbert; Weyn, Maarten
2016-01-01
Localization systems are increasingly valuable, but their location estimates are only useful when the uncertainty of the estimate is known. This uncertainty is currently calculated as the location error given a ground truth, which is then used as a static measure in sometimes very different environments. In contrast, we propose the use of the conditional entropy of a posterior probability distribution as a complementary measure of uncertainty. This measure has the advantage of being dynamic, i.e., it can be calculated during localization based on individual sensor measurements, does not require a ground truth, and can be applied to discrete localization algorithms. Furthermore, for every consistent location estimation algorithm, both the location error and the conditional entropy measures must be related, i.e., a low entropy should always correspond with a small location error, while a high entropy can correspond with either a small or large location error. We validate this relationship experimentally by calculating both measures of uncertainty in three publicly available datasets using probabilistic Wi-Fi fingerprinting with eight different implementations of the sensor model. We show that the discrepancy between these measures, i.e., many location estimates having a high location error while simultaneously having a low conditional entropy, is largest for the least realistic implementations of the probabilistic sensor model. Based on the results presented in this paper, we conclude that conditional entropy, being dynamic, complementary to location error, and applicable to both continuous and discrete localization, provides an important extra means of characterizing a localization method. PMID:27706099
Multimedia data from two probability-based exposure studies were investigated in terms of how missing data and measurement-error imprecision affected estimation of population parameters and associations. Missing data resulted mainly from individuals' refusing to participate in c...
NASA Astrophysics Data System (ADS)
Liu, Deyang; An, Ping; Ma, Ran; Yang, Chao; Shen, Liquan; Li, Kai
2016-07-01
Three-dimensional (3-D) holoscopic imaging, also known as integral imaging, light field imaging, or plenoptic imaging, can provide natural and fatigue-free 3-D visualization. However, a large amount of data is required to represent the 3-D holoscopic content. Therefore, efficient coding schemes for this particular type of image are needed. A 3-D holoscopic image coding scheme with kernel-based minimum mean square error (MMSE) estimation is proposed. In the proposed scheme, the coding block is predicted by an MMSE estimator under statistical modeling. In order to obtain the signal statistical behavior, kernel density estimation (KDE) is utilized to estimate the probability density function of the statistical modeling. As bandwidth estimation (BE) is a key issue in the KDE problem, we also propose a BE method based on kernel trick. The experimental results demonstrate that the proposed scheme can achieve a better rate-distortion performance and a better visual rendering quality.
About an adaptively weighted Kaplan-Meier estimate.
Plante, Jean-François
2009-09-01
The minimum averaged mean squared error nonparametric adaptive weights use data from m possibly different populations to infer about one population of interest. The definition of these weights is based on the properties of the empirical distribution function. We use the Kaplan-Meier estimate to let the weights accommodate right-censored data and use them to define the weighted Kaplan-Meier estimate. The proposed estimate is smoother than the usual Kaplan-Meier estimate and converges uniformly in probability to the target distribution. Simulations show that the performances of the weighted Kaplan-Meier estimate on finite samples exceed that of the usual Kaplan-Meier estimate. A case study is also presented.
Estimation of proportions in mixed pixels through their region characterization
NASA Technical Reports Server (NTRS)
Chittineni, C. B. (Principal Investigator)
1981-01-01
A region of mixed pixels can be characterized through the probability density function of proportions of classes in the pixels. Using information from the spectral vectors of a given set of pixels from the mixed pixel region, expressions are developed for obtaining the maximum likelihood estimates of the parameters of probability density functions of proportions. The proportions of classes in the mixed pixels can then be estimated. If the mixed pixels contain objects of two classes, the computation can be reduced by transforming the spectral vectors using a transformation matrix that simultaneously diagonalizes the covariance matrices of the two classes. If the proportions of the classes of a set of mixed pixels from the region are given, then expressions are developed for obtaining the estmates of the parameters of the probability density function of the proportions of mixed pixels. Development of these expressions is based on the criterion of the minimum sum of squares of errors. Experimental results from the processing of remotely sensed agricultural multispectral imagery data are presented.
Fancher, Chris M.; Han, Zhen; Levin, Igor; Page, Katharine; Reich, Brian J.; Smith, Ralph C.; Wilson, Alyson G.; Jones, Jacob L.
2016-01-01
A Bayesian inference method for refining crystallographic structures is presented. The distribution of model parameters is stochastically sampled using Markov chain Monte Carlo. Posterior probability distributions are constructed for all model parameters to properly quantify uncertainty by appropriately modeling the heteroskedasticity and correlation of the error structure. The proposed method is demonstrated by analyzing a National Institute of Standards and Technology silicon standard reference material. The results obtained by Bayesian inference are compared with those determined by Rietveld refinement. Posterior probability distributions of model parameters provide both estimates and uncertainties. The new method better estimates the true uncertainties in the model as compared to the Rietveld method. PMID:27550221
Schillaci, Michael A; Schillaci, Mario E
2009-02-01
The use of small sample sizes in human and primate evolutionary research is commonplace. Estimating how well small samples represent the underlying population, however, is not commonplace. Because the accuracy of determinations of taxonomy, phylogeny, and evolutionary process are dependant upon how well the study sample represents the population of interest, characterizing the uncertainty, or potential error, associated with analyses of small sample sizes is essential. We present a method for estimating the probability that the sample mean is within a desired fraction of the standard deviation of the true mean using small (n<10) or very small (n < or = 5) sample sizes. This method can be used by researchers to determine post hoc the probability that their sample is a meaningful approximation of the population parameter. We tested the method using a large craniometric data set commonly used by researchers in the field. Given our results, we suggest that sample estimates of the population mean can be reasonable and meaningful even when based on small, and perhaps even very small, sample sizes.
Deffner, Veronika; Küchenhoff, Helmut; Breitner, Susanne; Schneider, Alexandra; Cyrys, Josef; Peters, Annette
2018-05-01
The ultrafine particle measurements in the Augsburger Umweltstudie, a panel study conducted in Augsburg, Germany, exhibit measurement error from various sources. Measurements of mobile devices show classical possibly individual-specific measurement error; Berkson-type error, which may also vary individually, occurs, if measurements of fixed monitoring stations are used. The combination of fixed site and individual exposure measurements results in a mixture of the two error types. We extended existing bias analysis approaches to linear mixed models with a complex error structure including individual-specific error components, autocorrelated errors, and a mixture of classical and Berkson error. Theoretical considerations and simulation results show, that autocorrelation may severely change the attenuation of the effect estimations. Furthermore, unbalanced designs and the inclusion of confounding variables influence the degree of attenuation. Bias correction with the method of moments using data with mixture measurement error partially yielded better results compared to the usage of incomplete data with classical error. Confidence intervals (CIs) based on the delta method achieved better coverage probabilities than those based on Bootstrap samples. Moreover, we present the application of these new methods to heart rate measurements within the Augsburger Umweltstudie: the corrected effect estimates were slightly higher than their naive equivalents. The substantial measurement error of ultrafine particle measurements has little impact on the results. The developed methodology is generally applicable to longitudinal data with measurement error. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Data processing 1: Advancements in machine analysis of multispectral data
NASA Technical Reports Server (NTRS)
Swain, P. H.
1972-01-01
Multispectral data processing procedures are outlined beginning with the data display process used to accomplish data editing and proceeding through clustering, feature selection criterion for error probability estimation, and sample clustering and sample classification. The effective utilization of large quantities of remote sensing data by formulating a three stage sampling model for evaluation of crop acreage estimates represents an improvement in determining the cost benefit relationship associated with remote sensing technology.
Multilevel sequential Monte Carlo samplers
Beskos, Alexandros; Jasra, Ajay; Law, Kody; ...
2016-08-24
Here, we study the approximation of expectations w.r.t. probability distributions associated to the solution of partial differential equations (PDEs); this scenario appears routinely in Bayesian inverse problems. In practice, one often has to solve the associated PDE numerically, using, for instance finite element methods and leading to a discretisation bias, with the step-size level h L. In addition, the expectation cannot be computed analytically and one often resorts to Monte Carlo methods. In the context of this problem, it is known that the introduction of the multilevel Monte Carlo (MLMC) method can reduce the amount of computational effort to estimate expectations, for a given level of error. This is achieved via a telescoping identity associated to a Monte Carlo approximation of a sequence of probability distributions with discretisation levelsmore » $${\\infty}$$ >h 0>h 1 ...>h L. In many practical problems of interest, one cannot achieve an i.i.d. sampling of the associated sequence of probability distributions. A sequential Monte Carlo (SMC) version of the MLMC method is introduced to deal with this problem. In conclusion, it is shown that under appropriate assumptions, the attractive property of a reduction of the amount of computational effort to estimate expectations, for a given level of error, can be maintained within the SMC context.« less
Sinner, Jim; Ellis, Joanne; Kandlikar, Milind; Halpern, Benjamin S.; Satterfield, Terre; Chan, Kai
2017-01-01
The elicitation of expert judgment is an important tool for assessment of risks and impacts in environmental management contexts, and especially important as decision-makers face novel challenges where prior empirical research is lacking or insufficient. Evidence-driven elicitation approaches typically involve techniques to derive more accurate probability distributions under fairly specific contexts. Experts are, however, prone to overconfidence in their judgements. Group elicitations with diverse experts can reduce expert overconfidence by allowing cross-examination and reassessment of prior judgements, but groups are also prone to uncritical “groupthink” errors. When the problem context is underspecified the probability that experts commit groupthink errors may increase. This study addresses how structured workshops affect expert variability among and certainty within responses in a New Zealand case study. We find that experts’ risk estimates before and after a workshop differ, and that group elicitations provided greater consistency of estimates, yet also greater uncertainty among experts, when addressing prominent impacts to four different ecosystem services in coastal New Zealand. After group workshops, experts provided more consistent ranking of risks and more consistent best estimates of impact through increased clarity in terminology and dampening of extreme positions, yet probability distributions for impacts widened. The results from this case study suggest that group elicitations have favorable consequences for the quality and uncertainty of risk judgments within and across experts, making group elicitation techniques invaluable tools in contexts of limited data. PMID:28767694
NASA Technical Reports Server (NTRS)
Tsaoussi, Lucia S.; Koblinsky, Chester J.
1994-01-01
In order to facilitate the use of satellite-derived sea surface topography and velocity oceanographic models, methodology is presented for deriving the total error covariance and its geographic distribution from TOPEX/POSEIDON measurements. The model is formulated using a parametric model fit to the altimeter range observations. The topography and velocity modeled with spherical harmonic expansions whose coefficients are found through optimal adjustment to the altimeter range residuals using Bayesian statistics. All other parameters, including the orbit, geoid, surface models, and range corrections are provided as unadjusted parameters. The maximum likelihood estimates and errors are derived from the probability density function of the altimeter range residuals conditioned with a priori information. Estimates of model errors for the unadjusted parameters are obtained from the TOPEX/POSEIDON postlaunch verification results and the error covariances for the orbit and the geoid, except for the ocean tides. The error in the ocean tides is modeled, first, as the difference between two global tide models and, second, as the correction to the present tide model, the correction derived from the TOPEX/POSEIDON data. A formal error covariance propagation scheme is used to derive the total error. Our global total error estimate for the TOPEX/POSEIDON topography relative to the geoid for one 10-day period is found tio be 11 cm RMS. When the error in the geoid is removed, thereby providing an estimate of the time dependent error, the uncertainty in the topography is 3.5 cm root mean square (RMS). This level of accuracy is consistent with direct comparisons of TOPEX/POSEIDON altimeter heights with tide gauge measurements at 28 stations. In addition, the error correlation length scales are derived globally in both east-west and north-south directions, which should prove useful for data assimilation. The largest error correlation length scales are found in the tropics. Errors in the velocity field are smallest in midlatitude regions. For both variables the largest errors caused by uncertainty in the geoid. More accurate representations of the geoid await a dedicated geopotential satellite mission. Substantial improvements in the accuracy of ocean tide models are expected in the very near future from research with TOPEX/POSEIDON data.
NASA Astrophysics Data System (ADS)
Nanjo, K. Z.; Sakai, S.; Kato, A.; Tsuruoka, H.; Hirata, N.
2013-05-01
Seismicity in southern Kanto activated with the 2011 March 11 Tohoku earthquake of magnitude M9.0, but does this cause a significant difference in the probability of more earthquakes at the present or in the To? future answer this question, we examine the effect of a change in the seismicity rate on the probability of earthquakes. Our data set is from the Japan Meteorological Agency earthquake catalogue, downloaded on 2012 May 30. Our approach is based on time-dependent earthquake probabilistic calculations, often used for aftershock hazard assessment, and are based on two statistical laws: the Gutenberg-Richter (GR) frequency-magnitude law and the Omori-Utsu (OU) aftershock-decay law. We first confirm that the seismicity following a quake of M4 or larger is well modelled by the GR law with b ˜ 1. Then, there is good agreement with the OU law with p ˜ 0.5, which indicates that the slow decay was notably significant. Based on these results, we then calculate the most probable estimates of future M6-7-class events for various periods, all with a starting date of 2012 May 30. The estimates are higher than pre-quake levels if we consider a period of 3-yr duration or shorter. However, for statistics-based forecasting such as this, errors that arise from parameter estimation must be considered. Taking into account the contribution of these errors to the probability calculations, we conclude that any increase in the probability of earthquakes is insignificant. Although we try to avoid overstating the change in probability, our observations combined with results from previous studies support the likelihood that afterslip (fault creep) in southern Kanto will slowly relax a stress step caused by the Tohoku earthquake. This afterslip in turn reminds us of the potential for stress redistribution to the surrounding regions. We note the importance of varying hazards not only in time but also in space to improve the probabilistic seismic hazard assessment for southern Kanto.
Wang, Jian; Shete, Sanjay
2011-11-01
We recently proposed a bias correction approach to evaluate accurate estimation of the odds ratio (OR) of genetic variants associated with a secondary phenotype, in which the secondary phenotype is associated with the primary disease, based on the original case-control data collected for the purpose of studying the primary disease. As reported in this communication, we further investigated the type I error probabilities and powers of the proposed approach, and compared the results to those obtained from logistic regression analysis (with or without adjustment for the primary disease status). We performed a simulation study based on a frequency-matching case-control study with respect to the secondary phenotype of interest. We examined the empirical distribution of the natural logarithm of the corrected OR obtained from the bias correction approach and found it to be normally distributed under the null hypothesis. On the basis of the simulation study results, we found that the logistic regression approaches that adjust or do not adjust for the primary disease status had low power for detecting secondary phenotype associated variants and highly inflated type I error probabilities, whereas our approach was more powerful for identifying the SNP-secondary phenotype associations and had better-controlled type I error probabilities. © 2011 Wiley Periodicals, Inc.
Feed-forward frequency offset estimation for 32-QAM optical coherent detection.
Xiao, Fei; Lu, Jianing; Fu, Songnian; Xie, Chenhui; Tang, Ming; Tian, Jinwen; Liu, Deming
2017-04-17
Due to the non-rectangular distribution of the constellation points, traditional fast Fourier transform based frequency offset estimation (FFT-FOE) is no longer suitable for 32-QAM signal. Here, we report a modified FFT-FOE technique by selecting and digitally amplifying the inner QPSK ring of 32-QAM after the adaptive equalization, which is defined as QPSK-selection assisted FFT-FOE. Simulation results show that no FOE error occurs with a FFT size of only 512 symbols, when the signal-to-noise ratio (SNR) is above 17.5 dB using our proposed FOE technique. However, the error probability of traditional FFT-FOE scheme for 32-QAM is always intolerant. Finally, our proposed FOE scheme functions well for 10 Gbaud dual polarization (DP)-32-QAM signal to reach 20% forward error correction (FEC) threshold of BER=2×10-2, under the scenario of back-to-back (B2B) transmission.
Nichols, James D.; Hines, James E.; Pollock, Kenneth H.; Hinz, Robert L.; Link, William A.
1994-01-01
The proportion of animals in a population that breeds is an important determinant of population growth rate. Usual estimates of this quantity from field sampling data assume that the probability of appearing in the capture or count statistic is the same for animals that do and do not breed. A similar assumption is required by most existing methods used to test ecologically interesting hypotheses about reproductive costs using field sampling data. However, in many field sampling situations breeding and nonbreeding animals are likely to exhibit different probabilities of being seen or caught. In this paper, we propose the use of multistate capture-recapture models for these estimation and testing problems. This methodology permits a formal test of the hypothesis of equal capture/sighting probabilities for breeding and nonbreeding individuals. Two estimators of breeding proportion (and associated standard errors) are presented, one for the case of equal capture probabilities and one for the case of unequal capture probabilities. The multistate modeling framework also yields formal tests of hypotheses about reproductive costs to future reproduction or survival or both fitness components. The general methodology is illustrated using capture-recapture data on female meadow voles, Microtus pennsylvanicus. Resulting estimates of the proportion of reproductively active females showed strong seasonal variation, as expected, with low breeding proportions in midwinter. We found no evidence of reproductive costs extracted in subsequent survival or reproduction. We believe that this methodological framework has wide application to problems in animal ecology concerning breeding proportions and phenotypic reproductive costs.
Recurrent vulvovaginal candidiasis.
Blostein, Freida; Levin-Sparenberg, Elizabeth; Wagner, Julian; Foxman, Betsy
2017-09-01
Recurrent vulvovaginal candidiasis (RVVC), multiple episodes of vulvovaginal candidiasis (VVC; vaginal yeast infection) within a 12-month period, adversely affects quality of life, mental health, and sexual activity. Diagnosis is not straightforward, as VVC is defined by the combination of often nonspecific vaginal symptoms and the presence of yeast-which is a common vaginal commensal. Estimating the incidence and prevalence is challenging: most VVC is diagnosed and treated empirically, the availability for purchase of effective therapies over the counter enables self-diagnosis and treatment, and the duration of the relatively benign VVC symptoms is short, introducing errors into any estimates relying on medical records or patient recall. We evaluate current estimates of VVC and RVVC and provide new prevalence estimates using data from a 2011 seven-country (n = 7345) internet panel survey on VVC conducted by Ipsos Health (https://www.ipsos.com/en). We also evaluate information on VVC-associated visits using the National Ambulatory Medical Care Survey. The estimated probability of VVC by age 50 varied widely by country (from 23% to 49%, mean 39%), as did the estimated probability of RVVC after VVC (from 14% to 28%, mean 23%). However estimated, the probability of RVVC was high suggesting RVVC is a common condition. Copyright © 2017 Elsevier Inc. All rights reserved.
Wiens, J. David; Kolar, Patrick S.; Fuller, Mark R.; Hunt, W. Grainger; Hunt, Teresa
2015-01-01
We used a multistate occupancy sampling design to estimate occupancy, breeding success, and abundance of territorial pairs of golden eagles (Aquila chrysaetos) in the Diablo Range, California, in 2014. This method uses the spatial pattern of detections and non-detections over repeated visits to survey sites to estimate probabilities of occupancy and successful reproduction while accounting for imperfect detection of golden eagles and their young during surveys. The estimated probability of detecting territorial pairs of golden eagles and their young was less than 1 and varied with time of the breeding season, as did the probability of correctly classifying a pair’s breeding status. Imperfect detection and breeding classification led to a sizeable difference between the uncorrected, naïve estimate of the proportion of occupied sites where successful reproduction was observed (0.20) and the model-based estimate (0.30). The analysis further indicated a relatively high overall probability of landscape occupancy by pairs of golden eagles (0.67, standard error = 0.06), but that areas with the greatest occupancy and reproductive potential were patchily distributed. We documented a total of 138 territorial pairs of golden eagles during surveys completed in the 2014 breeding season, which represented about one-half of the 280 pairs we estimated to occur in the broader 5,169-square kilometer region sampled. The study results emphasize the importance of accounting for imperfect detection and spatial heterogeneity in studies of site occupancy, breeding success, and abundance of golden eagles.
Error reduction in EMG signal decomposition
Kline, Joshua C.
2014-01-01
Decomposition of the electromyographic (EMG) signal into constituent action potentials and the identification of individual firing instances of each motor unit in the presence of ambient noise are inherently probabilistic processes, whether performed manually or with automated algorithms. Consequently, they are subject to errors. We set out to classify and reduce these errors by analyzing 1,061 motor-unit action-potential trains (MUAPTs), obtained by decomposing surface EMG (sEMG) signals recorded during human voluntary contractions. Decomposition errors were classified into two general categories: location errors representing variability in the temporal localization of each motor-unit firing instance and identification errors consisting of falsely detected or missed firing instances. To mitigate these errors, we developed an error-reduction algorithm that combines multiple decomposition estimates to determine a more probable estimate of motor-unit firing instances with fewer errors. The performance of the algorithm is governed by a trade-off between the yield of MUAPTs obtained above a given accuracy level and the time required to perform the decomposition. When applied to a set of sEMG signals synthesized from real MUAPTs, the identification error was reduced by an average of 1.78%, improving the accuracy to 97.0%, and the location error was reduced by an average of 1.66 ms. The error-reduction algorithm in this study is not limited to any specific decomposition strategy. Rather, we propose it be used for other decomposition methods, especially when analyzing precise motor-unit firing instances, as occurs when measuring synchronization. PMID:25210159
The maximum entropy method of moments and Bayesian probability theory
NASA Astrophysics Data System (ADS)
Bretthorst, G. Larry
2013-08-01
The problem of density estimation occurs in many disciplines. For example, in MRI it is often necessary to classify the types of tissues in an image. To perform this classification one must first identify the characteristics of the tissues to be classified. These characteristics might be the intensity of a T1 weighted image and in MRI many other types of characteristic weightings (classifiers) may be generated. In a given tissue type there is no single intensity that characterizes the tissue, rather there is a distribution of intensities. Often this distributions can be characterized by a Gaussian, but just as often it is much more complicated. Either way, estimating the distribution of intensities is an inference problem. In the case of a Gaussian distribution, one must estimate the mean and standard deviation. However, in the Non-Gaussian case the shape of the density function itself must be inferred. Three common techniques for estimating density functions are binned histograms [1, 2], kernel density estimation [3, 4], and the maximum entropy method of moments [5, 6]. In the introduction, the maximum entropy method of moments will be reviewed. Some of its problems and conditions under which it fails will be discussed. Then in later sections, the functional form of the maximum entropy method of moments probability distribution will be incorporated into Bayesian probability theory. It will be shown that Bayesian probability theory solves all of the problems with the maximum entropy method of moments. One gets posterior probabilities for the Lagrange multipliers, and, finally, one can put error bars on the resulting estimated density function.
NASA Astrophysics Data System (ADS)
Ha, Taesung
A probabilistic risk assessment (PRA) was conducted for a loss of coolant accident, (LOCA) in the McMaster Nuclear Reactor (MNR). A level 1 PRA was completed including event sequence modeling, system modeling, and quantification. To support the quantification of the accident sequence identified, data analysis using the Bayesian method and human reliability analysis (HRA) using the accident sequence evaluation procedure (ASEP) approach were performed. Since human performance in research reactors is significantly different from that in power reactors, a time-oriented HRA model (reliability physics model) was applied for the human error probability (HEP) estimation of the core relocation. This model is based on two competing random variables: phenomenological time and performance time. The response surface and direct Monte Carlo simulation with Latin Hypercube sampling were applied for estimating the phenomenological time, whereas the performance time was obtained from interviews with operators. An appropriate probability distribution for the phenomenological time was assigned by statistical goodness-of-fit tests. The human error probability (HEP) for the core relocation was estimated from these two competing quantities: phenomenological time and operators' performance time. The sensitivity of each probability distribution in human reliability estimation was investigated. In order to quantify the uncertainty in the predicted HEPs, a Bayesian approach was selected due to its capability of incorporating uncertainties in model itself and the parameters in that model. The HEP from the current time-oriented model was compared with that from the ASEP approach. Both results were used to evaluate the sensitivity of alternative huinan reliability modeling for the manual core relocation in the LOCA risk model. This exercise demonstrated the applicability of a reliability physics model supplemented with a. Bayesian approach for modeling human reliability and its potential usefulness of quantifying model uncertainty as sensitivity analysis in the PRA model.
Uncertainty Forecasts Improve Weather-Related Decisions and Attenuate the Effects of Forecast Error
ERIC Educational Resources Information Center
Joslyn, Susan L.; LeClerc, Jared E.
2012-01-01
Although uncertainty is inherent in weather forecasts, explicit numeric uncertainty estimates are rarely included in public forecasts for fear that they will be misunderstood. Of particular concern are situations in which precautionary action is required at low probabilities, often the case with severe events. At present, a categorical weather…
Multifractal diffusion entropy analysis: Optimal bin width of probability histograms
NASA Astrophysics Data System (ADS)
Jizba, Petr; Korbel, Jan
2014-11-01
In the framework of Multifractal Diffusion Entropy Analysis we propose a method for choosing an optimal bin-width in histograms generated from underlying probability distributions of interest. The method presented uses techniques of Rényi’s entropy and the mean squared error analysis to discuss the conditions under which the error in the multifractal spectrum estimation is minimal. We illustrate the utility of our approach by focusing on a scaling behavior of financial time series. In particular, we analyze the S&P500 stock index as sampled at a daily rate in the time period 1950-2013. In order to demonstrate a strength of the method proposed we compare the multifractal δ-spectrum for various bin-widths and show the robustness of the method, especially for large values of q. For such values, other methods in use, e.g., those based on moment estimation, tend to fail for heavy-tailed data or data with long correlations. Connection between the δ-spectrum and Rényi’s q parameter is also discussed and elucidated on a simple example of multiscale time series.
Hard choices in assessing survival past dams — a comparison of single- and paired-release strategies
Zydlewski, Joseph D.; Stich, Daniel S.; Sigourney, Douglas B.
2017-01-01
Mark–recapture models are widely used to estimate survival of salmon smolts migrating past dams. Paired releases have been used to improve estimate accuracy by removing components of mortality not attributable to the dam. This method is accompanied by reduced precision because (i) sample size is reduced relative to a single, large release; and (ii) variance calculations inflate error. We modeled an idealized system with a single dam to assess trade-offs between accuracy and precision and compared methods using root mean squared error (RMSE). Simulations were run under predefined conditions (dam mortality, background mortality, detection probability, and sample size) to determine scenarios when the paired release was preferable to a single release. We demonstrate that a paired-release design provides a theoretical advantage over a single-release design only at large sample sizes and high probabilities of detection. At release numbers typical of many survival studies, paired release can result in overestimation of dam survival. Failures to meet model assumptions of a paired release may result in further overestimation of dam-related survival. Under most conditions, a single-release strategy was preferable.
Why We Should Not Be Indifferent to Specification Choices for Difference-in-Differences.
Ryan, Andrew M; Burgess, James F; Dimick, Justin B
2015-08-01
To evaluate the effects of specification choices on the accuracy of estimates in difference-in-differences (DID) models. Process-of-care quality data from Hospital Compare between 2003 and 2009. We performed a Monte Carlo simulation experiment to estimate the effect of an imaginary policy on quality. The experiment was performed for three different scenarios in which the probability of treatment was (1) unrelated to pre-intervention performance; (2) positively correlated with pre-intervention levels of performance; and (3) positively correlated with pre-intervention trends in performance. We estimated alternative DID models that varied with respect to the choice of data intervals, the comparison group, and the method of obtaining inference. We assessed estimator bias as the mean absolute deviation between estimated program effects and their true value. We evaluated the accuracy of inferences through statistical power and rates of false rejection of the null hypothesis. Performance of alternative specifications varied dramatically when the probability of treatment was correlated with pre-intervention levels or trends. In these cases, propensity score matching resulted in much more accurate point estimates. The use of permutation tests resulted in lower false rejection rates for the highly biased estimators, but the use of clustered standard errors resulted in slightly lower false rejection rates for the matching estimators. When treatment and comparison groups differed on pre-intervention levels or trends, our results supported specifications for DID models that include matching for more accurate point estimates and models using clustered standard errors or permutation tests for better inference. Based on our findings, we propose a checklist for DID analysis. © Health Research and Educational Trust.
Liu, Xian; Engel, Charles C
2012-12-20
Researchers often encounter longitudinal health data characterized with three or more ordinal or nominal categories. Random-effects multinomial logit models are generally applied to account for potential lack of independence inherent in such clustered data. When parameter estimates are used to describe longitudinal processes, however, random effects, both between and within individuals, need to be retransformed for correctly predicting outcome probabilities. This study attempts to go beyond existing work by developing a retransformation method that derives longitudinal growth trajectories of unbiased health probabilities. We estimated variances of the predicted probabilities by using the delta method. Additionally, we transformed the covariates' regression coefficients on the multinomial logit function, not substantively meaningful, to the conditional effects on the predicted probabilities. The empirical illustration uses the longitudinal data from the Asset and Health Dynamics among the Oldest Old. Our analysis compared three sets of the predicted probabilities of three health states at six time points, obtained from, respectively, the retransformation method, the best linear unbiased prediction, and the fixed-effects approach. The results demonstrate that neglect of retransforming random errors in the random-effects multinomial logit model results in severely biased longitudinal trajectories of health probabilities as well as overestimated effects of covariates on the probabilities. Copyright © 2012 John Wiley & Sons, Ltd.
Olson, Scott A.; with a section by Veilleux, Andrea G.
2014-01-01
This report provides estimates of flood discharges at selected annual exceedance probabilities (AEPs) for streamgages in and adjacent to Vermont and equations for estimating flood discharges at AEPs of 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent (recurrence intervals of 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-years, respectively) for ungaged, unregulated, rural streams in Vermont. The equations were developed using generalized least-squares regression. Flood-frequency and drainage-basin characteristics from 145 streamgages were used in developing the equations. The drainage-basin characteristics used as explanatory variables in the regression equations include drainage area, percentage of wetland area, and the basin-wide mean of the average annual precipitation. The average standard errors of prediction for estimating the flood discharges at the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent AEP with these equations are 34.9, 36.0, 38.7, 42.4, 44.9, 47.3, 50.7, and 55.1 percent, respectively. Flood discharges at selected AEPs for streamgages were computed by using the Expected Moments Algorithm. To improve estimates of the flood discharges for given exceedance probabilities at streamgages in Vermont, a new generalized skew coefficient was developed. The new generalized skew for the region is a constant, 0.44. The mean square error of the generalized skew coefficient is 0.078. This report describes a technique for using results from the regression equations to adjust an AEP discharge computed from a streamgage record. This report also describes a technique for using a drainage-area adjustment to estimate flood discharge at a selected AEP for an ungaged site upstream or downstream from a streamgage. The final regression equations and the flood-discharge frequency data used in this study will be available in StreamStats. StreamStats is a World Wide Web application providing automated regression-equation solutions for user-selected sites on streams.
Multiclass Bayes error estimation by a feature space sampling technique
NASA Technical Reports Server (NTRS)
Mobasseri, B. G.; Mcgillem, C. D.
1979-01-01
A general Gaussian M-class N-feature classification problem is defined. An algorithm is developed that requires the class statistics as its only input and computes the minimum probability of error through use of a combined analytical and numerical integration over a sequence simplifying transformations of the feature space. The results are compared with those obtained by conventional techniques applied to a 2-class 4-feature discrimination problem with results previously reported and 4-class 4-feature multispectral scanner Landsat data classified by training and testing of the available data.
Change-in-ratio estimators for populations with more than two subclasses
Udevitz, Mark S.; Pollock, Kenneth H.
1991-01-01
Change-in-ratio methods have been developed to estimate the size of populations with two or three population subclasses. Most of these methods require the often unreasonable assumption of equal sampling probabilities for individuals in all subclasses. This paper presents new models based on the weaker assumption that ratios of sampling probabilities are constant over time for populations with three or more subclasses. Estimation under these models requires that a value be assumed for one of these ratios when there are two samples. Explicit expressions are given for the maximum likelihood estimators under models for two samples with three or more subclasses and for three samples with two subclasses. A numerical method using readily available statistical software is described for obtaining the estimators and their standard errors under all of the models. Likelihood ratio tests that can be used in model selection are discussed. Emphasis is on the two-sample, three-subclass models for which Monte-Carlo simulation results and an illustrative example are presented.
Probability Distribution Extraction from TEC Estimates based on Kernel Density Estimation
NASA Astrophysics Data System (ADS)
Demir, Uygar; Toker, Cenk; Çenet, Duygu
2016-07-01
Statistical analysis of the ionosphere, specifically the Total Electron Content (TEC), may reveal important information about its temporal and spatial characteristics. One of the core metrics that express the statistical properties of a stochastic process is its Probability Density Function (pdf). Furthermore, statistical parameters such as mean, variance and kurtosis, which can be derived from the pdf, may provide information about the spatial uniformity or clustering of the electron content. For example, the variance differentiates between a quiet ionosphere and a disturbed one, whereas kurtosis differentiates between a geomagnetic storm and an earthquake. Therefore, valuable information about the state of the ionosphere (and the natural phenomena that cause the disturbance) can be obtained by looking at the statistical parameters. In the literature, there are publications which try to fit the histogram of TEC estimates to some well-known pdf.s such as Gaussian, Exponential, etc. However, constraining a histogram to fit to a function with a fixed shape will increase estimation error, and all the information extracted from such pdf will continue to contain this error. In such techniques, it is highly likely to observe some artificial characteristics in the estimated pdf which is not present in the original data. In the present study, we use the Kernel Density Estimation (KDE) technique to estimate the pdf of the TEC. KDE is a non-parametric approach which does not impose a specific form on the TEC. As a result, better pdf estimates that almost perfectly fit to the observed TEC values can be obtained as compared to the techniques mentioned above. KDE is particularly good at representing the tail probabilities, and outliers. We also calculate the mean, variance and kurtosis of the measured TEC values. The technique is applied to the ionosphere over Turkey where the TEC values are estimated from the GNSS measurement from the TNPGN-Active (Turkish National Permanent GNSS Network) network. This study is supported by by TUBITAK 115E915 and Joint TUBITAK 114E092 and AS CR14/001 projects.
Observation of sea-ice dynamics using synthetic aperture radar images: Automated analysis
NASA Technical Reports Server (NTRS)
Vesecky, John F.; Samadani, Ramin; Smith, Martha P.; Daida, Jason M.; Bracewell, Ronald N.
1988-01-01
The European Space Agency's ERS-1 satellite, as well as others planned to follow, is expected to carry synthetic-aperture radars (SARs) over the polar regions beginning in 1989. A key component in utilization of these SAR data is an automated scheme for extracting the sea-ice velocity field from a time sequence of SAR images of the same geographical region. Two techniques for automated sea-ice tracking, image pyramid area correlation (hierarchical correlation) and feature tracking, are described. Each technique is applied to a pair of Seasat SAR sea-ice images. The results compare well with each other and with manually tracked estimates of the ice velocity. The advantages and disadvantages of these automated methods are pointed out. Using these ice velocity field estimates it is possible to construct one sea-ice image from the other member of the pair. Comparing the reconstructed image with the observed image, errors in the estimated velocity field can be recognized and a useful probable error display created automatically to accompany ice velocity estimates. It is suggested that this error display may be useful in segmenting the sea ice observed into regions that move as rigid plates of significant ice velocity shear and distortion.
Reverse Transcription Errors and RNA-DNA Differences at Short Tandem Repeats.
Fungtammasan, Arkarachai; Tomaszkiewicz, Marta; Campos-Sánchez, Rebeca; Eckert, Kristin A; DeGiorgio, Michael; Makova, Kateryna D
2016-10-01
Transcript variation has important implications for organismal function in health and disease. Most transcriptome studies focus on assessing variation in gene expression levels and isoform representation. Variation at the level of transcript sequence is caused by RNA editing and transcription errors, and leads to nongenetically encoded transcript variants, or RNA-DNA differences (RDDs). Such variation has been understudied, in part because its detection is obscured by reverse transcription (RT) and sequencing errors. It has only been evaluated for intertranscript base substitution differences. Here, we investigated transcript sequence variation for short tandem repeats (STRs). We developed the first maximum-likelihood estimator (MLE) to infer RT error and RDD rates, taking next generation sequencing error rates into account. Using the MLE, we empirically evaluated RT error and RDD rates for STRs in a large-scale DNA and RNA replicated sequencing experiment conducted in a primate species. The RT error rates increased exponentially with STR length and were biased toward expansions. The RDD rates were approximately 1 order of magnitude lower than the RT error rates. The RT error rates estimated with the MLE from a primate data set were concordant with those estimated with an independent method, barcoded RNA sequencing, from a Caenorhabditis elegans data set. Our results have important implications for medical genomics, as STR allelic variation is associated with >40 diseases. STR nonallelic transcript variation can also contribute to disease phenotype. The MLE and empirical rates presented here can be used to evaluate the probability of disease-associated transcripts arising due to RDD. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
From least squares to multilevel modeling: A graphical introduction to Bayesian inference
NASA Astrophysics Data System (ADS)
Loredo, Thomas J.
2016-01-01
This tutorial presentation will introduce some of the key ideas and techniques involved in applying Bayesian methods to problems in astrostatistics. The focus will be on the big picture: understanding the foundations (interpreting probability, Bayes's theorem, the law of total probability and marginalization), making connections to traditional methods (propagation of errors, least squares, chi-squared, maximum likelihood, Monte Carlo simulation), and highlighting problems where a Bayesian approach can be particularly powerful (Poisson processes, density estimation and curve fitting with measurement error). The "graphical" component of the title reflects an emphasis on pictorial representations of some of the math, but also on the use of graphical models (multilevel or hierarchical models) for analyzing complex data. Code for some examples from the talk will be available to participants, in Python and in the Stan probabilistic programming language.
Pidlisecky, Adam; Haines, S.S.
2011-01-01
Conventional processing methods for seismic cone penetrometer data present several shortcomings, most notably the absence of a robust velocity model uncertainty estimate. We propose a new seismic cone penetrometer testing (SCPT) data-processing approach that employs Bayesian methods to map measured data errors into quantitative estimates of model uncertainty. We first calculate travel-time differences for all permutations of seismic trace pairs. That is, we cross-correlate each trace at each measurement location with every trace at every other measurement location to determine travel-time differences that are not biased by the choice of any particular reference trace and to thoroughly characterize data error. We calculate a forward operator that accounts for the different ray paths for each measurement location, including refraction at layer boundaries. We then use a Bayesian inversion scheme to obtain the most likely slowness (the reciprocal of velocity) and a distribution of probable slowness values for each model layer. The result is a velocity model that is based on correct ray paths, with uncertainty bounds that are based on the data error. ?? NRC Research Press 2011.
Array coding for large data memories
NASA Technical Reports Server (NTRS)
Tranter, W. H.
1982-01-01
It is pointed out that an array code is a convenient method for storing large quantities of data. In a typical application, the array consists of N data words having M symbols in each word. The probability of undetected error is considered, taking into account three symbol error probabilities which are of interest, and a formula for determining the probability of undetected error. Attention is given to the possibility of reading data into the array using a digital communication system with symbol error probability p. Two different schemes are found to be of interest. The conducted analysis of array coding shows that the probability of undetected error is very small even for relatively large arrays.
Stauffer, Glenn E.; Rotella, Jay J.; Garrott, Robert A.; Kendall, William L.
2014-01-01
In colonial-breeding species, prebreeders often emigrate temporarily from natal reproductive colonies then subsequently return for one or more years before producing young. Variation in attendance–nonattendance patterns can have implications for subsequent recruitment. We used open robust-design multistate models and 28 years of encounter data for prebreeding female Weddell seals (Leptonychotes weddellii [Lesson]) to evaluate hypotheses about (1) the relationships of temporary emigration (TE) probabilities to environmental and population size covariates and (2) motivations for attendance and consequences of nonattendance for subsequent probability of recruitment to the breeding population. TE probabilities were density dependent (βˆBPOP = 0.66, = 0.17; estimated effects [β] and standard errors of population size in the previous year) and increased when the fast-ice edge was distant from the breeding colonies (βˆDIST = 0.75, = 0.04; estimated effects and standard errors of distance to the sea-ice edge in the current year on TE probability in the current year) and were strongly age and state dependent. These results suggest that trade-offs between potential benefits and costs of colony attendance vary annually and might influence motivation to attend colonies. Recruitment probabilities were greatest for seals that consistently attended colonies in two or more years (e.g., = 0.56, SD = 0.17) and lowest for seals that never or inconsistently attended prior to recruitment (e.g., = 0.32, SD = 0.15), where denotes the mean recruitment probability (over all years) for 10-year-old seals for the specified prebreeder state. In colonial-breeding seabirds, repeated colony attendance increases subsequent probability of recruitment to the adult breeding population; our results suggest similar implications for a marine mammal and are consistent with the hypothesis that prebreeders were motivated to attend reproductive colonies to gain reproductive skills or perhaps to optimally synchronize estrus through close association with mature breeding females.
NASA Astrophysics Data System (ADS)
Dettmer, Jan; Molnar, Sheri; Steininger, Gavin; Dosso, Stan E.; Cassidy, John F.
2012-02-01
This paper applies a general trans-dimensional Bayesian inference methodology and hierarchical autoregressive data-error models to the inversion of microtremor array dispersion data for shear wave velocity (vs) structure. This approach accounts for the limited knowledge of the optimal earth model parametrization (e.g. the number of layers in the vs profile) and of the data-error statistics in the resulting vs parameter uncertainty estimates. The assumed earth model parametrization influences estimates of parameter values and uncertainties due to different parametrizations leading to different ranges of data predictions. The support of the data for a particular model is often non-unique and several parametrizations may be supported. A trans-dimensional formulation accounts for this non-uniqueness by including a model-indexing parameter as an unknown so that groups of models (identified by the indexing parameter) are considered in the results. The earth model is parametrized in terms of a partition model with interfaces given over a depth-range of interest. In this work, the number of interfaces (layers) in the partition model represents the trans-dimensional model indexing. In addition, serial data-error correlations are addressed by augmenting the geophysical forward model with a hierarchical autoregressive error model that can account for a wide range of error processes with a small number of parameters. Hence, the limited knowledge about the true statistical distribution of data errors is also accounted for in the earth model parameter estimates, resulting in more realistic uncertainties and parameter values. Hierarchical autoregressive error models do not rely on point estimates of the model vector to estimate data-error statistics, and have no requirement for computing the inverse or determinant of a data-error covariance matrix. This approach is particularly useful for trans-dimensional inverse problems, as point estimates may not be representative of the state space that spans multiple subspaces of different dimensionalities. The order of the autoregressive process required to fit the data is determined here by posterior residual-sample examination and statistical tests. Inference for earth model parameters is carried out on the trans-dimensional posterior probability distribution by considering ensembles of parameter vectors. In particular, vs uncertainty estimates are obtained by marginalizing the trans-dimensional posterior distribution in terms of vs-profile marginal distributions. The methodology is applied to microtremor array dispersion data collected at two sites with significantly different geology in British Columbia, Canada. At both sites, results show excellent agreement with estimates from invasive measurements.
Armstrong, Bonnie; Spaniol, Julia; Persaud, Nav
2018-02-13
Clinicians often overestimate the probability of a disease given a positive test result (positive predictive value; PPV) and the probability of no disease given a negative test result (negative predictive value; NPV). The purpose of this study was to investigate whether experiencing simulated patient cases (ie, an 'experience format') would promote more accurate PPV and NPV estimates compared with a numerical format. Participants were presented with information about three diagnostic tests for the same fictitious disease and were asked to estimate the PPV and NPV of each test. Tests varied with respect to sensitivity and specificity. Information about each test was presented once in the numerical format and once in the experience format. The study used a 2 (format: numerical vs experience) × 3 (diagnostic test: gold standard vs low sensitivity vs low specificity) within-subjects design. The study was completed online, via Qualtrics (Provo, Utah, USA). 50 physicians (12 clinicians and 38 residents) from the Department of Family and Community Medicine at St Michael's Hospital in Toronto, Canada, completed the study. All participants had completed at least 1 year of residency. Estimation accuracy was quantified by the mean absolute error (MAE; absolute difference between estimate and true predictive value). PPV estimation errors were larger in the numerical format (MAE=32.6%, 95% CI 26.8% to 38.4%) compared with the experience format (MAE=15.9%, 95% CI 11.8% to 20.0%, d =0.697, P<0.001). Likewise, NPV estimation errors were larger in the numerical format (MAE=24.4%, 95% CI 14.5% to 34.3%) than in the experience format (MAE=11.0%, 95% CI 6.5% to 15.5%, d =0.303, P=0.015). Exposure to simulated patient cases promotes accurate estimation of predictive values in clinicians. This finding carries implications for diagnostic training and practice. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Minimax Quantum Tomography: Estimators and Relative Entropy Bounds.
Ferrie, Christopher; Blume-Kohout, Robin
2016-03-04
A minimax estimator has the minimum possible error ("risk") in the worst case. We construct the first minimax estimators for quantum state tomography with relative entropy risk. The minimax risk of nonadaptive tomography scales as O(1/sqrt[N])-in contrast to that of classical probability estimation, which is O(1/N)-where N is the number of copies of the quantum state used. We trace this deficiency to sampling mismatch: future observations that determine risk may come from a different sample space than the past data that determine the estimate. This makes minimax estimators very biased, and we propose a computationally tractable alternative with similar behavior in the worst case, but superior accuracy on most states.
Maximizing the Detection Probability of Kilonovae Associated with Gravitational Wave Observations
NASA Astrophysics Data System (ADS)
Chan, Man Leong; Hu, Yi-Ming; Messenger, Chris; Hendry, Martin; Heng, Ik Siong
2017-01-01
Estimates of the source sky location for gravitational wave signals are likely to span areas of up to hundreds of square degrees or more, making it very challenging for most telescopes to search for counterpart signals in the electromagnetic spectrum. To boost the chance of successfully observing such counterparts, we have developed an algorithm that optimizes the number of observing fields and their corresponding time allocations by maximizing the detection probability. As a proof-of-concept demonstration, we optimize follow-up observations targeting kilonovae using telescopes including the CTIO-Dark Energy Camera, Subaru-HyperSuprimeCam, Pan-STARRS, and the Palomar Transient Factory. We consider three simulated gravitational wave events with 90% credible error regions spanning areas from ∼ 30 {\\deg }2 to ∼ 300 {\\deg }2. Assuming a source at 200 {Mpc}, we demonstrate that to obtain a maximum detection probability, there is an optimized number of fields for any particular event that a telescope should observe. To inform future telescope design studies, we present the maximum detection probability and corresponding number of observing fields for a combination of limiting magnitudes and fields of view over a range of parameters. We show that for large gravitational wave error regions, telescope sensitivity rather than field of view is the dominating factor in maximizing the detection probability.
Corrected score estimation in the proportional hazards model with misclassified discrete covariates
Zucker, David M.; Spiegelman, Donna
2013-01-01
SUMMARY We consider Cox proportional hazards regression when the covariate vector includes error-prone discrete covariates along with error-free covariates, which may be discrete or continuous. The misclassification in the discrete error-prone covariates is allowed to be of any specified form. Building on the work of Nakamura and his colleagues, we present a corrected score method for this setting. The method can handle all three major study designs (internal validation design, external validation design, and replicate measures design), both functional and structural error models, and time-dependent covariates satisfying a certain ‘localized error’ condition. We derive the asymptotic properties of the method and indicate how to adjust the covariance matrix of the regression coefficient estimates to account for estimation of the misclassification matrix. We present the results of a finite-sample simulation study under Weibull survival with a single binary covariate having known misclassification rates. The performance of the method described here was similar to that of related methods we have examined in previous works. Specifically, our new estimator performed as well as or, in a few cases, better than the full Weibull maximum likelihood estimator. We also present simulation results for our method for the case where the misclassification probabilities are estimated from an external replicate measures study. Our method generally performed well in these simulations. The new estimator has a broader range of applicability than many other estimators proposed in the literature, including those described in our own earlier work, in that it can handle time-dependent covariates with an arbitrary misclassification structure. We illustrate the method on data from a study of the relationship between dietary calcium intake and distal colon cancer. PMID:18219700
Palmer, Katherine A; Shane, Rita; Wu, Cindy N; Bell, Douglas S; Diaz, Frank; Cook-Wiens, Galen; Jackevicius, Cynthia A
2016-01-01
Objective We sought to assess the potential of a widely available source of electronic medication data to prevent medication history errors and resultant inpatient order errors. Methods We used admission medication history (AMH) data from a recent clinical trial that identified 1017 AMH errors and 419 resultant inpatient order errors among 194 hospital admissions of predominantly older adult patients on complex medication regimens. Among the subset of patients for whom we could access current Surescripts electronic pharmacy claims data (SEPCD), two pharmacists independently assessed error severity and our main outcome, which was whether SEPCD (1) was unrelated to the medication error; (2) probably would not have prevented the error; (3) might have prevented the error; or (4) probably would have prevented the error. Results Seventy patients had both AMH errors and current, accessible SEPCD. SEPCD probably would have prevented 110 (35%) of 315 AMH errors and 46 (31%) of 147 resultant inpatient order errors. When we excluded the least severe medication errors, SEPCD probably would have prevented 99 (47%) of 209 AMH errors and 37 (61%) of 61 resultant inpatient order errors. SEPCD probably would have prevented at least one AMH error in 42 (60%) of 70 patients. Conclusion When current SEPCD was available for older adult patients on complex medication regimens, it had substantial potential to prevent AMH errors and resultant inpatient order errors, with greater potential to prevent more severe errors. Further study is needed to measure the benefit of SEPCD in actual use at hospital admission. PMID:26911817
A Generally Robust Approach for Testing Hypotheses and Setting Confidence Intervals for Effect Sizes
ERIC Educational Resources Information Center
Keselman, H. J.; Algina, James; Lix, Lisa M.; Wilcox, Rand R.; Deering, Kathleen N.
2008-01-01
Standard least squares analysis of variance methods suffer from poor power under arbitrarily small departures from normality and fail to control the probability of a Type I error when standard assumptions are violated. This article describes a framework for robust estimation and testing that uses trimmed means with an approximate degrees of…
ERIC Educational Resources Information Center
Wedell, Douglas H.; Moro, Rodrigo
2008-01-01
Two experiments used within-subject designs to examine how conjunction errors depend on the use of (1) choice versus estimation tasks, (2) probability versus frequency language, and (3) conjunctions of two likely events versus conjunctions of likely and unlikely events. All problems included a three-option format verified to minimize…
NASA Technical Reports Server (NTRS)
Soneira, R. M.; Bahcall, J. N.
1981-01-01
Probabilities are calculated for acquiring suitable guide stars (GS) with the fine guidance system (FGS) of the space telescope. A number of the considerations and techniques described are also relevant for other space astronomy missions. The constraints of the FGS are reviewed. The available data on bright star densities are summarized and a previous error in the literature is corrected. Separate analytic and Monte Carlo calculations of the probabilities are described. A simulation of space telescope pointing is carried out using the Weistrop north galactic pole catalog of bright stars. Sufficient information is presented so that the probabilities of acquisition can be estimated as a function of position in the sky. The probability of acquiring suitable guide stars is greatly increased if the FGS can allow an appreciable difference between the (bright) primary GS limiting magnitude and the (fainter) secondary GS limiting magnitude.
Voxel-based statistical analysis of uncertainties associated with deformable image registration
NASA Astrophysics Data System (ADS)
Li, Shunshan; Glide-Hurst, Carri; Lu, Mei; Kim, Jinkoo; Wen, Ning; Adams, Jeffrey N.; Gordon, James; Chetty, Indrin J.; Zhong, Hualiang
2013-09-01
Deformable image registration (DIR) algorithms have inherent uncertainties in their displacement vector fields (DVFs).The purpose of this study is to develop an optimal metric to estimate DIR uncertainties. Six computational phantoms have been developed from the CT images of lung cancer patients using a finite element method (FEM). The FEM generated DVFs were used as a standard for registrations performed on each of these phantoms. A mechanics-based metric, unbalanced energy (UE), was developed to evaluate these registration DVFs. The potential correlation between UE and DIR errors was explored using multivariate analysis, and the results were validated by landmark approach and compared with two other error metrics: DVF inverse consistency (IC) and image intensity difference (ID). Landmark-based validation was performed using the POPI-model. The results show that the Pearson correlation coefficient between UE and DIR error is rUE-error = 0.50. This is higher than rIC-error = 0.29 for IC and DIR error and rID-error = 0.37 for ID and DIR error. The Pearson correlation coefficient between UE and the product of the DIR displacements and errors is rUE-error × DVF = 0.62 for the six patients and rUE-error × DVF = 0.73 for the POPI-model data. It has been demonstrated that UE has a strong correlation with DIR errors, and the UE metric outperforms the IC and ID metrics in estimating DIR uncertainties. The quantified UE metric can be a useful tool for adaptive treatment strategies, including probability-based adaptive treatment planning.
Associations between errors and contributing factors in aircraft maintenance
NASA Technical Reports Server (NTRS)
Hobbs, Alan; Williamson, Ann
2003-01-01
In recent years cognitive error models have provided insights into the unsafe acts that lead to many accidents in safety-critical environments. Most models of accident causation are based on the notion that human errors occur in the context of contributing factors. However, there is a lack of published information on possible links between specific errors and contributing factors. A total of 619 safety occurrences involving aircraft maintenance were reported using a self-completed questionnaire. Of these occurrences, 96% were related to the actions of maintenance personnel. The types of errors that were involved, and the contributing factors associated with those actions, were determined. Each type of error was associated with a particular set of contributing factors and with specific occurrence outcomes. Among the associations were links between memory lapses and fatigue and between rule violations and time pressure. Potential applications of this research include assisting with the design of accident prevention strategies, the estimation of human error probabilities, and the monitoring of organizational safety performance.
Snakebite mortality in the world
Swaroop, S.; Grab, B.
1954-01-01
In examining the relative importance of snakebite mortality in different parts of the world, the authors review the information collected concerning both snakebite mortality and the species of snake incriminated. Available statistical data are known to be unreliable and at best can serve to provide only an approximate and highly conservative estimate of the relative magnitude of the snakebite problem. The sources of error inherent in the data are discussed, and estimates are made of the probable mortality from snakebite in various areas of the world. PMID:13150169
Landsat D Thematic Mapper image dimensionality reduction and geometric correction accuracy
NASA Technical Reports Server (NTRS)
Ford, G. E.
1986-01-01
To characterize and quantify the performance of the Landsat thematic mapper (TM), techniques for dimensionality reduction by linear transformation have been studied and evaluated and the accuracy of the correction of geometric errors in TM images analyzed. Theoretical evaluations and comparisons for existing methods for the design of linear transformation for dimensionality reduction are presented. These methods include the discrete Karhunen Loeve (KL) expansion, Multiple Discriminant Analysis (MDA), Thematic Mapper (TM)-Tasseled Cap Linear Transformation and Singular Value Decomposition (SVD). A unified approach to these design problems is presented in which each method involves optimizing an objective function with respect to the linear transformation matrix. From these studies, four modified methods are proposed. They are referred to as the Space Variant Linear Transformation, the KL Transform-MDA hybrid method, and the First and Second Version of the Weighted MDA method. The modifications involve the assignment of weights to classes to achieve improvements in the class conditional probability of error for classes with high weights. Experimental evaluations of the existing and proposed methods have been performed using the six reflective bands of the TM data. It is shown that in terms of probability of classification error and the percentage of the cumulative eigenvalues, the six reflective bands of the TM data require only a three dimensional feature space. It is shown experimentally as well that for the proposed methods, the classes with high weights have improvements in class conditional probability of error estimates as expected.
Zhang, Yongsheng; Wei, Heng; Zheng, Kangning
2017-01-01
Considering that metro network expansion brings us with more alternative routes, it is attractive to integrate the impacts of routes set and the interdependency among alternative routes on route choice probability into route choice modeling. Therefore, the formulation, estimation and application of a constrained multinomial probit (CMNP) route choice model in the metro network are carried out in this paper. The utility function is formulated as three components: the compensatory component is a function of influencing factors; the non-compensatory component measures the impacts of routes set on utility; following a multivariate normal distribution, the covariance of error component is structured into three parts, representing the correlation among routes, the transfer variance of route, and the unobserved variance respectively. Considering multidimensional integrals of the multivariate normal probability density function, the CMNP model is rewritten as Hierarchical Bayes formula and M-H sampling algorithm based Monte Carlo Markov Chain approach is constructed to estimate all parameters. Based on Guangzhou Metro data, reliable estimation results are gained. Furthermore, the proposed CMNP model also shows a good forecasting performance for the route choice probabilities calculation and a good application performance for transfer flow volume prediction. PMID:28591188
The Significance of the Record Length in Flood Frequency Analysis
NASA Astrophysics Data System (ADS)
Senarath, S. U.
2013-12-01
Of all of the potential natural hazards, flood is the most costly in many regions of the world. For example, floods cause over a third of Europe's average annual catastrophe losses and affect about two thirds of the people impacted by natural catastrophes. Increased attention is being paid to determining flow estimates associated with pre-specified return periods so that flood-prone areas can be adequately protected against floods of particular magnitudes or return periods. Flood frequency analysis, which is conducted by using an appropriate probability density function that fits the observed annual maximum flow data, is frequently used for obtaining these flow estimates. Consequently, flood frequency analysis plays an integral role in determining the flood risk in flood prone watersheds. A long annual maximum flow record is vital for obtaining accurate estimates of discharges associated with high return period flows. However, in many areas of the world, flood frequency analysis is conducted with limited flow data or short annual maximum flow records. These inevitably lead to flow estimates that are subject to error. This is especially the case with high return period flow estimates. In this study, several statistical techniques are used to identify errors caused by short annual maximum flow records. The flow estimates used in the error analysis are obtained by fitting a log-Pearson III distribution to the flood time-series. These errors can then be used to better evaluate the return period flows in data limited streams. The study findings, therefore, have important implications for hydrologists, water resources engineers and floodplain managers.
A new parametric method to smooth time-series data of metabolites in metabolic networks.
Miyawaki, Atsuko; Sriyudthsak, Kansuporn; Hirai, Masami Yokota; Shiraishi, Fumihide
2016-12-01
Mathematical modeling of large-scale metabolic networks usually requires smoothing of metabolite time-series data to account for measurement or biological errors. Accordingly, the accuracy of smoothing curves strongly affects the subsequent estimation of model parameters. Here, an efficient parametric method is proposed for smoothing metabolite time-series data, and its performance is evaluated. To simplify parameter estimation, the method uses S-system-type equations with simple power law-type efflux terms. Iterative calculation using this method was found to readily converge, because parameters are estimated stepwise. Importantly, smoothing curves are determined so that metabolite concentrations satisfy mass balances. Furthermore, the slopes of smoothing curves are useful in estimating parameters, because they are probably close to their true behaviors regardless of errors that may be present in the actual data. Finally, calculations for each differential equation were found to converge in much less than one second if initial parameters are set at appropriate (guessed) values. Copyright © 2016 Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Aldrich, R. C.; Dana, R. W.; Roberts, E. H. (Principal Investigator)
1977-01-01
The author has identified the following significant results. A stratified random sample using LANDSAT band 5 and 7 panchromatic prints resulted in estimates of water in counties with sampling errors less than + or - 9% (67% probability level). A forest inventory using a four band LANDSAT color composite resulted in estimates of forest area by counties that were within + or - 6.7% and + or - 3.7% respectively (67% probability level). Estimates of forest area for counties by computer assisted techniques were within + or - 21% of operational forest survey figures and for all counties the difference was only one percent. Correlations of airborne terrain reflectance measurements with LANDSAT radiance verified a linear atmospheric model with an additive (path radiance) term and multiplicative (transmittance) term. Coefficients of determination for 28 of the 32 modeling attempts, not adverseley affected by rain shower occurring between the times of LANDSAT passage and aircraft overflights, exceeded 0.83.
Kennedy, Jeffrey R.; Paretti, Nicholas V.; Veilleux, Andrea G.
2014-01-01
Regression equations, which allow predictions of n-day flood-duration flows for selected annual exceedance probabilities at ungaged sites, were developed using generalized least-squares regression and flood-duration flow frequency estimates at 56 streamgaging stations within a single, relatively uniform physiographic region in the central part of Arizona, between the Colorado Plateau and Basin and Range Province, called the Transition Zone. Drainage area explained most of the variation in the n-day flood-duration annual exceedance probabilities, but mean annual precipitation and mean elevation were also significant variables in the regression models. Standard error of prediction for the regression equations varies from 28 to 53 percent and generally decreases with increasing n-day duration. Outside the Transition Zone there are insufficient streamgaging stations to develop regression equations, but flood-duration flow frequency estimates are presented at select streamgaging stations.
Mechanical failure probability of glasses in Earth orbit
NASA Technical Reports Server (NTRS)
Kinser, Donald L.; Wiedlocher, David E.
1992-01-01
Results of five years of earth-orbital exposure on mechanical properties of glasses indicate that radiation effects on mechanical properties of glasses, for the glasses examined, are less than the probable error of measurement. During the 5 year exposure, seven micrometeorite or space debris impacts occurred on the samples examined. These impacts were located in locations which were not subjected to effective mechanical testing, hence limited information on their influence upon mechanical strength was obtained. Combination of these results with micrometeorite and space debris impact frequency obtained by other experiments permits estimates of the failure probability of glasses exposed to mechanical loading under earth-orbit conditions. This probabilistic failure prediction is described and illustrated with examples.
Hsueh, Ya-seng Arthur; Brando, Alex; Dunt, David; Anjou, Mitchell D; Boudville, Andrea; Taylor, Hugh
2013-12-01
To estimate the costs of the extra resources required to close the gap of vision between Indigenous and non-Indigenous Australians. Constructing comprehensive eye care pathways for Indigenous Australians with their related probabilities, to capture full eye care usage compared with current usage rate for cataract surgery, refractive error and diabetic retinopathy using the best available data. Urban and remote regions of Australia. The provision of eye care for cataract surgery, refractive error and diabetic retinopathy. Estimated cost needed for full access, estimated current spending and estimated extra cost required to close the gaps of cataract surgery, refractive error and diabetic retinopathy for Indigenous Australians. Total cost needed for full coverage of all three major eye conditions is $45.5 million per year in 2011 Australian dollars. Current annual spending is $17.4 million. Additional yearly cost required to close the gap of vision is $28 million. This includes extra-capped funds of $3 million from the Commonwealth Government and $2 million from the State and Territory Governments. Additional coordination costs per year are $13.3 million. Although available data are limited, this study has produced the first estimates that are indicative of the need for planning and provide equity in eye care. © 2013 The Authors. Australian Journal of Rural Health © National Rural Health Alliance Inc.
Comparison of estimators of standard deviation for hydrologic time series
Tasker, Gary D.; Gilroy, Edward J.
1982-01-01
Unbiasing factors as a function of serial correlation, ρ, and sample size, n for the sample standard deviation of a lag one autoregressive model were generated by random number simulation. Monte Carlo experiments were used to compare the performance of several alternative methods for estimating the standard deviation σ of a lag one autoregressive model in terms of bias, root mean square error, probability of underestimation, and expected opportunity design loss. Three methods provided estimates of σ which were much less biased but had greater mean square errors than the usual estimate of σ: s = (1/(n - 1) ∑ (xi −x¯)2)½. The three methods may be briefly characterized as (1) a method using a maximum likelihood estimate of the unbiasing factor, (2) a method using an empirical Bayes estimate of the unbiasing factor, and (3) a robust nonparametric estimate of σ suggested by Quenouille. Because s tends to underestimate σ, its use as an estimate of a model parameter results in a tendency to underdesign. If underdesign losses are considered more serious than overdesign losses, then the choice of one of the less biased methods may be wise.
Error decomposition and estimation of inherent optical properties.
Salama, Mhd Suhyb; Stein, Alfred
2009-09-10
We describe a methodology to quantify and separate the errors of inherent optical properties (IOPs) derived from ocean-color model inversion. Their total error is decomposed into three different sources, namely, model approximations and inversion, sensor noise, and atmospheric correction. Prior information on plausible ranges of observation, sensor noise, and inversion goodness-of-fit are employed to derive the posterior probability distribution of the IOPs. The relative contribution of each error component to the total error budget of the IOPs, all being of stochastic nature, is then quantified. The method is validated with the International Ocean Colour Coordinating Group (IOCCG) data set and the NASA bio-Optical Marine Algorithm Data set (NOMAD). The derived errors are close to the known values with correlation coefficients of 60-90% and 67-90% for IOCCG and NOMAD data sets, respectively. Model-induced errors inherent to the derived IOPs are between 10% and 57% of the total error, whereas atmospheric-induced errors are in general above 43% and up to 90% for both data sets. The proposed method is applied to synthesized and in situ measured populations of IOPs. The mean relative errors of the derived values are between 2% and 20%. A specific error table to the Medium Resolution Imaging Spectrometer (MERIS) sensor is constructed. It serves as a benchmark to evaluate the performance of the atmospheric correction method and to compute atmospheric-induced errors. Our method has a better performance and is more appropriate to estimate actual errors of ocean-color derived products than the previously suggested methods. Moreover, it is generic and can be applied to quantify the error of any derived biogeophysical parameter regardless of the used derivation.
Jung, R.E.; Royle, J. Andrew; Sauer, J.R.; Addison, C.; Rau, R.D.; Shirk, J.L.; Whissel, J.C.
2005-01-01
Stream salamanders in the family Plethodontidae constitute a large biomass in and near headwater streams in the eastern United States and are promising indicators of stream ecosystem health. Many studies of stream salamanders have relied on population indices based on counts rather than population estimates based on techniques such as capture-recapture and removal. Application of estimation procedures allows the calculation of detection probabilities (the proportion of total animals present that are detected during a survey) and their associated sampling error, and may be essential for determining salamander population sizes and trends. In 1999, we conducted capture-recapture and removal population estimation methods for Desmognathus salamanders at six streams in Shenandoah National Park, Virginia, USA. Removal sampling appeared more efficient and detection probabilities from removal data were higher than those from capture-recapture. During 2001-2004, we used removal estimation at eight streams in the park to assess the usefulness of this technique for long-term monitoring of stream salamanders. Removal detection probabilities ranged from 0.39 to 0.96 for Desmognathus, 0.27 to 0.89 for Eurycea and 0.27 to 0.75 for northern spring (Gyrinophilus porphyriticus) and northern red (Pseudotriton ruber) salamanders across stream transects. Detection probabilities did not differ across years for Desmognathus and Eurycea, but did differ among streams for Desmognathus. Population estimates of Desmognathus decreased between 2001-2002 and 2003-2004 which may be related to changes in stream flow conditions. Removal-based procedures may be a feasible approach for population estimation of salamanders, but field methods should be designed to meet the assumptions of the sampling procedures. New approaches to estimating stream salamander populations are discussed.
Investigation of scene identification algorithms for radiation budget measurements
NASA Technical Reports Server (NTRS)
Diekmann, F. J.
1986-01-01
The computation of Earth radiation budget from satellite measurements requires the identification of the scene in order to select spectral factors and bidirectional models. A scene identification procedure is developed for AVHRR SW and LW data by using two radiative transfer models. These AVHRR GAC pixels are then attached to corresponding ERBE pixels and the results are sorted into scene identification probability matrices. These scene intercomparisons show that there generally is a higher tendency for underestimation of cloudiness over ocean at high cloud amounts, e.g., mostly cloudy instead of overcast, partly cloudy instead of mostly cloudy, for the ERBE relative to the AVHRR results. Reasons for this are explained. Preliminary estimates of the errors of exitances due to scene misidentification demonstrates the high dependency on the probability matrices. While the longwave error can generally be neglected the shortwave deviations have reached maximum values of more than 12% of the respective exitances.
Error probability for RFID SAW tags with pulse position coding and peak-pulse detection.
Shmaliy, Yuriy S; Plessky, Victor; Cerda-Villafaña, Gustavo; Ibarra-Manzano, Oscar
2012-11-01
This paper addresses the code reading error probability (EP) in radio-frequency identification (RFID) SAW tags with pulse position coding (PPC) and peak-pulse detection. EP is found in a most general form, assuming M groups of codes with N slots each and allowing individual SNRs in each slot. The basic case of zero signal in all off-pulses and equal signals in all on-pulses is investigated in detail. We show that if a SAW-tag with PPC is designed such that the spurious responses are attenuated by more than 20 dB below on-pulses, then EP can be achieved at the level of 10(-8) (one false read per 108 readings) with SNR >17 dB for any reasonable M and N. The tag reader range is estimated as a function of the transmitted power and EP.
NASA Technical Reports Server (NTRS)
Van Buren, Dave
1986-01-01
Equivalent width data from Copernicus and IUE appear to have an exponential, rather than a Gaussian distribution of errors. This is probably because there is one dominant source of error: the assignment of the background continuum shape. The maximum likelihood method of parameter estimation is presented for the case of exponential statistics, in enough generality for application to many problems. The method is applied to global fitting of Si II, Fe II, and Mn II oscillator strengths and interstellar gas parameters along many lines of sight. The new values agree in general with previous determinations but are usually much more tightly constrained. Finally, it is shown that care must be taken in deriving acceptable regions of parameter space because the probability contours are not generally ellipses whose axes are parallel to the coordinate axes.
A goodness-of-fit test for capture-recapture model M(t) under closure
Stanley, T.R.; Burnham, K.P.
1999-01-01
A new, fully efficient goodness-of-fit test for the time-specific closed-population capture-recapture model M(t) is presented. This test is based on the residual distribution of the capture history data given the maximum likelihood parameter estimates under model M(t), is partitioned into informative components, and is based on chi-square statistics. Comparison of this test with Leslie's test (Leslie, 1958, Journal of Animal Ecology 27, 84- 86) for model M(t), using Monte Carlo simulations, shows the new test generally outperforms Leslie's test. The new test is frequently computable when Leslie's test is not, has Type I error rates that are closer to nominal error rates than Leslie's test, and is sensitive to behavioral variation and heterogeneity in capture probabilities. Leslie's test is not sensitive to behavioral variation in capture probabilities but, when computable, has greater power to detect heterogeneity than the new test.
Two proposed convergence criteria for Monte Carlo solutions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Forster, R.A.; Pederson, S.P.; Booth, T.E.
1992-01-01
The central limit theorem (CLT) can be applied to a Monte Carlo solution if two requirements are satisfied: (1) The random variable has a finite mean and a finite variance; and (2) the number N of independent observations grows large. When these two conditions are satisfied, a confidence interval (CI) based on the normal distribution with a specified coverage probability can be formed. The first requirement is generally satisfied by the knowledge of the Monte Carlo tally being used. The Monte Carlo practitioner has a limited number of marginal methods to assess the fulfillment of the second requirement, such asmore » statistical error reduction proportional to 1/[radical]N with error magnitude guidelines. Two proposed methods are discussed in this paper to assist in deciding if N is large enough: estimating the relative variance of the variance (VOV) and examining the empirical history score probability density function (pdf).« less
Probable errors in width distributions of sea ice leads measured along a transect
NASA Technical Reports Server (NTRS)
Key, J.; Peckham, S.
1991-01-01
The degree of error expected in the measurement of widths of sea ice leads along a single transect are examined in a probabilistic sense under assumed orientation and width distributions, where both isotropic and anisotropic lead orientations are examined. Methods are developed for estimating the distribution of 'actual' widths (measured perpendicular to the local lead orientation) knowing the 'apparent' width distribution (measured along the transect), and vice versa. The distribution of errors, defined as the difference between the actual and apparent lead width, can be estimated from the two width distributions, and all moments of this distribution can be determined. The problem is illustrated with Landsat imagery and the procedure is applied to a submarine sonar transect. Results are determined for a range of geometries, and indicate the importance of orientation information if data sampled along a transect are to be used for the description of lead geometries. While the application here is to sea ice leads, the methodology can be applied to measurements of any linear feature.
Martire, Kristy A; Growns, Bethany; Navarro, Danielle J
2018-04-17
Forensic handwriting examiners currently testify to the origin of questioned handwriting for legal purposes. However, forensic scientists are increasingly being encouraged to assign probabilities to their observations in the form of a likelihood ratio. This study is the first to examine whether handwriting experts are able to estimate the frequency of US handwriting features more accurately than novices. The results indicate that the absolute error for experts was lower than novices, but the size of the effect is modest, and the overall error rate even for experts is large enough as to raise questions about whether their estimates can be sufficiently trustworthy for presentation in courts. When errors are separated into effects caused by miscalibration and those caused by imprecision, we find systematic differences between individuals. Finally, we consider several ways of aggregating predictions from multiple experts, suggesting that quite substantial improvements in expert predictions are possible when a suitable aggregation method is used.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hong, X; Gao, H; Schuemann, J
2015-06-15
Purpose: The Monte Carlo (MC) method is a gold standard for dose calculation in radiotherapy. However, it is not a priori clear how many particles need to be simulated to achieve a given dose accuracy. Prior error estimate and stopping criterion are not well established for MC. This work aims to fill this gap. Methods: Due to the statistical nature of MC, our approach is based on one-sample t-test. We design the prior error estimate method based on the t-test, and then use this t-test based error estimate for developing a simulation stopping criterion. The three major components are asmore » follows.First, the source particles are randomized in energy, space and angle, so that the dose deposition from a particle to the voxel is independent and identically distributed (i.i.d.).Second, a sample under consideration in the t-test is the mean value of dose deposition to the voxel by sufficiently large number of source particles. Then according to central limit theorem, the sample as the mean value of i.i.d. variables is normally distributed with the expectation equal to the true deposited dose.Third, the t-test is performed with the null hypothesis that the difference between sample expectation (the same as true deposited dose) and on-the-fly calculated mean sample dose from MC is larger than a given error threshold, in addition to which users have the freedom to specify confidence probability and region of interest in the t-test based stopping criterion. Results: The method is validated for proton dose calculation. The difference between the MC Result based on the t-test prior error estimate and the statistical Result by repeating numerous MC simulations is within 1%. Conclusion: The t-test based prior error estimate and stopping criterion are developed for MC and validated for proton dose calculation. Xiang Hong and Hao Gao were partially supported by the NSFC (#11405105), the 973 Program (#2015CB856000) and the Shanghai Pujiang Talent Program (#14PJ1404500)« less
Temporal scaling in information propagation.
Huang, Junming; Li, Chao; Wang, Wen-Qiang; Shen, Hua-Wei; Li, Guojie; Cheng, Xue-Qi
2014-06-18
For the study of information propagation, one fundamental problem is uncovering universal laws governing the dynamics of information propagation. This problem, from the microscopic perspective, is formulated as estimating the propagation probability that a piece of information propagates from one individual to another. Such a propagation probability generally depends on two major classes of factors: the intrinsic attractiveness of information and the interactions between individuals. Despite the fact that the temporal effect of attractiveness is widely studied, temporal laws underlying individual interactions remain unclear, causing inaccurate prediction of information propagation on evolving social networks. In this report, we empirically study the dynamics of information propagation, using the dataset from a population-scale social media website. We discover a temporal scaling in information propagation: the probability a message propagates between two individuals decays with the length of time latency since their latest interaction, obeying a power-law rule. Leveraging the scaling law, we further propose a temporal model to estimate future propagation probabilities between individuals, reducing the error rate of information propagation prediction from 6.7% to 2.6% and improving viral marketing with 9.7% incremental customers.
Temporal scaling in information propagation
NASA Astrophysics Data System (ADS)
Huang, Junming; Li, Chao; Wang, Wen-Qiang; Shen, Hua-Wei; Li, Guojie; Cheng, Xue-Qi
2014-06-01
For the study of information propagation, one fundamental problem is uncovering universal laws governing the dynamics of information propagation. This problem, from the microscopic perspective, is formulated as estimating the propagation probability that a piece of information propagates from one individual to another. Such a propagation probability generally depends on two major classes of factors: the intrinsic attractiveness of information and the interactions between individuals. Despite the fact that the temporal effect of attractiveness is widely studied, temporal laws underlying individual interactions remain unclear, causing inaccurate prediction of information propagation on evolving social networks. In this report, we empirically study the dynamics of information propagation, using the dataset from a population-scale social media website. We discover a temporal scaling in information propagation: the probability a message propagates between two individuals decays with the length of time latency since their latest interaction, obeying a power-law rule. Leveraging the scaling law, we further propose a temporal model to estimate future propagation probabilities between individuals, reducing the error rate of information propagation prediction from 6.7% to 2.6% and improving viral marketing with 9.7% incremental customers.
Low-flow characteristics of Virginia streams
Austin, Samuel H.; Krstolic, Jennifer L.; Wiegand, Ute
2011-01-01
Low-flow annual non-exceedance probabilities (ANEP), called probability-percent chance (P-percent chance) flow estimates, regional regression equations, and transfer methods are provided describing the low-flow characteristics of Virginia streams. Statistical methods are used to evaluate streamflow data. Analysis of Virginia streamflow data collected from 1895 through 2007 is summarized. Methods are provided for estimating low-flow characteristics of gaged and ungaged streams. The 1-, 4-, 7-, and 30-day average streamgaging station low-flow characteristics for 290 long-term, continuous-record, streamgaging stations are determined, adjusted for instances of zero flow using a conditional probability adjustment method, and presented for non-exceedance probabilities of 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.05, 0.02, 0.01, and 0.005. Stream basin characteristics computed using spatial data and a geographic information system are used as explanatory variables in regional regression equations to estimate annual non-exceedance probabilities at gaged and ungaged sites and are summarized for 290 long-term, continuous-record streamgaging stations, 136 short-term, continuous-record streamgaging stations, and 613 partial-record streamgaging stations. Regional regression equations for six physiographic regions use basin characteristics to estimate 1-, 4-, 7-, and 30-day average low-flow annual non-exceedance probabilities at gaged and ungaged sites. Weighted low-flow values that combine computed streamgaging station low-flow characteristics and annual non-exceedance probabilities from regional regression equations provide improved low-flow estimates. Regression equations developed using the Maintenance of Variance with Extension (MOVE.1) method describe the line of organic correlation (LOC) with an appropriate index site for low-flow characteristics at 136 short-term, continuous-record streamgaging stations and 613 partial-record streamgaging stations. Monthly streamflow statistics computed on the individual daily mean streamflows of selected continuous-record streamgaging stations and curves describing flow-duration are presented. Text, figures, and lists are provided summarizing low-flow estimates, selected low-flow sites, delineated physiographic regions, basin characteristics, regression equations, error estimates, definitions, and data sources. This study supersedes previous studies of low flows in Virginia.
Ulas, Arife; Silay, Kamile; Akinci, Sema; Dede, Didem Sener; Akinci, Muhammed Bulent; Sendur, Mehmet Ali Nahit; Cubukcu, Erdem; Coskun, Hasan Senol; Degirmenci, Mustafa; Utkan, Gungor; Ozdemir, Nuriye; Isikdogan, Abdurrahman; Buyukcelik, Abdullah; Inanc, Mevlude; Bilici, Ahmet; Odabasi, Hatice; Cihan, Sener; Avci, Nilufer; Yalcin, Bulent
2015-01-01
Medication errors in oncology may cause severe clinical problems due to low therapeutic indices and high toxicity of chemotherapeutic agents. We aimed to investigate unintentional medication errors and underlying factors during chemotherapy preparation and administration based on a systematic survey conducted to reflect oncology nurses experience. This study was conducted in 18 adult chemotherapy units with volunteer participation of 206 nurses. A survey developed by primary investigators and medication errors (MAEs) defined preventable errors during prescription of medication, ordering, preparation or administration. The survey consisted of 4 parts: demographic features of nurses; workload of chemotherapy units; errors and their estimated monthly number during chemotherapy preparation and administration; and evaluation of the possible factors responsible from ME. The survey was conducted by face to face interview and data analyses were performed with descriptive statistics. Chi-square or Fisher exact tests were used for a comparative analysis of categorical data. Some 83.4% of the 210 nurses reported one or more than one error during chemotherapy preparation and administration. Prescribing or ordering wrong doses by physicians (65.7%) and noncompliance with administration sequences during chemotherapy administration (50.5%) were the most common errors. The most common estimated average monthly error was not following the administration sequence of the chemotherapeutic agents (4.1 times/month, range 1-20). The most important underlying reasons for medication errors were heavy workload (49.7%) and insufficient number of staff (36.5%). Our findings suggest that the probability of medication error is very high during chemotherapy preparation and administration, the most common involving prescribing and ordering errors. Further studies must address the strategies to minimize medication error in chemotherapy receiving patients, determine sufficient protective measures and establishing multistep control mechanisms.
Rainfall estimation for real time flood monitoring using geostationary meteorological satellite data
NASA Astrophysics Data System (ADS)
Veerakachen, Watcharee; Raksapatcharawong, Mongkol
2015-09-01
Rainfall estimation by geostationary meteorological satellite data provides good spatial and temporal resolutions. This is advantageous for real time flood monitoring and warning systems. However, a rainfall estimation algorithm developed in one region needs to be adjusted for another climatic region. This work proposes computationally-efficient rainfall estimation algorithms based on an Infrared Threshold Rainfall (ITR) method calibrated with regional ground truth. Hourly rain gauge data collected from 70 stations around the Chao-Phraya river basin were used for calibration and validation of the algorithms. The algorithm inputs were derived from FY-2E satellite observations consisting of infrared and water vapor imagery. The results were compared with the Global Satellite Mapping of Precipitation (GSMaP) near real time product (GSMaP_NRT) using the probability of detection (POD), root mean square error (RMSE) and linear correlation coefficient (CC) as performance indices. Comparison with the GSMaP_NRT product for real time monitoring purpose shows that hourly rain estimates from the proposed algorithm with the error adjustment technique (ITR_EA) offers higher POD and approximately the same RMSE and CC with less data latency.
Engdahl, N.B.; Vogler, E.T.; Weissmann, G.S.
2010-01-01
River-aquifer exchange is considered within a transition probability framework along the Rio Grande in Albuquerque, New Mexico, to provide a stochastic estimate of aquifer heterogeneity and river loss. Six plausible hydrofacies configurations were determined using categorized drill core and wetland survey data processed through the TPROGS geostatistical package. A base case homogeneous model was also constructed for comparison. River loss was simulated for low, moderate, and high Rio Grande stages and several different riverside drain stage configurations. Heterogeneity effects were quantified by determining the mean and variance of the K field for each realization compared to the root-mean-square (RMS) error of the observed groundwater head data. Simulation results showed that the heterogeneous models produced smaller estimates of loss than the homogeneous approximation. Differences between heterogeneous and homogeneous model results indicate that the use of a homogeneous K in a regional-scale model may result in an overestimation of loss but comparable RMS error. We find that the simulated river loss is dependent on the aquifer structure and is most sensitive to the volumetric proportion of fines within the river channel. Copyright 2010 by the American Geophysical Union.
Assessing flight safety differences between the United States regional and major airlines
NASA Astrophysics Data System (ADS)
Sharp, Broderick H.
During 2008, the U.S. domestic airline departures exceeded 28,000 flights per day. Thirty-nine or less than 0.2 of 1% of these flights resulted in operational incidents or accidents. However, even a low percentage of airline accidents and incidents continue to cause human suffering and property loss. The charge of this study was the comparison of U.S. major and regional airline safety histories. The study spans safety events from January 1982 through December 2008. In this quantitative analysis, domestic major and regional airlines were statistically tested for their flight safety differences. Four major airlines and thirty-seven regional airlines qualified for the safety study which compared the airline groups' fatal accidents, incidents, non-fatal accidents, pilot errors, and the remaining six safety event probable cause types. The six other probable cause types are mechanical failure, weather, air traffic control, maintenance, other, and unknown causes. The National Transportation Safety Board investigated each airline safety event, and assigned a probable cause to each event. A sample of 500 events was randomly selected from the 1,391 airlines' accident and incident population. The airline groups' safety event probabilities were estimated using the least squares linear regression. A probability significance level of 5% was chosen to conclude the appropriate research question hypothesis. The airline fatal accidents and incidents probability levels were 1.2% and 0.05% respectively. These two research questions did not reach the 5% significance level threshold. Therefore, the airline groups' fatal accidents and non-destructive incidents probabilities favored the airline groups' safety differences hypothesis. The linear progression estimates for the remaining three research questions were 71.5% for non-fatal accidents, 21.8% for the pilot errors, and 7.4% significance level for the six probable causes. These research questions' linear regressions are greater than the 5% level. Consequently, these three research questions favored airline groups' safety similarities hypothesis. The study indicates the U.S. domestic major airlines were safer than the regional airlines. Ideas for potential airline safety progress can examine pilot fatigue, the airline groups' hiring policies, the government's airline oversight personnel, or the comparison of individual airline's operational policies.
ERIC Educational Resources Information Center
O'Connell, Ann Aileen
The relationships among types of errors observed during probability problem solving were studied. Subjects were 50 graduate students in an introductory probability and statistics course. Errors were classified as text comprehension, conceptual, procedural, and arithmetic. Canonical correlation analysis was conducted on the frequencies of specific…
Sampling Error in Relation to Cyst Nematode Population Density Estimation in Small Field Plots.
Župunski, Vesna; Jevtić, Radivoje; Jokić, Vesna Spasić; Župunski, Ljubica; Lalošević, Mirjana; Ćirić, Mihajlo; Ćurčić, Živko
2017-06-01
Cyst nematodes are serious plant-parasitic pests which could cause severe yield losses and extensive damage. Since there is still very little information about error of population density estimation in small field plots, this study contributes to the broad issue of population density assessment. It was shown that there was no significant difference between cyst counts of five or seven bulk samples taken per each 1-m 2 plot, if average cyst count per examined plot exceeds 75 cysts per 100 g of soil. Goodness of fit of data to probability distribution tested with χ 2 test confirmed a negative binomial distribution of cyst counts for 21 out of 23 plots. The recommended measure of sampling precision of 17% expressed through coefficient of variation ( cv ) was achieved if the plots of 1 m 2 contaminated with more than 90 cysts per 100 g of soil were sampled with 10-core bulk samples taken in five repetitions. If plots were contaminated with less than 75 cysts per 100 g of soil, 10-core bulk samples taken in seven repetitions gave cv higher than 23%. This study indicates that more attention should be paid on estimation of sampling error in experimental field plots to ensure more reliable estimation of population density of cyst nematodes.
NASA Astrophysics Data System (ADS)
Liu, Yang; Pu, Huangsheng; Zhang, Xi; Li, Baojuan; Liang, Zhengrong; Lu, Hongbing
2017-03-01
Arterial spin labeling (ASL) provides a noninvasive measurement of cerebral blood flow (CBF). Due to relatively low spatial resolution, the accuracy of CBF measurement is affected by the partial volume (PV) effect. To obtain accurate CBF estimation, the contribution of each tissue type in the mixture is desirable. In general, this can be obtained according to the registration of ASL and structural image in current ASL studies. This approach can obtain probability of each tissue type inside each voxel, but it also introduces error, which include error of registration algorithm and imaging itself error in scanning of ASL and structural image. Therefore, estimation of mixture percentage directly from ASL data is greatly needed. Under the assumption that ASL signal followed the Gaussian distribution and each tissue type is independent, a maximum a posteriori expectation-maximization (MAP-EM) approach was formulated to estimate the contribution of each tissue type to the observed perfusion signal at each voxel. Considering the sensitivity of MAP-EM to the initialization, an approximately accurate initialization was obtain using 3D Fuzzy c-means method. Our preliminary results demonstrated that the GM and WM pattern across the perfusion image can be sufficiently visualized by the voxel-wise tissue mixtures, which may be promising for the diagnosis of various brain diseases.
Limitations of shallow nets approximation.
Lin, Shao-Bo
2017-10-01
In this paper, we aim at analyzing the approximation abilities of shallow networks in reproducing kernel Hilbert spaces (RKHSs). We prove that there is a probability measure such that the achievable lower bound for approximating by shallow nets can be realized for all functions in balls of reproducing kernel Hilbert space with high probability, which is different with the classical minimax approximation error estimates. This result together with the existing approximation results for deep nets shows the limitations for shallow nets and provides a theoretical explanation on why deep nets perform better than shallow nets. Copyright © 2017 Elsevier Ltd. All rights reserved.
Active life expectancy from annual follow-up data with missing responses.
Izmirlian, G; Brock, D; Ferrucci, L; Phillips, C
2000-03-01
Active life expectancy (ALE) at a given age is defined as the expected remaining years free of disability. In this study, three categories of health status are defined according to the ability to perform activities of daily living independently. Several studies have used increment-decrement life tables to estimate ALE, without error analysis, from only a baseline and one follow-up interview. The present work conducts an individual-level covariate analysis using a three-state Markov chain model for multiple follow-up data. Using a logistic link, the model estimates single-year transition probabilities among states of health, accounting for missing interviews. This approach has the advantages of smoothing subsequent estimates and increased power by using all follow-ups. We compute ALE and total life expectancy from these estimated single-year transition probabilities. Variance estimates are computed using the delta method. Data from the Iowa Established Population for the Epidemiologic Study of the Elderly are used to test the effects of smoking on ALE on all 5-year age groups past 65 years, controlling for sex and education.
NASA Astrophysics Data System (ADS)
Clarke, John R.; Southerland, David
1999-07-01
Semi-closed circuit underwater breathing apparatus (UBA) provide a constant flow of mixed gas containing oxygen and nitrogen or helium to a diver. However, as a diver's work rate and metabolic oxygen consumption varies, the oxygen percentages within the UBA can change dramatically. Hence, even a resting diver can become hypoxic and become at risk for oxygen induced seizures. Conversely, a hard working diver can become hypoxic and lose consciousness. Unfortunately, current semi-closed UBA do not contain oxygen monitors. We describe a simple oxygen monitoring system designed and prototyped at the Navy Experimental Diving Unit. The main monitor components include a PIC microcontroller, analog-to-digital converter, bicolor LED, and oxygen sensor. The LED, affixed to the diver's mask is steady green if the oxygen partial pressure is within pre- defined acceptable limits. A more advanced monitor with a depth senor and additional computational circuitry could be used to estimate metabolic oxygen consumption. The computational algorithm uses the oxygen partial pressure and the diver's depth to compute O2 using the steady state solution of the differential equation describing oxygen concentrations within the UBA. Consequently, dive transients induce errors in the O2 estimation. To evalute these errors, we used a computer simulation of semi-closed circuit UBA dives to generate transient rich data as input to the estimation algorithm. A step change in simulated O2 elicits a monoexponential change in the estimated O2 with a time constant of 5 to 10 minutes. Methods for predicting error and providing a probable error indication to the diver are presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ren, Shangjie; Department of Radiation Oncology, Stanford University School of Medicine, Palo Alto, California; Hara, Wendy
Purpose: To develop a reliable method to estimate electron density based on anatomic magnetic resonance imaging (MRI) of the brain. Methods and Materials: We proposed a unifying multi-atlas approach for electron density estimation based on standard T1- and T2-weighted MRI. First, a composite atlas was constructed through a voxelwise matching process using multiple atlases, with the goal of mitigating effects of inherent anatomic variations between patients. Next we computed for each voxel 2 kinds of conditional probabilities: (1) electron density given its image intensity on T1- and T2-weighted MR images; and (2) electron density given its spatial location in a referencemore » anatomy, obtained by deformable image registration. These were combined into a unifying posterior probability density function using the Bayesian formalism, which provided the optimal estimates for electron density. We evaluated the method on 10 patients using leave-one-patient-out cross-validation. Receiver operating characteristic analyses for detecting different tissue types were performed. Results: The proposed method significantly reduced the errors in electron density estimation, with a mean absolute Hounsfield unit error of 119, compared with 140 and 144 (P<.0001) using conventional T1-weighted intensity and geometry-based approaches, respectively. For detection of bony anatomy, the proposed method achieved an 89% area under the curve, 86% sensitivity, 88% specificity, and 90% accuracy, which improved upon intensity and geometry-based approaches (area under the curve: 79% and 80%, respectively). Conclusion: The proposed multi-atlas approach provides robust electron density estimation and bone detection based on anatomic MRI. If validated on a larger population, our work could enable the use of MRI as a primary modality for radiation treatment planning.« less
Trimming and procrastination as inversion techniques
NASA Astrophysics Data System (ADS)
Backus, George E.
1996-12-01
By examining the processes of truncating and approximating the model space (trimming it), and by committing to neither the objectivist nor the subjectivist interpretation of probability (procrastinating), we construct a formal scheme for solving linear and non-linear geophysical inverse problems. The necessary prior information about the correct model xE can be either a collection of inequalities or a probability measure describing where xE was likely to be in the model space X before the data vector y0 was measured. The results of the inversion are (1) a vector z0 that estimates some numerical properties zE of xE; (2) an estimate of the error δz = z0 - zE. As y0 is finite dimensional, so is z0, and hence in principle inversion cannot describe all of xE. The error δz is studied under successively more specialized assumptions about the inverse problem, culminating in a complete analysis of the linear inverse problem with a prior quadratic bound on xE. Our formalism appears to encompass and provide error estimates for many of the inversion schemes current in geomagnetism, and would be equally applicable in geodesy and seismology if adequate prior information were available there. As an idealized example we study the magnetic field at the core-mantle boundary, using satellite measurements of field elements at sites assumed to be almost uniformly distributed on a single spherical surface. Magnetospheric currents are neglected and the crustal field is idealized as a random process with rotationally invariant statistics. We find that an appropriate data compression diagonalizes the variance matrix of the crustal signal and permits an analytic trimming of the idealized problem.
Cole, Stephen R.; Jacobson, Lisa P.; Tien, Phyllis C.; Kingsley, Lawrence; Chmiel, Joan S.; Anastos, Kathryn
2010-01-01
To estimate the net effect of imperfectly measured highly active antiretroviral therapy on incident acquired immunodeficiency syndrome or death, the authors combined inverse probability-of-treatment-and-censoring weighted estimation of a marginal structural Cox model with regression-calibration methods. Between 1995 and 2007, 950 human immunodeficiency virus–positive men and women were followed in 2 US cohort studies. During 4,054 person-years, 374 initiated highly active antiretroviral therapy, 211 developed acquired immunodeficiency syndrome or died, and 173 dropped out. Accounting for measured confounders and determinants of dropout, the weighted hazard ratio for acquired immunodeficiency syndrome or death comparing use of highly active antiretroviral therapy in the prior 2 years with no therapy was 0.36 (95% confidence limits: 0.21, 0.61). This association was relatively constant over follow-up (P = 0.19) and stronger than crude or adjusted hazard ratios of 0.75 and 0.95, respectively. Accounting for measurement error in reported exposure using external validation data on 331 men and women provided a hazard ratio of 0.17, with bias shifted from the hazard ratio to the estimate of precision as seen by the 2.5-fold wider confidence limits (95% confidence limits: 0.06, 0.43). Marginal structural measurement-error models can simultaneously account for 3 major sources of bias in epidemiologic research: validated exposure measurement error, measured selection bias, and measured time-fixed and time-varying confounding. PMID:19934191
ECG fiducial point extraction using switching Kalman filter.
Akhbari, Mahsa; Ghahjaverestan, Nasim Montazeri; Shamsollahi, Mohammad B; Jutten, Christian
2018-04-01
In this paper, we propose a novel method for extracting fiducial points (FPs) of the beats in electrocardiogram (ECG) signals using switching Kalman filter (SKF). In this method, according to McSharry's model, ECG waveforms (P-wave, QRS complex and T-wave) are modeled with Gaussian functions and ECG baselines are modeled with first order auto regressive models. In the proposed method, a discrete state variable called "switch" is considered that affects only the observation equations. We denote a mode as a specific observation equation and switch changes between 7 modes and corresponds to different segments of an ECG beat. At each time instant, the probability of each mode is calculated and compared among two consecutive modes and a path is estimated, which shows the relation of each part of the ECG signal to the mode with the maximum probability. ECG FPs are found from the estimated path. For performance evaluation, the Physionet QT database is used and the proposed method is compared with methods based on wavelet transform, partially collapsed Gibbs sampler (PCGS) and extended Kalman filter. For our proposed method, the mean error and the root mean square error across all FPs are 2 ms (i.e. less than one sample) and 14 ms, respectively. These errors are significantly smaller than those obtained using other methods. The proposed method achieves lesser RMSE and smaller variability with respect to others. Copyright © 2018 Elsevier B.V. All rights reserved.
Vanin, Evgeny; Jacobsen, Gunnar
2010-03-01
The Bit-Error-Ratio (BER) floor caused by the laser phase noise in the optical fiber communication system with differential quadrature phase shift keying (DQPSK) and coherent detection followed by digital signal processing (DSP) is analytically evaluated. An in-phase and quadrature (I&Q) receiver with a carrier phase recovery using DSP is considered. The carrier phase recovery is based on a phase estimation of a finite sum (block) of the signal samples raised to the power of four and the phase unwrapping at transitions between blocks. It is demonstrated that errors generated at block transitions cause the dominating contribution to the system BER floor when the impact of the additive noise is negligibly small in comparison with the effect of the laser phase noise. Even the BER floor in the case when the phase unwrapping is omitted is analytically derived and applied to emphasize the crucial importance of this signal processing operation. The analytical results are verified by full Monte Carlo simulations. The BER for another type of DQPSK receiver operation, which is based on differential phase detection, is also obtained in the analytical form using the principle of conditional probability. The principle of conditional probability is justified in the case of differential phase detection due to statistical independency of the laser phase noise induced signal phase error and the additive noise contributions. Based on the achieved analytical results the laser linewidth tolerance is calculated for different system cases.
Hughes-Jones, N C; Hunt, V A; Maycock, W D; Wesley, E D; Vallet, L
1978-01-01
An analysis of the assay of 28 preparations of anti-D immunoglobulin using a radioisotope method carried out at 6-montly intervals for 2--4.5 years showed an average fall in anti-D concentration of 10.6% each year, with 99% confidence limits of 6.8--14.7%. The fall in anti-D concentration after storage at 37 degrees C for 1 month was less than 8%, the minimum change that could be detected. No significant change in physical characteristics of the immunoglobulin were detected. The error of a single estimate of anti-D by the radioisotope method (125I-labelled anti-IgG) used here was calculated to be such that the true value probably (p = 0.95) lay between 66 and 150% of the estimated value.
Lash, Timothy L
2007-11-26
The associations of pesticide exposure with disease outcomes are estimated without the benefit of a randomized design. For this reason and others, these studies are susceptible to systematic errors. I analyzed studies of the associations between alachlor and glyphosate exposure and cancer incidence, both derived from the Agricultural Health Study cohort, to quantify the bias and uncertainty potentially attributable to systematic error. For each study, I identified the prominent result and important sources of systematic error that might affect it. I assigned probability distributions to the bias parameters that allow quantification of the bias, drew a value at random from each assigned distribution, and calculated the estimate of effect adjusted for the biases. By repeating the draw and adjustment process over multiple iterations, I generated a frequency distribution of adjusted results, from which I obtained a point estimate and simulation interval. These methods were applied without access to the primary record-level dataset. The conventional estimates of effect associating alachlor and glyphosate exposure with cancer incidence were likely biased away from the null and understated the uncertainty by quantifying only random error. For example, the conventional p-value for a test of trend in the alachlor study equaled 0.02, whereas fewer than 20% of the bias analysis iterations yielded a p-value of 0.02 or lower. Similarly, the conventional fully-adjusted result associating glyphosate exposure with multiple myleoma equaled 2.6 with 95% confidence interval of 0.7 to 9.4. The frequency distribution generated by the bias analysis yielded a median hazard ratio equal to 1.5 with 95% simulation interval of 0.4 to 8.9, which was 66% wider than the conventional interval. Bias analysis provides a more complete picture of true uncertainty than conventional frequentist statistical analysis accompanied by a qualitative description of study limitations. The latter approach is likely to lead to overconfidence regarding the potential for causal associations, whereas the former safeguards against such overinterpretations. Furthermore, such analyses, once programmed, allow rapid implementation of alternative assignments of probability distributions to the bias parameters, so elevate the plane of discussion regarding study bias from characterizing studies as "valid" or "invalid" to a critical and quantitative discussion of sources of uncertainty.
Drought Persistence Errors in Global Climate Models
NASA Astrophysics Data System (ADS)
Moon, H.; Gudmundsson, L.; Seneviratne, S. I.
2018-04-01
The persistence of drought events largely determines the severity of socioeconomic and ecological impacts, but the capability of current global climate models (GCMs) to simulate such events is subject to large uncertainties. In this study, the representation of drought persistence in GCMs is assessed by comparing state-of-the-art GCM model simulations to observation-based data sets. For doing so, we consider dry-to-dry transition probabilities at monthly and annual scales as estimates for drought persistence, where a dry status is defined as negative precipitation anomaly. Though there is a substantial spread in the drought persistence bias, most of the simulations show systematic underestimation of drought persistence at global scale. Subsequently, we analyzed to which degree (i) inaccurate observations, (ii) differences among models, (iii) internal climate variability, and (iv) uncertainty of the employed statistical methods contribute to the spread in drought persistence errors using an analysis of variance approach. The results show that at monthly scale, model uncertainty and observational uncertainty dominate, while the contribution from internal variability is small in most cases. At annual scale, the spread of the drought persistence error is dominated by the statistical estimation error of drought persistence, indicating that the partitioning of the error is impaired by the limited number of considered time steps. These findings reveal systematic errors in the representation of drought persistence in current GCMs and suggest directions for further model improvement.
NASA Technical Reports Server (NTRS)
Gracey, William; Jewel, Joseph W., Jr.; Carpenter, Gene T.
1960-01-01
The overall errors of the service altimeter installations of a variety of civil transport, military, and general-aviation airplanes have been experimentally determined during normal landing-approach and take-off operations. The average height above the runway at which the data were obtained was about 280 feet for the landings and about 440 feet for the take-offs. An analysis of the data obtained from 196 airplanes during 415 landing approaches and from 70 airplanes during 152 take-offs showed that: 1. The overall error of the altimeter installations in the landing- approach condition had a probable value (50 percent probability) of +/- 36 feet and a maximum probable value (99.7 percent probability) of +/- 159 feet with a bias of +10 feet. 2. The overall error in the take-off condition had a probable value of +/- 47 feet and a maximum probable value of +/- 207 feet with a bias of -33 feet. 3. The overall errors of the military airplanes were generally larger than those of the civil transports in both the landing-approach and take-off conditions. In the landing-approach condition the probable error and the maximum probable error of the military airplanes were +/- 43 and +/- 189 feet, respectively, with a bias of +15 feet, whereas those for the civil transports were +/- 22 and +/- 96 feet, respectively, with a bias of +1 foot. 4. The bias values of the error distributions (+10 feet for the landings and -33 feet for the take-offs) appear to represent a measure of the hysteresis characteristics (after effect and recovery) and friction of the instrument and the pressure lag of the tubing-instrument system.
Dehghan, Ashraf; Abumasoudi, Rouhollah Sheikh; Ehsanpour, Soheila
2016-01-01
Infertility and errors in the process of its treatment have a negative impact on infertile couples. The present study was aimed to identify and assess the common errors in the reception process by applying the approach of "failure modes and effects analysis" (FMEA). In this descriptive cross-sectional study, the admission process of fertility and infertility center of Isfahan was selected for evaluation of its errors based on the team members' decision. At first, the admission process was charted through observations and interviewing employees, holding multiple panels, and using FMEA worksheet, which has been used in many researches all over the world and also in Iran. Its validity was evaluated through content and face validity, and its reliability was evaluated through reviewing and confirmation of the obtained information by the FMEA team, and eventually possible errors, causes, and three indicators of severity of effect, probability of occurrence, and probability of detection were determined and corrective actions were proposed. Data analysis was determined by the number of risk priority (RPN) which is calculated by multiplying the severity of effect, probability of occurrence, and probability of detection. Twenty-five errors with RPN ≥ 125 was detected through the admission process, in which six cases of error had high priority in terms of severity and occurrence probability and were identified as high-risk errors. The team-oriented method of FMEA could be useful for assessment of errors and also to reduce the occurrence probability of errors.
Dehghan, Ashraf; Abumasoudi, Rouhollah Sheikh; Ehsanpour, Soheila
2016-01-01
Background: Infertility and errors in the process of its treatment have a negative impact on infertile couples. The present study was aimed to identify and assess the common errors in the reception process by applying the approach of “failure modes and effects analysis” (FMEA). Materials and Methods: In this descriptive cross-sectional study, the admission process of fertility and infertility center of Isfahan was selected for evaluation of its errors based on the team members’ decision. At first, the admission process was charted through observations and interviewing employees, holding multiple panels, and using FMEA worksheet, which has been used in many researches all over the world and also in Iran. Its validity was evaluated through content and face validity, and its reliability was evaluated through reviewing and confirmation of the obtained information by the FMEA team, and eventually possible errors, causes, and three indicators of severity of effect, probability of occurrence, and probability of detection were determined and corrective actions were proposed. Data analysis was determined by the number of risk priority (RPN) which is calculated by multiplying the severity of effect, probability of occurrence, and probability of detection. Results: Twenty-five errors with RPN ≥ 125 was detected through the admission process, in which six cases of error had high priority in terms of severity and occurrence probability and were identified as high-risk errors. Conclusions: The team-oriented method of FMEA could be useful for assessment of errors and also to reduce the occurrence probability of errors. PMID:28194208
Does a better model yield a better argument? An info-gap analysis
NASA Astrophysics Data System (ADS)
Ben-Haim, Yakov
2017-04-01
Theories, models and computations underlie reasoned argumentation in many areas. The possibility of error in these arguments, though of low probability, may be highly significant when the argument is used in predicting the probability of rare high-consequence events. This implies that the choice of a theory, model or computational method for predicting rare high-consequence events must account for the probability of error in these components. However, error may result from lack of knowledge or surprises of various sorts, and predicting the probability of error is highly uncertain. We show that the putatively best, most innovative and sophisticated argument may not actually have the lowest probability of error. Innovative arguments may entail greater uncertainty than more standard but less sophisticated methods, creating an innovation dilemma in formulating the argument. We employ info-gap decision theory to characterize and support the resolution of this problem and present several examples.
Artes, Paul H; Henson, David B; Harper, Robert; McLeod, David
2003-06-01
To compare a multisampling suprathreshold strategy with conventional suprathreshold and full-threshold strategies in detecting localized visual field defects and in quantifying the area of loss. Probability theory was applied to examine various suprathreshold pass criteria (i.e., the number of stimuli that have to be seen for a test location to be classified as normal). A suprathreshold strategy that requires three seen or three missed stimuli per test location (multisampling suprathreshold) was selected for further investigation. Simulation was used to determine how the multisampling suprathreshold, conventional suprathreshold, and full-threshold strategies detect localized field loss. To determine the systematic error and variability in estimates of loss area, artificial fields were generated with clustered defects (0-25 field locations with 8- and 16-dB loss) and, for each condition, the number of test locations classified as defective (suprathreshold strategies) and with pattern deviation probability less than 5% (full-threshold strategy), was derived from 1000 simulated test results. The full-threshold and multisampling suprathreshold strategies had similar sensitivity to field loss. Both detected defects earlier than the conventional suprathreshold strategy. The pattern deviation probability analyses of full-threshold results underestimated the area of field loss. The conventional suprathreshold perimetry also underestimated the defect area. With multisampling suprathreshold perimetry, the estimates of defect area were less variable and exhibited lower systematic error. Multisampling suprathreshold paradigms may be a powerful alternative to other strategies of visual field testing. Clinical trials are needed to verify these findings.
Degradation data analysis based on a generalized Wiener process subject to measurement error
NASA Astrophysics Data System (ADS)
Li, Junxing; Wang, Zhihua; Zhang, Yongbo; Fu, Huimin; Liu, Chengrui; Krishnaswamy, Sridhar
2017-09-01
Wiener processes have received considerable attention in degradation modeling over the last two decades. In this paper, we propose a generalized Wiener process degradation model that takes unit-to-unit variation, time-correlated structure and measurement error into considerations simultaneously. The constructed methodology subsumes a series of models studied in the literature as limiting cases. A simple method is given to determine the transformed time scale forms of the Wiener process degradation model. Then model parameters can be estimated based on a maximum likelihood estimation (MLE) method. The cumulative distribution function (CDF) and the probability distribution function (PDF) of the Wiener process with measurement errors are given based on the concept of the first hitting time (FHT). The percentiles of performance degradation (PD) and failure time distribution (FTD) are also obtained. Finally, a comprehensive simulation study is accomplished to demonstrate the necessity of incorporating measurement errors in the degradation model and the efficiency of the proposed model. Two illustrative real applications involving the degradation of carbon-film resistors and the wear of sliding metal are given. The comparative results show that the constructed approach can derive a reasonable result and an enhanced inference precision.
Bayesian assessment of overtriage and undertriage at a level I trauma centre.
DiDomenico, Paul B; Pietzsch, Jan B; Paté-Cornell, M Elisabeth
2008-07-13
We analysed the trauma triage system at a specific level I trauma centre to assess rates of over- and undertriage and to support recommendations for system improvements. The triage process is designed to estimate the severity of patient injury and allocate resources accordingly, with potential errors of overestimation (overtriage) consuming excess resources and underestimation (undertriage) potentially leading to medical errors.We first modelled the overall trauma system using risk analysis methods to understand interdependencies among the actions of the participants. We interviewed six experienced trauma surgeons to obtain their expert opinion of the over- and undertriage rates occurring in the trauma centre. We then assessed actual over- and undertriage rates in a random sample of 86 trauma cases collected over a six-week period at the same centre. We employed Bayesian analysis to quantitatively combine the data with the prior probabilities derived from expert opinion in order to obtain posterior distributions. The results were estimates of overtriage and undertriage in 16.1 and 4.9% of patients, respectively. This Bayesian approach, which provides a quantitative assessment of the error rates using both case data and expert opinion, provides a rational means of obtaining a best estimate of the system's performance. The overall approach that we describe in this paper can be employed more widely to analyse complex health care delivery systems, with the objective of reduced errors, patient risk and excess costs.
Treatment planning for prostate focal laser ablation in the face of needle placement uncertainty
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cepek, Jeremy, E-mail: jcepek@robarts.ca; Fenster, Aaron; Lindner, Uri
2014-01-15
Purpose: To study the effect of needle placement uncertainty on the expected probability of achieving complete focal target destruction in focal laser ablation (FLA) of prostate cancer. Methods: Using a simplified model of prostate cancer focal target, and focal laser ablation region shapes, Monte Carlo simulations of needle placement error were performed to estimate the probability of completely ablating a region of target tissue. Results: Graphs of the probability of complete focal target ablation are presented over clinically relevant ranges of focal target sizes and shapes, ablation region sizes, and levels of needle placement uncertainty. In addition, a table ismore » provided for estimating the maximum target size that is treatable. The results predict that targets whose length is at least 5 mm smaller than the diameter of each ablation region can be confidently ablated using, at most, four laser fibers if the standard deviation in each component of needle placement error is less than 3 mm. However, targets larger than this (i.e., near to or exceeding the diameter of each ablation region) require more careful planning. This process is facilitated by using the table provided. Conclusions: The probability of completely ablating a focal target using FLA is sensitive to the level of needle placement uncertainty, especially as the target length approaches and becomes greater than the diameter of ablated tissue that each individual laser fiber can achieve. The results of this work can be used to help determine individual patient eligibility for prostate FLA, to guide the planning of prostate FLA, and to quantify the clinical benefit of using advanced systems for accurate needle delivery for this treatment modality.« less
Nangia, Shikha; Jasper, Ahren W; Miller, Thomas F; Truhlar, Donald G
2004-02-22
The most widely used algorithm for Monte Carlo sampling of electronic transitions in trajectory surface hopping (TSH) calculations is the so-called anteater algorithm, which is inefficient for sampling low-probability nonadiabatic events. We present a new sampling scheme (called the army ants algorithm) for carrying out TSH calculations that is applicable to systems with any strength of coupling. The army ants algorithm is a form of rare event sampling whose efficiency is controlled by an input parameter. By choosing a suitable value of the input parameter the army ants algorithm can be reduced to the anteater algorithm (which is efficient for strongly coupled cases), and by optimizing the parameter the army ants algorithm may be efficiently applied to systems with low-probability events. To demonstrate the efficiency of the army ants algorithm, we performed atom-diatom scattering calculations on a model system involving weakly coupled electronic states. Fully converged quantum mechanical calculations were performed, and the probabilities for nonadiabatic reaction and nonreactive deexcitation (quenching) were found to be on the order of 10(-8). For such low-probability events the anteater sampling scheme requires a large number of trajectories ( approximately 10(10)) to obtain good statistics and converged semiclassical results. In contrast by using the new army ants algorithm converged results were obtained by running 10(5) trajectories. Furthermore, the results were found to be in excellent agreement with the quantum mechanical results. Sampling errors were estimated using the bootstrap method, which is validated for use with the army ants algorithm. (c) 2004 American Institute of Physics.
NASA Technical Reports Server (NTRS)
Backus, George E.
1999-01-01
The purpose of the grant was to study how prior information about the geomagnetic field can be used to interpret surface and satellite magnetic measurements, to generate quantitative descriptions of prior information that might be so used, and to use this prior information to obtain from satellite data a model of the core field with statistically justifiable error estimates. The need for prior information in geophysical inversion has long been recognized. Data sets are finite, and faithful descriptions of aspects of the earth almost always require infinite-dimensional model spaces. By themselves, the data can confine the correct earth model only to an infinite-dimensional subset of the model space. Earth properties other than direct functions of the observed data cannot be estimated from those data without prior information about the earth. Prior information is based on what the observer already knows before the data become available. Such information can be "hard" or "soft". Hard information is a belief that the real earth must lie in some known region of model space. For example, the total ohmic dissipation in the core is probably less that the total observed geothermal heat flow out of the earth's surface. (In principle, ohmic heat in the core can be recaptured to help drive the dynamo, but this effect is probably small.) "Soft" information is a probability distribution on the model space, a distribution that the observer accepts as a quantitative description of her/his beliefs about the earth. The probability distribution can be a subjective prior in the sense of Bayes or the objective result of a statistical study of previous data or relevant theories.
Ye, Xin; Garikapati, Venu M.; You, Daehyun; ...
2017-11-08
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ye, Xin; Garikapati, Venu M.; You, Daehyun
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
Technical note: Bayesian calibration of dynamic ruminant nutrition models.
Reed, K F; Arhonditsis, G B; France, J; Kebreab, E
2016-08-01
Mechanistic models of ruminant digestion and metabolism have advanced our understanding of the processes underlying ruminant animal physiology. Deterministic modeling practices ignore the inherent variation within and among individual animals and thus have no way to assess how sources of error influence model outputs. We introduce Bayesian calibration of mathematical models to address the need for robust mechanistic modeling tools that can accommodate error analysis by remaining within the bounds of data-based parameter estimation. For the purpose of prediction, the Bayesian approach generates a posterior predictive distribution that represents the current estimate of the value of the response variable, taking into account both the uncertainty about the parameters and model residual variability. Predictions are expressed as probability distributions, thereby conveying significantly more information than point estimates in regard to uncertainty. Our study illustrates some of the technical advantages of Bayesian calibration and discusses the future perspectives in the context of animal nutrition modeling. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Sargent, Daniel J.; Buyse, Marc; Burzykowski, Tomasz
2011-01-01
SUMMARY Using multiple historical trials with surrogate and true endpoints, we consider various models to predict the effect of treatment on a true endpoint in a target trial in which only a surrogate endpoint is observed. This predicted result is computed using (1) a prediction model (mixture, linear, or principal stratification) estimated from historical trials and the surrogate endpoint of the target trial and (2) a random extrapolation error estimated from successively leaving out each trial among the historical trials. The method applies to either binary outcomes or survival to a particular time that is computed from censored survival data. We compute a 95% confidence interval for the predicted result and validate its coverage using simulation. To summarize the additional uncertainty from using a predicted instead of true result for the estimated treatment effect, we compute its multiplier of standard error. Software is available for download. PMID:21838732
NASA Technical Reports Server (NTRS)
Varnai, Tamas; Marshak, Alexander
2000-01-01
This paper presents a simple approach to estimate the uncertainties that arise in satellite retrievals of cloud optical depth when the retrievals use one-dimensional radiative transfer theory for heterogeneous clouds that have variations in all three dimensions. For the first time, preliminary error bounds are set to estimate the uncertainty of cloud optical depth retrievals. These estimates can help us better understand the nature of uncertainties that three-dimensional effects can introduce into retrievals of this important product of the MODIS instrument. The probability distribution of resulting retrieval errors is examined through theoretical simulations of shortwave cloud reflection for a wide variety of cloud fields. The results are used to illustrate how retrieval uncertainties change with observable and known parameters, such as solar elevation or cloud brightness. Furthermore, the results indicate that a tendency observed in an earlier study, clouds appearing thicker for oblique sun, is indeed caused by three-dimensional radiative effects.
Semiblind channel estimation for MIMO-OFDM systems
NASA Astrophysics Data System (ADS)
Chen, Yi-Sheng; Song, Jyu-Han
2012-12-01
This article proposes a semiblind channel estimation method for multiple-input multiple-output orthogonal frequency-division multiplexing systems based on circular precoding. Relying on the precoding scheme at the transmitters, the autocorrelation matrix of the received data induces a structure relating the outer product of the channel frequency response matrix and precoding coefficients. This structure makes it possible to extract information about channel product matrices, which can be used to form a Hermitian matrix whose positive eigenvalues and corresponding eigenvectors yield the channel impulse response matrix. This article also tests the resistance of the precoding design to finite-sample estimation errors, and explores the effects of the precoding scheme on channel equalization by performing pairwise error probability analysis. The proposed method is immune to channel zero locations, and is reasonably robust to channel order overestimation. The proposed method is applicable to the scenarios in which the number of transmitters exceeds that of the receivers. Simulation results demonstrate the performance of the proposed method and compare it with some existing methods.
NASA Astrophysics Data System (ADS)
Olurotimi, E. O.; Sokoya, O.; Ojo, J. S.; Owolawi, P. A.
2018-03-01
Rain height is one of the significant parameters for prediction of rain attenuation for Earth-space telecommunication links, especially those operating at frequencies above 10 GHz. This study examines Three-parameter Dagum distribution of the rain height over Durban, South Africa. 5-year data were used to study the monthly, seasonal, and annual variations using the parameters estimated by the maximum likelihood of the distribution. The performance estimation of the distribution was determined using the statistical goodness of fit. Three-parameter Dagum distribution shows an appropriate distribution for the modeling of rain height over Durban with the Root Mean Square Error of 0.26. Also, the shape and scale parameters for the distribution show a wide variation. The probability exceedance of time for 0.01% indicates the high probability of rain attenuation at higher frequencies.
Probabilities and statistics for backscatter estimates obtained by a scatterometer
NASA Technical Reports Server (NTRS)
Pierson, Willard J., Jr.
1989-01-01
Methods for the recovery of winds near the surface of the ocean from measurements of the normalized radar backscattering cross section must recognize and make use of the statistics (i.e., the sampling variability) of the backscatter measurements. Radar backscatter values from a scatterometer are random variables with expected values given by a model. A model relates backscatter to properties of the waves on the ocean, which are in turn generated by the winds in the atmospheric marine boundary layer. The effective wind speed and direction at a known height for a neutrally stratified atmosphere are the values to be recovered from the model. The probability density function for the backscatter values is a normal probability distribution with the notable feature that the variance is a known function of the expected value. The sources of signal variability, the effects of this variability on the wind speed estimation, and criteria for the acceptance or rejection of models are discussed. A modified maximum likelihood method for estimating wind vectors is described. Ways to make corrections for the kinds of errors found for the Seasat SASS model function are described, and applications to a new scatterometer are given.
Unifying distance-based goodness-of-fit indicators for hydrologic model assessment
NASA Astrophysics Data System (ADS)
Cheng, Qinbo; Reinhardt-Imjela, Christian; Chen, Xi; Schulte, Achim
2014-05-01
The goodness-of-fit indicator, i.e. efficiency criterion, is very important for model calibration. However, recently the knowledge about the goodness-of-fit indicators is all empirical and lacks a theoretical support. Based on the likelihood theory, a unified distance-based goodness-of-fit indicator termed BC-GED model is proposed, which uses the Box-Cox (BC) transformation to remove the heteroscedasticity of model errors and the generalized error distribution (GED) with zero-mean to fit the distribution of model errors after BC. The BC-GED model can unify all recent distance-based goodness-of-fit indicators, and reveals the mean square error (MSE) and the mean absolute error (MAE) that are widely used goodness-of-fit indicators imply statistic assumptions that the model errors follow the Gaussian distribution and the Laplace distribution with zero-mean, respectively. The empirical knowledge about goodness-of-fit indicators can be also easily interpreted by BC-GED model, e.g. the sensitivity to high flow of the goodness-of-fit indicators with large power of model errors results from the low probability of large model error in the assumed distribution of these indicators. In order to assess the effect of the parameters (i.e. the BC transformation parameter λ and the GED kurtosis coefficient β also termed the power of model errors) of BC-GED model on hydrologic model calibration, six cases of BC-GED model were applied in Baocun watershed (East China) with SWAT-WB-VSA model. Comparison of the inferred model parameters and model simulation results among the six indicators demonstrates these indicators can be clearly separated two classes by the GED kurtosis β: β >1 and β ≤ 1. SWAT-WB-VSA calibrated by the class β >1 of distance-based goodness-of-fit indicators captures high flow very well and mimics the baseflow very badly, but it calibrated by the class β ≤ 1 mimics the baseflow very well, because first the larger value of β, the greater emphasis is put on high flow and second the derivative of GED probability density function at zero is zero as β >1, but discontinuous as β ≤ 1, and even infinity as β < 1 with which the maximum likelihood estimation can guarantee the model errors approach zero as well as possible. The BC-GED that estimates the parameters (i.e. λ and β) of BC-GED model as well as hydrologic model parameters is the best distance-based goodness-of-fit indicator because not only the model validation using groundwater levels is very good, but also the model errors fulfill the statistic assumption best. However, in some cases of model calibration with a few observations e.g. calibration of single-event model, for avoiding estimation of the parameters of BC-GED model the MAE i.e. the boundary indicator (β = 1) of the two classes, can replace the BC-GED, because the model validation of MAE is best.
Representation of layer-counted proxy records as probability densities on error-free time axes
NASA Astrophysics Data System (ADS)
Boers, Niklas; Goswami, Bedartha; Ghil, Michael
2016-04-01
Time series derived from paleoclimatic proxy records exhibit substantial dating uncertainties in addition to the measurement errors of the proxy values. For radiometrically dated proxy archives, Goswami et al. [1] have recently introduced a framework rooted in Bayesian statistics that successfully propagates the dating uncertainties from the time axis to the proxy axis. The resulting proxy record consists of a sequence of probability densities over the proxy values, conditioned on prescribed age values. One of the major benefits of this approach is that the proxy record is represented on an accurate, error-free time axis. Such unambiguous dating is crucial, for instance, in comparing different proxy records. This approach, however, is not directly applicable to proxy records with layer-counted chronologies, as for example ice cores, which are typically dated by counting quasi-annually deposited ice layers. Hence the nature of the chronological uncertainty in such records is fundamentally different from that in radiometrically dated ones. Here, we introduce a modification of the Goswami et al. [1] approach that is specifically designed for layer-counted proxy records, instead of radiometrically dated ones. We apply our method to isotope ratios and dust concentrations in the NGRIP core, using a published 60,000-year chronology [2]. It is shown that the further one goes into the past, the more the layer-counting errors accumulate and lead to growing uncertainties in the probability density sequence for the proxy values that results from the proposed approach. For the older parts of the record, these uncertainties affect more and more a statistically sound estimation of proxy values. This difficulty implies that great care has to be exercised when comparing and in particular aligning specific events among different layer-counted proxy records. On the other hand, when attempting to derive stochastic dynamical models from the proxy records, one is only interested in the relative changes, i.e. in the increments of the proxy values. In such cases, only the relative (non-cumulative) counting errors matter. For the example of the NGRIP records, we show that a precise estimation of these relative changes is in fact possible. References: [1] Goswami et al., Nonlin. Processes Geophys. (2014) [2] Svensson et al., Clim. Past (2008)
Using beta binomials to estimate classification uncertainty for ensemble models.
Clark, Robert D; Liang, Wenkel; Lee, Adam C; Lawless, Michael S; Fraczkiewicz, Robert; Waldman, Marvin
2014-01-01
Quantitative structure-activity (QSAR) models have enormous potential for reducing drug discovery and development costs as well as the need for animal testing. Great strides have been made in estimating their overall reliability, but to fully realize that potential, researchers and regulators need to know how confident they can be in individual predictions. Submodels in an ensemble model which have been trained on different subsets of a shared training pool represent multiple samples of the model space, and the degree of agreement among them contains information on the reliability of ensemble predictions. For artificial neural network ensembles (ANNEs) using two different methods for determining ensemble classification - one using vote tallies and the other averaging individual network outputs - we have found that the distribution of predictions across positive vote tallies can be reasonably well-modeled as a beta binomial distribution, as can the distribution of errors. Together, these two distributions can be used to estimate the probability that a given predictive classification will be in error. Large data sets comprised of logP, Ames mutagenicity, and CYP2D6 inhibition data are used to illustrate and validate the method. The distributions of predictions and errors for the training pool accurately predicted the distribution of predictions and errors for large external validation sets, even when the number of positive and negative examples in the training pool were not balanced. Moreover, the likelihood of a given compound being prospectively misclassified as a function of the degree of consensus between networks in the ensemble could in most cases be estimated accurately from the fitted beta binomial distributions for the training pool. Confidence in an individual predictive classification by an ensemble model can be accurately assessed by examining the distributions of predictions and errors as a function of the degree of agreement among the constituent submodels. Further, ensemble uncertainty estimation can often be improved by adjusting the voting or classification threshold based on the parameters of the error distribution. Finally, the profiles for models whose predictive uncertainty estimates are not reliable provide clues to that effect without the need for comparison to an external test set.
NASA Astrophysics Data System (ADS)
Maggioni, V.; Massari, C.; Ciabatta, L.; Brocca, L.
2016-12-01
Accurate quantitative precipitation estimation is of great importance for water resources management, agricultural planning, and forecasting and monitoring of natural hazards such as flash floods and landslides. In situ observations are limited around the Earth, especially in remote areas (e.g., complex terrain, dense vegetation), but currently available satellite precipitation products are able to provide global precipitation estimates with an accuracy that depends upon many factors (e.g., type of storms, temporal sampling, season, etc.). The recent SM2RAIN approach proposes to estimate rainfall by using satellite soil moisture observations. As opposed to traditional satellite precipitation methods, which sense cloud properties to retrieve instantaneous estimates, this new bottom-up approach makes use of two consecutive soil moisture measurements for obtaining an estimate of the fallen precipitation within the interval between two satellite overpasses. As a result, the nature of the measurement is different and complementary to the one of classical precipitation products and could provide a different valid perspective to substitute or improve current rainfall estimates. However, uncertainties in the SM2RAIN product are still not well known and could represent a limitation in utilizing this dataset for hydrological applications. Therefore, quantifying the uncertainty associated with SM2RAIN is necessary for enabling its use. The study is conducted over the Italian territory for a 5-yr period (2010-2014). A number of satellite precipitation error properties, typically used in error modeling, are investigated and include probability of detection, false alarm rates, missed events, spatial correlation of the error, and hit biases. After this preliminary uncertainty analysis, the potential of applying the stochastic rainfall error model SREM2D to correct SM2RAIN and to improve its performance in hydrologic applications is investigated. The use of SREM2D for characterizing the error in precipitation by SM2RAIN would be highly useful for the merging and the integration steps in its algorithm, i.e., the merging of multiple soil moisture derived products (e.g., SMAP, SMOS, ASCAT) and the integration of soil moisture derived and state of the art satellite precipitation products (e.g., GPM IMERG).
Modeling misidentification errors that result from use of genetic tags in capture-recapture studies
Yoshizaki, J.; Brownie, C.; Pollock, K.H.; Link, W.A.
2011-01-01
Misidentification of animals is potentially important when naturally existing features (natural tags) such as DNA fingerprints (genetic tags) are used to identify individual animals. For example, when misidentification leads to multiple identities being assigned to an animal, traditional estimators tend to overestimate population size. Accounting for misidentification in capture-recapture models requires detailed understanding of the mechanism. Using genetic tags as an example, we outline a framework for modeling the effect of misidentification in closed population studies when individual identification is based on natural tags that are consistent over time (non-evolving natural tags). We first assume a single sample is obtained per animal for each capture event, and then generalize to the case where multiple samples (such as hair or scat samples) are collected per animal per capture occasion. We introduce methods for estimating population size and, using a simulation study, we show that our new estimators perform well for cases with moderately high capture probabilities or high misidentification rates. In contrast, conventional estimators can seriously overestimate population size when errors due to misidentification are ignored. ?? 2009 Springer Science+Business Media, LLC.
Minimax Quantum Tomography: Estimators and Relative Entropy Bounds
Ferrie, Christopher; Blume-Kohout, Robin
2016-03-04
A minimax estimator has the minimum possible error (“risk”) in the worst case. Here we construct the first minimax estimators for quantum state tomography with relative entropy risk. The minimax risk of nonadaptive tomography scales as O (1/more » $$\\sqrt{N}$$ ) —in contrast to that of classical probability estimation, which is O (1/N) —where N is the number of copies of the quantum state used. We trace this deficiency to sampling mismatch: future observations that determine risk may come from a different sample space than the past data that determine the estimate. Lastly, this makes minimax estimators very biased, and we propose a computationally tractable alternative with similar behavior in the worst case, but superior accuracy on most states.« less
On the design of classifiers for crop inventories
NASA Technical Reports Server (NTRS)
Heydorn, R. P.; Takacs, H. C.
1986-01-01
Crop proportion estimators that use classifications of satellite data to correct, in an additive way, a given estimate acquired from ground observations are discussed. A linear version of these estimators is optimal, in terms of minimum variance, when the regression of the ground observations onto the satellite observations in linear. When this regression is not linear, but the reverse regression (satellite observations onto ground observations) is linear, the estimator is suboptimal but still has certain appealing variance properties. In this paper expressions are derived for those regressions which relate the intercepts and slopes to conditional classification probabilities. These expressions are then used to discuss the question of classifier designs that can lead to low-variance crop proportion estimates. Variance expressions for these estimates in terms of classifier omission and commission errors are also derived.
Genetic Algorithm for Optimization: Preprocessing with n Dimensional Bisection and Error Estimation
NASA Technical Reports Server (NTRS)
Sen, S. K.; Shaykhian, Gholam Ali
2006-01-01
A knowledge of the appropriate values of the parameters of a genetic algorithm (GA) such as the population size, the shrunk search space containing the solution, crossover and mutation probabilities is not available a priori for a general optimization problem. Recommended here is a polynomial-time preprocessing scheme that includes an n-dimensional bisection and that determines the foregoing parameters before deciding upon an appropriate GA for all problems of similar nature and type. Such a preprocessing is not only fast but also enables us to get the global optimal solution and its reasonably narrow error bounds with a high degree of confidence.
Evaluation of some random effects methodology applicable to bird ringing data
Burnham, K.P.; White, Gary C.
2002-01-01
Existing models for ring recovery and recapture data analysis treat temporal variations in annual survival probability (S) as fixed effects. Often there is no explainable structure to the temporal variation in S1,..., Sk; random effects can then be a useful model: Si = E(S) + ??i. Here, the temporal variation in survival probability is treated as random with average value E(??2) = ??2. This random effects model can now be fit in program MARK. Resultant inferences include point and interval estimation for process variation, ??2, estimation of E(S) and var (E??(S)) where the latter includes a component for ??2 as well as the traditional component for v??ar(S??\\S??). Furthermore, the random effects model leads to shrinkage estimates, Si, as improved (in mean square error) estimators of Si compared to the MLE, S??i, from the unrestricted time-effects model. Appropriate confidence intervals based on the Si are also provided. In addition, AIC has been generalized to random effects models. This paper presents results of a Monte Carlo evaluation of inference performance under the simple random effects model. Examined by simulation, under the simple one group Cormack-Jolly-Seber (CJS) model, are issues such as bias of ??s2, confidence interval coverage on ??2, coverage and mean square error comparisons for inference about Si based on shrinkage versus maximum likelihood estimators, and performance of AIC model selection over three models: Si ??? S (no effects), Si = E(S) + ??i (random effects), and S1,..., Sk (fixed effects). For the cases simulated, the random effects methods performed well and were uniformly better than fixed effects MLE for the Si.
Probability of undetected error after decoding for a concatenated coding scheme
NASA Technical Reports Server (NTRS)
Costello, D. J., Jr.; Lin, S.
1984-01-01
A concatenated coding scheme for error control in data communications is analyzed. In this scheme, the inner code is used for both error correction and detection, however the outer code is used only for error detection. A retransmission is requested if the outer code detects the presence of errors after the inner code decoding. Probability of undetected error is derived and bounded. A particular example, proposed for NASA telecommand system is analyzed.
Ennis, Erin J; Foley, Joe P
2016-07-15
A stochastic approach was utilized to estimate the probability of a successful isocratic or gradient separation in conventional chromatography for numbers of sample components, peak capacities, and saturation factors ranging from 2 to 30, 20-300, and 0.017-1, respectively. The stochastic probabilities were obtained under conditions of (i) constant peak width ("gradient" conditions) and (ii) peak width increasing linearly with time ("isocratic/constant N" conditions). The isocratic and gradient probabilities obtained stochastically were compared with the probabilities predicted by Martin et al. [Anal. Chem., 58 (1986) 2200-2207] and Davis and Stoll [J. Chromatogr. A, (2014) 128-142]; for a given number of components and peak capacity the same trend is always observed: probability obtained with the isocratic stochastic approach
Weathering the storm: hurricanes and birth outcomes.
Currie, Janet; Rossin-Slater, Maya
2013-05-01
A growing literature suggests that stressful events in pregnancy can have negative effects on birth outcomes. Some of the estimates in this literature may be affected by small samples, omitted variables, endogenous mobility in response to disasters, and errors in the measurement of gestation, as well as by a mechanical correlation between longer gestation and the probability of having been exposed. We use millions of individual birth records to examine the effects of exposure to hurricanes during pregnancy, and the sensitivity of the estimates to these econometric problems. We find that exposure to a hurricane during pregnancy increases the probability of abnormal conditions of the newborn such as being on a ventilator more than 30min and meconium aspiration syndrome (MAS). Although we are able to reproduce previous estimates of effects on birth weight and gestation, our results suggest that measured effects of stressful events on these outcomes are sensitive to specification and it is preferable to use more sensitive indicators of newborn health. Copyright © 2013 Elsevier B.V. All rights reserved.
Weathering the Storm: Hurricanes and Birth Outcomes
Currie, Janet
2013-01-01
A growing literature suggests that stressful events in pregnancy can have negative effects on birth outcomes. Some of the estimates in this literature may be affected by small samples, omitted variables, endogenous mobility in response to disasters, and errors in the measurement of gestation, as well as by a mechanical correlation between longer gestation and the probability of having been exposed. We use millions of individual birth records to examine the effects of exposure to hurricanes during pregnancy, and the sensitivity of the estimates to these econometric problems. We find that exposure to a hurricane during pregnancy increases the probability of abnormal conditions of the newborn such as being on a ventilator more than 30 minutes and meconium aspiration syndrome (MAS). Although we are able to reproduce previous estimates of effects on birth weight and gestation, our results suggest that measured effects of stressful events on these outcomes are sensitive to specification and it is preferable to use more sensitive indicators of newborn health. PMID:23500506
Measurement of reaeration coefficients for selected Florida streams
Hampson, P.S.; Coffin, J.E.
1989-01-01
A total of 29 separate reaeration coefficient determinations were performed on 27 subreaches of 12 selected Florida streams between October 1981 and May 1985. Measurements performed prior to June 1984 were made using the peak and area methods with ethylene and propane as the tracer gases. Later measurements utilized the steady-state method with propane as the only tracer gas. The reaeration coefficients ranged from 1.07 to 45.9 days with a mean estimated probable error of +/16.7%. Ten predictive equations (compiled from the literature) were also evaluated using the measured coefficients. The most representative equation was one of the energy dissipation type with a standard error of 60.3%. Seven of the 10 predictive additional equations were modified using the measured coefficients and nonlinear regression techniques. The most accurate of the developed equations was also of the energy dissipation form and had a standard error of 54.9%. For 5 of the 13 subreaches in which both ethylene and propane were used, the ethylene data resulted in substantially larger reaeration coefficient values which were rejected. In these reaches, ethylene concentrations were probably significantly affected by one or more electrophilic addition reactions known to occur in aqueous media. (Author 's abstract)
Wright, Wilson J.; Irvine, Kathryn M.
2017-01-01
We examined data on white pine blister rust (blister rust) collected during the monitoring of whitebark pine trees in the Greater Yellowstone Ecosystem (from 2004-2015). Summaries of repeat observations performed by multiple independent observers are reviewed and discussed. These summaries show variability among observers and the potential for errors being made in blister rust status. Based on this assessment, we utilized occupancy models to analyze blister rust prevalence while explicitly accounting for imperfect detection. Available covariates were used to model both the probability of a tree being infected with blister rust and the probability of an observer detecting the infection. The fitted model provided strong evidence that the probability of blister rust infection increases as tree diameter increases and decreases as site elevation increases. Most importantly, we found evidence of heterogeneity in detection probabilities related to tree size and average slope of a transect. These results suggested that detecting the presence of blister rust was more difficult in larger trees. Also, there was evidence that blister rust was easier to detect on transects located on steeper slopes. Our model accounted for potential impacts of observer experience on blister rust detection probabilities and also showed moderate variability among the different observers in their ability to detect blister rust. Based on these model results, we suggest that multiple observer sampling continue in future field seasons in order to allow blister rust prevalence estimates to be corrected for imperfect detection. We suggest that the multiple observer effort be spread out across many transects (instead of concentrated at a few each field season) while retaining the overall proportion of trees with multiple observers around 5-20%. Estimates of prevalence are confounded with detection unless it is explicitly accounted for in an analysis and we demonstrate how an occupancy model can be used to do account for this source of observation error.
Mitigating leakage errors due to cavity modes in a superconducting quantum computer
NASA Astrophysics Data System (ADS)
McConkey, T. G.; Béjanin, J. H.; Earnest, C. T.; McRae, C. R. H.; Pagel, Z.; Rinehart, J. R.; Mariantoni, M.
2018-07-01
A practical quantum computer requires quantum bit (qubit) operations with low error probabilities in extensible architectures. We study a packaging method that makes it possible to address hundreds of superconducting qubits by means of coaxial Pogo pins. A qubit chip is housed in a superconducting box, where both box and chip dimensions lead to unwanted modes that can interfere with qubit operations. We analyze these interference effects in the context of qubit coherent leakage and qubit decoherence induced by damped modes. We propose two methods, half-wave fencing and antinode pinning, to mitigate the resulting errors by detuning the resonance frequency of the modes from the qubit frequency. We perform electromagnetic field simulations indicating that the resonance frequency of the modes increases with the number of installed pins and can be engineered to be significantly higher than the highest qubit frequency. We estimate that the error probabilities and decoherence rates due to suitably shifted modes in realistic scenarios can be up to two orders of magnitude lower than the state-of-the-art superconducting qubit error and decoherence rates. Our methods can be extended to different types of packages that do not rely on Pogo pins. Conductive bump bonds, for example, can serve the same purpose in qubit architectures based on flip chip technology. Metalized vias, instead, can be used to mitigate modes due to the increasing size of the dielectric substrate on which qubit arrays are patterned.
Self-calibration of photometric redshift scatter in weak-lensing surveys
Zhang, Pengjie; Pen, Ue -Li; Bernstein, Gary
2010-06-11
Photo-z errors, especially catastrophic errors, are a major uncertainty for precision weak lensing cosmology. We find that the shear-(galaxy number) density and density-density cross correlation measurements between photo-z bins, available from the same lensing surveys, contain valuable information for self-calibration of the scattering probabilities between the true-z and photo-z bins. The self-calibration technique we propose does not rely on cosmological priors nor parameterization of the photo-z probability distribution function, and preserves all of the cosmological information available from shear-shear measurement. We estimate the calibration accuracy through the Fisher matrix formalism. We find that, for advanced lensing surveys such as themore » planned stage IV surveys, the rate of photo-z outliers can be determined with statistical uncertainties of 0.01-1% for z < 2 galaxies. Among the several sources of calibration error that we identify and investigate, the galaxy distribution bias is likely the most dominant systematic error, whereby photo-z outliers have different redshift distributions and/or bias than non-outliers from the same bin. This bias affects all photo-z calibration techniques based on correlation measurements. As a result, galaxy bias variations of O(0.1) produce biases in photo-z outlier rates similar to the statistical errors of our method, so this galaxy distribution bias may bias the reconstructed scatters at several-σ level, but is unlikely to completely invalidate the self-calibration technique.« less
A probabilistic method for the estimation of residual risk in donated blood.
Bish, Ebru K; Ragavan, Prasanna K; Bish, Douglas R; Slonim, Anthony D; Stramer, Susan L
2014-10-01
The residual risk (RR) of transfusion-transmitted infections, including the human immunodeficiency virus and hepatitis B and C viruses, is typically estimated by the incidence[Formula: see text]window period model, which relies on the following restrictive assumptions: Each screening test, with probability 1, (1) detects an infected unit outside of the test's window period; (2) fails to detect an infected unit within the window period; and (3) correctly identifies an infection-free unit. These assumptions need not hold in practice due to random or systemic errors and individual variations in the window period. We develop a probability model that accurately estimates the RR by relaxing these assumptions, and quantify their impact using a published cost-effectiveness study and also within an optimization model. These assumptions lead to inaccurate estimates in cost-effectiveness studies and to sub-optimal solutions in the optimization model. The testing solution generated by the optimization model translates into fewer expected infections without an increase in the testing cost. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Evaluation of statistical models for forecast errors from the HBV model
NASA Astrophysics Data System (ADS)
Engeland, Kolbjørn; Renard, Benjamin; Steinsland, Ingelin; Kolberg, Sjur
2010-04-01
SummaryThree statistical models for the forecast errors for inflow into the Langvatn reservoir in Northern Norway have been constructed and tested according to the agreement between (i) the forecast distribution and the observations and (ii) median values of the forecast distribution and the observations. For the first model observed and forecasted inflows were transformed by the Box-Cox transformation before a first order auto-regressive model was constructed for the forecast errors. The parameters were conditioned on weather classes. In the second model the Normal Quantile Transformation (NQT) was applied on observed and forecasted inflows before a similar first order auto-regressive model was constructed for the forecast errors. For the third model positive and negative errors were modeled separately. The errors were first NQT-transformed before conditioning the mean error values on climate, forecasted inflow and yesterday's error. To test the three models we applied three criterions: we wanted (a) the forecast distribution to be reliable; (b) the forecast intervals to be narrow; (c) the median values of the forecast distribution to be close to the observed values. Models 1 and 2 gave almost identical results. The median values improved the forecast with Nash-Sutcliffe R eff increasing from 0.77 for the original forecast to 0.87 for the corrected forecasts. Models 1 and 2 over-estimated the forecast intervals but gave the narrowest intervals. Their main drawback was that the distributions are less reliable than Model 3. For Model 3 the median values did not fit well since the auto-correlation was not accounted for. Since Model 3 did not benefit from the potential variance reduction that lies in bias estimation and removal it gave on average wider forecasts intervals than the two other models. At the same time Model 3 on average slightly under-estimated the forecast intervals, probably explained by the use of average measures to evaluate the fit.
Seismic Characterization of the Newberry and Cooper Basin EGS Sites
NASA Astrophysics Data System (ADS)
Templeton, D. C.; Wang, J.; Goebel, M.; Johannesson, G.; Myers, S. C.; Harris, D.; Cladouhos, T. T.
2015-12-01
To aid in the seismic characterization of Engineered Geothermal Systems (EGS), we enhance traditional microearthquake detection and location methodologies at two EGS systems: the Newberry EGS site and the Habanero EGS site in the Cooper Basin of South Australia. We apply the Matched Field Processing (MFP) seismic imaging technique to detect new seismic events using known discrete microearthquake sources. Events identified using MFP typically have smaller magnitudes or occur within the coda of a larger event. Additionally, we apply a Bayesian multiple-event location algorithm, called MicroBayesLoc, to estimate the 95% probability ellipsoids for events with high signal-to-noise ratios (SNR). Such probability ellipsoid information can provide evidence for determining if a seismic lineation is real, or simply within the anticipated error range. At the Newberry EGS site, 235 events were reported in the original catalog. MFP identified 164 additional events (an increase of over 70% more events). For the relocated events in the Newberry catalog, we can distinguish two distinct seismic swarms that fall outside of one another's 95% probability error ellipsoids.This work performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Estimated Probability of a Cervical Spine Injury During an ISS Mission
NASA Technical Reports Server (NTRS)
Brooker, John E.; Weaver, Aaron S.; Myers, Jerry G.
2013-01-01
Introduction: The Integrated Medical Model (IMM) utilizes historical data, cohort data, and external simulations as input factors to provide estimates of crew health, resource utilization and mission outcomes. The Cervical Spine Injury Module (CSIM) is an external simulation designed to provide the IMM with parameter estimates for 1) a probability distribution function (PDF) of the incidence rate, 2) the mean incidence rate, and 3) the standard deviation associated with the mean resulting from injury/trauma of the neck. Methods: An injury mechanism based on an idealized low-velocity blunt impact to the superior posterior thorax of an ISS crewmember was used as the simulated mission environment. As a result of this impact, the cervical spine is inertially loaded from the mass of the head producing an extension-flexion motion deforming the soft tissues of the neck. A multibody biomechanical model was developed to estimate the kinematic and dynamic response of the head-neck system from a prescribed acceleration profile. Logistic regression was performed on a dataset containing AIS1 soft tissue neck injuries from rear-end automobile collisions with published Neck Injury Criterion values producing an injury transfer function (ITF). An injury event scenario (IES) was constructed such that crew 1 is moving through a primary or standard translation path transferring large volume equipment impacting stationary crew 2. The incidence rate for this IES was estimated from in-flight data and used to calculate the probability of occurrence. The uncertainty in the model input factors were estimated from representative datasets and expressed in terms of probability distributions. A Monte Carlo Method utilizing simple random sampling was employed to propagate both aleatory and epistemic uncertain factors. Scatterplots and partial correlation coefficients (PCC) were generated to determine input factor sensitivity. CSIM was developed in the SimMechanics/Simulink environment with a Monte Carlo wrapper (MATLAB) used to integrate the components of the module. Results: The probability of generating an AIS1 soft tissue neck injury from the extension/flexion motion induced by a low-velocity blunt impact to the superior posterior thorax was fitted with a lognormal PDF with mean 0.26409, standard deviation 0.11353, standard error of mean 0.00114, and 95% confidence interval [0.26186, 0.26631]. Combining the probability of an AIS1 injury with the probability of IES occurrence was fitted with a Johnson SI PDF with mean 0.02772, standard deviation 0.02012, standard error of mean 0.00020, and 95% confidence interval [0.02733, 0.02812]. The input factor sensitivity analysis in descending order was IES incidence rate, ITF regression coefficient 1, impactor initial velocity, ITF regression coefficient 2, and all others (equipment mass, crew 1 body mass, crew 2 body mass) insignificant. Verification and Validation (V&V): The IMM V&V, based upon NASA STD 7009, was implemented which included an assessment of the data sets used to build CSIM. The documentation maintained includes source code comments and a technical report. The software code and documentation is under Subversion configuration management. Kinematic validation was performed by comparing the biomechanical model output to established corridors.
The Foraging Ecology of Royal and Sandwich Terns in North Carolina, USA
McGinnis, T.W.; Emslie, S.D.
2001-01-01
Population sizes of territorial male red-winged blackbirds (Agelaius phoeniceus) were determined with counts of territorial males (area count) and a Petersen-Lincoln Index method for roadsides (roadside estimate). Weather conditions and time of day did not influence either method. Combined roadside estimates had smaller error bounds than the individual transect estimates and were not hindered by the problem of zero recaptures. Roadside estimates were usually one-half as large as the area counts, presumably due to an observer bias for marked birds. The roadside estimate provides only an index of major changes in populations of territorial male redwings. When the roadside estimate is employed, the area count should be used to determine the amount and nature of observer bias. For small population surveys, the area count is probably more reliable and accurate than the roadside estimate.
Determining population size of territorial red-winged blackbirds
Albers, P.H.
1976-01-01
Population sizes of territorial male red-winged blackbirds (Agelaius phoeniceus) were determined with counts of territorial males (area count) and a Petersen-Lincoln Index method for roadsides (roadside estimate). Weather conditions and time of day did not influence either method. Combined roadside estimates had smaller error bounds than the individual transect estimates and were not hindered by the problem of zero recaptures. Roadside estimates were usually one-half as large as the area counts, presumably due to an observer bias for marked birds. The roadside estimate provides only an index of major changes in populations of territorial male redwings. When the roadside estimate is employed, the area count should be used to determine the amount and nature of observer bias. For small population surveys, the area count is probably more reliable and accurate than the roadside estimate.
On the timing problem in optical PPM communications.
NASA Technical Reports Server (NTRS)
Gagliardi, R. M.
1971-01-01
Investigation of the effects of imperfect timing in a direct-detection (noncoherent) optical system using pulse-position-modulation bits. Special emphasis is placed on specification of timing accuracy, and an examination of system degradation when this accuracy is not attained. Bit error probabilities are shown as a function of timing errors, from which average error probabilities can be computed for specific synchronization methods. Of significant importance is shown to be the presence of a residual, or irreducible error probability, due entirely to the timing system, that cannot be overcome by the data channel.
Staid, Andrea; Watson, Jean -Paul; Wets, Roger J. -B.; ...
2017-07-11
Forecasts of available wind power are critical in key electric power systems operations planning problems, including economic dispatch and unit commitment. Such forecasts are necessarily uncertain, limiting the reliability and cost effectiveness of operations planning models based on a single deterministic or “point” forecast. A common approach to address this limitation involves the use of a number of probabilistic scenarios, each specifying a possible trajectory of wind power production, with associated probability. We present and analyze a novel method for generating probabilistic wind power scenarios, leveraging available historical information in the form of forecasted and corresponding observed wind power timemore » series. We estimate non-parametric forecast error densities, specifically using epi-spline basis functions, allowing us to capture the skewed and non-parametric nature of error densities observed in real-world data. We then describe a method to generate probabilistic scenarios from these basis functions that allows users to control for the degree to which extreme errors are captured.We compare the performance of our approach to the current state-of-the-art considering publicly available data associated with the Bonneville Power Administration, analyzing aggregate production of a number of wind farms over a large geographic region. Finally, we discuss the advantages of our approach in the context of specific power systems operations planning problems: stochastic unit commitment and economic dispatch. Here, our methodology is embodied in the joint Sandia – University of California Davis Prescient software package for assessing and analyzing stochastic operations strategies.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Staid, Andrea; Watson, Jean -Paul; Wets, Roger J. -B.
Forecasts of available wind power are critical in key electric power systems operations planning problems, including economic dispatch and unit commitment. Such forecasts are necessarily uncertain, limiting the reliability and cost effectiveness of operations planning models based on a single deterministic or “point” forecast. A common approach to address this limitation involves the use of a number of probabilistic scenarios, each specifying a possible trajectory of wind power production, with associated probability. We present and analyze a novel method for generating probabilistic wind power scenarios, leveraging available historical information in the form of forecasted and corresponding observed wind power timemore » series. We estimate non-parametric forecast error densities, specifically using epi-spline basis functions, allowing us to capture the skewed and non-parametric nature of error densities observed in real-world data. We then describe a method to generate probabilistic scenarios from these basis functions that allows users to control for the degree to which extreme errors are captured.We compare the performance of our approach to the current state-of-the-art considering publicly available data associated with the Bonneville Power Administration, analyzing aggregate production of a number of wind farms over a large geographic region. Finally, we discuss the advantages of our approach in the context of specific power systems operations planning problems: stochastic unit commitment and economic dispatch. Here, our methodology is embodied in the joint Sandia – University of California Davis Prescient software package for assessing and analyzing stochastic operations strategies.« less
Gonthier, Gerard
2007-01-01
A graphical method that uses continuous water-level and barometric-pressure data was developed to estimate barometric efficiency. A plot of nearly continuous water level (on the y-axis), as a function of nearly continuous barometric pressure (on the x-axis), will plot as a line curved into a series of connected elliptical loops. Each loop represents a barometric-pressure fluctuation. The negative of the slope of the major axis of an elliptical loop will be the ratio of water-level change to barometric-pressure change, which is the sum of the barometric efficiency plus the error. The negative of the slope of the preferred orientation of many elliptical loops is an estimate of the barometric efficiency. The slope of the preferred orientation of many elliptical loops is approximately the median of the slopes of the major axes of the elliptical loops. If water-level change that is not caused by barometric-pressure change does not correlate with barometric-pressure change, the probability that the error will be greater than zero will be the same as the probability that it will be less than zero. As a result, the negative of the median of the slopes for many loops will be close to the barometric efficiency. The graphical method provided a rapid assessment of whether a well was affected by barometric-pressure change and also provided a rapid estimate of barometric efficiency. The graphical method was used to assess which wells at Air Force Plant 6, Marietta, Georgia, had water levels affected by barometric-pressure changes during a 2003 constant-discharge aquifer test. The graphical method was also used to estimate barometric efficiency. Barometric-efficiency estimates from the graphical method were compared to those of four other methods: average of ratios, median of ratios, Clark, and slope. The two methods (the graphical and median-of-ratios methods) that used the median values of water-level change divided by barometric-pressure change appeared to be most resistant to error caused by barometric-pressure-independent water-level change. The graphical method was particularly resistant to large amounts of barometric-pressure-independent water-level change, having an average and standard deviation of error for control wells that was less than one-quarter that of the other four methods. When using the graphical method, it is advisable that more than one person select the slope or that the same person fits the same data several times to minimize the effect of subjectivity. Also, a long study period should be used (at least 60 days) to ensure that loops affected by large amounts of barometric-pressure-independent water-level change do not significantly contribute to error in the barometric-efficiency estimate.
Iachan, Ronaldo; H. Johnson, Christopher; L. Harding, Richard; Kyle, Tonja; Saavedra, Pedro; L. Frazier, Emma; Beer, Linda; L. Mattson, Christine; Skarbinski, Jacek
2016-01-01
Background: Health surveys of the general US population are inadequate for monitoring human immunodeficiency virus (HIV) infection because the relatively low prevalence of the disease (<0.5%) leads to small subpopulation sample sizes. Objective: To collect a nationally and locally representative probability sample of HIV-infected adults receiving medical care to monitor clinical and behavioral outcomes, supplementing the data in the National HIV Surveillance System. This paper describes the sample design and weighting methods for the Medical Monitoring Project (MMP) and provides estimates of the size and characteristics of this population. Methods: To develop a method for obtaining valid, representative estimates of the in-care population, we implemented a cross-sectional, three-stage design that sampled 23 jurisdictions, then 691 facilities, then 9,344 HIV patients receiving medical care, using probability-proportional-to-size methods. The data weighting process followed standard methods, accounting for the probabilities of selection at each stage and adjusting for nonresponse and multiplicity. Nonresponse adjustments accounted for differing response at both facility and patient levels. Multiplicity adjustments accounted for visits to more than one HIV care facility. Results: MMP used a multistage stratified probability sampling design that was approximately self-weighting in each of the 23 project areas and nationally. The probability sample represents the estimated 421,186 HIV-infected adults receiving medical care during January through April 2009. Methods were efficient (i.e., induced small, unequal weighting effects and small standard errors for a range of weighted estimates). Conclusion: The information collected through MMP allows monitoring trends in clinical and behavioral outcomes and informs resource allocation for treatment and prevention activities. PMID:27651851
NASA Astrophysics Data System (ADS)
Pace, Phillip Eric; Tan, Chew Kung; Ong, Chee K.
2018-02-01
Direction finding (DF) systems are fundamental electronic support measures for electronic warfare. A number of DF techniques have been developed over the years; however, these systems are limited in bandwidth and resolution and suffer from a complex design for frequency downconversion. The design of a photonic DF technique for the detection and DF of low probability of intercept (LPI) signals is investigated. Key advantages of this design include a small baseline, wide bandwidth, high resolution, minimal space, weight, and power requirement. A robust postprocessing algorithm that utilizes the minimum Euclidean distance detector provides consistence and accurate estimation of angle of arrival (AoA) for a wide range of LPI waveforms. Experimental tests using frequency modulation continuous wave (FMCW) and P4 modulation signals were conducted in an anechoic chamber to verify the system design. Test results showed that the photonic DF system is capable of measuring the AoA of the LPI signals with 1-deg resolution over a 180 deg field-of-view. For an FMCW signal, the AoA was determined with a RMS error of 0.29 deg at 1-deg resolution. For a P4 coded signal, the RMS error in estimating the AoA is 0.32 deg at 1-deg resolution.
Jacob, Benjamin G; Novak, Robert J; Toe, Laurent; Sanfo, Moussa S; Afriyie, Abena N; Ibrahim, Mohammed A; Griffith, Daniel A; Unnasch, Thomas R
2012-01-01
The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter estimators from the sampled data. Thereafter, Durbin-Watson test statistics were used to test the null hypothesis that the regression residuals were not autocorrelated against the alternative that the residuals followed an autoregressive process in AUTOREG. Bayesian uncertainty matrices were also constructed employing normal priors for each of the sampled estimators in PROC MCMC. The residuals revealed both spatially structured and unstructured error effects in the high and low ABR-stratified clusters. The analyses also revealed that the estimators, levels of turbidity and presence of rocks were statistically significant for the high-ABR-stratified clusters, while the estimators distance between habitats and floating vegetation were important for the low-ABR-stratified cluster. Varying and constant coefficient regression models, ABR- stratified GIS-generated clusters, sub-meter resolution satellite imagery, a robust residual intra-cluster diagnostic test, MBR-based histograms, eigendecomposition spatial filter algorithms and Bayesian matrices can enable accurate autoregressive estimation of latent uncertainity affects and other residual error probabilities (i.e., heteroskedasticity) for testing correlations between georeferenced S. damnosum s.l. riverine larval habitat estimators. The asymptotic distribution of the resulting residual adjusted intra-cluster predictor error autocovariate coefficients can thereafter be established while estimates of the asymptotic variance can lead to the construction of approximate confidence intervals for accurately targeting productive S. damnosum s.l habitats based on spatiotemporal field-sampled count data.
Incorporating harvest rates into the sex-age-kill model for white-tailed deer
Norton, Andrew S.; Diefenbach, Duane R.; Rosenberry, Christopher S.; Wallingford, Bret D.
2013-01-01
Although monitoring population trends is an essential component of game species management, wildlife managers rarely have complete counts of abundance. Often, they rely on population models to monitor population trends. As imperfect representations of real-world populations, models must be rigorously evaluated to be applied appropriately. Previous research has evaluated population models for white-tailed deer (Odocoileus virginianus); however, the precision and reliability of these models when tested against empirical measures of variability and bias largely is untested. We were able to statistically evaluate the Pennsylvania sex-age-kill (PASAK) population model using realistic error measured using data from 1,131 radiocollared white-tailed deer in Pennsylvania from 2002 to 2008. We used these data and harvest data (number killed, age-sex structure, etc.) to estimate precision of abundance estimates, identify the most efficient harvest data collection with respect to precision of parameter estimates, and evaluate PASAK model robustness to violation of assumptions. Median coefficient of variation (CV) estimates by Wildlife Management Unit, 13.2% in the most recent year, were slightly above benchmarks recommended for managing game species populations. Doubling reporting rates by hunters or doubling the number of deer checked by personnel in the field reduced median CVs to recommended levels. The PASAK model was robust to errors in estimates for adult male harvest rates but was sensitive to errors in subadult male harvest rates, especially in populations with lower harvest rates. In particular, an error in subadult (1.5-yr-old) male harvest rates resulted in the opposite error in subadult male, adult female, and juvenile population estimates. Also, evidence of a greater harvest probability for subadult female deer when compared with adult (≥2.5-yr-old) female deer resulted in a 9.5% underestimate of the population using the PASAK model. Because obtaining appropriate sample sizes, by management unit, to estimate harvest rate parameters each year may be too expensive, assumptions of constant annual harvest rates may be necessary. However, if changes in harvest regulations or hunter behavior influence subadult male harvest rates, the PASAK model could provide an unreliable index to population changes.
Zhao, Wei; Cella, Massimo; Della Pasqua, Oscar; Burger, David; Jacqz-Aigrain, Evelyne
2012-04-01
Abacavir is used to treat HIV infection in both adults and children. The recommended paediatric dose is 8 mg kg(-1) twice daily up to a maximum of 300 mg twice daily. Weight was identified as the central covariate influencing pharmacokinetics of abacavir in children. A population pharmacokinetic model was developed to describe both once and twice daily pharmacokinetic profiles of abacavir in infants and toddlers. Standard dosage regimen is associated with large interindividual variability in abacavir concentrations. A maximum a posteriori probability Bayesian estimator of AUC(0-) (t) based on three time points (0, 1 or 2, and 3 h) is proposed to support area under the concentration-time curve (AUC) targeted individualized therapy in infants and toddlers. To develop a population pharmacokinetic model for abacavir in HIV-infected infants and toddlers, which will be used to describe both once and twice daily pharmacokinetic profiles, identify covariates that explain variability and propose optimal time points to optimize the area under the concentration-time curve (AUC) targeted dosage and individualize therapy. The pharmacokinetics of abacavir was described with plasma concentrations from 23 patients using nonlinear mixed-effects modelling (NONMEM) software. A two-compartment model with first-order absorption and elimination was developed. The final model was validated using bootstrap, visual predictive check and normalized prediction distribution errors. The Bayesian estimator was validated using the cross-validation and simulation-estimation method. The typical population pharmacokinetic parameters and relative standard errors (RSE) were apparent systemic clearance (CL) 13.4 () h−1 (RSE 6.3%), apparent central volume of distribution 4.94 () (RSE 28.7%), apparent peripheral volume of distribution 8.12 () (RSE14.2%), apparent intercompartment clearance 1.25 () h−1 (RSE 16.9%) and absorption rate constant 0.758 h−1 (RSE 5.8%). The covariate analysis identified weight as the individual factor influencing the apparent oral clearance: CL = 13.4 × (weight/12)1.14. The maximum a posteriori probability Bayesian estimator, based on three concentrations measured at 0, 1 or 2, and 3 h after drug intake allowed predicting individual AUC0–t. The population pharmacokinetic model developed for abacavir in HIV-infected infants and toddlers accurately described both once and twice daily pharmacokinetic profiles. The maximum a posteriori probability Bayesian estimator of AUC(0-) (t) was developed from the final model and can be used routinely to optimize individual dosing. © 2011 The Authors. British Journal of Clinical Pharmacology © 2011 The British Pharmacological Society.
A robust algorithm for automated target recognition using precomputed radar cross sections
NASA Astrophysics Data System (ADS)
Ehrman, Lisa M.; Lanterman, Aaron D.
2004-09-01
Passive radar is an emerging technology that offers a number of unique benefits, including covert operation. Many such systems are already capable of detecting and tracking aircraft. The goal of this work is to develop a robust algorithm for adding automated target recognition (ATR) capabilities to existing passive radar systems. In previous papers, we proposed conducting ATR by comparing the precomputed RCS of known targets to that of detected targets. To make the precomputed RCS as accurate as possible, a coordinated flight model is used to estimate aircraft orientation. Once the aircraft's position and orientation are known, it is possible to determine the incident and observed angles on the aircraft, relative to the transmitter and receiver. This makes it possible to extract the appropriate radar cross section (RCS) from our simulated database. This RCS is then scaled to account for propagation losses and the receiver's antenna gain. A Rician likelihood model compares these expected signals from different targets to the received target profile. We have previously employed Monte Carlo runs to gauge the probability of error in the ATR algorithm; however, generation of a statistically significant set of Monte Carlo runs is computationally intensive. As an alternative to Monte Carlo runs, we derive the relative entropy (also known as Kullback-Liebler distance) between two Rician distributions. Since the probability of Type II error in our hypothesis testing problem can be expressed as a function of the relative entropy via Stein's Lemma, this provides us with a computationally efficient method for determining an upper bound on our algorithm's performance. It also provides great insight into the types of classification errors we can expect from our algorithm. This paper compares the numerically approximated probability of Type II error with the results obtained from a set of Monte Carlo runs.
The discrete Laplace exponential family and estimation of Y-STR haplotype frequencies.
Andersen, Mikkel Meyer; Eriksen, Poul Svante; Morling, Niels
2013-07-21
Estimating haplotype frequencies is important in e.g. forensic genetics, where the frequencies are needed to calculate the likelihood ratio for the evidential weight of a DNA profile found at a crime scene. Estimation is naturally based on a population model, motivating the investigation of the Fisher-Wright model of evolution for haploid lineage DNA markers. An exponential family (a class of probability distributions that is well understood in probability theory such that inference is easily made by using existing software) called the 'discrete Laplace distribution' is described. We illustrate how well the discrete Laplace distribution approximates a more complicated distribution that arises by investigating the well-known population genetic Fisher-Wright model of evolution by a single-step mutation process. It was shown how the discrete Laplace distribution can be used to estimate haplotype frequencies for haploid lineage DNA markers (such as Y-chromosomal short tandem repeats), which in turn can be used to assess the evidential weight of a DNA profile found at a crime scene. This was done by making inference in a mixture of multivariate, marginally independent, discrete Laplace distributions using the EM algorithm to estimate the probabilities of membership of a set of unobserved subpopulations. The discrete Laplace distribution can be used to estimate haplotype frequencies with lower prediction error than other existing estimators. Furthermore, the calculations could be performed on a normal computer. This method was implemented in the freely available open source software R that is supported on Linux, MacOS and MS Windows. Copyright © 2013 Elsevier Ltd. All rights reserved.
Analytical Plug-In Method for Kernel Density Estimator Applied to Genetic Neutrality Study
NASA Astrophysics Data System (ADS)
Troudi, Molka; Alimi, Adel M.; Saoudi, Samir
2008-12-01
The plug-in method enables optimization of the bandwidth of the kernel density estimator in order to estimate probability density functions (pdfs). Here, a faster procedure than that of the common plug-in method is proposed. The mean integrated square error (MISE) depends directly upon [InlineEquation not available: see fulltext.] which is linked to the second-order derivative of the pdf. As we intend to introduce an analytical approximation of [InlineEquation not available: see fulltext.], the pdf is estimated only once, at the end of iterations. These two kinds of algorithm are tested on different random variables having distributions known for their difficult estimation. Finally, they are applied to genetic data in order to provide a better characterisation in the mean of neutrality of Tunisian Berber populations.
Perreault Levasseur, Laurence; Hezaveh, Yashar D.; Wechsler, Risa H.
2017-11-15
In Hezaveh et al. (2017) we showed that deep learning can be used for model parameter estimation and trained convolutional neural networks to determine the parameters of strong gravitational lensing systems. Here we demonstrate a method for obtaining the uncertainties of these parameters. We review the framework of variational inference to obtain approximate posteriors of Bayesian neural networks and apply it to a network trained to estimate the parameters of the Singular Isothermal Ellipsoid plus external shear and total flux magnification. We show that the method can capture the uncertainties due to different levels of noise in the input data,more » as well as training and architecture-related errors made by the network. To evaluate the accuracy of the resulting uncertainties, we calculate the coverage probabilities of marginalized distributions for each lensing parameter. By tuning a single hyperparameter, the dropout rate, we obtain coverage probabilities approximately equal to the confidence levels for which they were calculated, resulting in accurate and precise uncertainty estimates. Our results suggest that neural networks can be a fast alternative to Monte Carlo Markov Chains for parameter uncertainty estimation in many practical applications, allowing more than seven orders of magnitude improvement in speed.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perreault Levasseur, Laurence; Hezaveh, Yashar D.; Wechsler, Risa H.
In Hezaveh et al. (2017) we showed that deep learning can be used for model parameter estimation and trained convolutional neural networks to determine the parameters of strong gravitational lensing systems. Here we demonstrate a method for obtaining the uncertainties of these parameters. We review the framework of variational inference to obtain approximate posteriors of Bayesian neural networks and apply it to a network trained to estimate the parameters of the Singular Isothermal Ellipsoid plus external shear and total flux magnification. We show that the method can capture the uncertainties due to different levels of noise in the input data,more » as well as training and architecture-related errors made by the network. To evaluate the accuracy of the resulting uncertainties, we calculate the coverage probabilities of marginalized distributions for each lensing parameter. By tuning a single hyperparameter, the dropout rate, we obtain coverage probabilities approximately equal to the confidence levels for which they were calculated, resulting in accurate and precise uncertainty estimates. Our results suggest that neural networks can be a fast alternative to Monte Carlo Markov Chains for parameter uncertainty estimation in many practical applications, allowing more than seven orders of magnitude improvement in speed.« less
Sample Size Determination for Rasch Model Tests
ERIC Educational Resources Information Center
Draxler, Clemens
2010-01-01
This paper is concerned with supplementing statistical tests for the Rasch model so that additionally to the probability of the error of the first kind (Type I probability) the probability of the error of the second kind (Type II probability) can be controlled at a predetermined level by basing the test on the appropriate number of observations.…
Prediction-error variance in Bayesian model updating: a comparative study
NASA Astrophysics Data System (ADS)
Asadollahi, Parisa; Li, Jian; Huang, Yong
2017-04-01
In Bayesian model updating, the likelihood function is commonly formulated by stochastic embedding in which the maximum information entropy probability model of prediction error variances plays an important role and it is Gaussian distribution subject to the first two moments as constraints. The selection of prediction error variances can be formulated as a model class selection problem, which automatically involves a trade-off between the average data-fit of the model class and the information it extracts from the data. Therefore, it is critical for the robustness in the updating of the structural model especially in the presence of modeling errors. To date, three ways of considering prediction error variances have been seem in the literature: 1) setting constant values empirically, 2) estimating them based on the goodness-of-fit of the measured data, and 3) updating them as uncertain parameters by applying Bayes' Theorem at the model class level. In this paper, the effect of different strategies to deal with the prediction error variances on the model updating performance is investigated explicitly. A six-story shear building model with six uncertain stiffness parameters is employed as an illustrative example. Transitional Markov Chain Monte Carlo is used to draw samples of the posterior probability density function of the structure model parameters as well as the uncertain prediction variances. The different levels of modeling uncertainty and complexity are modeled through three FE models, including a true model, a model with more complexity, and a model with modeling error. Bayesian updating is performed for the three FE models considering the three aforementioned treatments of the prediction error variances. The effect of number of measurements on the model updating performance is also examined in the study. The results are compared based on model class assessment and indicate that updating the prediction error variances as uncertain parameters at the model class level produces more robust results especially when the number of measurement is small.
A post audit of a model-designed ground water extraction system.
Andersen, Peter F; Lu, Silong
2003-01-01
Model post audits test the predictive capabilities of ground water models and shed light on their practical limitations. In the work presented here, ground water model predictions were used to design an extraction/treatment/injection system at a military ammunition facility and then were re-evaluated using site-specific water-level data collected approximately one year after system startup. The water-level data indicated that performance specifications for the design, i.e., containment, had been achieved over the required area, but that predicted water-level changes were greater than observed, particularly in the deeper zones of the aquifer. Probable model error was investigated by determining the changes that were required to obtain an improved match to observed water-level changes. This analysis suggests that the originally estimated hydraulic properties were in error by a factor of two to five. These errors may have resulted from attributing less importance to data from deeper zones of the aquifer and from applying pumping test results to a volume of material that was larger than the volume affected by the pumping test. To determine the importance of these errors to the predictions of interest, the models were used to simulate the capture zones resulting from the originally estimated and updated parameter values. The study suggests that, despite the model error, the ground water model contributed positively to the design of the remediation system.
Multiple-Event Seismic Location Using the Markov-Chain Monte Carlo Technique
NASA Astrophysics Data System (ADS)
Myers, S. C.; Johannesson, G.; Hanley, W.
2005-12-01
We develop a new multiple-event location algorithm (MCMCloc) that utilizes the Markov-Chain Monte Carlo (MCMC) method. Unlike most inverse methods, the MCMC approach produces a suite of solutions, each of which is consistent with observations and prior estimates of data and model uncertainties. Model parameters in MCMCloc consist of event hypocenters, and travel-time predictions. Data are arrival time measurements and phase assignments. Posteriori estimates of event locations, path corrections, pick errors, and phase assignments are made through analysis of the posteriori suite of acceptable solutions. Prior uncertainty estimates include correlations between travel-time predictions, correlations between measurement errors, the probability of misidentifying one phase for another, and the probability of spurious data. Inclusion of prior constraints on location accuracy allows direct utilization of ground-truth locations or well-constrained location parameters (e.g. from InSAR) that aid in the accuracy of the solution. Implementation of a correlation structure for travel-time predictions allows MCMCloc to operate over arbitrarily large geographic areas. Transition in behavior between a multiple-event locator for tightly clustered events and a single-event locator for solitary events is controlled by the spatial correlation of travel-time predictions. We test the MCMC locator on a regional data set of Nevada Test Site nuclear explosions. Event locations and origin times are known for these events, allowing us to test the features of MCMCloc using a high-quality ground truth data set. Preliminary tests suggest that MCMCloc provides excellent relative locations, often outperforming traditional multiple-event location algorithms, and excellent absolute locations are attained when constraints from one or more ground truth event are included. When phase assignments are switched, we find that MCMCloc properly corrects the error when predicted arrival times are separated by several seconds. In cases where the predicted arrival times are within the combined uncertainty of prediction and measurement errors, MCMCloc determines the probability of one or the other phase assignment and propagates this uncertainty into all model parameters. We find that MCMCloc is a promising method for simultaneously locating large, geographically distributed data sets. Because we incorporate prior knowledge on many parameters, MCMCloc is ideal for combining trusted data with data of unknown reliability. This work was performed under the auspices of the U.S. Department of Energy by the University of California Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48, Contribution UCRL-ABS-215048
Population viability analysis with species occurrence data from museum collections.
Skarpaas, Olav; Stabbetorp, Odd E
2011-06-01
The most comprehensive data on many species come from scientific collections. Thus, we developed a method of population viability analysis (PVA) in which this type of occurrence data can be used. In contrast to classical PVA, our approach accounts for the inherent observation error in occurrence data and allows the estimation of the population parameters needed for viability analysis. We tested the sensitivity of the approach to spatial resolution of the data, length of the time series, sampling effort, and detection probability with simulated data and conducted PVAs for common, rare, and threatened species. We compared the results of these PVAs with results of standard method PVAs in which observation error is ignored. Our method provided realistic estimates of population growth terms and quasi-extinction risk in cases in which the standard method without observation error could not. For low values of any of the sampling variables we tested, precision decreased, and in some cases biased estimates resulted. The results of our PVAs with the example species were consistent with information in the literature on these species. Our approach may facilitate PVA for a wide range of species of conservation concern for which demographic data are lacking but occurrence data are readily available. ©2011 Society for Conservation Biology.
Dorazio, Robert M.
2012-01-01
Several models have been developed to predict the geographic distribution of a species by combining measurements of covariates of occurrence at locations where the species is known to be present with measurements of the same covariates at other locations where species occurrence status (presence or absence) is unknown. In the absence of species detection errors, spatial point-process models and binary-regression models for case-augmented surveys provide consistent estimators of a species’ geographic distribution without prior knowledge of species prevalence. In addition, these regression models can be modified to produce estimators of species abundance that are asymptotically equivalent to those of the spatial point-process models. However, if species presence locations are subject to detection errors, neither class of models provides a consistent estimator of covariate effects unless the covariates of species abundance are distinct and independently distributed from the covariates of species detection probability. These analytical results are illustrated using simulation studies of data sets that contain a wide range of presence-only sample sizes. Analyses of presence-only data of three avian species observed in a survey of landbirds in western Montana and northern Idaho are compared with site-occupancy analyses of detections and nondetections of these species.
Encircling the dark: constraining dark energy via cosmic density in spheres
NASA Astrophysics Data System (ADS)
Codis, S.; Pichon, C.; Bernardeau, F.; Uhlemann, C.; Prunet, S.
2016-08-01
The recently published analytic probability density function for the mildly non-linear cosmic density field within spherical cells is used to build a simple but accurate maximum likelihood estimate for the redshift evolution of the variance of the density, which, as expected, is shown to have smaller relative error than the sample variance. This estimator provides a competitive probe for the equation of state of dark energy, reaching a few per cent accuracy on wp and wa for a Euclid-like survey. The corresponding likelihood function can take into account the configuration of the cells via their relative separations. A code to compute one-cell-density probability density functions for arbitrary initial power spectrum, top-hat smoothing and various spherical-collapse dynamics is made available online, so as to provide straightforward means of testing the effect of alternative dark energy models and initial power spectra on the low-redshift matter distribution.
NASA Astrophysics Data System (ADS)
Ballari, D.; Castro, E.; Campozano, L.
2016-06-01
Precipitation monitoring is of utmost importance for water resource management. However, in regions of complex terrain such as Ecuador, the high spatio-temporal precipitation variability and the scarcity of rain gauges, make difficult to obtain accurate estimations of precipitation. Remotely sensed estimated precipitation, such as the Multi-satellite Precipitation Analysis TRMM, can cope with this problem after a validation process, which must be representative in space and time. In this work we validate monthly estimates from TRMM 3B43 satellite precipitation (0.25° x 0.25° resolution), by using ground data from 14 rain gauges in Ecuador. The stations are located in the 3 most differentiated regions of the country: the Pacific coastal plains, the Andean highlands, and the Amazon rainforest. Time series, between 1998 - 2010, of imagery and rain gauges were compared using statistical error metrics such as bias, root mean square error, and Pearson correlation; and with detection indexes such as probability of detection, equitable threat score, false alarm rate and frequency bias index. The results showed that precipitation seasonality is well represented and TRMM 3B43 acceptably estimates the monthly precipitation in the three regions of the country. According to both, statistical error metrics and detection indexes, the coastal and Amazon regions are better estimated quantitatively than the Andean highlands. Additionally, it was found that there are better estimations for light precipitation rates. The present validation of TRMM 3B43 provides important results to support further studies on calibration and bias correction of precipitation in ungagged watershed basins.
Sensitivity of feedforward neural networks to weight errors
NASA Technical Reports Server (NTRS)
Stevenson, Maryhelen; Widrow, Bernard; Winter, Rodney
1990-01-01
An analysis is made of the sensitivity of feedforward layered networks of Adaline elements (threshold logic units) to weight errors. An approximation is derived which expresses the probability of error for an output neuron of a large network (a network with many neurons per layer) as a function of the percentage change in the weights. As would be expected, the probability of error increases with the number of layers in the network and with the percentage change in the weights. The probability of error is essentially independent of the number of weights per neuron and of the number of neurons per layer, as long as these numbers are large (on the order of 100 or more).
Predicting the Earth encounters of (99942) Apophis
NASA Technical Reports Server (NTRS)
Giorgini, Jon D.; Benner, Lance A. M.; Ostro, Steven J.; Nolan, Michael C.; Busch, Michael W.
2007-01-01
Arecibo delay-Doppler measurements of (99942) Apophis in 2005 and 2006 resulted in a five standard-deviation trajectory correction to the optically predicted close approach distance to Earth in 2029. The radar measurements reduced the volume of the statistical uncertainty region entering the encounter to 7.3% of the pre-radar solution, but increased the trajectory uncertainty growth rate across the encounter by 800% due to the closer predicted approach to the Earth. A small estimated Earth impact probability remained for 2036. With standard-deviation plane-of-sky position uncertainties for 2007-2010 already less than 0.2 arcsec, the best near-term ground-based optical astrometry can only weakly affect the trajectory estimate. While the potential for impact in 2036 will likely be excluded in 2013 (if not 2011) using ground-based optical measurements, approximations within the Standard Dynamical Model (SDM) used to estimate and predict the trajectory from the current era are sufficient to obscure the difference between a predicted impact and a miss in 2036 by altering the dynamics leading into the 2029 encounter. Normal impact probability assessments based on the SDM become problematic without knowledge of the object's physical properties; impact could be excluded while the actual dynamics still permit it. Calibrated position uncertainty intervals are developed to compensate for this by characterizing the minimum and maximum effect of physical parameters on the trajectory. Uncertainty in accelerations related to solar radiation can cause between 82 and 4720 Earth-radii of trajectory change relative to the SDM by 2036. If an actionable hazard exists, alteration by 2-10% of Apophis' total absorption of solar radiation in 2018 could be sufficient to produce a six standard-deviation trajectory change by 2036 given physical characterization; even a 0.5% change could produce a trajectory shift of one Earth-radius by 2036 for all possible spin-poles and likely masses. Planetary ephemeris uncertainties are the next greatest source of systematic error, causing up to 23 Earth-radii of uncertainty. The SDM Earth point-mass assumption introduces an additional 2.9 Earth-radii of prediction error by 2036. Unmodeled asteroid perturbations produce as much as 2.3 Earth-radii of error. We find no future small-body encounters likely to yield an Apophis mass determination prior to 2029. However, asteroid (144898) 2004 VD17, itself having a statistical Earth impact in 2102, will probably encounter Apophis at 6.7 lunar distances in 2034, their uncertainty regions coming as close as 1.6 lunar distances near the center of both SDM probability distributions.
Rothmann, Mark
2005-01-01
When testing the equality of means from two different populations, a t-test or large sample normal test tend to be performed. For these tests, when the sample size or design for the second sample is dependent on the results of the first sample, the type I error probability is altered for each specific possibility in the null hypothesis. We will examine the impact on the type I error probabilities for two confidence interval procedures and procedures using test statistics when the design for the second sample or experiment is dependent on the results from the first sample or experiment (or series of experiments). Ways for controlling a desired maximum type I error probability or a desired type I error rate will be discussed. Results are applied to the setting of noninferiority comparisons in active controlled trials where the use of a placebo is unethical.
Inverse and forward modeling under uncertainty using MRE-based Bayesian approach
NASA Astrophysics Data System (ADS)
Hou, Z.; Rubin, Y.
2004-12-01
A stochastic inverse approach for subsurface characterization is proposed and applied to shallow vadose zone at a winery field site in north California and to a gas reservoir at the Ormen Lange field site in the North Sea. The approach is formulated in a Bayesian-stochastic framework, whereby the unknown parameters are identified in terms of their statistical moments or their probabilities. Instead of the traditional single-valued estimation /prediction provided by deterministic methods, the approach gives a probability distribution for an unknown parameter. This allows calculating the mean, the mode, and the confidence interval, which is useful for a rational treatment of uncertainty and its consequences. The approach also allows incorporating data of various types and different error levels, including measurements of state variables as well as information such as bounds on or statistical moments of the unknown parameters, which may represent prior information. To obtain minimally subjective prior probabilities required for the Bayesian approach, the principle of Minimum Relative Entropy (MRE) is employed. The approach is tested in field sites for flow parameters identification and soil moisture estimation in the vadose zone and for gas saturation estimation at great depth below the ocean floor. Results indicate the potential of coupling various types of field data within a MRE-based Bayesian formalism for improving the estimation of the parameters of interest.
Poster error probability in the Mu-11 Sequential Ranging System
NASA Technical Reports Server (NTRS)
Coyle, C. W.
1981-01-01
An expression is derived for the posterior error probability in the Mu-2 Sequential Ranging System. An algorithm is developed which closely bounds the exact answer and can be implemented in the machine software. A computer simulation is provided to illustrate the improved level of confidence in a ranging acquisition using this figure of merit as compared to that using only the prior probabilities. In a simulation of 20,000 acquisitions with an experimentally determined threshold setting, the algorithm detected 90% of the actual errors and made false indication of errors on 0.2% of the acquisitions.
Structural Reliability Using Probability Density Estimation Methods Within NESSUS
NASA Technical Reports Server (NTRS)
Chamis, Chrisos C. (Technical Monitor); Godines, Cody Ric
2003-01-01
A reliability analysis studies a mathematical model of a physical system taking into account uncertainties of design variables and common results are estimations of a response density, which also implies estimations of its parameters. Some common density parameters include the mean value, the standard deviation, and specific percentile(s) of the response, which are measures of central tendency, variation, and probability regions, respectively. Reliability analyses are important since the results can lead to different designs by calculating the probability of observing safe responses in each of the proposed designs. All of this is done at the expense of added computational time as compared to a single deterministic analysis which will result in one value of the response out of many that make up the density of the response. Sampling methods, such as monte carlo (MC) and latin hypercube sampling (LHS), can be used to perform reliability analyses and can compute nonlinear response density parameters even if the response is dependent on many random variables. Hence, both methods are very robust; however, they are computationally expensive to use in the estimation of the response density parameters. Both methods are 2 of 13 stochastic methods that are contained within the Numerical Evaluation of Stochastic Structures Under Stress (NESSUS) program. NESSUS is a probabilistic finite element analysis (FEA) program that was developed through funding from NASA Glenn Research Center (GRC). It has the additional capability of being linked to other analysis programs; therefore, probabilistic fluid dynamics, fracture mechanics, and heat transfer are only a few of what is possible with this software. The LHS method is the newest addition to the stochastic methods within NESSUS. Part of this work was to enhance NESSUS with the LHS method. The new LHS module is complete, has been successfully integrated with NESSUS, and been used to study four different test cases that have been proposed by the Society of Automotive Engineers (SAE). The test cases compare different probabilistic methods within NESSUS because it is important that a user can have confidence that estimates of stochastic parameters of a response will be within an acceptable error limit. For each response, the mean, standard deviation, and 0.99 percentile, are repeatedly estimated which allows confidence statements to be made for each parameter estimated, and for each method. Thus, the ability of several stochastic methods to efficiently and accurately estimate density parameters is compared using four valid test cases. While all of the reliability methods used performed quite well, for the new LHS module within NESSUS it was found that it had a lower estimation error than MC when they were used to estimate the mean, standard deviation, and 0.99 percentile of the four different stochastic responses. Also, LHS required a smaller amount of calculations to obtain low error answers with a high amount of confidence than MC. It can therefore be stated that NESSUS is an important reliability tool that has a variety of sound probabilistic methods a user can employ and the newest LHS module is a valuable new enhancement of the program.
NASA Technical Reports Server (NTRS)
Furnstenau, Norbert; Ellis, Stephen R.
2015-01-01
In order to determine the required visual frame rate (FR) for minimizing prediction errors with out-the-window video displays at remote/virtual airport towers, thirteen active air traffic controllers viewed high dynamic fidelity simulations of landing aircraft and decided whether aircraft would stop as if to be able to make a turnoff or whether a runway excursion would be expected. The viewing conditions and simulation dynamics replicated visual rates and environments of transport aircraft landing at small commercial airports. The required frame rate was estimated using Bayes inference on prediction errors by linear FRextrapolation of event probabilities conditional on predictions (stop, no-stop). Furthermore estimates were obtained from exponential model fits to the parametric and non-parametric perceptual discriminabilities d' and A (average area under ROC-curves) as dependent on FR. Decision errors are biased towards preference of overshoot and appear due to illusionary increase in speed at low frames rates. Both Bayes and A - extrapolations yield a framerate requirement of 35 < FRmin < 40 Hz. When comparing with published results [12] on shooter game scores the model based d'(FR)-extrapolation exhibits the best agreement and indicates even higher FRmin > 40 Hz for minimizing decision errors. Definitive recommendations require further experiments with FR > 30 Hz.
On estimating probability of presence from use-availability or presence-background data.
Phillips, Steven J; Elith, Jane
2013-06-01
A fundamental ecological modeling task is to estimate the probability that a species is present in (or uses) a site, conditional on environmental variables. For many species, available data consist of "presence" data (locations where the species [or evidence of it] has been observed), together with "background" data, a random sample of available environmental conditions. Recently published papers disagree on whether probability of presence is identifiable from such presence-background data alone. This paper aims to resolve the disagreement, demonstrating that additional information is required. We defined seven simulated species representing various simple shapes of response to environmental variables (constant, linear, convex, unimodal, S-shaped) and ran five logistic model-fitting methods using 1000 presence samples and 10 000 background samples; the simulations were repeated 100 times. The experiment revealed a stark contrast between two groups of methods: those based on a strong assumption that species' true probability of presence exactly matches a given parametric form had highly variable predictions and much larger RMS error than methods that take population prevalence (the fraction of sites in which the species is present) as an additional parameter. For six species, the former group grossly under- or overestimated probability of presence. The cause was not model structure or choice of link function, because all methods were logistic with linear and, where necessary, quadratic terms. Rather, the experiment demonstrates that an estimate of prevalence is not just helpful, but is necessary (except in special cases) for identifying probability of presence. We therefore advise against use of methods that rely on the strong assumption, due to Lele and Keim (recently advocated by Royle et al.) and Lancaster and Imbens. The methods are fragile, and their strong assumption is unlikely to be true in practice. We emphasize, however, that we are not arguing against standard statistical methods such as logistic regression, generalized linear models, and so forth, none of which requires the strong assumption. If probability of presence is required for a given application, there is no panacea for lack of data. Presence-background data must be augmented with an additional datum, e.g., species' prevalence, to reliably estimate absolute (rather than relative) probability of presence.
Performance Analysis of HF Band FB-MC-SS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hussein Moradi; Stephen Andrew Laraway; Behrouz Farhang-Boroujeny
Abstract—In a recent paper [1] the filter bank multicarrier spread spectrum (FB-MC-SS) waveform was proposed for wideband spread spectrum HF communications. A significant benefit of this waveform is robustness against narrow and partial band interference. Simulation results in [1] demonstrated good performance in a wideband HF channel over a wide range of conditions. In this paper we present a theoretical analysis of the bit error probably for this system. Our analysis tailors the results from [2] where BER performance was analyzed for maximum ration combining systems that accounted for correlation between subcarriers and channel estimation error. Equations are give formore » BER that closely match the simulated performance in most situations.« less
Cuba: Multidimensional numerical integration library
NASA Astrophysics Data System (ADS)
Hahn, Thomas
2016-08-01
The Cuba library offers four independent routines for multidimensional numerical integration: Vegas, Suave, Divonne, and Cuhre. The four algorithms work by very different methods, and can integrate vector integrands and have very similar Fortran, C/C++, and Mathematica interfaces. Their invocation is very similar, making it easy to cross-check by substituting one method by another. For further safeguarding, the output is supplemented by a chi-square probability which quantifies the reliability of the error estimate.
Statistical plant set estimation using Schroeder-phased multisinusoidal input design
NASA Technical Reports Server (NTRS)
Bayard, D. S.
1992-01-01
A frequency domain method is developed for plant set estimation. The estimation of a plant 'set' rather than a point estimate is required to support many methods of modern robust control design. The approach here is based on using a Schroeder-phased multisinusoid input design which has the special property of placing input energy only at the discrete frequency points used in the computation. A detailed analysis of the statistical properties of the frequency domain estimator is given, leading to exact expressions for the probability distribution of the estimation error, and many important properties. It is shown that, for any nominal parametric plant estimate, one can use these results to construct an overbound on the additive uncertainty to any prescribed statistical confidence. The 'soft' bound thus obtained can be used to replace 'hard' bounds presently used in many robust control analysis and synthesis methods.
A Quantum Theoretical Explanation for Probability Judgment Errors
ERIC Educational Resources Information Center
Busemeyer, Jerome R.; Pothos, Emmanuel M.; Franco, Riccardo; Trueblood, Jennifer S.
2011-01-01
A quantum probability model is introduced and used to explain human probability judgment errors including the conjunction and disjunction fallacies, averaging effects, unpacking effects, and order effects on inference. On the one hand, quantum theory is similar to other categorization and memory models of cognition in that it relies on vector…
A concatenated coding scheme for error control
NASA Technical Reports Server (NTRS)
Lin, S.
1985-01-01
A concatenated coding scheme for error contol in data communications was analyzed. The inner code is used for both error correction and detection, however the outer code is used only for error detection. A retransmission is requested if either the inner code decoder fails to make a successful decoding or the outer code decoder detects the presence of errors after the inner code decoding. Probability of undetected error of the proposed scheme is derived. An efficient method for computing this probability is presented. Throughout efficiency of the proposed error control scheme incorporated with a selective repeat ARQ retransmission strategy is analyzed.
The effect of timing errors in optical digital systems.
NASA Technical Reports Server (NTRS)
Gagliardi, R. M.
1972-01-01
The use of digital transmission with narrow light pulses appears attractive for data communications, but carries with it a stringent requirement on system bit timing. The effects of imperfect timing in direct-detection (noncoherent) optical binary systems are investigated using both pulse-position modulation and on-off keying for bit transmission. Particular emphasis is placed on specification of timing accuracy and an examination of system degradation when this accuracy is not attained. Bit error probabilities are shown as a function of timing errors from which average error probabilities can be computed for specific synchronization methods. Of significance is the presence of a residual or irreducible error probability in both systems, due entirely to the timing system, which cannot be overcome by the data channel.
Drought Persistence in Models and Observations
NASA Astrophysics Data System (ADS)
Moon, Heewon; Gudmundsson, Lukas; Seneviratne, Sonia
2017-04-01
Many regions of the world have experienced drought events that persisted several years and caused substantial economic and ecological impacts in the 20th century. However, it remains unclear whether there are significant trends in the frequency or severity of these prolonged drought events. In particular, an important issue is linked to systematic biases in the representation of persistent drought events in climate models, which impedes analysis related to the detection and attribution of drought trends. This study assesses drought persistence errors in global climate model (GCM) simulations from the 5th phase of Coupled Model Intercomparison Project (CMIP5), in the period of 1901-2010. The model simulations are compared with five gridded observational data products. The analysis focuses on two aspects: the identification of systematic biases in the models and the partitioning of the spread of drought-persistence-error into four possible sources of uncertainty: model uncertainty, observation uncertainty, internal climate variability and the estimation error of drought persistence. We use monthly and yearly dry-to-dry transition probabilities as estimates for drought persistence with drought conditions defined as negative precipitation anomalies. For both time scales we find that most model simulations consistently underestimated drought persistence except in a few regions such as India and Eastern South America. Partitioning the spread of the drought-persistence-error shows that at the monthly time scale model uncertainty and observation uncertainty are dominant, while the contribution from internal variability does play a minor role in most cases. At the yearly scale, the spread of the drought-persistence-error is dominated by the estimation error, indicating that the partitioning is not statistically significant, due to a limited number of considered time steps. These findings reveal systematic errors in the representation of drought persistence in current climate models and highlight the main contributors of uncertainty of drought-persistence-error. Future analyses will focus on investigating the temporal propagation of drought persistence to better understand the causes for the identified errors in the representation of drought persistence in state-of-the-art climate models.
Locally Weighted Score Estimation for Quantile Classification in Binary Regression Models
Rice, John D.; Taylor, Jeremy M. G.
2016-01-01
One common use of binary response regression methods is classification based on an arbitrary probability threshold dictated by the particular application. Since this is given to us a priori, it is sensible to incorporate the threshold into our estimation procedure. Specifically, for the linear logistic model, we solve a set of locally weighted score equations, using a kernel-like weight function centered at the threshold. The bandwidth for the weight function is selected by cross validation of a novel hybrid loss function that combines classification error and a continuous measure of divergence between observed and fitted values; other possible cross-validation functions based on more common binary classification metrics are also examined. This work has much in common with robust estimation, but diers from previous approaches in this area in its focus on prediction, specifically classification into high- and low-risk groups. Simulation results are given showing the reduction in error rates that can be obtained with this method when compared with maximum likelihood estimation, especially under certain forms of model misspecification. Analysis of a melanoma data set is presented to illustrate the use of the method in practice. PMID:28018492
A new multistage groundwater transport inverse method: presentation, evaluation, and implications
Anderman, Evan R.; Hill, Mary C.
1999-01-01
More computationally efficient methods of using concentration data are needed to estimate groundwater flow and transport parameters. This work introduces and evaluates a three‐stage nonlinear‐regression‐based iterative procedure in which trial advective‐front locations link decoupled flow and transport models. Method accuracy and efficiency are evaluated by comparing results to those obtained when flow‐ and transport‐model parameters are estimated simultaneously. The new method is evaluated as conclusively as possible by using a simple test case that includes distinct flow and transport parameters, but does not include any approximations that are problem dependent. The test case is analytical; the only flow parameter is a constant velocity, and the transport parameters are longitudinal and transverse dispersivity. Any difficulties detected using the new method in this ideal situation are likely to be exacerbated in practical problems. Monte‐Carlo analysis of observation error ensures that no specific error realization obscures the results. Results indicate that, while this, and probably other, multistage methods do not always produce optimal parameter estimates, the computational advantage may make them useful in some circumstances, perhaps as a precursor to using a simultaneous method.
Peak-flow frequency relations and evaluation of the peak-flow gaging network in Nebraska
Soenksen, Philip J.; Miller, Lisa D.; Sharpe, Jennifer B.; Watton, Jason R.
1999-01-01
Estimates of peak-flow magnitude and frequency are required for the efficient design of structures that convey flood flows or occupy floodways, such as bridges, culverts, and roads. The U.S. Geological Survey, in cooperation with the Nebraska Department of Roads, conducted a study to update peak-flow frequency analyses for selected streamflow-gaging stations, develop a new set of peak-flow frequency relations for ungaged streams, and evaluate the peak-flow gaging-station network for Nebraska. Data from stations located in or within about 50 miles of Nebraska were analyzed using guidelines of the Interagency Advisory Committee on Water Data in Bulletin 17B. New generalized skew relations were developed for use in frequency analyses of unregulated streams. Thirty-three drainage-basin characteristics related to morphology, soils, and precipitation were quantified using a geographic information system, related computer programs, and digital spatial data.For unregulated streams, eight sets of regional regression equations relating drainage-basin to peak-flow characteristics were developed for seven regions of the state using a generalized least squares procedure. Two sets of regional peak-flow frequency equations were developed for basins with average soil permeability greater than 4 inches per hour, and six sets of equations were developed for specific geographic areas, usually based on drainage-basin boundaries. Standard errors of estimate for the 100-year frequency equations (1percent probability) ranged from 12.1 to 63.8 percent. For regulated reaches of nine streams, graphs of peak flow for standard frequencies and distance upstream of the mouth were estimated.The regional networks of streamflow-gaging stations on unregulated streams were analyzed to evaluate how additional data might affect the average sampling errors of the newly developed peak-flow equations for the 100-year frequency occurrence. Results indicated that data from new stations, rather than more data from existing stations, probably would produce the greatest reduction in average sampling errors of the equations.
Newell, Felicity L.; Sheehan, James; Wood, Petra Bohall; Rodewald, Amanda D.; Buehler, David A.; Keyser, Patrick D.; Larkin, Jeffrey L.; Beachy, Tiffany A.; Bakermans, Marja H.; Boves, Than J.; Evans, Andrea; George, Gregory A.; McDermott, Molly E.; Perkins, Kelly A.; White, Matthew; Wigley, T. Bently
2013-01-01
Point counts are commonly used to assess changes in bird abundance, including analytical approaches such as distance sampling that estimate density. Point-count methods have come under increasing scrutiny because effects of detection probability and field error are difficult to quantify. For seven forest songbirds, we compared fixed-radii counts (50 m and 100 m) and density estimates obtained from distance sampling to known numbers of birds determined by territory mapping. We applied point-count analytic approaches to a typical forest management question and compared results to those obtained by territory mapping. We used a before–after control impact (BACI) analysis with a data set collected across seven study areas in the central Appalachians from 2006 to 2010. Using a 50-m fixed radius, variance in error was at least 1.5 times that of the other methods, whereas a 100-m fixed radius underestimated actual density by >3 territories per 10 ha for the most abundant species. Distance sampling improved accuracy and precision compared to fixed-radius counts, although estimates were affected by birds counted outside 10-ha units. In the BACI analysis, territory mapping detected an overall treatment effect for five of the seven species, and effects were generally consistent each year. In contrast, all point-count methods failed to detect two treatment effects due to variance and error in annual estimates. Overall, our results highlight the need for adequate sample sizes to reduce variance, and skilled observers to reduce the level of error in point-count data. Ultimately, the advantages and disadvantages of different survey methods should be considered in the context of overall study design and objectives, allowing for trade-offs among effort, accuracy, and power to detect treatment effects.
Smith, S. Jerrod; Lewis, Jason M.; Graves, Grant M.
2015-09-28
Generalized-least-squares multiple-linear regression analysis was used to formulate regression relations between peak-streamflow frequency statistics and basin characteristics. Contributing drainage area was the only basin characteristic determined to be statistically significant for all percentage of annual exceedance probabilities and was the only basin characteristic used in regional regression equations for estimating peak-streamflow frequency statistics on unregulated streams in and near the Oklahoma Panhandle. The regression model pseudo-coefficient of determination, converted to percent, for the Oklahoma Panhandle regional regression equations ranged from about 38 to 63 percent. The standard errors of prediction and the standard model errors for the Oklahoma Panhandle regional regression equations ranged from about 84 to 148 percent and from about 76 to 138 percent, respectively. These errors were comparable to those reported for regional peak-streamflow frequency regression equations for the High Plains areas of Texas and Colorado. The root mean square errors for the Oklahoma Panhandle regional regression equations (ranging from 3,170 to 92,000 cubic feet per second) were less than the root mean square errors for the Oklahoma statewide regression equations (ranging from 18,900 to 412,000 cubic feet per second); therefore, the Oklahoma Panhandle regional regression equations produce more accurate peak-streamflow statistic estimates for the irrigated period of record in the Oklahoma Panhandle than do the Oklahoma statewide regression equations. The regression equations developed in this report are applicable to streams that are not substantially affected by regulation, impoundment, or surface-water withdrawals. These regression equations are intended for use for stream sites with contributing drainage areas less than or equal to about 2,060 square miles, the maximum value for the independent variable used in the regression analysis.
Methods for determining time of death.
Madea, Burkhard
2016-12-01
Medicolegal death time estimation must estimate the time since death reliably. Reliability can only be provided empirically by statistical analysis of errors in field studies. Determining the time since death requires the calculation of measurable data along a time-dependent curve back to the starting point. Various methods are used to estimate the time since death. The current gold standard for death time estimation is a previously established nomogram method based on the two-exponential model of body cooling. Great experimental and practical achievements have been realized using this nomogram method. To reduce the margin of error of the nomogram method, a compound method was developed based on electrical and mechanical excitability of skeletal muscle, pharmacological excitability of the iris, rigor mortis, and postmortem lividity. Further increasing the accuracy of death time estimation involves the development of conditional probability distributions for death time estimation based on the compound method. Although many studies have evaluated chemical methods of death time estimation, such methods play a marginal role in daily forensic practice. However, increased precision of death time estimation has recently been achieved by considering various influencing factors (i.e., preexisting diseases, duration of terminal episode, and ambient temperature). Putrefactive changes may be used for death time estimation in water-immersed bodies. Furthermore, recently developed technologies, such as H magnetic resonance spectroscopy, can be used to quantitatively study decompositional changes. This review addresses the gold standard method of death time estimation in forensic practice and promising technological and scientific developments in the field.
Signal processing of white-light interferometric low-finesse fiber-optic Fabry-Perot sensors.
Ma, Cheng; Wang, Anbo
2013-01-10
Signal processing for low-finesse fiber-optic Fabry-Perot sensors based on white-light interferometry is investigated. The problem is demonstrated as analogous to the parameter estimation of a noisy, real, discrete harmonic of finite length. The Cramer-Rao bounds for the estimators are given, and three algorithms are evaluated and proven to approach the bounds. A long-standing problem with these types of sensors is the unpredictable jumps in the phase estimation. Emphasis is made on the property and mechanism of the "total phase" estimator in reducing the estimation error, and a varying phase term in the total phase is identified to be responsible for the unwanted demodulation jumps. The theories are verified by simulation and experiment. A solution to reducing the probability of jump is demonstrated. © 2013 Optical Society of America
More on the decoder error probability for Reed-Solomon codes
NASA Technical Reports Server (NTRS)
Cheung, K.-M.
1987-01-01
The decoder error probability for Reed-Solomon codes (more generally, linear maximum distance separable codes) is examined. McEliece and Swanson offered an upper bound on P sub E (u), the decoder error probability given that u symbol errors occurs. This upper bound is slightly greater than Q, the probability that a completely random error pattern will cause decoder error. By using a combinatoric technique, the principle of inclusion and exclusion, an exact formula for P sub E (u) is derived. The P sub e (u) for the (255, 223) Reed-Solomon Code used by NASA, and for the (31,15) Reed-Solomon code (JTIDS code), are calculated using the exact formula, and the P sub E (u)'s are observed to approach the Q's of the codes rapidly as u gets larger. An upper bound for the expression is derived, and is shown to decrease nearly exponentially as u increases. This proves analytically that P sub E (u) indeed approaches Q as u becomes large, and some laws of large numbers come into play.
Reasoning in psychosis: risky but not necessarily hasty.
Moritz, Steffen; Scheu, Florian; Andreou, Christina; Pfueller, Ute; Weisbrod, Matthias; Roesch-Ely, Daniela
2016-01-01
A liberal acceptance (LA) threshold for hypotheses has been put forward to explain the well-replicated "jumping to conclusions" (JTC) bias in psychosis, particularly in patients with paranoid symptoms. According to this account, schizophrenia patients rest their decisions on lower subjective probability estimates. The initial formulation of the LA account also predicts an absence of the JTC bias under high task ambiguity (i.e., if more than one response option surpasses the subjective acceptance threshold). Schizophrenia patients (n = 62) with current or former delusions and healthy controls (n = 30) were compared on six scenarios of a variant of the beads task paradigm. Decision-making was assessed under low and high task ambiguity. Along with decision judgments (optional), participants were required to provide probability estimates for each option in order to determine decision thresholds (i.e., the probability the individual deems sufficient for a decision). In line with the LA account, schizophrenia patients showed a lowered decision threshold compared to controls (82% vs. 93%) which predicted both more errors and less draws to decisions. Group differences on thresholds were comparable across conditions. At the same time, patients did not show hasty decision-making, reflecting overall lowered probability estimates in patients. Results confirm core predictions derived from the LA account. Our results may (partly) explain why hasty decision-making is sometimes aggravated and sometimes abolished in psychosis. The proneness to make risky decisions may contribute to the pathogenesis of psychosis. A revised LA account is put forward.
NASA Astrophysics Data System (ADS)
Wani, Omar; Beckers, Joost V. L.; Weerts, Albrecht H.; Solomatine, Dimitri P.
2017-08-01
A non-parametric method is applied to quantify residual uncertainty in hydrologic streamflow forecasting. This method acts as a post-processor on deterministic model forecasts and generates a residual uncertainty distribution. Based on instance-based learning, it uses a k nearest-neighbour search for similar historical hydrometeorological conditions to determine uncertainty intervals from a set of historical errors, i.e. discrepancies between past forecast and observation. The performance of this method is assessed using test cases of hydrologic forecasting in two UK rivers: the Severn and Brue. Forecasts in retrospect were made and their uncertainties were estimated using kNN resampling and two alternative uncertainty estimators: quantile regression (QR) and uncertainty estimation based on local errors and clustering (UNEEC). Results show that kNN uncertainty estimation produces accurate and narrow uncertainty intervals with good probability coverage. Analysis also shows that the performance of this technique depends on the choice of search space. Nevertheless, the accuracy and reliability of uncertainty intervals generated using kNN resampling are at least comparable to those produced by QR and UNEEC. It is concluded that kNN uncertainty estimation is an interesting alternative to other post-processors, like QR and UNEEC, for estimating forecast uncertainty. Apart from its concept being simple and well understood, an advantage of this method is that it is relatively easy to implement.
An RFID Indoor Positioning Algorithm Based on Bayesian Probability and K-Nearest Neighbor.
Xu, He; Ding, Ye; Li, Peng; Wang, Ruchuan; Li, Yizhu
2017-08-05
The Global Positioning System (GPS) is widely used in outdoor environmental positioning. However, GPS cannot support indoor positioning because there is no signal for positioning in an indoor environment. Nowadays, there are many situations which require indoor positioning, such as searching for a book in a library, looking for luggage in an airport, emergence navigation for fire alarms, robot location, etc. Many technologies, such as ultrasonic, sensors, Bluetooth, WiFi, magnetic field, Radio Frequency Identification (RFID), etc., are used to perform indoor positioning. Compared with other technologies, RFID used in indoor positioning is more cost and energy efficient. The Traditional RFID indoor positioning algorithm LANDMARC utilizes a Received Signal Strength (RSS) indicator to track objects. However, the RSS value is easily affected by environmental noise and other interference. In this paper, our purpose is to reduce the location fluctuation and error caused by multipath and environmental interference in LANDMARC. We propose a novel indoor positioning algorithm based on Bayesian probability and K -Nearest Neighbor (BKNN). The experimental results show that the Gaussian filter can filter some abnormal RSS values. The proposed BKNN algorithm has the smallest location error compared with the Gaussian-based algorithm, LANDMARC and an improved KNN algorithm. The average error in location estimation is about 15 cm using our method.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ren, S; Tianjin University, Tianjin; Hara, W
Purpose: MRI has a number of advantages over CT as a primary modality for radiation treatment planning (RTP). However, one key bottleneck problem still remains, which is the lack of electron density information in MRI. In the work, a reliable method to map electron density is developed by leveraging the differential contrast of multi-parametric MRI. Methods: We propose a probabilistic Bayesian approach for electron density mapping based on T1 and T2-weighted MRI, using multiple patients as atlases. For each voxel, we compute two conditional probabilities: (1) electron density given its image intensity on T1 and T2-weighted MR images, and (2)more » electron density given its geometric location in a reference anatomy. The two sources of information (image intensity and spatial location) are combined into a unifying posterior probability density function using the Bayesian formalism. The mean value of the posterior probability density function provides the estimated electron density. Results: We evaluated the method on 10 head and neck patients and performed leave-one-out cross validation (9 patients as atlases and remaining 1 as test). The proposed method significantly reduced the errors in electron density estimation, with a mean absolute HU error of 138, compared with 193 for the T1-weighted intensity approach and 261 without density correction. For bone detection (HU>200), the proposed method had an accuracy of 84% and a sensitivity of 73% at specificity of 90% (AUC = 87%). In comparison, the AUC for bone detection is 73% and 50% using the intensity approach and without density correction, respectively. Conclusion: The proposed unifying method provides accurate electron density estimation and bone detection based on multi-parametric MRI of the head with highly heterogeneous anatomy. This could allow for accurate dose calculation and reference image generation for patient setup in MRI-based radiation treatment planning.« less
Lobach, Iryna; Mallick, Bani; Carroll, Raymond J
2011-01-01
Case-control studies are widely used to detect gene-environment interactions in the etiology of complex diseases. Many variables that are of interest to biomedical researchers are difficult to measure on an individual level, e.g. nutrient intake, cigarette smoking exposure, long-term toxic exposure. Measurement error causes bias in parameter estimates, thus masking key features of data and leading to loss of power and spurious/masked associations. We develop a Bayesian methodology for analysis of case-control studies for the case when measurement error is present in an environmental covariate and the genetic variable has missing data. This approach offers several advantages. It allows prior information to enter the model to make estimation and inference more precise. The environmental covariates measured exactly are modeled completely nonparametrically. Further, information about the probability of disease can be incorporated in the estimation procedure to improve quality of parameter estimates, what cannot be done in conventional case-control studies. A unique feature of the procedure under investigation is that the analysis is based on a pseudo-likelihood function therefore conventional Bayesian techniques may not be technically correct. We propose an approach using Markov Chain Monte Carlo sampling as well as a computationally simple method based on an asymptotic posterior distribution. Simulation experiments demonstrated that our method produced parameter estimates that are nearly unbiased even for small sample sizes. An application of our method is illustrated using a population-based case-control study of the association between calcium intake with the risk of colorectal adenoma development.
Neural dynamics of reward probability coding: a Magnetoencephalographic study in humans
Thomas, Julie; Vanni-Mercier, Giovanna; Dreher, Jean-Claude
2013-01-01
Prediction of future rewards and discrepancy between actual and expected outcomes (prediction error) are crucial signals for adaptive behavior. In humans, a number of fMRI studies demonstrated that reward probability modulates these two signals in a large brain network. Yet, the spatio-temporal dynamics underlying the neural coding of reward probability remains unknown. Here, using magnetoencephalography, we investigated the neural dynamics of prediction and reward prediction error computations while subjects learned to associate cues of slot machines with monetary rewards with different probabilities. We showed that event-related magnetic fields (ERFs) arising from the visual cortex coded the expected reward value 155 ms after the cue, demonstrating that reward value signals emerge early in the visual stream. Moreover, a prediction error was reflected in ERF peaking 300 ms after the rewarded outcome and showing decreasing amplitude with higher reward probability. This prediction error signal was generated in a network including the anterior and posterior cingulate cortex. These findings pinpoint the spatio-temporal characteristics underlying reward probability coding. Together, our results provide insights into the neural dynamics underlying the ability to learn probabilistic stimuli-reward contingencies. PMID:24302894
A method to estimate groundwater depletion from confining layers
Konikow, Leonard F.; Neuzil, Christopher E.
2007-01-01
Although depletion of storage in low‐permeability confining layers is the source of much of the groundwater produced from many confined aquifer systems, it is all too frequently overlooked or ignored. This makes effective management of groundwater resources difficult by masking how much water has been derived from storage and, in some cases, the total amount of water that has been extracted from an aquifer system. Analyzing confining layer storage is viewed as troublesome because of the additional computational burden and because the hydraulic properties of confining layers are poorly known. In this paper we propose a simplified method for computing estimates of confining layer depletion, as well as procedures for approximating confining layer hydraulic conductivity (K) and specific storage (Ss) using geologic information. The latter makes the technique useful in developing countries and other settings where minimal data are available or when scoping calculations are needed. As such, our approach may be helpful for estimating the global transfer of groundwater to surface water. A test of the method on a synthetic system suggests that the computational errors will generally be small. Larger errors will probably result from inaccuracy in confining layer property estimates, but these may be no greater than errors in more sophisticated analyses. The technique is demonstrated by application to two aquifer systems: the Dakota artesian aquifer system in South Dakota and the coastal plain aquifer system in Virginia. In both cases, depletion from confining layers was substantially larger than depletion from the aquifers.
Levin, Gregory P; Emerson, Sarah C; Emerson, Scott S
2014-09-01
Many papers have introduced adaptive clinical trial methods that allow modifications to the sample size based on interim estimates of treatment effect. There has been extensive commentary on type I error control and efficiency considerations, but little research on estimation after an adaptive hypothesis test. We evaluate the reliability and precision of different inferential procedures in the presence of an adaptive design with pre-specified rules for modifying the sampling plan. We extend group sequential orderings of the outcome space based on the stage at stopping, likelihood ratio statistic, and sample mean to the adaptive setting in order to compute median-unbiased point estimates, exact confidence intervals, and P-values uniformly distributed under the null hypothesis. The likelihood ratio ordering is found to average shorter confidence intervals and produce higher probabilities of P-values below important thresholds than alternative approaches. The bias adjusted mean demonstrates the lowest mean squared error among candidate point estimates. A conditional error-based approach in the literature has the benefit of being the only method that accommodates unplanned adaptations. We compare the performance of this and other methods in order to quantify the cost of failing to plan ahead in settings where adaptations could realistically be pre-specified at the design stage. We find the cost to be meaningful for all designs and treatment effects considered, and to be substantial for designs frequently proposed in the literature. © 2014, The International Biometric Society.
Switzer, P.; Harden, J.W.; Mark, R.K.
1988-01-01
A statistical method for estimating rates of soil development in a given region based on calibration from a series of dated soils is used to estimate ages of soils in the same region that are not dated directly. The method is designed specifically to account for sampling procedures and uncertainties that are inherent in soil studies. Soil variation and measurement error, uncertainties in calibration dates and their relation to the age of the soil, and the limited number of dated soils are all considered. Maximum likelihood (ML) is employed to estimate a parametric linear calibration curve, relating soil development to time or age on suitably transformed scales. Soil variation on a geomorphic surface of a certain age is characterized by replicate sampling of soils on each surface; such variation is assumed to have a Gaussian distribution. The age of a geomorphic surface is described by older and younger bounds. This technique allows age uncertainty to be characterized by either a Gaussian distribution or by a triangular distribution using minimum, best-estimate, and maximum ages. The calibration curve is taken to be linear after suitable (in certain cases logarithmic) transformations, if required, of the soil parameter and age variables. Soil variability, measurement error, and departures from linearity are described in a combined fashion using Gaussian distributions with variances particular to each sampled geomorphic surface and the number of sample replicates. Uncertainty in age of a geomorphic surface used for calibration is described using three parameters by one of two methods. In the first method, upper and lower ages are specified together with a coverage probability; this specification is converted to a Gaussian distribution with the appropriate mean and variance. In the second method, "absolute" older and younger ages are specified together with a most probable age; this specification is converted to an asymmetric triangular distribution with mode at the most probable age. The statistical variability of the ML-estimated calibration curve is assessed by a Monte Carlo method in which simulated data sets repeatedly are drawn from the distributional specification; calibration parameters are reestimated for each such simulation in order to assess their statistical variability. Several examples are used for illustration. The age of undated soils in a related setting may be estimated from the soil data using the fitted calibration curve. A second simulation to assess age estimate variability is described and applied to the examples. ?? 1988 International Association for Mathematical Geology.
Imperfect pathogen detection from non-invasive skin swabs biases disease inference
DiRenzo, Graziella V.; Grant, Evan H. Campbell; Longo, Ana; Che-Castaldo, Christian; Zamudio, Kelly R.; Lips, Karen
2018-01-01
1. Conservation managers rely on accurate estimates of disease parameters, such as pathogen prevalence and infection intensity, to assess disease status of a host population. However, these disease metrics may be biased if low-level infection intensities are missed by sampling methods or laboratory diagnostic tests. These false negatives underestimate pathogen prevalence and overestimate mean infection intensity of infected individuals. 2. Our objectives were two-fold. First, we quantified false negative error rates of Batrachochytrium dendrobatidis on non-invasive skin swabs collected from an amphibian community in El Copé, Panama. We swabbed amphibians twice in sequence, and we used a recently developed hierarchical Bayesian estimator to assess disease status of the population. Second, we developed a novel hierarchical Bayesian model to simultaneously account for imperfect pathogen detection from field sampling and laboratory diagnostic testing. We evaluated the performance of the model using simulations and varying sampling design to quantify the magnitude of bias in estimates of pathogen prevalence and infection intensity. 3. We show that Bd detection probability from skin swabs was related to host infection intensity, where Bd infections < 10 zoospores have < 95% probability of being detected. If imperfect Bd detection was not considered, then Bd prevalence was underestimated by as much as 16%. In the Bd-amphibian system, this indicates a need to correct for imperfect pathogen detection caused by skin swabs in persisting host communities with low-level infections. More generally, our results have implications for study designs in other disease systems, particularly those with similar objectives, biology, and sampling decisions. 4. Uncertainty in pathogen detection is an inherent property of most sampling protocols and diagnostic tests, where the magnitude of bias depends on the study system, type of infection, and false negative error rates. Given that it may be difficult to know this information in advance, we advocate that the most cautious approach is to assume all errors are possible and to accommodate them by adjusting sampling designs. The modeling framework presented here improves the accuracy in estimating pathogen prevalence and infection intensity.
Estimating Seven Coefficients of Pairwise Relatedness Using Population-Genomic Data
Ackerman, Matthew S.; Johri, Parul; Spitze, Ken; Xu, Sen; Doak, Thomas G.; Young, Kimberly; Lynch, Michael
2017-01-01
Population structure can be described by genotypic-correlation coefficients between groups of individuals, the most basic of which are the pairwise relatedness coefficients between any two individuals. There are nine pairwise relatedness coefficients in the most general model, and we show that these can be reduced to seven coefficients for biallelic loci. Although all nine coefficients can be estimated from pedigrees, six coefficients have been beyond empirical reach. We provide a numerical optimization procedure that estimates all seven reduced coefficients from population-genomic data. Simulations show that the procedure is nearly unbiased, even at 3× coverage, and errors in five of the seven coefficients are statistically uncorrelated. The remaining two coefficients have a negative correlation of errors, but their sum provides an unbiased assessment of the overall correlation of heterozygosity between two individuals. Application of these new methods to four populations of the freshwater crustacean Daphnia pulex reveal the occurrence of half siblings in our samples, as well as a number of identical individuals that are likely obligately asexual clone mates. Statistically significant negative estimates of these pairwise relatedness coefficients, including inbreeding coefficients that were typically negative, underscore the difficulties that arise when interpreting genotypic correlations as estimations of the probability that alleles are identical by descent. PMID:28341647
NASA Astrophysics Data System (ADS)
Perreault Levasseur, Laurence; Hezaveh, Yashar D.; Wechsler, Risa H.
2017-11-01
In Hezaveh et al. we showed that deep learning can be used for model parameter estimation and trained convolutional neural networks to determine the parameters of strong gravitational-lensing systems. Here we demonstrate a method for obtaining the uncertainties of these parameters. We review the framework of variational inference to obtain approximate posteriors of Bayesian neural networks and apply it to a network trained to estimate the parameters of the Singular Isothermal Ellipsoid plus external shear and total flux magnification. We show that the method can capture the uncertainties due to different levels of noise in the input data, as well as training and architecture-related errors made by the network. To evaluate the accuracy of the resulting uncertainties, we calculate the coverage probabilities of marginalized distributions for each lensing parameter. By tuning a single variational parameter, the dropout rate, we obtain coverage probabilities approximately equal to the confidence levels for which they were calculated, resulting in accurate and precise uncertainty estimates. Our results suggest that the application of approximate Bayesian neural networks to astrophysical modeling problems can be a fast alternative to Monte Carlo Markov Chains, allowing orders of magnitude improvement in speed.
Information matrix estimation procedures for cognitive diagnostic models.
Liu, Yanlou; Xin, Tao; Andersson, Björn; Tian, Wei
2018-03-06
Two new methods to estimate the asymptotic covariance matrix for marginal maximum likelihood estimation of cognitive diagnosis models (CDMs), the inverse of the observed information matrix and the sandwich-type estimator, are introduced. Unlike several previous covariance matrix estimators, the new methods take into account both the item and structural parameters. The relationships between the observed information matrix, the empirical cross-product information matrix, the sandwich-type covariance matrix and the two approaches proposed by de la Torre (2009, J. Educ. Behav. Stat., 34, 115) are discussed. Simulation results show that, for a correctly specified CDM and Q-matrix or with a slightly misspecified probability model, the observed information matrix and the sandwich-type covariance matrix exhibit good performance with respect to providing consistent standard errors of item parameter estimates. However, with substantial model misspecification only the sandwich-type covariance matrix exhibits robust performance. © 2018 The British Psychological Society.
Bit Error Probability for Maximum Likelihood Decoding of Linear Block Codes
NASA Technical Reports Server (NTRS)
Lin, Shu; Fossorier, Marc P. C.; Rhee, Dojun
1996-01-01
In this paper, the bit error probability P(sub b) for maximum likelihood decoding of binary linear codes is investigated. The contribution of each information bit to P(sub b) is considered. For randomly generated codes, it is shown that the conventional approximation at high SNR P(sub b) is approximately equal to (d(sub H)/N)P(sub s), where P(sub s) represents the block error probability, holds for systematic encoding only. Also systematic encoding provides the minimum P(sub b) when the inverse mapping corresponding to the generator matrix of the code is used to retrieve the information sequence. The bit error performances corresponding to other generator matrix forms are also evaluated. Although derived for codes with a generator matrix randomly generated, these results are shown to provide good approximations for codes used in practice. Finally, for decoding methods which require a generator matrix with a particular structure such as trellis decoding or algebraic-based soft decision decoding, equivalent schemes that reduce the bit error probability are discussed.
U.S. Maternally Linked Birth Records May Be Biased for Hispanics and Other Population Groups
LEISS, JACK K.; GILES, DENISE; SULLIVAN, KRISTIN M.; MATHEWS, RAHEL; SENTELLE, GLENDA; TOMASHEK, KAY M.
2010-01-01
Purpose To advance understanding of linkage error in U.S. maternally linked datasets, and how the error may affect results of studies based on the linked data. Methods North Carolina birth and fetal death records for 1988-1997 were maternally linked (n=1,030,029). The maternal set probability, defined as the probability that all records assigned to the same maternal set do in fact represent events to the same woman, was used to assess differential maternal linkage error across race/ethnic groups. Results Maternal set probabilities were lower for records specifying Asian or Hispanic race/ethnicity, suggesting greater maternal linkage error. The lower probabilities for Hispanics were concentrated in women of Mexican origin who were not born in the United States. Conclusions Differential maternal linkage error may be a source of bias in studies using U.S. maternally linked datasets to make comparisons between Hispanics and other groups or among Hispanic subgroups. Methods to quantify and adjust for this potential bias are needed. PMID:20006273
An error analysis perspective for patient alignment systems.
Figl, Michael; Kaar, Marcus; Hoffman, Rainer; Kratochwil, Alfred; Hummel, Johann
2013-09-01
This paper analyses the effects of error sources which can be found in patient alignment systems. As an example, an ultrasound (US) repositioning system and its transformation chain are assessed. The findings of this concept can also be applied to any navigation system. In a first step, all error sources were identified and where applicable, corresponding target registration errors were computed. By applying error propagation calculations on these commonly used registration/calibration and tracking errors, we were able to analyse the components of the overall error. Furthermore, we defined a special situation where the whole registration chain reduces to the error caused by the tracking system. Additionally, we used a phantom to evaluate the errors arising from the image-to-image registration procedure, depending on the image metric used. We have also discussed how this analysis can be applied to other positioning systems such as Cone Beam CT-based systems or Brainlab's ExacTrac. The estimates found by our error propagation analysis are in good agreement with the numbers found in the phantom study but significantly smaller than results from patient evaluations. We probably underestimated human influences such as the US scan head positioning by the operator and tissue deformation. Rotational errors of the tracking system can multiply these errors, depending on the relative position of tracker and probe. We were able to analyse the components of the overall error of a typical patient positioning system. We consider this to be a contribution to the optimization of the positioning accuracy for computer guidance systems.
On the error probability of general tree and trellis codes with applications to sequential decoding
NASA Technical Reports Server (NTRS)
Johannesson, R.
1973-01-01
An upper bound on the average error probability for maximum-likelihood decoding of the ensemble of random binary tree codes is derived and shown to be independent of the length of the tree. An upper bound on the average error probability for maximum-likelihood decoding of the ensemble of random L-branch binary trellis codes of rate R = 1/n is derived which separates the effects of the tail length T and the memory length M of the code. It is shown that the bound is independent of the length L of the information sequence. This implication is investigated by computer simulations of sequential decoding utilizing the stack algorithm. These simulations confirm the implication and further suggest an empirical formula for the true undetected decoding error probability with sequential decoding.
NASA Astrophysics Data System (ADS)
Balaji, K. A.; Prabu, K.
2018-03-01
There is an immense demand for high bandwidth and high data rate systems, which is fulfilled by wireless optical communication or free space optics (FSO). Hence FSO gained a pivotal role in research which has a added advantage of both cost-effective and licence free huge bandwidth. Unfortunately the optical signal in free space suffers from irradiance and phase fluctuations due to atmospheric turbulence and pointing errors which deteriorates the signal and degrades the performance of communication system over longer distance which is undesirable. In this paper, we have considered polarization shift keying (POLSK) system applied with wavelength and time diversity technique over Malaga(M)distribution to mitigate turbulence induced fading. We derived closed form mathematical expressions for estimating the systems outage probability and average bit error rate (BER). Ultimately from the results we can infer that wavelength and time diversity schemes enhances these systems performance.
Estimation and correction of visibility bias in aerial surveys of wintering ducks
Pearse, A.T.; Gerard, P.D.; Dinsmore, S.J.; Kaminski, R.M.; Reinecke, K.J.
2008-01-01
Incomplete detection of all individuals leading to negative bias in abundance estimates is a pervasive source of error in aerial surveys of wildlife, and correcting that bias is a critical step in improving surveys. We conducted experiments using duck decoys as surrogates for live ducks to estimate bias associated with surveys of wintering ducks in Mississippi, USA. We found detection of decoy groups was related to wetland cover type (open vs. forested), group size (1?100 decoys), and interaction of these variables. Observers who detected decoy groups reported counts that averaged 78% of the decoys actually present, and this counting bias was not influenced by either covariate cited above. We integrated this sightability model into estimation procedures for our sample surveys with weight adjustments derived from probabilities of group detection (estimated by logistic regression) and count bias. To estimate variances of abundance estimates, we used bootstrap resampling of transects included in aerial surveys and data from the bias-correction experiment. When we implemented bias correction procedures on data from a field survey conducted in January 2004, we found bias-corrected estimates of abundance increased 36?42%, and associated standard errors increased 38?55%, depending on species or group estimated. We deemed our method successful for integrating correction of visibility bias in an existing sample survey design for wintering ducks in Mississippi, and we believe this procedure could be implemented in a variety of sampling problems for other locations and species.
Regression without truth with Markov chain Monte-Carlo
NASA Astrophysics Data System (ADS)
Madan, Hennadii; Pernuš, Franjo; Likar, Boštjan; Å piclin, Žiga
2017-03-01
Regression without truth (RWT) is a statistical technique for estimating error model parameters of each method in a group of methods used for measurement of a certain quantity. A very attractive aspect of RWT is that it does not rely on a reference method or "gold standard" data, which is otherwise difficult RWT was used for a reference-free performance comparison of several methods for measuring left ventricular ejection fraction (EF), i.e. a percentage of blood leaving the ventricle each time the heart contracts, and has since been applied for various other quantitative imaging biomarkerss (QIBs). Herein, we show how Markov chain Monte-Carlo (MCMC), a computational technique for drawing samples from a statistical distribution with probability density function known only up to a normalizing coefficient, can be used to augment RWT to gain a number of important benefits compared to the original approach based on iterative optimization. For instance, the proposed MCMC-based RWT enables the estimation of joint posterior distribution of the parameters of the error model, straightforward quantification of uncertainty of the estimates, estimation of true value of the measurand and corresponding credible intervals (CIs), does not require a finite support for prior distribution of the measureand generally has a much improved robustness against convergence to non-global maxima. The proposed approach is validated using synthetic data that emulate the EF data for 45 patients measured with 8 different methods. The obtained results show that 90% CI of the corresponding parameter estimates contain the true values of all error model parameters and the measurand. A potential real-world application is to take measurements of a certain QIB several different methods and then use the proposed framework to compute the estimates of the true values and their uncertainty, a vital information for diagnosis based on QIB.
NASA Astrophysics Data System (ADS)
Cheng, Qin-Bo; Chen, Xi; Xu, Chong-Yu; Reinhardt-Imjela, Christian; Schulte, Achim
2014-11-01
In this study, the likelihood functions for uncertainty analysis of hydrological models are compared and improved through the following steps: (1) the equivalent relationship between the Nash-Sutcliffe Efficiency coefficient (NSE) and the likelihood function with Gaussian independent and identically distributed residuals is proved; (2) a new estimation method of the Box-Cox transformation (BC) parameter is developed to improve the effective elimination of the heteroscedasticity of model residuals; and (3) three likelihood functions-NSE, Generalized Error Distribution with BC (BC-GED) and Skew Generalized Error Distribution with BC (BC-SGED)-are applied for SWAT-WB-VSA (Soil and Water Assessment Tool - Water Balance - Variable Source Area) model calibration in the Baocun watershed, Eastern China. Performances of calibrated models are compared using the observed river discharges and groundwater levels. The result shows that the minimum variance constraint can effectively estimate the BC parameter. The form of the likelihood function significantly impacts on the calibrated parameters and the simulated results of high and low flow components. SWAT-WB-VSA with the NSE approach simulates flood well, but baseflow badly owing to the assumption of Gaussian error distribution, where the probability of the large error is low, but the small error around zero approximates equiprobability. By contrast, SWAT-WB-VSA with the BC-GED or BC-SGED approach mimics baseflow well, which is proved in the groundwater level simulation. The assumption of skewness of the error distribution may be unnecessary, because all the results of the BC-SGED approach are nearly the same as those of the BC-GED approach.
1993-03-10
template which runs a Romberg algorithm in the background to numerically integrate the BVN [12:257]. Appendix A als- lists the results from two other...for computing these values: a Taylor series expansion, the Romberg algorithm , and the CBN technique. Appendix A lists CEPpop. values for eleven...determining factor in this selection process. Of the 175 populations ex- amined in the experiment, the MathCAD version of the Romberg algorithm failed
A Game Theoretic Framework for Power Control in Wireless Sensor Networks (POSTPRINT)
2010-02-01
which the sensor nodes compute based on past observations. Correspondingly, Pe can only be estimated; for example, with a noncoherent FSK modula...bit error probability for the link (i ! j) is given by some inverse function of j. For example, with noncoherent FSK modulation scheme, Pe ¼ 0:5e j...show the results for two different modulation schemes: DPSK and noncoherent PSK. As expected, with improvement in channel condition, i.e., with increase
Lee, Hwan Young; Song, Injee; Ha, Eunho; Cho, Sung-Bae; Yang, Woo Ick; Shin, Kyoung-Jin
2008-01-01
Background For the past few years, scientific controversy has surrounded the large number of errors in forensic and literature mitochondrial DNA (mtDNA) data. However, recent research has shown that using mtDNA phylogeny and referring to known mtDNA haplotypes can be useful for checking the quality of sequence data. Results We developed a Web-based bioinformatics resource "mtDNAmanager" that offers a convenient interface supporting the management and quality analysis of mtDNA sequence data. The mtDNAmanager performs computations on mtDNA control-region sequences to estimate the most-probable mtDNA haplogroups and retrieves similar sequences from a selected database. By the phased designation of the most-probable haplogroups (both expected and estimated haplogroups), mtDNAmanager enables users to systematically detect errors whilst allowing for confirmation of the presence of clear key diagnostic mutations and accompanying mutations. The query tools of mtDNAmanager also facilitate database screening with two options of "match" and "include the queried nucleotide polymorphism". In addition, mtDNAmanager provides Web interfaces for users to manage and analyse their own data in batch mode. Conclusion The mtDNAmanager will provide systematic routines for mtDNA sequence data management and analysis via easily accessible Web interfaces, and thus should be very useful for population, medical and forensic studies that employ mtDNA analysis. mtDNAmanager can be accessed at . PMID:19014619
Towards a novel look on low-frequency climate reconstructions
NASA Astrophysics Data System (ADS)
Kamenik, Christian; Goslar, Tomasz; Hicks, Sheila; Barnekow, Lena; Huusko, Antti
2010-05-01
Information on low-frequency (millennial to sub-centennial) climate change is often derived from sedimentary archives, such as peat profiles or lake sediments. Usually, these archives have non-annual and varying time resolution. Their dating is mainly based on radionuclides, which provide probabilistic age-depth relationships with complex error structures. Dating uncertainties impede the interpretation of sediment-based climate reconstructions. They complicate the calculation of time-dependent rates. In most cases, they make any calibration in time impossible. Sediment-based climate proxies are therefore often presented as a single, best-guess time series without proper calibration and error estimation. Errors along time and dating errors that propagate into the calculation of time-dependent rates are neglected. Our objective is to overcome the aforementioned limitations by using a 'swarm' or 'ensemble' of reconstructions instead of a single best-guess. The novelty of our approach is to take into account age-depth uncertainties by permuting through a large number of potential age-depth relationships of the archive of interest. For each individual permutation we can then calculate rates, calibrate proxies in time, and reconstruct the climate-state variable of interest. From the resulting swarm of reconstructions, we can derive realistic estimates of even complex error structures. The likelihood of reconstructions is visualized by a grid of two-dimensional kernels that take into account probabilities along time and the climate-state variable of interest simultaneously. For comparison and regional synthesis, likelihoods can be scored against other independent climate time series.
Honest Importance Sampling with Multiple Markov Chains
Tan, Aixin; Doss, Hani; Hobert, James P.
2017-01-01
Importance sampling is a classical Monte Carlo technique in which a random sample from one probability density, π1, is used to estimate an expectation with respect to another, π. The importance sampling estimator is strongly consistent and, as long as two simple moment conditions are satisfied, it obeys a central limit theorem (CLT). Moreover, there is a simple consistent estimator for the asymptotic variance in the CLT, which makes for routine computation of standard errors. Importance sampling can also be used in the Markov chain Monte Carlo (MCMC) context. Indeed, if the random sample from π1 is replaced by a Harris ergodic Markov chain with invariant density π1, then the resulting estimator remains strongly consistent. There is a price to be paid however, as the computation of standard errors becomes more complicated. First, the two simple moment conditions that guarantee a CLT in the iid case are not enough in the MCMC context. Second, even when a CLT does hold, the asymptotic variance has a complex form and is difficult to estimate consistently. In this paper, we explain how to use regenerative simulation to overcome these problems. Actually, we consider a more general set up, where we assume that Markov chain samples from several probability densities, π1, …, πk, are available. We construct multiple-chain importance sampling estimators for which we obtain a CLT based on regeneration. We show that if the Markov chains converge to their respective target distributions at a geometric rate, then under moment conditions similar to those required in the iid case, the MCMC-based importance sampling estimator obeys a CLT. Furthermore, because the CLT is based on a regenerative process, there is a simple consistent estimator of the asymptotic variance. We illustrate the method with two applications in Bayesian sensitivity analysis. The first concerns one-way random effects models under different priors. The second involves Bayesian variable selection in linear regression, and for this application, importance sampling based on multiple chains enables an empirical Bayes approach to variable selection. PMID:28701855
Honest Importance Sampling with Multiple Markov Chains.
Tan, Aixin; Doss, Hani; Hobert, James P
2015-01-01
Importance sampling is a classical Monte Carlo technique in which a random sample from one probability density, π 1 , is used to estimate an expectation with respect to another, π . The importance sampling estimator is strongly consistent and, as long as two simple moment conditions are satisfied, it obeys a central limit theorem (CLT). Moreover, there is a simple consistent estimator for the asymptotic variance in the CLT, which makes for routine computation of standard errors. Importance sampling can also be used in the Markov chain Monte Carlo (MCMC) context. Indeed, if the random sample from π 1 is replaced by a Harris ergodic Markov chain with invariant density π 1 , then the resulting estimator remains strongly consistent. There is a price to be paid however, as the computation of standard errors becomes more complicated. First, the two simple moment conditions that guarantee a CLT in the iid case are not enough in the MCMC context. Second, even when a CLT does hold, the asymptotic variance has a complex form and is difficult to estimate consistently. In this paper, we explain how to use regenerative simulation to overcome these problems. Actually, we consider a more general set up, where we assume that Markov chain samples from several probability densities, π 1 , …, π k , are available. We construct multiple-chain importance sampling estimators for which we obtain a CLT based on regeneration. We show that if the Markov chains converge to their respective target distributions at a geometric rate, then under moment conditions similar to those required in the iid case, the MCMC-based importance sampling estimator obeys a CLT. Furthermore, because the CLT is based on a regenerative process, there is a simple consistent estimator of the asymptotic variance. We illustrate the method with two applications in Bayesian sensitivity analysis. The first concerns one-way random effects models under different priors. The second involves Bayesian variable selection in linear regression, and for this application, importance sampling based on multiple chains enables an empirical Bayes approach to variable selection.
Non-Parametric Collision Probability for Low-Velocity Encounters
NASA Technical Reports Server (NTRS)
Carpenter, J. Russell
2007-01-01
An implicit, but not necessarily obvious, assumption in all of the current techniques for assessing satellite collision probability is that the relative position uncertainty is perfectly correlated in time. If there is any mis-modeling of the dynamics in the propagation of the relative position error covariance matrix, time-wise de-correlation of the uncertainty will increase the probability of collision over a given time interval. The paper gives some examples that illustrate this point. This paper argues that, for the present, Monte Carlo analysis is the best available tool for handling low-velocity encounters, and suggests some techniques for addressing the issues just described. One proposal is for the use of a non-parametric technique that is widely used in actuarial and medical studies. The other suggestion is that accurate process noise models be used in the Monte Carlo trials to which the non-parametric estimate is applied. A further contribution of this paper is a description of how the time-wise decorrelation of uncertainty increases the probability of collision.
Methods for estimating magnitude and frequency of floods in Montana based on data through 1983
Omang, R.J.; Parrett, Charles; Hull, J.A.
1986-01-01
Equations are presented for estimating flood magnitudes for ungaged sites in Montana based on data through 1983. The State was divided into eight regions based on hydrologic conditions, and separate multiple regression equations were developed for each region. These equations relate annual flood magnitudes and frequencies to basin characteristics and are applicable only to natural flow streams. In three of the regions, equations also were developed relating flood magnitudes and frequencies to basin characteristics and channel geometry measurements. The standard errors of estimate for an exceedance probability of 1% ranged from 39% to 87%. Techniques are described for estimating annual flood magnitude and flood frequency information at ungaged sites based on data from gaged sites on the same stream. Included are curves relating flood frequency information to drainage area for eight major streams in the State. Maximum known flood magnitudes in Montana are compared with estimated 1 %-chance flood magnitudes and with maximum known floods in the United States. Values of flood magnitudes for selected exceedance probabilities and values of significant basin characteristics and channel geometry measurements for all gaging stations used in the analysis are tabulated. Included are 375 stations in Montana and 28 nearby stations in Canada and adjoining States. (Author 's abstract)
Entanglement-enhanced Neyman-Pearson target detection using quantum illumination
NASA Astrophysics Data System (ADS)
Zhuang, Quntao; Zhang, Zheshen; Shapiro, Jeffrey H.
2017-08-01
Quantum illumination (QI) provides entanglement-based target detection---in an entanglement-breaking environment---whose performance is significantly better than that of optimum classical-illumination target detection. QI's performance advantage was established in a Bayesian setting with the target presumed equally likely to be absent or present and error probability employed as the performance metric. Radar theory, however, eschews that Bayesian approach, preferring the Neyman-Pearson performance criterion to avoid the difficulties of accurately assigning prior probabilities to target absence and presence and appropriate costs to false-alarm and miss errors. We have recently reported an architecture---based on sum-frequency generation (SFG) and feedforward (FF) processing---for minimum error-probability QI target detection with arbitrary prior probabilities for target absence and presence. In this paper, we use our results for FF-SFG reception to determine the receiver operating characteristic---detection probability versus false-alarm probability---for optimum QI target detection under the Neyman-Pearson criterion.
Bayesian statistics and Monte Carlo methods
NASA Astrophysics Data System (ADS)
Koch, K. R.
2018-03-01
The Bayesian approach allows an intuitive way to derive the methods of statistics. Probability is defined as a measure of the plausibility of statements or propositions. Three rules are sufficient to obtain the laws of probability. If the statements refer to the numerical values of variables, the so-called random variables, univariate and multivariate distributions follow. They lead to the point estimation by which unknown quantities, i.e. unknown parameters, are computed from measurements. The unknown parameters are random variables, they are fixed quantities in traditional statistics which is not founded on Bayes' theorem. Bayesian statistics therefore recommends itself for Monte Carlo methods, which generate random variates from given distributions. Monte Carlo methods, of course, can also be applied in traditional statistics. The unknown parameters, are introduced as functions of the measurements, and the Monte Carlo methods give the covariance matrix and the expectation of these functions. A confidence region is derived where the unknown parameters are situated with a given probability. Following a method of traditional statistics, hypotheses are tested by determining whether a value for an unknown parameter lies inside or outside the confidence region. The error propagation of a random vector by the Monte Carlo methods is presented as an application. If the random vector results from a nonlinearly transformed vector, its covariance matrix and its expectation follow from the Monte Carlo estimate. This saves a considerable amount of derivatives to be computed, and errors of the linearization are avoided. The Monte Carlo method is therefore efficient. If the functions of the measurements are given by a sum of two or more random vectors with different multivariate distributions, the resulting distribution is generally not known. TheMonte Carlo methods are then needed to obtain the covariance matrix and the expectation of the sum.
Alternate methods for FAAT S-curve generation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kaufman, A.M.
The FAAT (Foreign Asset Assessment Team) assessment methodology attempts to derive a probability of effect as a function of incident field strength. The probability of effect is the likelihood that the stress put on a system exceeds its strength. In the FAAT methodology, both the stress and strength are random variables whose statistical properties are estimated by experts. Each random variable has two components of uncertainty: systematic and random. The systematic uncertainty drives the confidence bounds in the FAAT assessment. Its variance can be reduced by improved information. The variance of the random uncertainty is not reducible. The FAAT methodologymore » uses an assessment code called ARES to generate probability of effect curves (S-curves) at various confidence levels. ARES assumes log normal distributions for all random variables. The S-curves themselves are log normal cumulants associated with the random portion of the uncertainty. The placement of the S-curves depends on confidence bounds. The systematic uncertainty in both stress and strength is usually described by a mode and an upper and lower variance. Such a description is not consistent with the log normal assumption of ARES and an unsatisfactory work around solution is used to obtain the required placement of the S-curves at each confidence level. We have looked into this situation and have found that significant errors are introduced by this work around. These errors are at least several dB-W/cm{sup 2} at all confidence levels, but they are especially bad in the estimate of the median. In this paper, we suggest two alternate solutions for the placement of S-curves. To compare these calculational methods, we have tabulated the common combinations of upper and lower variances and generated the relevant S-curves offsets from the mode difference of stress and strength.« less
NASA Astrophysics Data System (ADS)
Chen, Po-Hao; Botzolakis, Emmanuel; Mohan, Suyash; Bryan, R. N.; Cook, Tessa
2016-03-01
In radiology, diagnostic errors occur either through the failure of detection or incorrect interpretation. Errors are estimated to occur in 30-35% of all exams and contribute to 40-54% of medical malpractice litigations. In this work, we focus on reducing incorrect interpretation of known imaging features. Existing literature categorizes cognitive bias leading a radiologist to an incorrect diagnosis despite having correctly recognized the abnormal imaging features: anchoring bias, framing effect, availability bias, and premature closure. Computational methods make a unique contribution, as they do not exhibit the same cognitive biases as a human. Bayesian networks formalize the diagnostic process. They modify pre-test diagnostic probabilities using clinical and imaging features, arriving at a post-test probability for each possible diagnosis. To translate Bayesian networks to clinical practice, we implemented an entirely web-based open-source software tool. In this tool, the radiologist first selects a network of choice (e.g. basal ganglia). Then, large, clearly labeled buttons displaying salient imaging features are displayed on the screen serving both as a checklist and for input. As the radiologist inputs the value of an extracted imaging feature, the conditional probabilities of each possible diagnosis are updated. The software presents its level of diagnostic discrimination using a Pareto distribution chart, updated with each additional imaging feature. Active collaboration with the clinical radiologist is a feasible approach to software design and leads to design decisions closely coupling the complex mathematics of conditional probability in Bayesian networks with practice.
Evaluation of Satellite and Model Precipitation Products Over Turkey
NASA Astrophysics Data System (ADS)
Yilmaz, M. T.; Amjad, M.
2017-12-01
Satellite-based remote sensing, gauge stations, and models are the three major platforms to acquire precipitation dataset. Among them satellites and models have the advantage of retrieving spatially and temporally continuous and consistent datasets, while the uncertainty estimates of these retrievals are often required for many hydrological studies to understand the source and the magnitude of the uncertainty in hydrological response parameters. In this study, satellite and model precipitation data products are validated over various temporal scales (daily, 3-daily, 7-daily, 10-daily and monthly) using in-situ measured precipitation observations from a network of 733 gauges from all over the Turkey. Tropical Rainfall Measurement Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA) 3B42 version 7 and European Center of Medium-Range Weather Forecast (ECMWF) model estimates (daily, 3-daily, 7-daily and 10-daily accumulated forecast) are used in this study. Retrievals are evaluated for their mean and standard deviation and their accuracies are evaluated via bias, root mean square error, error standard deviation and correlation coefficient statistics. Intensity vs frequency analysis and some contingency table statistics like percent correct, probability of detection, false alarm ratio and critical success index are determined using daily time-series. Both ECMWF forecasts and TRMM observations, on average, overestimate the precipitation compared to gauge estimates; wet biases are 10.26 mm/month and 8.65 mm/month, respectively for ECMWF and TRMM. RMSE values of ECMWF forecasts and TRMM estimates are 39.69 mm/month and 41.55 mm/month, respectively. Monthly correlations between Gauges-ECMWF, Gauges-TRMM and ECMWF-TRMM are 0.76, 0.73 and 0.81, respectively. The model and the satellite error statistics are further compared against the gauges error statistics based on inverse distance weighting (IWD) analysis. Both the model and satellite data have less IWD errors (14.72 mm/month and 10.75 mm/month, respectively) compared to gauges IWD error (21.58 mm/month). These results show that, on average, ECMWF forecast data have higher skill than TRMM observations. Overall, both ECMWF forecast data and TRMM observations show good potential for catchment scale hydrological analysis.
Morris, Gail; Conner, L Mike
2017-01-01
Global positioning system (GPS) technologies have improved the ability of researchers to monitor wildlife; however, use of these technologies is often limited by monetary costs. Some researchers have begun to use commercially available GPS loggers as a less expensive means of tracking wildlife, but data regarding performance of these devices are limited. We tested a commercially available GPS logger (i-gotU GT-120) by placing loggers at ground control points with locations known to < 30 cm. In a preliminary investigation, we collected locations every 15 minutes for several days to estimate location error (LE) and circular error probable (CEP). Using similar methods, we then investigated the influence of cover on LE, CEP, and fix success rate (FSR) by constructing cover over ground control points. We found mean LE was < 10 m and mean 50% CEP was < 7 m. FSR was not significantly influenced by cover and in all treatments remained near 100%. Cover had a minor but significant effect on LE. Denser cover was associated with higher mean LE, but the difference in LE between the no cover and highest cover treatments was only 2.2 m. Finally, the most commonly used commercially available devices provide a measure of estimated horizontal position error (EHPE) which potentially may be used to filter inaccurate locations. Using data combined from the preliminary and cover investigations, we modeled LE as a function of EHPE and number of satellites. We found support for use of both EHPE and number of satellites in predicting LE; however, use of EHPE to filter inaccurate locations resulted in the loss of many locations with low error in return for only modest improvements in LE. Even without filtering, the accuracy of the logger was likely sufficient for studies which can accept average location errors of approximately 10 m.
Occupancy Modeling Species-Environment Relationships with Non-ignorable Survey Designs.
Irvine, Kathryn M; Rodhouse, Thomas J; Wright, Wilson J; Olsen, Anthony R
2018-05-26
Statistical models supporting inferences about species occurrence patterns in relation to environmental gradients are fundamental to ecology and conservation biology. A common implicit assumption is that the sampling design is ignorable and does not need to be formally accounted for in analyses. The analyst assumes data are representative of the desired population and statistical modeling proceeds. However, if datasets from probability and non-probability surveys are combined or unequal selection probabilities are used, the design may be non ignorable. We outline the use of pseudo-maximum likelihood estimation for site-occupancy models to account for such non-ignorable survey designs. This estimation method accounts for the survey design by properly weighting the pseudo-likelihood equation. In our empirical example, legacy and newer randomly selected locations were surveyed for bats to bridge a historic statewide effort with an ongoing nationwide program. We provide a worked example using bat acoustic detection/non-detection data and show how analysts can diagnose whether their design is ignorable. Using simulations we assessed whether our approach is viable for modeling datasets composed of sites contributed outside of a probability design Pseudo-maximum likelihood estimates differed from the usual maximum likelihood occu31 pancy estimates for some bat species. Using simulations we show the maximum likelihood estimator of species-environment relationships with non-ignorable sampling designs was biased, whereas the pseudo-likelihood estimator was design-unbiased. However, in our simulation study the designs composed of a large proportion of legacy or non-probability sites resulted in estimation issues for standard errors. These issues were likely a result of highly variable weights confounded by small sample sizes (5% or 10% sampling intensity and 4 revisits). Aggregating datasets from multiple sources logically supports larger sample sizes and potentially increases spatial extents for statistical inferences. Our results suggest that ignoring the mechanism for how locations were selected for data collection (e.g., the sampling design) could result in erroneous model-based conclusions. Therefore, in order to ensure robust and defensible recommendations for evidence-based conservation decision-making, the survey design information in addition to the data themselves must be available for analysts. Details for constructing the weights used in estimation and code for implementation are provided. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Estimating linear-nonlinear models using Rényi divergences
Kouh, Minjoon; Sharpee, Tatyana O.
2009-01-01
This paper compares a family of methods for characterizing neural feature selectivity using natural stimuli in the framework of the linear-nonlinear model. In this model, the spike probability depends in a nonlinear way on a small number of stimulus dimensions. The relevant stimulus dimensions can be found by optimizing a Rényi divergence that quantifies a change in the stimulus distribution associated with the arrival of single spikes. Generally, good reconstructions can be obtained based on optimization of Rényi divergence of any order, even in the limit of small numbers of spikes. However, the smallest error is obtained when the Rényi divergence of order 1 is optimized. This type of optimization is equivalent to information maximization, and is shown to saturate the Cramér-Rao bound describing the smallest error allowed for any unbiased method. We also discuss conditions under which information maximization provides a convenient way to perform maximum likelihood estimation of linear-nonlinear models from neural data. PMID:19568981
Estimating linear-nonlinear models using Renyi divergences.
Kouh, Minjoon; Sharpee, Tatyana O
2009-01-01
This article compares a family of methods for characterizing neural feature selectivity using natural stimuli in the framework of the linear-nonlinear model. In this model, the spike probability depends in a nonlinear way on a small number of stimulus dimensions. The relevant stimulus dimensions can be found by optimizing a Rényi divergence that quantifies a change in the stimulus distribution associated with the arrival of single spikes. Generally, good reconstructions can be obtained based on optimization of Rényi divergence of any order, even in the limit of small numbers of spikes. However, the smallest error is obtained when the Rényi divergence of order 1 is optimized. This type of optimization is equivalent to information maximization, and is shown to saturate the Cramer-Rao bound describing the smallest error allowed for any unbiased method. We also discuss conditions under which information maximization provides a convenient way to perform maximum likelihood estimation of linear-nonlinear models from neural data.
NASA Astrophysics Data System (ADS)
Azarnavid, Babak; Parand, Kourosh; Abbasbandy, Saeid
2018-06-01
This article discusses an iterative reproducing kernel method with respect to its effectiveness and capability of solving a fourth-order boundary value problem with nonlinear boundary conditions modeling beams on elastic foundations. Since there is no method of obtaining reproducing kernel which satisfies nonlinear boundary conditions, the standard reproducing kernel methods cannot be used directly to solve boundary value problems with nonlinear boundary conditions as there is no knowledge about the existence and uniqueness of the solution. The aim of this paper is, therefore, to construct an iterative method by the use of a combination of reproducing kernel Hilbert space method and a shooting-like technique to solve the mentioned problems. Error estimation for reproducing kernel Hilbert space methods for nonlinear boundary value problems have yet to be discussed in the literature. In this paper, we present error estimation for the reproducing kernel method to solve nonlinear boundary value problems probably for the first time. Some numerical results are given out to demonstrate the applicability of the method.
The random coding bound is tight for the average code.
NASA Technical Reports Server (NTRS)
Gallager, R. G.
1973-01-01
The random coding bound of information theory provides a well-known upper bound to the probability of decoding error for the best code of a given rate and block length. The bound is constructed by upperbounding the average error probability over an ensemble of codes. The bound is known to give the correct exponential dependence of error probability on block length for transmission rates above the critical rate, but it gives an incorrect exponential dependence at rates below a second lower critical rate. Here we derive an asymptotic expression for the average error probability over the ensemble of codes used in the random coding bound. The result shows that the weakness of the random coding bound at rates below the second critical rate is due not to upperbounding the ensemble average, but rather to the fact that the best codes are much better than the average at low rates.
A universal approximation to grain size from images of non-cohesive sediment
Buscombe, D.; Rubin, D.M.; Warrick, J.A.
2010-01-01
The two-dimensional spectral decomposition of an image of sediment provides a direct statistical estimate, grid-by-number style, of the mean of all intermediate axes of all single particles within the image. We develop and test this new method which, unlike existing techniques, requires neither image processing algorithms for detection and measurement of individual grains, nor calibration. The only information required of the operator is the spatial resolution of the image. The method is tested with images of bed sediment from nine different sedimentary environments (five beaches, three rivers, and one continental shelf), across the range 0.1 mm to 150 mm, taken in air and underwater. Each population was photographed using a different camera and lighting conditions. We term it a “universal approximation” because it has produced accurate estimates for all populations we have tested it with, without calibration. We use three approaches (theory, computational experiments, and physical experiments) to both understand and explore the sensitivities and limits of this new method. Based on 443 samples, the root-mean-squared (RMS) error between size estimates from the new method and known mean grain size (obtained from point counts on the image) was found to be ±≈16%, with a 95% probability of estimates within ±31% of the true mean grain size (measured in a linear scale). The RMS error reduces to ≈11%, with a 95% probability of estimates within ±20% of the true mean grain size if point counts from a few images are used to correct bias for a specific population of sediment images. It thus appears it is transferable between sedimentary populations with different grain size, but factors such as particle shape and packing may introduce bias which may need to be calibrated for. For the first time, an attempt has been made to mathematically relate the spatial distribution of pixel intensity within the image of sediment to the grain size.
A universal approximation of grain size from images of noncohesive sediment
NASA Astrophysics Data System (ADS)
Buscombe, D.; Rubin, D. M.; Warrick, J. A.
2010-06-01
The two-dimensional spectral decomposition of an image of sediment provides a direct statistical estimate, grid-by-number style, of the mean of all intermediate axes of all single particles within the image. We develop and test this new method which, unlike existing techniques, requires neither image processing algorithms for detection and measurement of individual grains, nor calibration. The only information required of the operator is the spatial resolution of the image. The method is tested with images of bed sediment from nine different sedimentary environments (five beaches, three rivers, and one continental shelf), across the range 0.1 mm to 150 mm, taken in air and underwater. Each population was photographed using a different camera and lighting conditions. We term it a "universal approximation" because it has produced accurate estimates for all populations we have tested it with, without calibration. We use three approaches (theory, computational experiments, and physical experiments) to both understand and explore the sensitivities and limits of this new method. Based on 443 samples, the root-mean-squared (RMS) error between size estimates from the new method and known mean grain size (obtained from point counts on the image) was found to be ±≈16%, with a 95% probability of estimates within ±31% of the true mean grain size (measured in a linear scale). The RMS error reduces to ≈11%, with a 95% probability of estimates within ±20% of the true mean grain size if point counts from a few images are used to correct bias for a specific population of sediment images. It thus appears it is transferable between sedimentary populations with different grain size, but factors such as particle shape and packing may introduce bias which may need to be calibrated for. For the first time, an attempt has been made to mathematically relate the spatial distribution of pixel intensity within the image of sediment to the grain size.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Audenaert, Koenraad M. R., E-mail: koenraad.audenaert@rhul.ac.uk; Department of Physics and Astronomy, University of Ghent, S9, Krijgslaan 281, B-9000 Ghent; Mosonyi, Milán, E-mail: milan.mosonyi@gmail.com
2014-10-01
We consider the multiple hypothesis testing problem for symmetric quantum state discrimination between r given states σ₁, …, σ{sub r}. By splitting up the overall test into multiple binary tests in various ways we obtain a number of upper bounds on the optimal error probability in terms of the binary error probabilities. These upper bounds allow us to deduce various bounds on the asymptotic error rate, for which it has been hypothesized that it is given by the multi-hypothesis quantum Chernoff bound (or Chernoff divergence) C(σ₁, …, σ{sub r}), as recently introduced by Nussbaum and Szkoła in analogy with Salikhov'smore » classical multi-hypothesis Chernoff bound. This quantity is defined as the minimum of the pairwise binary Chernoff divergences min{sub j« less
Statistical design and analysis for plant cover studies with multiple sources of observation errors
Wright, Wilson; Irvine, Kathryn M.; Warren, Jeffrey M .; Barnett, Jenny K.
2017-01-01
Effective wildlife habitat management and conservation requires understanding the factors influencing distribution and abundance of plant species. Field studies, however, have documented observation errors in visually estimated plant cover including measurements which differ from the true value (measurement error) and not observing a species that is present within a plot (detection error). Unlike the rapid expansion of occupancy and N-mixture models for analysing wildlife surveys, development of statistical models accounting for observation error in plants has not progressed quickly. Our work informs development of a monitoring protocol for managed wetlands within the National Wildlife Refuge System.Zero-augmented beta (ZAB) regression is the most suitable method for analysing areal plant cover recorded as a continuous proportion but assumes no observation errors. We present a model extension that explicitly includes the observation process thereby accounting for both measurement and detection errors. Using simulations, we compare our approach to a ZAB regression that ignores observation errors (naïve model) and an “ad hoc” approach using a composite of multiple observations per plot within the naïve model. We explore how sample size and within-season revisit design affect the ability to detect a change in mean plant cover between 2 years using our model.Explicitly modelling the observation process within our framework produced unbiased estimates and nominal coverage of model parameters. The naïve and “ad hoc” approaches resulted in underestimation of occurrence and overestimation of mean cover. The degree of bias was primarily driven by imperfect detection and its relationship with cover within a plot. Conversely, measurement error had minimal impacts on inferences. We found >30 plots with at least three within-season revisits achieved reasonable posterior probabilities for assessing change in mean plant cover.For rapid adoption and application, code for Bayesian estimation of our single-species ZAB with errors model is included. Practitioners utilizing our R-based simulation code can explore trade-offs among different survey efforts and parameter values, as we did, but tuned to their own investigation. Less abundant plant species of high ecological interest may warrant the additional cost of gathering multiple independent observations in order to guard against erroneous conclusions.
NASA Astrophysics Data System (ADS)
Cecinati, Francesca; Rico-Ramirez, Miguel Angel; Heuvelink, Gerard B. M.; Han, Dawei
2017-05-01
The application of radar quantitative precipitation estimation (QPE) to hydrology and water quality models can be preferred to interpolated rainfall point measurements because of the wide coverage that radars can provide, together with a good spatio-temporal resolutions. Nonetheless, it is often limited by the proneness of radar QPE to a multitude of errors. Although radar errors have been widely studied and techniques have been developed to correct most of them, residual errors are still intrinsic in radar QPE. An estimation of uncertainty of radar QPE and an assessment of uncertainty propagation in modelling applications is important to quantify the relative importance of the uncertainty associated to radar rainfall input in the overall modelling uncertainty. A suitable tool for this purpose is the generation of radar rainfall ensembles. An ensemble is the representation of the rainfall field and its uncertainty through a collection of possible alternative rainfall fields, produced according to the observed errors, their spatial characteristics, and their probability distribution. The errors are derived from a comparison between radar QPE and ground point measurements. The novelty of the proposed ensemble generator is that it is based on a geostatistical approach that assures a fast and robust generation of synthetic error fields, based on the time-variant characteristics of errors. The method is developed to meet the requirement of operational applications to large datasets. The method is applied to a case study in Northern England, using the UK Met Office NIMROD radar composites at 1 km resolution and at 1 h accumulation on an area of 180 km by 180 km. The errors are estimated using a network of 199 tipping bucket rain gauges from the Environment Agency. 183 of the rain gauges are used for the error modelling, while 16 are kept apart for validation. The validation is done by comparing the radar rainfall ensemble with the values recorded by the validation rain gauges. The validated ensemble is then tested on a hydrological case study, to show the advantage of probabilistic rainfall for uncertainty propagation. The ensemble spread only partially captures the mismatch between the modelled and the observed flow. The residual uncertainty can be attributed to other sources of uncertainty, in particular to model structural uncertainty, parameter identification uncertainty, uncertainty in other inputs, and uncertainty in the observed flow.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Martin, Peter R., E-mail: pmarti46@uwo.ca; Cool, Derek W.; Romagnoli, Cesare
2014-07-15
Purpose: Magnetic resonance imaging (MRI)-targeted, 3D transrectal ultrasound (TRUS)-guided “fusion” prostate biopsy intends to reduce the ∼23% false negative rate of clinical two-dimensional TRUS-guided sextant biopsy. Although it has been reported to double the positive yield, MRI-targeted biopsies continue to yield false negatives. Therefore, the authors propose to investigate how biopsy system needle delivery error affects the probability of sampling each tumor, by accounting for uncertainties due to guidance system error, image registration error, and irregular tumor shapes. Methods: T2-weighted, dynamic contrast-enhanced T1-weighted, and diffusion-weighted prostate MRI and 3D TRUS images were obtained from 49 patients. A radiologist and radiologymore » resident contoured 81 suspicious regions, yielding 3D tumor surfaces that were registered to the 3D TRUS images using an iterative closest point prostate surface-based method to yield 3D binary images of the suspicious regions in the TRUS context. The probabilityP of obtaining a sample of tumor tissue in one biopsy core was calculated by integrating a 3D Gaussian distribution over each suspicious region domain. Next, the authors performed an exhaustive search to determine the maximum root mean squared error (RMSE, in mm) of a biopsy system that gives P ≥ 95% for each tumor sample, and then repeated this procedure for equal-volume spheres corresponding to each tumor sample. Finally, the authors investigated the effect of probe-axis-direction error on measured tumor burden by studying the relationship between the error and estimated percentage of core involvement. Results: Given a 3.5 mm RMSE for contemporary fusion biopsy systems,P ≥ 95% for 21 out of 81 tumors. The authors determined that for a biopsy system with 3.5 mm RMSE, one cannot expect to sample tumors of approximately 1 cm{sup 3} or smaller with 95% probability with only one biopsy core. The predicted maximum RMSE giving P ≥ 95% for each tumor was consistently greater when using spherical tumor shapes as opposed to no shape assumption. However, an assumption of spherical tumor shape for RMSE = 3.5 mm led to a mean overestimation of tumor sampling probabilities of 3%, implying that assuming spherical tumor shape may be reasonable for many prostate tumors. The authors also determined that a biopsy system would need to have a RMS needle delivery error of no more than 1.6 mm in order to sample 95% of tumors with one core. The authors’ experiments also indicated that the effect of axial-direction error on the measured tumor burden was mitigated by the 18 mm core length at 3.5 mm RMSE. Conclusions: For biopsy systems with RMSE ≥ 3.5 mm, more than one biopsy core must be taken from the majority of tumors to achieveP ≥ 95%. These observations support the authors’ perspective that some tumors of clinically significant sizes may require more than one biopsy attempt in order to be sampled during the first biopsy session. This motivates the authors’ ongoing development of an approach to optimize biopsy plans with the aim of achieving a desired probability of obtaining a sample from each tumor, while minimizing the number of biopsies. Optimized planning of within-tumor targets for MRI-3D TRUS fusion biopsy could support earlier diagnosis of prostate cancer while it remains localized to the gland and curable.« less
Madsen, Thomas; Braun, Danielle; Peng, Gang; Parmigiani, Giovanni; Trippa, Lorenzo
2018-06-25
The Elston-Stewart peeling algorithm enables estimation of an individual's probability of harboring germline risk alleles based on pedigree data, and serves as the computational backbone of important genetic counseling tools. However, it remains limited to the analysis of risk alleles at a small number of genetic loci because its computing time grows exponentially with the number of loci considered. We propose a novel, approximate version of this algorithm, dubbed the peeling and paring algorithm, which scales polynomially in the number of loci. This allows extending peeling-based models to include many genetic loci. The algorithm creates a trade-off between accuracy and speed, and allows the user to control this trade-off. We provide exact bounds on the approximation error and evaluate it in realistic simulations. Results show that the loss of accuracy due to the approximation is negligible in important applications. This algorithm will improve genetic counseling tools by increasing the number of pathogenic risk alleles that can be addressed. To illustrate we create an extended five genes version of BRCAPRO, a widely used model for estimating the carrier probabilities of BRCA1 and BRCA2 risk alleles and assess its computational properties. © 2018 WILEY PERIODICALS, INC.
NASA Technical Reports Server (NTRS)
Shapiro, Jeffrey H.
1992-01-01
Phase measurements on a single-mode radiation field are examined from a system-theoretic viewpoint. Quantum estimation theory is used to establish the primacy of the Susskind-Glogower (SG) phase operator; its phase eigenkets generate the probability operator measure (POM) for maximum likelihood phase estimation. A commuting observables description for the SG-POM on a signal x apparatus state space is derived. It is analogous to the signal-band x image-band formulation for optical heterodyne detection. Because heterodyning realizes the annihilation operator POM, this analogy may help realize the SG-POM. The wave function representation associated with the SG POM is then used to prove the duality between the phase measurement and the number operator measurement, from which a number-phase uncertainty principle is obtained, via Fourier theory, without recourse to linearization. Fourier theory is also employed to establish the principle of number-ket causality, leading to a Paley-Wiener condition that must be satisfied by the phase-measurement probability density function (PDF) for a single-mode field in an arbitrary quantum state. Finally, a two-mode phase measurement is shown to afford phase-conjugate quantum communication at zero error probability with finite average photon number. Application of this construct to interferometric precision measurements is briefly discussed.
A concatenated coding scheme for error control
NASA Technical Reports Server (NTRS)
Kasami, T.; Fujiwara, T.; Lin, S.
1986-01-01
In this paper, a concatenated coding scheme for error control in data communications is presented and analyzed. In this scheme, the inner code is used for both error correction and detection; however, the outer code is used only for error detection. A retransmission is requested if either the inner code decoder fails to make a successful decoding or the outer code decoder detects the presence of errors after the inner code decoding. Probability of undetected error (or decoding error) of the proposed scheme is derived. An efficient method for computing this probability is presented. Throughput efficiency of the proposed error control scheme incorporated with a selective-repeat ARQ retransmission strategy is also analyzed. Three specific examples are presented. One of the examples is proposed for error control in the NASA Telecommand System.
A bootstrap method for estimating uncertainty of water quality trends
Hirsch, Robert M.; Archfield, Stacey A.; DeCicco, Laura
2015-01-01
Estimation of the direction and magnitude of trends in surface water quality remains a problem of great scientific and practical interest. The Weighted Regressions on Time, Discharge, and Season (WRTDS) method was recently introduced as an exploratory data analysis tool to provide flexible and robust estimates of water quality trends. This paper enhances the WRTDS method through the introduction of the WRTDS Bootstrap Test (WBT), an extension of WRTDS that quantifies the uncertainty in WRTDS-estimates of water quality trends and offers various ways to visualize and communicate these uncertainties. Monte Carlo experiments are applied to estimate the Type I error probabilities for this method. WBT is compared to other water-quality trend-testing methods appropriate for data sets of one to three decades in length with sampling frequencies of 6–24 observations per year. The software to conduct the test is in the EGRETci R-package.
Uncertainty of exploitation estimates made from tag returns
Miranda, L.E.; Brock, R.E.; Dorr, B.S.
2002-01-01
Over 6,000 crappies Pomoxis spp. were tagged in five water bodies to estimate exploitation rates by anglers. Exploitation rates were computed as the percentage of tags returned after adjustment for three sources of uncertainty: postrelease mortality due to the tagging process, tag loss, and the reporting rate of tagged fish. Confidence intervals around exploitation rates were estimated by resampling from the probability distributions of tagging mortality, tag loss, and reporting rate. Estimates of exploitation rates ranged from 17% to 54% among the five study systems. Uncertainty around estimates of tagging mortality, tag loss, and reporting resulted in 90% confidence intervals around the median exploitation rate as narrow as 15 percentage points and as broad as 46 percentage points. The greatest source of estimation error was uncertainty about tag reporting. Because the large investments required by tagging and reward operations produce imprecise estimates of the exploitation rate, it may be worth considering other approaches to estimating it or simply circumventing the exploitation question altogether.
Markov chains for testing redundant software
NASA Technical Reports Server (NTRS)
White, Allan L.; Sjogren, Jon A.
1988-01-01
A preliminary design for a validation experiment has been developed that addresses several problems unique to assuring the extremely high quality of multiple-version programs in process-control software. The procedure uses Markov chains to model the error states of the multiple version programs. The programs are observed during simulated process-control testing, and estimates are obtained for the transition probabilities between the states of the Markov chain. The experimental Markov chain model is then expanded into a reliability model that takes into account the inertia of the system being controlled. The reliability of the multiple version software is computed from this reliability model at a given confidence level using confidence intervals obtained for the transition probabilities during the experiment. An example demonstrating the method is provided.
Advances in the meta-analysis of heterogeneous clinical trials II: The quality effects model.
Doi, Suhail A R; Barendregt, Jan J; Khan, Shahjahan; Thalib, Lukman; Williams, Gail M
2015-11-01
This article examines the performance of the updated quality effects (QE) estimator for meta-analysis of heterogeneous studies. It is shown that this approach leads to a decreased mean squared error (MSE) of the estimator while maintaining the nominal level of coverage probability of the confidence interval. Extensive simulation studies confirm that this approach leads to the maintenance of the correct coverage probability of the confidence interval, regardless of the level of heterogeneity, as well as a lower observed variance compared to the random effects (RE) model. The QE model is robust to subjectivity in quality assessment down to completely random entry, in which case its MSE equals that of the RE estimator. When the proposed QE method is applied to a meta-analysis of magnesium for myocardial infarction data, the pooled mortality odds ratio (OR) becomes 0.81 (95% CI 0.61-1.08) which favors the larger studies but also reflects the increased uncertainty around the pooled estimate. In comparison, under the RE model, the pooled mortality OR is 0.71 (95% CI 0.57-0.89) which is less conservative than that of the QE results. The new estimation method has been implemented into the free meta-analysis software MetaXL which allows comparison of alternative estimators and can be downloaded from www.epigear.com. Copyright © 2015 Elsevier Inc. All rights reserved.
Karanth, K.Ullas; Chundawat, Raghunandan S.; Nichols, James D.; Kumar, N. Samba
2004-01-01
Tropical dry-deciduous forests comprise more than 45% of the tiger (Panthera tigris) habitat in India. However, in the absence of rigorously derived estimates of ecological densities of tigers in dry forests, critical baseline data for managing tiger populations are lacking. In this study tiger densities were estimated using photographic capture–recapture sampling in the dry forests of Panna Tiger Reserve in Central India. Over a 45-day survey period, 60 camera trap sites were sampled in a well-protected part of the 542-km2 reserve during 2002. A total sampling effort of 914 camera-trap-days yielded photo-captures of 11 individual tigers over 15 sampling occasions that effectively covered a 418-km2 area. The closed capture–recapture model Mh, which incorporates individual heterogeneity in capture probabilities, fitted these photographic capture history data well. The estimated capture probability/sample, p̂= 0.04, resulted in an estimated tiger population size and standard error (N̂(SÊN̂)) of 29 (9.65), and a density (D̂(SÊD̂)) of 6.94 (3.23) tigers/100 km2. The estimated tiger density matched predictions based on prey abundance. Our results suggest that, if managed appropriately, the available dry forest habitat in India has the potential to support a population size of about 9000 wild tigers.
Jasra, Ajay; Law, Kody J. H.; Zhou, Yan
2016-01-01
Our paper considers uncertainty quantification for an elliptic nonlocal equation. In particular, it is assumed that the parameters which define the kernel in the nonlocal operator are uncertain and a priori distributed according to a probability measure. It is shown that the induced probability measure on some quantities of interest arising from functionals of the solution to the equation with random inputs is well-defined,s as is the posterior distribution on parameters given observations. As the elliptic nonlocal equation cannot be solved approximate posteriors are constructed. The multilevel Monte Carlo (MLMC) and multilevel sequential Monte Carlo (MLSMC) sampling algorithms are usedmore » for a priori and a posteriori estimation, respectively, of quantities of interest. Furthermore, these algorithms reduce the amount of work to estimate posterior expectations, for a given level of error, relative to Monte Carlo and i.i.d. sampling from the posterior at a given level of approximation of the solution of the elliptic nonlocal equation.« less
Optimum quantum receiver for detecting weak signals in PAM communication systems
NASA Astrophysics Data System (ADS)
Sharma, Navneet; Rawat, Tarun Kumar; Parthasarathy, Harish; Gautam, Kumar
2017-09-01
This paper deals with the modeling of an optimum quantum receiver for pulse amplitude modulator (PAM) communication systems. The information bearing sequence {I_k}_{k=0}^{N-1} is estimated using the maximum likelihood (ML) method. The ML method is based on quantum mechanical measurements of an observable X in the Hilbert space of the quantum system at discrete times, when the Hamiltonian of the system is perturbed by an operator obtained by modulating a potential V with a PAM signal derived from the information bearing sequence {I_k}_{k=0}^{N-1}. The measurement process at each time instant causes collapse of the system state to an observable eigenstate. All probabilities of getting different outcomes from an observable are calculated using the perturbed evolution operator combined with the collapse postulate. For given probability densities, calculation of the mean square error evaluates the performance of the receiver. Finally, we present an example involving estimating an information bearing sequence that modulates a quantum electromagnetic field incident on a quantum harmonic oscillator.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jasra, Ajay; Law, Kody J. H.; Zhou, Yan
Our paper considers uncertainty quantification for an elliptic nonlocal equation. In particular, it is assumed that the parameters which define the kernel in the nonlocal operator are uncertain and a priori distributed according to a probability measure. It is shown that the induced probability measure on some quantities of interest arising from functionals of the solution to the equation with random inputs is well-defined,s as is the posterior distribution on parameters given observations. As the elliptic nonlocal equation cannot be solved approximate posteriors are constructed. The multilevel Monte Carlo (MLMC) and multilevel sequential Monte Carlo (MLSMC) sampling algorithms are usedmore » for a priori and a posteriori estimation, respectively, of quantities of interest. Furthermore, these algorithms reduce the amount of work to estimate posterior expectations, for a given level of error, relative to Monte Carlo and i.i.d. sampling from the posterior at a given level of approximation of the solution of the elliptic nonlocal equation.« less
Estimation of the limit of detection using information theory measures.
Fonollosa, Jordi; Vergara, Alexander; Huerta, Ramón; Marco, Santiago
2014-01-31
Definitions of the limit of detection (LOD) based on the probability of false positive and/or false negative errors have been proposed over the past years. Although such definitions are straightforward and valid for any kind of analytical system, proposed methodologies to estimate the LOD are usually simplified to signals with Gaussian noise. Additionally, there is a general misconception that two systems with the same LOD provide the same amount of information on the source regardless of the prior probability of presenting a blank/analyte sample. Based upon an analogy between an analytical system and a binary communication channel, in this paper we show that the amount of information that can be extracted from an analytical system depends on the probability of presenting the two different possible states. We propose a new definition of LOD utilizing information theory tools that deals with noise of any kind and allows the introduction of prior knowledge easily. Unlike most traditional LOD estimation approaches, the proposed definition is based on the amount of information that the chemical instrumentation system provides on the chemical information source. Our findings indicate that the benchmark of analytical systems based on the ability to provide information about the presence/absence of the analyte (our proposed approach) is a more general and proper framework, while converging to the usual values when dealing with Gaussian noise. Copyright © 2013 Elsevier B.V. All rights reserved.
Performance in population models for count data, part II: a new SAEM algorithm
Savic, Radojka; Lavielle, Marc
2009-01-01
Analysis of count data from clinical trials using mixed effect analysis has recently become widely used. However, algorithms available for the parameter estimation, including LAPLACE and Gaussian quadrature (GQ), are associated with certain limitations, including bias in parameter estimates and the long analysis runtime. The stochastic approximation expectation maximization (SAEM) algorithm has proven to be a very efficient and powerful tool in the analysis of continuous data. The aim of this study was to implement and investigate the performance of a new SAEM algorithm for application to count data. A new SAEM algorithm was implemented in MATLAB for estimation of both, parameters and the Fisher information matrix. Stochastic Monte Carlo simulations followed by re-estimation were performed according to scenarios used in previous studies (part I) to investigate properties of alternative algorithms (1). A single scenario was used to explore six probability distribution models. For parameter estimation, the relative bias was less than 0.92% and 4.13 % for fixed and random effects, for all models studied including ones accounting for over- or under-dispersion. Empirical and estimated relative standard errors were similar, with distance between them being <1.7 % for all explored scenarios. The longest CPU time was 95s for parameter estimation and 56s for SE estimation. The SAEM algorithm was extended for analysis of count data. It provides accurate estimates of both, parameters and standard errors. The estimation is significantly faster compared to LAPLACE and GQ. The algorithm is implemented in Monolix 3.1, (beta-version available in July 2009). PMID:19680795
Performance analysis of the word synchronization properties of the outer code in a TDRSS decoder
NASA Technical Reports Server (NTRS)
Costello, D. J., Jr.; Lin, S.
1984-01-01
A self-synchronizing coding scheme for NASA's TDRSS satellite system is a concatenation of a (2,1,7) inner convolutional code with a (255,223) Reed-Solomon outer code. Both symbol and word synchronization are achieved without requiring that any additional symbols be transmitted. An important parameter which determines the performance of the word sync procedure is the ratio of the decoding failure probability to the undetected error probability. Ideally, the former should be as small as possible compared to the latter when the error correcting capability of the code is exceeded. A computer simulation of a (255,223) Reed-Solomon code as carried out. Results for decoding failure probability and for undetected error probability are tabulated and compared.
Linhart, S. Mike; Nania, Jon F.; Sanders, Curtis L.; Archfield, Stacey A.
2012-01-01
The U.S. Geological Survey (USGS) maintains approximately 148 real-time streamgages in Iowa for which daily mean streamflow information is available, but daily mean streamflow data commonly are needed at locations where no streamgages are present. Therefore, the USGS conducted a study as part of a larger project in cooperation with the Iowa Department of Natural Resources to develop methods to estimate daily mean streamflow at locations in ungaged watersheds in Iowa by using two regression-based statistical methods. The regression equations for the statistical methods were developed from historical daily mean streamflow and basin characteristics from streamgages within the study area, which includes the entire State of Iowa and adjacent areas within a 50-mile buffer of Iowa in neighboring states. Results of this study can be used with other techniques to determine the best method for application in Iowa and can be used to produce a Web-based geographic information system tool to compute streamflow estimates automatically. The Flow Anywhere statistical method is a variation of the drainage-area-ratio method, which transfers same-day streamflow information from a reference streamgage to another location by using the daily mean streamflow at the reference streamgage and the drainage-area ratio of the two locations. The Flow Anywhere method modifies the drainage-area-ratio method in order to regionalize the equations for Iowa and determine the best reference streamgage from which to transfer same-day streamflow information to an ungaged location. Data used for the Flow Anywhere method were retrieved for 123 continuous-record streamgages located in Iowa and within a 50-mile buffer of Iowa. The final regression equations were computed by using either left-censored regression techniques with a low limit threshold set at 0.1 cubic feet per second (ft3/s) and the daily mean streamflow for the 15th day of every other month, or by using an ordinary-least-squares multiple linear regression method and the daily mean streamflow for the 15th day of every other month. The Flow Duration Curve Transfer method was used to estimate unregulated daily mean streamflow from the physical and climatic characteristics of gaged basins. For the Flow Duration Curve Transfer method, daily mean streamflow quantiles at the ungaged site were estimated with the parameter-based regression model, which results in a continuous daily flow-duration curve (the relation between exceedance probability and streamflow for each day of observed streamflow) at the ungaged site. By the use of a reference streamgage, the Flow Duration Curve Transfer is converted to a time series. Data used in the Flow Duration Curve Transfer method were retrieved for 113 continuous-record streamgages in Iowa and within a 50-mile buffer of Iowa. The final statewide regression equations for Iowa were computed by using a weighted-least-squares multiple linear regression method and were computed for the 0.01-, 0.05-, 0.10-, 0.15-, 0.20-, 0.30-, 0.40-, 0.50-, 0.60-, 0.70-, 0.80-, 0.85-, 0.90-, and 0.95-exceedance probability statistics determined from the daily mean streamflow with a reporting limit set at 0.1 ft3/s. The final statewide regression equation for Iowa computed by using left-censored regression techniques was computed for the 0.99-exceedance probability statistic determined from the daily mean streamflow with a low limit threshold and a reporting limit set at 0.1 ft3/s. For the Flow Anywhere method, results of the validation study conducted by using six streamgages show that differences between the root-mean-square error and the mean absolute error ranged from 1,016 to 138 ft3/s, with the larger value signifying a greater occurrence of outliers between observed and estimated streamflows. Root-mean-square-error values ranged from 1,690 to 237 ft3/s. Values of the percent root-mean-square error ranged from 115 percent to 26.2 percent. The logarithm (base 10) streamflow percent root-mean-square error ranged from 13.0 to 5.3 percent. Root-mean-square-error observations standard-deviation-ratio values ranged from 0.80 to 0.40. Percent-bias values ranged from 25.4 to 4.0 percent. Untransformed streamflow Nash-Sutcliffe efficiency values ranged from 0.84 to 0.35. The logarithm (base 10) streamflow Nash-Sutcliffe efficiency values ranged from 0.86 to 0.56. For the streamgage with the best agreement between observed and estimated streamflow, higher streamflows appear to be underestimated. For the streamgage with the worst agreement between observed and estimated streamflow, low flows appear to be overestimated whereas higher flows seem to be underestimated. Estimated cumulative streamflows for the period October 1, 2004, to September 30, 2009, are underestimated by -25.8 and -7.4 percent for the closest and poorest comparisons, respectively. For the Flow Duration Curve Transfer method, results of the validation study conducted by using the same six streamgages show that differences between the root-mean-square error and the mean absolute error ranged from 437 to 93.9 ft3/s, with the larger value signifying a greater occurrence of outliers between observed and estimated streamflows. Root-mean-square-error values ranged from 906 to 169 ft3/s. Values of the percent root-mean-square-error ranged from 67.0 to 25.6 percent. The logarithm (base 10) streamflow percent root-mean-square error ranged from 12.5 to 4.4 percent. Root-mean-square-error observations standard-deviation-ratio values ranged from 0.79 to 0.40. Percent-bias values ranged from 22.7 to 0.94 percent. Untransformed streamflow Nash-Sutcliffe efficiency values ranged from 0.84 to 0.38. The logarithm (base 10) streamflow Nash-Sutcliffe efficiency values ranged from 0.89 to 0.48. For the streamgage with the closest agreement between observed and estimated streamflow, there is relatively good agreement between observed and estimated streamflows. For the streamgage with the poorest agreement between observed and estimated streamflow, streamflows appear to be substantially underestimated for much of the time period. Estimated cumulative streamflow for the period October 1, 2004, to September 30, 2009, are underestimated by -9.3 and -22.7 percent for the closest and poorest comparisons, respectively.
Allodji, Rodrigue S; Schwartz, Boris; Diallo, Ibrahima; Agbovon, Césaire; Laurier, Dominique; de Vathaire, Florent
2015-08-01
Analyses of the Life Span Study (LSS) of Japanese atomic bombing survivors have routinely incorporated corrections for additive classical measurement errors using regression calibration. Recently, several studies reported that the efficiency of the simulation-extrapolation method (SIMEX) is slightly more accurate than the simple regression calibration method (RCAL). In the present paper, the SIMEX and RCAL methods have been used to address errors in atomic bomb survivor dosimetry on solid cancer and leukaemia mortality risk estimates. For instance, it is shown that using the SIMEX method, the ERR/Gy is increased by an amount of about 29 % for all solid cancer deaths using a linear model compared to the RCAL method, and the corrected EAR 10(-4) person-years at 1 Gy (the linear terms) is decreased by about 8 %, while the corrected quadratic term (EAR 10(-4) person-years/Gy(2)) is increased by about 65 % for leukaemia deaths based on a linear-quadratic model. The results with SIMEX method are slightly higher than published values. The observed differences were probably due to the fact that with the RCAL method the dosimetric data were partially corrected, while all doses were considered with the SIMEX method. Therefore, one should be careful when comparing the estimated risks and it may be useful to use several correction techniques in order to obtain a range of corrected estimates, rather than to rely on a single technique. This work will enable to improve the risk estimates derived from LSS data, and help to make more reliable the development of radiation protection standards.
Atmospheric Tracer Inverse Modeling Using Markov Chain Monte Carlo (MCMC)
NASA Astrophysics Data System (ADS)
Kasibhatla, P.
2004-12-01
In recent years, there has been an increasing emphasis on the use of Bayesian statistical estimation techniques to characterize the temporal and spatial variability of atmospheric trace gas sources and sinks. The applications have been varied in terms of the particular species of interest, as well as in terms of the spatial and temporal resolution of the estimated fluxes. However, one common characteristic has been the use of relatively simple statistical models for describing the measurement and chemical transport model error statistics and prior source statistics. For example, multivariate normal probability distribution functions (pdfs) are commonly used to model these quantities and inverse source estimates are derived for fixed values of pdf paramaters. While the advantage of this approach is that closed form analytical solutions for the a posteriori pdfs of interest are available, it is worth exploring Bayesian analysis approaches which allow for a more general treatment of error and prior source statistics. Here, we present an application of the Markov Chain Monte Carlo (MCMC) methodology to an atmospheric tracer inversion problem to demonstrate how more gereral statistical models for errors can be incorporated into the analysis in a relatively straightforward manner. The MCMC approach to Bayesian analysis, which has found wide application in a variety of fields, is a statistical simulation approach that involves computing moments of interest of the a posteriori pdf by efficiently sampling this pdf. The specific inverse problem that we focus on is the annual mean CO2 source/sink estimation problem considered by the TransCom3 project. TransCom3 was a collaborative effort involving various modeling groups and followed a common modeling and analysis protocoal. As such, this problem provides a convenient case study to demonstrate the applicability of the MCMC methodology to atmospheric tracer source/sink estimation problems.
Gosho, Masahiko; Hirakawa, Akihiro; Noma, Hisashi; Maruo, Kazushi; Sato, Yasunori
2017-10-01
In longitudinal clinical trials, some subjects will drop out before completing the trial, so their measurements towards the end of the trial are not obtained. Mixed-effects models for repeated measures (MMRM) analysis with "unstructured" (UN) covariance structure are increasingly common as a primary analysis for group comparisons in these trials. Furthermore, model-based covariance estimators have been routinely used for testing the group difference and estimating confidence intervals of the difference in the MMRM analysis using the UN covariance. However, using the MMRM analysis with the UN covariance could lead to convergence problems for numerical optimization, especially in trials with a small-sample size. Although the so-called sandwich covariance estimator is robust to misspecification of the covariance structure, its performance deteriorates in settings with small-sample size. We investigated the performance of the sandwich covariance estimator and covariance estimators adjusted for small-sample bias proposed by Kauermann and Carroll ( J Am Stat Assoc 2001; 96: 1387-1396) and Mancl and DeRouen ( Biometrics 2001; 57: 126-134) fitting simpler covariance structures through a simulation study. In terms of the type 1 error rate and coverage probability of confidence intervals, Mancl and DeRouen's covariance estimator with compound symmetry, first-order autoregressive (AR(1)), heterogeneous AR(1), and antedependence structures performed better than the original sandwich estimator and Kauermann and Carroll's estimator with these structures in the scenarios where the variance increased across visits. The performance based on Mancl and DeRouen's estimator with these structures was nearly equivalent to that based on the Kenward-Roger method for adjusting the standard errors and degrees of freedom with the UN structure. The model-based covariance estimator with the UN structure under unadjustment of the degrees of freedom, which is frequently used in applications, resulted in substantial inflation of the type 1 error rate. We recommend the use of Mancl and DeRouen's estimator in MMRM analysis if the number of subjects completing is ( n + 5) or less, where n is the number of planned visits. Otherwise, the use of Kenward and Roger's method with UN structure should be the best way.
Robust linear discriminant models to solve financial crisis in banking sectors
NASA Astrophysics Data System (ADS)
Lim, Yai-Fung; Yahaya, Sharipah Soaad Syed; Idris, Faoziah; Ali, Hazlina; Omar, Zurni
2014-12-01
Linear discriminant analysis (LDA) is a widely-used technique in patterns classification via an equation which will minimize the probability of misclassifying cases into their respective categories. However, the performance of classical estimators in LDA highly depends on the assumptions of normality and homoscedasticity. Several robust estimators in LDA such as Minimum Covariance Determinant (MCD), S-estimators and Minimum Volume Ellipsoid (MVE) are addressed by many authors to alleviate the problem of non-robustness of the classical estimates. In this paper, we investigate on the financial crisis of the Malaysian banking institutions using robust LDA and classical LDA methods. Our objective is to distinguish the "distress" and "non-distress" banks in Malaysia by using the LDA models. Hit ratio is used to validate the accuracy predictive of LDA models. The performance of LDA is evaluated by estimating the misclassification rate via apparent error rate. The results and comparisons show that the robust estimators provide a better performance than the classical estimators for LDA.
Self-calibration method without joint iteration for distributed small satellite SAR systems
NASA Astrophysics Data System (ADS)
Xu, Qing; Liao, Guisheng; Liu, Aifei; Zhang, Juan
2013-12-01
The performance of distributed small satellite synthetic aperture radar systems degrades significantly due to the unavoidable array errors, including gain, phase, and position errors, in real operating scenarios. In the conventional method proposed in (IEEE T Aero. Elec. Sys. 42:436-451, 2006), the spectrum components within one Doppler bin are considered as calibration sources. However, it is found in this article that the gain error estimation and the position error estimation in the conventional method can interact with each other. The conventional method may converge to suboptimal solutions in large position errors since it requires the joint iteration between gain-phase error estimation and position error estimation. In addition, it is also found that phase errors can be estimated well regardless of position errors when the zero Doppler bin is chosen. In this article, we propose a method obtained by modifying the conventional one, based on these two observations. In this modified method, gain errors are firstly estimated and compensated, which eliminates the interaction between gain error estimation and position error estimation. Then, by using the zero Doppler bin data, the phase error estimation can be performed well independent of position errors. Finally, position errors are estimated based on the Taylor-series expansion. Meanwhile, the joint iteration between gain-phase error estimation and position error estimation is not required. Therefore, the problem of suboptimal convergence, which occurs in the conventional method, can be avoided with low computational method. The modified method has merits of faster convergence and lower estimation error compared to the conventional one. Theoretical analysis and computer simulation results verified the effectiveness of the modified method.
Heritability of lenticular myopia in English Springer spaniels.
Kubai, Melissa A; Labelle, Amber L; Hamor, Ralph E; Mutti, Donald O; Famula, Thomas R; Murphy, Christopher J
2013-11-08
We determined whether naturally-occurring lenticular myopia in English Springer spaniels (ESS) has a genetic component. Streak retinoscopy was performed on 226 related ESS 30 minutes after the onset of pharmacologic mydriasis and cycloplegia. A pedigree was constructed to determine relationships between affected offspring and parents. Estimation of heritability was done in a Bayesian analysis (facilitated by the MCMCglmm package of R) of refractive error in a model, including terms for sex and coat color. Myopia was defined as ≤-0.5 diopters (D) spherical equivalent. The median refractive error for ESS was 0.25 D (range, -3.5 to +4.5 D). Median age was 0.2 years (range, 0.1-15 years). The prevalence of myopia in related ESS was 19% (42/226). The ESS had a strong correlation (r = 0.95) for refractive error between the two eyes. Moderate heritability was present for refractive error with a mean value of 0.29 (95% highest probability density, 0.07-0.50). The distribution of refractive error, and subsequently lenticular myopia, has a moderate genetic component in ESS. Further investigation of genes responsible for regulation of the development of refractive ocular components in canines is warranted.
Rong, Hao; Tian, Jin; Zhao, Tingdi
2016-01-01
In traditional approaches of human reliability assessment (HRA), the definition of the error producing conditions (EPCs) and the supporting guidance are such that some of the conditions (especially organizational or managerial conditions) can hardly be included, and thus the analysis is burdened with incomprehensiveness without reflecting the temporal trend of human reliability. A method based on system dynamics (SD), which highlights interrelationships among technical and organizational aspects that may contribute to human errors, is presented to facilitate quantitatively estimating the human error probability (HEP) and its related variables changing over time in a long period. Taking the Minuteman III missile accident in 2008 as a case, the proposed HRA method is applied to assess HEP during missile operations over 50 years by analyzing the interactions among the variables involved in human-related risks; also the critical factors are determined in terms of impact that the variables have on risks in different time periods. It is indicated that both technical and organizational aspects should be focused on to minimize human errors in a long run. Copyright © 2015 Elsevier Ltd and The Ergonomics Society. All rights reserved.
NASA Technical Reports Server (NTRS)
Long, S. A. T.
1973-01-01
The triangulation method developed specifically for the Barium Ion Cloud Project is discussed. Expression for the four displacement errors, the three slope errors, and the curvature error in the triangulation solution due to a probable error in the lines-of-sight from the observation stations to points on the cloud are derived. The triangulation method is then used to determine the effect of the following on these different errors in the solution: the number and location of the stations, the observation duration, east-west cloud drift, the number of input data points, and the addition of extra cameras to one of the stations. The pointing displacement errors, and the pointing slope errors are compared. The displacement errors in the solution due to a probable error in the position of a moving station plus the weighting factors for the data from the moving station are also determined.
Meuwissen, Theo H E; Indahl, Ulf G; Ødegård, Jørgen
2017-12-27
Non-linear Bayesian genomic prediction models such as BayesA/B/C/R involve iteration and mostly Markov chain Monte Carlo (MCMC) algorithms, which are computationally expensive, especially when whole-genome sequence (WGS) data are analyzed. Singular value decomposition (SVD) of the genotype matrix can facilitate genomic prediction in large datasets, and can be used to estimate marker effects and their prediction error variances (PEV) in a computationally efficient manner. Here, we developed, implemented, and evaluated a direct, non-iterative method for the estimation of marker effects for the BayesC genomic prediction model. The BayesC model assumes a priori that markers have normally distributed effects with probability [Formula: see text] and no effect with probability (1 - [Formula: see text]). Marker effects and their PEV are estimated by using SVD and the posterior probability of the marker having a non-zero effect is calculated. These posterior probabilities are used to obtain marker-specific effect variances, which are subsequently used to approximate BayesC estimates of marker effects in a linear model. A computer simulation study was conducted to compare alternative genomic prediction methods, where a single reference generation was used to estimate marker effects, which were subsequently used for 10 generations of forward prediction, for which accuracies were evaluated. SVD-based posterior probabilities of markers having non-zero effects were generally lower than MCMC-based posterior probabilities, but for some regions the opposite occurred, resulting in clear signals for QTL-rich regions. The accuracies of breeding values estimated using SVD- and MCMC-based BayesC analyses were similar across the 10 generations of forward prediction. For an intermediate number of generations (2 to 5) of forward prediction, accuracies obtained with the BayesC model tended to be slightly higher than accuracies obtained using the best linear unbiased prediction of SNP effects (SNP-BLUP model). When reducing marker density from WGS data to 30 K, SNP-BLUP tended to yield the highest accuracies, at least in the short term. Based on SVD of the genotype matrix, we developed a direct method for the calculation of BayesC estimates of marker effects. Although SVD- and MCMC-based marker effects differed slightly, their prediction accuracies were similar. Assuming that the SVD of the marker genotype matrix is already performed for other reasons (e.g. for SNP-BLUP), computation times for the BayesC predictions were comparable to those of SNP-BLUP.
Bayesian logistic regression approaches to predict incorrect DRG assignment.
Suleiman, Mani; Demirhan, Haydar; Boyd, Leanne; Girosi, Federico; Aksakalli, Vural
2018-05-07
Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode's probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.
Doi, Suhail A R; Barendregt, Jan J; Khan, Shahjahan; Thalib, Lukman; Williams, Gail M
2015-11-01
This article examines an improved alternative to the random effects (RE) model for meta-analysis of heterogeneous studies. It is shown that the known issues of underestimation of the statistical error and spuriously overconfident estimates with the RE model can be resolved by the use of an estimator under the fixed effect model assumption with a quasi-likelihood based variance structure - the IVhet model. Extensive simulations confirm that this estimator retains a correct coverage probability and a lower observed variance than the RE model estimator, regardless of heterogeneity. When the proposed IVhet method is applied to the controversial meta-analysis of intravenous magnesium for the prevention of mortality after myocardial infarction, the pooled OR is 1.01 (95% CI 0.71-1.46) which not only favors the larger studies but also indicates more uncertainty around the point estimate. In comparison, under the RE model the pooled OR is 0.71 (95% CI 0.57-0.89) which, given the simulation results, reflects underestimation of the statistical error. Given the compelling evidence generated, we recommend that the IVhet model replace both the FE and RE models. To facilitate this, it has been implemented into free meta-analysis software called MetaXL which can be downloaded from www.epigear.com. Copyright © 2015 Elsevier Inc. All rights reserved.
Connick, M J; Beckman, E; Ibusuki, T; Malone, L; Tweedy, S M
2016-11-01
The International Paralympic Committee has a maximum allowable standing height (MASH) rule that limits stature to a pre-trauma estimation. The MASH rule reduces the probability that bilateral lower limb amputees use disproportionately long prostheses in competition. Although there are several methods for estimating stature, the validity of these methods has not been compared. To identify the most appropriate method for the MASH rule, this study aimed to compare the criterion validity of estimations resulting from the current method, the Contini method, and four Canda methods (Canda-1, Canda-2, Canda-3, and Canda-4). Stature, ulna length, demispan, sitting height, thigh length, upper arm length, and forearm length measurements in 31 males and 30 females were used to calculate the respective estimation for each method. Results showed that Canda-1 (based on four anthropometric variables) produced the smallest error and best fitted the data in males and females. The current method was associated with the largest error of those tests because it increasingly overestimated height in people with smaller stature. The results suggest that the set of Canda equations provide a more valid MASH estimation in people with a range of upper limb and bilateral lower limb amputations compared with the current method. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Paz-García, David A; Munguía-Vega, Adrián; Plomozo-Lugo, Tomas; Weaver, Amy Hudson
2017-04-01
We developed a set of hypervariable microsatellite markers for the Pacific red snapper (Lutjanus peru), an economically important marine fish for small-scale fisheries in the west coast of Mexico. We performed shotgun genome sequencing with the 454 XL titanium chemistry and used bioinformatic tools to search for perfect microsatellite loci. We selected 66 primer pairs that were synthesized and genotyped in an ABI PRISM 3730XL DNA sequencer in 32 individuals from the Gulf of California. We estimated levels of genetic diversity, deviations from linkage and Hardy-Weinberg equilibrium, estimated the frequency of null alleles and the probability of individual identity for the new markers. We reanalyzed 16 loci in 16 individuals to estimate genotyping error rates. Eighteen loci failed to amplify, 16 loci were discarded due to unspecific amplifications and 32 loci (14 tetranucleotide and 18 dinucleotide) were successfully scored. The average number of alleles per locus was 21 (±6.87, SD) and ranged from 8 to 34. The average observed and expected heterozygosities were 0.787 (±0.144 SD, range 0.250-0.935) and 0.909 (±0.122 SD, range 0.381-0.965), respectively. No significant linkage was detected. Eight loci showed deviations from Hardy-Weinberg equilibrium, and from these, four loci showed moderate null allele frequencies (0.104-0.220). The probability of individual identity for the new loci was 1.46 -62 . Genotyping error rates averaged 9.58%. The new markers will be useful to investigate patterns of larval dispersal, metapopulation dynamics, fine-scale genetic structure and diversity aimed to inform the implementation of spatially explicit fisheries management strategies in the Gulf of California.
Gaussian copula as a likelihood function for environmental models
NASA Astrophysics Data System (ADS)
Wani, O.; Espadas, G.; Cecinati, F.; Rieckermann, J.
2017-12-01
Parameter estimation of environmental models always comes with uncertainty. To formally quantify this parametric uncertainty, a likelihood function needs to be formulated, which is defined as the probability of observations given fixed values of the parameter set. A likelihood function allows us to infer parameter values from observations using Bayes' theorem. The challenge is to formulate a likelihood function that reliably describes the error generating processes which lead to the observed monitoring data, such as rainfall and runoff. If the likelihood function is not representative of the error statistics, the parameter inference will give biased parameter values. Several uncertainty estimation methods that are currently being used employ Gaussian processes as a likelihood function, because of their favourable analytical properties. Box-Cox transformation is suggested to deal with non-symmetric and heteroscedastic errors e.g. for flow data which are typically more uncertain in high flows than in periods with low flows. Problem with transformations is that the results are conditional on hyper-parameters, for which it is difficult to formulate the analyst's belief a priori. In an attempt to address this problem, in this research work we suggest learning the nature of the error distribution from the errors made by the model in the "past" forecasts. We use a Gaussian copula to generate semiparametric error distributions . 1) We show that this copula can be then used as a likelihood function to infer parameters, breaking away from the practice of using multivariate normal distributions. Based on the results from a didactical example of predicting rainfall runoff, 2) we demonstrate that the copula captures the predictive uncertainty of the model. 3) Finally, we find that the properties of autocorrelation and heteroscedasticity of errors are captured well by the copula, eliminating the need to use transforms. In summary, our findings suggest that copulas are an interesting departure from the usage of fully parametric distributions as likelihood functions - and they could help us to better capture the statistical properties of errors and make more reliable predictions.
NASA Astrophysics Data System (ADS)
Mauder, M.; Huq, S.; De Roo, F.; Foken, T.; Manhart, M.; Schmid, H. P. E.
2017-12-01
The Campbell CSAT3 sonic anemometer is one of the most widely used instruments for eddy-covariance measurement. However, conflicting estimates for the probe-induced flow distortion error of this instrument have been reported recently, and those error estimates range between 3% and 14% for the measurement of vertical velocity fluctuations. This large discrepancy between the different studies can probably be attributed to the different experimental approaches applied. In order to overcome the limitations of both field intercomparison experiments and wind tunnel experiments, we propose a new approach that relies on virtual measurements in a large-eddy simulation (LES) environment. In our experimental set-up, we generate horizontal and vertical velocity fluctuations at frequencies that typically dominate the turbulence spectra of the surface layer. The probe-induced flow distortion error of a CSAT3 is then quantified by this numerical wind tunnel approach while the statistics of the prescribed inflow signal are taken as reference or etalon. The resulting relative error is found to range from 3% to 7% and from 1% to 3% for the standard deviation of the vertical and the horizontal velocity component, respectively, depending on the orientation of the CSAT3 in the flow field. We further demonstrate that these errors are independent of the frequency of fluctuations at the inflow of the simulation. The analytical corrections proposed by Kaimal et al. (Proc Dyn Flow Conf, 551-565, 1978) and Horst et al. (Boundary-Layer Meteorol, 155, 371-395, 2015) are compared against our simulated results, and we find that they indeed reduce the error by up to three percentage points. However, these corrections fail to reproduce the azimuth-dependence of the error that we observe. Moreover, we investigate the general Reynolds number dependence of the flow distortion error by more detailed idealized simulations.
NASA Astrophysics Data System (ADS)
Liu, Yiming; Shi, Yimin; Bai, Xuchao; Zhan, Pei
2018-01-01
In this paper, we study the estimation for the reliability of a multicomponent system, named N- M-cold-standby redundancy system, based on progressive Type-II censoring sample. In the system, there are N subsystems consisting of M statistically independent distributed strength components, and only one of these subsystems works under the impact of stresses at a time and the others remain as standbys. Whenever the working subsystem fails, one from the standbys takes its place. The system fails when the entire subsystems fail. It is supposed that the underlying distributions of random strength and stress both belong to the generalized half-logistic distribution with different shape parameter. The reliability of the system is estimated by using both classical and Bayesian statistical inference. Uniformly minimum variance unbiased estimator and maximum likelihood estimator for the reliability of the system are derived. Under squared error loss function, the exact expression of the Bayes estimator for the reliability of the system is developed by using the Gauss hypergeometric function. The asymptotic confidence interval and corresponding coverage probabilities are derived based on both the Fisher and the observed information matrices. The approximate highest probability density credible interval is constructed by using Monte Carlo method. Monte Carlo simulations are performed to compare the performances of the proposed reliability estimators. A real data set is also analyzed for an illustration of the findings.
Performance of cellular frequency-hopped spread-spectrum radio networks
NASA Astrophysics Data System (ADS)
Gluck, Jeffrey W.; Geraniotis, Evaggelos
1989-10-01
Multiple access interference is characterized for cellular mobile networks, in which users are assumed to be Poisson-distributed in the plane and employ frequency-hopped spread-spectrum signaling with transmitter-oriented assignment of frequency-hopping patterns. Exact expressions for the bit error probabilities are derived for binary coherently demodulated systems without coding. Approximations for the packet error probability are derived for coherent and noncoherent systems and these approximations are applied when forward-error-control coding is employed. In all cases, the effects of varying interference power are accurately taken into account according to some propagation law. Numerical results are given in terms of bit error probability for the exact case and throughput for the approximate analyses. Comparisons are made with previously derived bounds and it is shown that these tend to be very pessimistic.
NASA Technical Reports Server (NTRS)
Massey, J. L.
1976-01-01
Virtually all previously-suggested rate 1/2 binary convolutional codes with KE = 24 are compared. Their distance properties are given; and their performance, both in computation and in error probability, with sequential decoding on the deep-space channel is determined by simulation. Recommendations are made both for the choice of a specific KE = 24 code as well as for codes to be included in future coding standards for the deep-space channel. A new result given in this report is a method for determining the statistical significance of error probability data when the error probability is so small that it is not feasible to perform enough decoding simulations to obtain more than a very small number of decoding errors.
Chan, Siew Foong; Deeks, Jonathan J; Macaskill, Petra; Irwig, Les
2008-01-01
To compare three predictive models based on logistic regression to estimate adjusted likelihood ratios allowing for interdependency between diagnostic variables (tests). This study was a review of the theoretical basis, assumptions, and limitations of published models; and a statistical extension of methods and application to a case study of the diagnosis of obstructive airways disease based on history and clinical examination. Albert's method includes an offset term to estimate an adjusted likelihood ratio for combinations of tests. Spiegelhalter and Knill-Jones method uses the unadjusted likelihood ratio for each test as a predictor and computes shrinkage factors to allow for interdependence. Knottnerus' method differs from the other methods because it requires sequencing of tests, which limits its application to situations where there are few tests and substantial data. Although parameter estimates differed between the models, predicted "posttest" probabilities were generally similar. Construction of predictive models using logistic regression is preferred to the independence Bayes' approach when it is important to adjust for dependency of tests errors. Methods to estimate adjusted likelihood ratios from predictive models should be considered in preference to a standard logistic regression model to facilitate ease of interpretation and application. Albert's method provides the most straightforward approach.
Ice Cores Dating With a New Inverse Method Taking Account of the Flow Modeling Errors
NASA Astrophysics Data System (ADS)
Lemieux-Dudon, B.; Parrenin, F.; Blayo, E.
2007-12-01
Deep ice cores extracted from Antarctica or Greenland recorded a wide range of past climatic events. In order to contribute to the Quaternary climate system understanding, the calculation of an accurate depth-age relationship is a crucial point. Up to now ice chronologies for deep ice cores estimated with inverse approaches are based on quite simplified ice-flow models that fail to reproduce flow irregularities and consequently to respect all available set of age markers. We describe in this paper, a new inverse method that takes into account the model uncertainty in order to circumvent the restrictions linked to the use of simplified flow models. This method uses first guesses on two flow physical entities, the ice thinning function and the accumulation rate and then identifies correction functions on both flow entities. We highlight two major benefits brought by this new method: first of all the ability to respect large set of observations and as a consequence, the feasibility to estimate a synchronized common ice chronology for several cores at the same time. This inverse approach relies on a bayesian framework. To respect the positive constraint on the searched correction functions, we assume lognormal probability distribution on one hand for the background errors, but also for one particular set of the observation errors. We test this new inversion method on three cores simultaneously (the two EPICA cores : DC and DML and the Vostok core) and we assimilate more than 150 observations (e.g.: age markers, stratigraphic links,...). We analyze the sensitivity of the solution with respect to the background information, especially the prior error covariance matrix. The confidence intervals based on the posterior covariance matrix calculation, are estimated on the correction functions and for the first time on the overall output chronologies.
Bayesian network models for error detection in radiotherapy plans
NASA Astrophysics Data System (ADS)
Kalet, Alan M.; Gennari, John H.; Ford, Eric C.; Phillips, Mark H.
2015-04-01
The purpose of this study is to design and develop a probabilistic network for detecting errors in radiotherapy plans for use at the time of initial plan verification. Our group has initiated a multi-pronged approach to reduce these errors. We report on our development of Bayesian models of radiotherapy plans. Bayesian networks consist of joint probability distributions that define the probability of one event, given some set of other known information. Using the networks, we find the probability of obtaining certain radiotherapy parameters, given a set of initial clinical information. A low probability in a propagated network then corresponds to potential errors to be flagged for investigation. To build our networks we first interviewed medical physicists and other domain experts to identify the relevant radiotherapy concepts and their associated interdependencies and to construct a network topology. Next, to populate the network’s conditional probability tables, we used the Hugin Expert software to learn parameter distributions from a subset of de-identified data derived from a radiation oncology based clinical information database system. These data represent 4990 unique prescription cases over a 5 year period. Under test case scenarios with approximately 1.5% introduced error rates, network performance produced areas under the ROC curve of 0.88, 0.98, and 0.89 for the lung, brain and female breast cancer error detection networks, respectively. Comparison of the brain network to human experts performance (AUC of 0.90 ± 0.01) shows the Bayes network model performs better than domain experts under the same test conditions. Our results demonstrate the feasibility and effectiveness of comprehensive probabilistic models as part of decision support systems for improved detection of errors in initial radiotherapy plan verification procedures.
Wilkinson, Michael
2014-03-01
Decisions about support for predictions of theories in light of data are made using statistical inference. The dominant approach in sport and exercise science is the Neyman-Pearson (N-P) significance-testing approach. When applied correctly it provides a reliable procedure for making dichotomous decisions for accepting or rejecting zero-effect null hypotheses with known and controlled long-run error rates. Type I and type II error rates must be specified in advance and the latter controlled by conducting an a priori sample size calculation. The N-P approach does not provide the probability of hypotheses or indicate the strength of support for hypotheses in light of data, yet many scientists believe it does. Outcomes of analyses allow conclusions only about the existence of non-zero effects, and provide no information about the likely size of true effects or their practical/clinical value. Bayesian inference can show how much support data provide for different hypotheses, and how personal convictions should be altered in light of data, but the approach is complicated by formulating probability distributions about prior subjective estimates of population effects. A pragmatic solution is magnitude-based inference, which allows scientists to estimate the true magnitude of population effects and how likely they are to exceed an effect magnitude of practical/clinical importance, thereby integrating elements of subjective Bayesian-style thinking. While this approach is gaining acceptance, progress might be hastened if scientists appreciate the shortcomings of traditional N-P null hypothesis significance testing.
Nesvizhskii, Alexey I.
2010-01-01
This manuscript provides a comprehensive review of the peptide and protein identification process using tandem mass spectrometry (MS/MS) data generated in shotgun proteomic experiments. The commonly used methods for assigning peptide sequences to MS/MS spectra are critically discussed and compared, from basic strategies to advanced multi-stage approaches. A particular attention is paid to the problem of false-positive identifications. Existing statistical approaches for assessing the significance of peptide to spectrum matches are surveyed, ranging from single-spectrum approaches such as expectation values to global error rate estimation procedures such as false discovery rates and posterior probabilities. The importance of using auxiliary discriminant information (mass accuracy, peptide separation coordinates, digestion properties, and etc.) is discussed, and advanced computational approaches for joint modeling of multiple sources of information are presented. This review also includes a detailed analysis of the issues affecting the interpretation of data at the protein level, including the amplification of error rates when going from peptide to protein level, and the ambiguities in inferring the identifies of sample proteins in the presence of shared peptides. Commonly used methods for computing protein-level confidence scores are discussed in detail. The review concludes with a discussion of several outstanding computational issues. PMID:20816881
Robust Characterization of Loss Rates
NASA Astrophysics Data System (ADS)
Wallman, Joel J.; Barnhill, Marie; Emerson, Joseph
2015-08-01
Many physical implementations of qubits—including ion traps, optical lattices and linear optics—suffer from loss. A nonzero probability of irretrievably losing a qubit can be a substantial obstacle to fault-tolerant methods of processing quantum information, requiring new techniques to safeguard against loss that introduce an additional overhead that depends upon the loss rate. Here we present a scalable and platform-independent protocol for estimating the average loss rate (averaged over all input states) resulting from an arbitrary Markovian noise process, as well as an independent estimate of detector efficiency. Moreover, we show that our protocol gives an additional constraint on estimated parameters from randomized benchmarking that improves the reliability of the estimated error rate and provides a new indicator for non-Markovian signatures in the experimental data. We also derive a bound for the state-dependent loss rate in terms of the average loss rate.
2018-01-01
Propensity score methods are increasingly being used to estimate the effects of treatments and exposures when using observational data. The propensity score was initially developed for use with binary exposures. The generalized propensity score (GPS) is an extension of the propensity score for use with quantitative or continuous exposures (eg, dose or quantity of medication, income, or years of education). We used Monte Carlo simulations to examine the performance of different methods of using the GPS to estimate the effect of continuous exposures on binary outcomes. We examined covariate adjustment using the GPS and weighting using weights based on the inverse of the GPS. We examined both the use of ordinary least squares to estimate the propensity function and the use of the covariate balancing propensity score algorithm. The use of methods based on the GPS was compared with the use of G‐computation. All methods resulted in essentially unbiased estimation of the population dose‐response function. However, GPS‐based weighting tended to result in estimates that displayed greater variability and had higher mean squared error when the magnitude of confounding was strong. Of the methods based on the GPS, covariate adjustment using the GPS tended to result in estimates with lower variability and mean squared error when the magnitude of confounding was strong. We illustrate the application of these methods by estimating the effect of average neighborhood income on the probability of death within 1 year of hospitalization for an acute myocardial infarction. PMID:29508424
Austin, Peter C
2018-05-20
Propensity score methods are increasingly being used to estimate the effects of treatments and exposures when using observational data. The propensity score was initially developed for use with binary exposures. The generalized propensity score (GPS) is an extension of the propensity score for use with quantitative or continuous exposures (eg, dose or quantity of medication, income, or years of education). We used Monte Carlo simulations to examine the performance of different methods of using the GPS to estimate the effect of continuous exposures on binary outcomes. We examined covariate adjustment using the GPS and weighting using weights based on the inverse of the GPS. We examined both the use of ordinary least squares to estimate the propensity function and the use of the covariate balancing propensity score algorithm. The use of methods based on the GPS was compared with the use of G-computation. All methods resulted in essentially unbiased estimation of the population dose-response function. However, GPS-based weighting tended to result in estimates that displayed greater variability and had higher mean squared error when the magnitude of confounding was strong. Of the methods based on the GPS, covariate adjustment using the GPS tended to result in estimates with lower variability and mean squared error when the magnitude of confounding was strong. We illustrate the application of these methods by estimating the effect of average neighborhood income on the probability of death within 1 year of hospitalization for an acute myocardial infarction. © 2018 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Bias in the Wagner-Nelson estimate of the fraction of drug absorbed.
Wang, Yibin; Nedelman, Jerry
2002-04-01
To examine and quantify bias in the Wagner-Nelson estimate of the fraction of drug absorbed resulting from the estimation error of the elimination rate constant (k), measurement error of the drug concentration, and the truncation error in the area under the curve. Bias in the Wagner-Nelson estimate was derived as a function of post-dosing time (t), k, ratio of absorption rate constant to k (r), and the coefficient of variation for estimates of k (CVk), or CV% for the observed concentration, by assuming a one-compartment model and using an independent estimate of k. The derived functions were used for evaluating the bias with r = 0.5, 3, or 6; k = 0.1 or 0.2; CV, = 0.2 or 0.4; and CV, =0.2 or 0.4; for t = 0 to 30 or 60. Estimation error of k resulted in an upward bias in the Wagner-Nelson estimate that could lead to the estimate of the fraction absorbed being greater than unity. The bias resulting from the estimation error of k inflates the fraction of absorption vs. time profiles mainly in the early post-dosing period. The magnitude of the bias in the Wagner-Nelson estimate resulting from estimation error of k was mainly determined by CV,. The bias in the Wagner-Nelson estimate resulting from to estimation error in k can be dramatically reduced by use of the mean of several independent estimates of k, as in studies for development of an in vivo-in vitro correlation. The truncation error in the area under the curve can introduce a negative bias in the Wagner-Nelson estimate. This can partially offset the bias resulting from estimation error of k in the early post-dosing period. Measurement error of concentration does not introduce bias in the Wagner-Nelson estimate. Estimation error of k results in an upward bias in the Wagner-Nelson estimate, mainly in the early drug absorption phase. The truncation error in AUC can result in a downward bias, which may partially offset the upward bias due to estimation error of k in the early absorption phase. Measurement error of concentration does not introduce bias. The joint effect of estimation error of k and truncation error in AUC can result in a non-monotonic fraction-of-drug-absorbed-vs-time profile. However, only estimation error of k can lead to the Wagner-Nelson estimate of fraction of drug absorbed greater than unity.
Commentary: Reducing diagnostic errors: another role for checklists?
Winters, Bradford D; Aswani, Monica S; Pronovost, Peter J
2011-03-01
Diagnostic errors are a widespread problem, although the true magnitude is unknown because they cannot currently be measured validly. These errors have received relatively little attention despite alarming estimates of associated harm and death. One promising intervention to reduce preventable harm is the checklist. This intervention has proven successful in aviation, in which situations are linear and deterministic (one alarm goes off and a checklist guides the flight crew to evaluate the cause). In health care, problems are multifactorial and complex. A checklist has been used to reduce central-line-associated bloodstream infections in intensive care units. Nevertheless, this checklist was incorporated in a culture-based safety program that engaged and changed behaviors and used robust measurement of infections to evaluate progress. In this issue, Ely and colleagues describe how three checklists could reduce the cognitive biases and mental shortcuts that underlie diagnostic errors, but point out that these tools still need to be tested. To be effective, they must reduce diagnostic errors (efficacy) and be routinely used in practice (effectiveness). Such tools must intuitively support how the human brain works, and under time pressures, clinicians rarely think in conditional probabilities when making decisions. To move forward, it is necessary to accurately measure diagnostic errors (which could come from mapping out the diagnostic process as the medication process has done and measuring errors at each step) and pilot test interventions such as these checklists to determine whether they work.
A complete representation of uncertainties in layer-counted paleoclimatic archives
NASA Astrophysics Data System (ADS)
Boers, Niklas; Goswami, Bedartha; Ghil, Michael
2017-09-01
Accurate time series representation of paleoclimatic proxy records is challenging because such records involve dating errors in addition to proxy measurement errors. Rigorous attention is rarely given to age uncertainties in paleoclimatic research, although the latter can severely bias the results of proxy record analysis. Here, we introduce a Bayesian approach to represent layer-counted proxy records - such as ice cores, sediments, corals, or tree rings - as sequences of probability distributions on absolute, error-free time axes. The method accounts for both proxy measurement errors and uncertainties arising from layer-counting-based dating of the records. An application to oxygen isotope ratios from the North Greenland Ice Core Project (NGRIP) record reveals that the counting errors, although seemingly small, lead to substantial uncertainties in the final representation of the oxygen isotope ratios. In particular, for the older parts of the NGRIP record, our results show that the total uncertainty originating from dating errors has been seriously underestimated. Our method is next applied to deriving the overall uncertainties of the Suigetsu radiocarbon comparison curve, which was recently obtained from varved sediment cores at Lake Suigetsu, Japan. This curve provides the only terrestrial radiocarbon comparison for the time interval 12.5-52.8 kyr BP. The uncertainties derived here can be readily employed to obtain complete error estimates for arbitrary radiometrically dated proxy records of this recent part of the last glacial interval.
Adaptive aperture for Geiger mode avalanche photodiode flash ladar systems.
Wang, Liang; Han, Shaokun; Xia, Wenze; Lei, Jieyu
2018-02-01
Although the Geiger-mode avalanche photodiode (GM-APD) flash ladar system offers the advantages of high sensitivity and simple construction, its detection performance is influenced not only by the incoming signal-to-noise ratio but also by the absolute number of noise photons. In this paper, we deduce a hyperbolic approximation to estimate the noise-photon number from the false-firing percentage in a GM-APD flash ladar system under dark conditions. By using this hyperbolic approximation function, we introduce a method to adapt the aperture to reduce the number of incoming background-noise photons. Finally, the simulation results show that the adaptive-aperture method decreases the false probability in all cases, increases the detection probability provided that the signal exceeds the noise, and decreases the average ranging error per frame.
Adaptive aperture for Geiger mode avalanche photodiode flash ladar systems
NASA Astrophysics Data System (ADS)
Wang, Liang; Han, Shaokun; Xia, Wenze; Lei, Jieyu
2018-02-01
Although the Geiger-mode avalanche photodiode (GM-APD) flash ladar system offers the advantages of high sensitivity and simple construction, its detection performance is influenced not only by the incoming signal-to-noise ratio but also by the absolute number of noise photons. In this paper, we deduce a hyperbolic approximation to estimate the noise-photon number from the false-firing percentage in a GM-APD flash ladar system under dark conditions. By using this hyperbolic approximation function, we introduce a method to adapt the aperture to reduce the number of incoming background-noise photons. Finally, the simulation results show that the adaptive-aperture method decreases the false probability in all cases, increases the detection probability provided that the signal exceeds the noise, and decreases the average ranging error per frame.
Supernova Cosmology Inference with Probabilistic Photometric Redshifts (SCIPPR)
NASA Astrophysics Data System (ADS)
Peters, Christina; Malz, Alex; Hlozek, Renée
2018-01-01
The Bayesian Estimation Applied to Multiple Species (BEAMS) framework employs probabilistic supernova type classifications to do photometric SN cosmology. This work extends BEAMS to replace high-confidence spectroscopic redshifts with photometric redshift probability density functions, a capability that will be essential in the era the Large Synoptic Survey Telescope and other next-generation photometric surveys where it will not be possible to perform spectroscopic follow up on every SN. We present the Supernova Cosmology Inference with Probabilistic Photometric Redshifts (SCIPPR) Bayesian hierarchical model for constraining the cosmological parameters from photometric lightcurves and host galaxy photometry, which includes selection effects and is extensible to uncertainty in the redshift-dependent supernova type proportions. We create a pair of realistic mock catalogs of joint posteriors over supernova type, redshift, and distance modulus informed by photometric supernova lightcurves and over redshift from simulated host galaxy photometry. We perform inference under our model to obtain a joint posterior probability distribution over the cosmological parameters and compare our results with other methods, namely: a spectroscopic subset, a subset of high probability photometrically classified supernovae, and reducing the photometric redshift probability to a single measurement and error bar.
Frequency domain analysis of errors in cross-correlations of ambient seismic noise
NASA Astrophysics Data System (ADS)
Liu, Xin; Ben-Zion, Yehuda; Zigone, Dimitri
2016-12-01
We analyse random errors (variances) in cross-correlations of ambient seismic noise in the frequency domain, which differ from previous time domain methods. Extending previous theoretical results on ensemble averaged cross-spectrum, we estimate confidence interval of stacked cross-spectrum of finite amount of data at each frequency using non-overlapping windows with fixed length. The extended theory also connects amplitude and phase variances with the variance of each complex spectrum value. Analysis of synthetic stationary ambient noise is used to estimate the confidence interval of stacked cross-spectrum obtained with different length of noise data corresponding to different number of evenly spaced windows of the same duration. This method allows estimating Signal/Noise Ratio (SNR) of noise cross-correlation in the frequency domain, without specifying filter bandwidth or signal/noise windows that are needed for time domain SNR estimations. Based on synthetic ambient noise data, we also compare the probability distributions, causal part amplitude and SNR of stacked cross-spectrum function using one-bit normalization or pre-whitening with those obtained without these pre-processing steps. Natural continuous noise records contain both ambient noise and small earthquakes that are inseparable from the noise with the existing pre-processing steps. Using probability distributions of random cross-spectrum values based on the theoretical results provides an effective way to exclude such small earthquakes, and additional data segments (outliers) contaminated by signals of different statistics (e.g. rain, cultural noise), from continuous noise waveforms. This technique is applied to constrain values and uncertainties of amplitude and phase velocity of stacked noise cross-spectrum at different frequencies, using data from southern California at both regional scale (˜35 km) and dense linear array (˜20 m) across the plate-boundary faults. A block bootstrap resampling method is used to account for temporal correlation of noise cross-spectrum at low frequencies (0.05-0.2 Hz) near the ocean microseismic peaks.
Nonlinear Demodulation and Channel Coding in EBPSK Scheme
Chen, Xianqing; Wu, Lenan
2012-01-01
The extended binary phase shift keying (EBPSK) is an efficient modulation technique, and a special impacting filter (SIF) is used in its demodulator to improve the bit error rate (BER) performance. However, the conventional threshold decision cannot achieve the optimum performance, and the SIF brings more difficulty in obtaining the posterior probability for LDPC decoding. In this paper, we concentrate not only on reducing the BER of demodulation, but also on providing accurate posterior probability estimates (PPEs). A new approach for the nonlinear demodulation based on the support vector machine (SVM) classifier is introduced. The SVM method which selects only a few sampling points from the filter output was used for getting PPEs. The simulation results show that the accurate posterior probability can be obtained with this method and the BER performance can be improved significantly by applying LDPC codes. Moreover, we analyzed the effect of getting the posterior probability with different methods and different sampling rates. We show that there are more advantages of the SVM method under bad condition and it is less sensitive to the sampling rate than other methods. Thus, SVM is an effective method for EBPSK demodulation and getting posterior probability for LDPC decoding. PMID:23213281
Nonlinear demodulation and channel coding in EBPSK scheme.
Chen, Xianqing; Wu, Lenan
2012-01-01
The extended binary phase shift keying (EBPSK) is an efficient modulation technique, and a special impacting filter (SIF) is used in its demodulator to improve the bit error rate (BER) performance. However, the conventional threshold decision cannot achieve the optimum performance, and the SIF brings more difficulty in obtaining the posterior probability for LDPC decoding. In this paper, we concentrate not only on reducing the BER of demodulation, but also on providing accurate posterior probability estimates (PPEs). A new approach for the nonlinear demodulation based on the support vector machine (SVM) classifier is introduced. The SVM method which selects only a few sampling points from the filter output was used for getting PPEs. The simulation results show that the accurate posterior probability can be obtained with this method and the BER performance can be improved significantly by applying LDPC codes. Moreover, we analyzed the effect of getting the posterior probability with different methods and different sampling rates. We show that there are more advantages of the SVM method under bad condition and it is less sensitive to the sampling rate than other methods. Thus, SVM is an effective method for EBPSK demodulation and getting posterior probability for LDPC decoding.
A two-phase sampling design for increasing detections of rare species in occupancy surveys
Pacifici, Krishna; Dorazio, Robert M.; Dorazio, Michael J.
2012-01-01
1. Occupancy estimation is a commonly used tool in ecological studies owing to the ease at which data can be collected and the large spatial extent that can be covered. One major obstacle to using an occupancy-based approach is the complications associated with designing and implementing an efficient survey. These logistical challenges become magnified when working with rare species when effort can be wasted in areas with none or very few individuals. 2. Here, we develop a two-phase sampling approach that mitigates these problems by using a design that places more effort in areas with higher predicted probability of occurrence. We compare our new sampling design to traditional single-season occupancy estimation under a range of conditions and population characteristics. We develop an intuitive measure of predictive error to compare the two approaches and use simulations to assess the relative accuracy of each approach. 3. Our two-phase approach exhibited lower predictive error rates compared to the traditional single-season approach in highly spatially correlated environments. The difference was greatest when detection probability was high (0·75) regardless of the habitat or sample size. When the true occupancy rate was below 0·4 (0·05-0·4), we found that allocating 25% of the sample to the first phase resulted in the lowest error rates. 4. In the majority of scenarios, the two-phase approach showed lower error rates compared to the traditional single-season approach suggesting our new approach is fairly robust to a broad range of conditions and design factors and merits use under a wide variety of settings. 5. Synthesis and applications. Conservation and management of rare species are a challenging task facing natural resource managers. It is critical for studies involving rare species to efficiently allocate effort and resources as they are usually of a finite nature. We believe our approach provides a framework for optimal allocation of effort while maximizing the information content of the data in an attempt to provide the highest conservation value per unit of effort.
NASA Astrophysics Data System (ADS)
Goh, Shu Ting
Spacecraft formation flying navigation continues to receive a great deal of interest. The research presented in this dissertation focuses on developing methods for estimating spacecraft absolute and relative positions, assuming measurements of only relative positions using wireless sensors. The implementation of the extended Kalman filter to the spacecraft formation navigation problem results in high estimation errors and instabilities in state estimation at times. This is due to the high nonlinearities in the system dynamic model. Several approaches are attempted in this dissertation aiming at increasing the estimation stability and improving the estimation accuracy. A differential geometric filter is implemented for spacecraft positions estimation. The differential geometric filter avoids the linearization step (which is always carried out in the extended Kalman filter) through a mathematical transformation that converts the nonlinear system into a linear system. A linear estimator is designed in the linear domain, and then transformed back to the physical domain. This approach demonstrated better estimation stability for spacecraft formation positions estimation, as detailed in this dissertation. The constrained Kalman filter is also implemented for spacecraft formation flying absolute positions estimation. The orbital motion of a spacecraft is characterized by two range extrema (perigee and apogee). At the extremum, the rate of change of a spacecraft's range vanishes. This motion constraint can be used to improve the position estimation accuracy. The application of the constrained Kalman filter at only two points in the orbit causes filter instability. Two variables are introduced into the constrained Kalman filter to maintain the stability and improve the estimation accuracy. An extended Kalman filter is implemented as a benchmark for comparison with the constrained Kalman filter. Simulation results show that the constrained Kalman filter provides better estimation accuracy as compared with the extended Kalman filter. A Weighted Measurement Fusion Kalman Filter (WMFKF) is proposed in this dissertation. In wireless localizing sensors, a measurement error is proportional to the distance of the signal travels and sensor noise. In this proposed Weighted Measurement Fusion Kalman Filter, the signal traveling time delay is not modeled; however, each measurement is weighted based on the measured signal travel distance. The obtained estimation performance is compared to the standard Kalman filter in two scenarios. The first scenario assumes using a wireless local positioning system in a GPS denied environment. The second scenario assumes the availability of both the wireless local positioning system and GPS measurements. The simulation results show that the WMFKF has similar accuracy performance as the standard Kalman Filter (KF) in the GPS denied environment. However, the WMFKF maintains the position estimation error within its expected error boundary when the WLPS detection range limit is above 30km. In addition, the WMFKF has a better accuracy and stability performance when GPS is available. Also, the computational cost analysis shows that the WMFKF has less computational cost than the standard KF, and the WMFKF has higher ellipsoid error probable percentage than the standard Measurement Fusion method. A method to determine the relative attitudes between three spacecraft is developed. The method requires four direction measurements between the three spacecraft. The simulation results and covariance analysis show that the method's error falls within a three sigma boundary without exhibiting any singularity issues. A study of the accuracy of the proposed method with respect to the shape of the spacecraft formation is also presented.
Rew, Mary Beth; Robbins, Jooke; Mattila, David; Palsbøll, Per J; Bérube, Martine
2011-04-01
Genetic identification of individuals is now commonplace, enabling the application of tagging methods to elusive species or species that cannot be tagged by traditional methods. A key aspect is determining the number of loci required to ensure that different individuals have non-matching multi-locus genotypes. Closely related individuals are of particular concern because of elevated matching probabilities caused by their recent co-ancestry. This issue may be addressed by increasing the number of loci to a level where full siblings (the relatedness category with the highest matching probability) are expected to have non-matching multi-locus genotypes. However, increasing the number of loci to meet this "full-sib criterion" greatly increases the laboratory effort, which in turn may increase the genotyping error rate resulting in an upward-biased mark-recapture estimate of abundance as recaptures are missed due to genotyping errors. We assessed the contribution of false matches from close relatives among 425 maternally related humpback whales, each genotyped at 20 microsatellite loci. We observed a very low (0.5-4%) contribution to falsely matching samples from pairs of first-order relatives (i.e., parent and offspring or full siblings). The main contribution to falsely matching individuals from close relatives originated from second-order relatives (e.g., half siblings), which was estimated at 9%. In our study, the total number of observed matches agreed well with expectations based upon the matching probability estimated for unrelated individuals, suggesting that the full-sib criterion is overly conservative, and would have required a 280% relative increase in effort. We suggest that, under most circumstances, the overall contribution to falsely matching samples from close relatives is likely to be low, and hence applying the full-sib criterion is unnecessary. In those cases where close relatives may present a significant issue, such as unrepresentative sampling, we propose three different genotyping strategies requiring only a modest increase in effort, which will greatly reduce the number of false matches due to the presence of related individuals.
Dynamic Sensor Tasking for Space Situational Awareness via Reinforcement Learning
NASA Astrophysics Data System (ADS)
Linares, R.; Furfaro, R.
2016-09-01
This paper studies the Sensor Management (SM) problem for optical Space Object (SO) tracking. The tasking problem is formulated as a Markov Decision Process (MDP) and solved using Reinforcement Learning (RL). The RL problem is solved using the actor-critic policy gradient approach. The actor provides a policy which is random over actions and given by a parametric probability density function (pdf). The critic evaluates the policy by calculating the estimated total reward or the value function for the problem. The parameters of the policy action pdf are optimized using gradients with respect to the reward function. Both the critic and the actor are modeled using deep neural networks (multi-layer neural networks). The policy neural network takes the current state as input and outputs probabilities for each possible action. This policy is random, and can be evaluated by sampling random actions using the probabilities determined by the policy neural network's outputs. The critic approximates the total reward using a neural network. The estimated total reward is used to approximate the gradient of the policy network with respect to the network parameters. This approach is used to find the non-myopic optimal policy for tasking optical sensors to estimate SO orbits. The reward function is based on reducing the uncertainty for the overall catalog to below a user specified uncertainty threshold. This work uses a 30 km total position error for the uncertainty threshold. This work provides the RL method with a negative reward as long as any SO has a total position error above the uncertainty threshold. This penalizes policies that take longer to achieve the desired accuracy. A positive reward is provided when all SOs are below the catalog uncertainty threshold. An optimal policy is sought that takes actions to achieve the desired catalog uncertainty in minimum time. This work trains the policy in simulation by letting it task a single sensor to "learn" from its performance. The proposed approach for the SM problem is tested in simulation and good performance is found using the actor-critic policy gradient method.
Modulation/demodulation techniques for satellite communications. Part 1: Background
NASA Technical Reports Server (NTRS)
Omura, J. K.; Simon, M. K.
1981-01-01
Basic characteristics of digital data transmission systems described include the physical communication links, the notion of bandwidth, FCC regulations, and performance measurements such as bit rates, bit error probabilities, throughputs, and delays. The error probability performance and spectral characteristics of various modulation/demodulation techniques commonly used or proposed for use in radio and satellite communication links are summarized. Forward error correction with block or convolutional codes is also discussed along with the important coding parameter, channel cutoff rate.
NASA Astrophysics Data System (ADS)
Kaftan, V. I.; Melnikov, A. Yu.
2018-01-01
The results of Global Navigational Satellite System (GNSS) observations in the regions of large earthquakes are analyzed. The characteristics of the Earth's surface deformations before, during, and after the earthquakes are considered. The obtained results demonstrate the presence of anomalous deformations close to the epicenters of the events. Statistical estimates of the anomalous strains and their relationship with measurement errors are obtained. Conclusions are drawn about the probable use of local GNSS networks to assess the risk of the occurrence of strong seismic events.
Empirical prediction intervals improve energy forecasting
Kaack, Lynn H.; Apt, Jay; Morgan, M. Granger; McSharry, Patrick
2017-01-01
Hundreds of organizations and analysts use energy projections, such as those contained in the US Energy Information Administration (EIA)’s Annual Energy Outlook (AEO), for investment and policy decisions. Retrospective analyses of past AEO projections have shown that observed values can differ from the projection by several hundred percent, and thus a thorough treatment of uncertainty is essential. We evaluate the out-of-sample forecasting performance of several empirical density forecasting methods, using the continuous ranked probability score (CRPS). The analysis confirms that a Gaussian density, estimated on past forecasting errors, gives comparatively accurate uncertainty estimates over a variety of energy quantities in the AEO, in particular outperforming scenario projections provided in the AEO. We report probabilistic uncertainties for 18 core quantities of the AEO 2016 projections. Our work frames how to produce, evaluate, and rank probabilistic forecasts in this setting. We propose a log transformation of forecast errors for price projections and a modified nonparametric empirical density forecasting method. Our findings give guidance on how to evaluate and communicate uncertainty in future energy outlooks. PMID:28760997
A probabilistic approach to the drag-based model
NASA Astrophysics Data System (ADS)
Napoletano, Gianluca; Forte, Roberta; Moro, Dario Del; Pietropaolo, Ermanno; Giovannelli, Luca; Berrilli, Francesco
2018-02-01
The forecast of the time of arrival (ToA) of a coronal mass ejection (CME) to Earth is of critical importance for our high-technology society and for any future manned exploration of the Solar System. As critical as the forecast accuracy is the knowledge of its precision, i.e. the error associated to the estimate. We propose a statistical approach for the computation of the ToA using the drag-based model by introducing the probability distributions, rather than exact values, as input parameters, thus allowing the evaluation of the uncertainty on the forecast. We test this approach using a set of CMEs whose transit times are known, and obtain extremely promising results: the average value of the absolute differences between measure and forecast is 9.1h, and half of these residuals are within the estimated errors. These results suggest that this approach deserves further investigation. We are working to realize a real-time implementation which ingests the outputs of automated CME tracking algorithms as inputs to create a database of events useful for a further validation of the approach.
NASA Technical Reports Server (NTRS)
Larson, Kristine M.; Ray, Richard D.; Williams, Simon D. P.
2017-01-01
A standard geodetic GPS receiver and a conventional Aquatrak tide gauge, collocated at Friday Harbor, Washington, are used to assess the quality of 10 years of water levels estimated from GPS sea surface reflections.The GPS results are improved by accounting for (tidal) motion of the reflecting sea surface and for signal propagation delay by the troposphere. The RMS error of individual GPS water level estimates is about 12 cm. Lower water levels are measured slightly more accurately than higher water levels. Forming daily mean sea levels reduces the RMS difference with the tide gauge data to approximately 2 cm. For monthly means, the RMS difference is 1.3 cm. The GPS elevations, of course, can be automatically placed into a well-defined terrestrial reference frame. Ocean tide coefficients, determined from both the GPS and tide gauge data, are in good agreement, with absolute differences below 1 cm for all constituents save K1 and S1. The latter constituent is especially anomalous, probably owing to daily temperature-induced errors in the Aquatrak tide gauge
NASA Astrophysics Data System (ADS)
Behmanesh, Iman; Yousefianmoghadam, Seyedsina; Nozari, Amin; Moaveni, Babak; Stavridis, Andreas
2018-07-01
This paper investigates the application of Hierarchical Bayesian model updating for uncertainty quantification and response prediction of civil structures. In this updating framework, structural parameters of an initial finite element (FE) model (e.g., stiffness or mass) are calibrated by minimizing error functions between the identified modal parameters and the corresponding parameters of the model. These error functions are assumed to have Gaussian probability distributions with unknown parameters to be determined. The estimated parameters of error functions represent the uncertainty of the calibrated model in predicting building's response (modal parameters here). The focus of this paper is to answer whether the quantified model uncertainties using dynamic measurement at building's reference/calibration state can be used to improve the model prediction accuracies at a different structural state, e.g., damaged structure. Also, the effects of prediction error bias on the uncertainty of the predicted values is studied. The test structure considered here is a ten-story concrete building located in Utica, NY. The modal parameters of the building at its reference state are identified from ambient vibration data and used to calibrate parameters of the initial FE model as well as the error functions. Before demolishing the building, six of its exterior walls were removed and ambient vibration measurements were also collected from the structure after the wall removal. These data are not used to calibrate the model; they are only used to assess the predicted results. The model updating framework proposed in this paper is applied to estimate the modal parameters of the building at its reference state as well as two damaged states: moderate damage (removal of four walls) and severe damage (removal of six walls). Good agreement is observed between the model-predicted modal parameters and those identified from vibration tests. Moreover, it is shown that including prediction error bias in the updating process instead of commonly-used zero-mean error function can significantly reduce the prediction uncertainties.
Unmanned aircraft system sense and avoid integrity and continuity
NASA Astrophysics Data System (ADS)
Jamoom, Michael B.
This thesis describes new methods to guarantee safety of sense and avoid (SAA) functions for Unmanned Aircraft Systems (UAS) by evaluating integrity and continuity risks. Previous SAA efforts focused on relative safety metrics, such as risk ratios, comparing the risk of using an SAA system versus not using it. The methods in this thesis evaluate integrity and continuity risks as absolute measures of safety, as is the established practice in commercial aircraft terminal area navigation applications. The main contribution of this thesis is a derivation of a new method, based on a standard intruder relative constant velocity assumption, that uses hazard state estimates and estimate error covariances to establish (1) the integrity risk of the SAA system not detecting imminent loss of '"well clear," which is the time and distance required to maintain safe separation from intruder aircraft, and (2) the probability of false alert, the continuity risk. Another contribution is applying these integrity and continuity risk evaluation methods to set quantifiable and certifiable safety requirements on sensors. A sensitivity analysis uses this methodology to evaluate the impact of sensor errors on integrity and continuity risks. The penultimate contribution is an integrity and continuity risk evaluation where the estimation model is refined to address realistic intruder relative linear accelerations, which goes beyond the current constant velocity standard. The final contribution is an integrity and continuity risk evaluation addressing multiple intruders. This evaluation is a new innovation-based method to determine the risk of mis-associating intruder measurements. A mis-association occurs when the SAA system incorrectly associates a measurement to the wrong intruder, causing large errors in the estimated intruder trajectories. The new methods described in this thesis can help ensure safe encounters between aircraft and enable SAA sensor certification for UAS integration into the National Airspace System.
Generalizing boundaries for triangular designs, and efficacy estimation at extended follow-ups.
Allison, Annabel; Edwards, Tansy; Omollo, Raymond; Alves, Fabiana; Magirr, Dominic; E Alexander, Neal D
2015-11-16
Visceral leishmaniasis (VL) is a parasitic disease transmitted by sandflies and is fatal if left untreated. Phase II trials of new treatment regimens for VL are primarily carried out to evaluate safety and efficacy, while pharmacokinetic data are also important to inform future combination treatment regimens. The efficacy of VL treatments is evaluated at two time points, initial cure, when treatment is completed and definitive cure, commonly 6 months post end of treatment, to allow for slow response to treatment and detection of relapses. This paper investigates a generalization of the triangular design to impose a minimum sample size for pharmacokinetic or other analyses, and methods to estimate efficacy at extended follow-up accounting for the sequential design and changes in cure status during extended follow-up. We provided R functions that generalize the triangular design to impose a minimum sample size before allowing stopping for efficacy. For estimation of efficacy at a second, extended, follow-up time, the performance of a shrinkage estimator (SHE), a probability tree estimator (PTE) and the maximum likelihood estimator (MLE) for estimation was assessed by simulation. The SHE and PTE are viable approaches to estimate an extended follow-up although the SHE performed better than the PTE: the bias and root mean square error were lower and coverage probabilities higher. Generalization of the triangular design is simple to implement for adaptations to meet requirements for pharmacokinetic analyses. Using the simple MLE approach to estimate efficacy at extended follow-up will lead to biased results, generally over-estimating treatment success. The SHE is recommended in trials of two or more treatments. The PTE is an acceptable alternative for one-arm trials or where use of the SHE is not possible due to computational complexity. NCT01067443 , February 2010.
Palmer, Tom M; Holmes, Michael V; Keating, Brendan J; Sheehan, Nuala A
2017-01-01
Abstract Mendelian randomization studies use genotypes as instrumental variables to test for and estimate the causal effects of modifiable risk factors on outcomes. Two-stage residual inclusion (TSRI) estimators have been used when researchers are willing to make parametric assumptions. However, researchers are currently reporting uncorrected or heteroscedasticity-robust standard errors for these estimates. We compared several different forms of the standard error for linear and logistic TSRI estimates in simulations and in real-data examples. Among others, we consider standard errors modified from the approach of Newey (1987), Terza (2016), and bootstrapping. In our simulations Newey, Terza, bootstrap, and corrected 2-stage least squares (in the linear case) standard errors gave the best results in terms of coverage and type I error. In the real-data examples, the Newey standard errors were 0.5% and 2% larger than the unadjusted standard errors for the linear and logistic TSRI estimators, respectively. We show that TSRI estimators with modified standard errors have correct type I error under the null. Researchers should report TSRI estimates with modified standard errors instead of reporting unadjusted or heteroscedasticity-robust standard errors. PMID:29106476
NASA Astrophysics Data System (ADS)
Hüsami Afşar, Mehdi; Unal Şorman, Ali; Tugrul Yilmaz, Mustafa
2016-04-01
Different drought characteristics (e.g. duration, average severity, and average areal extent) often have monotonic relation that increased magnitude of one often follows a similar increase in the magnitude of the other drought characteristic. Hence it is viable to establish a relationship between different drought characteristics with the goal of predicting one using other ones. Copula functions that relate different variables using their joint and conditional cumulative probability distributions are often used to statistically model the drought characteristics. In this study bivariate and trivariate joint probabilities of these characteristics are obtained over Ankara (Turkey) between 1960 and 2013. Copula-based return period estimation of drought characteristics of duration, average severity, and average areal extent show joint probabilities of these characteristics can be satisfactorily achieved. Among different copula families investigated in this study, elliptical family (i.e. including normal and t-student copula functions) resulted in the lowest root mean square error. "This study was supported by TUBITAK fund #114Y676)."
NASA Astrophysics Data System (ADS)
McKague, Darren Shawn
2001-12-01
The statistical properties of clouds and precipitation on a global scale are important to our understanding of climate. Inversion methods exist to retrieve the needed cloud and precipitation properties from satellite data pixel-by-pixel that can then be summarized over large data sets to obtain the desired statistics. These methods can be quite computationally expensive, and typically don't provide errors on the statistics. A new method is developed to directly retrieve probability distributions of parameters from the distribution of measured radiances. The method also provides estimates of the errors on the retrieved distributions. The method can retrieve joint distributions of parameters that allows for the study of the connection between parameters. A forward radiative transfer model creates a mapping from retrieval parameter space to radiance space. A Monte Carlo procedure uses the mapping to transform probability density from the observed radiance histogram to a two- dimensional retrieval property probability distribution function (PDF). An estimate of the uncertainty in the retrieved PDF is calculated from random realizations of the radiance to retrieval parameter PDF transformation given the uncertainty of the observed radiances, the radiance PDF, the forward radiative transfer, the finite number of prior state vectors, and the non-unique mapping to retrieval parameter space. The retrieval method is also applied to the remote sensing of precipitation from SSM/I microwave data. A method of stochastically generating hydrometeor fields based on the fields from a numerical cloud model is used to create the precipitation parameter radiance space transformation. The impact of vertical and horizontal variability within the hydrometeor fields has a significant impact on algorithm performance. Beamfilling factors are computed from the simulated hydrometeor fields. The beamfilling factors vary quite a bit depending upon the horizontal structure of the rain. The algorithm is applied to SSM/I images from the eastern tropical Pacific and is compared to PDFs of rain rate computed using pixel-by-pixel retrievals from Wilheit and from Liu and Curry. Differences exist between the three methods, but good general agreement is seen between the PDF retrieval algorithm and the algorithm of Liu and Curry. (Abstract shortened by UMI.)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jones, B; Miften, M
2014-06-15
Purpose: Cone-beam CT (CBCT) projection images provide anatomical data in real-time over several respiratory cycles, forming a comprehensive picture of tumor movement. We developed a method using these projections to determine the trajectory and dose of highly mobile tumors during each fraction of treatment. Methods: CBCT images of a respiration phantom were acquired, where the trajectory mimicked a lung tumor with high amplitude (2.4 cm) and hysteresis. A template-matching algorithm was used to identify the location of a steel BB in each projection. A Gaussian probability density function for tumor position was calculated which best fit the observed trajectory ofmore » the BB in the imager geometry. Two methods to improve the accuracy of tumor track reconstruction were investigated: first, using respiratory phase information to refine the trajectory estimation, and second, using the Monte Carlo method to sample the estimated Gaussian tumor position distribution. 15 clinically-drawn abdominal/lung CTV volumes were used to evaluate the accuracy of the proposed methods by comparing the known and calculated BB trajectories. Results: With all methods, the mean position of the BB was determined with accuracy better than 0.1 mm, and root-mean-square (RMS) trajectory errors were lower than 5% of marker amplitude. Use of respiratory phase information decreased RMS errors by 30%, and decreased the fraction of large errors (>3 mm) by half. Mean dose to the clinical volumes was calculated with an average error of 0.1% and average absolute error of 0.3%. Dosimetric parameters D90/D95 were determined within 0.5% of maximum dose. Monte-Carlo sampling increased RMS trajectory and dosimetric errors slightly, but prevented over-estimation of dose in trajectories with high noise. Conclusions: Tumor trajectory and dose-of-the-day were accurately calculated using CBCT projections. This technique provides a widely-available method to evaluate highly-mobile tumors, and could facilitate better strategies to mitigate or compensate for motion during SBRT.« less
Waltemeyer, Scott D.
2008-01-01
Estimates of the magnitude and frequency of peak discharges are necessary for the reliable design of bridges, culverts, and open-channel hydraulic analysis, and for flood-hazard mapping in New Mexico and surrounding areas. The U.S. Geological Survey, in cooperation with the New Mexico Department of Transportation, updated estimates of peak-discharge magnitude for gaging stations in the region and updated regional equations for estimation of peak discharge and frequency at ungaged sites. Equations were developed for estimating the magnitude of peak discharges for recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years at ungaged sites by use of data collected through 2004 for 293 gaging stations on unregulated streams that have 10 or more years of record. Peak discharges for selected recurrence intervals were determined at gaging stations by fitting observed data to a log-Pearson Type III distribution with adjustments for a low-discharge threshold and a zero skew coefficient. A low-discharge threshold was applied to frequency analysis of 140 of the 293 gaging stations. This application provides an improved fit of the log-Pearson Type III frequency distribution. Use of the low-discharge threshold generally eliminated the peak discharge by having a recurrence interval of less than 1.4 years in the probability-density function. Within each of the nine regions, logarithms of the maximum peak discharges for selected recurrence intervals were related to logarithms of basin and climatic characteristics by using stepwise ordinary least-squares regression techniques for exploratory data analysis. Generalized least-squares regression techniques, an improved regression procedure that accounts for time and spatial sampling errors, then were applied to the same data used in the ordinary least-squares regression analyses. The average standard error of prediction, which includes average sampling error and average standard error of regression, ranged from 38 to 93 percent (mean value is 62, and median value is 59) for the 100-year flood. The 1996 investigation standard error of prediction for the flood regions ranged from 41 to 96 percent (mean value is 67, and median value is 68) for the 100-year flood that was analyzed by using generalized least-squares regression analysis. Overall, the equations based on generalized least-squares regression techniques are more reliable than those in the 1996 report because of the increased length of record and improved geographic information system (GIS) method to determine basin and climatic characteristics. Flood-frequency estimates can be made for ungaged sites upstream or downstream from gaging stations by using a method that transfers flood-frequency data at the gaging station to the ungaged site by using a drainage-area ratio adjustment equation. The peak discharge for a given recurrence interval at the gaging station, drainage-area ratio, and the drainage-area exponent from the regional regression equation of the respective region is used to transfer the peak discharge for the recurrence interval to the ungaged site. Maximum observed peak discharge as related to drainage area was determined for New Mexico. Extreme events are commonly used in the design and appraisal of bridge crossings and other structures. Bridge-scour evaluations are commonly made by using the 500-year peak discharge for these appraisals. Peak-discharge data collected at 293 gaging stations and 367 miscellaneous sites were used to develop a maximum peak-discharge relation as an alternative method of estimating peak discharge of an extreme event such as a maximum probable flood.
NASA Technical Reports Server (NTRS)
Rodriguez, G.
1981-01-01
A function space approach to smoothing is used to obtain a set of model error estimates inherent in a reduced-order model. By establishing knowledge of inevitable deficiencies in the truncated model, the error estimates provide a foundation for updating the model and thereby improving system performance. The function space smoothing solution leads to a specification of a method for computation of the model error estimates and development of model error analysis techniques for comparison between actual and estimated errors. The paper summarizes the model error estimation approach as well as an application arising in the area of modeling for spacecraft attitude control.
Model error estimation for distributed systems described by elliptic equations
NASA Technical Reports Server (NTRS)
Rodriguez, G.
1983-01-01
A function space approach is used to develop a theory for estimation of the errors inherent in an elliptic partial differential equation model for a distributed parameter system. By establishing knowledge of the inevitable deficiencies in the model, the error estimates provide a foundation for updating the model. The function space solution leads to a specification of a method for computation of the model error estimates and development of model error analysis techniques for comparison between actual and estimated errors. The paper summarizes the model error estimation approach as well as an application arising in the area of modeling for static shape determination of large flexible systems.
Maximum-likelihood methods in wavefront sensing: stochastic models and likelihood functions
Barrett, Harrison H.; Dainty, Christopher; Lara, David
2008-01-01
Maximum-likelihood (ML) estimation in wavefront sensing requires careful attention to all noise sources and all factors that influence the sensor data. We present detailed probability density functions for the output of the image detector in a wavefront sensor, conditional not only on wavefront parameters but also on various nuisance parameters. Practical ways of dealing with nuisance parameters are described, and final expressions for likelihoods and Fisher information matrices are derived. The theory is illustrated by discussing Shack–Hartmann sensors, and computational requirements are discussed. Simulation results show that ML estimation can significantly increase the dynamic range of a Shack–Hartmann sensor with four detectors and that it can reduce the residual wavefront error when compared with traditional methods. PMID:17206255
The calculation of average error probability in a digital fibre optical communication system
NASA Astrophysics Data System (ADS)
Rugemalira, R. A. M.
1980-03-01
This paper deals with the problem of determining the average error probability in a digital fibre optical communication system, in the presence of message dependent inhomogeneous non-stationary shot noise, additive Gaussian noise and intersymbol interference. A zero-forcing equalization receiver filter is considered. Three techniques for error rate evaluation are compared. The Chernoff bound and the Gram-Charlier series expansion methods are compared to the characteristic function technique. The latter predicts a higher receiver sensitivity
An Evidential Reasoning-Based CREAM to Human Reliability Analysis in Maritime Accident Process.
Wu, Bing; Yan, Xinping; Wang, Yang; Soares, C Guedes
2017-10-01
This article proposes a modified cognitive reliability and error analysis method (CREAM) for estimating the human error probability in the maritime accident process on the basis of an evidential reasoning approach. This modified CREAM is developed to precisely quantify the linguistic variables of the common performance conditions and to overcome the problem of ignoring the uncertainty caused by incomplete information in the existing CREAM models. Moreover, this article views maritime accident development from the sequential perspective, where a scenario- and barrier-based framework is proposed to describe the maritime accident process. This evidential reasoning-based CREAM approach together with the proposed accident development framework are applied to human reliability analysis of a ship capsizing accident. It will facilitate subjective human reliability analysis in different engineering systems where uncertainty exists in practice. © 2017 Society for Risk Analysis.
NASA Technical Reports Server (NTRS)
Lukyanov, S. S.
1983-01-01
This paper is dedicated to the possible investigation of the utilization of the solar radiation pressure for the spacecraft motion control in the vicinity of collinear libration point of planar restricted ring problem of three bodies. The control is realized by changing the solar sail area at its permanent orientation. In this problem the influence of the trajectory errors and the errors of the execution control is accounted. It is worked out, the estimation method of the solar sail sizes, which are necessary for spacecraft keeping in the vicinity of collinear libration point during the certain time with given probability. The main control parameters were calculated for some examples in case of libration points of the Sun-Earth and Earth-Moon systems.
Cumulative uncertainty in measured streamflow and water quality data for small watersheds
Harmel, R.D.; Cooper, R.J.; Slade, R.M.; Haney, R.L.; Arnold, J.G.
2006-01-01
The scientific community has not established an adequate understanding of the uncertainty inherent in measured water quality data, which is introduced by four procedural categories: streamflow measurement, sample collection, sample preservation/storage, and laboratory analysis. Although previous research has produced valuable information on relative differences in procedures within these categories, little information is available that compares the procedural categories or presents the cumulative uncertainty in resulting water quality data. As a result, quality control emphasis is often misdirected, and data uncertainty is typically either ignored or accounted for with an arbitrary margin of safety. Faced with the need for scientifically defensible estimates of data uncertainty to support water resource management, the objectives of this research were to: (1) compile selected published information on uncertainty related to measured streamflow and water quality data for small watersheds, (2) use a root mean square error propagation method to compare the uncertainty introduced by each procedural category, and (3) use the error propagation method to determine the cumulative probable uncertainty in measured streamflow, sediment, and nutrient data. Best case, typical, and worst case "data quality" scenarios were examined. Averaged across all constituents, the calculated cumulative probable uncertainty (??%) contributed under typical scenarios ranged from 6% to 19% for streamflow measurement, from 4% to 48% for sample collection, from 2% to 16% for sample preservation/storage, and from 5% to 21% for laboratory analysis. Under typical conditions, errors in storm loads ranged from 8% to 104% for dissolved nutrients, from 8% to 110% for total N and P, and from 7% to 53% for TSS. Results indicated that uncertainty can increase substantially under poor measurement conditions and limited quality control effort. This research provides introductory scientific estimates of uncertainty in measured water quality data. The results and procedures presented should also assist modelers in quantifying the "quality"of calibration and evaluation data sets, determining model accuracy goals, and evaluating model performance.
Measurement error affects risk estimates for recruitment to the Hudson River stock of striped bass.
Dunning, Dennis J; Ross, Quentin E; Munch, Stephan B; Ginzburg, Lev R
2002-06-07
We examined the consequences of ignoring the distinction between measurement error and natural variability in an assessment of risk to the Hudson River stock of striped bass posed by entrainment at the Bowline Point, Indian Point, and Roseton power plants. Risk was defined as the probability that recruitment of age-1+ striped bass would decline by 80% or more, relative to the equilibrium value, at least once during the time periods examined (1, 5, 10, and 15 years). Measurement error, estimated using two abundance indices from independent beach seine surveys conducted on the Hudson River, accounted for 50% of the variability in one index and 56% of the variability in the other. If a measurement error of 50% was ignored and all of the variability in abundance was attributed to natural causes, the risk that recruitment of age-1+ striped bass would decline by 80% or more after 15 years was 0.308 at the current level of entrainment mortality (11%). However, the risk decreased almost tenfold (0.032) if a measurement error of 50% was considered. The change in risk attributable to decreasing the entrainment mortality rate from 11 to 0% was very small (0.009) and similar in magnitude to the change in risk associated with an action proposed in Amendment #5 to the Interstate Fishery Management Plan for Atlantic striped bass (0.006)--an increase in the instantaneous fishing mortality rate from 0.33 to 0.4. The proposed increase in fishing mortality was not considered an adverse environmental impact, which suggests that potentially costly efforts to reduce entrainment mortality on the Hudson River stock of striped bass are not warranted.
NASA Astrophysics Data System (ADS)
Haneberg, W. C.
2017-12-01
Remote characterization of new landslides or areas of ongoing movement using differences in high resolution digital elevation models (DEMs) created through time, for example before and after major rains or earthquakes, is an attractive proposition. In the case of large catastrophic landslides, changes may be apparent enough that simple subtraction suffices. In other cases, statistical noise can obscure landslide signatures and place practical limits on detection. In ideal cases on land, GPS surveys of representative areas at the time of DEM creation can quantify the inherent errors. In less-than-ideal terrestrial cases and virtually all submarine cases, it may be impractical or impossible to independently estimate the DEM errors. Examining DEM difference statistics for areas reasonably inferred to have no change, however, can provide insight into the limits of detectability. Data from inferred no-change areas of airborne LiDAR DEM difference maps of the 2014 Oso, Washington landslide and landslide-prone colluvium slopes along the Ohio River valley in northern Kentucky, show that DEM difference maps can have non-zero mean and slope dependent error components consistent with published studies of DEM errors. Statistical thresholds derived from DEM difference error and slope data can help to distinguish between DEM differences that are likely real—and which may indicate landsliding—from those that are likely spurious or irrelevant. This presentation describes and compares two different approaches, one based upon a heuristic assumption about the proportion of the study area likely covered by new landslides and another based upon the amount of change necessary to ensure difference at a specified level of probability.
Okawa, S; Endo, Y; Hoshi, Y; Yamada, Y
2012-01-01
A method to reduce noise for time-domain diffuse optical tomography (DOT) is proposed. Poisson noise which contaminates time-resolved photon counting data is reduced by use of maximum a posteriori estimation. The noise-free data are modeled as a Markov random process, and the measured time-resolved data are assumed as Poisson distributed random variables. The posterior probability of the occurrence of the noise-free data is formulated. By maximizing the probability, the noise-free data are estimated, and the Poisson noise is reduced as a result. The performances of the Poisson noise reduction are demonstrated in some experiments of the image reconstruction of time-domain DOT. In simulations, the proposed method reduces the relative error between the noise-free and noisy data to about one thirtieth, and the reconstructed DOT image was smoothed by the proposed noise reduction. The variance of the reconstructed absorption coefficients decreased by 22% in a phantom experiment. The quality of DOT, which can be applied to breast cancer screening etc., is improved by the proposed noise reduction.
Accounting for imperfect detection in ecology: a quantitative review.
Kellner, Kenneth F; Swihart, Robert K
2014-01-01
Detection in studies of species abundance and distribution is often imperfect. Assuming perfect detection introduces bias into estimation that can weaken inference upon which understanding and policy are based. Despite availability of numerous methods designed to address this assumption, many refereed papers in ecology fail to account for non-detection error. We conducted a quantitative literature review of 537 ecological articles to measure the degree to which studies of different taxa, at various scales, and over time have accounted for imperfect detection. Overall, just 23% of articles accounted for imperfect detection. The probability that an article incorporated imperfect detection increased with time and varied among taxa studied; studies of vertebrates were more likely to incorporate imperfect detection. Among articles that reported detection probability, 70% contained per-survey estimates of detection that were less than 0.5. For articles in which constancy of detection was tested, 86% reported significant variation. We hope that our findings prompt more ecologists to consider carefully the detection process when designing studies and analyzing results, especially for sub-disciplines where incorporation of imperfect detection in study design and analysis so far has been lacking.
Uncertainty Analysis and Parameter Estimation For Nearshore Hydrodynamic Models
NASA Astrophysics Data System (ADS)
Ardani, S.; Kaihatu, J. M.
2012-12-01
Numerical models represent deterministic approaches used for the relevant physical processes in the nearshore. Complexity of the physics of the model and uncertainty involved in the model inputs compel us to apply a stochastic approach to analyze the robustness of the model. The Bayesian inverse problem is one powerful way to estimate the important input model parameters (determined by apriori sensitivity analysis) and can be used for uncertainty analysis of the outputs. Bayesian techniques can be used to find the range of most probable parameters based on the probability of the observed data and the residual errors. In this study, the effect of input data involving lateral (Neumann) boundary conditions, bathymetry and off-shore wave conditions on nearshore numerical models are considered. Monte Carlo simulation is applied to a deterministic numerical model (the Delft3D modeling suite for coupled waves and flow) for the resulting uncertainty analysis of the outputs (wave height, flow velocity, mean sea level and etc.). Uncertainty analysis of outputs is performed by random sampling from the input probability distribution functions and running the model as required until convergence to the consistent results is achieved. The case study used in this analysis is the Duck94 experiment, which was conducted at the U.S. Army Field Research Facility at Duck, North Carolina, USA in the fall of 1994. The joint probability of model parameters relevant for the Duck94 experiments will be found using the Bayesian approach. We will further show that, by using Bayesian techniques to estimate the optimized model parameters as inputs and applying them for uncertainty analysis, we can obtain more consistent results than using the prior information for input data which means that the variation of the uncertain parameter will be decreased and the probability of the observed data will improve as well. Keywords: Monte Carlo Simulation, Delft3D, uncertainty analysis, Bayesian techniques, MCMC
Selecting a restoration technique to minimize OCR error.
Cannon, M; Fugate, M; Hush, D R; Scovel, C
2003-01-01
This paper introduces a learning problem related to the task of converting printed documents to ASCII text files. The goal of the learning procedure is to produce a function that maps documents to restoration techniques in such a way that on average the restored documents have minimum optical character recognition error. We derive a general form for the optimal function and use it to motivate the development of a nonparametric method based on nearest neighbors. We also develop a direct method of solution based on empirical error minimization for which we prove a finite sample bound on estimation error that is independent of distribution. We show that this empirical error minimization problem is an extension of the empirical optimization problem for traditional M-class classification with general loss function and prove computational hardness for this problem. We then derive a simple iterative algorithm called generalized multiclass ratchet (GMR) and prove that it produces an optimal function asymptotically (with probability 1). To obtain the GMR algorithm we introduce a new data map that extends Kesler's construction for the multiclass problem and then apply an algorithm called Ratchet to this mapped data, where Ratchet is a modification of the Pocket algorithm . Finally, we apply these methods to a collection of documents and report on the experimental results.
Effects of preparation time and trial type probability on performance of anti- and pro-saccades.
Pierce, Jordan E; McDowell, Jennifer E
2016-02-01
Cognitive control optimizes responses to relevant task conditions by balancing bottom-up stimulus processing with top-down goal pursuit. It can be investigated using the ocular motor system by contrasting basic prosaccades (look toward a stimulus) with complex antisaccades (look away from a stimulus). Furthermore, the amount of time allotted between trials, the need to switch task sets, and the time allowed to prepare for an upcoming saccade all impact performance. In this study the relative probabilities of anti- and pro-saccades were manipulated across five blocks of interleaved trials, while the inter-trial interval and trial type cue duration were varied across subjects. Results indicated that inter-trial interval had no significant effect on error rates or reaction times (RTs), while a shorter trial type cue led to more antisaccade errors and faster overall RTs. Responses following a shorter cue duration also showed a stronger effect of trial type probability, with more antisaccade errors in blocks with a low antisaccade probability and slower RTs for each saccade task when its trial type was unlikely. A longer cue duration yielded fewer errors and slower RTs, with a larger switch cost for errors compared to a short cue duration. Findings demonstrated that when the trial type cue duration was shorter, visual motor responsiveness was faster and subjects relied upon the implicit trial probability context to improve performance. When the cue duration was longer, increased fixation-related activity may have delayed saccade motor preparation and slowed responses, guiding subjects to respond in a controlled manner regardless of trial type probability. Copyright © 2016 Elsevier B.V. All rights reserved.
Bias in error estimation when using cross-validation for model selection.
Varma, Sudhir; Simon, Richard
2006-02-23
Cross-validation (CV) is an effective method for estimating the prediction error of a classifier. Some recent articles have proposed methods for optimizing classifiers by choosing classifier parameter values that minimize the CV error estimate. We have evaluated the validity of using the CV error estimate of the optimized classifier as an estimate of the true error expected on independent data. We used CV to optimize the classification parameters for two kinds of classifiers; Shrunken Centroids and Support Vector Machines (SVM). Random training datasets were created, with no difference in the distribution of the features between the two classes. Using these "null" datasets, we selected classifier parameter values that minimized the CV error estimate. 10-fold CV was used for Shrunken Centroids while Leave-One-Out-CV (LOOCV) was used for the SVM. Independent test data was created to estimate the true error. With "null" and "non null" (with differential expression between the classes) data, we also tested a nested CV procedure, where an inner CV loop is used to perform the tuning of the parameters while an outer CV is used to compute an estimate of the error. The CV error estimate for the classifier with the optimal parameters was found to be a substantially biased estimate of the true error that the classifier would incur on independent data. Even though there is no real difference between the two classes for the "null" datasets, the CV error estimate for the Shrunken Centroid with the optimal parameters was less than 30% on 18.5% of simulated training data-sets. For SVM with optimal parameters the estimated error rate was less than 30% on 38% of "null" data-sets. Performance of the optimized classifiers on the independent test set was no better than chance. The nested CV procedure reduces the bias considerably and gives an estimate of the error that is very close to that obtained on the independent testing set for both Shrunken Centroids and SVM classifiers for "null" and "non-null" data distributions. We show that using CV to compute an error estimate for a classifier that has itself been tuned using CV gives a significantly biased estimate of the true error. Proper use of CV for estimating true error of a classifier developed using a well defined algorithm requires that all steps of the algorithm, including classifier parameter tuning, be repeated in each CV loop. A nested CV procedure provides an almost unbiased estimate of the true error.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bond, J.W.
1988-01-01
Data-compression codes offer the possibility of improving the thruput of existing communication systems in the near term. This study was undertaken to determine if data-compression codes could be utilized to provide message compression in a channel with up to a 0.10-bit error rate. The data-compression capabilities of codes were investigated by estimating the average number of bits-per-character required to transmit narrative files. The performance of the codes in a channel with errors (a noisy channel) was investigated in terms of the average numbers of characters-decoded-in-error and of characters-printed-in-error-per-bit-error. Results were obtained by encoding four narrative files, which were resident onmore » an IBM-PC and use a 58-character set. The study focused on Huffman codes and suffix/prefix comma-free codes. Other data-compression codes, in particular, block codes and some simple variants of block codes, are briefly discussed to place the study results in context. Comma-free codes were found to have the most-promising data compression because error propagation due to bit errors are limited to a few characters for these codes. A technique was found to identify a suffix/prefix comma-free code giving nearly the same data compressions as a Huffman code with much less error propagation than the Huffman codes. Greater data compression can be achieved through the use of this comma-free code word assignments based on conditioned probabilities of character occurrence.« less
Karnon, Jonathan; Campbell, Fiona; Czoski-Murray, Carolyn
2009-04-01
Medication errors can lead to preventable adverse drug events (pADEs) that have significant cost and health implications. Errors often occur at care interfaces, and various interventions have been devised to reduce medication errors at the point of admission to hospital. The aim of this study is to assess the incremental costs and effects [measured as quality adjusted life years (QALYs)] of a range of such interventions for which evidence of effectiveness exists. A previously published medication errors model was adapted to describe the pathway of errors occurring at admission through to the occurrence of pADEs. The baseline model was populated using literature-based values, and then calibrated to observed outputs. Evidence of effects was derived from a systematic review of interventions aimed at preventing medication error at hospital admission. All five interventions, for which evidence of effectiveness was identified, are estimated to be extremely cost-effective when compared with the baseline scenario. Pharmacist-led reconciliation intervention has the highest expected net benefits, and a probability of being cost-effective of over 60% by a QALY value of pound10 000. The medication errors model provides reasonably strong evidence that some form of intervention to improve medicines reconciliation is a cost-effective use of NHS resources. The variation in the reported effectiveness of the few identified studies of medication error interventions illustrates the need for extreme attention to detail in the development of interventions, but also in their evaluation and may justify the primary evaluation of more than one specification of included interventions.
Lightning Reporting at 45th Weather Squadron: Recent Improvements
NASA Technical Reports Server (NTRS)
Finn, Frank C.; Roeder, William P.; Buchanan, Michael D.; McNamara, Todd M.; McAllenan, Michael; Winters, Katherine A.; Fitzpatrick, Michael E.; Huddleston, Lisa L.
2010-01-01
The 45th Weather Squadron (45 WS) provides daily lightning reports to space launch customers at CCAFS/KSC. These reports are provided to assess the need to inspect the electronics of satellite payloads, space launch vehicles, and ground support equipment for induced current damage from nearby lightning strokes. The 45 WS has made several improvements to the lightning reports during 2008-2009. The 4DLSS, implemented in April 2008, provides all lightning strokes as opposed to just one stroke per flash as done by the previous system. The 45 WS discovered that the peak current was being truncated to the nearest kilo amp in the database used to generate the daily lightning reports, which led to an up to 4% underestimate in the peak current for average lightning. This error was corrected and led to elimination of this underestimate. The 45 WS and their mission partners developed lightning location error ellipses for 99% and 95% location accuracies tailored to each individual stroke and began providing them in the spring of 2009. The new procedure provides the distance from the point of interest to the best location of the stroke (the center of the error ellipse) and the distance to the closest edge of the ellipse. This information is now included in the lightning reports, along with the peak current of the stroke. The initial method of calculating the error ellipses could only be used during normal duty hours, i.e. not during nights, weekends, or holidays. This method was improved later to provide lightning reports in near real-time, 24/7. The calculation of the distance to the closest point on the ellipse was also significantly improved later. Other improvements were also implemented. A new method to calculate the probability of any nearby lightning stroke. being within any radius of any point of interest was developed and is being implemented. This may supersede the use of location error ellipses. The 45 WS is pursuing adding data from nine NLDN sensors into 4DLSS in real-time. This will overcome the problem of 4DLSS missing some of the strong local strokes. This will also improve the location accuracy, reduce the size and eccentricity of the location error ellipses, and reduce the probability of nearby strokes being inside the areas of interest when few of the 4DLSS sensors are used in the stroke solution. This will not reduce 4DLSS performance when most of the 4DLSS sensors are used in the stroke solution. Finally, several possible future improvements were discussed, especially for improving the peak current estimate and the error estimate for peak current, and upgrading the 4DLSS. Some possible approaches for both of these goals were discussed.
Parallel computers - Estimate errors caused by imprecise data
NASA Technical Reports Server (NTRS)
Kreinovich, Vladik; Bernat, Andrew; Villa, Elsa; Mariscal, Yvonne
1991-01-01
A new approach to the problem of estimating errors caused by imprecise data is proposed in the context of software engineering. A software device is used to produce an ideal solution to the problem, when the computer is capable of computing errors of arbitrary programs. The software engineering aspect of this problem is to describe a device for computing the error estimates in software terms and then to provide precise numbers with error estimates to the user. The feasibility of the program capable of computing both some quantity and its error estimate in the range of possible measurement errors is demonstrated.
Alachiotis, Nikolaos; Vogiatzi, Emmanouella; Pavlidis, Pavlos; Stamatakis, Alexandros
2013-01-01
Automated DNA sequencers generate chromatograms that contain raw sequencing data. They also generate data that translates the chromatograms into molecular sequences of A, C, G, T, or N (undetermined) characters. Since chromatogram translation programs frequently introduce errors, a manual inspection of the generated sequence data is required. As sequence numbers and lengths increase, visual inspection and manual correction of chromatograms and corresponding sequences on a per-peak and per-nucleotide basis becomes an error-prone, time-consuming, and tedious process. Here, we introduce ChromatoGate (CG), an open-source software that accelerates and partially automates the inspection of chromatograms and the detection of sequencing errors for bidirectional sequencing runs. To provide users full control over the error correction process, a fully automated error correction algorithm has not been implemented. Initially, the program scans a given multiple sequence alignment (MSA) for potential sequencing errors, assuming that each polymorphic site in the alignment may be attributed to a sequencing error with a certain probability. The guided MSA assembly procedure in ChromatoGate detects chromatogram peaks of all characters in an alignment that lead to polymorphic sites, given a user-defined threshold. The threshold value represents the sensitivity of the sequencing error detection mechanism. After this pre-filtering, the user only needs to inspect a small number of peaks in every chromatogram to correct sequencing errors. Finally, we show that correcting sequencing errors is important, because population genetic and phylogenetic inferences can be misled by MSAs with uncorrected mis-calls. Our experiments indicate that estimates of population mutation rates can be affected two- to three-fold by uncorrected errors. PMID:24688709
Alachiotis, Nikolaos; Vogiatzi, Emmanouella; Pavlidis, Pavlos; Stamatakis, Alexandros
2013-01-01
Automated DNA sequencers generate chromatograms that contain raw sequencing data. They also generate data that translates the chromatograms into molecular sequences of A, C, G, T, or N (undetermined) characters. Since chromatogram translation programs frequently introduce errors, a manual inspection of the generated sequence data is required. As sequence numbers and lengths increase, visual inspection and manual correction of chromatograms and corresponding sequences on a per-peak and per-nucleotide basis becomes an error-prone, time-consuming, and tedious process. Here, we introduce ChromatoGate (CG), an open-source software that accelerates and partially automates the inspection of chromatograms and the detection of sequencing errors for bidirectional sequencing runs. To provide users full control over the error correction process, a fully automated error correction algorithm has not been implemented. Initially, the program scans a given multiple sequence alignment (MSA) for potential sequencing errors, assuming that each polymorphic site in the alignment may be attributed to a sequencing error with a certain probability. The guided MSA assembly procedure in ChromatoGate detects chromatogram peaks of all characters in an alignment that lead to polymorphic sites, given a user-defined threshold. The threshold value represents the sensitivity of the sequencing error detection mechanism. After this pre-filtering, the user only needs to inspect a small number of peaks in every chromatogram to correct sequencing errors. Finally, we show that correcting sequencing errors is important, because population genetic and phylogenetic inferences can be misled by MSAs with uncorrected mis-calls. Our experiments indicate that estimates of population mutation rates can be affected two- to three-fold by uncorrected errors.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lon N. Haney; David I. Gertman
2003-04-01
Beginning in the 1980s a primary focus of human reliability analysis was estimation of human error probabilities. However, detailed qualitative modeling with comprehensive representation of contextual variables often was lacking. This was likely due to the lack of comprehensive error and performance shaping factor taxonomies, and the limited data available on observed error rates and their relationship to specific contextual variables. In the mid 90s Boeing, America West Airlines, NASA Ames Research Center and INEEL partnered in a NASA sponsored Advanced Concepts grant to: assess the state of the art in human error analysis, identify future needs for human errormore » analysis, and develop an approach addressing these needs. Identified needs included the need for a method to identify and prioritize task and contextual characteristics affecting human reliability. Other needs identified included developing comprehensive taxonomies to support detailed qualitative modeling and to structure meaningful data collection efforts across domains. A result was the development of the FRamework Assessing Notorious Contributing Influences for Error (FRANCIE) with a taxonomy for airline maintenance tasks. The assignment of performance shaping factors to generic errors by experts proved to be valuable to qualitative modeling. Performance shaping factors and error types from such detailed approaches can be used to structure error reporting schemes. In a recent NASA Advanced Human Support Technology grant FRANCIE was refined, and two new taxonomies for use on space missions were developed. The development, sharing, and use of error taxonomies, and the refinement of approaches for increased fidelity of qualitative modeling is offered as a means to help direct useful data collection strategies.« less
Fast maximum likelihood estimation of mutation rates using a birth-death process.
Wu, Xiaowei; Zhu, Hongxiao
2015-02-07
Since fluctuation analysis was first introduced by Luria and Delbrück in 1943, it has been widely used to make inference about spontaneous mutation rates in cultured cells. Under certain model assumptions, the probability distribution of the number of mutants that appear in a fluctuation experiment can be derived explicitly, which provides the basis of mutation rate estimation. It has been shown that, among various existing estimators, the maximum likelihood estimator usually demonstrates some desirable properties such as consistency and lower mean squared error. However, its application in real experimental data is often hindered by slow computation of likelihood due to the recursive form of the mutant-count distribution. We propose a fast maximum likelihood estimator of mutation rates, MLE-BD, based on a birth-death process model with non-differential growth assumption. Simulation studies demonstrate that, compared with the conventional maximum likelihood estimator derived from the Luria-Delbrück distribution, MLE-BD achieves substantial improvement on computational speed and is applicable to arbitrarily large number of mutants. In addition, it still retains good accuracy on point estimation. Published by Elsevier Ltd.