Mollenhauer, Robert; Mouser, Joshua B.; Brewer, Shannon K.
2018-01-01
Temporal and spatial variability in streams result in heterogeneous gear capture probability (i.e., the proportion of available individuals identified) that confounds interpretation of data used to monitor fish abundance. We modeled tow-barge electrofishing capture probability at multiple spatial scales for nine Ozark Highland stream fishes. In addition to fish size, we identified seven reach-scale environmental characteristics associated with variable capture probability: stream discharge, water depth, conductivity, water clarity, emergent vegetation, wetted width–depth ratio, and proportion of riffle habitat. The magnitude of the relationship between capture probability and both discharge and depth varied among stream fishes. We also identified lithological characteristics among stream segments as a coarse-scale source of variable capture probability. The resulting capture probability model can be used to adjust catch data and derive reach-scale absolute abundance estimates across a wide range of sampling conditions with similar effort as used in more traditional fisheries surveys (i.e., catch per unit effort). Adjusting catch data based on variable capture probability improves the comparability of data sets, thus promoting both well-informed conservation and management decisions and advances in stream-fish ecology.
Extended Importance Sampling for Reliability Analysis under Evidence Theory
NASA Astrophysics Data System (ADS)
Yuan, X. K.; Chen, B.; Zhang, B. Q.
2018-05-01
In early engineering practice, the lack of data and information makes uncertainty difficult to deal with. However, evidence theory has been proposed to handle uncertainty with limited information as an alternative way to traditional probability theory. In this contribution, a simulation-based approach, called ‘Extended importance sampling’, is proposed based on evidence theory to handle problems with epistemic uncertainty. The proposed approach stems from the traditional importance sampling for reliability analysis under probability theory, and is developed to handle the problem with epistemic uncertainty. It first introduces a nominal instrumental probability density function (PDF) for every epistemic uncertainty variable, and thus an ‘equivalent’ reliability problem under probability theory is obtained. Then the samples of these variables are generated in a way of importance sampling. Based on these samples, the plausibility and belief (upper and lower bounds of probability) can be estimated. It is more efficient than direct Monte Carlo simulation. Numerical and engineering examples are given to illustrate the efficiency and feasible of the proposed approach.
Calibrating SALT: a sampling scheme to improve estimates of suspended sediment yield
Robert B. Thomas
1986-01-01
Abstract - SALT (Selection At List Time) is a variable probability sampling scheme that provides unbiased estimates of suspended sediment yield and its variance. SALT performs better than standard schemes which are estimate variance. Sampling probabilities are based on a sediment rating function which promotes greater sampling intensity during periods of high...
NASA Astrophysics Data System (ADS)
Daneshgaran, Fred; Mondin, Marina; Olia, Khashayar
This paper is focused on the problem of Information Reconciliation (IR) for continuous variable Quantum Key Distribution (QKD). The main problem is quantization and assignment of labels to the samples of the Gaussian variables observed at Alice and Bob. Trouble is that most of the samples, assuming that the Gaussian variable is zero mean which is de-facto the case, tend to have small magnitudes and are easily disturbed by noise. Transmission over longer and longer distances increases the losses corresponding to a lower effective Signal-to-Noise Ratio (SNR) exasperating the problem. Quantization over higher dimensions is advantageous since it allows for fractional bit per sample accuracy which may be needed at very low SNR conditions whereby the achievable secret key rate is significantly less than one bit per sample. In this paper, we propose to use Permutation Modulation (PM) for quantization of Gaussian vectors potentially containing thousands of samples. PM is applied to the magnitudes of the Gaussian samples and we explore the dependence of the sign error probability on the magnitude of the samples. At very low SNR, we may transmit the entire label of the PM code from Bob to Alice in Reverse Reconciliation (RR) over public channel. The side information extracted from this label can then be used by Alice to characterize the sign error probability of her individual samples. Forward Error Correction (FEC) coding can be used by Bob on each subset of samples with similar sign error probability to aid Alice in error correction. This can be done for different subsets of samples with similar sign error probabilities leading to an Unequal Error Protection (UEP) coding paradigm.
Quantum Inference on Bayesian Networks
NASA Astrophysics Data System (ADS)
Yoder, Theodore; Low, Guang Hao; Chuang, Isaac
2014-03-01
Because quantum physics is naturally probabilistic, it seems reasonable to expect physical systems to describe probabilities and their evolution in a natural fashion. Here, we use quantum computation to speedup sampling from a graphical probability model, the Bayesian network. A specialization of this sampling problem is approximate Bayesian inference, where the distribution on query variables is sampled given the values e of evidence variables. Inference is a key part of modern machine learning and artificial intelligence tasks, but is known to be NP-hard. Classically, a single unbiased sample is obtained from a Bayesian network on n variables with at most m parents per node in time (nmP(e) - 1 / 2) , depending critically on P(e) , the probability the evidence might occur in the first place. However, by implementing a quantum version of rejection sampling, we obtain a square-root speedup, taking (n2m P(e) -1/2) time per sample. The speedup is the result of amplitude amplification, which is proving to be broadly applicable in sampling and machine learning tasks. In particular, we provide an explicit and efficient circuit construction that implements the algorithm without the need for oracle access.
A country-wide probability sample of public attitudes toward stuttering in Portugal.
Valente, Ana Rita S; St Louis, Kenneth O; Leahy, Margaret; Hall, Andreia; Jesus, Luis M T
2017-06-01
Negative public attitudes toward stuttering have been widely reported, although differences among countries and regions exist. Clear reasons for these differences remain obscure. Published research is unavailable on public attitudes toward stuttering in Portugal as well as a representative sample that explores stuttering attitudes in an entire country. This study sought to (a) determine the feasibility of a country-wide probability sampling scheme to measure public stuttering attitudes in Portugal using a standard instrument (the Public Opinion Survey of Human Attributes-Stuttering [POSHA-S]) and (b) identify demographic variables that predict Portuguese attitudes. The POSHA-S was translated to European Portuguese through a five-step process. Thereafter, a local administrative office-based, three-stage, cluster, probability sampling scheme was carried out to obtain 311 adult respondents who filled out the questionnaire. The Portuguese population held stuttering attitudes that were generally within the average range of those observed from numerous previous POSHA-S samples. Demographic variables that predicted more versus less positive stuttering attitudes were respondents' age, region of the country, years of school completed, working situation, and number of languages spoken. Non-predicting variables were respondents' sex, marital status, and parental status. A local administrative office-based, probability sampling scheme generated a respondent profile similar to census data and indicated that Portuguese attitudes are generally typical. Copyright © 2017 Elsevier Inc. All rights reserved.
Vidal-Martínez, Víctor M; Torres-Irineo, Edgar; Romero, David; Gold-Bouchot, Gerardo; Martínez-Meyer, Enrique; Valdés-Lozano, David; Aguirre-Macedo, M Leopoldina
2015-11-26
Understanding the environmental and anthropogenic factors influencing the probability of occurrence of the marine parasitic species is fundamental for determining the circumstances under which they can act as bioindicators of environmental impact. The aim of this study was to determine whether physicochemical variables, polyaromatic hydrocarbons or sewage discharge affect the probability of occurrence of the larval cestode Oncomegas wageneri, which infects the shoal flounder, Syacium gunteri, in the southern Gulf of Mexico. The study area included 162 sampling sites in the southern Gulf of Mexico and covered 288,205 km(2), where the benthic sediments, water and the shoal flounder individuals were collected. We used the boosted generalised additive models (boosted GAM) and the MaxEnt to examine the potential statistical relationships between the environmental variables (nutrients, contaminants and physicochemical variables from the water and sediments) and the probability of the occurrence of this parasite. The models were calibrated using all of the sampling sites (full area) with and without parasite occurrences (n = 162) and a polygon area that included sampling sites with a depth of 1500 m or less (n = 134). Oncomegas wageneri occurred at 29/162 sampling sites. The boosted GAM for the full area and the polygon area accurately predicted the probability of the occurrence of O. wageneri in the study area. By contrast, poor probabilities of occurrence were obtained with the MaxEnt models for the same areas. The variables with the highest frequencies of appearance in the models (proxies for the explained variability) were the polyaromatic hydrocarbons of high molecular weight (PAHH, 95 %), followed by a combination of nutrients, spatial variables and polyaromatic hydrocarbons of low molecular weight (PAHL, 5 %). The contribution of the PAHH to the variability was explained by the fact that these compounds, together with N and P, are carried by rivers that discharge into the ocean, which enhances the growth of hydrocarbonoclastic bacteria and the productivity and number of the intermediate hosts. Our results suggest that sites with PAHL/PAHH ratio values up to 1.89 promote transmission based on the high values of the prevalence of O. wageneri in the study area. In contrast, PAHL/PAHH ratio values ≥ 1.90 can be considered harmful for the transmission stages of O. wageneri and its hosts (copepods, shrimps and shoal flounders). Overall, the results indicate that the PAHHs affect the probability of occurrence of this helminth parasite in the southern Gulf of Mexico.
Peterson, James T.; Scheerer, Paul D.; Clements, Shaun
2015-01-01
Desert springs are sensitive aquatic ecosystems that pose unique challenges to natural resource managers and researchers. Among the most important of these is the need to accurately quantify population parameters for resident fish, particularly when the species are of special conservation concern. We evaluated the efficiency of baited minnow traps for estimating the abundance of two at-risk species, Foskett Speckled Dace Rhinichthys osculus ssp. and Borax Lake Chub Gila boraxobius, in desert spring systems in southeastern Oregon. We evaluated alternative sample designs using simulation and found that capture–recapture designs with four capture occasions would maximize the accuracy of estimates and minimize fish handling. We implemented the design and estimated capture and recapture probabilities using the Huggins closed-capture estimator. Trap capture probabilities averaged 23% and 26% for Foskett Speckled Dace and Borax Lake Chub, respectively, but differed substantially among sample locations, through time, and nonlinearly with fish body size. Recapture probabilities for Foskett Speckled Dace were, on average, 1.6 times greater than (first) capture probabilities, suggesting “trap-happy” behavior. Comparison of population estimates from the Huggins model with the commonly used Lincoln–Petersen estimator indicated that the latter underestimated Foskett Speckled Dace and Borax Lake Chub population size by 48% and by 20%, respectively. These biases were due to variability in capture and recapture probabilities. Simulation of fish monitoring that included the range of capture and recapture probabilities observed indicated that variability in capture and recapture probabilities in time negatively affected the ability to detect annual decreases by up to 20% in fish population size. Failure to account for variability in capture and recapture probabilities can lead to poor quality data and study inferences. Therefore, we recommend that fishery researchers and managers employ sample designs and estimators that can account for this variability.
Latin Hypercube Sampling (LHS) UNIX Library/Standalone
DOE Office of Scientific and Technical Information (OSTI.GOV)
2004-05-13
The LHS UNIX Library/Standalone software provides the capability to draw random samples from over 30 distribution types. It performs the sampling by a stratified sampling method called Latin Hypercube Sampling (LHS). Multiple distributions can be sampled simultaneously, with user-specified correlations amongst the input distributions, LHS UNIX Library/ Standalone provides a way to generate multi-variate samples. The LHS samples can be generated either as a callable library (e.g., from within the DAKOTA software framework) or as a standalone capability. LHS UNIX Library/Standalone uses the Latin Hypercube Sampling method (LHS) to generate samples. LHS is a constrained Monte Carlo sampling scheme. Inmore » LHS, the range of each variable is divided into non-overlapping intervals on the basis of equal probability. A sample is selected at random with respect to the probability density in each interval, If multiple variables are sampled simultaneously, then values obtained for each are paired in a random manner with the n values of the other variables. In some cases, the pairing is restricted to obtain specified correlations amongst the input variables. Many simulation codes have input parameters that are uncertain and can be specified by a distribution, To perform uncertainty analysis and sensitivity analysis, random values are drawn from the input parameter distributions, and the simulation is run with these values to obtain output values. If this is done repeatedly, with many input samples drawn, one can build up a distribution of the output as well as examine correlations between input and output variables.« less
Haynes, Trevor B.; Rosenberger, Amanda E.; Lindberg, Mark S.; Whitman, Matthew; Schmutz, Joel A.
2013-01-01
Studies examining species occurrence often fail to account for false absences in field sampling. We investigate detection probabilities of five gear types for six fish species in a sample of lakes on the North Slope, Alaska. We used an occupancy modeling approach to provide estimates of detection probabilities for each method. Variation in gear- and species-specific detection probability was considerable. For example, detection probabilities for the fyke net ranged from 0.82 (SE = 0.05) for least cisco (Coregonus sardinella) to 0.04 (SE = 0.01) for slimy sculpin (Cottus cognatus). Detection probabilities were also affected by site-specific variables such as depth of the lake, year, day of sampling, and lake connection to a stream. With the exception of the dip net and shore minnow traps, each gear type provided the highest detection probability of at least one species. Results suggest that a multimethod approach may be most effective when attempting to sample the entire fish community of Arctic lakes. Detection probability estimates will be useful for designing optimal fish sampling and monitoring protocols in Arctic lakes.
On the Effects of Signaling Reinforcer Probability and Magnitude in Delayed Matching to Sample
ERIC Educational Resources Information Center
Brown, Glenn S.; White, K. Geoffrey
2005-01-01
Two experiments examined whether postsample signals of reinforcer probability or magnitude affected the accuracy of delayed matching to sample in pigeons. On each trial, red or green choice responses that matched red or green stimuli seen shortly before a variable retention interval were reinforced with wheat access. In Experiment 1, the…
Zimmerman, Tammy M.
2006-01-01
The Lake Erie shoreline in Pennsylvania spans nearly 40 miles and is a valuable recreational resource for Erie County. Nearly 7 miles of the Lake Erie shoreline lies within Presque Isle State Park in Erie, Pa. Concentrations of Escherichia coli (E. coli) bacteria at permitted Presque Isle beaches occasionally exceed the single-sample bathing-water standard, resulting in unsafe swimming conditions and closure of the beaches. E. coli concentrations and other water-quality and environmental data collected at Presque Isle Beach 2 during the 2004 and 2005 recreational seasons were used to develop models using tobit regression analyses to predict E. coli concentrations. All variables statistically related to E. coli concentrations were included in the initial regression analyses, and after several iterations, only those explanatory variables that made the models significantly better at predicting E. coli concentrations were included in the final models. Regression models were developed using data from 2004, 2005, and the combined 2-year dataset. Variables in the 2004 model and the combined 2004-2005 model were log10 turbidity, rain weight, wave height (calculated), and wind direction. Variables in the 2005 model were log10 turbidity and wind direction. Explanatory variables not included in the final models were water temperature, streamflow, wind speed, and current speed; model results indicated these variables did not meet significance criteria at the 95-percent confidence level (probabilities were greater than 0.05). The predicted E. coli concentrations produced by the models were used to develop probabilities that concentrations would exceed the single-sample bathing-water standard for E. coli of 235 colonies per 100 milliliters. Analysis of the exceedence probabilities helped determine a threshold probability for each model, chosen such that the correct number of exceedences and nonexceedences was maximized and the number of false positives and false negatives was minimized. Future samples with computed exceedence probabilities higher than the selected threshold probability, as determined by the model, will likely exceed the E. coli standard and a beach advisory or closing may need to be issued; computed exceedence probabilities lower than the threshold probability will likely indicate the standard will not be exceeded. Additional data collected each year can be used to test and possibly improve the model. This study will aid beach managers in more rapidly determining when waters are not safe for recreational use and, subsequently, when to issue beach advisories or closings.
Discrimination of Variable Schedules Is Controlled by Interresponse Times Proximal to Reinforcement
ERIC Educational Resources Information Center
Tanno, Takayuki; Silberberg, Alan; Sakagami, Takayuki
2012-01-01
In Experiment 1, food-deprived rats responded to one of two schedules that were, with equal probability, associated with a sample lever. One schedule was always variable ratio, while the other schedule, depending on the trial within a session, was: (a) a variable-interval schedule; (b) a tandem variable-interval,…
Magoulick, Daniel D.; DiStefano, Robert J.; Imhoff, Emily M.; Nolen, Matthew S.; Wagner, Brian K.
2017-01-01
Crayfish are ecologically important in freshwater systems worldwide and are imperiled in North America and globally. We sought to examine landscape- to local-scale environmental variables related to occupancy and detection probability of a suite of stream-dwelling crayfish species. We used a quantitative kickseine method to sample crayfish presence at 102 perennial stream sites with eight surveys per site. We modeled occupancy (psi) and detection probability (P) and local- and landscape-scale environmental covariates. We developed a set of a priori candidate models for each species and ranked models using (Q)AICc. Detection probabilities and occupancy estimates differed among crayfish species with Orconectes eupunctus, O. marchandi, and Cambarus hubbsi being relatively rare (psi < 0.20) with moderate (0.46–0.60) to high (0.81) detection probability and O. punctimanus and O. ozarkae being relatively common (psi > 0.60) with high detection probability (0.81). Detection probability was often related to local habitat variables current velocity, depth, or substrate size. Important environmental variables for crayfish occupancy were species dependent but were mainly landscape variables such as stream order, geology, slope, topography, and land use. Landscape variables strongly influenced crayfish occupancy and should be considered in future studies and conservation plans.
Quantifying seining detection probability for fishes of Great Plains sand‐bed rivers
Mollenhauer, Robert; Logue, Daniel R.; Brewer, Shannon K.
2018-01-01
Species detection error (i.e., imperfect and variable detection probability) is an essential consideration when investigators map distributions and interpret habitat associations. When fish detection error that is due to highly variable instream environments needs to be addressed, sand‐bed streams of the Great Plains represent a unique challenge. We quantified seining detection probability for diminutive Great Plains fishes across a range of sampling conditions in two sand‐bed rivers in Oklahoma. Imperfect detection resulted in underestimates of species occurrence using naïve estimates, particularly for less common fishes. Seining detection probability also varied among fishes and across sampling conditions. We observed a quadratic relationship between water depth and detection probability, in which the exact nature of the relationship was species‐specific and dependent on water clarity. Similarly, the direction of the relationship between water clarity and detection probability was species‐specific and dependent on differences in water depth. The relationship between water temperature and detection probability was also species dependent, where both the magnitude and direction of the relationship varied among fishes. We showed how ignoring detection error confounded an underlying relationship between species occurrence and water depth. Despite imperfect and heterogeneous detection, our results support that determining species absence can be accomplished with two to six spatially replicated seine hauls per 200‐m reach under average sampling conditions; however, required effort would be higher under certain conditions. Detection probability was low for the Arkansas River Shiner Notropis girardi, which is federally listed as threatened, and more than 10 seine hauls per 200‐m reach would be required to assess presence across sampling conditions. Our model allows scientists to estimate sampling effort to confidently assess species occurrence, which maximizes the use of available resources. Increased implementation of approaches that consider detection error promote ecological advancements and conservation and management decisions that are better informed.
A linear programming model for protein inference problem in shotgun proteomics.
Huang, Ting; He, Zengyou
2012-11-15
Assembling peptides identified from tandem mass spectra into a list of proteins, referred to as protein inference, is an important issue in shotgun proteomics. The objective of protein inference is to find a subset of proteins that are truly present in the sample. Although many methods have been proposed for protein inference, several issues such as peptide degeneracy still remain unsolved. In this article, we present a linear programming model for protein inference. In this model, we use a transformation of the joint probability that each peptide/protein pair is present in the sample as the variable. Then, both the peptide probability and protein probability can be expressed as a formula in terms of the linear combination of these variables. Based on this simple fact, the protein inference problem is formulated as an optimization problem: minimize the number of proteins with non-zero probabilities under the constraint that the difference between the calculated peptide probability and the peptide probability generated from peptide identification algorithms should be less than some threshold. This model addresses the peptide degeneracy issue by forcing some joint probability variables involving degenerate peptides to be zero in a rigorous manner. The corresponding inference algorithm is named as ProteinLP. We test the performance of ProteinLP on six datasets. Experimental results show that our method is competitive with the state-of-the-art protein inference algorithms. The source code of our algorithm is available at: https://sourceforge.net/projects/prolp/. zyhe@dlut.edu.cn. Supplementary data are available at Bioinformatics Online.
Electrofishing capture probability of smallmouth bass in streams
Dauwalter, D.C.; Fisher, W.L.
2007-01-01
Abundance estimation is an integral part of understanding the ecology and advancing the management of fish populations and communities. Mark-recapture and removal methods are commonly used to estimate the abundance of stream fishes. Alternatively, abundance can be estimated by dividing the number of individuals sampled by the probability of capture. We conducted a mark-recapture study and used multiple repeated-measures logistic regression to determine the influence of fish size, sampling procedures, and stream habitat variables on the cumulative capture probability for smallmouth bass Micropterus dolomieu in two eastern Oklahoma streams. The predicted capture probability was used to adjust the number of individuals sampled to obtain abundance estimates. The observed capture probabilities were higher for larger fish and decreased with successive electrofishing passes for larger fish only. Model selection suggested that the number of electrofishing passes, fish length, and mean thalweg depth affected capture probabilities the most; there was little evidence for any effect of electrofishing power density and woody debris density on capture probability. Leave-one-out cross validation showed that the cumulative capture probability model predicts smallmouth abundance accurately. ?? Copyright by the American Fisheries Society 2007.
The use of auxiliary variables in capture-recapture and removal experiments
Pollock, K.H.; Hines, J.E.; Nichols, J.D.
1984-01-01
The dependence of animal capture probabilities on auxiliary variables is an important practical problem which has not been considered in the development of estimation procedures for capture-recapture and removal experiments. In this paper the linear logistic binary regression model is used to relate the probability of capture to continuous auxiliary variables. The auxiliary variables could be environmental quantities such as air or water temperature, or characteristics of individual animals, such as body length or weight. Maximum likelihood estimators of the population parameters are considered for a variety of models which all assume a closed population. Testing between models is also considered. The models can also be used when one auxiliary variable is a measure of the effort expended in obtaining the sample.
Large Deviations: Advanced Probability for Undergrads
ERIC Educational Resources Information Center
Rolls, David A.
2007-01-01
In the branch of probability called "large deviations," rates of convergence (e.g. of the sample mean) are considered. The theory makes use of the moment generating function. So, particularly for sums of independent and identically distributed random variables, the theory can be made accessible to senior undergraduates after a first course in…
Probability techniques for reliability analysis of composite materials
NASA Technical Reports Server (NTRS)
Wetherhold, Robert C.; Ucci, Anthony M.
1994-01-01
Traditional design approaches for composite materials have employed deterministic criteria for failure analysis. New approaches are required to predict the reliability of composite structures since strengths and stresses may be random variables. This report will examine and compare methods used to evaluate the reliability of composite laminae. The two types of methods that will be evaluated are fast probability integration (FPI) methods and Monte Carlo methods. In these methods, reliability is formulated as the probability that an explicit function of random variables is less than a given constant. Using failure criteria developed for composite materials, a function of design variables can be generated which defines a 'failure surface' in probability space. A number of methods are available to evaluate the integration over the probability space bounded by this surface; this integration delivers the required reliability. The methods which will be evaluated are: the first order, second moment FPI methods; second order, second moment FPI methods; the simple Monte Carlo; and an advanced Monte Carlo technique which utilizes importance sampling. The methods are compared for accuracy, efficiency, and for the conservativism of the reliability estimation. The methodology involved in determining the sensitivity of the reliability estimate to the design variables (strength distributions) and importance factors is also presented.
Point-Sampling and Line-Sampling Probability Theory, Geometric Implications, Synthesis
L.R. Grosenbaugh
1958-01-01
Foresters concerned with measuring tree populations on definite areas have long employed two well-known methods of representative sampling. In list or enumerative sampling the entire tree population is tallied with a known proportion being randomly selected and measured for volume or other variables. In area sampling all trees on randomly located plots or strips...
On estimating probability of presence from use-availability or presence-background data.
Phillips, Steven J; Elith, Jane
2013-06-01
A fundamental ecological modeling task is to estimate the probability that a species is present in (or uses) a site, conditional on environmental variables. For many species, available data consist of "presence" data (locations where the species [or evidence of it] has been observed), together with "background" data, a random sample of available environmental conditions. Recently published papers disagree on whether probability of presence is identifiable from such presence-background data alone. This paper aims to resolve the disagreement, demonstrating that additional information is required. We defined seven simulated species representing various simple shapes of response to environmental variables (constant, linear, convex, unimodal, S-shaped) and ran five logistic model-fitting methods using 1000 presence samples and 10 000 background samples; the simulations were repeated 100 times. The experiment revealed a stark contrast between two groups of methods: those based on a strong assumption that species' true probability of presence exactly matches a given parametric form had highly variable predictions and much larger RMS error than methods that take population prevalence (the fraction of sites in which the species is present) as an additional parameter. For six species, the former group grossly under- or overestimated probability of presence. The cause was not model structure or choice of link function, because all methods were logistic with linear and, where necessary, quadratic terms. Rather, the experiment demonstrates that an estimate of prevalence is not just helpful, but is necessary (except in special cases) for identifying probability of presence. We therefore advise against use of methods that rely on the strong assumption, due to Lele and Keim (recently advocated by Royle et al.) and Lancaster and Imbens. The methods are fragile, and their strong assumption is unlikely to be true in practice. We emphasize, however, that we are not arguing against standard statistical methods such as logistic regression, generalized linear models, and so forth, none of which requires the strong assumption. If probability of presence is required for a given application, there is no panacea for lack of data. Presence-background data must be augmented with an additional datum, e.g., species' prevalence, to reliably estimate absolute (rather than relative) probability of presence.
Cool, Geneviève; Lebel, Alexandre; Sadiq, Rehan; Rodriguez, Manuel J
2015-12-01
The regional variability of the probability of occurrence of high total trihalomethane (TTHM) levels was assessed using multilevel logistic regression models that incorporate environmental and infrastructure characteristics. The models were structured in a three-level hierarchical configuration: samples (first level), drinking water utilities (DWUs, second level) and natural regions, an ecological hierarchical division from the Quebec ecological framework of reference (third level). They considered six independent variables: precipitation, temperature, source type, seasons, treatment type and pH. The average probability of TTHM concentrations exceeding the targeted threshold was 18.1%. The probability was influenced by seasons, treatment type, precipitations and temperature. The variance at all levels was significant, showing that the probability of TTHM concentrations exceeding the threshold is most likely to be similar if located within the same DWU and within the same natural region. However, most of the variance initially attributed to natural regions was explained by treatment types and clarified by spatial aggregation on treatment types. Nevertheless, even after controlling for treatment type, there was still significant regional variability of the probability of TTHM concentrations exceeding the threshold. Regional variability was particularly important for DWUs using chlorination alone since they lack the appropriate treatment required to reduce the amount of natural organic matter (NOM) in source water prior to disinfection. Results presented herein could be of interest to authorities in identifying regions with specific needs regarding drinking water quality and for epidemiological studies identifying geographical variations in population exposure to disinfection by-products (DBPs).
Metocean design parameter estimation for fixed platform based on copula functions
NASA Astrophysics Data System (ADS)
Zhai, Jinjin; Yin, Qilin; Dong, Sheng
2017-08-01
Considering the dependent relationship among wave height, wind speed, and current velocity, we construct novel trivariate joint probability distributions via Archimedean copula functions. Total 30-year data of wave height, wind speed, and current velocity in the Bohai Sea are hindcast and sampled for case study. Four kinds of distributions, namely, Gumbel distribution, lognormal distribution, Weibull distribution, and Pearson Type III distribution, are candidate models for marginal distributions of wave height, wind speed, and current velocity. The Pearson Type III distribution is selected as the optimal model. Bivariate and trivariate probability distributions of these environmental conditions are established based on four bivariate and trivariate Archimedean copulas, namely, Clayton, Frank, Gumbel-Hougaard, and Ali-Mikhail-Haq copulas. These joint probability models can maximize marginal information and the dependence among the three variables. The design return values of these three variables can be obtained by three methods: univariate probability, conditional probability, and joint probability. The joint return periods of different load combinations are estimated by the proposed models. Platform responses (including base shear, overturning moment, and deck displacement) are further calculated. For the same return period, the design values of wave height, wind speed, and current velocity obtained by the conditional and joint probability models are much smaller than those by univariate probability. Considering the dependence among variables, the multivariate probability distributions provide close design parameters to actual sea state for ocean platform design.
NASA Astrophysics Data System (ADS)
Yang, P.; Ng, T. L.; Yang, W.
2015-12-01
Effective water resources management depends on the reliable estimation of the uncertainty of drought events. Confidence intervals (CIs) are commonly applied to quantify this uncertainty. A CI seeks to be at the minimal length necessary to cover the true value of the estimated variable with the desired probability. In drought analysis where two or more variables (e.g., duration and severity) are often used to describe a drought, copulas have been found suitable for representing the joint probability behavior of these variables. However, the comprehensive assessment of the parameter uncertainties of copulas of droughts has been largely ignored, and the few studies that have recognized this issue have not explicitly compared the various methods to produce the best CIs. Thus, the objective of this study to compare the CIs generated using two widely applied uncertainty estimation methods, bootstrapping and Markov Chain Monte Carlo (MCMC). To achieve this objective, (1) the marginal distributions lognormal, Gamma, and Generalized Extreme Value, and the copula functions Clayton, Frank, and Plackett are selected to construct joint probability functions of two drought related variables. (2) The resulting joint functions are then fitted to 200 sets of simulated realizations of drought events with known distribution and extreme parameters and (3) from there, using bootstrapping and MCMC, CIs of the parameters are generated and compared. The effect of an informative prior on the CIs generated by MCMC is also evaluated. CIs are produced for different sample sizes (50, 100, and 200) of the simulated drought events for fitting the joint probability functions. Preliminary results assuming lognormal marginal distributions and the Clayton copula function suggest that for cases with small or medium sample sizes (~50-100), MCMC to be superior method if an informative prior exists. Where an informative prior is unavailable, for small sample sizes (~50), both bootstrapping and MCMC yield the same level of performance, and for medium sample sizes (~100), bootstrapping is better. For cases with a large sample size (~200), there is little difference between the CIs generated using bootstrapping and MCMC regardless of whether or not an informative prior exists.
Multinomial mixture model with heterogeneous classification probabilities
Holland, M.D.; Gray, B.R.
2011-01-01
Royle and Link (Ecology 86(9):2505-2512, 2005) proposed an analytical method that allowed estimation of multinomial distribution parameters and classification probabilities from categorical data measured with error. While useful, we demonstrate algebraically and by simulations that this method yields biased multinomial parameter estimates when the probabilities of correct category classifications vary among sampling units. We address this shortcoming by treating these probabilities as logit-normal random variables within a Bayesian framework. We use Markov chain Monte Carlo to compute Bayes estimates from a simulated sample from the posterior distribution. Based on simulations, this elaborated Royle-Link model yields nearly unbiased estimates of multinomial and correct classification probability estimates when classification probabilities are allowed to vary according to the normal distribution on the logit scale or according to the Beta distribution. The method is illustrated using categorical submersed aquatic vegetation data. ?? 2010 Springer Science+Business Media, LLC.
Improved high-dimensional prediction with Random Forests by the use of co-data.
Te Beest, Dennis E; Mes, Steven W; Wilting, Saskia M; Brakenhoff, Ruud H; van de Wiel, Mark A
2017-12-28
Prediction in high dimensional settings is difficult due to the large number of variables relative to the sample size. We demonstrate how auxiliary 'co-data' can be used to improve the performance of a Random Forest in such a setting. Co-data are incorporated in the Random Forest by replacing the uniform sampling probabilities that are used to draw candidate variables by co-data moderated sampling probabilities. Co-data here are defined as any type information that is available on the variables of the primary data, but does not use its response labels. These moderated sampling probabilities are, inspired by empirical Bayes, learned from the data at hand. We demonstrate the co-data moderated Random Forest (CoRF) with two examples. In the first example we aim to predict the presence of a lymph node metastasis with gene expression data. We demonstrate how a set of external p-values, a gene signature, and the correlation between gene expression and DNA copy number can improve the predictive performance. In the second example we demonstrate how the prediction of cervical (pre-)cancer with methylation data can be improved by including the location of the probe relative to the known CpG islands, the number of CpG sites targeted by a probe, and a set of p-values from a related study. The proposed method is able to utilize auxiliary co-data to improve the performance of a Random Forest.
Cowell, Robert G
2018-05-04
Current models for single source and mixture samples, and probabilistic genotyping software based on them used for analysing STR electropherogram data, assume simple probability distributions, such as the gamma distribution, to model the allelic peak height variability given the initial amount of DNA prior to PCR amplification. Here we illustrate how amplicon number distributions, for a model of the process of sample DNA collection and PCR amplification, may be efficiently computed by evaluating probability generating functions using discrete Fourier transforms. Copyright © 2018 Elsevier B.V. All rights reserved.
Bonawitz, Elizabeth; Denison, Stephanie; Griffiths, Thomas L; Gopnik, Alison
2014-10-01
Although probabilistic models of cognitive development have become increasingly prevalent, one challenge is to account for how children might cope with a potentially vast number of possible hypotheses. We propose that children might address this problem by 'sampling' hypotheses from a probability distribution. We discuss empirical results demonstrating signatures of sampling, which offer an explanation for the variability of children's responses. The sampling hypothesis provides an algorithmic account of how children might address computationally intractable problems and suggests a way to make sense of their 'noisy' behavior. Copyright © 2014 Elsevier Ltd. All rights reserved.
Smart, Adam S; Tingley, Reid; Weeks, Andrew R; van Rooyen, Anthony R; McCarthy, Michael A
2015-10-01
Effective management of alien species requires detecting populations in the early stages of invasion. Environmental DNA (eDNA) sampling can detect aquatic species at relatively low densities, but few studies have directly compared detection probabilities of eDNA sampling with those of traditional sampling methods. We compare the ability of a traditional sampling technique (bottle trapping) and eDNA to detect a recently established invader, the smooth newt Lissotriton vulgaris vulgaris, at seven field sites in Melbourne, Australia. Over a four-month period, per-trap detection probabilities ranged from 0.01 to 0.26 among sites where L. v. vulgaris was detected, whereas per-sample eDNA estimates were much higher (0.29-1.0). Detection probabilities of both methods varied temporally (across days and months), but temporal variation appeared to be uncorrelated between methods. Only estimates of spatial variation were strongly correlated across the two sampling techniques. Environmental variables (water depth, rainfall, ambient temperature) were not clearly correlated with detection probabilities estimated via trapping, whereas eDNA detection probabilities were negatively correlated with water depth, possibly reflecting higher eDNA concentrations at lower water levels. Our findings demonstrate that eDNA sampling can be an order of magnitude more sensitive than traditional methods, and illustrate that traditional- and eDNA-based surveys can provide independent information on species distributions when occupancy surveys are conducted over short timescales.
Survival estimates for Florida manatees from the photo-identification of individuals
Langtimm, C.A.; Beck, C.A.; Edwards, H.H.; Fick-Child, K. J.; Ackerman, B.B.; Barton, S.L.; Hartley, W.C.
2004-01-01
We estimated adult survival probabilities for the endangered Florida manatee (Trichechus manatus latirostris) in four regional populations using photo-identification data and open-population capture-recapture statistical models. The mean annual adult survival probability over the most recent 10-yr period of available estimates was as follows: Northwest - 0.956 (SE 0.007), Upper St. Johns River - 0.960 (0.011), Atlantic Coast - 0.937 (0.008), and Southwest - 0.908 (0.019). Estimates of temporal variance independent of sampling error, calculated from the survival estimates, indicated constant survival in the Upper St. Johns River, true temporal variability in the Northwest and Atlantic Coast, and large sampling variability obscuring estimates for the Southwest. Calf and subadult survival probabilities were estimated for the Upper St. Johns River from the only available data for known-aged individuals: 0.810 (95% CI 0.727-0.873) for 1st year calves, 0.915 (0.827-0.960) for 2nd year calves, and 0.969 (0.946-0.982) for manatee 3 yr or older. These estimates of survival probabilities and temporal variance, in conjunction with estimates of reproduction probabilities from photoidentification data can be used to model manatee population dynamics, estimate population growth rates, and provide an integrated measure of regional status.
Chelgren, Nathan D.; Samora, Barbara; Adams, Michael J.; McCreary, Brome
2011-01-01
High variability in abundance, cryptic coloration, and small body size of newly metamorphosed anurans have limited demographic studies of this life-history stage. We used line-transect distance sampling and Bayesian methods to estimate the abundance and spatial distribution of newly metamorphosed Western Toads (Anaxyrus boreas) in terrestrial habitat surrounding a montane lake in central Washington, USA. We completed 154 line-transect surveys from the commencement of metamorphosis (15 September 2009) to the date of first snow accumulation in fall (1 October 2009), and located 543 newly metamorphosed toads. After accounting for variable detection probability associated with the extent of barren habitats, estimates of total surface abundance ranged from a posterior median of 3,880 (95% credible intervals from 2,235 to 12,600) in the first week of sampling to 12,150 (5,543 to 51,670) during the second week of sampling. Numbers of newly metamorphosed toads dropped quickly with increasing distance from the lakeshore in a pattern that differed over the three weeks of the study and contradicted our original hypotheses. Though we hypothesized that the spatial distribution of toads would initially be concentrated near the lake shore and then spread outward from the lake over time, we observed the opposite. Ninety-five percent of individuals occurred within 20, 16, and 15 m of shore during weeks one, two, and three respectively, probably reflecting continued emergence of newly metamorphosed toads from the lake and mortality or burrow use of dispersed individuals. Numbers of toads were highest near the inlet stream of the lake. Distance sampling may provide a useful method for estimating the surface abundance of newly metamorphosed toads and relating their space use to landscape variables despite uncertain and variable probability of detection. We discuss means of improving the precision of estimates of total abundance.
He, Fu-yuan; Deng, Kai-wen; Huang, Sheng; Liu, Wen-long; Shi, Ji-lian
2013-09-01
The paper aims to elucidate and establish a new mathematic model: the total quantum statistical moment standard similarity (TQSMSS) on the base of the original total quantum statistical moment model and to illustrate the application of the model to medical theoretical research. The model was established combined with the statistical moment principle and the normal distribution probability density function properties, then validated and illustrated by the pharmacokinetics of three ingredients in Buyanghuanwu decoction and of three data analytical method for them, and by analysis of chromatographic fingerprint for various extracts with different solubility parameter solvents dissolving the Buyanghanwu-decoction extract. The established model consists of four mainly parameters: (1) total quantum statistical moment similarity as ST, an overlapped area by two normal distribution probability density curves in conversion of the two TQSM parameters; (2) total variability as DT, a confidence limit of standard normal accumulation probability which is equal to the absolute difference value between the two normal accumulation probabilities within integration of their curve nodical; (3) total variable probability as 1-Ss, standard normal distribution probability within interval of D(T); (4) total variable probability (1-beta)alpha and (5) stable confident probability beta(1-alpha): the correct probability to make positive and negative conclusions under confident coefficient alpha. With the model, we had analyzed the TQSMS similarities of pharmacokinetics of three ingredients in Buyanghuanwu decoction and of three data analytical methods for them were at range of 0.3852-0.9875 that illuminated different pharmacokinetic behaviors of each other; and the TQSMS similarities (ST) of chromatographic fingerprint for various extracts with different solubility parameter solvents dissolving Buyanghuanwu-decoction-extract were at range of 0.6842-0.999 2 that showed different constituents with various solvent extracts. The TQSMSS can characterize the sample similarity, by which we can quantitate the correct probability with the test of power under to make positive and negative conclusions no matter the samples come from same population under confident coefficient a or not, by which we can realize an analysis at both macroscopic and microcosmic levels, as an important similar analytical method for medical theoretical research.
Quantum probabilistic logic programming
NASA Astrophysics Data System (ADS)
Balu, Radhakrishnan
2015-05-01
We describe a quantum mechanics based logic programming language that supports Horn clauses, random variables, and covariance matrices to express and solve problems in probabilistic logic. The Horn clauses of the language wrap random variables, including infinite valued, to express probability distributions and statistical correlations, a powerful feature to capture relationship between distributions that are not independent. The expressive power of the language is based on a mechanism to implement statistical ensembles and to solve the underlying SAT instances using quantum mechanical machinery. We exploit the fact that classical random variables have quantum decompositions to build the Horn clauses. We establish the semantics of the language in a rigorous fashion by considering an existing probabilistic logic language called PRISM with classical probability measures defined on the Herbrand base and extending it to the quantum context. In the classical case H-interpretations form the sample space and probability measures defined on them lead to consistent definition of probabilities for well formed formulae. In the quantum counterpart, we define probability amplitudes on Hinterpretations facilitating the model generations and verifications via quantum mechanical superpositions and entanglements. We cast the well formed formulae of the language as quantum mechanical observables thus providing an elegant interpretation for their probabilities. We discuss several examples to combine statistical ensembles and predicates of first order logic to reason with situations involving uncertainty.
Olsen, Lisa D.; Spencer, Tracey A.
2000-01-01
The U.S. Geological Survey (USGS) collected 13 surface-water samples and 3 replicates from 5 sites in the West Branch Canal Creek area at Aberdeen Proving Ground from February through August 1999, as a part of an investigation of ground-water contamination and natural attenuation processes. The samples were analyzed for volatile organic compounds, including trichloroethylene, 1,1,2,2-tetrachloroethane, carbon tetrachloride, and chloroform, which are the four major contaminants that were detected in ground water in the Canal Creek area in earlier USGS studies. Field blanks were collected during the sampling period to assess sample bias. Field replicates were used to assess sample variability, which was expressed as relative percent difference. The mean variability of the surface-water replicate analyses was larger (35.4 percent) than the mean variability of ground-water replicate analyses (14.6 percent) determined for West Branch Canal Creek from 1995 through 1996. The higher variability in surface-water analyses is probably due to heterogeneities in the composition of the surface water rather than differences in sampling or analytical procedures. The most frequently detected volatile organic compound was 1,1,2,2- tetrachloroethane, which was detected in every sample and in two of the replicates. The surface-water contamination is likely the result of cross-media transfer of contaminants from the ground water and sediments along the West Branch Canal Creek. The full extent of surface-water contamination in West Branch Canal Creek and the locations of probable contaminant sources cannot be determined from this limited set of data. Tidal mixing, creek flow patterns, and potential effects of a drought that occurred during the sampling period also complicate the evaluation of surface-water contamination.
Multistage variable probability forest volume inventory. [the Defiance Unit of the Navajo Nation
NASA Technical Reports Server (NTRS)
Anderson, J. E. (Principal Investigator)
1979-01-01
An inventory scheme based on the use of computer processed LANDSAT MSS data was developed. Output from the inventory scheme provides an estimate of the standing net saw timber volume of a major timber species on a selected forested area of the Navajo Nation. Such estimates are based on the values of parameters currently used for scaled sawlog conversion to mill output. The multistage variable probability sampling appears capable of producing estimates which compare favorably with those produced using conventional techniques. In addition, the reduction in time, manpower, and overall costs lend it to numerous applications.
Optimal estimation for discrete time jump processes
NASA Technical Reports Server (NTRS)
Vaca, M. V.; Tretter, S. A.
1977-01-01
Optimum estimates of nonobservable random variables or random processes which influence the rate functions of a discrete time jump process (DTJP) are obtained. The approach is based on the a posteriori probability of a nonobservable event expressed in terms of the a priori probability of that event and of the sample function probability of the DTJP. A general representation for optimum estimates and recursive equations for minimum mean squared error (MMSE) estimates are obtained. MMSE estimates are nonlinear functions of the observations. The problem of estimating the rate of a DTJP when the rate is a random variable with a probability density function of the form cx super K (l-x) super m and show that the MMSE estimates are linear in this case. This class of density functions explains why there are insignificant differences between optimum unconstrained and linear MMSE estimates in a variety of problems.
Optimal estimation for discrete time jump processes
NASA Technical Reports Server (NTRS)
Vaca, M. V.; Tretter, S. A.
1978-01-01
Optimum estimates of nonobservable random variables or random processes which influence the rate functions of a discrete time jump process (DTJP) are derived. The approach used is based on the a posteriori probability of a nonobservable event expressed in terms of the a priori probability of that event and of the sample function probability of the DTJP. Thus a general representation is obtained for optimum estimates, and recursive equations are derived for minimum mean-squared error (MMSE) estimates. In general, MMSE estimates are nonlinear functions of the observations. The problem is considered of estimating the rate of a DTJP when the rate is a random variable with a beta probability density function and the jump amplitudes are binomially distributed. It is shown that the MMSE estimates are linear. The class of beta density functions is rather rich and explains why there are insignificant differences between optimum unconstrained and linear MMSE estimates in a variety of problems.
Polynomial chaos representation of databases on manifolds
DOE Office of Scientific and Technical Information (OSTI.GOV)
Soize, C., E-mail: christian.soize@univ-paris-est.fr; Ghanem, R., E-mail: ghanem@usc.edu
2017-04-15
Characterizing the polynomial chaos expansion (PCE) of a vector-valued random variable with probability distribution concentrated on a manifold is a relevant problem in data-driven settings. The probability distribution of such random vectors is multimodal in general, leading to potentially very slow convergence of the PCE. In this paper, we build on a recent development for estimating and sampling from probabilities concentrated on a diffusion manifold. The proposed methodology constructs a PCE of the random vector together with an associated generator that samples from the target probability distribution which is estimated from data concentrated in the neighborhood of the manifold. Themore » method is robust and remains efficient for high dimension and large datasets. The resulting polynomial chaos construction on manifolds permits the adaptation of many uncertainty quantification and statistical tools to emerging questions motivated by data-driven queries.« less
Application of Influence Diagrams in Identifying Soviet Satellite Missions
1990-12-01
Probabilities Comparison ......................... 58 35. Continuous Model Variables ............................ 59 36. Sample Inclination Data...diagramming is a method which allows the simple construction of a model to illustrate the interrelationships which exist among variables by capturing an...environmental monitoring systems. The module also contained an array of instruments for geophysical and astrophysical experimentation . 4.3.14.3 Soyuz. The Soyuz
Probability-based nitrate contamination map of groundwater in Kinmen.
Liu, Chen-Wuing; Wang, Yeuh-Bin; Jang, Cheng-Shin
2013-12-01
Groundwater supplies over 50% of drinking water in Kinmen. Approximately 16.8% of groundwater samples in Kinmen exceed the drinking water quality standard (DWQS) of NO3 (-)-N (10 mg/L). The residents drinking high nitrate-polluted groundwater pose a potential risk to health. To formulate effective water quality management plan and assure a safe drinking water in Kinmen, the detailed spatial distribution of nitrate-N in groundwater is a prerequisite. The aim of this study is to develop an efficient scheme for evaluating spatial distribution of nitrate-N in residential well water using logistic regression (LR) model. A probability-based nitrate-N contamination map in Kinmen is constructed. The LR model predicted the binary occurrence probability of groundwater nitrate-N concentrations exceeding DWQS by simple measurement variables as independent variables, including sampling season, soil type, water table depth, pH, EC, DO, and Eh. The analyzed results reveal that three statistically significant explanatory variables, soil type, pH, and EC, are selected for the forward stepwise LR analysis. The total ratio of correct classification reaches 92.7%. The highest probability of nitrate-N contamination map presents in the central zone, indicating that groundwater in the central zone should not be used for drinking purposes. Furthermore, a handy EC-pH-probability curve of nitrate-N exceeding the threshold of DWQS was developed. This curve can be used for preliminary screening of nitrate-N contamination in Kinmen groundwater. This study recommended that the local agency should implement the best management practice strategies to control nonpoint nitrogen sources and carry out a systematic monitoring of groundwater quality in residential wells of the high nitrate-N contamination zones.
Using known populations of pronghorn to evaluate sampling plans and estimators
Kraft, K.M.; Johnson, D.H.; Samuelson, J.M.; Allen, S.H.
1995-01-01
Although sampling plans and estimators of abundance have good theoretical properties, their performance in real situations is rarely assessed because true population sizes are unknown. We evaluated widely used sampling plans and estimators of population size on 3 known clustered distributions of pronghorn (Antilocapra americana). Our criteria were accuracy of the estimate, coverage of 95% confidence intervals, and cost. Sampling plans were combinations of sampling intensities (16, 33, and 50%), sample selection (simple random sampling without replacement, systematic sampling, and probability proportional to size sampling with replacement), and stratification. We paired sampling plans with suitable estimators (simple, ratio, and probability proportional to size). We used area of the sampling unit as the auxiliary variable for the ratio and probability proportional to size estimators. All estimators were nearly unbiased, but precision was generally low (overall mean coefficient of variation [CV] = 29). Coverage of 95% confidence intervals was only 89% because of the highly skewed distribution of the pronghorn counts and small sample sizes, especially with stratification. Stratification combined with accurate estimates of optimal stratum sample sizes increased precision, reducing the mean CV from 33 without stratification to 25 with stratification; costs increased 23%. Precise results (mean CV = 13) but poor confidence interval coverage (83%) were obtained with simple and ratio estimators when the allocation scheme included all sampling units in the stratum containing most pronghorn. Although areas of the sampling units varied, ratio estimators and probability proportional to size sampling did not increase precision, possibly because of the clumped distribution of pronghorn. Managers should be cautious in using sampling plans and estimators to estimate abundance of aggregated populations.
ERIC Educational Resources Information Center
Gambro, John S.; Switzky, Harvey N.
The objectives of this study are to assess the current environmental knowledge base in a national probability sample of American high school students, and examine the distribution of environmental knowledge across several variables which have been found to be related to environmental knowledge in previous research (e.g. education and gender).…
NASA Astrophysics Data System (ADS)
Hunter, Evelyn M. Irving
1998-12-01
The purpose of this study was to examine the relationship and predictive power of the variables gender, high school GPA, class rank, SAT scores, ACT scores, and socioeconomic status on the graduation rates of minority college students majoring in the sciences at a selected urban university. Data was examined on these variables as they related to minority students majoring in science. The population consisted of 101 minority college students who had majored in the sciences from 1986 to 1996 at an urban university in the southwestern region of Texas. A non-probability sampling procedure was used in this study. The non-probability sampling procedure in this investigation was incidental sampling technique. A profile sheet was developed to record the information regarding the variables. The composite scores from SAT and ACT testing were used in the study. The dichotomous variables gender and socioeconomic status were dummy coded for analysis. For the gender variable, zero (0) indicated male, and one (1) indicated female. Additionally, zero (0) indicated high SES, and one (1) indicated low SES. Two parametric procedures were used to analyze the data in this investigation. They were the multiple correlation and multiple regression procedures. Multiple correlation is a statistical technique that indicates the relationship between one variable and a combination of two other variables. The variables socioeconomic status and GPA were found to contribute significantly to the graduation rates of minority students majoring in all sciences when combined with chemistry (Hypotheses Two and Four). These variables accounted for 7% and 15% of the respective variance in the graduation rates of minority students in the sciences and in chemistry. Hypotheses One and Three, the predictor variables gender, high school GPA, SAT Total Scores, class rank, and socioeconomic status did not contribute significantly to the graduation rates of minority students in biology and pharmacy.
High throughput nonparametric probability density estimation.
Farmer, Jenny; Jacobs, Donald
2018-01-01
In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference.
High throughput nonparametric probability density estimation
Farmer, Jenny
2018-01-01
In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference. PMID:29750803
Kent, Peter; Boyle, Eleanor; Keating, Jennifer L; Albert, Hanne B; Hartvigsen, Jan
2017-02-01
To quantify variability in the results of statistical analyses based on contingency tables and discuss the implications for the choice of sample size for studies that derive clinical prediction rules. An analysis of three pre-existing sets of large cohort data (n = 4,062-8,674) was performed. In each data set, repeated random sampling of various sample sizes, from n = 100 up to n = 2,000, was performed 100 times at each sample size and the variability in estimates of sensitivity, specificity, positive and negative likelihood ratios, posttest probabilities, odds ratios, and risk/prevalence ratios for each sample size was calculated. There were very wide, and statistically significant, differences in estimates derived from contingency tables from the same data set when calculated in sample sizes below 400 people, and typically, this variability stabilized in samples of 400-600 people. Although estimates of prevalence also varied significantly in samples below 600 people, that relationship only explains a small component of the variability in these statistical parameters. To reduce sample-specific variability, contingency tables should consist of 400 participants or more when used to derive clinical prediction rules or test their performance. Copyright © 2016 Elsevier Inc. All rights reserved.
Sampling design trade-offs in occupancy studies with imperfect detection: examples and software
Bailey, L.L.; Hines, J.E.; Nichols, J.D.
2007-01-01
Researchers have used occupancy, or probability of occupancy, as a response or state variable in a variety of studies (e.g., habitat modeling), and occupancy is increasingly favored by numerous state, federal, and international agencies engaged in monitoring programs. Recent advances in estimation methods have emphasized that reliable inferences can be made from these types of studies if detection and occupancy probabilities are simultaneously estimated. The need for temporal replication at sampled sites to estimate detection probability creates a trade-off between spatial replication (number of sample sites distributed within the area of interest/inference) and temporal replication (number of repeated surveys at each site). Here, we discuss a suite of questions commonly encountered during the design phase of occupancy studies, and we describe software (program GENPRES) developed to allow investigators to easily explore design trade-offs focused on particularities of their study system and sampling limitations. We illustrate the utility of program GENPRES using an amphibian example from Greater Yellowstone National Park, USA.
A spatial model of bird abundance as adjusted for detection probability
Gorresen, P.M.; Mcmillan, G.P.; Camp, R.J.; Pratt, T.K.
2009-01-01
Modeling the spatial distribution of animals can be complicated by spatial and temporal effects (i.e. spatial autocorrelation and trends in abundance over time) and other factors such as imperfect detection probabilities and observation-related nuisance variables. Recent advances in modeling have demonstrated various approaches that handle most of these factors but which require a degree of sampling effort (e.g. replication) not available to many field studies. We present a two-step approach that addresses these challenges to spatially model species abundance. Habitat, spatial and temporal variables were handled with a Bayesian approach which facilitated modeling hierarchically structured data. Predicted abundance was subsequently adjusted to account for imperfect detection and the area effectively sampled for each species. We provide examples of our modeling approach for two endemic Hawaiian nectarivorous honeycreepers: 'i'iwi Vestiaria coccinea and 'apapane Himatione sanguinea. ?? 2009 Ecography.
Estimation and applications of size-biased distributions in forestry
Jeffrey H. Gove
2003-01-01
Size-biased distributions arise naturally in several contexts in forestry and ecology. Simple power relationships (e.g. basal area and diameter at breast height) between variables are one such area of interest arising from a modelling perspective. Another, probability proportional to size PPS) sampling, is found in the most widely used methods for sampling standing or...
Paterson, Marie; Green, J M; Basson, C J; Ross, F
2002-02-01
There is little information on the probability of assertive behaviour, interpersonal anxiety and self-efficacy in the literature regarding dietitians. The objective of this study was to establish baseline information of these attributes and the factors affecting them. Questionnaires collecting biographical information and self-assessment psychometric scales measuring levels of probability of assertiveness, interpersonal anxiety and self-efficacy were mailed to 350 subjects, who comprised a random sample of dietitians registered with the Health Professions Council of South Africa. Forty-one per cent (n=145) of the sample responded. Self-assessment inventory results were compared to test levels of probability of assertive behaviour, interpersonal anxiety and self-efficacy. The inventory results were compared with the biographical findings to establish statistical relationships between the variables. The hypotheses were formulated before data collection. Dietitians had acceptable levels of probability of assertive behaviour and interpersonal anxiety. The probability of assertive behaviour was significantly lower than the level noted in the literature and was negatively related to interpersonal anxiety and positively related to self-efficacy.
Rand E. Eads; Mark R. Boolootian; Steven C. [Inventors] Hankin
1987-01-01
Abstract - A programmable calculator is connected to a pumping sampler by an interface circuit board. The calculator has a sediment sampling program stored therein and includes a timer to periodically wake up the calculator. Sediment collection is controlled by a Selection At List Time (SALT) scheme in which the probability of taking a sample is proportional to its...
Accounting for randomness in measurement and sampling in studying cancer cell population dynamics.
Ghavami, Siavash; Wolkenhauer, Olaf; Lahouti, Farshad; Ullah, Mukhtar; Linnebacher, Michael
2014-10-01
Knowing the expected temporal evolution of the proportion of different cell types in sample tissues gives an indication about the progression of the disease and its possible response to drugs. Such systems have been modelled using Markov processes. We here consider an experimentally realistic scenario in which transition probabilities are estimated from noisy cell population size measurements. Using aggregated data of FACS measurements, we develop MMSE and ML estimators and formulate two problems to find the minimum number of required samples and measurements to guarantee the accuracy of predicted population sizes. Our numerical results show that the convergence mechanism of transition probabilities and steady states differ widely from the real values if one uses the standard deterministic approach for noisy measurements. This provides support for our argument that for the analysis of FACS data one should consider the observed state as a random variable. The second problem we address is about the consequences of estimating the probability of a cell being in a particular state from measurements of small population of cells. We show how the uncertainty arising from small sample sizes can be captured by a distribution for the state probability.
Dawson-Coates, J A; Chase, J C; Funk, V; Booy, M H; Haines, L R; Falkenberg, C L; Whitaker, D J; Olafson, R W; Pearson, T W
2003-08-01
Atlantic salmon, Salmo salar L., were exposed to Kudoa thyrsites (Myxozoa, Myxosporea)-containing sea water for 15 months, and then harvested and assessed for parasite burden and fillet quality. At harvest, parasites were enumerated in muscle samples from a variety of somatic and opercular sites, and mean counts were determined for each fish. After 6 days storage at 4 degrees C, fillet quality was determined by visual assessment and by analysis of muscle firmness using a texture analyzer. Fillet quality could best be predicted by determining mean parasite numbers and spore counts in all eight tissue samples (somatic and opercular) or in four fillet samples, as the counts from opercular samples alone showed greater variability and thus decreased reliability. The variability in both plasmodia and spore numbers between tissue samples taken from an individual fish indicated that the parasites were not uniformly distributed in the somatic musculature. Therefore, to best predict the probable level of fillet degradation caused by K. thyrsites infections, multiple samples must be taken from each fish. If this is performed, a mean plasmodia count of 0.3 mm(-2) or a mean spore count of 4.0 x 10(5) g(-1) of tissue are the levels where the probability of severe myoliquefaction becomes a significant risk.
Optimizing liquid effluent monitoring at a large nuclear complex.
Chou, Charissa J; Barnett, D Brent; Johnson, Vernon G; Olson, Phil M
2003-12-01
Effluent monitoring typically requires a large number of analytes and samples during the initial or startup phase of a facility. Once a baseline is established, the analyte list and sampling frequency may be reduced. Although there is a large body of literature relevant to the initial design, few, if any, published papers exist on updating established effluent monitoring programs. This paper statistically evaluates four years of baseline data to optimize the liquid effluent monitoring efficiency of a centralized waste treatment and disposal facility at a large defense nuclear complex. Specific objectives were to: (1) assess temporal variability in analyte concentrations, (2) determine operational factors contributing to waste stream variability, (3) assess the probability of exceeding permit limits, and (4) streamline the sampling and analysis regime. Results indicated that the probability of exceeding permit limits was one in a million under normal facility operating conditions, sampling frequency could be reduced, and several analytes could be eliminated. Furthermore, indicators such as gross alpha and gross beta measurements could be used in lieu of more expensive specific isotopic analyses (radium, cesium-137, and strontium-90) for routine monitoring. Study results were used by the state regulatory agency to modify monitoring requirements for a new discharge permit, resulting in an annual cost savings of US dollars 223,000. This case study demonstrates that statistical evaluation of effluent contaminant variability coupled with process knowledge can help plant managers and regulators streamline analyte lists and sampling frequencies based on detection history and environmental risk.
Aldinger, Kyle R.; Wood, Petra B.
2015-01-01
Detection probability during point counts and its associated variables are important considerations for bird population monitoring and have implications for conservation planning by influencing population estimates. During 2008–2009, we evaluated variables hypothesized to be associated with detection probability, detection latency, and behavioral responses of male Golden-winged Warblers in pastures in the Monongahela National Forest, West Virginia, USA. This is the first study of male Golden-winged Warbler detection probability, detection latency, or behavioral response based on point-count sampling with known territory locations and identities for all males. During 3-min passive point counts, detection probability decreased as distance to a male's territory and time since sunrise increased. During 3-min point counts with playback, detection probability decreased as distance to a male's territory increased, but remained constant as time since sunrise increased. Detection probability was greater when point counts included type 2 compared with type 1 song playback, particularly during the first 2 min of type 2 song playback. Golden-winged Warblers primarily use type 1 songs (often zee bee bee bee with a higher-pitched first note) in intersexual contexts and type 2 songs (strident, rapid stutter ending with a lower-pitched buzzy note) in intrasexual contexts. Distance to a male's territory, ordinal date, and song playback type were associated with the type of behavioral response to song playback. Overall, ~2 min of type 2 song playback may increase the efficacy of point counts for monitoring populations of Golden-winged Warblers by increasing the conspicuousness of males for visual identification and offsetting the consequences of surveying later in the morning. Because playback may interfere with the ability to detect distant males, it is important to follow playback with a period of passive listening. Our results indicate that even in relatively open pasture vegetation, detection probability of male Golden-winged Warblers is imperfect and highly variable.
Exploring the Connection Between Sampling Problems in Bayesian Inference and Statistical Mechanics
NASA Technical Reports Server (NTRS)
Pohorille, Andrew
2006-01-01
The Bayesian and statistical mechanical communities often share the same objective in their work - estimating and integrating probability distribution functions (pdfs) describing stochastic systems, models or processes. Frequently, these pdfs are complex functions of random variables exhibiting multiple, well separated local minima. Conventional strategies for sampling such pdfs are inefficient, sometimes leading to an apparent non-ergodic behavior. Several recently developed techniques for handling this problem have been successfully applied in statistical mechanics. In the multicanonical and Wang-Landau Monte Carlo (MC) methods, the correct pdfs are recovered from uniform sampling of the parameter space by iteratively establishing proper weighting factors connecting these distributions. Trivial generalizations allow for sampling from any chosen pdf. The closely related transition matrix method relies on estimating transition probabilities between different states. All these methods proved to generate estimates of pdfs with high statistical accuracy. In another MC technique, parallel tempering, several random walks, each corresponding to a different value of a parameter (e.g. "temperature"), are generated and occasionally exchanged using the Metropolis criterion. This method can be considered as a statistically correct version of simulated annealing. An alternative approach is to represent the set of independent variables as a Hamiltonian system. Considerab!e progress has been made in understanding how to ensure that the system obeys the equipartition theorem or, equivalently, that coupling between the variables is correctly described. Then a host of techniques developed for dynamical systems can be used. Among them, probably the most powerful is the Adaptive Biasing Force method, in which thermodynamic integration and biased sampling are combined to yield very efficient estimates of pdfs. The third class of methods deals with transitions between states described by rate constants. These problems are isomorphic with chemical kinetics problems. Recently, several efficient techniques for this purpose have been developed based on the approach originally proposed by Gillespie. Although the utility of the techniques mentioned above for Bayesian problems has not been determined, further research along these lines is warranted
Probabilities and statistics for backscatter estimates obtained by a scatterometer
NASA Technical Reports Server (NTRS)
Pierson, Willard J., Jr.
1989-01-01
Methods for the recovery of winds near the surface of the ocean from measurements of the normalized radar backscattering cross section must recognize and make use of the statistics (i.e., the sampling variability) of the backscatter measurements. Radar backscatter values from a scatterometer are random variables with expected values given by a model. A model relates backscatter to properties of the waves on the ocean, which are in turn generated by the winds in the atmospheric marine boundary layer. The effective wind speed and direction at a known height for a neutrally stratified atmosphere are the values to be recovered from the model. The probability density function for the backscatter values is a normal probability distribution with the notable feature that the variance is a known function of the expected value. The sources of signal variability, the effects of this variability on the wind speed estimation, and criteria for the acceptance or rejection of models are discussed. A modified maximum likelihood method for estimating wind vectors is described. Ways to make corrections for the kinds of errors found for the Seasat SASS model function are described, and applications to a new scatterometer are given.
Brůžek, Jaroslav; Santos, Frédéric; Dutailly, Bruno; Murail, Pascal; Cunha, Eugenia
2017-10-01
A new tool for skeletal sex estimation based on measurements of the human os coxae is presented using skeletons from a metapopulation of identified adult individuals from twelve independent population samples. For reliable sex estimation, a posterior probability greater than 0.95 was considered to be the classification threshold: below this value, estimates are considered indeterminate. By providing free software, we aim to develop an even more disseminated method for sex estimation. Ten metric variables collected from 2,040 ossa coxa of adult subjects of known sex were recorded between 1986 and 2002 (reference sample). To test both the validity and reliability, a target sample consisting of two series of adult ossa coxa of known sex (n = 623) was used. The DSP2 software (Diagnose Sexuelle Probabiliste v2) is based on Linear Discriminant Analysis, and the posterior probabilities are calculated using an R script. For the reference sample, any combination of four dimensions provides a correct sex estimate in at least 99% of cases. The percentage of individuals for whom sex can be estimated depends on the number of dimensions; for all ten variables it is higher than 90%. Those results are confirmed in the target sample. Our posterior probability threshold of 0.95 for sex estimate corresponds to the traditional sectioning point used in osteological studies. DSP2 software is replacing the former version that should not be used anymore. DSP2 is a robust and reliable technique for sexing adult os coxae, and is also user friendly. © 2017 Wiley Periodicals, Inc.
Mark J. Ducey; Jeffrey H. Gove; Harry T. Valentine
2008-01-01
Perpendicular distance sampling (PDS) is a fast probability-proportional-to-size method for inventory of downed wood. However, previous development of PDS had limited the method to estimating only one variable (such as volume per hectare, or surface area per hectare) at a time. Here, we develop a general design-unbiased estimator for PDS. We then show how that...
NASA Astrophysics Data System (ADS)
Massah, Mozhdeh; Kantz, Holger
2016-04-01
As we have one and only one earth and no replicas, climate characteristics are usually computed as time averages from a single time series. For understanding climate variability, it is essential to understand how close a single time average will typically be to an ensemble average. To answer this question, we study large deviation probabilities (LDP) of stochastic processes and characterize them by their dependence on the time window. In contrast to iid variables for which there exists an analytical expression for the rate function, the correlated variables such as auto-regressive (short memory) and auto-regressive fractionally integrated moving average (long memory) processes, have not an analytical LDP. We study LDP for these processes, in order to see how correlation affects this probability in comparison to iid data. Although short range correlations lead to a simple correction of sample size, long range correlations lead to a sub-exponential decay of LDP and hence to a very slow convergence of time averages. This effect is demonstrated for a 120 year long time series of daily temperature anomalies measured in Potsdam (Germany).
NASA Astrophysics Data System (ADS)
Sadegh, M.; Moftakhari, H.; AghaKouchak, A.
2017-12-01
Many natural hazards are driven by multiple forcing variables, and concurrence/consecutive extreme events significantly increases risk of infrastructure/system failure. It is a common practice to use univariate analysis based upon a perceived ruling driver to estimate design quantiles and/or return periods of extreme events. A multivariate analysis, however, permits modeling simultaneous occurrence of multiple forcing variables. In this presentation, we introduce the Multi-hazard Assessment and Scenario Toolbox (MhAST) that comprehensively analyzes marginal and joint probability distributions of natural hazards. MhAST also offers a wide range of scenarios of return period and design levels and their likelihoods. Contribution of this study is four-fold: 1. comprehensive analysis of marginal and joint probability of multiple drivers through 17 continuous distributions and 26 copulas, 2. multiple scenario analysis of concurrent extremes based upon the most likely joint occurrence, one ruling variable, and weighted random sampling of joint occurrences with similar exceedance probabilities, 3. weighted average scenario analysis based on a expected event, and 4. uncertainty analysis of the most likely joint occurrence scenario using a Bayesian framework.
Recent progresses in outcome-dependent sampling with failure time data.
Ding, Jieli; Lu, Tsui-Shan; Cai, Jianwen; Zhou, Haibo
2017-01-01
An outcome-dependent sampling (ODS) design is a retrospective sampling scheme where one observes the primary exposure variables with a probability that depends on the observed value of the outcome variable. When the outcome of interest is failure time, the observed data are often censored. By allowing the selection of the supplemental samples depends on whether the event of interest happens or not and oversampling subjects from the most informative regions, ODS design for the time-to-event data can reduce the cost of the study and improve the efficiency. We review recent progresses and advances in research on ODS designs with failure time data. This includes researches on ODS related designs like case-cohort design, generalized case-cohort design, stratified case-cohort design, general failure-time ODS design, length-biased sampling design and interval sampling design.
Recent progresses in outcome-dependent sampling with failure time data
Ding, Jieli; Lu, Tsui-Shan; Cai, Jianwen; Zhou, Haibo
2016-01-01
An outcome-dependent sampling (ODS) design is a retrospective sampling scheme where one observes the primary exposure variables with a probability that depends on the observed value of the outcome variable. When the outcome of interest is failure time, the observed data are often censored. By allowing the selection of the supplemental samples depends on whether the event of interest happens or not and oversampling subjects from the most informative regions, ODS design for the time-to-event data can reduce the cost of the study and improve the efficiency. We review recent progresses and advances in research on ODS designs with failure time data. This includes researches on ODS related designs like case–cohort design, generalized case–cohort design, stratified case–cohort design, general failure-time ODS design, length-biased sampling design and interval sampling design. PMID:26759313
Validating long-term satellite-derived disturbance products: the case of burned areas
NASA Astrophysics Data System (ADS)
Boschetti, L.; Roy, D. P.
2015-12-01
The potential research, policy and management applications of satellite products place a high priority on providing statements about their accuracy. A number of NASA, ESA and EU funded global and continental burned area products have been developed using coarse spatial resolution satellite data, and have the potential to become part of a long-term fire Climate Data Record. These products have usually been validated by comparison with reference burned area maps derived by visual interpretation of Landsat or similar spatial resolution data selected on an ad hoc basis. More optimally, a design-based validation method should be adopted that is characterized by the selection of reference data via a probability sampling that can subsequently be used to compute accuracy metrics, taking into account the sampling probability. Design based techniques have been used for annual land cover and land cover change product validation, but have not been widely used for burned area products, or for the validation of global products that are highly variable in time and space (e.g. snow, floods or other non-permanent phenomena). This has been due to the challenge of designing an appropriate sampling strategy, and to the cost of collecting independent reference data. We propose a tri-dimensional sampling grid that allows for probability sampling of Landsat data in time and in space. To sample the globe in the spatial domain with non-overlapping sampling units, the Thiessen Scene Area (TSA) tessellation of the Landsat WRS path/rows is used. The TSA grid is then combined with the 16-day Landsat acquisition calendar to provide tri-dimensonal elements (voxels). This allows the implementation of a sampling design where not only the location but also the time interval of the reference data is explicitly drawn by probability sampling. The proposed sampling design is a stratified random sampling, with two-level stratification of the voxels based on biomes and fire activity (Figure 1). The novel validation approach, used for the validation of the MODIS and forthcoming VIIRS global burned area products, is a general one, and could be used for the validation of other global products that are highly variable in space and time and is required to assess the accuracy of climate records. The approach is demonstrated using a 1 year dataset of MODIS fire products.
NASA Astrophysics Data System (ADS)
Liu, Zhangjun; Liu, Zenghui
2018-06-01
This paper develops a hybrid approach of spectral representation and random function for simulating stationary stochastic vector processes. In the proposed approach, the high-dimensional random variables, included in the original spectral representation (OSR) formula, could be effectively reduced to only two elementary random variables by introducing the random functions that serve as random constraints. Based on this, a satisfactory simulation accuracy can be guaranteed by selecting a small representative point set of the elementary random variables. The probability information of the stochastic excitations can be fully emerged through just several hundred of sample functions generated by the proposed approach. Therefore, combined with the probability density evolution method (PDEM), it could be able to implement dynamic response analysis and reliability assessment of engineering structures. For illustrative purposes, a stochastic turbulence wind velocity field acting on a frame-shear-wall structure is simulated by constructing three types of random functions to demonstrate the accuracy and efficiency of the proposed approach. Careful and in-depth studies concerning the probability density evolution analysis of the wind-induced structure have been conducted so as to better illustrate the application prospects of the proposed approach. Numerical examples also show that the proposed approach possesses a good robustness.
Quantitative assessment of mineral resources with an application to petroleum geology
Harff, Jan; Davis, J.C.; Olea, R.A.
1992-01-01
The probability of occurrence of natural resources, such as petroleum deposits, can be assessed by a combination of multivariate statistical and geostatistical techniques. The area of study is partitioned into regions that are as homogeneous as possible internally while simultaneously as distinct as possible. Fisher's discriminant criterion is used to select geological variables that best distinguish productive from nonproductive localities, based on a sample of previously drilled exploratory wells. On the basis of these geological variables, each wildcat well is assigned to the production class (dry or producer in the two-class case) for which the Mahalanobis' distance from the observation to the class centroid is a minimum. Universal kriging is used to interpolate values of the Mahalanobis' distances to all locations not yet drilled. The probability that an undrilled locality belongs to the productive class can be found, using the kriging estimation variances to assess the probability of misclassification. Finally, Bayes' relationship can be used to determine the probability that an undrilled location will be a discovery, regardless of the production class in which it is placed. The method is illustrated with a study of oil prospects in the Lansing/Kansas City interval of western Kansas, using geological variables derived from well logs. ?? 1992 Oxford University Press.
[Socio-demographic and health factors associated with the institutionalization of dependent people].
Ayuso Gutiérrez, Mercedes; Pozo Rubio, Raúl Del; Escribano Sotos, Francisco
2010-01-01
The analysis of the effect that different variables have in the probability that dependent people are institutionalized is a topic scantily studied in Spain. The aim of the work is to analyze as certain socio-demographic and health factors can influence probability of dependent person living in a residence. A cross-section study has been conducted from a representative sample of the dependent population in Cuenca (Spain) in February, 2009. We have obtained information for people with level II and III of dependence. A binary logit regression model has been estimated to identify those factors related to the institutionalization of dependent people. People with ages between 65-74 years old are six times more likely to be institutionalized than younger people (< 65 years old); this probability increases sixteen times for those individuals with ages equal or higher than 95 years. The probability of institutionalization of people who live in an urban area is three times the probability of people who live in a rural area. People who need pharmacological, psychotherapy or rehabilitation treatments have between two and four times more probability of being institutionalized that those who do not need those. Age, marital status, place of residence, cardiovascular and musculoskeletal diseases and four times of medical treatment are the principal variables associated with the institutionalization of dependent people.
Secondary outcome analysis for data from an outcome-dependent sampling design.
Pan, Yinghao; Cai, Jianwen; Longnecker, Matthew P; Zhou, Haibo
2018-04-22
Outcome-dependent sampling (ODS) scheme is a cost-effective way to conduct a study. For a study with continuous primary outcome, an ODS scheme can be implemented where the expensive exposure is only measured on a simple random sample and supplemental samples selected from 2 tails of the primary outcome variable. With the tremendous cost invested in collecting the primary exposure information, investigators often would like to use the available data to study the relationship between a secondary outcome and the obtained exposure variable. This is referred as secondary analysis. Secondary analysis in ODS designs can be tricky, as the ODS sample is not a random sample from the general population. In this article, we use the inverse probability weighted and augmented inverse probability weighted estimating equations to analyze the secondary outcome for data obtained from the ODS design. We do not make any parametric assumptions on the primary and secondary outcome and only specify the form of the regression mean models, thus allow an arbitrary error distribution. Our approach is robust to second- and higher-order moment misspecification. It also leads to more precise estimates of the parameters by effectively using all the available participants. Through simulation studies, we show that the proposed estimator is consistent and asymptotically normal. Data from the Collaborative Perinatal Project are analyzed to illustrate our method. Copyright © 2018 John Wiley & Sons, Ltd.
Francy, Donna S.; Gifford, Amie M.; Darner, Robert A.
2003-01-01
Results of studies during the recreational seasons of 2000 and 2001 strengthen the science that supports monitoring of our Nation?s beaches. Water and sediment samples were collected and analyzed for concentrations of Escherichia coli (E. coli). Ancillary water-quality and environmental data were collected or compiled to determine their relation to E. coli concentrations. Data were collected at three Lake Erie urban beaches (Edgewater, Villa Angela, and Huntington), two Lake Erie beaches in a less populated area (Mentor Headlands and Fairport Harbor), and one inland-lake beach (Mosquito Lake). The distribution of E. coli in water and sediments within the bathing area, outside the bathing area, and near the swash zone was investigated at the three Lake Erie urban beaches and at Mosquito Lake. (The swash zone is the zone that is alternately covered and exposed by waves.) Lake-bottom sediments from outside the bathing area were not significant deposition areas for E. coli. In contrast, interstitial water and subsurface sediments from near the swash zone were enriched with E. coli. For example, E. coli concentrations were as high as 100,000 colonies per 100 milliliters in some interstitial waters. Although there are no standards for E. coli in swash-zone materials, the high concentrations found at some locations warrant concern for public health. Studies were done at Mosquito Lake to identify sources of fecal contamination to the lake and bathing beach. Escherichia coli concentrations decreased with distance from a suspected source of fecal contamination that is north of the beach but increased at the bathing beach. This evidence indicated that elevated E. coli concentrations at the bathing beach are of local origin rather than from transport of bacteria from sites to the north. Samples collected from the three Lake Erie urban beaches and Mosquito Lake were analyzed to determine whether wastewater indicators could be used as surrogates for E. coli at bathing beaches. None of the concentrations of wastewater indicators of fecal contamination, including 3b-coprostanol and cholesterol, were significantly correlated (a=0.05) to concentrations of E. coli. Concentrations of the two compounds that were significantly correlated to E. coli were components of coal tar and asphalt, which are not necessarily indicative of fecal contamination. Data were collected to build on an earlier 1997 study to develop and test multiple-linear-regression models to predict E. coli concentrations using water-quality and environmental variables as explanatory variables. The probability of exceeding the single-sample bathing-water standard for E. coli (235 colonies per 100 milliliters) was used as the model output variable. Threshold probabilities for each model were established. Computed probabilities that are less than a threshold probability indicate that bacterial water quality is most likely acceptable. Computed probabilities equal to or above the threshold probability indicate that the water quality is most likely not acceptable and that a water-quality advisory may be needed. Models were developed at each beach, whenever possible, using combinations of 1997, 2000, and (or) 2001 data. The models developed and tested in this study were shown to be beach specific; that is, different explanatory variables were used to predict the probability of exceeding the standard at each beach. At Mentor Headlands and Fairport Harbor, models were not developed because water quality was generally good. At the three Lake Erie urban beaches, models were developed with variable lists that included the number of birds on the beach at the time of sampling, lake-current direction, wave height, turbidity, streamflow of a nearby river, and rainfall. The models for Huntington explained a larger percentage of the variability in E. coli concentrations than the models for Edgewater and Villa Angela. At Mosquito Lake, a model based on 2000 and 2001 data contained the
Computational methods for efficient structural reliability and reliability sensitivity analysis
NASA Technical Reports Server (NTRS)
Wu, Y.-T.
1993-01-01
This paper presents recent developments in efficient structural reliability analysis methods. The paper proposes an efficient, adaptive importance sampling (AIS) method that can be used to compute reliability and reliability sensitivities. The AIS approach uses a sampling density that is proportional to the joint PDF of the random variables. Starting from an initial approximate failure domain, sampling proceeds adaptively and incrementally with the goal of reaching a sampling domain that is slightly greater than the failure domain to minimize over-sampling in the safe region. Several reliability sensitivity coefficients are proposed that can be computed directly and easily from the above AIS-based failure points. These probability sensitivities can be used for identifying key random variables and for adjusting design to achieve reliability-based objectives. The proposed AIS methodology is demonstrated using a turbine blade reliability analysis problem.
Visualizing Time-Varying Distribution Data in EOS Application
NASA Technical Reports Server (NTRS)
Shen, Han-Wei
2004-01-01
In this research, we have developed several novel visualization methods for spatial probability density function data. Our focus has been on 2D spatial datasets, where each pixel is a random variable, and has multiple samples which are the results of experiments on that random variable. We developed novel clustering algorithms as a means to reduce the information contained in these datasets; and investigated different ways of interpreting and clustering the data.
Multinomial logistic regression in workers' health
NASA Astrophysics Data System (ADS)
Grilo, Luís M.; Grilo, Helena L.; Gonçalves, Sónia P.; Junça, Ana
2017-11-01
In European countries, namely in Portugal, it is common to hear some people mentioning that they are exposed to excessive and continuous psychosocial stressors at work. This is increasing in diverse activity sectors, such as, the Services sector. A representative sample was collected from a Portuguese Services' organization, by applying a survey (internationally validated), which variables were measured in five ordered categories in Likert-type scale. A multinomial logistic regression model is used to estimate the probability of each category of the dependent variable general health perception where, among other independent variables, burnout appear as statistically significant.
Population variability complicates the accurate detection of climate change responses.
McCain, Christy; Szewczyk, Tim; Bracy Knight, Kevin
2016-06-01
The rush to assess species' responses to anthropogenic climate change (CC) has underestimated the importance of interannual population variability (PV). Researchers assume sampling rigor alone will lead to an accurate detection of response regardless of the underlying population fluctuations of the species under consideration. Using population simulations across a realistic, empirically based gradient in PV, we show that moderate to high PV can lead to opposite and biased conclusions about CC responses. Between pre- and post-CC sampling bouts of modeled populations as in resurvey studies, there is: (i) A 50% probability of erroneously detecting the opposite trend in population abundance change and nearly zero probability of detecting no change. (ii) Across multiple years of sampling, it is nearly impossible to accurately detect any directional shift in population sizes with even moderate PV. (iii) There is up to 50% probability of detecting a population extirpation when the species is present, but in very low natural abundances. (iv) Under scenarios of moderate to high PV across a species' range or at the range edges, there is a bias toward erroneous detection of range shifts or contractions. Essentially, the frequency and magnitude of population peaks and troughs greatly impact the accuracy of our CC response measurements. Species with moderate to high PV (many small vertebrates, invertebrates, and annual plants) may be inaccurate 'canaries in the coal mine' for CC without pertinent demographic analyses and additional repeat sampling. Variation in PV may explain some idiosyncrasies in CC responses detected so far and urgently needs more careful consideration in design and analysis of CC responses. © 2016 John Wiley & Sons Ltd.
The utility of Bayesian predictive probabilities for interim monitoring of clinical trials
Connor, Jason T.; Ayers, Gregory D; Alvarez, JoAnn
2014-01-01
Background Bayesian predictive probabilities can be used for interim monitoring of clinical trials to estimate the probability of observing a statistically significant treatment effect if the trial were to continue to its predefined maximum sample size. Purpose We explore settings in which Bayesian predictive probabilities are advantageous for interim monitoring compared to Bayesian posterior probabilities, p-values, conditional power, or group sequential methods. Results For interim analyses that address prediction hypotheses, such as futility monitoring and efficacy monitoring with lagged outcomes, only predictive probabilities properly account for the amount of data remaining to be observed in a clinical trial and have the flexibility to incorporate additional information via auxiliary variables. Limitations Computational burdens limit the feasibility of predictive probabilities in many clinical trial settings. The specification of prior distributions brings additional challenges for regulatory approval. Conclusions The use of Bayesian predictive probabilities enables the choice of logical interim stopping rules that closely align with the clinical decision making process. PMID:24872363
REGIONAL LAKE TROPHIC PATTERNS IN THE NORTHEASTERN UNITED STATES: THREE APPROACHES
During the summers of 1991-1994, the Environmental Monitoring and Assessment Progam (EMAP) conducted variable probability sampling on 344 lakes throughout the northeastern United States. Trophic state data were analyzed for the Northeast as a whole and for each of its three major...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chatterjee, Samrat; Tipireddy, Ramakrishna; Oster, Matthew R.
Securing cyber-systems on a continual basis against a multitude of adverse events is a challenging undertaking. Game-theoretic approaches, that model actions of strategic decision-makers, are increasingly being applied to address cybersecurity resource allocation challenges. Such game-based models account for multiple player actions and represent cyber attacker payoffs mostly as point utility estimates. Since a cyber-attacker’s payoff generation mechanism is largely unknown, appropriate representation and propagation of uncertainty is a critical task. In this paper we expand on prior work and focus on operationalizing the probabilistic uncertainty quantification framework, for a notional cyber system, through: 1) representation of uncertain attacker andmore » system-related modeling variables as probability distributions and mathematical intervals, and 2) exploration of uncertainty propagation techniques including two-phase Monte Carlo sampling and probability bounds analysis.« less
Viana, Duarte S; Santamaría, Luis; Figuerola, Jordi
2016-02-01
Propagule retention time is a key factor in determining propagule dispersal distance and the shape of "seed shadows". Propagules dispersed by animal vectors are either ingested and retained in the gut until defecation or attached externally to the body until detachment. Retention time is a continuous variable, but it is commonly measured at discrete time points, according to pre-established sampling time-intervals. Although parametric continuous distributions have been widely fitted to these interval-censored data, the performance of different fitting methods has not been evaluated. To investigate the performance of five different fitting methods, we fitted parametric probability distributions to typical discretized retention-time data with known distribution using as data-points either the lower, mid or upper bounds of sampling intervals, as well as the cumulative distribution of observed values (using either maximum likelihood or non-linear least squares for parameter estimation); then compared the estimated and original distributions to assess the accuracy of each method. We also assessed the robustness of these methods to variations in the sampling procedure (sample size and length of sampling time-intervals). Fittings to the cumulative distribution performed better for all types of parametric distributions (lognormal, gamma and Weibull distributions) and were more robust to variations in sample size and sampling time-intervals. These estimated distributions had negligible deviations of up to 0.045 in cumulative probability of retention times (according to the Kolmogorov-Smirnov statistic) in relation to original distributions from which propagule retention time was simulated, supporting the overall accuracy of this fitting method. In contrast, fitting the sampling-interval bounds resulted in greater deviations that ranged from 0.058 to 0.273 in cumulative probability of retention times, which may introduce considerable biases in parameter estimates. We recommend the use of cumulative probability to fit parametric probability distributions to propagule retention time, specifically using maximum likelihood for parameter estimation. Furthermore, the experimental design for an optimal characterization of unimodal propagule retention time should contemplate at least 500 recovered propagules and sampling time-intervals not larger than the time peak of propagule retrieval, except in the tail of the distribution where broader sampling time-intervals may also produce accurate fits.
Improvements in sub-grid, microphysics averages using quadrature based approaches
NASA Astrophysics Data System (ADS)
Chowdhary, K.; Debusschere, B.; Larson, V. E.
2013-12-01
Sub-grid variability in microphysical processes plays a critical role in atmospheric climate models. In order to account for this sub-grid variability, Larson and Schanen (2013) propose placing a probability density function on the sub-grid cloud microphysics quantities, e.g. autoconversion rate, essentially interpreting the cloud microphysics quantities as a random variable in each grid box. Random sampling techniques, e.g. Monte Carlo and Latin Hypercube, can be used to calculate statistics, e.g. averages, on the microphysics quantities, which then feed back into the model dynamics on the coarse scale. We propose an alternate approach using numerical quadrature methods based on deterministic sampling points to compute the statistical moments of microphysics quantities in each grid box. We have performed a preliminary test on the Kessler autoconversion formula, and, upon comparison with Latin Hypercube sampling, our approach shows an increased level of accuracy with a reduction in sample size by almost two orders of magnitude. Application to other microphysics processes is the subject of ongoing research.
Mali, Ivana; Duarte, Adam; Forstner, Michael R J
2018-01-01
Abundance estimates play an important part in the regulatory and conservation decision-making process. It is important to correct monitoring data for imperfect detection when using these data to track spatial and temporal variation in abundance, especially in the case of rare and elusive species. This paper presents the first attempt to estimate abundance of the Rio Grande cooter ( Pseudemys gorzugi ) while explicitly considering the detection process. Specifically, in 2016 we monitored this rare species at two sites along the Black River, New Mexico via traditional baited hoop-net traps and less invasive visual surveys to evaluate the efficacy of these two sampling designs. We fitted the Huggins closed-capture estimator to estimate capture probabilities using the trap data and distance sampling models to estimate detection probabilities using the visual survey data. We found that only the visual survey with the highest number of observed turtles resulted in similar abundance estimates to those estimated using the trap data. However, the estimates of abundance from the remaining visual survey data were highly variable and often underestimated abundance relative to the estimates from the trap data. We suspect this pattern is related to changes in the basking behavior of the species and, thus, the availability of turtles to be detected even though all visual surveys were conducted when environmental conditions were similar. Regardless, we found that riverine habitat conditions limited our ability to properly conduct visual surveys at one site. Collectively, this suggests visual surveys may not be an effective sample design for this species in this river system. When analyzing the trap data, we found capture probabilities to be highly variable across sites and between age classes and that recapture probabilities were much lower than initial capture probabilities, highlighting the importance of accounting for detectability when monitoring this species. Although baited hoop-net traps seem to be an effective sampling design, it is important to note that this method required a relatively high trap effort to reliably estimate abundance. This information will be useful when developing a larger-scale, long-term monitoring program for this species of concern.
Perry, Russell W.; Kirsch, Joseph E.; Hendrix, A. Noble
2016-06-17
Resource managers rely on abundance or density metrics derived from beach seine surveys to make vital decisions that affect fish population dynamics and assemblage structure. However, abundance and density metrics may be biased by imperfect capture and lack of geographic closure during sampling. Currently, there is considerable uncertainty about the capture efficiency of juvenile Chinook salmon (Oncorhynchus tshawytscha) by beach seines. Heterogeneity in capture can occur through unrealistic assumptions of closure and from variation in the probability of capture caused by environmental conditions. We evaluated the assumptions of closure and the influence of environmental conditions on capture efficiency and abundance estimates of Chinook salmon from beach seining within the Sacramento–San Joaquin Delta and the San Francisco Bay. Beach seine capture efficiency was measured using a stratified random sampling design combined with open and closed replicate depletion sampling. A total of 56 samples were collected during the spring of 2014. To assess variability in capture probability and the absolute abundance of juvenile Chinook salmon, beach seine capture efficiency data were fitted to the paired depletion design using modified N-mixture models. These models allowed us to explicitly test the closure assumption and estimate environmental effects on the probability of capture. We determined that our updated method allowing for lack of closure between depletion samples drastically outperformed traditional data analysis that assumes closure among replicate samples. The best-fit model (lowest-valued Akaike Information Criterion model) included the probability of fish being available for capture (relaxed closure assumption), capture probability modeled as a function of water velocity and percent coverage of fine sediment, and abundance modeled as a function of sample area, temperature, and water velocity. Given that beach seining is a ubiquitous sampling technique for many species, our improved sampling design and analysis could provide significant improvements in density and abundance estimation.
The coalescent of a sample from a binary branching process.
Lambert, Amaury
2018-04-25
At time 0, start a time-continuous binary branching process, where particles give birth to a single particle independently (at a possibly time-dependent rate) and die independently (at a possibly time-dependent and age-dependent rate). A particular case is the classical birth-death process. Stop this process at time T>0. It is known that the tree spanned by the N tips alive at time T of the tree thus obtained (called a reduced tree or coalescent tree) is a coalescent point process (CPP), which basically means that the depths of interior nodes are independent and identically distributed (iid). Now select each of the N tips independently with probability y (Bernoulli sample). It is known that the tree generated by the selected tips, which we will call the Bernoulli sampled CPP, is again a CPP. Now instead, select exactly k tips uniformly at random among the N tips (a k-sample). We show that the tree generated by the selected tips is a mixture of Bernoulli sampled CPPs with the same parent CPP, over some explicit distribution of the sampling probability y. An immediate consequence is that the genealogy of a k-sample can be obtained by the realization of k random variables, first the random sampling probability Y and then the k-1 node depths which are iid conditional on Y=y. Copyright © 2018. Published by Elsevier Inc.
Spatial Variation of Soil Lead in an Urban Community Garden: Implications for Risk-Based Sampling.
Bugdalski, Lauren; Lemke, Lawrence D; McElmurry, Shawn P
2014-01-01
Soil lead pollution is a recalcitrant problem in urban areas resulting from a combination of historical residential, industrial, and transportation practices. The emergence of urban gardening movements in postindustrial cities necessitates accurate assessment of soil lead levels to ensure safe gardening. In this study, we examined small-scale spatial variability of soil lead within a 15 × 30 m urban garden plot established on two adjacent residential lots located in Detroit, Michigan, USA. Eighty samples collected using a variably spaced sampling grid were analyzed for total, fine fraction (less than 250 μm), and bioaccessible soil lead. Measured concentrations varied at sampling scales of 1-10 m and a hot spot exceeding 400 ppm total soil lead was identified in the northwest portion of the site. An interpolated map of total lead was treated as an exhaustive data set, and random sampling was simulated to generate Monte Carlo distributions and evaluate alternative sampling strategies intended to estimate the average soil lead concentration or detect hot spots. Increasing the number of individual samples decreases the probability of overlooking the hot spot (type II error). However, the practice of compositing and averaging samples decreased the probability of overestimating the mean concentration (type I error) at the expense of increasing the chance for type II error. The results reported here suggest a need to reconsider U.S. Environmental Protection Agency sampling objectives and consequent guidelines for reclaimed city lots where soil lead distributions are expected to be nonuniform. © 2013 Society for Risk Analysis.
76 FR 770 - Proposed Information Collection; Comment Request; Monthly Wholesale Trade Survey
Federal Register 2010, 2011, 2012, 2013, 2014
2011-01-06
... reduces the time and cost of preparing mailout packages that contain unique variable data, while improving... developing productivity measurements. Estimates produced from the MWTS are based on a probability sample and..., excluding manufacturers' sales branches and offices. Estimated Number of Respondents: 4,500. Estimated Time...
Attitudes Vs. Cognitions: Explaining Long-Term Watergate Effects.
ERIC Educational Resources Information Center
Becker, Lee B.; Towers, Wayne M.
The political scandals known as Watergate provided an unusual opportunity to study the importance of attitudinal and cognitive variables in media research. In order to assess the impact of Watergate during the months preceding the 1974 Congressional elections, 339 personal interviews were conducted during October with a probability sample of…
Rare event simulation in radiation transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kollman, Craig
1993-10-01
This dissertation studies methods for estimating extremely small probabilities by Monte Carlo simulation. Problems in radiation transport typically involve estimating very rare events or the expected value of a random variable which is with overwhelming probability equal to zero. These problems often have high dimensional state spaces and irregular geometries so that analytic solutions are not possible. Monte Carlo simulation must be used to estimate the radiation dosage being transported to a particular location. If the area is well shielded the probability of any one particular particle getting through is very small. Because of the large number of particles involved,more » even a tiny fraction penetrating the shield may represent an unacceptable level of radiation. It therefore becomes critical to be able to accurately estimate this extremely small probability. Importance sampling is a well known technique for improving the efficiency of rare event calculations. Here, a new set of probabilities is used in the simulation runs. The results are multiple by the likelihood ratio between the true and simulated probabilities so as to keep the estimator unbiased. The variance of the resulting estimator is very sensitive to which new set of transition probabilities are chosen. It is shown that a zero variance estimator does exist, but that its computation requires exact knowledge of the solution. A simple random walk with an associated killing model for the scatter of neutrons is introduced. Large deviation results for optimal importance sampling in random walks are extended to the case where killing is present. An adaptive ``learning`` algorithm for implementing importance sampling is given for more general Markov chain models of neutron scatter. For finite state spaces this algorithm is shown to give with probability one, a sequence of estimates converging exponentially fast to the true solution.« less
Hayer, C.-A.; Irwin, E.R.
2008-01-01
We used an information-theoretic approach to examine the variation in detection probabilities for 87 Piedmont and Coastal Plain fishes in relation to instream gravel mining in four Alabama streams of the Mobile River drainage. Biotic and abiotic variables were also included in candidate models. Detection probabilities were heterogeneous across species and varied with habitat type, stream, season, and water quality. Instream gravel mining influenced the variation in detection probabilities for 38% of the species collected, probably because it led to habitat loss and increased sedimentation. Higher detection probabilities were apparent at unmined sites than at mined sites for 78% of the species for which gravel mining was shown to influence detection probabilities, indicating potential negative impacts to these species. Physical and chemical attributes also explained the variation in detection probabilities for many species. These results indicate that anthropogenic impacts can affect detection probabilities for fishes, and such variation should be considered when developing monitoring programs or routine sampling protocols. ?? Copyright by the American Fisheries Society 2008.
Estimating rare events in biochemical systems using conditional sampling.
Sundar, V S
2017-01-28
The paper focuses on development of variance reduction strategies to estimate rare events in biochemical systems. Obtaining this probability using brute force Monte Carlo simulations in conjunction with the stochastic simulation algorithm (Gillespie's method) is computationally prohibitive. To circumvent this, important sampling tools such as the weighted stochastic simulation algorithm and the doubly weighted stochastic simulation algorithm have been proposed. However, these strategies require an additional step of determining the important region to sample from, which is not straightforward for most of the problems. In this paper, we apply the subset simulation method, developed as a variance reduction tool in the context of structural engineering, to the problem of rare event estimation in biochemical systems. The main idea is that the rare event probability is expressed as a product of more frequent conditional probabilities. These conditional probabilities are estimated with high accuracy using Monte Carlo simulations, specifically the Markov chain Monte Carlo method with the modified Metropolis-Hastings algorithm. Generating sample realizations of the state vector using the stochastic simulation algorithm is viewed as mapping the discrete-state continuous-time random process to the standard normal random variable vector. This viewpoint opens up the possibility of applying more sophisticated and efficient sampling schemes developed elsewhere to problems in stochastic chemical kinetics. The results obtained using the subset simulation method are compared with existing variance reduction strategies for a few benchmark problems, and a satisfactory improvement in computational time is demonstrated.
O'Connell, Allan F.; Talancy, Neil W.; Bailey, Larissa L.; Sauer, John R.; Cook, Robert; Gilbert, Andrew T.
2006-01-01
Large-scale, multispecies monitoring programs are widely used to assess changes in wildlife populations but they often assume constant detectability when documenting species occurrence. This assumption is rarely met in practice because animal populations vary across time and space. As a result, detectability of a species can be influenced by a number of physical, biological, or anthropogenic factors (e.g., weather, seasonality, topography, biological rhythms, sampling methods). To evaluate some of these influences, we estimated site occupancy rates using species-specific detection probabilities for meso- and large terrestrial mammal species on Cape Cod, Massachusetts, USA. We used model selection to assess the influence of different sampling methods and major environmental factors on our ability to detect individual species. Remote cameras detected the most species (9), followed by cubby boxes (7) and hair traps (4) over a 13-month period. Estimated site occupancy rates were similar among sampling methods for most species when detection probabilities exceeded 0.15, but we question estimates obtained from methods with detection probabilities between 0.05 and 0.15, and we consider methods with lower probabilities unacceptable for occupancy estimation and inference. Estimated detection probabilities can be used to accommodate variation in sampling methods, which allows for comparison of monitoring programs using different protocols. Vegetation and seasonality produced species-specific differences in detectability and occupancy, but differences were not consistent within or among species, which suggests that our results should be considered in the context of local habitat features and life history traits for the target species. We believe that site occupancy is a useful state variable and suggest that monitoring programs for mammals using occupancy data consider detectability prior to making inferences about species distributions or population change.
Probabilistic confidence for decisions based on uncertain reliability estimates
NASA Astrophysics Data System (ADS)
Reid, Stuart G.
2013-05-01
Reliability assessments are commonly carried out to provide a rational basis for risk-informed decisions concerning the design or maintenance of engineering systems and structures. However, calculated reliabilities and associated probabilities of failure often have significant uncertainties associated with the possible estimation errors relative to the 'true' failure probabilities. For uncertain probabilities of failure, a measure of 'probabilistic confidence' has been proposed to reflect the concern that uncertainty about the true probability of failure could result in a system or structure that is unsafe and could subsequently fail. The paper describes how the concept of probabilistic confidence can be applied to evaluate and appropriately limit the probabilities of failure attributable to particular uncertainties such as design errors that may critically affect the dependability of risk-acceptance decisions. This approach is illustrated with regard to the dependability of structural design processes based on prototype testing with uncertainties attributable to sampling variability.
Highly variable AGN from the XMM-Newton slew survey
NASA Astrophysics Data System (ADS)
Strotjohann, N. L.; Saxton, R. D.; Starling, R. L. C.; Esquej, P.; Read, A. M.; Evans, P. A.; Miniutti, G.
2016-07-01
Aims: We investigate the properties of a variability-selected complete sample of active galactic nuclei (AGN) in order to identify the mechanisms which cause large amplitude X-ray variability on timescales of years. Methods: A complete sample of 24 sources was constructed, from AGN which changed their soft X-ray luminosity by more than one order of magnitude over 5-20 years between ROSAT observations and the XMM-Newton slew survey. Follow-up observations were obtained with the Swift satellite. We analysed the spectra of these AGN at the Swift and XMM observation epochs, where six sources had continued to display extreme variability. Multiwavelength data are used to calculate black hole masses and the relative X-ray brightness αOX. Results: After removal of two probable spurious sources, we find that the sample has global properties which differ little from a non-varying control sample drawn from the wider XMM-slew/ROSAT/Veron sample of all secure AGN detections. A wide range of AGN types are represented in the varying sample. The black hole mass distributions for the varying and non-varying sample are not significantly different. This suggests that long timescale variability is not strongly affected by black hole mass. There is marginal evidence that the variable sources have a lower redshift (2σ) and X-ray luminosity (1.7σ). Apart from two radio-loud sources, the sample sources have normal optical-X-ray ratios (αOX) when at their peak but are X-ray weak during their lowest flux measurements. Conclusions: Drawing on our results and other studies, we are able to identify a variety of variability mechanisms at play: tidal disruption events, jet activity, changes in absorption, thermal emission from the inner accretion disc, and variable accretion disc reflection. Little evidence for strong absorption is seen in the majority of the sample and single-component absorption can be excluded as the mechanism for most sources.
Christensen, H; Mackinnon, A J; Korten, A E; Jorm, A F; Henderson, A S; Jacomb, P; Rodgers, B
1999-09-01
This longitudinal study investigated whether age is associated with increases in interindividual variability across 4 ability domains using a sample of 426 elderly community dwellers followed over 3.5 years. Interindividual variability in change scores increased with age for memory, spatial functioning, and speed but not for crystallized intelligence for the full sample and in a subsample that excluded dementia or probable dementia cases. Hierarchical regression analyses indicated that being female, having weaker muscle strength, and having greater symptoms of illness and greater depression were associated with overall greater variability in cognitive scores. Having a higher level of education was associated with reduced variability. These findings are consistent with the view that there is a greater range of responses at older ages, that certain domains of intelligence are less susceptible to variation than others and that variables other than age affect cognitive performance in later life.
Nonlinear Spatial Inversion Without Monte Carlo Sampling
NASA Astrophysics Data System (ADS)
Curtis, A.; Nawaz, A.
2017-12-01
High-dimensional, nonlinear inverse or inference problems usually have non-unique solutions. The distribution of solutions are described by probability distributions, and these are usually found using Monte Carlo (MC) sampling methods. These take pseudo-random samples of models in parameter space, calculate the probability of each sample given available data and other information, and thus map out high or low probability values of model parameters. However, such methods would converge to the solution only as the number of samples tends to infinity; in practice, MC is found to be slow to converge, convergence is not guaranteed to be achieved in finite time, and detection of convergence requires the use of subjective criteria. We propose a method for Bayesian inversion of categorical variables such as geological facies or rock types in spatial problems, which requires no sampling at all. The method uses a 2-D Hidden Markov Model over a grid of cells, where observations represent localized data constraining the model in each cell. The data in our example application are seismic properties such as P- and S-wave impedances or rock density; our model parameters are the hidden states and represent the geological rock types in each cell. The observations at each location are assumed to depend on the facies at that location only - an assumption referred to as `localized likelihoods'. However, the facies at a location cannot be determined solely by the observation at that location as it also depends on prior information concerning its correlation with the spatial distribution of facies elsewhere. Such prior information is included in the inversion in the form of a training image which represents a conceptual depiction of the distribution of local geologies that might be expected, but other forms of prior information can be used in the method as desired. The method provides direct (pseudo-analytic) estimates of posterior marginal probability distributions over each variable, so these do not need to be estimated from samples as is required in MC methods. On a 2-D test example the method is shown to outperform previous methods significantly, and at a fraction of the computational cost. In many foreseeable applications there are therefore no serious impediments to extending the method to 3-D spatial models.
A new variable interval schedule with constant hazard rate and finite time range.
Bugallo, Mehdi; Machado, Armando; Vasconcelos, Marco
2018-05-27
We propose a new variable interval (VI) schedule that achieves constant probability of reinforcement in time while using a bounded range of intervals. By sampling each trial duration from a uniform distribution ranging from 0 to 2 T seconds, and then applying a reinforcement rule that depends linearly on trial duration, the schedule alternates reinforced and unreinforced trials, each less than 2 T seconds, while preserving a constant hazard function. © 2018 Society for the Experimental Analysis of Behavior.
Statistics 101 for Radiologists.
Anvari, Arash; Halpern, Elkan F; Samir, Anthony E
2015-10-01
Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. © RSNA, 2015.
On the objective identification of flood seasons
NASA Astrophysics Data System (ADS)
Cunderlik, Juraj M.; Ouarda, Taha B. M. J.; BobéE, Bernard
2004-01-01
The determination of seasons of high and low probability of flood occurrence is a task with many practical applications in contemporary hydrology and water resources management. Flood seasons are generally identified subjectively by visually assessing the temporal distribution of flood occurrences and, then at a regional scale, verified by comparing the temporal distribution with distributions obtained at hydrologically similar neighboring sites. This approach is subjective, time consuming, and potentially unreliable. The main objective of this study is therefore to introduce a new, objective, and systematic method for the identification of flood seasons. The proposed method tests the significance of flood seasons by comparing the observed variability of flood occurrences with the theoretical flood variability in a nonseasonal model. The method also addresses the uncertainty resulting from sampling variability by quantifying the probability associated with the identified flood seasons. The performance of the method was tested on an extensive number of samples with different record lengths generated from several theoretical models of flood seasonality. The proposed approach was then applied on real data from a large set of sites with different flood regimes across Great Britain. The results show that the method can efficiently identify flood seasons from both theoretical and observed distributions of flood occurrence. The results were used for the determination of the main flood seasonality types in Great Britain.
Andrews, Derek S.; Gudbrandsen, Christina M.; Marquand, Andre F.; Ginestet, Cedric E.; Daly, Eileen M.; Murphy, Clodagh M.; Lai, Meng-Chuan; Lombardo, Michael V.; Ruigrok, Amber N. V.; Bullmore, Edward T.; Suckling, John; Williams, Steven C. R.; Baron-Cohen, Simon; Craig, Michael C.; Murphy, Declan G. M.
2017-01-01
Importance Autism spectrum disorder (ASD) is 2 to 5 times more common in male individuals than in female individuals. While the male preponderant prevalence of ASD might partially be explained by sex differences in clinical symptoms, etiological models suggest that the biological male phenotype carries a higher intrinsic risk for ASD than the female phenotype. To our knowledge, this hypothesis has never been tested directly, and the neurobiological mechanisms that modulate ASD risk in male individuals and female individuals remain elusive. Objectives To examine the probability of ASD as a function of normative sex-related phenotypic diversity in brain structure and to identify the patterns of sex-related neuroanatomical variability associated with low or high probability of ASD. Design, Setting, and Participants This study examined a cross-sectional sample of 98 right-handed, high-functioning adults with ASD and 98 matched neurotypical control individuals aged 18 to 42 years. A multivariate probabilistic classification approach was used to develop a predictive model of biological sex based on cortical thickness measures assessed via magnetic resonance imaging in neurotypical controls. This normative model was subsequently applied to individuals with ASD. The study dates were June 2005 to October 2009, and this analysis was conducted between June 2015 and July 2016. Main Outcomes and Measures Sample and population ASD probability estimates as a function of normative sex-related diversity in brain structure, as well as neuroanatomical patterns associated with low or high ASD probability in male individuals and female individuals. Results Among the 98 individuals with ASD, 49 were male and 49 female, with a mean (SD) age of 26.88 (7.18) years. Among the 98 controls, 51 were male and 47 female, with a mean (SD) age of 27.39 (6.44) years. The sample probability of ASD increased significantly with predictive probabilities for the male neuroanatomical brain phenotype. For example, biological female individuals with a more male-typic pattern of brain anatomy were significantly (ie, 3 times) more likely to have ASD than biological female individuals with a characteristically female brain phenotype (P = .72 vs .24, respectively; χ21 = 20.26; P < .001; difference in P values, 0.48; 95% CI, 0.29-0.68). This finding translates to an estimated variability in population prevalence from 0.2% to 1.3%, respectively. Moreover, the patterns of neuroanatomical variability carrying low or high ASD probability were sex specific (eg, in inferior temporal regions, where ASD has different neurobiological underpinnings in male individuals and female individuals). Conclusions and Relevance These findings highlight the need for considering normative sex-related phenotypic diversity when determining an individual’s risk for ASD and provide important novel insights into the neurobiological mechanisms mediating sex differences in ASD prevalence. PMID:28196230
Origin and fate of nanoparticles in marine water - Preliminary results.
Graca, Bożena; Zgrundo, Aleksandra; Zakrzewska, Danuta; Rzodkiewicz, Monika; Karczewski, Jakub
2018-05-05
The number, morphology and elemental composition of nanoparticles (<100 nm) in marine water was investigated using Variable Pressure Scanning Electron Microscopy (VP-SEM) and Energy-dispersive X-ray spectroscopy (EDS). Preliminary research conducted in the Baltic Sea showed that the number of nanoparticles in seawater varied from undetectable to 380 (x10 2 ) cm -3 . Wind mixing and density barriers (thermocline) had a significant impact on the abundance and distribution of nanoparticles in water. Many more nanoparticles (mainly nanofibers) were detected in periods of intensive primary production and thermal stratification of water than at the end of the growing season and during periods of strong wind mixing. Temporal and spatial variability of nanoparticles as well as air mass trajectories indicated that the analysed nanofibers were both autochthonous and allochthonous (atmospheric), while the nanospheres were mainly autochthonous. Chemical composition of most of analysed nanoparticles indicates their autochthonous, natural (biogenic/geogenic) origin. Silica nanofibers (probably the remains of flagellates), nanofibers composed of manganese and iron oxides (probably of microbial origin), and pyrite nanospheres (probable formed in anoxic sediments), were all identified in the samples. Only asbestos nanofibers, which were also detected, are probably allochthonous and anthropogenic. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Al-Mudhafar, W. J.
2013-12-01
Precisely prediction of rock facies leads to adequate reservoir characterization by improving the porosity-permeability relationships to estimate the properties in non-cored intervals. It also helps to accurately identify the spatial facies distribution to perform an accurate reservoir model for optimal future reservoir performance. In this paper, the facies estimation has been done through Multinomial logistic regression (MLR) with respect to the well logs and core data in a well in upper sandstone formation of South Rumaila oil field. The entire independent variables are gamma rays, formation density, water saturation, shale volume, log porosity, core porosity, and core permeability. Firstly, Robust Sequential Imputation Algorithm has been considered to impute the missing data. This algorithm starts from a complete subset of the dataset and estimates sequentially the missing values in an incomplete observation by minimizing the determinant of the covariance of the augmented data matrix. Then, the observation is added to the complete data matrix and the algorithm continues with the next observation with missing values. The MLR has been chosen to estimate the maximum likelihood and minimize the standard error for the nonlinear relationships between facies & core and log data. The MLR is used to predict the probabilities of the different possible facies given each independent variable by constructing a linear predictor function having a set of weights that are linearly combined with the independent variables by using a dot product. Beta distribution of facies has been considered as prior knowledge and the resulted predicted probability (posterior) has been estimated from MLR based on Baye's theorem that represents the relationship between predicted probability (posterior) with the conditional probability and the prior knowledge. To assess the statistical accuracy of the model, the bootstrap should be carried out to estimate extra-sample prediction error by randomly drawing datasets with replacement from the training data. Each sample has the same size of the original training set and it can be conducted N times to produce N bootstrap datasets to re-fit the model accordingly to decrease the squared difference between the estimated and observed categorical variables (facies) leading to decrease the degree of uncertainty.
Students' Appreciation of Expectation and Variation as a Foundation for Statistical Understanding
ERIC Educational Resources Information Center
Watson, Jane M.; Callingham, Rosemary A.; Kelly, Ben A.
2007-01-01
This study presents the results of a partial credit Rasch analysis of in-depth interview data exploring statistical understanding of 73 school students in 6 contextual settings. The use of Rasch analysis allowed the exploration of a single underlying variable across contexts, which included probability sampling, representation of temperature…
Pigeons' Choices between Fixed-Interval and Random-Interval Schedules: Utility of Variability?
ERIC Educational Resources Information Center
Andrzejewski, Matthew E.; Cardinal, Claudia D.; Field, Douglas P.; Flannery, Barbara A.; Johnson, Michael; Bailey, Kathleen; Hineline, Philip N.
2005-01-01
Pigeons' choosing between fixed-interval and random-interval schedules of reinforcement was investigated in three experiments using a discrete-trial procedure. In all three experiments, the random-interval schedule was generated by sampling a probability distribution at an interval (and in multiples of the interval) equal to that of the…
Tics and Tourette Syndrome in Autism Spectrum Disorders
ERIC Educational Resources Information Center
Canitano, Roberto; Vivanti, Giacomo
2007-01-01
Autism spectrum disorders (ASDs) are more frequently associated with tic disorders than expected by chance. Variable rates of comorbidity have been reported and common genetic and neurobiological factors are probably involved. The aim of this study was to determine the rate of tic disorders in a clinical sample (n = 105) of children and…
Switzer, P.; Harden, J.W.; Mark, R.K.
1988-01-01
A statistical method for estimating rates of soil development in a given region based on calibration from a series of dated soils is used to estimate ages of soils in the same region that are not dated directly. The method is designed specifically to account for sampling procedures and uncertainties that are inherent in soil studies. Soil variation and measurement error, uncertainties in calibration dates and their relation to the age of the soil, and the limited number of dated soils are all considered. Maximum likelihood (ML) is employed to estimate a parametric linear calibration curve, relating soil development to time or age on suitably transformed scales. Soil variation on a geomorphic surface of a certain age is characterized by replicate sampling of soils on each surface; such variation is assumed to have a Gaussian distribution. The age of a geomorphic surface is described by older and younger bounds. This technique allows age uncertainty to be characterized by either a Gaussian distribution or by a triangular distribution using minimum, best-estimate, and maximum ages. The calibration curve is taken to be linear after suitable (in certain cases logarithmic) transformations, if required, of the soil parameter and age variables. Soil variability, measurement error, and departures from linearity are described in a combined fashion using Gaussian distributions with variances particular to each sampled geomorphic surface and the number of sample replicates. Uncertainty in age of a geomorphic surface used for calibration is described using three parameters by one of two methods. In the first method, upper and lower ages are specified together with a coverage probability; this specification is converted to a Gaussian distribution with the appropriate mean and variance. In the second method, "absolute" older and younger ages are specified together with a most probable age; this specification is converted to an asymmetric triangular distribution with mode at the most probable age. The statistical variability of the ML-estimated calibration curve is assessed by a Monte Carlo method in which simulated data sets repeatedly are drawn from the distributional specification; calibration parameters are reestimated for each such simulation in order to assess their statistical variability. Several examples are used for illustration. The age of undated soils in a related setting may be estimated from the soil data using the fitted calibration curve. A second simulation to assess age estimate variability is described and applied to the examples. ?? 1988 International Association for Mathematical Geology.
Gaiser, Maria Rita; Skorokhod, Alexander; Gransheier, Diana; Weide, Benjamin; Koch, Winfried; Schif, Birgit; Enk, Alexander; Garbe, Claus; Bauer, Jürgen
2017-01-01
The incidence of melanoma, particularly in older patients, has steadily increased over the past few decades. Activating mutations of BRAF, the majority occurring in BRAFV600, are frequently detected in melanoma; however, the prognostic significance remains unclear. This study aimed to define the probability and distribution of BRAFV600 mutations, and the clinico-pathological factors that may affect BRAF mutation status, in patients with advanced melanoma using next-generation sequencing. This was a non-interventional, retrospective study of BRAF mutation testing at two German centers, in Heidelberg and Tübingen. Archival tumor samples from patients with histologically confirmed melanoma (stage IIIB, IIIC, IV) were analyzed using PCR amplification and deep sequencing. Clinical, histological, and mutation data were collected. The statistical influence of patient- and tumor-related characteristics on BRAFV600 mutation status was assessed using multiple logistic regression (MLR) and a prediction profiler. BRAFV600 mutation status was assessed in 453 samples. Mutations were detected in 57.6% of patients (n = 261), with 48.1% (n = 102) at the Heidelberg site and 66.0% (n = 159) at the Tübingen site. The decreasing influence of increasing age on mutation probability was quantified. A main effects MLR model identified age (p = 0.0001), center (p = 0.0004), and melanoma subtype (p = 0.014) as significantly influencing BRAFV600 mutation probability; ultraviolet (UV) exposure showed a statistical trend (p = 0.1419). An interaction model of age versus other variables showed that center (p<0.0001) and melanoma subtype (p = 0.0038) significantly influenced BRAF mutation probability; age had a statistically significant effect only as part of an interaction with both UV exposure (p = 0.0110) and melanoma subtype (p = 0.0134). This exploratory study highlights that testing center, melanoma subtype, and age in combination with UV exposure and melanoma subtype significantly influence BRAFV600 mutation probability in patients with melanoma. Further validation of this model, in terms of reproducibility and broader relevance, is required.
Optimized nested Markov chain Monte Carlo sampling: theory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Coe, Joshua D; Shaw, M Sam; Sewell, Thomas D
2009-01-01
Metropolis Monte Carlo sampling of a reference potential is used to build a Markov chain in the isothermal-isobaric ensemble. At the endpoints of the chain, the energy is reevaluated at a different level of approximation (the 'full' energy) and a composite move encompassing all of the intervening steps is accepted on the basis of a modified Metropolis criterion. By manipulating the thermodynamic variables characterizing the reference system we maximize the average acceptance probability of composite moves, lengthening significantly the random walk made between consecutive evaluations of the full energy at a fixed acceptance probability. This provides maximally decorrelated samples ofmore » the full potential, thereby lowering the total number required to build ensemble averages of a given variance. The efficiency of the method is illustrated using model potentials appropriate to molecular fluids at high pressure. Implications for ab initio or density functional theory (DFT) treatment are discussed.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schmidt, Edward G.; Hemen, Brian; Rogalla, Danielle
We have obtained VR photometry of 282 Cepheid variable star candidates from the northern part of the All Sky Automated Survey (ASAS). These together with data from the ASAS and the Northern Sky Variability Survey (NSVS) were used to redetermine the periods of the stars. We divided the stars into four groups based on location in a plot of mean color, (V-R), versus period. Two of the groups fell within the region of the diagram containing known type II Cepheids and yielded 14 new highly probable type II Cepheids. The properties of the remaining stars in these two groups aremore » discussed but their nature remains uncertain. Unexplained differences exist between the sample of stars studied here and a previous sample drawn from the NSVS by Akerlof et al. This suggests serious biases in the identification of variables in different surveys.« less
Approved Methods and Algorithms for DoD Risk-Based Explosives Siting
2009-07-21
Parameter used in determining probability of hit ( Phit ) by debris. [Table 31, Table 32, Table 33, Eq. (157), Eq. (158)] CCa Variable “Actual...being in the glass hazard area”. [Eq. (60), Eq. (78)] Phit Variable “Probability of hit”. An array value indexed by consequence and mass bin...Eq. (156), Eq. (157)] Phit (f) Variable “Probability of hit for fatality”. [Eq. (157), Eq. (158)] Phit (maji) Variable “Probability of hit for major
Evaluating impacts using a BACI design, ratios, and a Bayesian approach with a focus on restoration.
Conner, Mary M; Saunders, W Carl; Bouwes, Nicolaas; Jordan, Chris
2015-10-01
Before-after-control-impact (BACI) designs are an effective method to evaluate natural and human-induced perturbations on ecological variables when treatment sites cannot be randomly chosen. While effect sizes of interest can be tested with frequentist methods, using Bayesian Markov chain Monte Carlo (MCMC) sampling methods, probabilities of effect sizes, such as a ≥20 % increase in density after restoration, can be directly estimated. Although BACI and Bayesian methods are used widely for assessing natural and human-induced impacts for field experiments, the application of hierarchal Bayesian modeling with MCMC sampling to BACI designs is less common. Here, we combine these approaches and extend the typical presentation of results with an easy to interpret ratio, which provides an answer to the main study question-"How much impact did a management action or natural perturbation have?" As an example of this approach, we evaluate the impact of a restoration project, which implemented beaver dam analogs, on survival and density of juvenile steelhead. Results indicated the probabilities of a ≥30 % increase were high for survival and density after the dams were installed, 0.88 and 0.99, respectively, while probabilities for a higher increase of ≥50 % were variable, 0.17 and 0.82, respectively. This approach demonstrates a useful extension of Bayesian methods that can easily be generalized to other study designs from simple (e.g., single factor ANOVA, paired t test) to more complicated block designs (e.g., crossover, split-plot). This approach is valuable for estimating the probabilities of restoration impacts or other management actions.
Shoukri, Mohamed M; Elkum, Nasser; Walter, Stephen D
2006-01-01
Background In this paper we propose the use of the within-subject coefficient of variation as an index of a measurement's reliability. For continuous variables and based on its maximum likelihood estimation we derive a variance-stabilizing transformation and discuss confidence interval construction within the framework of a one-way random effects model. We investigate sample size requirements for the within-subject coefficient of variation for continuous and binary variables. Methods We investigate the validity of the approximate normal confidence interval by Monte Carlo simulations. In designing a reliability study, a crucial issue is the balance between the number of subjects to be recruited and the number of repeated measurements per subject. We discuss efficiency of estimation and cost considerations for the optimal allocation of the sample resources. The approach is illustrated by an example on Magnetic Resonance Imaging (MRI). We also discuss the issue of sample size estimation for dichotomous responses with two examples. Results For the continuous variable we found that the variance stabilizing transformation improves the asymptotic coverage probabilities on the within-subject coefficient of variation for the continuous variable. The maximum like estimation and sample size estimation based on pre-specified width of confidence interval are novel contribution to the literature for the binary variable. Conclusion Using the sample size formulas, we hope to help clinical epidemiologists and practicing statisticians to efficiently design reliability studies using the within-subject coefficient of variation, whether the variable of interest is continuous or binary. PMID:16686943
Probabilistic analysis of preload in the abutment screw of a dental implant complex.
Guda, Teja; Ross, Thomas A; Lang, Lisa A; Millwater, Harry R
2008-09-01
Screw loosening is a problem for a percentage of implants. A probabilistic analysis to determine the cumulative probability distribution of the preload, the probability of obtaining an optimal preload, and the probabilistic sensitivities identifying important variables is lacking. The purpose of this study was to examine the inherent variability of material properties, surface interactions, and applied torque in an implant system to determine the probability of obtaining desired preload values and to identify the significant variables that affect the preload. Using software programs, an abutment screw was subjected to a tightening torque and the preload was determined from finite element (FE) analysis. The FE model was integrated with probabilistic analysis software. Two probabilistic analysis methods (advanced mean value and Monte Carlo sampling) were applied to determine the cumulative distribution function (CDF) of preload. The coefficient of friction, elastic moduli, Poisson's ratios, and applied torque were modeled as random variables and defined by probability distributions. Separate probability distributions were determined for the coefficient of friction in well-lubricated and dry environments. The probabilistic analyses were performed and the cumulative distribution of preload was determined for each environment. A distinct difference was seen between the preload probability distributions generated in a dry environment (normal distribution, mean (SD): 347 (61.9) N) compared to a well-lubricated environment (normal distribution, mean (SD): 616 (92.2) N). The probability of obtaining a preload value within the target range was approximately 54% for the well-lubricated environment and only 0.02% for the dry environment. The preload is predominately affected by the applied torque and coefficient of friction between the screw threads and implant bore at lower and middle values of the preload CDF, and by the applied torque and the elastic modulus of the abutment screw at high values of the preload CDF. Lubrication at the threaded surfaces between the abutment screw and implant bore affects the preload developed in the implant complex. For the well-lubricated surfaces, only approximately 50% of implants will have preload values within the generally accepted range. This probability can be improved by applying a higher torque than normally recommended or a more closely controlled torque than typically achieved. It is also suggested that materials with higher elastic moduli be used in the manufacture of the abutment screw to achieve a higher preload.
Kopp, Blaine S.; Nielsen, Martha; Glisic, Dejan; Neckles, Hilary A.
2009-01-01
This report documents results of pilot tests of a protocol for monitoring estuarine nutrient enrichment for the Vital Signs Monitoring Program of the National Park Service Northeast Coastal and Barrier Network. Data collected from four parks during protocol development in 2003-06 are presented: Gateway National Recreation Area, Colonial National Historic Park, Fire Island National Seashore, and Assateague Island National Seashore. The monitoring approach incorporates several spatial and temporal designs to address questions at a hierarchy of scales. Indicators of estuarine response to nutrient enrichment were sampled using a probability design within park estuaries during a late-summer index period. Monitoring variables consisted of dissolved-oxygen concentration, chlorophyll a concentration, water temperature, salinity, attenuation of downwelling photosynthetically available radiation (PAR), and turbidity. The statistical sampling design allowed the condition of unsampled locations to be inferred from the distribution of data from a set of randomly positioned "probability" stations. A subset of sampling stations was sampled repeatedly during the index period, and stations were not rerandomized in subsequent years. These "trend stations" allowed us to examine temporal variability within the index period, and to improve the sensitivity of the monitoring protocol to detecting change through time. Additionally, one index site in each park was equipped for continuous monitoring throughout the index period. Thus, the protocol includes elements of probabilistic and targeted spatial sampling, and the temporal intensity ranges from snapshot assessments to continuous monitoring.
Dealing with uncertainty in the probability of overtopping of a flood mitigation dam
NASA Astrophysics Data System (ADS)
Michailidi, Eleni Maria; Bacchi, Baldassare
2017-05-01
In recent years, copula multivariate functions were used to model, probabilistically, the most important variables of flood events: discharge peak, flood volume and duration. However, in most of the cases, the sampling uncertainty, from which small-sized samples suffer, is neglected. In this paper, considering a real reservoir controlled by a dam as a case study, we apply a structure-based approach to estimate the probability of reaching specific reservoir levels, taking into account the key components of an event (flood peak, volume, hydrograph shape) and of the reservoir (rating curve, volume-water depth relation). Additionally, we improve information about the peaks from historical data and reports through a Bayesian framework, allowing the incorporation of supplementary knowledge from different sources and its associated error. As it is seen here, the extra information can result in a very different inferred parameter set and consequently this is reflected as a strong variability of the reservoir level, associated with a given return period. Most importantly, the sampling uncertainty is accounted for in both cases (single-site and multi-site with historical information scenarios), and Monte Carlo confidence intervals for the maximum water level are calculated. It is shown that water levels of specific return periods in a lot of cases overlap, thus making risk assessment, without providing confidence intervals, deceiving.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Romero, Vicente; Bonney, Matthew; Schroeder, Benjamin
When very few samples of a random quantity are available from a source distribution of unknown shape, it is usually not possible to accurately infer the exact distribution from which the data samples come. Under-estimation of important quantities such as response variance and failure probabilities can result. For many engineering purposes, including design and risk analysis, we attempt to avoid under-estimation with a strategy to conservatively estimate (bound) these types of quantities -- without being overly conservative -- when only a few samples of a random quantity are available from model predictions or replicate experiments. This report examines a classmore » of related sparse-data uncertainty representation and inference approaches that are relatively simple, inexpensive, and effective. Tradeoffs between the methods' conservatism, reliability, and risk versus number of data samples (cost) are quantified with multi-attribute metrics use d to assess method performance for conservative estimation of two representative quantities: central 95% of response; and 10 -4 probability of exceeding a response threshold in a tail of the distribution. Each method's performance is characterized with 10,000 random trials on a large number of diverse and challenging distributions. The best method and number of samples to use in a given circumstance depends on the uncertainty quantity to be estimated, the PDF character, and the desired reliability of bounding the true value. On the basis of this large data base and study, a strategy is proposed for selecting the method and number of samples for attaining reasonable credibility levels in bounding these types of quantities when sparse samples of random variables or functions are available from experiments or simulations.« less
Mattfeldt, S.D.; Bailey, L.L.; Grant, E.H.C.
2009-01-01
Monitoring programs have the potential to identify population declines and differentiate among the possible cause(s) of these declines. Recent criticisms regarding the design of monitoring programs have highlighted a failure to clearly state objectives and to address detectability and spatial sampling issues. Here, we incorporate these criticisms to design an efficient monitoring program whose goals are to determine environmental factors which influence the current distribution and measure change in distributions over time for a suite of amphibians. In designing the study we (1) specified a priori factors that may relate to occupancy, extinction, and colonization probabilities and (2) used the data collected (incorporating detectability) to address our scientific questions and adjust our sampling protocols. Our results highlight the role of wetland hydroperiod and other local covariates in the probability of amphibian occupancy. There was a change in overall occupancy probabilities for most species over the first three years of monitoring. Most colonization and extinction estimates were constant over time (years) and space (among wetlands), with one notable exception: local extinction probabilities for Rana clamitans were lower for wetlands with longer hydroperiods. We used information from the target system to generate scenarios of population change and gauge the ability of the current sampling to meet monitoring goals. Our results highlight the limitations of the current sampling design, emphasizing the need for long-term efforts, with periodic re-evaluation of the program in a framework that can inform management decisions.
Constituent loads in small streams: the process and problems of estimating sediment flux
R. B. Thomas
1989-01-01
Constituent loads in small streams are often estimated poorly. This is especially true for discharge-related constituents like sediment, since their flux is highly variable and mainly occurs during infrequent high-flow events. One reason for low-quality estimates is that most prevailing data collection methods ignore sampling probabilities and only partly account for...
Estimation and applications of size-based distributions in forestry
Jeffrey H. Gove
2003-01-01
Size-based distributions arise in several contexts in forestry and ecology. Simple power relationships (e.g., basal area and diameter at breast height) between variables are one such area of interest arising from a modeling perspective. Another, probability proportional to size sampline (PPS), is found in the most widely used methods for sampling standing or dead and...
ERIC Educational Resources Information Center
Chu, Yuan-Hsiang
2006-01-01
The study investigated the effects of an educational intervention on the variables of awareness, perception, self-efficacy, and behavioral intentions towards technological literacy among students at a College of Design in Southern Taiwan. Using non-probability sampling, 42 freshmen students from the Department of Product Design participated in the…
Propagating probability distributions of stand variables using sequential Monte Carlo methods
Jeffrey H. Gove
2009-01-01
A general probabilistic approach to stand yield estimation is developed based on sequential Monte Carlo filters, also known as particle filters. The essential steps in the development of the sampling importance resampling (SIR) particle filter are presented. The SIR filter is then applied to simulated and observed data showing how the 'predictor - corrector'...
Surveys of fish community status were conducted in summer 1987 in 49 lakes in Subregion 20, the Upper Peninsula of Michigan, as part of Phase II of the Eastern Lake Survey. Lake selection involved a variable probability sampling design. Fish communities were surveyed using gill n...
Surveys of fish community status were conducted in summer 1987 in 49 lakes in Subregion 20, the Upper Peninsula of Michigan, as part of Phase II of the Eastern Lake Survey. Lake selection involved a variable probability sampling design. Fish communities were surveyed using gill n...
Reading Achievements of Vietnamese Grade 5 Pupils
ERIC Educational Resources Information Center
Griffin, Patrick; Thanh, Mai Thi
2006-01-01
This article described a national study in Vietnam whereby a probability sample of students was chosen from each of the 61 provinces. A reading test consisting of 60 items was administered. The items were matched to the Vietnam reading and language curriculum for Year 5 students. Using a skills audit of the items, a variable of reading development…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu Xuebing; Wang Ran; Bian Fuyan
2011-09-15
The identification of quasars in the redshift range 2.2 < z < 3 is known to be very inefficient because the optical colors of such quasars are indistinguishable from those of stars. Recent studies have proposed using optical variability or near-infrared (near-IR) colors to improve the identification of the missing quasars in this redshift range. Here we present a case study combining both methods. We select a sample of 70 quasar candidates from variables in Sloan Digital Sky Survey (SDSS) Stripe 82, which are non-ultraviolet excess sources and have UKIDSS near-IR public data. They are clearly separated into two partsmore » on the Y - K/g - z color-color diagram, and 59 of them meet or lie close to a newly proposed Y - K/g - z selection criterion for z < 4 quasars. Of these 59 sources, 44 were previously identified as quasars in SDSS DR7, and 35 of them are quasars at 2.2 < z < 3. We present spectroscopic observations of 14 of 15 remaining quasar candidates using the Bok 2.3 m telescope and the MMT 6.5 m telescope, and successfully identify all of them as new quasars at z = 2.36-2.88. We also apply this method to a sample of 643 variable quasar candidates with SDSS-UKIDSS nine-band photometric data selected from 1875 new quasar candidates in SDSS Stripe 82 given by Butler and Bloom based on the time-series selections, and find that 188 of them are probably new quasars with photometric redshifts at 2.2 < z < 3. Our results indicate that the combination of optical variability and optical/near-IR colors is probably the most efficient way to find 2.2 < z < 3 quasars and is very helpful for constructing a complete quasar sample. We discuss its implications for ongoing and upcoming large optical and near-IR sky surveys.« less
Repeat migration and disappointment.
Grant, E K; Vanderkamp, J
1986-01-01
This article investigates the determinants of repeat migration among the 44 regions of Canada, using information from a large micro-database which spans the period 1968 to 1971. The explanation of repeat migration probabilities is a difficult task, and this attempt is only partly successful. May of the explanatory variables are not significant, and the overall explanatory power of the equations is not high. In the area of personal characteristics, the variables related to age, sex, and marital status are generally significant and with expected signs. The distance variable has a strongly positive effect on onward move probabilities. Variables related to prior migration experience have an important impact that differs between return and onward probabilities. In particular, the occurrence of prior moves has a striking effect on the probability of onward migration. The variable representing disappointment, or relative success of the initial move, plays a significant role in explaining repeat migration probabilities. The disappointment variable represents the ratio of actural versus expected wage income in the year after the initial move, and its effect on both repeat migration probabilities is always negative and almost always highly significant. The repeat probabilities diminish after a year's stay in the destination region, but disappointment in the most recent year still has a bearing on the delayed repeat probabilities. While the quantitative impact of the disappointment variable is not large, it is difficult to draw comparisons since similar estimates are not available elsewhere.
Predicting redox conditions in groundwater at a regional scale
Tesoriero, Anthony J.; Terziotti, Silvia; Abrams, Daniel B.
2015-01-01
Defining the oxic-suboxic interface is often critical for determining pathways for nitrate transport in groundwater and to streams at the local scale. Defining this interface on a regional scale is complicated by the spatial variability of reaction rates. The probability of oxic groundwater in the Chesapeake Bay watershed was predicted by relating dissolved O2 concentrations in groundwater samples to indicators of residence time and/or electron donor availability using logistic regression. Variables that describe surficial geology, position in the flow system, and soil drainage were important predictors of oxic water. The probability of encountering oxic groundwater at a 30 m depth and the depth to the bottom of the oxic layer were predicted for the Chesapeake Bay watershed. The influence of depth to the bottom of the oxic layer on stream nitrate concentrations and time lags (i.e., time period between land application of nitrogen and its effect on streams) are illustrated using model simulations for hypothetical basins. Regional maps of the probability of oxic groundwater should prove useful as indicators of groundwater susceptibility and stream susceptibility to contaminant sources derived from groundwater.
Bayesian network representing system dynamics in risk analysis of nuclear systems
NASA Astrophysics Data System (ADS)
Varuttamaseni, Athi
2011-12-01
A dynamic Bayesian network (DBN) model is used in conjunction with the alternating conditional expectation (ACE) regression method to analyze the risk associated with the loss of feedwater accident coupled with a subsequent initiation of the feed and bleed operation in the Zion-1 nuclear power plant. The use of the DBN allows the joint probability distribution to be factorized, enabling the analysis to be done on many simpler network structures rather than on one complicated structure. The construction of the DBN model assumes conditional independence relations among certain key reactor parameters. The choice of parameter to model is based on considerations of the macroscopic balance statements governing the behavior of the reactor under a quasi-static assumption. The DBN is used to relate the peak clad temperature to a set of independent variables that are known to be important in determining the success of the feed and bleed operation. A simple linear relationship is then used to relate the clad temperature to the core damage probability. To obtain a quantitative relationship among different nodes in the DBN, surrogates of the RELAP5 reactor transient analysis code are used. These surrogates are generated by applying the ACE algorithm to output data obtained from about 50 RELAP5 cases covering a wide range of the selected independent variables. These surrogates allow important safety parameters such as the fuel clad temperature to be expressed as a function of key reactor parameters such as the coolant temperature and pressure together with important independent variables such as the scram delay time. The time-dependent core damage probability is calculated by sampling the independent variables from their probability distributions and propagate the information up through the Bayesian network to give the clad temperature. With the knowledge of the clad temperature and the assumption that the core damage probability has a one-to-one relationship to it, we have calculated the core damage probably as a function of transient time. The use of the DBN model in combination with ACE allows risk analysis to be performed with much less effort than if the analysis were done using the standard techniques.
NASA Technical Reports Server (NTRS)
Rood, Richard B.; Douglass, Anne R.; Cerniglia, Mark C.; Sparling, Lynn C.; Nielsen, J. Eric
1999-01-01
We present a study of the distribution of ozone in the lowermost stratosphere with the goal of characterizing the observed variability. The air in the lowermost stratosphere is divided into two population groups based on Ertel's potential vorticity at 300 hPa. High (low) potential vorticity at 300 hPa indicates that the tropopause is low (high), and the identification of these two groups is made to account for the dynamic variability. Conditional probability distribution functions are used to define the statistics of the ozone distribution from both observations and a three-dimensional model simulation using winds from the Goddard Earth Observing System Data Assimilation System for transport. Ozone data sets include ozonesonde observations from northern midlatitude stations (1991-96) and midlatitude observations made by the Halogen Occultation Experiment (HALOE) on the Upper Atmosphere Research Satellite (UARS) (1994- 1998). The conditional probability distribution functions are calculated at a series of potential temperature surfaces spanning the domain from the midlatitude tropopause to surfaces higher than the mean tropical tropopause (approximately 380K). The probability distribution functions are similar for the two data sources, despite differences in horizontal and vertical resolution and spatial and temporal sampling. Comparisons with the model demonstrate that the model maintains a mix of air in the lowermost stratosphere similar to the observations. The model also simulates a realistic annual cycle. Results show that during summer, much of the observed variability is explained by the height of the tropopause. During the winter and spring, when the tropopause fluctuations are larger, less of the variability is explained by tropopause height. This suggests that more mixing occurs during these seasons. During all seasons, there is a transition zone near the tropopause that contains air characteristic of both the troposphere and the stratosphere. The relevance of the results to the assessment of the environmental impact of aircraft effluence is also discussed.
A large-scale study of the random variability of a coding sequence: a study on the CFTR gene.
Modiano, Guido; Bombieri, Cristina; Ciminelli, Bianca Maria; Belpinati, Francesca; Giorgi, Silvia; Georges, Marie des; Scotet, Virginie; Pompei, Fiorenza; Ciccacci, Cinzia; Guittard, Caroline; Audrézet, Marie Pierre; Begnini, Angela; Toepfer, Michael; Macek, Milan; Ferec, Claude; Claustres, Mireille; Pignatti, Pier Franco
2005-02-01
Coding single nucleotide substitutions (cSNSs) have been studied on hundreds of genes using small samples (n(g) approximately 100-150 genes). In the present investigation, a large random European population sample (average n(g) approximately 1500) was studied for a single gene, the CFTR (Cystic Fibrosis Transmembrane conductance Regulator). The nonsynonymous (NS) substitutions exhibited, in accordance with previous reports, a mean probability of being polymorphic (q > 0.005), much lower than that of the synonymous (S) substitutions, but they showed a similar rate of subpolymorphic (q < 0.005) variability. This indicates that, in autosomal genes that may have harmful recessive alleles (nonduplicated genes with important functions), genetic drift overwhelms selection in the subpolymorphic range of variability, making disadvantageous alleles behave as neutral. These results imply that the majority of the subpolymorphic nonsynonymous alleles of these genes are selectively negative or even pathogenic.
Conroy, M.J.; Nichols, J.D.
1984-01-01
Several important questions in evolutionary biology and paleobiology involve sources of variation in extinction rates. In all cases of which we are aware, extinction rates have been estimated from data in which the probability that an observation (e.g., a fossil taxon) will occur is related both to extinction rates and to what we term encounter probabilities. Any statistical method for analyzing fossil data should at a minimum permit separate inferences on these two components. We develop a method for estimating taxonomic extinction rates from stratigraphic range data and for testing hypotheses about variability in these rates. We use this method to estimate extinction rates and to test the hypothesis of constant extinction rates for several sets of stratigraphic range data. The results of our tests support the hypothesis that extinction rates varied over the geologic time periods examined. We also present a test that can be used to identify periods of high or low extinction probabilities and provide an example using Phanerozoic invertebrate data. Extinction rates should be analyzed using stochastic models, in which it is recognized that stratigraphic samples are random varlates and that sampling is imperfect
On Fitting a Multivariate Two-Part Latent Growth Model
Xu, Shu; Blozis, Shelley A.; Vandewater, Elizabeth A.
2017-01-01
A 2-part latent growth model can be used to analyze semicontinuous data to simultaneously study change in the probability that an individual engages in a behavior, and if engaged, change in the behavior. This article uses a Monte Carlo (MC) integration algorithm to study the interrelationships between the growth factors of 2 variables measured longitudinally where each variable can follow a 2-part latent growth model. A SAS macro implementing Mplus is developed to estimate the model to take into account the sampling uncertainty of this simulation-based computational approach. A sample of time-use data is used to show how maximum likelihood estimates can be obtained using a rectangular numerical integration method and an MC integration method. PMID:29333054
p-adic stochastic hidden variable model
NASA Astrophysics Data System (ADS)
Khrennikov, Andrew
1998-03-01
We propose stochastic hidden variables model in which hidden variables have a p-adic probability distribution ρ(λ) and at the same time conditional probabilistic distributions P(U,λ), U=A,A',B,B', are ordinary probabilities defined on the basis of the Kolmogorov measure-theoretical axiomatics. A frequency definition of p-adic probability is quite similar to the ordinary frequency definition of probability. p-adic frequency probability is defined as the limit of relative frequencies νn but in the p-adic metric. We study a model with p-adic stochastics on the level of the hidden variables description. But, of course, responses of macroapparatuses have to be described by ordinary stochastics. Thus our model describes a mixture of p-adic stochastics of the microworld and ordinary stochastics of macroapparatuses. In this model probabilities for physical observables are the ordinary probabilities. At the same time Bell's inequality is violated.
Binomial leap methods for simulating stochastic chemical kinetics.
Tian, Tianhai; Burrage, Kevin
2004-12-01
This paper discusses efficient simulation methods for stochastic chemical kinetics. Based on the tau-leap and midpoint tau-leap methods of Gillespie [D. T. Gillespie, J. Chem. Phys. 115, 1716 (2001)], binomial random variables are used in these leap methods rather than Poisson random variables. The motivation for this approach is to improve the efficiency of the Poisson leap methods by using larger stepsizes. Unlike Poisson random variables whose range of sample values is from zero to infinity, binomial random variables have a finite range of sample values. This probabilistic property has been used to restrict possible reaction numbers and to avoid negative molecular numbers in stochastic simulations when larger stepsize is used. In this approach a binomial random variable is defined for a single reaction channel in order to keep the reaction number of this channel below the numbers of molecules that undergo this reaction channel. A sampling technique is also designed for the total reaction number of a reactant species that undergoes two or more reaction channels. Samples for the total reaction number are not greater than the molecular number of this species. In addition, probability properties of the binomial random variables provide stepsize conditions for restricting reaction numbers in a chosen time interval. These stepsize conditions are important properties of robust leap control strategies. Numerical results indicate that the proposed binomial leap methods can be applied to a wide range of chemical reaction systems with very good accuracy and significant improvement on efficiency over existing approaches. (c) 2004 American Institute of Physics.
Doidge, James C
2018-02-01
Population-based cohort studies are invaluable to health research because of the breadth of data collection over time, and the representativeness of their samples. However, they are especially prone to missing data, which can compromise the validity of analyses when data are not missing at random. Having many waves of data collection presents opportunity for participants' responsiveness to be observed over time, which may be informative about missing data mechanisms and thus useful as an auxiliary variable. Modern approaches to handling missing data such as multiple imputation and maximum likelihood can be difficult to implement with the large numbers of auxiliary variables and large amounts of non-monotone missing data that occur in cohort studies. Inverse probability-weighting can be easier to implement but conventional wisdom has stated that it cannot be applied to non-monotone missing data. This paper describes two methods of applying inverse probability-weighting to non-monotone missing data, and explores the potential value of including measures of responsiveness in either inverse probability-weighting or multiple imputation. Simulation studies are used to compare methods and demonstrate that responsiveness in longitudinal studies can be used to mitigate bias induced by missing data, even when data are not missing at random.
Gender Roles and Acculturation: Relationships With Cancer Screening Among Vietnamese American Women
Nguyen, Anh B.; Clark, Trenette T.; Belgrave, Faye Z.
2017-01-01
The aim of this study was to examine the influence of demographic variables and the interplay between gender roles and acculturation on breast and cervical cancer screening outcomes among Vietnamese American women. Convenience sampling was used to recruit 100 Vietnamese women from the Richmond, VA, metropolitan area. Women were recruited to participate in a larger cancer screening intervention. All participants completed measures on demographic variables, gender roles, acculturation, and cancer screening variables. Findings indicated that traditional masculine gender roles were associated with increased self-efficacy for breast and cervical cancer screening. Higher levels of acculturation were associated with higher probability of having had a Papanicolaou test. In addition, acculturation moderated the relationship between traditional female gender roles and cancer screening variables. For highly acculturated women, higher levels of feminine gender roles predicted higher probability of having had a previous clinical breast exam and higher levels of self-efficacy for cervical cancer screening, while the opposite was true for lower acculturated women. The findings of this study indicate the important roles that sociodemographic variables, gender roles, and acculturation play in affecting health attitudes and behaviors among Vietnamese women. These findings also help to identify a potentially high-risk subgroup and existing gaps that need to be targeted by preventive interventions. PMID:24491129
Gender roles and acculturation: relationships with cancer screening among Vietnamese American women.
Nguyen, Anh B; Clark, Trenette T; Belgrave, Faye Z
2014-01-01
The aim of this study was to examine the influence of demographic variables and the interplay between gender roles and acculturation on breast and cervical cancer screening outcomes among Vietnamese American women. Convenience sampling was used to recruit 100 Vietnamese women from the Richmond, VA, metropolitan area. Women were recruited to participate in a larger cancer screening intervention. All participants completed measures on demographic variables, gender roles, acculturation, and cancer screening variables. Findings indicated that traditional masculine gender roles were associated with increased self-efficacy for breast and cervical cancer screening. Higher levels of acculturation were associated with higher probability of having had a Papanicolaou test. In addition, acculturation moderated the relationship between traditional female gender roles and cancer screening variables. For highly acculturated women, higher levels of feminine gender roles predicted higher probability of having had a previous clinical breast exam and higher levels of self-efficacy for cervical cancer screening, while the opposite was true for lower acculturated women. The findings of this study indicate the important roles that sociodemographic variables, gender roles, and acculturation play in affecting health attitudes and behaviors among Vietnamese women. These findings also help to identify a potentially high-risk subgroup and existing gaps that need to be targeted by preventive interventions.
NASA Technical Reports Server (NTRS)
Anderson, J. E. (Principal Investigator)
1979-01-01
The net board foot volume (Scribner log rule) of the standing Ponderosa pine timber on the Defiance Unit of the Navajo Nation's forested land was estimated using a multistage forest volume inventory scheme with variable sample selection probabilities. The inventory designed to accomplish this task required that both LANDSAT MSS digital data and aircraft acquired data be used to locate one acre ground splits, which were subsequently visited by ground teams conducting detailed tree measurements using an optical dendrometer. The dendrometer measurements were then punched on computer input cards and were entered in a computer program developed by the U.S. Forest Service. The resulting individual tree volume estimates were then expanded through the use of a statistically defined equation to produce the volume estimate for the entire area which includes 192,026 acres and is approximately a 44% the total forested area of the Navajo Nation.
NASA Astrophysics Data System (ADS)
Bailey, John I.; Mateo, Mario; White, Russel J.; Shectman, Stephen A.; Crane, Jeffrey D.
2018-04-01
We present multi-epoch high-dispersion optical spectra obtained with the Michigan/Magellan Fibre System of 126 and 125 Sun-like stars in the young clusters NGC 2516 (141 Myr) and NGC 2422 (73 Myr). We determine stellar properties including radial velocity (RV), Teff, [Fe/H], [α/Fe] and the line-of-sight rotation rate, vrsin (i), from these spectra. Our median RV precision of 80 m s-1 on individual epochs that span a temporal baseline of 1.1 yr enables us to investigate membership and stellar binarity, and to search for sub-stellar companions. We determine membership probabilities and RV variability probabilities for our sample along with candidate companion orbital periods for a select subset of stars. In NGC 2516, we identified 81 RV members, 27 spectroscopic binaries (17 previously identified as photometric binaries) and 16 other stars that show significant RV variability after accounting for average stellar jitter at the 74 m s-1 level. In NGC 2422, we identify 57 members, 11 spectroscopic binaries and three other stars that show significant RV variability after accounting for an average jitter of 138 m s-1. We use Monte Carlo simulations to verify our stellar jitter measurements, determine the proportion of exoplanets and stellar companions to which we are sensitive, and estimate companion-mass limits for our targets. We also report mean cluster metallicity, velocity and velocity dispersion based on our member targets. We identify 58 non-member stars as RV variables, 24 of which have RV amplitudes that imply stellar or brown-dwarf mass companions. Finally, we note the discovery of a separate RV clustering of stars in our NGC 2422 sample.
Stream permanence influences crayfish occupancy and abundance in the Ozark Highlands, USA
Yarra, Allyson N.; Magoulick, Daniel D.
2018-01-01
Crayfish use of intermittent streams is especially important to understand in the face of global climate change. We examined the influence of stream permanence and local habitat on crayfish occupancy and species densities in the Ozark Highlands, USA. We sampled in June and July 2014 and 2015. We used a quantitative kick–seine method to sample crayfish presence and abundance at 20 stream sites with 32 surveys/site in the Upper White River drainage, and we measured associated local environmental variables each year. We modeled site occupancy and detection probabilities with the software PRESENCE, and we used multiple linear regressions to identify relationships between crayfish species densities and environmental variables. Occupancy of all crayfish species was related to stream permanence. Faxonius meeki was found exclusively in intermittent streams, whereas Faxonius neglectus and Faxonius luteushad higher occupancy and detection probability in permanent than in intermittent streams, and Faxonius williamsi was associated with intermittent streams. Estimates of detection probability ranged from 0.56 to 1, which is high relative to values found by other investigators. With the exception of F. williamsi, species densities were largely related to stream permanence rather than local habitat. Species densities did not differ by year, but total crayfish densities were significantly lower in 2015 than 2014. Increased precipitation and discharge in 2015 probably led to the lower crayfish densities observed during this year. Our study demonstrates that crayfish distribution and abundance is strongly influenced by stream permanence. Some species, including those of conservation concern (i.e., F. williamsi, F. meeki), appear dependent on intermittent streams, and conservation efforts should include consideration of intermittent streams as an important component of freshwater biodiversity.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mitrani, J
Bayesian networks (BN) are an excellent tool for modeling uncertainties in systems with several interdependent variables. A BN is a directed acyclic graph, and consists of a structure, or the set of directional links between variables that depend on other variables, and conditional probabilities (CP) for each variable. In this project, we apply BN's to understand uncertainties in NIF ignition experiments. One can represent various physical properties of National Ignition Facility (NIF) capsule implosions as variables in a BN. A dataset containing simulations of NIF capsule implosions was provided. The dataset was generated from a radiation hydrodynamics code, and itmore » contained 120 simulations of 16 variables. Relevant knowledge about the physics of NIF capsule implosions and greedy search algorithms were used to search for hypothetical structures for a BN. Our preliminary results found 6 links between variables in the dataset. However, we thought there should have been more links between the dataset variables based on the physics of NIF capsule implosions. Important reasons for the paucity of links are the relatively small size of the dataset, and the sampling of the values for dataset variables. Another factor that might have caused the paucity of links is the fact that in the dataset, 20% of the simulations represented successful fusion, and 80% didn't, (simulations of unsuccessful fusion are useful for measuring certain diagnostics) which skewed the distributions of several variables, and possibly reduced the number of links. Nevertheless, by illustrating the interdependencies and conditional probabilities of several parameters and diagnostics, an accurate and complete BN built from an appropriate simulation set would provide uncertainty quantification for NIF capsule implosions.« less
Crock, J.G.; Severson, R.C.; Gough, L.P.
1992-01-01
Recent investigations on the Kenai Peninsula had two major objectives: (1) to establish elemental baseline concentrations ranges for native vegetation and soils; and, (2) to determine the sampling density required for preparing stable regional geochemical maps for various elements in native plants and soils. These objectives were accomplished using an unbalanced, nested analysis-of-variance (ANOVA) barbell sampling design. Hylocomium splendens (Hedw.) BSG (feather moss, whole plant), Picea glauca (Moench) Voss (white spruce, twigs and needles), and soil horizons (02 and C) were collected and analyzed for major and trace total element concentrations. Using geometric means and geometric deviations, expected baseline ranges for elements were calculated. Results of the ANOVA show that intensive soil or plant sampling is needed to reliably map the geochemistry of the area, due to large local variability. For example, producing reliable element maps of feather moss using a 50 km cell (at 95% probability) would require sampling densities of from 4 samples per cell for Al, Co, Fe, La, Li, and V, to more than 15 samples per cell for Cu, Pb, Se, and Zn.Recent investigations on the Kenai Peninsula had two major objectives: (1) to establish elemental baseline concentrations ranges for native vegetation and soils; and, (2) to determine the sampling density required for preparing stable regional geochemical maps for various elements in native plants and soils. These objectives were accomplished using an unbalanced, nested analysis-of-variance (ANOVA) barbell sampling design. Hylocomium splendens (Hedw.) BSG (feather moss, whole plant), Picea glauca (Moench) Voss (white spruce, twigs and needles), and soil horizons (02 and C) were collected and analyzed for major and trace total element concentrations. Using geometric means and geometric deviations, expected baseline ranges for elements were calculated. Results of the ANOVA show that intensive soil or plant sampling is needed to reliably map the geochemistry of the area, due to large local variability. For example, producing reliable element maps of feather moss using a 50 km cell (at 95% probability) would require sampling densities of from 4 samples per cell Al, Co, Fe, La, Li, and V, to more than 15 samples per cell for Cu, Pb, Se, and Zn.
ERIC Educational Resources Information Center
Gambro, John S.; Switzky, Harvey N.
1999-01-01
Student knowledge about environmental issues related to energy and pollution was analyzed in a national probability sample of high school seniors. Parental level of education, quantity of high school science courses, and gender were all significantly related to students' knowledge levels. Bias in favor of males remained even when the number of…
Legenstein, Robert; Maass, Wolfgang
2014-01-01
It has recently been shown that networks of spiking neurons with noise can emulate simple forms of probabilistic inference through “neural sampling”, i.e., by treating spikes as samples from a probability distribution of network states that is encoded in the network. Deficiencies of the existing model are its reliance on single neurons for sampling from each random variable, and the resulting limitation in representing quickly varying probabilistic information. We show that both deficiencies can be overcome by moving to a biologically more realistic encoding of each salient random variable through the stochastic firing activity of an ensemble of neurons. The resulting model demonstrates that networks of spiking neurons with noise can easily track and carry out basic computational operations on rapidly varying probability distributions, such as the odds of getting rewarded for a specific behavior. We demonstrate the viability of this new approach towards neural coding and computation, which makes use of the inherent parallelism of generic neural circuits, by showing that this model can explain experimentally observed firing activity of cortical neurons for a variety of tasks that require rapid temporal integration of sensory information. PMID:25340749
NASA Astrophysics Data System (ADS)
Treloar, W. J.; Taylor, G. E.; Flenley, J. R.
2004-12-01
This is the first of a series of papers on the theme of automated pollen analysis. The automation of pollen analysis could result in numerous advantages for the reconstruction of past environments, with larger data sets made practical, objectivity and fine resolution sampling. There are also applications in apiculture and medicine. Previous work on the classification of pollen using texture measures has been successful with small numbers of pollen taxa. However, as the number of pollen taxa to be identified increases, more features may be required to achieve a successful classification. This paper describes the use of simple geometric measures to augment the texture measures. The feasibility of this new approach is tested using scanning electron microscope (SEM) images of 12 taxa of fresh pollen taken from reference material collected on Henderson Island, Polynesia. Pollen images were captured directly from a SEM connected to a PC. A threshold grey-level was set and binary images were then generated. Pollen edges were then located and the boundaries were traced using a chain coding system. A number of simple geometric variables were calculated directly from the chain code of the pollen and a variable selection procedure was used to choose the optimal subset to be used for classification. The efficiency of these variables was tested using a leave-one-out classification procedure. The system successfully split the original 12 taxa sample into five sub-samples containing no more than six pollen taxa each. The further subdivision of echinate pollen types was then attempted with a subset of four pollen taxa. A set of difference codes was constructed for a range of displacements along the chain code. From these difference codes probability variables were calculated. A variable selection procedure was again used to choose the optimal subset of probabilities that may be used for classification. The efficiency of these variables was again tested using a leave-one-out classification procedure. The proportion of correctly classified pollen ranged from 81% to 100% depending on the subset of variables used. The best set of variables had an overall classification rate averaging at about 95%. This is comparable with the classification rates from the earlier texture analysis work for other types of pollen. Copyright
Zoonoses action plan Salmonella monitoring programme: an investigation of the sampling protocol.
Snary, E L; Munday, D K; Arnold, M E; Cook, A J C
2010-03-01
The Zoonoses Action Plan (ZAP) Salmonella Programme was established by the British Pig Executive to monitor Salmonella prevalence in quality-assured British pigs at slaughter by testing a sample of pigs with a meat juice enzyme-linked immunosorbent assay for antibodies against group B and C(1) Salmonella. Farms were assigned a ZAP level (1 to 3) depending on the monitored prevalence, and ZAP 2 or 3 farms were required to act to reduce the prevalence. The ultimate goal was to reduce the risk of human salmonellosis attributable to British pork. A mathematical model has been developed to describe the ZAP sampling protocol. Results show that the probability of assigning a farm the correct ZAP level was high, except for farms that had a seroprevalence close to the cutoff points between different ZAP levels. Sensitivity analyses identified that the probability of assigning a farm to the correct ZAP level was dependent on the sensitivity and specificity of the test, the number of batches taken to slaughter each quarter, and the number of samples taken per batch. The variability of the predicted seroprevalence was reduced as the number of batches or samples increased and, away from the cutoff points, the probability of being assigned the correct ZAP level increased as the number of batches or samples increased. In summary, the model described here provided invaluable insight into the ZAP sampling protocol. Further work is required to understand the impact of the program for Salmonella infection in British pig farms and therefore on human health.
The first search for variable stars in the open cluster NGC 6253 and its surrounding field
NASA Astrophysics Data System (ADS)
de Marchi, F.; Poretti, E.; Montalto, M.; Desidera, S.; Piotto, G.
2010-01-01
Aims: This work presents the first high-precision variability survey in the field of the intermediate-age, metal-rich open cluster NGC 6253. Clusters of this type are benchmarks for stellar evolution models. Methods: Continuous photometric monitoring of the cluster and its surrounding field was performed over a time span of ten nights using the Wide Field Imager mounted at the ESO-MPI 2.2 m telescope. High-quality timeseries, each composed of about 800 datapoints, were obtained for 250 000 stars using ISIS and DAOPHOT packages. Candidate members were selected by using the colour-magnitude diagrams and period-luminosity-colour relations. Membership probabilities based on the proper motions were also used. The membership of all the variables discovered within a radius of 8´ from the centre is discussed by comparing the incidence of the classes in the cluster direction and in the surrounding field. Results: We discovered 595 variables and we also characterized most of them providing their variability classes, periods, and amplitudes. The sample is complete for short periods: we classified 20 pulsating variables, 225 contact systems, 99 eclipsing systems (22 β Lyr type, 59 β Per type, 18 RS CVn type), and 77 rotational variables. The time-baseline hampered the precise characterization of 173 variables with periods longer than 4-5 days. Moreover, we found a cataclysmic system undergoing an outburst of about 2.5 mag. We propose a list of 35 variable stars as probable members of NGC 6253. ARRAY(0x383c870)
Nonrecurrence and Bell-like inequalities
NASA Astrophysics Data System (ADS)
Danforth, Douglas G.
2017-12-01
The general class, Λ, of Bell hidden variables is composed of two subclasses ΛR and ΛN such that ΛR⋃ΛN = Λ and ΛR∩ ΛN = {}. The class ΛN is very large and contains random variables whose domain is the continuum, the reals. There are an uncountable infinite number of reals. Every instance of a real random variable is unique. The probability of two instances being equal is zero, exactly zero. ΛN induces sample independence. All correlations are context dependent but not in the usual sense. There is no "spooky action at a distance". Random variables, belonging to ΛN, are independent from one experiment to the next. The existence of the class ΛN makes it impossible to derive any of the standard Bell inequalities used to define quantum entanglement.
Public attitudes toward stuttering in Turkey: probability versus convenience sampling.
Ozdemir, R Sertan; St Louis, Kenneth O; Topbaş, Seyhun
2011-12-01
A Turkish translation of the Public Opinion Survey of Human Attributes-Stuttering (POSHA-S) was used to compare probability versus convenience sampling to measure public attitudes toward stuttering. A convenience sample of adults in Eskişehir, Turkey was compared with two replicates of a school-based, probability cluster sampling scheme. The two replicates of the probability sampling scheme yielded similar demographic samples, both of which were different from the convenience sample. Components of subscores on the POSHA-S were significantly different in more than half of the comparisons between convenience and probability samples, indicating important differences in public attitudes. If POSHA-S users intend to generalize to specific geographic areas, results of this study indicate that probability sampling is a better research strategy than convenience sampling. The reader will be able to: (1) discuss the difference between convenience sampling and probability sampling; (2) describe a school-based probability sampling scheme; and (3) describe differences in POSHA-S results from convenience sampling versus probability sampling. Copyright © 2011 Elsevier Inc. All rights reserved.
Phytoplankton Enumeration and Evaluation Experiments
2009-05-01
deballasted in another. The entrained organisms are generally discharged along with the ballast water. This process has been identified as a vector...ballasted in one port and deballasted in another. This process has been identified as a vector for the translocation of non- indigenous species (NIS...the probability that variable rates of cell survival from the irradiation process impacted NRLKW s ability to prepare samples with accurate numbers
Thomas B. Lynch; Jean Nkouka; Michael M. Huebschmann; James M. Guldin
2003-01-01
A logistic equation is the basis for a model that predicts the probability of obtaining regeneration at specified densities. The density of regeneration (trees/ha) for which an estimate of probability is desired can be specified by means of independent variables in the model. When estimating parameters, the dependent variable is set to 1 if the regeneration density (...
Fram, Miranda S.; Belitz, Kenneth
2011-01-01
We use data from 1626 groundwater samples collected in California, primarily from public drinking water supply wells, to investigate the distribution of perchlorate in deep groundwater under natural conditions. The wells were sampled for the California Groundwater Ambient Monitoring and Assessment Priority Basin Project. We develop a logistic regression model for predicting probabilities of detecting perchlorate at concentrations greater than multiple threshold concentrations as a function of climate (represented by an aridity index) and potential anthropogenic contributions of perchlorate (quantified as an anthropogenic score, AS). AS is a composite categorical variable including terms for nitrate, pesticides, and volatile organic compounds. Incorporating water-quality parameters in AS permits identification of perturbation of natural occurrence patterns by flushing of natural perchlorate salts from unsaturated zones by irrigation recharge as well as addition of perchlorate from industrial and agricultural sources. The data and model results indicate low concentrations (0.1-0.5 μg/L) of perchlorate occur under natural conditions in groundwater across a wide range of climates, beyond the arid to semiarid climates in which they mostly have been previously reported. The probability of detecting perchlorate at concentrations greater than 0.1 μg/L under natural conditions ranges from 50-70% in semiarid to arid regions of California and the Southwestern United States to 5-15% in the wettest regions sampled (the Northern California coast). The probability of concentrations above 1 μg/L under natural conditions is low (generally <3%).
Reward and uncertainty in exploration programs
NASA Technical Reports Server (NTRS)
Kaufman, G. M.; Bradley, P. G.
1971-01-01
A set of variables which are crucial to the economic outcome of petroleum exploration are discussed. These are treated as random variables; the values they assume indicate the number of successes that occur in a drilling program and determine, for a particular discovery, the unit production cost and net economic return if that reservoir is developed. In specifying the joint probability law for those variables, extreme and probably unrealistic assumptions are made. In particular, the different random variables are assumed to be independently distributed. Using postulated probability functions and specified parameters, values are generated for selected random variables, such as reservoir size. From this set of values the economic magnitudes of interest, net return and unit production cost are computed. This constitutes a single trial, and the procedure is repeated many times. The resulting histograms approximate the probability density functions of the variables which describe the economic outcomes of an exploratory drilling program.
Robustness-Based Design Optimization Under Data Uncertainty
NASA Technical Reports Server (NTRS)
Zaman, Kais; McDonald, Mark; Mahadevan, Sankaran; Green, Lawrence
2010-01-01
This paper proposes formulations and algorithms for design optimization under both aleatory (i.e., natural or physical variability) and epistemic uncertainty (i.e., imprecise probabilistic information), from the perspective of system robustness. The proposed formulations deal with epistemic uncertainty arising from both sparse and interval data without any assumption about the probability distributions of the random variables. A decoupled approach is proposed in this paper to un-nest the robustness-based design from the analysis of non-design epistemic variables to achieve computational efficiency. The proposed methods are illustrated for the upper stage design problem of a two-stage-to-orbit (TSTO) vehicle, where the information on the random design inputs are only available as sparse point and/or interval data. As collecting more data reduces uncertainty but increases cost, the effect of sample size on the optimality and robustness of the solution is also studied. A method is developed to determine the optimal sample size for sparse point data that leads to the solutions of the design problem that are least sensitive to variations in the input random variables.
NASA Astrophysics Data System (ADS)
Kumar, V.; Nayagum, D.; Thornton, S.; Banwart, S.; Schuhmacher2, M.; Lerner, D.
2006-12-01
Characterization of uncertainty associated with groundwater quality models is often of critical importance, as for example in cases where environmental models are employed in risk assessment. Insufficient data, inherent variability and estimation errors of environmental model parameters introduce uncertainty into model predictions. However, uncertainty analysis using conventional methods such as standard Monte Carlo sampling (MCS) may not be efficient, or even suitable, for complex, computationally demanding models and involving different nature of parametric variability and uncertainty. General MCS or variant of MCS such as Latin Hypercube Sampling (LHS) assumes variability and uncertainty as a single random entity and the generated samples are treated as crisp assuming vagueness as randomness. Also when the models are used as purely predictive tools, uncertainty and variability lead to the need for assessment of the plausible range of model outputs. An improved systematic variability and uncertainty analysis can provide insight into the level of confidence in model estimates, and can aid in assessing how various possible model estimates should be weighed. The present study aims to introduce, Fuzzy Latin Hypercube Sampling (FLHS), a hybrid approach of incorporating cognitive and noncognitive uncertainties. The noncognitive uncertainty such as physical randomness, statistical uncertainty due to limited information, etc can be described by its own probability density function (PDF); whereas the cognitive uncertainty such estimation error etc can be described by the membership function for its fuzziness and confidence interval by ?-cuts. An important property of this theory is its ability to merge inexact generated data of LHS approach to increase the quality of information. The FLHS technique ensures that the entire range of each variable is sampled with proper incorporation of uncertainty and variability. A fuzzified statistical summary of the model results will produce indices of sensitivity and uncertainty that relate the effects of heterogeneity and uncertainty of input variables to model predictions. The feasibility of the method is validated to assess uncertainty propagation of parameter values for estimation of the contamination level of a drinking water supply well due to transport of dissolved phenolics from a contaminated site in the UK.
A multi-source probabilistic hazard assessment of tephra dispersal in the Neapolitan area
NASA Astrophysics Data System (ADS)
Sandri, Laura; Costa, Antonio; Selva, Jacopo; Folch, Arnau; Macedonio, Giovanni; Tonini, Roberto
2015-04-01
In this study we present the results obtained from a long-term Probabilistic Hazard Assessment (PHA) of tephra dispersal in the Neapolitan area. Usual PHA for tephra dispersal needs the definition of eruptive scenarios (usually by grouping eruption sizes and possible vent positions in a limited number of classes) with associated probabilities, a meteorological dataset covering a representative time period, and a tephra dispersal model. PHA then results from combining simulations considering different volcanological and meteorological conditions through weights associated to their specific probability of occurrence. However, volcanological parameters (i.e., erupted mass, eruption column height, eruption duration, bulk granulometry, fraction of aggregates) typically encompass a wide range of values. Because of such a natural variability, single representative scenarios or size classes cannot be adequately defined using single values for the volcanological inputs. In the present study, we use a method that accounts for this within-size-class variability in the framework of Event Trees. The variability of each parameter is modeled with specific Probability Density Functions, and meteorological and volcanological input values are chosen by using a stratified sampling method. This procedure allows for quantifying hazard without relying on the definition of scenarios, thus avoiding potential biases introduced by selecting single representative scenarios. Embedding this procedure into the Bayesian Event Tree scheme enables the tephra fall PHA and its epistemic uncertainties. We have appied this scheme to analyze long-term tephra fall PHA from Vesuvius and Campi Flegrei, in a multi-source paradigm. We integrate two tephra dispersal models (the analytical HAZMAP and the numerical FALL3D) into BET_VH. The ECMWF reanalysis dataset are used for exploring different meteorological conditions. The results obtained show that PHA accounting for the whole natural variability are consistent with previous probabilities maps elaborated for Vesuvius and Campi Flegrei on the basis of single representative scenarios, but show significant differences. In particular, the area characterized by a 300 kg/m2-load exceedance probability larger than 5%, accounting for the whole range of variability (that is, from small violent strombolian to plinian eruptions), is similar to that displayed in the maps based on the medium magnitude reference eruption, but it is of a smaller extent. This is due to the relatively higher weight of the small magnitude eruptions considered in this study, but neglected in the reference scenario maps. On the other hand, in our new maps the area characterized by a 300 kg/m2-load exceedance probability larger than 1% is much larger than that of the medium magnitude reference eruption, due to the contribution of plinian eruptions at lower probabilities, again neglected in the reference scenario maps.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kelly, Brandon C.; Becker, Andrew C.; Sobolewska, Malgosia
2014-06-10
We present the use of continuous-time autoregressive moving average (CARMA) models as a method for estimating the variability features of a light curve, and in particular its power spectral density (PSD). CARMA models fully account for irregular sampling and measurement errors, making them valuable for quantifying variability, forecasting and interpolating light curves, and variability-based classification. We show that the PSD of a CARMA model can be expressed as a sum of Lorentzian functions, which makes them extremely flexible and able to model a broad range of PSDs. We present the likelihood function for light curves sampled from CARMA processes, placingmore » them on a statistically rigorous foundation, and we present a Bayesian method to infer the probability distribution of the PSD given the measured light curve. Because calculation of the likelihood function scales linearly with the number of data points, CARMA modeling scales to current and future massive time-domain data sets. We conclude by applying our CARMA modeling approach to light curves for an X-ray binary, two active galactic nuclei, a long-period variable star, and an RR Lyrae star in order to illustrate their use, applicability, and interpretation.« less
Generating intrinsically disordered protein conformational ensembles from a Markov chain
NASA Astrophysics Data System (ADS)
Cukier, Robert I.
2018-03-01
Intrinsically disordered proteins (IDPs) sample a diverse conformational space. They are important to signaling and regulatory pathways in cells. An entropy penalty must be payed when an IDP becomes ordered upon interaction with another protein or a ligand. Thus, the degree of conformational disorder of an IDP is of interest. We create a dichotomic Markov model that can explore entropic features of an IDP. The Markov condition introduces local (neighbor residues in a protein sequence) rotamer dependences that arise from van der Waals and other chemical constraints. A protein sequence of length N is characterized by its (information) entropy and mutual information, MIMC, the latter providing a measure of the dependence among the random variables describing the rotamer probabilities of the residues that comprise the sequence. For a Markov chain, the MIMC is proportional to the pair mutual information MI which depends on the singlet and pair probabilities of neighbor residue rotamer sampling. All 2N sequence states are generated, along with their probabilities, and contrasted with the probabilities under the assumption of independent residues. An efficient method to generate realizations of the chain is also provided. The chain entropy, MIMC, and state probabilities provide the ingredients to distinguish different scenarios using the terminologies: MoRF (molecular recognition feature), not-MoRF, and not-IDP. A MoRF corresponds to large entropy and large MIMC (strong dependence among the residues' rotamer sampling), a not-MoRF corresponds to large entropy but small MIMC, and not-IDP corresponds to low entropy irrespective of the MIMC. We show that MorFs are most appropriate as descriptors of IDPs. They provide a reasonable number of high-population states that reflect the dependences between neighbor residues, thus classifying them as IDPs, yet without very large entropy that might lead to a too high entropy penalty.
O’Donnell, Katherine M.; Thompson, Frank R.; Semlitsch, Raymond D.
2015-01-01
Detectability of individual animals is highly variable and nearly always < 1; imperfect detection must be accounted for to reliably estimate population sizes and trends. Hierarchical models can simultaneously estimate abundance and effective detection probability, but there are several different mechanisms that cause variation in detectability. Neglecting temporary emigration can lead to biased population estimates because availability and conditional detection probability are confounded. In this study, we extend previous hierarchical binomial mixture models to account for multiple sources of variation in detectability. The state process of the hierarchical model describes ecological mechanisms that generate spatial and temporal patterns in abundance, while the observation model accounts for the imperfect nature of counting individuals due to temporary emigration and false absences. We illustrate our model’s potential advantages, including the allowance of temporary emigration between sampling periods, with a case study of southern red-backed salamanders Plethodon serratus. We fit our model and a standard binomial mixture model to counts of terrestrial salamanders surveyed at 40 sites during 3–5 surveys each spring and fall 2010–2012. Our models generated similar parameter estimates to standard binomial mixture models. Aspect was the best predictor of salamander abundance in our case study; abundance increased as aspect became more northeasterly. Increased time-since-rainfall strongly decreased salamander surface activity (i.e. availability for sampling), while higher amounts of woody cover objects and rocks increased conditional detection probability (i.e. probability of capture, given an animal is exposed to sampling). By explicitly accounting for both components of detectability, we increased congruence between our statistical modeling and our ecological understanding of the system. We stress the importance of choosing survey locations and protocols that maximize species availability and conditional detection probability to increase population parameter estimate reliability. PMID:25775182
Modelling the spatial distribution of Fasciola hepatica in dairy cattle in Europe.
Ducheyne, Els; Charlier, Johannes; Vercruysse, Jozef; Rinaldi, Laura; Biggeri, Annibale; Demeler, Janina; Brandt, Christina; De Waal, Theo; Selemetas, Nikolaos; Höglund, Johan; Kaba, Jaroslaw; Kowalczyk, Slawomir J; Hendrickx, Guy
2015-03-26
A harmonized sampling approach in combination with spatial modelling is required to update current knowledge of fasciolosis in dairy cattle in Europe. Within the scope of the EU project GLOWORM, samples from 3,359 randomly selected farms in 849 municipalities in Belgium, Germany, Ireland, Poland and Sweden were collected and their infection status assessed using an indirect bulk tank milk (BTM) enzyme-linked immunosorbent assay (ELISA). Dairy farms were considered exposed when the optical density ratio (ODR) exceeded the 0.3 cut-off. Two ensemble-modelling techniques, Random Forests (RF) and Boosted Regression Trees (BRT), were used to obtain the spatial distribution of the probability of exposure to Fasciola hepatica using remotely sensed environmental variables (1-km spatial resolution) and interpolated values from meteorological stations as predictors. The median ODRs amounted to 0.31, 0.12, 0.54, 0.25 and 0.44 for Belgium, Germany, Ireland, Poland and southern Sweden, respectively. Using the 0.3 threshold, 571 municipalities were categorized as positive and 429 as negative. RF was seen as capable of predicting the spatial distribution of exposure with an area under the receiver operation characteristic (ROC) curve (AUC) of 0.83 (0.96 for BRT). Both models identified rainfall and temperature as the most important factors for probability of exposure. Areas of high and low exposure were identified by both models, with BRT better at discriminating between low-probability and high-probability exposure; this model may therefore be more useful in practise. Given a harmonized sampling strategy, it should be possible to generate robust spatial models for fasciolosis in dairy cattle in Europe to be used as input for temporal models and for the detection of deviations in baseline probability. Further research is required for model output in areas outside the eco-climatic range investigated.
NASA Technical Reports Server (NTRS)
Brown, A. M.
1998-01-01
Accounting for the statistical geometric and material variability of structures in analysis has been a topic of considerable research for the last 30 years. The determination of quantifiable measures of statistical probability of a desired response variable, such as natural frequency, maximum displacement, or stress, to replace experience-based "safety factors" has been a primary goal of these studies. There are, however, several problems associated with their satisfactory application to realistic structures, such as bladed disks in turbomachinery. These include the accurate definition of the input random variables (rv's), the large size of the finite element models frequently used to simulate these structures, which makes even a single deterministic analysis expensive, and accurate generation of the cumulative distribution function (CDF) necessary to obtain the probability of the desired response variables. The research presented here applies a methodology called probabilistic dynamic synthesis (PDS) to solve these problems. The PDS method uses dynamic characteristics of substructures measured from modal test as the input rv's, rather than "primitive" rv's such as material or geometric uncertainties. These dynamic characteristics, which are the free-free eigenvalues, eigenvectors, and residual flexibility (RF), are readily measured and for many substructures, a reasonable sample set of these measurements can be obtained. The statistics for these rv's accurately account for the entire random character of the substructure. Using the RF method of component mode synthesis, these dynamic characteristics are used to generate reduced-size sample models of the substructures, which are then coupled to form system models. These sample models are used to obtain the CDF of the response variable by either applying Monte Carlo simulation or by generating data points for use in the response surface reliability method, which can perform the probabilistic analysis with an order of magnitude less computational effort. Both free- and forced-response analyses have been performed, and the results indicate that, while there is considerable room for improvement, the method produces usable and more representative solutions for the design of realistic structures with a substantial savings in computer time.
A Blind Survey for AGN in the Kepler Field through Optical Variability
NASA Astrophysics Data System (ADS)
Olling, Robert; Shaya, E. J.; Mushotzky, R.
2013-01-01
We present an initial analysis of three quarters of Kepler LLC time series of 400 small galaxies. The Kepler LLC data is sampled about twice per hour, and allows us to investigate variability on time scales between about one day and one month. The calibrated Kepler LLC light curves still contain many instrumental effects that can not be taken out in a robust manner. Instead, our analysis relies on the similarity of variability measures in the three independent quarters to decide if an galaxy shows variability, or not. We estimate that roughly 15% of our small galaxies shows variability at levels exceeding several parts per thousand (mmag) on timescales of days to weeks. However, this estimate is probably uncertain by a factor of two. Our data is more sensitive by several factors of ten as compared to extant data sets.
3D radiation belt diffusion model results using new empirical models of whistler chorus and hiss
NASA Astrophysics Data System (ADS)
Cunningham, G.; Chen, Y.; Henderson, M. G.; Reeves, G. D.; Tu, W.
2012-12-01
3D diffusion codes model the energization, radial transport, and pitch angle scattering due to wave-particle interactions. Diffusion codes are powerful but are limited by the lack of knowledge of the spatial & temporal distribution of waves that drive the interactions for a specific event. We present results from the 3D DREAM model using diffusion coefficients driven by new, activity-dependent, statistical models of chorus and hiss waves. Most 3D codes parameterize the diffusion coefficients or wave amplitudes as functions of magnetic activity indices like Kp, AE, or Dst. These functional representations produce the average value of the wave intensities for a given level of magnetic activity; however, the variability of the wave population at a given activity level is lost with such a representation. Our 3D code makes use of the full sample distributions contained in a set of empirical wave databases (one database for each wave type, including plasmaspheric hiss, lower and upper hand chorus) that were recently produced by our team using CRRES and THEMIS observations. The wave databases store the full probability distribution of observed wave intensity binned by AE, MLT, MLAT and L*. In this presentation, we show results that make use of the wave intensity sample probability distributions for lower-band and upper-band chorus by sampling the distributions stochastically during a representative CRRES-era storm. The sampling of the wave intensity probability distributions produces a collection of possible evolutions of the phase space density, which quantifies the uncertainty in the model predictions caused by the uncertainty of the chorus wave amplitudes for a specific event. A significant issue is the determination of an appropriate model for the spatio-temporal correlations of the wave intensities, since the diffusion coefficients are computed as spatio-temporal averages of the waves over MLT, MLAT and L*. The spatiotemporal correlations cannot be inferred from the wave databases. In this study we use a temporal correlation of ~1 hour for the sampled wave intensities that is informed by the observed autocorrelation in the AE index, a spatial correlation length of ~100 km in the two directions perpendicular to the magnetic field, and a spatial correlation length of 5000 km in the direction parallel to the magnetic field, according to the work of Santolik et al (2003), who used multi-spacecraft measurements from Cluster to quantify the correlation length scales for equatorial chorus . We find that, despite the small correlation length scale for chorus, there remains significant variability in the model outcomes driven by variability in the chorus wave intensities.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carson, K.S.
The presence of overpopulation or unsustainable population growth may place pressure on the food and water supplies of countries in sensitive areas of the world. Severe air or water pollution may place additional pressure on these resources. These pressures may generate both internal and international conflict in these areas as nations struggle to provide for their citizens. Such conflicts may result in United States intervention, either unilaterally, or through the United Nations. Therefore, it is in the interests of the United States to identify potential areas of conflict in order to properly train and allocate forces. The purpose of thismore » research is to forecast the probability of conflict in a nation as a function of it s environmental conditions. Probit, logit and ordered probit models are employed to forecast the probability of a given level of conflict. Data from 95 countries are used to estimate the models. Probability forecasts are generated for these 95 nations. Out-of sample forecasts are generated for an additional 22 nations. These probabilities are then used to rank nations from highest probability of conflict to lowest. The results indicate that the dependence of a nation`s economy on agriculture, the rate of deforestation, and the population density are important variables in forecasting the probability and level of conflict. These results indicate that environmental variables do play a role in generating or exacerbating conflict. It is unclear that the United States military has any direct role in mitigating the environmental conditions that may generate conflict. A more important role for the military is to aid in data gathering to generate better forecasts so that the troops are adequntely prepared when conflicts arises.« less
Counihan, T.D.; Miller, Allen I.; Parsley, M.J.
1999-01-01
The development of recruitment monitoring programs for age-0 white sturgeons Acipenser transmontanus is complicated by the statistical properties of catch-per-unit-effort (CPUE) data. We found that age-0 CPUE distributions from bottom trawl surveys violated assumptions of statistical procedures based on normal probability theory. Further, no single data transformation uniformly satisfied these assumptions because CPUE distribution properties varied with the sample mean (??(CPUE)). Given these analytic problems, we propose that an additional index of age-0 white sturgeon relative abundance, the proportion of positive tows (Ep), be used to estimate sample sizes before conducting age-0 recruitment surveys and to evaluate statistical hypothesis tests comparing the relative abundance of age-0 white sturgeons among years. Monte Carlo simulations indicated that Ep was consistently more precise than ??(CPUE), and because Ep is binomially rather than normally distributed, surveys can be planned and analyzed without violating the assumptions of procedures based on normal probability theory. However, we show that Ep may underestimate changes in relative abundance at high levels and confound our ability to quantify responses to management actions if relative abundance is consistently high. If data suggest that most samples will contain age-0 white sturgeons, estimators of relative abundance other than Ep should be considered. Because Ep may also obscure correlations to climatic and hydrologic variables if high abundance levels are present in time series data, we recommend ??(CPUE) be used to describe relations to environmental variables. The use of both Ep and ??(CPUE) will facilitate the evaluation of hypothesis tests comparing relative abundance levels and correlations to variables affecting age-0 recruitment. Estimated sample sizes for surveys should therefore be based on detecting predetermined differences in Ep, but data necessary to calculate ??(CPUE) should also be collected.
New approaches for sampling and modeling native and exotic plant species richness
Chong, G.W.; Reich, R.M.; Kalkhan, M.A.; Stohlgren, T.J.
2001-01-01
We demonstrate new multi-phase, multi-scale approaches for sampling and modeling native and exotic plant species to predict the spread of invasive species and aid in control efforts. Our test site is a 54,000-ha portion of Rocky Mountain National Park, Colorado, USA. This work is based on previous research wherein we developed vegetation sampling techniques to identify hot spots of diversity, important rare habitats, and locations of invasive plant species. Here we demonstrate statistical modeling tools to rapidly assess current patterns of native and exotic plant species to determine which habitats are most vulnerable to invasion by exotic species. We use stepwise multiple regression and modified residual kriging to estimate numbers of native species and exotic species, as well as probability of observing an exotic species in 30 × 30-m cells. Final models accounted for 62% of the variability observed in number of native species, 51% of the variability observed in number of exotic species, and 47% of the variability associated with observing an exotic species. Important independent variables used in developing the models include geographical location, elevation, slope, aspect, and Landsat TM bands 1-7. These models can direct resource managers to areas in need of further inventory, monitoring, and exotic species control efforts.
Seok, Junhee; Seon Kang, Yeong
2015-01-01
Mutual information, a general measure of the relatedness between two random variables, has been actively used in the analysis of biomedical data. The mutual information between two discrete variables is conventionally calculated by their joint probabilities estimated from the frequency of observed samples in each combination of variable categories. However, this conventional approach is no longer efficient for discrete variables with many categories, which can be easily found in large-scale biomedical data such as diagnosis codes, drug compounds, and genotypes. Here, we propose a method to provide stable estimations for the mutual information between discrete variables with many categories. Simulation studies showed that the proposed method reduced the estimation errors by 45 folds and improved the correlation coefficients with true values by 99 folds, compared with the conventional calculation of mutual information. The proposed method was also demonstrated through a case study for diagnostic data in electronic health records. This method is expected to be useful in the analysis of various biomedical data with discrete variables. PMID:26046461
Rare Event Simulation in Radiation Transport
NASA Astrophysics Data System (ADS)
Kollman, Craig
This dissertation studies methods for estimating extremely small probabilities by Monte Carlo simulation. Problems in radiation transport typically involve estimating very rare events or the expected value of a random variable which is with overwhelming probability equal to zero. These problems often have high dimensional state spaces and irregular geometries so that analytic solutions are not possible. Monte Carlo simulation must be used to estimate the radiation dosage being transported to a particular location. If the area is well shielded the probability of any one particular particle getting through is very small. Because of the large number of particles involved, even a tiny fraction penetrating the shield may represent an unacceptable level of radiation. It therefore becomes critical to be able to accurately estimate this extremely small probability. Importance sampling is a well known technique for improving the efficiency of rare event calculations. Here, a new set of probabilities is used in the simulation runs. The results are multiplied by the likelihood ratio between the true and simulated probabilities so as to keep our estimator unbiased. The variance of the resulting estimator is very sensitive to which new set of transition probabilities are chosen. It is shown that a zero variance estimator does exist, but that its computation requires exact knowledge of the solution. A simple random walk with an associated killing model for the scatter of neutrons is introduced. Large deviation results for optimal importance sampling in random walks are extended to the case where killing is present. An adaptive "learning" algorithm for implementing importance sampling is given for more general Markov chain models of neutron scatter. For finite state spaces this algorithm is shown to give, with probability one, a sequence of estimates converging exponentially fast to the true solution. In the final chapter, an attempt to generalize this algorithm to a continuous state space is made. This involves partitioning the space into a finite number of cells. There is a tradeoff between additional computation per iteration and variance reduction per iteration that arises in determining the optimal grid size. All versions of this algorithm can be thought of as a compromise between deterministic and Monte Carlo methods, capturing advantages of both techniques.
López, Iago; Alvarez, César; Gil, José L; Revilla, José A
2012-11-30
Data on the 95th and 90th percentiles of bacteriological quality indicators are used to classify bathing waters in Europe, according to the requirements of Directive 2006/7/EC. However, percentile values and consequently, classification of bathing waters depend both on sampling effort and sample-size, which may undermine an appropriate assessment of bathing water classification. To analyse the influence of sampling effort and sample size on water classification, a bootstrap approach was applied to 55 bacteriological quality datasets of several beaches in the Balearic Islands (Spain). Our results show that the probability of failing the regulatory standards of the Directive is high when sample size is low, due to a higher variability in percentile values. In this way, 49% of the bathing waters reaching an "Excellent" classification (95th percentile of Escherichia coli under 250 cfu/100 ml) can fail the "Excellent" regulatory standard due to sampling strategy, when 23 samples per season are considered. This percentage increases to 81% when 4 samples per season are considered. "Good" regulatory standards can also be failed in bathing waters with an "Excellent" classification as a result of these sampling strategies. The variability in percentile values may affect bathing water classification and is critical for the appropriate design and implementation of bathing water Quality Monitoring and Assessment Programs. Hence, variability of percentile values should be taken into account by authorities if an adequate management of these areas is to be achieved. Copyright © 2012 Elsevier Ltd. All rights reserved.
Advanced reliability methods for structural evaluation
NASA Technical Reports Server (NTRS)
Wirsching, P. H.; Wu, Y.-T.
1985-01-01
Fast probability integration (FPI) methods, which can yield approximate solutions to such general structural reliability problems as the computation of the probabilities of complicated functions of random variables, are known to require one-tenth the computer time of Monte Carlo methods for a probability level of 0.001; lower probabilities yield even more dramatic differences. A strategy is presented in which a computer routine is run k times with selected perturbed values of the variables to obtain k solutions for a response variable Y. An approximating polynomial is fit to the k 'data' sets, and FPI methods are employed for this explicit form.
The late Neandertal supraorbital fossils from Vindija Cave, Croatia: a biased sample?
Ahern, James C M; Lee, Sang-Hee; Hawks, John D
2002-09-01
The late Neandertal sample from Vindija (Croatia) has been described as transitional between the earlier Central European Neandertals from Krapina (Croatia) and modern humans. However, the morphological differences indicating this transition may rather be the result of different sex and/or age compositions between the samples. This study tests the hypothesis that the metric differences between the Krapina and Vindija supraorbital samples are due to sampling bias. We focus upon the supraorbital region because past studies have posited this region as particularly indicative of the Vindija sample's transitional nature. Furthermore, the supraorbital region varies significantly with both age and sex. We analyzed four chords and two derived indices of supraorbital torus form as defined by Smith & Ranyard (1980, Am. J. phys. Anthrop.93, pp. 589-610). For each variable, we analyzed relative sample bias of the Krapina and Vindija samples using three sampling methods. In order to test the hypothesis that the Vindija sample contains an over-representation of females and/or young while the Krapina sample is normal or also female/young biased, we determined the probability of drawing a sample of the same size as and with a mean equal to or less than Vindija's from a Krapina-based population. In order to test the hypothesis that the Vindija sample is female/young biased while the Krapina sample is male/old biased, we determined the probability of drawing a sample of the same size as and with a mean equal or less than Vindija's from a generated population whose mean is halfway between Krapina's and Vindija's. Finally, in order to test the hypothesis that the Vindija sample is normal while the Krapina sample contains an over-representation of males and/or old, we determined the probability of drawing a sample of the same size as and with a mean equal to or greater than Krapina's from a Vindija-based population. Unless we assume that the Vindija sample is female/young and the Krapina sample is male/old biased, our results falsify the hypothesis that the metric differences between the Krapina and Vindija samples are due to sample bias.
Review of Literature on Probability of Detection for Liquid Penetrant Nondestructive Testing
2011-11-01
increased maintenance costs , or catastrophic failure of safety- critical structure. Knowledge of the reliability achieved by NDT methods, including...representative components to gather data for statistical analysis, which can be prohibitively expensive. To account for sampling variability inherent in any...Sioux City and Pensacola. (Those recommendations were discussed in Section 3.4.) Drury et al report on a factorial experiment aimed at identifying the
A stochastic diffusion process for Lochner's generalized Dirichlet distribution
Bakosi, J.; Ristorcelli, J. R.
2013-10-01
The method of potential solutions of Fokker-Planck equations is used to develop a transport equation for the joint probability of N stochastic variables with Lochner’s generalized Dirichlet distribution as its asymptotic solution. Individual samples of a discrete ensemble, obtained from the system of stochastic differential equations, equivalent to the Fokker-Planck equation developed here, satisfy a unit-sum constraint at all times and ensure a bounded sample space, similarly to the process developed in for the Dirichlet distribution. Consequently, the generalized Dirichlet diffusion process may be used to represent realizations of a fluctuating ensemble of N variables subject to a conservation principle.more » Compared to the Dirichlet distribution and process, the additional parameters of the generalized Dirichlet distribution allow a more general class of physical processes to be modeled with a more general covariance matrix.« less
Norton, Aaron T.; Allen, Thomas J.; Sims, Charles L.
2010-01-01
Using data from a US national probability sample of self-identified lesbian, gay, and bisexual adults (N = 662), this article reports population parameter estimates for a variety of demographic, psychological, and social variables. Special emphasis is given to information with relevance to public policy and law. Compared with the US adult population, respondents were younger, more highly educated, and less likely to be non-Hispanic White, but differences were observed between gender and sexual orientation groups on all of these variables. Overall, respondents tended to be politically liberal, not highly religious, and supportive of marriage equality for same-sex couples. Women were more likely than men to be in a committed relationship. Virtually all coupled gay men and lesbians had a same-sex partner, whereas the vast majority of coupled bisexuals were in a heterosexual relationship. Compared with bisexuals, gay men and lesbians reported stronger commitment to a sexual-minority identity, greater community identification and involvement, and more extensive disclosure of their sexual orientation to others. Most respondents reported experiencing little or no choice about their sexual orientation. The importance of distinguishing among lesbians, gay men, bisexual women, and bisexual men in behavioral and social research is discussed. PMID:20835383
Olivatti, A M; Boni, T A; Silva-Júnior, N J; Resende, L V; Gouveia, F O; Telles, M P C
2011-01-01
Leporinus friderici, native to the Amazon Basin and popularly known as "piau-três-pintas", has great ecological and economic importance; it is widely fished and consumed throughout much of tropical South America. Knowledge of the genetic diversity of this native species is important to support management and conservation programs. We evaluated microsatellite loci amplification, using heterologous primers, in 31 individuals of L. friderici. These samples were collected from natural populations of the Araguaia River basin, in central Brazil, and the DNA was extracted from samples of muscle tissue. Eight loci were successfully analyzed. Six of them were polymorphic, and the number of alleles ranged from three to 10. Values of expected heterozygosities for these polymorphic loci ranged from 0.488 to 0.795. Exclusion probability (0.983), the identity probability (0.000073), and the mean genetic diversity values were high, showing that these microsatellite markers are suitable for assessing the genetic variability of L. friderici populations. There is a growing interest in studies that evaluate the genetic variability of natural populations for various purposes, such as conservation. Here, we showed that a viable alternative to the costly development of specific primers for fish populations is simply testing for heterologous amplification of microsatellite markers available from research on other species.
Bootstrap investigation of the stability of a Cox regression model.
Altman, D G; Andersen, P K
1989-07-01
We describe a bootstrap investigation of the stability of a Cox proportional hazards regression model resulting from the analysis of a clinical trial of azathioprine versus placebo in patients with primary biliary cirrhosis. We have considered stability to refer both to the choice of variables included in the model and, more importantly, to the predictive ability of the model. In stepwise Cox regression analyses of 100 bootstrap samples using 17 candidate variables, the most frequently selected variables were those selected in the original analysis, and no other important variable was identified. Thus there was no reason to doubt the model obtained in the original analysis. For each patient in the trial, bootstrap confidence intervals were constructed for the estimated probability of surviving two years. It is shown graphically that these intervals are markedly wider than those obtained from the original model.
Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model
NASA Astrophysics Data System (ADS)
Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami
2017-06-01
A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.
NASA Astrophysics Data System (ADS)
Sarker, Subrata; Lemke, Peter; Wiltshire, Karen H.
2018-05-01
Explaining species diversity as a function of ecosystem variability is a long-term discussion in community-ecology research. Here, we aimed to establish a causal relationship between ecosystem variability and phytoplankton diversity in a shallow-sea ecosystem. We used long-term data on biotic and abiotic factors from Helgoland Roads, along with climate data to assess the effect of ecosystem variability on phytoplankton diversity. A point cumulative semi-variogram method was used to estimate the long-term ecosystem variability. A Markov chain model was used to estimate dynamical processes of species i.e. occurrence, absence and outcompete probability. We identified that the 1980s was a period of high ecosystem variability while the last two decades were comparatively less variable. Ecosystem variability was found as an important predictor of phytoplankton diversity at Helgoland Roads. High diversity was related to low ecosystem variability due to non-significant relationship between probability of a species occurrence and absence, significant negative relationship between probability of a species occurrence and probability of a species to be outcompeted by others, and high species occurrence at low ecosystem variability. Using an exceptional marine long-term data set, this study established a causal relationship between ecosystem variability and phytoplankton diversity.
Seed and vegetative production of shrubs and growth of understory conifer regeneration
Wender, B.; Harrington, C.; Tappeiner, J. C.
2004-01-01
We observed flower and fruit production for nine understory shrub species in western Washington and Oregon and examined the relationships between shrub reproductive output and plant size, plant age, site factors, and overstory density to determine the factors that control flowering or fruiting in understory shrubs. In Washington, 50 or more shrubs or microplots (for rhizomatous species) were sampled for each of eight species. The variables examined were more useful for explaining abundance of flowers or fruit on shrubs than they were for explaining the probability that a shrub would produce flowers or fruit. Plant size was consistently the most useful predictor of flower/fruit abundance in all species; plant age was also a good predictor of abundance and was strongly correlated with plant size. Site variables (e.g., slope) and overstory competition variables (e.g., presence/absence of a canopy gap) also helped explain flower/fruit abundance for some species. At two Oregon sites, the responses of five species to four levels of thinning were observed for 2-4 yr (15 shrubs or microplots per treatment per year). Thinning increased the probability and abundance of flowering/fruiting for two species, had no effect on one species, and responses for two other species were positive but inconsistent between sites or from year to year. We believe reducing overstory density or creating canopy gaps may be useful tools for enhancing shrub size and vigor, thus, increasing the probability and abundance of fruiting in some understory shrub species.
Flower and fruit production of understory shrubs in western Washington and Oregon
Wender, B.; Harrington, C.; Tappeiner, J. C.
2004-01-01
We observed flower and fruit production for nine understory shrub species in western Washington and Oregon and examined the relationships between shrub reproductive output and plant size, plant age, site factors, and overstory density to determine the factors that control flowering or fruiting in understory shrubs. In Washington, 50 or more shrubs or microplots (for rhizomatous species) were sampled for each of eight species. The variables examined were more useful for explaining abundance of flowers or fruit on shrubs than they were for explaining the probability that a shrub would produce flowers or fruit. Plant size was consistently the most useful predictor of flower/fruit abundance in all species; plant age was also a good predictor of abundance and was strongly correlated with plant size. Site variables (e.g., slope) and overstory competition variables (e.g., presence/absence of a canopy gap) also helped explain flower/fruit abundance for some species. At two Oregon sites, the responses of five species to four levels of thinning were observed for 2-4 yr (15 shrubs or microplots per treatment per year). Thinning increased the probability and abundance of flowering/fruiting for two species, had no effect on one species, and responses for two other species were positive but inconsistent between sites or from year to year. We believe reducing overstory density or creating canopy gaps may be useful tools for enhancing shrub size and vigor, thus, increasing the probability and abundance of fruiting in some understory shrub species.
Rain attenuation measurements: Variability and data quality assessment
NASA Technical Reports Server (NTRS)
Crane, Robert K.
1989-01-01
Year to year variations in the cumulative distributions of rain rate or rain attenuation are evident in any of the published measurements for a single propagation path that span a period of several years of observation. These variations must be described by models for the prediction of rain attenuation statistics. Now that a large measurement data base has been assembled by the International Radio Consultative Committee, the information needed to assess variability is available. On the basis of 252 sample cumulative distribution functions for the occurrence of attenuation by rain, the expected year to year variation in attenuation at a fixed probability level in the 0.1 to 0.001 percent of a year range is estimated to be 27 percent. The expected deviation from an attenuation model prediction for a single year of observations is estimated to exceed 33 percent when any of the available global rain climate model are employed to estimate the rain rate statistics. The probability distribution for the variation in attenuation or rain rate at a fixed fraction of a year is lognormal. The lognormal behavior of the variate was used to compile the statistics for variability.
Breslow, Norman E.; Lumley, Thomas; Ballantyne, Christie M; Chambless, Lloyd E.; Kulich, Michal
2009-01-01
The case-cohort study involves two-phase sampling: simple random sampling from an infinite super-population at phase one and stratified random sampling from a finite cohort at phase two. Standard analyses of case-cohort data involve solution of inverse probability weighted (IPW) estimating equations, with weights determined by the known phase two sampling fractions. The variance of parameter estimates in (semi)parametric models, including the Cox model, is the sum of two terms: (i) the model based variance of the usual estimates that would be calculated if full data were available for the entire cohort; and (ii) the design based variance from IPW estimation of the unknown cohort total of the efficient influence function (IF) contributions. This second variance component may be reduced by adjusting the sampling weights, either by calibration to known cohort totals of auxiliary variables correlated with the IF contributions or by their estimation using these same auxiliary variables. Both adjustment methods are implemented in the R survey package. We derive the limit laws of coefficients estimated using adjusted weights. The asymptotic results suggest practical methods for construction of auxiliary variables that are evaluated by simulation of case-cohort samples from the National Wilms Tumor Study and by log-linear modeling of case-cohort data from the Atherosclerosis Risk in Communities Study. Although not semiparametric efficient, estimators based on adjusted weights may come close to achieving full efficiency within the class of augmented IPW estimators. PMID:20174455
Spahr, N.E.; Boulger, R.W.
1997-01-01
Quality-control samples provide part of the information needed to estimate the bias and variability that result from sample collection, processing, and analysis. Quality-control samples of surface water collected for the Upper Colorado River National Water-Quality Assessment study unit for water years 1995?96 are presented and analyzed in this report. The types of quality-control samples collected include pre-processing split replicates, concurrent replicates, sequential replicates, post-processing split replicates, and field blanks. Analysis of the pre-processing split replicates, concurrent replicates, sequential replicates, and post-processing split replicates is based on differences between analytical results of the environmental samples and analytical results of the quality-control samples. Results of these comparisons indicate that variability introduced by sample collection, processing, and handling is low and will not affect interpretation of the environmental data. The differences for most water-quality constituents is on the order of plus or minus 1 or 2 lowest rounding units. A lowest rounding unit is equivalent to the magnitude of the least significant figure reported for analytical results. The use of lowest rounding units avoids some of the difficulty in comparing differences between pairs of samples when concentrations span orders of magnitude and provides a measure of the practical significance of the effect of variability. Analysis of field-blank quality-control samples indicates that with the exception of chloride and silica, no systematic contamination of samples is apparent. Chloride contamination probably was the result of incomplete rinsing of the dilute cleaning solution from the outlet ports of the decaport sample splitter. Silica contamination seems to have been introduced by the blank water. Sampling and processing procedures for water year 1997 have been modified as a result of these analyses.
Probability sampling in legal cases: Kansas cellphone users
NASA Astrophysics Data System (ADS)
Kadane, Joseph B.
2012-10-01
Probability sampling is a standard statistical technique. This article introduces the basic ideas of probability sampling, and shows in detail how probability sampling was used in a particular legal case.
Cramer, Emily
2016-01-01
Abstract Hospital performance reports often include rankings of unit pressure ulcer rates. Differentiating among units on the basis of quality requires reliable measurement. Our objectives were to describe and apply methods for assessing reliability of hospital‐acquired pressure ulcer rates and evaluate a standard signal‐noise reliability measure as an indicator of precision of differentiation among units. Quarterly pressure ulcer data from 8,199 critical care, step‐down, medical, surgical, and medical‐surgical nursing units from 1,299 US hospitals were analyzed. Using beta‐binomial models, we estimated between‐unit variability (signal) and within‐unit variability (noise) in annual unit pressure ulcer rates. Signal‐noise reliability was computed as the ratio of between‐unit variability to the total of between‐ and within‐unit variability. To assess precision of differentiation among units based on ranked pressure ulcer rates, we simulated data to estimate the probabilities of a unit's observed pressure ulcer rate rank in a given sample falling within five and ten percentiles of its true rank, and the probabilities of units with ulcer rates in the highest quartile and highest decile being identified as such. We assessed the signal‐noise measure as an indicator of differentiation precision by computing its correlations with these probabilities. Pressure ulcer rates based on a single year of quarterly or weekly prevalence surveys were too susceptible to noise to allow for precise differentiation among units, and signal‐noise reliability was a poor indicator of precision of differentiation. To ensure precise differentiation on the basis of true differences, alternative methods of assessing reliability should be applied to measures purported to differentiate among providers or units based on quality. © 2016 The Authors. Research in Nursing & Health published by Wiley Periodicals, Inc. PMID:27223598
Gough, L.P.; Jackson, L.L.; Sacklin, J.A.
1988-01-01
Hypogymnia enteromorpha and Usnea spp. were collected in the Little Bald Hills ultramafic region of Redwood National Park, California, to establish element-concentration norms. Baselines are presented for Ba, Ca, Cu, Mn, Ni, P, Sr, V, and Zn for both lichen species; for Li, Mg, and K for H. enteromorpha; and for Al, Ce, Cr, Co, Fe, Na, and Ti for Usnea. Element concentrations of future collections of this same material can be used to monitor possible air quality changes anticipated from mining activities planned nearby. The variability in the element concentrations was partitioned between geographical distance increments and sample preparation and analysis procedures. In general, most of this variability was found in samples less than a few hundreds of meters apart rather than those at about 1 km apart. Therefore, except for Ba and Co, no large geographical element-concentration trends were observed. Samples of both species contained elevated levels of Ni and Mg, which probably reflect the ultramafic terrain over which they occur.
A comparison of exact tests for trend with binary endpoints using Bartholomew's statistic.
Consiglio, J D; Shan, G; Wilding, G E
2014-01-01
Tests for trend are important in a number of scientific fields when trends associated with binary variables are of interest. Implementing the standard Cochran-Armitage trend test requires an arbitrary choice of scores assigned to represent the grouping variable. Bartholomew proposed a test for qualitatively ordered samples using asymptotic critical values, but type I error control can be problematic in finite samples. To our knowledge, use of the exact probability distribution has not been explored, and we study its use in the present paper. Specifically we consider an approach based on conditioning on both sets of marginal totals and three unconditional approaches where only the marginal totals corresponding to the group sample sizes are treated as fixed. While slightly conservative, all four tests are guaranteed to have actual type I error rates below the nominal level. The unconditional tests are found to exhibit far less conservatism than the conditional test and thereby gain a power advantage.
Ding, Aidong Adam; Hsieh, Jin-Jian; Wang, Weijing
2015-01-01
Bivariate survival analysis has wide applications. In the presence of covariates, most literature focuses on studying their effects on the marginal distributions. However covariates can also affect the association between the two variables. In this article we consider the latter issue by proposing a nonstandard local linear estimator for the concordance probability as a function of covariates. Under the Clayton copula, the conditional concordance probability has a simple one-to-one correspondence with the copula parameter for different data structures including those subject to independent or dependent censoring and dependent truncation. The proposed method can be used to study how covariates affect the Clayton association parameter without specifying marginal regression models. Asymptotic properties of the proposed estimators are derived and their finite-sample performances are examined via simulations. Finally, for illustration, we apply the proposed method to analyze a bone marrow transplant data set.
Toribo, S.G.; Gray, B.R.; Liang, S.
2011-01-01
The N-mixture model proposed by Royle in 2004 may be used to approximate the abundance and detection probability of animal species in a given region. In 2006, Royle and Dorazio discussed the advantages of using a Bayesian approach in modelling animal abundance and occurrence using a hierarchical N-mixture model. N-mixture models assume replication on sampling sites, an assumption that may be violated when the site is not closed to changes in abundance during the survey period or when nominal replicates are defined spatially. In this paper, we studied the robustness of a Bayesian approach to fitting the N-mixture model for pseudo-replicated count data. Our simulation results showed that the Bayesian estimates for abundance and detection probability are slightly biased when the actual detection probability is small and are sensitive to the presence of extra variability within local sites.
Two Universality Properties Associated with the Monkey Model of Zipf's Law
NASA Astrophysics Data System (ADS)
Perline, Richard; Perline, Ron
2016-03-01
The distribution of word probabilities in the monkey model of Zipf's law is associated with two universality properties: (1) the power law exponent converges strongly to $-1$ as the alphabet size increases and the letter probabilities are specified as the spacings from a random division of the unit interval for any distribution with a bounded density function on $[0,1]$; and (2), on a logarithmic scale the version of the model with a finite word length cutoff and unequal letter probabilities is approximately normally distributed in the part of the distribution away from the tails. The first property is proved using a remarkably general limit theorem for the logarithm of sample spacings from Shao and Hahn, and the second property follows from Anscombe's central limit theorem for a random number of i.i.d. random variables. The finite word length model leads to a hybrid Zipf-lognormal mixture distribution closely related to work in other areas.
Exploratory reconstructability analysis of accident TBI data
NASA Astrophysics Data System (ADS)
Zwick, Martin; Carney, Nancy; Nettleton, Rosemary
2018-02-01
This paper describes the use of reconstructability analysis to perform a secondary study of traumatic brain injury data from automobile accidents. Neutral searches were done and their results displayed with a hypergraph. Directed searches, using both variable-based and state-based models, were applied to predict performance on two cognitive tests and one neurological test. Very simple state-based models gave large uncertainty reductions for all three DVs and sizeable improvements in percent correct for the two cognitive test DVs which were equally sampled. Conditional probability distributions for these models are easily visualized with simple decision trees. Confounding variables and counter-intuitive findings are also reported.
Estimation of the biserial correlation and its sampling variance for use in meta-analysis.
Jacobs, Perke; Viechtbauer, Wolfgang
2017-06-01
Meta-analyses are often used to synthesize the findings of studies examining the correlational relationship between two continuous variables. When only dichotomous measurements are available for one of the two variables, the biserial correlation coefficient can be used to estimate the product-moment correlation between the two underlying continuous variables. Unlike the point-biserial correlation coefficient, biserial correlation coefficients can therefore be integrated with product-moment correlation coefficients in the same meta-analysis. The present article describes the estimation of the biserial correlation coefficient for meta-analytic purposes and reports simulation results comparing different methods for estimating the coefficient's sampling variance. The findings indicate that commonly employed methods yield inconsistent estimates of the sampling variance across a broad range of research situations. In contrast, consistent estimates can be obtained using two methods that appear to be unknown in the meta-analytic literature. A variance-stabilizing transformation for the biserial correlation coefficient is described that allows for the construction of confidence intervals for individual coefficients with close to nominal coverage probabilities in most of the examined conditions. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Navigating complex sample analysis using national survey data.
Saylor, Jennifer; Friedmann, Erika; Lee, Hyeon Joo
2012-01-01
The National Center for Health Statistics conducts the National Health and Nutrition Examination Survey and other national surveys with probability-based complex sample designs. Goals of national surveys are to provide valid data for the population of the United States. Analyses of data from population surveys present unique challenges in the research process but are valuable avenues to study the health of the United States population. The aim of this study was to demonstrate the importance of using complex data analysis techniques for data obtained with complex multistage sampling design and provide an example of analysis using the SPSS Complex Samples procedure. Illustration of challenges and solutions specific to secondary data analysis of national databases are described using the National Health and Nutrition Examination Survey as the exemplar. Oversampling of small or sensitive groups provides necessary estimates of variability within small groups. Use of weights without complex samples accurately estimates population means and frequency from the sample after accounting for over- or undersampling of specific groups. Weighting alone leads to inappropriate population estimates of variability, because they are computed as if the measures were from the entire population rather than a sample in the data set. The SPSS Complex Samples procedure allows inclusion of all sampling design elements, stratification, clusters, and weights. Use of national data sets allows use of extensive, expensive, and well-documented survey data for exploratory questions but limits analysis to those variables included in the data set. The large sample permits examination of multiple predictors and interactive relationships. Merging data files, availability of data in several waves of surveys, and complex sampling are techniques used to provide a representative sample but present unique challenges. In sophisticated data analysis techniques, use of these data is optimized.
Sufficient Statistics for Divergence and the Probability of Misclassification
NASA Technical Reports Server (NTRS)
Quirein, J.
1972-01-01
One particular aspect is considered of the feature selection problem which results from the transformation x=Bz, where B is a k by n matrix of rank k and k is or = to n. It is shown that in general, such a transformation results in a loss of information. In terms of the divergence, this is equivalent to the fact that the average divergence computed using the variable x is less than or equal to the average divergence computed using the variable z. A loss of information in terms of the probability of misclassification is shown to be equivalent to the fact that the probability of misclassification computed using variable x is greater than or equal to the probability of misclassification computed using variable z. First, the necessary facts relating k-dimensional and n-dimensional integrals are derived. Then the mentioned results about the divergence and probability of misclassification are derived. Finally it is shown that if no information is lost (in x = Bz) as measured by the divergence, then no information is lost as measured by the probability of misclassification.
VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS
Huang, Jian; Horowitz, Joel L.; Wei, Fengrong
2010-01-01
We consider a nonparametric additive model of a conditional mean function in which the number of variables and additive components may be larger than the sample size but the number of nonzero additive components is “small” relative to the sample size. The statistical problem is to determine which additive components are nonzero. The additive components are approximated by truncated series expansions with B-spline bases. With this approximation, the problem of component selection becomes that of selecting the groups of coefficients in the expansion. We apply the adaptive group Lasso to select nonzero components, using the group Lasso to obtain an initial estimator and reduce the dimension of the problem. We give conditions under which the group Lasso selects a model whose number of components is comparable with the underlying model, and the adaptive group Lasso selects the nonzero components correctly with probability approaching one as the sample size increases and achieves the optimal rate of convergence. The results of Monte Carlo experiments show that the adaptive group Lasso procedure works well with samples of moderate size. A data example is used to illustrate the application of the proposed method. PMID:21127739
Stan : A Probabilistic Programming Language
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carpenter, Bob; Gelman, Andrew; Hoffman, Matthew D.
Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. As of version 2.14.0, Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectationmore » propagation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can also be called from the command line using the cmdstan package, through R using the rstan package, and through Python using the pystan package. All three interfaces support sampling and optimization-based inference with diagnostics and posterior analysis. rstan and pystan also provide access to log probabilities, gradients, Hessians, parameter transforms, and specialized plotting.« less
Stan : A Probabilistic Programming Language
Carpenter, Bob; Gelman, Andrew; Hoffman, Matthew D.; ...
2017-01-01
Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. As of version 2.14.0, Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectationmore » propagation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can also be called from the command line using the cmdstan package, through R using the rstan package, and through Python using the pystan package. All three interfaces support sampling and optimization-based inference with diagnostics and posterior analysis. rstan and pystan also provide access to log probabilities, gradients, Hessians, parameter transforms, and specialized plotting.« less
Adriaanse, Marieke A; Evers, Catharine; Verhoeven, Aukje A C; de Ridder, Denise T D
2016-03-01
It is often assumed that there are substantial sex differences in eating behaviour (e.g. women are more likely to be dieters or emotional eaters than men). The present study investigates this assumption in a large representative community sample while incorporating a comprehensive set of psychological eating-related variables. A community sample was employed to: (i) determine sex differences in (un)healthy snack consumption and psychological eating-related variables (e.g. emotional eating, intention to eat healthily); (ii) examine whether sex predicts energy intake from (un)healthy snacks over and above psychological variables; and (iii) investigate the relationship between psychological variables and snack intake for men and women separately. Snack consumption was assessed with a 7d snack diary; the psychological eating-related variables with questionnaires. Participants were members of an Internet survey panel that is based on a true probability sample of households in the Netherlands. Men and women (n 1292; 45 % male), with a mean age of 51·23 (sd 16·78) years and a mean BMI of 25·62 (sd 4·75) kg/m2. Results revealed that women consumed more healthy and less unhealthy snacks than men and they scored higher than men on emotional and restrained eating. Women also more often reported appearance and health-related concerns about their eating behaviour, but men and women did not differ with regard to external eating or their intentions to eat more healthily. The relationships between psychological eating-related variables and snack intake were similar for men and women, indicating that snack intake is predicted by the same variables for men and women. It is concluded that some small sex differences in psychological eating-related variables exist, but based on the present data there is no need for interventions aimed at promoting healthy eating to target different predictors according to sex.
Wu, F Z; Ma, J; Hu, X N; Zeng, L
2015-02-01
The mealybug species Phenacoccus solenopsis (P. solenopsis) has caused much agricultural damage since its recent invasion in China. However, the source of this invasion remains unclear. This study uses molecular methods to clarify the relationships among different population of P. solenopsis from China, USA, Pakistan, India, and Vietnam to determine the geographic origin of the introduction of this species into China. P. solenopsis samples were collected from 25 different locations in three provinces of Southern China. Samples from the USA, Pakistan, and Vietnam were also obtained. Parts of the mitochondrial genes for cytochrome oxidase I (COI) were sequenced for each sample. Homologous DNA sequences of the samples from the USA and India were downloaded from Gen Bank. Two haplotypes were found in China. The first was from most samples from the Guangdong, Guangxi, and Hainan populations in the China and Pakistan groups, and the second from a few samples from the Guangdong, Guangxi, Hainan populations in the China, Pakistan, India, and Vietnam groups. As shown in the maximum likelihood of trees constructed using the COI sequences, these samples belonged to two clades. Phylogenetic analysis suggested that most P. solenopsis mealybugs in Southern China are probably closely related to populations in Pakistan. The variation, relationship, expansion, and probable geographic origin of P. solenopsis mealybugs in Southern China are also discussed.
Time-dependent landslide probability mapping
Campbell, Russell H.; Bernknopf, Richard L.; ,
1993-01-01
Case studies where time of failure is known for rainfall-triggered debris flows can be used to estimate the parameters of a hazard model in which the probability of failure is a function of time. As an example, a time-dependent function for the conditional probability of a soil slip is estimated from independent variables representing hillside morphology, approximations of material properties, and the duration and rate of rainfall. If probabilities are calculated in a GIS (geomorphic information system ) environment, the spatial distribution of the result for any given hour can be displayed on a map. Although the probability levels in this example are uncalibrated, the method offers a potential for evaluating different physical models and different earth-science variables by comparing the map distribution of predicted probabilities with inventory maps for different areas and different storms. If linked with spatial and temporal socio-economic variables, this method could be used for short-term risk assessment.
Feasibility of conducting wetfall chemistry investigations around the Bowen Power Plant
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, N.C.J.; Patrinos, A.A.N.
1979-10-01
The feasibility of expanding the Meteorological Effects of Thermal Energy Releases - Oak Ridge National Laboratory (METER-ORNL) research at Bower Power Plant, a coal-fired power plant in northwest Georgia, to include wetfall chemistry is evaluated using results of similar studies around other power plants, several atmospheric washout models, analysis of spatial variability in precipitation, and field logistical considerations. An optimal wetfall chemistry network design is proposed, incorporating the inner portion of the existing rain-gauge network and augmented by additional sites to ensure adequate coverage of probable target areas. The predicted sulfate production rate differs by about four orders of magnitudemore » among the models reviewed with a pH of 3. No model can claim superiority over any other model without substantive data verification. The spatial uniformity in rain amount is evaluated using four storms that occurred at the METER-ORNL network. Values of spatial variability ranged from 8 to 31% and decreased as the mean rainfall increased. The field study of wetfall chemistry will require a minimum of 5 persons to operate the approximately 50 collectors covering an area of 740 km/sup 2/. Preliminary wetfall-only samples collected on an event basis showed lower pH and higher electrical conductivity of precipitation collected about 5 km downwind of the power plant relative to samples collected upwind. Wetfall samples collected on a weekly basis using automatic samplers, however, showed variable results, with no consistent pattern. This suggests the need for event sampling to minimize variable rain volume and multiple-source effects often associated with weekly samples.« less
Shirley, Matthew H.; Dorazio, Robert M.; Abassery, Ekramy; Elhady, Amr A.; Mekki, Mohammed S.; Asran, Hosni H.
2012-01-01
As part of the development of a management program for Nile crocodiles in Lake Nasser, Egypt, we used a dependent double-observer sampling protocol with multiple observers to compute estimates of population size. To analyze the data, we developed a hierarchical model that allowed us to assess variation in detection probabilities among observers and survey dates, as well as account for variation in crocodile abundance among sites and habitats. We conducted surveys from July 2008-June 2009 in 15 areas of Lake Nasser that were representative of 3 main habitat categories. During these surveys, we sampled 1,086 km of lake shore wherein we detected 386 crocodiles. Analysis of the data revealed significant variability in both inter- and intra-observer detection probabilities. Our raw encounter rate was 0.355 crocodiles/km. When we accounted for observer effects and habitat, we estimated a surface population abundance of 2,581 (2,239-2,987, 95% credible intervals) crocodiles in Lake Nasser. Our results underscore the importance of well-trained, experienced monitoring personnel in order to decrease heterogeneity in intra-observer detection probability and to better detect changes in the population based on survey indices. This study will assist the Egyptian government establish a monitoring program as an integral part of future crocodile harvest activities in Lake Nasser
The Contemporary Land Mammals of Egypt (Including Sinai).
1980-08-15
libyca and adult male and female Ictonyx striatus erythreae .............. 400 49. Variation in median lumbar stripe in samples of Poecilictis libyca...vampI)(stris ptitrizii: wadis of ’{ebel Uw ~einat and probably (;ilf el Kebir;, lMpodillus cam ;wstris uenjistus: \\Aest bank of Nile in Upper Egypt. I...white base in lumbar and sacral region. Width of tip, subter- minal, and basal color bands variable. Ear pigmented, covered with whitish hairs. Tail
Essl, Franz; Dullinger, Stefan
2016-01-01
The search for traits that make alien species invasive has mostly concentrated on comparing successful invaders and different comparison groups with respect to average trait values. By contrast, little attention has been paid to trait variability among invaders. Here, we combine an analysis of trait differences between invasive and non-invasive species with a comparison of multidimensional trait variability within these two species groups. We collected data on biological and distributional traits for 1402 species of the native, non-woody vascular plant flora of Austria. We then compared the subsets of species recorded and not recorded as invasive aliens anywhere in the world, respectively, first, with respect to the sampled traits using univariate and multiple regression models; and, second, with respect to their multidimensional trait diversity by calculating functional richness and dispersion metrics. Attributes related to competitiveness (strategy type, nitrogen indicator value), habitat use (agricultural and ruderal habitats, occurrence under the montane belt), and propagule pressure (frequency) were most closely associated with invasiveness. However, even the best multiple model, including interactions, only explained a moderate fraction of the differences in invasive success. In addition, multidimensional variability in trait space was even larger among invasive than among non-invasive species. This pronounced variability suggests that invasive success has a considerable idiosyncratic component and is probably highly context specific. We conclude that basing risk assessment protocols on species trait profiles will probably face hardly reducible uncertainties. PMID:27187616
Klonner, Günther; Fischer, Stefan; Essl, Franz; Dullinger, Stefan
2016-01-01
The search for traits that make alien species invasive has mostly concentrated on comparing successful invaders and different comparison groups with respect to average trait values. By contrast, little attention has been paid to trait variability among invaders. Here, we combine an analysis of trait differences between invasive and non-invasive species with a comparison of multidimensional trait variability within these two species groups. We collected data on biological and distributional traits for 1402 species of the native, non-woody vascular plant flora of Austria. We then compared the subsets of species recorded and not recorded as invasive aliens anywhere in the world, respectively, first, with respect to the sampled traits using univariate and multiple regression models; and, second, with respect to their multidimensional trait diversity by calculating functional richness and dispersion metrics. Attributes related to competitiveness (strategy type, nitrogen indicator value), habitat use (agricultural and ruderal habitats, occurrence under the montane belt), and propagule pressure (frequency) were most closely associated with invasiveness. However, even the best multiple model, including interactions, only explained a moderate fraction of the differences in invasive success. In addition, multidimensional variability in trait space was even larger among invasive than among non-invasive species. This pronounced variability suggests that invasive success has a considerable idiosyncratic component and is probably highly context specific. We conclude that basing risk assessment protocols on species trait profiles will probably face hardly reducible uncertainties.
Single and simultaneous binary mergers in Wright-Fisher genealogies.
Melfi, Andrew; Viswanath, Divakar
2018-05-01
The Kingman coalescent is a commonly used model in genetics, which is often justified with reference to the Wright-Fisher (WF) model. Current proofs of convergence of WF and other models to the Kingman coalescent assume a constant sample size. However, sample sizes have become quite large in human genetics. Therefore, we develop a convergence theory that allows the sample size to increase with population size. If the haploid population size is N and the sample size is N 1∕3-ϵ , ϵ>0, we prove that Wright-Fisher genealogies involve at most a single binary merger in each generation with probability converging to 1 in the limit of large N. Single binary merger or no merger in each generation of the genealogy implies that the Kingman partition distribution is obtained exactly. If the sample size is N 1∕2-ϵ , Wright-Fisher genealogies may involve simultaneous binary mergers in a single generation but do not involve triple mergers in the large N limit. The asymptotic theory is verified using numerical calculations. Variable population sizes are handled algorithmically. It is found that even distant bottlenecks can increase the probability of triple mergers as well as simultaneous binary mergers in WF genealogies. Copyright © 2018 Elsevier Inc. All rights reserved.
Honest Importance Sampling with Multiple Markov Chains
Tan, Aixin; Doss, Hani; Hobert, James P.
2017-01-01
Importance sampling is a classical Monte Carlo technique in which a random sample from one probability density, π1, is used to estimate an expectation with respect to another, π. The importance sampling estimator is strongly consistent and, as long as two simple moment conditions are satisfied, it obeys a central limit theorem (CLT). Moreover, there is a simple consistent estimator for the asymptotic variance in the CLT, which makes for routine computation of standard errors. Importance sampling can also be used in the Markov chain Monte Carlo (MCMC) context. Indeed, if the random sample from π1 is replaced by a Harris ergodic Markov chain with invariant density π1, then the resulting estimator remains strongly consistent. There is a price to be paid however, as the computation of standard errors becomes more complicated. First, the two simple moment conditions that guarantee a CLT in the iid case are not enough in the MCMC context. Second, even when a CLT does hold, the asymptotic variance has a complex form and is difficult to estimate consistently. In this paper, we explain how to use regenerative simulation to overcome these problems. Actually, we consider a more general set up, where we assume that Markov chain samples from several probability densities, π1, …, πk, are available. We construct multiple-chain importance sampling estimators for which we obtain a CLT based on regeneration. We show that if the Markov chains converge to their respective target distributions at a geometric rate, then under moment conditions similar to those required in the iid case, the MCMC-based importance sampling estimator obeys a CLT. Furthermore, because the CLT is based on a regenerative process, there is a simple consistent estimator of the asymptotic variance. We illustrate the method with two applications in Bayesian sensitivity analysis. The first concerns one-way random effects models under different priors. The second involves Bayesian variable selection in linear regression, and for this application, importance sampling based on multiple chains enables an empirical Bayes approach to variable selection. PMID:28701855
Honest Importance Sampling with Multiple Markov Chains.
Tan, Aixin; Doss, Hani; Hobert, James P
2015-01-01
Importance sampling is a classical Monte Carlo technique in which a random sample from one probability density, π 1 , is used to estimate an expectation with respect to another, π . The importance sampling estimator is strongly consistent and, as long as two simple moment conditions are satisfied, it obeys a central limit theorem (CLT). Moreover, there is a simple consistent estimator for the asymptotic variance in the CLT, which makes for routine computation of standard errors. Importance sampling can also be used in the Markov chain Monte Carlo (MCMC) context. Indeed, if the random sample from π 1 is replaced by a Harris ergodic Markov chain with invariant density π 1 , then the resulting estimator remains strongly consistent. There is a price to be paid however, as the computation of standard errors becomes more complicated. First, the two simple moment conditions that guarantee a CLT in the iid case are not enough in the MCMC context. Second, even when a CLT does hold, the asymptotic variance has a complex form and is difficult to estimate consistently. In this paper, we explain how to use regenerative simulation to overcome these problems. Actually, we consider a more general set up, where we assume that Markov chain samples from several probability densities, π 1 , …, π k , are available. We construct multiple-chain importance sampling estimators for which we obtain a CLT based on regeneration. We show that if the Markov chains converge to their respective target distributions at a geometric rate, then under moment conditions similar to those required in the iid case, the MCMC-based importance sampling estimator obeys a CLT. Furthermore, because the CLT is based on a regenerative process, there is a simple consistent estimator of the asymptotic variance. We illustrate the method with two applications in Bayesian sensitivity analysis. The first concerns one-way random effects models under different priors. The second involves Bayesian variable selection in linear regression, and for this application, importance sampling based on multiple chains enables an empirical Bayes approach to variable selection.
INFORMAL CARE AND CAREGIVER’S HEALTH
DO, YOUNG KYUNG; NORTON, EDWARD C.; STEARNS, SALLY C.; VAN HOUTVEN, COURTNEY HAROLD
2014-01-01
This study aims to measure the causal effect of informal caregiving on the health and health care use of women who are caregivers, using instrumental variables. We use data from South Korea, where daughters and daughters-in-law are the prevalent source of caregivers for frail elderly parents and parents-in-law. A key insight of our instrumental variable approach is that having a parent-in-law with functional limitations increases the probability of providing informal care to that parent-in-law, but a parent-in-law’s functional limitation does not directly affect the daughter-in-law’s health. We compare results for the daughter-in-law and daughter samples to check the assumption of the excludability of the instruments for the daughter sample. Our results show that providing informal care has significant adverse effects along multiple dimensions of health for daughter-in-law and daughter caregivers in South Korea. PMID:24753386
A Stochastic Diffusion Process for the Dirichlet Distribution
Bakosi, J.; Ristorcelli, J. R.
2013-03-01
The method of potential solutions of Fokker-Planck equations is used to develop a transport equation for the joint probability ofNcoupled stochastic variables with the Dirichlet distribution as its asymptotic solution. To ensure a bounded sample space, a coupled nonlinear diffusion process is required: the Wiener processes in the equivalent system of stochastic differential equations are multiplicative with coefficients dependent on all the stochastic variables. Individual samples of a discrete ensemble, obtained from the stochastic process, satisfy a unit-sum constraint at all times. The process may be used to represent realizations of a fluctuating ensemble ofNvariables subject to a conservation principle.more » Similar to the multivariate Wright-Fisher process, whose invariant is also Dirichlet, the univariate case yields a process whose invariant is the beta distribution. As a test of the results, Monte Carlo simulations are used to evolve numerical ensembles toward the invariant Dirichlet distribution.« less
NASA Astrophysics Data System (ADS)
Carmo, Vanda; Santos, Mariana; Menezes, Gui M.; Loureiro, Clara M.; Lambardi, Paolo; Martins, Ana
2013-12-01
Seamounts are common topographic features around the Azores archipelago (NE Atlantic). Recently there has been increasing research effort devoted to the ecology of these ecosystems. In the Azores, the mesozooplankon is poorly studied, particularly in relation to these seafloor elevations. In this study, zooplankton communities in the Condor seamount area (Azores) were investigated during March, July and September 2010. Samples were taken during both day and night with a Bongo net of 200 µm mesh that towed obliquely within the first 100 m of the water column. Total abundance, biomass and chlorophyll a concentrations did not vary with sampling site or within the diel cycle but significant seasonal variation was observed. Moreover, zooplankton community composition showed the same strong seasonal pattern regardless of spatial or daily variability. Despite seasonal differences, the zooplankton community structure remained similar for the duration of this study. Seasonal variability better explained our results than mesoscale spatial variability. Spatial homogeneity is probably related with island proximity and local dynamics over Condor seamount. Zooplankton literature for the region is sparse, therefore a short review of the most important zooplankton studies from the Azores is also presented.
Evaluation of genetic variability in a small, insular population of spruce grouse
O'Connell, A.F.; Rhymer, Judith; Keppie, D.M.; Svenson, K.L.; Paigan, B.J.
2002-01-01
Using microsatellite markers we determined genetic variability for two populations of spruce grouse in eastern North America, one on a coastal Maine island where breeding habitat is limited and highly fragmented, the other in central New Brunswick (NB), where suitable breeding habitat is generally contiguous across the region. We examined six markers for both populations and all were polymorphic. Although the number of alleles per locus and the proportion of unique alleles were lower in the island population, and probably a result of small sample.size, heterozygosity and a breeding coefficient (Fis) indicated slightly more variability in the island population. Deviation from Hardy-Weinberg equilibrium also was more evident in loci for the mainland population. Several traits previously documented in the island population: relatively long natal dispersal distances, reproductive success, territoriality, adult survival, and longevity support the maintenance of hetrerzygosity, at least in the short-term. Sample collection from two small (500 ha), separate areas in NB, and the predicted importance of immigration density to supplement this population demonstrate the need for behavioral and ecological information when interpreting genetic variation. We discuss the relevance of these issues with respect to genetic variability and viability.
Computer simulation of random variables and vectors with arbitrary probability distribution laws
NASA Technical Reports Server (NTRS)
Bogdan, V. M.
1981-01-01
Assume that there is given an arbitrary n-dimensional probability distribution F. A recursive construction is found for a sequence of functions x sub 1 = f sub 1 (U sub 1, ..., U sub n), ..., x sub n = f sub n (U sub 1, ..., U sub n) such that if U sub 1, ..., U sub n are independent random variables having uniform distribution over the open interval (0,1), then the joint distribution of the variables x sub 1, ..., x sub n coincides with the distribution F. Since uniform independent random variables can be well simulated by means of a computer, this result allows one to simulate arbitrary n-random variables if their joint probability distribution is known.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, Jay Dean; Oberkampf, William Louis; Helton, Jon Craig
2004-12-01
Relationships to determine the probability that a weak link (WL)/strong link (SL) safety system will fail to function as intended in a fire environment are investigated. In the systems under study, failure of the WL system before failure of the SL system is intended to render the overall system inoperational and thus prevent the possible occurrence of accidents with potentially serious consequences. Formal developments of the probability that the WL system fails to deactivate the overall system before failure of the SL system (i.e., the probability of loss of assured safety, PLOAS) are presented for several WWSL configurations: (i) onemore » WL, one SL, (ii) multiple WLs, multiple SLs with failure of any SL before any WL constituting failure of the safety system, (iii) multiple WLs, multiple SLs with failure of all SLs before any WL constituting failure of the safety system, and (iv) multiple WLs, multiple SLs and multiple sublinks in each SL with failure of any sublink constituting failure of the associated SL and failure of all SLs before failure of any WL constituting failure of the safety system. The indicated probabilities derive from time-dependent temperatures in the WL/SL system and variability (i.e., aleatory uncertainty) in the temperatures at which the individual components of this system fail and are formally defined as multidimensional integrals. Numerical procedures based on quadrature (i.e., trapezoidal rule, Simpson's rule) and also on Monte Carlo techniques (i.e., simple random sampling, importance sampling) are described and illustrated for the evaluation of these integrals. Example uncertainty and sensitivity analyses for PLOAS involving the representation of uncertainty (i.e., epistemic uncertainty) with probability theory and also with evidence theory are presented.« less
Maximum predictive power and the superposition principle
NASA Technical Reports Server (NTRS)
Summhammer, Johann
1994-01-01
In quantum physics the direct observables are probabilities of events. We ask how observed probabilities must be combined to achieve what we call maximum predictive power. According to this concept the accuracy of a prediction must only depend on the number of runs whose data serve as input for the prediction. We transform each probability to an associated variable whose uncertainty interval depends only on the amount of data and strictly decreases with it. We find that for a probability which is a function of two other probabilities maximum predictive power is achieved when linearly summing their associated variables and transforming back to a probability. This recovers the quantum mechanical superposition principle.
NASA Astrophysics Data System (ADS)
Selva, Jacopo; Sandri, Laura; Costa, Antonio; Tonini, Roberto; Folch, Arnau; Macedonio, Giovanni
2014-05-01
The intrinsic uncertainty and variability associated to the size of next eruption strongly affects short to long-term tephra hazard assessment. Often, emergency plans are established accounting for the effects of one or a few representative scenarios (meant as a specific combination of eruptive size and vent position), selected with subjective criteria. On the other hand, probabilistic hazard assessments (PHA) consistently explore the natural variability of such scenarios. PHA for tephra dispersal needs the definition of eruptive scenarios (usually by grouping possible eruption sizes and vent positions in classes) with associated probabilities, a meteorological dataset covering a representative time period, and a tephra dispersal model. PHA results from combining simulations considering different volcanological and meteorological conditions through a weight given by their specific probability of occurrence. However, volcanological parameters, such as erupted mass, eruption column height and duration, bulk granulometry, fraction of aggregates, typically encompass a wide range of values. Because of such a variability, single representative scenarios or size classes cannot be adequately defined using single values for the volcanological inputs. Here we propose a method that accounts for this within-size-class variability in the framework of Event Trees. The variability of each parameter is modeled with specific Probability Density Functions, and meteorological and volcanological inputs are chosen by using a stratified sampling method. This procedure allows avoiding the bias introduced by selecting single representative scenarios and thus neglecting most of the intrinsic eruptive variability. When considering within-size-class variability, attention must be paid to appropriately weight events falling within the same size class. While a uniform weight to all the events belonging to a size class is the most straightforward idea, this implies a strong dependence on the thresholds dividing classes: under this choice, the largest event of a size class has a much larger weight than the smallest event of the subsequent size class. In order to overcome this problem, in this study, we propose an innovative solution able to smoothly link the weight variability within each size class to the variability among the size classes through a common power law, and, simultaneously, respect the probability of different size classes conditional to the occurrence of an eruption. Embedding this procedure into the Bayesian Event Tree scheme enables for tephra fall PHA, quantified through hazard curves and maps representing readable results applicable in planning risk mitigation actions, and for the quantification of its epistemic uncertainties. As examples, we analyze long-term tephra fall PHA at Vesuvius and Campi Flegrei. We integrate two tephra dispersal models (the analytical HAZMAP and the numerical FALL3D) into BET_VH. The ECMWF reanalysis dataset are used for exploring different meteorological conditions. The results obtained clearly show that PHA accounting for the whole natural variability significantly differs from that based on a representative scenarios, as in volcanic hazard common practice.
NASA Astrophysics Data System (ADS)
Carter, Jeffrey R.; Simon, Wayne E.
1990-08-01
Neural networks are trained using Recursive Error Minimization (REM) equations to perform statistical classification. Using REM equations with continuous input variables reduces the required number of training experiences by factors of one to two orders of magnitude over standard back propagation. Replacing the continuous input variables with discrete binary representations reduces the number of connections by a factor proportional to the number of variables reducing the required number of experiences by another order of magnitude. Undesirable effects of using recurrent experience to train neural networks for statistical classification problems are demonstrated and nonrecurrent experience used to avoid these undesirable effects. 1. THE 1-41 PROBLEM The statistical classification problem which we address is is that of assigning points in ddimensional space to one of two classes. The first class has a covariance matrix of I (the identity matrix) the covariance matrix of the second class is 41. For this reason the problem is known as the 1-41 problem. Both classes have equal probability of occurrence and samples from both classes may appear anywhere throughout the ddimensional space. Most samples near the origin of the coordinate system will be from the first class while most samples away from the origin will be from the second class. Since the two classes completely overlap it is impossible to have a classifier with zero error. The minimum possible error is known as the Bayes error and
Wijeysundera, Duminda N; Austin, Peter C; Hux, Janet E; Beattie, W Scott; Laupacis, Andreas
2009-01-01
Randomized trials generally use "frequentist" statistics based on P-values and 95% confidence intervals. Frequentist methods have limitations that might be overcome, in part, by Bayesian inference. To illustrate these advantages, we re-analyzed randomized trials published in four general medical journals during 2004. We used Medline to identify randomized superiority trials with two parallel arms, individual-level randomization and dichotomous or time-to-event primary outcomes. Studies with P<0.05 in favor of the intervention were deemed "positive"; otherwise, they were "negative." We used several prior distributions and exact conjugate analyses to calculate Bayesian posterior probabilities for clinically relevant effects. Of 88 included studies, 39 were positive using a frequentist analysis. Although the Bayesian posterior probabilities of any benefit (relative risk or hazard ratio<1) were high in positive studies, these probabilities were lower and variable for larger benefits. The positive studies had only moderate probabilities for exceeding the effects that were assumed for calculating the sample size. By comparison, there were moderate probabilities of any benefit in negative studies. Bayesian and frequentist analyses complement each other when interpreting the results of randomized trials. Future reports of randomized trials should include both.
Occupancy Modeling Species-Environment Relationships with Non-ignorable Survey Designs.
Irvine, Kathryn M; Rodhouse, Thomas J; Wright, Wilson J; Olsen, Anthony R
2018-05-26
Statistical models supporting inferences about species occurrence patterns in relation to environmental gradients are fundamental to ecology and conservation biology. A common implicit assumption is that the sampling design is ignorable and does not need to be formally accounted for in analyses. The analyst assumes data are representative of the desired population and statistical modeling proceeds. However, if datasets from probability and non-probability surveys are combined or unequal selection probabilities are used, the design may be non ignorable. We outline the use of pseudo-maximum likelihood estimation for site-occupancy models to account for such non-ignorable survey designs. This estimation method accounts for the survey design by properly weighting the pseudo-likelihood equation. In our empirical example, legacy and newer randomly selected locations were surveyed for bats to bridge a historic statewide effort with an ongoing nationwide program. We provide a worked example using bat acoustic detection/non-detection data and show how analysts can diagnose whether their design is ignorable. Using simulations we assessed whether our approach is viable for modeling datasets composed of sites contributed outside of a probability design Pseudo-maximum likelihood estimates differed from the usual maximum likelihood occu31 pancy estimates for some bat species. Using simulations we show the maximum likelihood estimator of species-environment relationships with non-ignorable sampling designs was biased, whereas the pseudo-likelihood estimator was design-unbiased. However, in our simulation study the designs composed of a large proportion of legacy or non-probability sites resulted in estimation issues for standard errors. These issues were likely a result of highly variable weights confounded by small sample sizes (5% or 10% sampling intensity and 4 revisits). Aggregating datasets from multiple sources logically supports larger sample sizes and potentially increases spatial extents for statistical inferences. Our results suggest that ignoring the mechanism for how locations were selected for data collection (e.g., the sampling design) could result in erroneous model-based conclusions. Therefore, in order to ensure robust and defensible recommendations for evidence-based conservation decision-making, the survey design information in addition to the data themselves must be available for analysts. Details for constructing the weights used in estimation and code for implementation are provided. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
[Adult mortality differentials in Argentina].
Rofman, R
1994-06-01
Adult mortality differentials in Argentina are estimated and analyzed using data from the National Social Security Administration. The study of adult mortality has attracted little attention in developing countries because of the scarcity of reliable statistics and the greater importance assigned to demographic phenomena traditionally associated with development, such as infant mortality and fertility. A sample of 39,421 records of retired persons surviving as of June 30, 1988, was analyzed by age, sex, region of residence, relative amount of pension, and social security fund of membership prior to the consolidation of the system in 1967. The thirteen former funds were grouped into the five categories of government, commerce, industry, self-employed, and other, which were assumed to be proxies for the activity sector in which the individual spent his active life. The sample is not representative of the Argentine population, since it excludes the lowest and highest socioeconomic strata and overrepresents men and urban residents. It is, however, believed to be adequate for explaining mortality differentials for most of the population covered by the social security system. The study methodology was based on the technique of logistic analysis and on the use of regional model life tables developed by Coale and others. To evaluate the effect of the study variables on the probability of dying, a regression model of maximal verisimilitude was estimated. The model relates the logit of the probability of death between ages 65 and 95 to the available explanatory variables, including their possible interactions. Life tables were constructed by sex, region of residence, previous pension fund, and income. As a test of external consistency, a model including only age and sex as explanatory variables was constructed using the methodology. The results confirmed consistency between the estimated values and other published estimates. A significant conclusion of the study was that social security data are a satisfactory source for study of adult mortality, a finding of importance in cases where vital statistics systems are deficient. Mortality differentials by income level and activity sector were significant, representing up to 11.5 years in life expectancy at age 20 and 4.4 years at age 65. Mortality differentials by region were minor, probably due to the nature of the sample. The lowest observed mortality levels were in own-account workers, independent professionals, and small businessmen.
Sampling, feasibility, and priors in data assimilation
Tu, Xuemin; Morzfeld, Matthias; Miller, Robert N.; ...
2016-03-01
Importance sampling algorithms are discussed in detail, with an emphasis on implicit sampling, and applied to data assimilation via particle filters. Implicit sampling makes it possible to use the data to find high-probability samples at relatively low cost, making the assimilation more efficient. A new analysis of the feasibility of data assimilation is presented, showing in detail why feasibility depends on the Frobenius norm of the covariance matrix of the noise and not on the number of variables. A discussion of the convergence of particular particle filters follows. A major open problem in numerical data assimilation is the determination ofmore » appropriate priors, a progress report on recent work on this problem is given. The analysis highlights the need for a careful attention both to the data and to the physics in data assimilation problems.« less
Chen, Connie; Gribble, Matthew O; Bartroff, Jay; Bay, Steven M; Goldstein, Larry
2017-05-01
The United States's Clean Water Act stipulates in section 303(d) that states must identify impaired water bodies for which total maximum daily loads (TMDLs) of pollution inputs into water bodies are developed. Decision-making procedures about how to list, or delist, water bodies as impaired, or not, per Clean Water Act 303(d) differ across states. In states such as California, whether or not a particular monitoring sample suggests that water quality is impaired can be regarded as a binary outcome variable, and California's current regulatory framework invokes a version of the exact binomial test to consolidate evidence across samples and assess whether the overall water body complies with the Clean Water Act. Here, we contrast the performance of California's exact binomial test with one potential alternative, the Sequential Probability Ratio Test (SPRT). The SPRT uses a sequential testing framework, testing samples as they become available and evaluating evidence as it emerges, rather than measuring all the samples and calculating a test statistic at the end of the data collection process. Through simulations and theoretical derivations, we demonstrate that the SPRT on average requires fewer samples to be measured to have comparable Type I and Type II error rates as the current fixed-sample binomial test. Policymakers might consider efficient alternatives such as SPRT to current procedure. Copyright © 2017 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Conover, W.J.; Cox, D.D.; Martz, H.F.
1997-12-01
When using parametric empirical Bayes estimation methods for estimating the binomial or Poisson parameter, the validity of the assumed beta or gamma conjugate prior distribution is an important diagnostic consideration. Chi-square goodness-of-fit tests of the beta or gamma prior hypothesis are developed for use when the binomial sample sizes or Poisson exposure times vary. Nine examples illustrate the application of the methods, using real data from such diverse applications as the loss of feedwater flow rates in nuclear power plants, the probability of failure to run on demand and the failure rates of the high pressure coolant injection systems atmore » US commercial boiling water reactors, the probability of failure to run on demand of emergency diesel generators in US commercial nuclear power plants, the rate of failure of aircraft air conditioners, baseball batting averages, the probability of testing positive for toxoplasmosis, and the probability of tumors in rats. The tests are easily applied in practice by means of corresponding Mathematica{reg_sign} computer programs which are provided.« less
Graves, Tabitha A.; Royle, J. Andrew; Kendall, Katherine C.; Beier, Paul; Stetz, Jeffrey B.; Macleod, Amy C.
2012-01-01
Using multiple detection methods can increase the number, kind, and distribution of individuals sampled, which may increase accuracy and precision and reduce cost of population abundance estimates. However, when variables influencing abundance are of interest, if individuals detected via different methods are influenced by the landscape differently, separate analysis of multiple detection methods may be more appropriate. We evaluated the effects of combining two detection methods on the identification of variables important to local abundance using detections of grizzly bears with hair traps (systematic) and bear rubs (opportunistic). We used hierarchical abundance models (N-mixture models) with separate model components for each detection method. If both methods sample the same population, the use of either data set alone should (1) lead to the selection of the same variables as important and (2) provide similar estimates of relative local abundance. We hypothesized that the inclusion of 2 detection methods versus either method alone should (3) yield more support for variables identified in single method analyses (i.e. fewer variables and models with greater weight), and (4) improve precision of covariate estimates for variables selected in both separate and combined analyses because sample size is larger. As expected, joint analysis of both methods increased precision as well as certainty in variable and model selection. However, the single-method analyses identified different variables and the resulting predicted abundances had different spatial distributions. We recommend comparing single-method and jointly modeled results to identify the presence of individual heterogeneity between detection methods in N-mixture models, along with consideration of detection probabilities, correlations among variables, and tolerance to risk of failing to identify variables important to a subset of the population. The benefits of increased precision should be weighed against those risks. The analysis framework presented here will be useful for other species exhibiting heterogeneity by detection method.
Bangalore, Sripal; Gopinath, Devi; Yao, Siu-Sun; Chaudhry, Farooq A
2007-03-01
We sought to evaluate the risk stratification ability and incremental prognostic value of stress echocardiography over historic, clinical, and stress electrocardiographic (ECG) variables, over a wide spectrum of bayesian pretest probabilities of coronary artery disease (CAD). Stress echocardiography is an established technique for the diagnosis of CAD. However, data on incremental prognostic value of stress echocardiography over historic, clinical, and stress ECG variables in patients with known or suggested CAD is limited. We evaluated 3259 patients (60 +/- 13 years, 48% men) undergoing stress echocardiography. Patients were grouped into low (<15%), intermediate (15-85%), and high (>85%) pretest CAD likelihood subgroups using standard software. The historical, clinical, stress ECG, and stress echocardiographic variables were recorded for the entire cohort. Follow-up (2.7 +/- 1.1 years) for confirmed myocardial infarction (n = 66) and cardiac death (n = 105) was obtained. For the entire cohort, an ischemic stress echocardiography study confers a 5.0 times higher cardiac event rate than the normal stress echocardiography group (4.0% vs 0.8%/y, P < .0001). Furthermore, Cox proportional hazard regression model showed incremental prognostic value of stress echocardiography variables over historic, clinical, and stress ECG variables across all pretest probability subgroups (global chi2 increased from 5.1 to 8.5 to 20.1 in the low pretest group, P = .44 and P = .01; from 20.9 to 28.2 to 116 in the intermediate pretest group, P = .47 and P < .0001; and from 17.5 to 36.6 to 61.4 in the high pretest group, P < .0001 for both groups). A normal stress echocardiography portends a benign prognosis (<1% event rate/y) in all pretest probability subgroups and even in patients with high pretest probability and yields incremental prognostic value over historic, clinical, and stress ECG variables across all pretest probability subgroups. The best incremental value is, however, in the intermediate pretest probability subgroup.
Ocean time-series near Bermuda: Hydrostation S and the US JGOFS Bermuda Atlantic time-series study
NASA Technical Reports Server (NTRS)
Michaels, Anthony F.; Knap, Anthony H.
1992-01-01
Bermuda is the site of two ocean time-series programs. At Hydrostation S, the ongoing biweekly profiles of temperature, salinity and oxygen now span 37 years. This is one of the longest open-ocean time-series data sets and provides a view of decadal scale variability in ocean processes. In 1988, the U.S. JGOFS Bermuda Atlantic Time-series Study began a wide range of measurements at a frequency of 14-18 cruises each year to understand temporal variability in ocean biogeochemistry. On each cruise, the data range from chemical analyses of discrete water samples to data from electronic packages of hydrographic and optics sensors. In addition, a range of biological and geochemical rate measurements are conducted that integrate over time-periods of minutes to days. This sampling strategy yields a reasonable resolution of the major seasonal patterns and of decadal scale variability. The Sargasso Sea also has a variety of episodic production events on scales of days to weeks and these are only poorly resolved. In addition, there is a substantial amount of mesoscale variability in this region and some of the perceived temporal patterns are caused by the intersection of the biweekly sampling with the natural spatial variability. In the Bermuda time-series programs, we have added a series of additional cruises to begin to assess these other sources of variation and their impacts on the interpretation of the main time-series record. However, the adequate resolution of higher frequency temporal patterns will probably require the introduction of new sampling strategies and some emerging technologies such as biogeochemical moorings and autonomous underwater vehicles.
Robust, Adaptive Radar Detection and Estimation
2015-07-21
cost function is not a convex function in R, we apply a transformation variables i.e., let X = σ2R−1 and S′ = 1 σ2 S. Then, the revised cost function in...1 viv H i . We apply this inverse covariance matrix in computing the SINR as well as estimator variance. • Rank Constrained Maximum Likelihood: Our...even as almost all available training samples are corrupted. Probability of Detection vs. SNR We apply three test statistics, the normalized matched
Design and operation of the national home health aide survey: 2007-2008.
Bercovitz, Anita; Moss, Abigail J; Sengupta, Manisha; Harris-Kojetin, Lauren D; Squillace, Marie R; Emily, Rosenoff; Branden, Laura
2010-03-01
This report provides an overview of the National Home Health Aide Survey (NHHAS), the first national probability survey of home health aides. NHHAS was designed to provide national estimates of home health aides who provided assistance in activities of daily living (ADLs) and were directly employed by agencies that provide home health and/or hospice care. This report discusses the need for and objectives of the survey, the design process, the survey methods, and data availability. METHODS NHHAS, a multistage probability sample survey, was conducted as a supplement to the 2007 National Home and Hospice Care Survey (NHHCS). Agencies providing home health and/or hospice care were sampled, and then aides employed by these agencies were sampled and interviewed by telephone. Survey topics included recruitment, training, job history, family life, client relations, work-related injuries, and demographics. NHHAS was virtually identical to the 2004 National Nursing Assistant Survey of certified nursing assistants employed in sampled nursing homes with minor changes to account for differences in workplace environment and responsibilities. RESULTS From September 2007 to April 2008, interviews were completed with 3,416 aides. A public-use data file that contains the interview responses, sampling weights, and design variables is available. The NHHAS overall response rate weighted by the inverse of the probability of selection was 41 percent. This rate is the product of the weighted first-stage agency response rate of 57 percent (i.e., weighted response rate of 59 percent for agency participation in NHHCS times the weighted response rate of 97 percent for agencies participating in NHHCS that also participated in NHHAS) and the weighted second-stage aide response rate of 72 percent to NHHAS.
Barazzetti Barbieri, Cristina; de Souza Sarkis, Jorge Eduardo
2018-07-01
The forensic interpretation of environmental analytical data is usually challenging due to the high geospatial variability of these data. The measurements' uncertainty includes contributions from the sampling and from the sample handling and preparation processes. These contributions are often disregarded in analytical techniques results' quality assurance. A pollution crime investigation case was used to carry out a methodology able to address these uncertainties in two different environmental compartments, freshwater sediments and landfill leachate. The methodology used to estimate the uncertainty was the duplicate method (that replicates predefined steps of the measurement procedure in order to assess its precision) and the parameters used to investigate the pollution were metals (Cr, Cu, Ni, and Zn) in the leachate, the suspect source, and in the sediment, the possible sink. The metal analysis results were compared to statutory limits and it was demonstrated that Cr and Ni concentrations in sediment samples exceeded the threshold levels at all sites downstream the pollution sources, considering the expanded uncertainty U of the measurements and a probability of contamination >0.975, at most sites. Cu and Zn concentrations were above the statutory limits at two sites, but the classification was inconclusive considering the uncertainties of the measurements. Metal analyses in leachate revealed that Cr concentrations were above the statutory limits with a probability of contamination >0.975 in all leachate ponds while the Cu, Ni and Zn probability of contamination was below 0.025. The results demonstrated that the estimation of the sampling uncertainty, which was the dominant component of the combined uncertainty, is required for a comprehensive interpretation of the environmental analyses results, particularly in forensic cases. Copyright © 2018 Elsevier B.V. All rights reserved.
Wolitzky-Taylor, Kate B.; Ruggiero, Kenneth J.; McCart, Michael R.; Smith, Daniel W.; Hanson, Rochelle F.; Resnick, Heidi S.; de Arellano, Michael A.; Saunders, Benjamin E.; Kilpatrick, Dean G.
2011-01-01
We compared the prevalence and correlates of adolescent suicidal ideation and attempts in two nationally representative probability samples of adolescents interviewed in 1995 (National Survey of Adolescents; N =4,023) and 2005 (National Survey of Adolescents-Replication; N =3,614). Participants in both samples completed a telephone survey that assessed major depressive episode (MDE), post-traumatic stress disorder, suicidal ideation and attempts, violence exposure, and substance use. Results demonstrated that the lifetime prevalence of suicidal ideation among adolescents was lower in 2005 than 1995, whereas the prevalence of suicide attempts remained stable. MDE was the strongest predictor of suicidality in both samples. In addition, several demographic, substance use, and violence exposure variables were significantly associated with increased risk of suicidal ideation and attempts in both samples, with female gender, nonexperimental drug use, and direct violence exposure being consistent risk factors in both samples. PMID:20390799
NASA Astrophysics Data System (ADS)
Nerantzaki, Sofia; Papalexiou, Simon Michael
2017-04-01
Identifying precisely the distribution tail of a geophysical variable is tough, or, even impossible. First, the tail is the part of the distribution for which we have the less empirical information available; second, a universally accepted definition of tail does not and cannot exist; and third, a tail may change over time due to long-term changes. Unfortunately, the tail is the most important part of the distribution as it dictates the estimates of exceedance probabilities or return periods. Fortunately, based on their tail behavior, probability distributions can be generally categorized into two major families, i.e., sub-exponentials (heavy-tailed) and hyper-exponentials (light-tailed). This study aims to update the Mean Excess Function (MEF), providing a useful tool in order to asses which type of tail better describes empirical data. The MEF is based on the mean value of a variable over a threshold and results in a zero slope regression line when applied for the Exponential distribution. Here, we construct slope confidence intervals for the Exponential distribution as functions of sample size. The validation of the method using Monte Carlo techniques on four theoretical distributions covering major tail cases (Pareto type II, Log-normal, Weibull and Gamma) revealed that it performs well especially for large samples. Finally, the method is used to investigate the behavior of daily rainfall extremes; thousands of rainfall records were examined, from all over the world and with sample size over 100 years, revealing that heavy-tailed distributions can describe more accurately rainfall extremes.
Bivariate categorical data analysis using normal linear conditional multinomial probability model.
Sun, Bingrui; Sutradhar, Brajendra
2015-02-10
Bivariate multinomial data such as the left and right eyes retinopathy status data are analyzed either by using a joint bivariate probability model or by exploiting certain odds ratio-based association models. However, the joint bivariate probability model yields marginal probabilities, which are complicated functions of marginal and association parameters for both variables, and the odds ratio-based association model treats the odds ratios involved in the joint probabilities as 'working' parameters, which are consequently estimated through certain arbitrary 'working' regression models. Also, this later odds ratio-based model does not provide any easy interpretations of the correlations between two categorical variables. On the basis of pre-specified marginal probabilities, in this paper, we develop a bivariate normal type linear conditional multinomial probability model to understand the correlations between two categorical variables. The parameters involved in the model are consistently estimated using the optimal likelihood and generalized quasi-likelihood approaches. The proposed model and the inferences are illustrated through an intensive simulation study as well as an analysis of the well-known Wisconsin Diabetic Retinopathy status data. Copyright © 2014 John Wiley & Sons, Ltd.
Search for anomalous kinematics in tt dilepton events at CDF II.
Acosta, D; Adelman, J; Affolder, T; Akimoto, T; Albrow, M G; Ambrose, D; Amerio, S; Amidei, D; Anastassov, A; Anikeev, K; Annovi, A; Antos, J; Aoki, M; Apollinari, G; Arisawa, T; Arguin, J-F; Artikov, A; Ashmanskas, W; Attal, A; Azfar, F; Azzi-Bacchetta, P; Bacchetta, N; Bachacou, H; Badgett, W; Barbaro-Galtieri, A; Barker, G J; Barnes, V E; Barnett, B A; Baroiant, S; Barone, M; Bauer, G; Bedeschi, F; Behari, S; Belforte, S; Bellettini, G; Bellinger, J; Ben-Haim, E; Benjamin, D; Beretvas, A; Bhatti, A; Binkley, M; Bisello, D; Bishai, M; Blair, R E; Blocker, C; Bloom, K; Blumenfeld, B; Bocci, A; Bodek, A; Bolla, G; Bolshov, A; Booth, P S L; Bortoletto, D; Boudreau, J; Bourov, S; Brau, B; Bromberg, C; Brubaker, E; Budagov, J; Budd, H S; Burkett, K; Busetto, G; Bussey, P; Byrum, K L; Cabrera, S; Campanelli, M; Campbell, M; Canepa, A; Casarsa, M; Carlsmith, D; Carron, S; Carosi, R; Cavalli-Sforza, M; Castro, A; Catastini, P; Cauz, D; Cerri, A; Cerrito, L; Chapman, J; Chen, C; Chen, Y C; Chertok, M; Chiarelli, G; Chlachidze, G; Chlebana, F; Cho, I; Cho, K; Chokheli, D; Chou, J P; Chu, M L; Chuang, S; Chung, J Y; Chung, W-H; Chung, Y S; Ciobanu, C I; Ciocci, M A; Clark, A G; Clark, D; Coca, M; Connolly, A; Convery, M; Conway, J; Cooper, B; Cordelli, M; Cortiana, G; Cranshaw, J; Cuevas, J; Culbertson, R; Currat, C; Cyr, D; Dagenhart, D; Da Ronco, S; D'Auria, S; de Barbaro, P; De Cecco, S; De Lentdecker, G; Dell'Agnello, S; Dell'Orso, M; Demers, S; Demortier, L; Deninno, M; De Pedis, D; Derwent, P F; Dionisi, C; Dittmann, J R; Dörr, C; Doksus, P; Dominguez, A; Donati, S; Donega, M; Donini, J; D'Onofrio, M; Dorigo, T; Drollinger, V; Ebina, K; Eddy, N; Ehlers, J; Ely, R; Erbacher, R; Erdmann, M; Errede, D; Errede, S; Eusebi, R; Fang, H-C; Farrington, S; Fedorko, I; Fedorko, W T; Feild, R G; Feindt, M; Fernandez, J P; Ferretti, C; Field, R D; Flanagan, G; Flaugher, B; Flores-Castillo, L R; Foland, A; Forrester, S; Foster, G W; Franklin, M; Freeman, J C; Fujii, Y; Furic, I; Gajjar, A; Gallas, A; Galyardt, J; Gallinaro, M; Garcia-Sciveres, M; Garfinkel, A F; Gay, C; Gerberich, H; Gerdes, D W; Gerchtein, E; Giagu, S; Giannetti, P; Gibson, A; Gibson, K; Ginsburg, C; Giolo, K; Giordani, M; Giunta, M; Giurgiu, G; Glagolev, V; Glenzinski, D; Gold, M; Goldschmidt, N; Goldstein, D; Goldstein, J; Gomez, G; Gomez-Ceballos, G; Goncharov, M; González, O; Gorelov, I; Goshaw, A T; Gotra, Y; Goulianos, K; Gresele, A; Griffiths, M; Grosso-Pilcher, C; Grundler, U; Guenther, M; Guimaraes da Costa, J; Haber, C; Hahn, K; Hahn, S R; Halkiadakis, E; Hamilton, A; Han, B-Y; Handler, R; Happacher, F; Hara, K; Hare, M; Harr, R F; Harris, R M; Hartmann, F; Hatakeyama, K; Hauser, J; Hays, C; Hayward, H; Heider, E; Heinemann, B; Heinrich, J; Hennecke, M; Herndon, M; Hill, C; Hirschhbuehl, D; Hocker, A; Hoffman, K D; Holloway, A; Hou, S; Houlden, M A; Huffman, B T; Huang, Y; Hughes, R E; Huston, J; Ikado, K; Incandela, J; Introzzi, G; Iori, M; Ishizawa, Y; Issever, C; Ivanov, A; Iwata, Y; Iyutin, B; James, E; Jang, D; Jarrell, J; Jeans, D; Jensen, H; Jeon, E J; Jones, M; Joo, K K; Jun, S Y; Junk, T; Kamon, T; Kang, J; Karagoz Unel, M; Karchin, P E; Kartal, S; Kato, Y; Kemp, Y; Kephart, R; Kerzel, U; Khotilovich, V; Kilminster, B; Kim, D H; Kim, H S; Kim, J E; Kim, M J; Kim, M S; Kim, S B; Kim, S H; Kim, T H; Kim, Y K; King, B T; Kirby, M; Kirsch, L; Klimenko, S; Knuteson, B; Ko, B R; Kobayashi, H; Koehn, P; Kong, D J; Kondo, K; Konigsberg, J; Kordas, K; Korn, A; Korytov, A; Kotelnikov, K; Kotwal, A V; Kovalev, A; Kraus, J; Kravchenko, I; Kreymer, A; Kroll, J; Kruse, M; Krutelyov, V; Kuhlmann, S E; Kwang, S; Laasanen, A T; Lai, S; Lami, S; Lammel, S; Lancaster, J; Lancaster, M; Lander, R; Lannon, K; Lath, A; Latino, G; Lauhakangas, R; Lazzizzera, I; Le, Y; Lecci, C; LeCompte, T; Lee, J; Lee, J; Lee, S W; Lefèvre, R; Leonardo, N; Leone, S; Levy, S; Lewis, J D; Li, K; Lin, C; Lin, C S; Lindgren, M; Liss, T M; Lister, A; Litvintsev, D O; Liu, T; Liu, Y; Lockyer, N S; Loginov, A; Loreti, M; Loverre, P; Lu, R-S; Lucchesi, D; Lujan, P; Lukens, P; Lungu, G; Lyons, L; Lys, J; Lysak, R; MacQueen, D; Madrak, R; Maeshima, K; Maksimovic, P; Malferrari, L; Manca, G; Marginean, R; Marino, C; Martin, A; Martin, M; Martin, V; Martínez, M; Maruyama, T; Matsunaga, H; Mattson, M; Mazzanti, P; McFarland, K S; McGivern, D; McIntyre, P M; McNamara, P; NcNulty, R; Mehta, A; Menzemer, S; Menzione, A; Merkel, P; Mesropian, C; Messina, A; Miao, T; Miladinovic, N; Miller, L; Miller, R; Miller, J S; Miquel, R; Miscetti, S; Mitselmakher, G; Miyamoto, A; Miyazaki, Y; Moggi, N; Mohr, B; Moore, R; Morello, M; Movilla Fernandez, P A; Mukherjee, A; Mulhearn, M; Muller, T; Mumford, R; Munar, A; Murat, P; Nachtman, J; Nahn, S; Nakamura, I; Nakano, I; Napier, A; Napora, R; Naumov, D; Necula, V; Niell, F; Nielsen, J; Nelson, C; Nelson, T; Neu, C; Neubauer, M S; Newman-Holmes, C; Nigmanov, T; Nodulman, L; Norniella, O; Oesterberg, K; Ogawa, T; Oh, S H; Oh, Y D; Ohsugi, T; Okusawa, T; Oldeman, R; Orava, R; Orejudos, W; Pagliarone, C; Palencia, E; Paoletti, R; Papadimitriou, V; Pashapour, S; Patrick, J; Pauletta, G; Paulini, M; Pauly, T; Paus, C; Pellett, D; Penzo, A; Phillips, T J; Piacentino, G; Piedra, J; Pitts, K T; Plager, C; Pompos, A; Pondrom, L; Pope, G; Portell, X; Poukhov, O; Prakoshyn, F; Pratt, T; Pronko, A; Proudfoot, J; Ptohos, F; Punzi, G; Rademachker, J; Rahaman, M A; Rakitine, A; Rappoccio, S; Ratnikov, F; Ray, H; Reisert, B; Rekovic, V; Renton, P; Rescigno, M; Rimondi, F; Rinnert, K; Ristori, L; Robertson, W J; Robson, A; Rodrigo, T; Rolli, S; Rosenson, L; Roser, R; Rossin, R; Rott, C; Russ, J; Rusu, V; Ruiz, A; Ryan, D; Saarikko, H; Sabik, S; Safonov, A; St Denis, R; Sakumoto, W K; Salamanna, G; Saltzberg, D; Sanchez, C; Sansoni, A; Santi, L; Sarkar, S; Sato, K; Savard, P; Savoy-Navarro, A; Schlabach, P; Schmidt, E E; Schmidt, M P; Schmitt, M; Scodellaro, L; Scribano, A; Scuri, F; Sedov, A; Seidel, S; Seiya, Y; Semeria, F; Sexton-Kennedy, L; Sfiligoi, I; Shapiro, M D; Shears, T; Shepard, P F; Sherman, D; Shimojima, M; Shochet, M; Shon, Y; Shreyber, I; Sidoti, A; Siegrist, J; Siket, M; Sill, A; Sinervo, P; Sisakyan, A; Skiba, A; Slaughter, A J; Sliwa, K; Smirnov, D; Smith, J R; Snider, F D; Snihur, R; Soha, A; Somalwar, S V; Spalding, J; Spezziga, M; Spiegel, L; Spinella, F; Spiropulu, M; Squillacioti, P; Stadie, H; Stelzer, B; Stelzer-Chilton, O; Strologas, J; Stuart, D; Sukhanov, A; Sumorok, K; Sun, H; Suzuki, T; Taffard, A; Tafirout, R; Takach, S F; Takano, H; Takashima, R; Takeuchi, Y; Takikawa, K; Tanaka, M; Tanaka, R; Tanimoto, N; Tapprogge, S; Tecchio, M; Teng, P K; Terashi, K; Tesarek, R J; Tether, S; Thom, J; Thompson, A S; Thomson, E; Tipton, P; Tiwari, V; Trkaczyk, S; Toback, D; Tollefson, K; Tomura, T; Tonelli, D; Tönnesmann, M; Torre, S; Torretta, D; Tourneur, S; Trischuk, W; Tseng, J; Tsuchiya, R; Tsuno, S; Tsybychev, D; Turini, N; Turner, M; Ukegawa, F; Unverhau, T; Uozumi, S; Usynin, D; Vacavant, L; Vaiciulis, A; Varganov, A; Vataga, E; Vejcik, S; Velev, G; Veszpremi, V; Veramendi, G; Vickey, T; Vidal, R; Vila, I; Vilar, R; Vollrath, I; Volobouev, I; von der Mey, M; Wagner, P; Wagner, R G; Wagner, R L; Wagner, W; Wallny, R; Walter, T; Yamashita, T; Yamamoto, K; Wan, Z; Wang, M J; Wang, S M; Warburton, A; Ward, B; Waschke, S; Waters, D; Watts, T; Weber, M; Wester, W C; Whitehouse, B; Wicklund, A B; Wicklund, E; Williams, H H; Wilson, P; Winer, B L; Wittich, P; Wolbers, S; Wolter, M; Worcester, M; Worm, S; Wright, T; Wu, X; Würthwein, F; Wyatt, A; Yagil, A; Yang, C; Yang, U K; Yao, W; Yeh, G P; Yi, K; Yoh, J; Yoon, P; Yorita, K; Yoshida, T; Yu, I; Yu, S; Yu, Z; Yun, J C; Zanello, L; Zanetti, A; Zaw, I; Zetti, F; Zhou, J; Zsenei, A; Zucchelli, S
2005-07-08
We report on a search for anomalous kinematics of tt dilepton events in pp collisions at square root of s=1.96 TeV using 193 pb(-1) of data collected with the CDF II detector. We developed a new a priori technique designed to isolate the subset in a data sample revealing the largest deviation from standard model (SM) expectations and to quantify the significance of this departure. In the four-variable space considered, no particular subset shows a significant discrepancy, and we find that the probability of obtaining a data sample less consistent with the SM than what is observed is 1.0%-4.5%.
[Biometric bases: basic concepts of probability calculation].
Dinya, E
1998-04-26
The author gives or outline of the basic concepts of probability theory. The bases of the event algebra, definition of the probability, the classical probability model and the random variable are presented.
The estimation of tree posterior probabilities using conditional clade probability distributions.
Larget, Bret
2013-07-01
In this article I introduce the idea of conditional independence of separated subtrees as a principle by which to estimate the posterior probability of trees using conditional clade probability distributions rather than simple sample relative frequencies. I describe an algorithm for these calculations and software which implements these ideas. I show that these alternative calculations are very similar to simple sample relative frequencies for high probability trees but are substantially more accurate for relatively low probability trees. The method allows the posterior probability of unsampled trees to be calculated when these trees contain only clades that are in other sampled trees. Furthermore, the method can be used to estimate the total probability of the set of sampled trees which provides a measure of the thoroughness of a posterior sample.
Ordinal probability effect measures for group comparisons in multinomial cumulative link models.
Agresti, Alan; Kateri, Maria
2017-03-01
We consider simple ordinal model-based probability effect measures for comparing distributions of two groups, adjusted for explanatory variables. An "ordinal superiority" measure summarizes the probability that an observation from one distribution falls above an independent observation from the other distribution, adjusted for explanatory variables in a model. The measure applies directly to normal linear models and to a normal latent variable model for ordinal response variables. It equals Φ(β/2) for the corresponding ordinal model that applies a probit link function to cumulative multinomial probabilities, for standard normal cdf Φ and effect β that is the coefficient of the group indicator variable. For the more general latent variable model for ordinal responses that corresponds to a linear model with other possible error distributions and corresponding link functions for cumulative multinomial probabilities, the ordinal superiority measure equals exp(β)/[1+exp(β)] with the log-log link and equals approximately exp(β/2)/[1+exp(β/2)] with the logit link, where β is the group effect. Another ordinal superiority measure generalizes the difference of proportions from binary to ordinal responses. We also present related measures directly for ordinal models for the observed response that need not assume corresponding latent response models. We present confidence intervals for the measures and illustrate with an example. © 2016, The International Biometric Society.
Bakbergenuly, Ilyas; Morgenthaler, Stephan
2016-01-01
We study bias arising as a result of nonlinear transformations of random variables in random or mixed effects models and its effect on inference in group‐level studies or in meta‐analysis. The findings are illustrated on the example of overdispersed binomial distributions, where we demonstrate considerable biases arising from standard log‐odds and arcsine transformations of the estimated probability p^, both for single‐group studies and in combining results from several groups or studies in meta‐analysis. Our simulations confirm that these biases are linear in ρ, for small values of ρ, the intracluster correlation coefficient. These biases do not depend on the sample sizes or the number of studies K in a meta‐analysis and result in abysmal coverage of the combined effect for large K. We also propose bias‐correction for the arcsine transformation. Our simulations demonstrate that this bias‐correction works well for small values of the intraclass correlation. The methods are applied to two examples of meta‐analyses of prevalence. PMID:27192062
Parameterizing deep convection using the assumed probability density function method
Storer, R. L.; Griffin, B. M.; Höft, J.; ...
2014-06-11
Due to their coarse horizontal resolution, present-day climate models must parameterize deep convection. This paper presents single-column simulations of deep convection using a probability density function (PDF) parameterization. The PDF parameterization predicts the PDF of subgrid variability of turbulence, clouds, and hydrometeors. That variability is interfaced to a prognostic microphysics scheme using a Monte Carlo sampling method. The PDF parameterization is used to simulate tropical deep convection, the transition from shallow to deep convection over land, and mid-latitude deep convection. These parameterized single-column simulations are compared with 3-D reference simulations. The agreement is satisfactory except when the convective forcing ismore » weak. The same PDF parameterization is also used to simulate shallow cumulus and stratocumulus layers. The PDF method is sufficiently general to adequately simulate these five deep, shallow, and stratiform cloud cases with a single equation set. This raises hopes that it may be possible in the future, with further refinements at coarse time step and grid spacing, to parameterize all cloud types in a large-scale model in a unified way.« less
NASA Astrophysics Data System (ADS)
Siregar, A. F.; Supriana, T.
2018-02-01
Shallots contains a lot of usefull ingredients for human life, especially as flavor to dishes by Indonesian. The need for shallots was increasing as increasing population. The increased demand of shallots caused the price to increase due to production in North Sumatera was low. The objective of this study is to analyze interest and factors that affect the interest of farmers in shallots farming and analyze the responses from each factors to the interest of farmers in shallots farming. The samples were 85 farmers in shallots farming. Binomial logit was used as data analysis method. The result of the study showed that the factors that influence the interest of farmers in shallots farming consist of land area, experience, income, supporting and trauma.. The opportunity of farmers in shallots farming increased 22% if the area of land increased by one acre. The probability variable with the supporting is higher 0,3 % compared without the supporting. While the probability variable without the trauma is higher 0,014 % compared with the trauma.
Parameterizing deep convection using the assumed probability density function method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Storer, R. L.; Griffin, B. M.; Höft, J.
2015-01-06
Due to their coarse horizontal resolution, present-day climate models must parameterize deep convection. This paper presents single-column simulations of deep convection using a probability density function (PDF) parameterization. The PDF parameterization predicts the PDF of subgrid variability of turbulence, clouds, and hydrometeors. That variability is interfaced to a prognostic microphysics scheme using a Monte Carlo sampling method.The PDF parameterization is used to simulate tropical deep convection, the transition from shallow to deep convection over land, and midlatitude deep convection. These parameterized single-column simulations are compared with 3-D reference simulations. The agreement is satisfactory except when the convective forcing is weak.more » The same PDF parameterization is also used to simulate shallow cumulus and stratocumulus layers. The PDF method is sufficiently general to adequately simulate these five deep, shallow, and stratiform cloud cases with a single equation set. This raises hopes that it may be possible in the future, with further refinements at coarse time step and grid spacing, to parameterize all cloud types in a large-scale model in a unified way.« less
Parameterizing deep convection using the assumed probability density function method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Storer, R. L.; Griffin, B. M.; Hoft, Jan
2015-01-06
Due to their coarse horizontal resolution, present-day climate models must parameterize deep convection. This paper presents single-column simulations of deep convection using a probability density function (PDF) parameterization. The PDF parameterization predicts the PDF of subgrid variability of turbulence, clouds, and hydrometeors. That variability is interfaced to a prognostic microphysics scheme using a Monte Carlo sampling method.The PDF parameterization is used to simulate tropical deep convection, the transition from shallow to deep convection over land, and mid-latitude deep convection.These parameterized single-column simulations are compared with 3-D reference simulations. The agreement is satisfactory except when the convective forcing is weak. Themore » same PDF parameterization is also used to simulate shallow cumulus and stratocumulus layers. The PDF method is sufficiently general to adequately simulate these five deep, shallow, and stratiform cloud cases with a single equation set. This raises hopes that it may be possible in the future, with further refinements at coarse time step and grid spacing, to parameterize all cloud types in a large-scale model in a unified way.« less
The Estimation of Tree Posterior Probabilities Using Conditional Clade Probability Distributions
Larget, Bret
2013-01-01
In this article I introduce the idea of conditional independence of separated subtrees as a principle by which to estimate the posterior probability of trees using conditional clade probability distributions rather than simple sample relative frequencies. I describe an algorithm for these calculations and software which implements these ideas. I show that these alternative calculations are very similar to simple sample relative frequencies for high probability trees but are substantially more accurate for relatively low probability trees. The method allows the posterior probability of unsampled trees to be calculated when these trees contain only clades that are in other sampled trees. Furthermore, the method can be used to estimate the total probability of the set of sampled trees which provides a measure of the thoroughness of a posterior sample. [Bayesian phylogenetics; conditional clade distributions; improved accuracy; posterior probabilities of trees.] PMID:23479066
Additional Samples: Where They Should Be Located
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pilger, G. G., E-mail: jfelipe@ufrgs.br; Costa, J. F. C. L.; Koppe, J. C.
2001-09-15
Information for mine planning requires to be close spaced, if compared to the grid used for exploration and resource assessment. The additional samples collected during quasimining usually are located in the same pattern of the original diamond drillholes net but closer spaced. This procedure is not the best in mathematical sense for selecting a location. The impact of an additional information to reduce the uncertainty about the parameter been modeled is not the same everywhere within the deposit. Some locations are more sensitive in reducing the local and global uncertainty than others. This study introduces a methodology to select additionalmore » sample locations based on stochastic simulation. The procedure takes into account data variability and their spatial location. Multiple equally probable models representing a geological attribute are generated via geostatistical simulation. These models share basically the same histogram and the same variogram obtained from the original data set. At each block belonging to the model a value is obtained from the n simulations and their combination allows one to access local variability. Variability is measured using an uncertainty index proposed. This index was used to map zones of high variability. A value extracted from a given simulation is added to the original data set from a zone identified as erratic in the previous maps. The process of adding samples and simulation is repeated and the benefit of the additional sample is evaluated. The benefit in terms of uncertainty reduction is measure locally and globally. The procedure showed to be robust and theoretically sound, mapping zones where the additional information is most beneficial. A case study in a coal mine using coal seam thickness illustrates the method.« less
Gould, A Lawrence; Koglin, Joerg; Bain, Raymond P; Pinto, Cathy-Anne; Mitchel, Yale B; Pasternak, Richard C; Sapre, Aditi
2009-08-01
Studies measuring progression of carotid artery intima-media thickness (cIMT) have been used to estimate the effect of lipid-modifying therapies cardiovascular event risk. The likelihood that future cIMT clinical trials will detect a true treatment effect is estimated by leveraging results from prior studies. The present analyses assess the impact of between- and within-study variability based on currently published data from prior clinical studies on the likelihood that ongoing or future cIMT trials will detect the true treatment effect of lipid-modifying therapies. Published data from six contemporary cIMT studies (ASAP, ARBITER 2, RADIANCE 1, RADIANCE 2, ENHANCE, and METEOR) including data from a total of 3563 patients were examined. Bayesian and frequentist methods were used to assess the impact of between study variability on the likelihood of detecting true treatment effects on 1-year cIMT progression/regression and to provide a sample size estimate that would specifically compensate for the effect of between-study variability. In addition to the well-described within-study variability, there is considerable between-study variability associated with the measurement of annualized change in cIMT. Accounting for the additional between-study variability decreases the power for existing study designs. In order to account for the added between-study variability, it is likely that future cIMT studies would require a large increase in sample size in order to provide substantial probability (> or =90%) to have 90% power of detecting a true treatment effect.Limitation Analyses are based on study level data. Future meta-analyses incorporating patient-level data would be useful for confirmation. Due to substantial within- and between-study variability in the measure of 1-year change of cIMT, as well as uncertainty about progression rates in contemporary populations, future study designs evaluating the effect of new lipid-modifying therapies on atherosclerotic disease progression are likely to be challenged by large sample sizes in order to demonstrate a true treatment effect.
NASA Astrophysics Data System (ADS)
Phillips, Thomas J.; Gates, W. Lawrence; Arpe, Klaus
1992-12-01
The effects of sampling frequency on the first- and second-moment statistics of selected European Centre for Medium-Range Weather Forecasts (ECMWF) model variables are investigated in a simulation of "perpetual July" with a diurnal cycle included and with surface and atmospheric fields saved at hourly intervals. The shortest characteristic time scales (as determined by the e-folding time of lagged autocorrelation functions) are those of ground heat fluxes and temperatures, precipitation and runoff, convective processes, cloud properties, and atmospheric vertical motion, while the longest time scales are exhibited by soil temperature and moisture, surface pressure, and atmospheric specific humidity, temperature, and wind. The time scales of surface heat and momentum fluxes and of convective processes are substantially shorter over land than over oceans. An appropriate sampling frequency for each model variable is obtained by comparing the estimates of first- and second-moment statistics determined at intervals ranging from 2 to 24 hours with the "best" estimates obtained from hourly sampling. Relatively accurate estimation of first- and second-moment climate statistics (10% errors in means, 20% errors in variances) can be achieved by sampling a model variable at intervals that usually are longer than the bandwidth of its time series but that often are shorter than its characteristic time scale. For the surface variables, sampling at intervals that are nonintegral divisors of a 24-hour day yields relatively more accurate time-mean statistics because of a reduction in errors associated with aliasing of the diurnal cycle and higher-frequency harmonics. The superior estimates of first-moment statistics are accompanied by inferior estimates of the variance of the daily means due to the presence of systematic biases, but these probably can be avoided by defining a different measure of low-frequency variability. Estimates of the intradiurnal variance of accumulated precipitation and surface runoff also are strongly impacted by the length of the storage interval. In light of these results, several alternative strategies for storage of the EMWF model variables are recommended.
van Walraven, Carl; Austin, Peter C; Manuel, Douglas; Knoll, Greg; Jennings, Allison; Forster, Alan J
2010-12-01
Administrative databases commonly use codes to indicate diagnoses. These codes alone are often inadequate to accurately identify patients with particular conditions. In this study, we determined whether we could quantify the probability that a person has a particular disease-in this case renal failure-using other routinely collected information available in an administrative data set. This would allow the accurate identification of a disease cohort in an administrative database. We determined whether patients in a randomly selected 100,000 hospitalizations had kidney disease (defined as two or more sequential serum creatinines or the single admission creatinine indicating a calculated glomerular filtration rate less than 60 mL/min/1.73 m²). The independent association of patient- and hospitalization-level variables with renal failure was measured using a multivariate logistic regression model in a random 50% sample of the patients. The model was validated in the remaining patients. Twenty thousand seven hundred thirteen patients had kidney disease (20.7%). A diagnostic code of kidney disease was strongly associated with kidney disease (relative risk: 34.4), but the accuracy of the code was poor (sensitivity: 37.9%; specificity: 98.9%). Twenty-nine patient- and hospitalization-level variables entered the kidney disease model. This model had excellent discrimination (c-statistic: 90.1%) and accurately predicted the probability of true renal failure. The probability threshold that maximized sensitivity and specificity for the identification of true kidney disease was 21.3% (sensitivity: 80.0%; specificity: 82.2%). Multiple variables available in administrative databases can be combined to quantify the probability that a person has a particular disease. This process permits accurate identification of a disease cohort in an administrative database. These methods may be extended to other diagnoses or procedures and could both facilitate and clarify the use of administrative databases for research and quality improvement. Copyright © 2010 Elsevier Inc. All rights reserved.
Charney, Noah D.; Kubel, Jacob E.; Eiseman, Charles S.
2015-01-01
Improving detection rates for elusive species with clumped distributions is often accomplished through adaptive sampling designs. This approach can be extended to include species with temporally variable detection probabilities. By concentrating survey effort in years when the focal species are most abundant or visible, overall detection rates can be improved. This requires either long-term monitoring at a few locations where the species are known to occur or models capable of predicting population trends using climatic and demographic data. For marbled salamanders (Ambystoma opacum) in Massachusetts, we demonstrate that annual variation in detection probability of larvae is regionally correlated. In our data, the difference in survey success between years was far more important than the difference among the three survey methods we employed: diurnal surveys, nocturnal surveys, and dipnet surveys. Based on these data, we simulate future surveys to locate unknown populations under a temporally adaptive sampling framework. In the simulations, when pond dynamics are correlated over the focal region, the temporally adaptive design improved mean survey success by as much as 26% over a non-adaptive sampling design. Employing a temporally adaptive strategy costs very little, is simple, and has the potential to substantially improve the efficient use of scarce conservation funds. PMID:25799224
Meta-analysis with missing study-level sample variance data.
Chowdhry, Amit K; Dworkin, Robert H; McDermott, Michael P
2016-07-30
We consider a study-level meta-analysis with a normally distributed outcome variable and possibly unequal study-level variances, where the object of inference is the difference in means between a treatment and control group. A common complication in such an analysis is missing sample variances for some studies. A frequently used approach is to impute the weighted (by sample size) mean of the observed variances (mean imputation). Another approach is to include only those studies with variances reported (complete case analysis). Both mean imputation and complete case analysis are only valid under the missing-completely-at-random assumption, and even then the inverse variance weights produced are not necessarily optimal. We propose a multiple imputation method employing gamma meta-regression to impute the missing sample variances. Our method takes advantage of study-level covariates that may be used to provide information about the missing data. Through simulation studies, we show that multiple imputation, when the imputation model is correctly specified, is superior to competing methods in terms of confidence interval coverage probability and type I error probability when testing a specified group difference. Finally, we describe a similar approach to handling missing variances in cross-over studies. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
More than Just Convenient: The Scientific Merits of Homogeneous Convenience Samples
Jager, Justin; Putnick, Diane L.; Bornstein, Marc H.
2017-01-01
Despite their disadvantaged generalizability relative to probability samples, non-probability convenience samples are the standard within developmental science, and likely will remain so because probability samples are cost-prohibitive and most available probability samples are ill-suited to examine developmental questions. In lieu of focusing on how to eliminate or sharply reduce reliance on convenience samples within developmental science, here we propose how to augment their advantages when it comes to understanding population effects as well as subpopulation differences. Although all convenience samples have less clear generalizability than probability samples, we argue that homogeneous convenience samples have clearer generalizability relative to conventional convenience samples. Therefore, when researchers are limited to convenience samples, they should consider homogeneous convenience samples as a positive alternative to conventional or heterogeneous) convenience samples. We discuss future directions as well as potential obstacles to expanding the use of homogeneous convenience samples in developmental science. PMID:28475254
A Method for Evaluating Tuning Functions of Single Neurons based on Mutual Information Maximization
NASA Astrophysics Data System (ADS)
Brostek, Lukas; Eggert, Thomas; Ono, Seiji; Mustari, Michael J.; Büttner, Ulrich; Glasauer, Stefan
2011-03-01
We introduce a novel approach for evaluation of neuronal tuning functions, which can be expressed by the conditional probability of observing a spike given any combination of independent variables. This probability can be estimated out of experimentally available data. By maximizing the mutual information between the probability distribution of the spike occurrence and that of the variables, the dependence of the spike on the input variables is maximized as well. We used this method to analyze the dependence of neuronal activity in cortical area MSTd on signals related to movement of the eye and retinal image movement.
Decision theory for computing variable and value ordering decisions for scheduling problems
NASA Technical Reports Server (NTRS)
Linden, Theodore A.
1993-01-01
Heuristics that guide search are critical when solving large planning and scheduling problems, but most variable and value ordering heuristics are sensitive to only one feature of the search state. One wants to combine evidence from all features of the search state into a subjective probability that a value choice is best, but there has been no solid semantics for merging evidence when it is conceived in these terms. Instead, variable and value ordering decisions should be viewed as problems in decision theory. This led to two key insights: (1) The fundamental concept that allows heuristic evidence to be merged is the net incremental utility that will be achieved by assigning a value to a variable. Probability distributions about net incremental utility can merge evidence from the utility function, binary constraints, resource constraints, and other problem features. The subjective probability that a value is the best choice is then derived from probability distributions about net incremental utility. (2) The methods used for rumor control in Bayesian Networks are the primary way to prevent cycling in the computation of probable net incremental utility. These insights lead to semantically justifiable ways to compute heuristic variable and value ordering decisions that merge evidence from all available features of the search state.
Influence of internal variability on population exposure to hydroclimatic changes
NASA Astrophysics Data System (ADS)
Mankin, Justin S.; Viviroli, Daniel; Mekonnen, Mesfin M.; Hoekstra, Arjen Y.; Horton, Radley M.; E Smerdon, Jason; Diffenbaugh, Noah S.
2017-04-01
Future freshwater supply, human water demand, and people’s exposure to water stress are subject to multiple sources of uncertainty, including unknown future pathways of fossil fuel and water consumption, and ‘irreducible’ uncertainty arising from internal climate system variability. Such internal variability can conceal forced hydroclimatic changes on multi-decadal timescales and near-continental spatial-scales. Using three projections of population growth, a large ensemble from a single Earth system model, and assuming stationary per capita water consumption, we quantify the likelihoods of future population exposure to increased hydroclimatic deficits, which we define as the average duration and magnitude by which evapotranspiration exceeds precipitation in a basin. We calculate that by 2060, ∽31%-35% of the global population will be exposed to >50% probability of hydroclimatic deficit increases that exceed existing hydrological storage, with up to 9% of people exposed to >90% probability. However, internal variability, which is an irreducible uncertainty in climate model predictions that is under-sampled in water resource projections, creates substantial uncertainty in predicted exposure: ∽86%-91% of people will reside where irreducible uncertainty spans the potential for both increases and decreases in sub-annual water deficits. In one population scenario, changes in exposure to large hydroclimate deficits vary from -3% to +6% of global population, a range arising entirely from internal variability. The uncertainty in risk arising from irreducible uncertainty in the precise pattern of hydroclimatic change, which is typically conflated with other uncertainties in projections, is critical for climate risk management that seeks to optimize adaptations that are robust to the full set of potential real-world outcomes.
Latin hypercube approach to estimate uncertainty in ground water vulnerability
Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.
2007-01-01
A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.
Comonotonic bounds on the survival probabilities in the Lee-Carter model for mortality projection
NASA Astrophysics Data System (ADS)
Denuit, Michel; Dhaene, Jan
2007-06-01
In the Lee-Carter framework, future survival probabilities are random variables with an intricate distribution function. In large homogeneous portfolios of life annuities, value-at-risk or conditional tail expectation of the total yearly payout of the company are approximately equal to the corresponding quantities involving random survival probabilities. This paper aims to derive some bounds in the increasing convex (or stop-loss) sense on these random survival probabilities. These bounds are obtained with the help of comonotonic upper and lower bounds on sums of correlated random variables.
Vallée, Julie; Souris, Marc; Fournet, Florence; Bochaton, Audrey; Mobillion, Virginie; Peyronnie, Karine; Salem, Gérard
2007-01-01
Background Geographical objectives and probabilistic methods are difficult to reconcile in a unique health survey. Probabilistic methods focus on individuals to provide estimates of a variable's prevalence with a certain precision, while geographical approaches emphasise the selection of specific areas to study interactions between spatial characteristics and health outcomes. A sample selected from a small number of specific areas creates statistical challenges: the observations are not independent at the local level, and this results in poor statistical validity at the global level. Therefore, it is difficult to construct a sample that is appropriate for both geographical and probability methods. Methods We used a two-stage selection procedure with a first non-random stage of selection of clusters. Instead of randomly selecting clusters, we deliberately chose a group of clusters, which as a whole would contain all the variation in health measures in the population. As there was no health information available before the survey, we selected a priori determinants that can influence the spatial homogeneity of the health characteristics. This method yields a distribution of variables in the sample that closely resembles that in the overall population, something that cannot be guaranteed with randomly-selected clusters, especially if the number of selected clusters is small. In this way, we were able to survey specific areas while minimising design effects and maximising statistical precision. Application We applied this strategy in a health survey carried out in Vientiane, Lao People's Democratic Republic. We selected well-known health determinants with unequal spatial distribution within the city: nationality and literacy. We deliberately selected a combination of clusters whose distribution of nationality and literacy is similar to the distribution in the general population. Conclusion This paper describes the conceptual reasoning behind the construction of the survey sample and shows that it can be advantageous to choose clusters using reasoned hypotheses, based on both probability and geographical approaches, in contrast to a conventional, random cluster selection strategy. PMID:17543100
Vallée, Julie; Souris, Marc; Fournet, Florence; Bochaton, Audrey; Mobillion, Virginie; Peyronnie, Karine; Salem, Gérard
2007-06-01
Geographical objectives and probabilistic methods are difficult to reconcile in a unique health survey. Probabilistic methods focus on individuals to provide estimates of a variable's prevalence with a certain precision, while geographical approaches emphasise the selection of specific areas to study interactions between spatial characteristics and health outcomes. A sample selected from a small number of specific areas creates statistical challenges: the observations are not independent at the local level, and this results in poor statistical validity at the global level. Therefore, it is difficult to construct a sample that is appropriate for both geographical and probability methods. We used a two-stage selection procedure with a first non-random stage of selection of clusters. Instead of randomly selecting clusters, we deliberately chose a group of clusters, which as a whole would contain all the variation in health measures in the population. As there was no health information available before the survey, we selected a priori determinants that can influence the spatial homogeneity of the health characteristics. This method yields a distribution of variables in the sample that closely resembles that in the overall population, something that cannot be guaranteed with randomly-selected clusters, especially if the number of selected clusters is small. In this way, we were able to survey specific areas while minimising design effects and maximising statistical precision. We applied this strategy in a health survey carried out in Vientiane, Lao People's Democratic Republic. We selected well-known health determinants with unequal spatial distribution within the city: nationality and literacy. We deliberately selected a combination of clusters whose distribution of nationality and literacy is similar to the distribution in the general population. This paper describes the conceptual reasoning behind the construction of the survey sample and shows that it can be advantageous to choose clusters using reasoned hypotheses, based on both probability and geographical approaches, in contrast to a conventional, random cluster selection strategy.
Risk and protective factors of dissocial behavior in a probability sample.
Moral de la Rubia, José; Ortiz Morales, Humberto
2012-07-01
The aims of this study were to know risk and protective factors for dissocial behavior keeping in mind that the self-report of dissocial behavior is biased by the impression management. A probability sample of adolescents that lived in two neighborhoods with high indexes of gangs and offenses (112 male and 86 women) was collected. The 27-item Dissocial Behavior Scale (ECODI27; Pacheco & Moral, 2010), Balanced Inventory of Desirable Responding, version 6 (BIDR-6; Paulhus, 1991), Sensation Seeking Scale, form V (SSS-V; Zuckerman, Eysenck, & Eysenck, 1978), Parent-Adolescent Communication Scale (PACS; Barnes & Olson, 1982), 30-item Rathus Assertiveness Schedule (RAS; Rathus, 1973), Interpersonal Reactivity Index (IRI; Davis, 1983) and a social relationship questionnaire (SRQ) were applied. Binary logistic regression was used for the data analysis. A third of the participants showed dissocial behavior. Belonging to a gang in the school (schooled adolescents) or to a gang out of school and job (total sample) and desinhibition were risk factors; being woman, perspective taking and open communication with the father were protective factors. School-leaving was a differential aspect. We insisted on the need of intervention on these variables.
King, C; Siegel, M
1999-12-01
This study investigated whether cigarette brands popular among youths are preferentially advertised in magazines with high youth readerships. Using a probit regression model of 1986-1994 data, we estimated the effect of the percentage of youth (ages 12-17) readers in a magazine on the probability of a cigarette brand advertising in that magazine and compared these effects for youth cigarette brands (those smoked by more than 2.5% of 10-15-year-old smokers) and adult cigarette brands. We controlled for the percentages of young adult (ages 18-24), female, black, and Hispanic readers. Holding all other variables constant at their sample means, the probability of an adult brand advertising in a magazine decreased from 0.76 (95% confidence interval [CI], 0.67-0.85) at a youth readership level of 2% (the lowest level of percentage youth readership in the sample magazines) to 0.46 (95% CI, 0.29-0.64) at a youth readership level of 47% (the highest level in the sample magazines). In contrast, the probability of a youth brand advertising in a magazine increased from 0.63 (95% CI, 0.51-0.75) at a youth readership level of 2% to 0.84 (95% CI, 0.72-0.96) at a youth readership level of 47%. It was concluded that, over nearly a decade, cigarette brands popular among youths were more likely than adult brands to advertise in magazines with high youth readerships.
Maximum-entropy probability distributions under Lp-norm constraints
NASA Technical Reports Server (NTRS)
Dolinar, S.
1991-01-01
Continuous probability density functions and discrete probability mass functions are tabulated which maximize the differential entropy or absolute entropy, respectively, among all probability distributions with a given L sub p norm (i.e., a given pth absolute moment when p is a finite integer) and unconstrained or constrained value set. Expressions for the maximum entropy are evaluated as functions of the L sub p norm. The most interesting results are obtained and plotted for unconstrained (real valued) continuous random variables and for integer valued discrete random variables. The maximum entropy expressions are obtained in closed form for unconstrained continuous random variables, and in this case there is a simple straight line relationship between the maximum differential entropy and the logarithm of the L sub p norm. Corresponding expressions for arbitrary discrete and constrained continuous random variables are given parametrically; closed form expressions are available only for special cases. However, simpler alternative bounds on the maximum entropy of integer valued discrete random variables are obtained by applying the differential entropy results to continuous random variables which approximate the integer valued random variables in a natural manner. All the results are presented in an integrated framework that includes continuous and discrete random variables, constraints on the permissible value set, and all possible values of p. Understanding such as this is useful in evaluating the performance of data compression schemes.
Navratil, Sarah; Gregory, Ashley; Bauer, Arin; Srinath, Indumathi; Szonyi, Barbara; Nightingale, Kendra; Anciso, Juan; Jun, Mikyoung; Han, Daikwon; Lawhon, Sara; Ivanek, Renata
2014-01-01
The National Resources Information (NRI) databases provide underutilized information on the local farm conditions that may predict microbial contamination of leafy greens at preharvest. Our objective was to identify NRI weather and landscape factors affecting spinach contamination with generic Escherichia coli individually and jointly with farm management and environmental factors. For each of the 955 georeferenced spinach samples (including 63 positive samples) collected between 2010 and 2012 on 12 farms in Colorado and Texas, we extracted variables describing the local weather (ambient temperature, precipitation, and wind speed) and landscape (soil characteristics and proximity to roads and water bodies) from NRI databases. Variables describing farm management and environment were obtained from a survey of the enrolled farms. The variables were evaluated using a mixed-effect logistic regression model with random effects for farm and date. The model identified precipitation as a single NRI predictor of spinach contamination with generic E. coli, indicating that the contamination probability increases with an increasing mean amount of rain (mm) in the past 29 days (odds ratio [OR] = 3.5). The model also identified the farm's hygiene practices as a protective factor (OR = 0.06) and manure application (OR = 52.2) and state (OR = 108.1) as risk factors. In cross-validation, the model showed a solid predictive performance, with an area under the receiver operating characteristic (ROC) curve of 81%. Overall, the findings highlighted the utility of NRI precipitation data in predicting contamination and demonstrated that farm management, environment, and weather factors should be considered jointly in development of good agricultural practices and measures to reduce produce contamination. PMID:24509926
Joint probabilities and quantum cognition
NASA Astrophysics Data System (ADS)
de Barros, J. Acacio
2012-12-01
In this paper we discuss the existence of joint probability distributions for quantumlike response computations in the brain. We do so by focusing on a contextual neural-oscillator model shown to reproduce the main features of behavioral stimulus-response theory. We then exhibit a simple example of contextual random variables not having a joint probability distribution, and describe how such variables can be obtained from neural oscillators, but not from a quantum observable algebra.
NASA Astrophysics Data System (ADS)
Greco, R.; Sorriso-Valvo, M.
2013-09-01
Several authors, according to different methodological approaches, have employed logistic Regression (LR), a multivariate statistical analysis adopted to assess the spatial probability of landslide, even though its fundamental principles have remained unaltered. This study aims at assessing the influence of some of these methodological approaches on the performance of LR, through a series of sensitivity analyses developed over a test area of about 300 km2 in Calabria (southern Italy). In particular, four types of sampling (1 - the whole study area; 2 - transects running parallel to the general slope direction of the study area with a total surface of about 1/3 of the whole study area; 3 - buffers surrounding the phenomena with a 1/1 ratio between the stable and the unstable area; 4 - buffers surrounding the phenomena with a 1/2 ratio between the stable and the unstable area), two variable coding modes (1 - grouped variables; 2 - binary variables), and two types of elementary land (1 - cells units; 2 - slope units) units have been tested. The obtained results must be considered as statistically relevant in all cases (Aroc values > 70%), thus confirming the soundness of the LR analysis which maintains high predictive capacities notwithstanding the features of input data. As for the area under investigation, the best performing methodological choices are the following: (i) transects produced the best results (0 < P(y) ≤ 93.4%; Aroc = 79.5%); (ii) as for sampling modalities, binary variables (0 < P(y) ≤ 98.3%; Aroc = 80.7%) provide better performance than ordinated variables; (iii) as for the choice of elementary land units, slope units (0 < P(y) ≤ 100%; Aroc = 84.2%) have obtained better results than cells matrix.
Kalkhan, M.A.; Stohlgren, T.J.
2000-01-01
Land managers need better techniques to assess exoticplant invasions. We used the cross-correlationstatistic, IYZ, to test for the presence ofspatial cross-correlation between pair-wisecombinations of soil characteristics, topographicvariables, plant species richness, and cover ofvascular plants in a 754 ha study site in RockyMountain National Park, Colorado, U.S.A. Using 25 largeplots (1000 m2) in five vegetation types, 8 of 12variables showed significant spatial cross-correlationwith at least one other variable, while 6 of 12variables showed significant spatial auto-correlation. Elevation and slope showed significant spatialcross-correlation with all variables except percentcover of native and exotic species. Percent cover ofnative species had significant spatialcross-correlations with soil variables, but not withexotic species. This was probably because of thepatchy distributions of vegetation types in the studyarea. At a finer resolution, using data from ten1 m2 subplots within each of the 1000 m2 plots, allvariables showed significant spatial auto- andcross-correlation. Large-plot sampling was moreaffected by topographic factors than speciesdistribution patterns, while with finer resolutionsampling, the opposite was true. However, thestatistically and biologically significant spatialcorrelation of native and exotic species could only bedetected with finer resolution sampling. We foundexotic plant species invading areas with high nativeplant richness and cover, and in fertile soils high innitrogen, silt, and clay. Spatial auto- andcross-correlation statistics, along with theintegration of remotely sensed data and geographicinformation systems, are powerful new tools forevaluating the patterns and distribution of native andexotic plant species in relation to landscape structure.
NASA Astrophysics Data System (ADS)
Alfano, M.; Bisagni, C.
2017-01-01
The objective of the running EU project DESICOS (New Robust DESign Guideline for Imperfection Sensitive COmposite Launcher Structures) is to formulate an improved shell design methodology in order to meet the demand of aerospace industry for lighter structures. Within the project, this article discusses the development of a probability-based methodology developed at Politecnico di Milano. It is based on the combination of the Stress-Strength Interference Method and the Latin Hypercube Method with the aim to predict the bucking response of three sandwich composite cylindrical shells, assuming a loading condition of pure compression. The three shells are made of the same material, but have different stacking sequence and geometric dimensions. One of them presents three circular cut-outs. Different types of input imperfections, treated as random variables, are taken into account independently and in combination: variability in longitudinal Young's modulus, ply misalignment, geometric imperfections, and boundary imperfections. The methodology enables a first assessment of the structural reliability of the shells through the calculation of a probabilistic buckling factor for a specified level of probability. The factor depends highly on the reliability level, on the number of adopted samples, and on the assumptions made in modeling the input imperfections. The main advantage of the developed procedure is the versatility, as it can be applied to the buckling analysis of laminated composite shells and sandwich composite shells including different types of imperfections.
Nichols, J.D.; Pollock, K.H.
1983-01-01
Capture-recapture models can be used to estimate parameters of interest from paleobiological data when encouter probabilities are unknown and variable over time. These models also permit estimation of sampling variances and goodness-of-fit tests are available for assessing the fit of data to most models. The authors describe capture-recapture models which should be useful in paleobiological analyses and discuss the assumptions which underlie them. They illustrate these models with examples and discuss aspects of study design.
On the use of variability time-scales as an early classifier of radio transients and variables
NASA Astrophysics Data System (ADS)
Pietka, M.; Staley, T. D.; Pretorius, M. L.; Fender, R. P.
2017-11-01
We have shown previously that a broad correlation between the peak radio luminosity and the variability time-scales, approximately L ∝ τ5, exists for variable synchrotron emitting sources and that different classes of astrophysical sources occupy different regions of luminosity and time-scale space. Based on those results, we investigate whether the most basic information available for a newly discovered radio variable or transient - their rise and/or decline rate - can be used to set initial constraints on the class of events from which they originate. We have analysed a sample of ≈800 synchrotron flares, selected from light curves of ≈90 sources observed at 5-8 GHz, representing a wide range of astrophysical phenomena, from flare stars to supermassive black holes. Selection of outbursts from the noisy radio light curves has been done automatically in order to ensure reproducibility of results. The distribution of rise/decline rates for the selected flares is modelled as a Gaussian probability distribution for each class of object, and further convolved with estimated areal density of that class in order to correct for the strong bias in our sample. We show in this way that comparing the measured variability time-scale of a radio transient/variable of unknown origin can provide an early, albeit approximate, classification of the object, and could form part of a suite of measurements used to provide early categorization of such events. Finally, we also discuss the effect scintillating sources will have on our ability to classify events based on their variability time-scales.
Perron, Marc; Gendron, Chantal; Langevin, Pierre; Leblond, Jean; Roos, Marianne; Roy, Jean-Sébastien
2018-04-02
Low back pain (LBP) encompasses heterogeneous patients unlikely to respond to a unique treatment. Identifying sub-groups of LBP may help to improve treatment outcomes. This is a hypothesis-setting study designed to create a clinical prediction rule (CPR) that will predict favorable outcomes in soldiers with sub-acute and chronic LBP participating in a multi-station exercise program. Military members with LBP participated in a supervised program comprising 7 stations each consisting of exercises of increasing difficulty. Demographic, impairment and disability data were collected at baseline. The modified Oswestry Disability Index (ODI) was administered at baseline and following the 6-week program. An improvement of 50% in the initial ODI score was considered the reference standard to determine a favorable outcome. Univariate associations with favorable outcome were tested using chi-square or paired t-tests. Variables that showed between-group (favorable/unfavorable) differences were entered into a logistic regression after determining the sampling adequacy. Finally, continuous variables were dichotomized and the sensitivity, specificity and positive and negative likelihood ratios were determined for the model and for each variable. A sample of 85 participants was included in analyses. Five variables contributed to prediction of a favorable outcome: no pain in lying down (p = 0.017), no use of antidepressants (p = 0.061), FABQ work score < 22.5 (p = 0.061), fewer than 5 physiotherapy sessions before entering the program (p = 0.144) and less than 6 months' work restriction (p = 0.161). This model yielded a sensitivity of 0.78, specificity of 0.80, LR+ of 3.88, and LR- of 0.28. A 77.5% probability of favorable outcome can be predicted by the presence of more than three of the five variables, while an 80% probability of unfavorable outcome can be expected if only three or fewer variables are present. The use of prognostic factors may guide clinicians in identifying soldiers with LBP most likely to have a favorable outcome. Further validation studies are needed to determine if the variables identified in our study are treatment effect modifiers that can predict success following participation in the multi-station exercise program. ClinicalTrials.gov Identifier: NCT03464877 registered retrospectively on 14 March 2018.
Determination of riverbank erosion probability using Locally Weighted Logistic Regression
NASA Astrophysics Data System (ADS)
Ioannidou, Elena; Flori, Aikaterini; Varouchakis, Emmanouil A.; Giannakis, Georgios; Vozinaki, Anthi Eirini K.; Karatzas, George P.; Nikolaidis, Nikolaos
2015-04-01
Riverbank erosion is a natural geomorphologic process that affects the fluvial environment. The most important issue concerning riverbank erosion is the identification of the vulnerable locations. An alternative to the usual hydrodynamic models to predict vulnerable locations is to quantify the probability of erosion occurrence. This can be achieved by identifying the underlying relations between riverbank erosion and the geomorphological or hydrological variables that prevent or stimulate erosion. Thus, riverbank erosion can be determined by a regression model using independent variables that are considered to affect the erosion process. The impact of such variables may vary spatially, therefore, a non-stationary regression model is preferred instead of a stationary equivalent. Locally Weighted Regression (LWR) is proposed as a suitable choice. This method can be extended to predict the binary presence or absence of erosion based on a series of independent local variables by using the logistic regression model. It is referred to as Locally Weighted Logistic Regression (LWLR). Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable (e.g. binary response) based on one or more predictor variables. The method can be combined with LWR to assign weights to local independent variables of the dependent one. LWR allows model parameters to vary over space in order to reflect spatial heterogeneity. The probabilities of the possible outcomes are modelled as a function of the independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. erosion presence or absence) for any value of the independent variables. The erosion occurrence probability can be calculated in conjunction with the model deviance regarding the independent variables tested. The most straightforward measure for goodness of fit is the G statistic. It is a simple and effective way to study and evaluate the Logistic Regression model efficiency and the reliability of each independent variable. The developed statistical model is applied to the Koiliaris River Basin on the island of Crete, Greece. Two datasets of river bank slope, river cross-section width and indications of erosion were available for the analysis (12 and 8 locations). Two different types of spatial dependence functions, exponential and tricubic, were examined to determine the local spatial dependence of the independent variables at the measurement locations. The results show a significant improvement when the tricubic function is applied as the erosion probability is accurately predicted at all eight validation locations. Results for the model deviance show that cross-section width is more important than bank slope in the estimation of erosion probability along the Koiliaris riverbanks. The proposed statistical model is a useful tool that quantifies the erosion probability along the riverbanks and can be used to assist managing erosion and flooding events. Acknowledgements This work is part of an on-going THALES project (CYBERSENSORS - High Frequency Monitoring System for Integrated Water Resources Management of Rivers). The project has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.
Probabilistic inference using linear Gaussian importance sampling for hybrid Bayesian networks
NASA Astrophysics Data System (ADS)
Sun, Wei; Chang, K. C.
2005-05-01
Probabilistic inference for Bayesian networks is in general NP-hard using either exact algorithms or approximate methods. However, for very complex networks, only the approximate methods such as stochastic sampling could be used to provide a solution given any time constraint. There are several simulation methods currently available. They include logic sampling (the first proposed stochastic method for Bayesian networks, the likelihood weighting algorithm) the most commonly used simulation method because of its simplicity and efficiency, the Markov blanket scoring method, and the importance sampling algorithm. In this paper, we first briefly review and compare these available simulation methods, then we propose an improved importance sampling algorithm called linear Gaussian importance sampling algorithm for general hybrid model (LGIS). LGIS is aimed for hybrid Bayesian networks consisting of both discrete and continuous random variables with arbitrary distributions. It uses linear function and Gaussian additive noise to approximate the true conditional probability distribution for continuous variable given both its parents and evidence in a Bayesian network. One of the most important features of the newly developed method is that it can adaptively learn the optimal important function from the previous samples. We test the inference performance of LGIS using a 16-node linear Gaussian model and a 6-node general hybrid model. The performance comparison with other well-known methods such as Junction tree (JT) and likelihood weighting (LW) shows that LGIS-GHM is very promising.
Lago-Ballesteros, Joaquin; Lago-Peñas, Carlos; Rey, Ezequiel
2012-01-01
The aim of this study was to analyse the influence of playing tactics, opponent interaction and situational variables on achieving score-box possessions in professional soccer. The sample was constituted by 908 possessions obtained by a team from the Spanish soccer league in 12 matches played during the 2009-2010 season. Multidimensional qualitative data obtained from 12 ordered categorical variables were used. Sampled matches were registered by the AMISCO PRO system. Data were analysed using chi-square analysis and multiple logistic regression analysis. Of 908 possessions, 303 (33.4%) produced score-box possessions, 477 (52.5%) achieved progression and 128 (14.1%) failed to reach any sort of progression. Multiple logistic regression showed that, for the main variable "team possession type", direct attacks and counterattacks were three times more effective than elaborate attacks for producing a score-box possession (P < 0.05). Team possession originating from the middle zones and playing against less than six defending players (P < 0.001) registered a higher success than those started in the defensive zone with a balanced defence. When the team was drawing or winning, the probability of reaching the score-box decreased by 43 and 53 percent, respectively, compared with the losing situation (P < 0.05). Accounting for opponent interactions and situational variables is critical to evaluate the effectiveness of offensive playing tactics on producing score-box possessions.
Silva-Fernández, Lucía; Pérez-Vicente, Sabina; Martín-Martínez, María Auxiliadora; López-González, Ruth
2015-06-01
To describe the variability in the prescription of non-biologic disease-modifying antirheumatic drugs (nbDMARDs) for the treatment of spondyloarthritis (SpA) in Spain and to explore which factors relating to the disease, patient, physician, and/or center contribute to these variations. A retrospective medical record review was performed using a probabilistic sample of 1168 patients with SpA from 45 centers distributed in 15/19 regions in Spain. The sociodemographic and clinical features and the use of drugs were recorded following a standardized protocol. Logistic regression, with nbDMARDs prescriptions as the dependent variable, was used for bivariable analysis. A multilevel logistic regression model was used to study variability. The probability of receiving an nbDMARD was higher in female patients [OR = 1.548; 95% confidence interval (CI): 1.208-1.984], in those with elevated C-reactive protein (OR = 1.039; 95% CI: 1.012-1.066) and erythrocyte sedimentation rate (OR = 1.012; 95% CI: 1.003-1.021), in those with a higher number of affected peripheral joints (OR = 12.921; 95% CI: 2.911-57.347), and in patients with extra-articular manifestations like dactylitis (OR = 2.997; 95% CI: 1.868-4.809), psoriasis (OR = 2.601; 95% CI: 1.870-3.617), and enthesitis (OR = 1.717; 95% CI: 1.224-2.410). There was a marked variability in the prescription of nbDMARDs for SpA patients, depending on the center (14.3%; variance 0.549; standard error 0.161; median odds ratio 2.366; p < 0.001). After adjusting for patient and center variables, this variability fell to 3.8%. A number of factors affecting variability in clinical practice, and which are independent of disease characteristics, are associated with the probability of SpA patients receiving nbDMARDs in Spain. Copyright © 2015 Elsevier Inc. All rights reserved.
Staggs, Vincent S; Cramer, Emily
2016-08-01
Hospital performance reports often include rankings of unit pressure ulcer rates. Differentiating among units on the basis of quality requires reliable measurement. Our objectives were to describe and apply methods for assessing reliability of hospital-acquired pressure ulcer rates and evaluate a standard signal-noise reliability measure as an indicator of precision of differentiation among units. Quarterly pressure ulcer data from 8,199 critical care, step-down, medical, surgical, and medical-surgical nursing units from 1,299 US hospitals were analyzed. Using beta-binomial models, we estimated between-unit variability (signal) and within-unit variability (noise) in annual unit pressure ulcer rates. Signal-noise reliability was computed as the ratio of between-unit variability to the total of between- and within-unit variability. To assess precision of differentiation among units based on ranked pressure ulcer rates, we simulated data to estimate the probabilities of a unit's observed pressure ulcer rate rank in a given sample falling within five and ten percentiles of its true rank, and the probabilities of units with ulcer rates in the highest quartile and highest decile being identified as such. We assessed the signal-noise measure as an indicator of differentiation precision by computing its correlations with these probabilities. Pressure ulcer rates based on a single year of quarterly or weekly prevalence surveys were too susceptible to noise to allow for precise differentiation among units, and signal-noise reliability was a poor indicator of precision of differentiation. To ensure precise differentiation on the basis of true differences, alternative methods of assessing reliability should be applied to measures purported to differentiate among providers or units based on quality. © 2016 The Authors. Research in Nursing & Health published by Wiley Periodicals, Inc. © 2016 The Authors. Research in Nursing & Health published by Wiley Periodicals, Inc.
Habbous, Steven; Chu, Karen P.; Lau, Harold; Schorr, Melissa; Belayneh, Mathieos; Ha, Michael N.; Murray, Scott; O’Sullivan, Brian; Huang, Shao Hui; Snow, Stephanie; Parliament, Matthew; Hao, Desiree; Cheung, Winson Y.; Xu, Wei; Liu, Geoffrey
2017-01-01
BACKGROUND: The incidence of oropharyngeal cancer has risen over the past 2 decades. This rise has been attributed to human papillomavirus (HPV), but information on temporal trends in incidence of HPV-associated cancers across Canada is limited. METHODS: We collected social, clinical and demographic characteristics and p16 protein status (p16-positive or p16-negative, using this immunohistochemistry variable as a surrogate marker of HPV status) for 3643 patients with oropharyngeal cancer diagnosed between 2000 and 2012 at comprehensive cancer centres in British Columbia (6 centres), Edmonton, Calgary, Toronto and Halifax. We used receiver operating characteristic curves and multiple imputation to estimate the p16 status for missing values. We chose a best-imputation probability cut point on the basis of accuracy in samples with known p16 status and through an independent relation between p16 status and overall survival. We used logistic and Cox proportional hazard regression. RESULTS: We found no temporal changes in p16-positive status initially, but there was significant selection bias, with p16 testing significantly more likely to be performed in males, lifetime never-smokers, patients with tonsillar or base-of-tongue tumours and those with nodal involvement (p < 0.05 for each variable). We used the following variables associated with p16-positive status for multiple imputation: male sex, tonsillar or base-of-tongue tumours, smaller tumours, nodal involvement, less smoking and lower alcohol consumption (p < 0.05 for each variable). Using sensitivity analyses, we showed that different imputation probability cut points for p16-positive status each identified a rise from 2000 to 2012, with the best-probability cut point identifying an increase from 47.3% in 2000 to 73.7% in 2012 (p < 0.001). INTERPRETATION: Across multiple centres in Canada, there was a steady rise in the proportion of oropharyngeal cancers attributable to HPV from 2000 to 2012. PMID:28808115
Effects of variability in probable maximum precipitation patterns on flood losses
NASA Astrophysics Data System (ADS)
Zischg, Andreas Paul; Felder, Guido; Weingartner, Rolf; Quinn, Niall; Coxon, Gemma; Neal, Jeffrey; Freer, Jim; Bates, Paul
2018-05-01
The assessment of the impacts of extreme floods is important for dealing with residual risk, particularly for critical infrastructure management and for insurance purposes. Thus, modelling of the probable maximum flood (PMF) from probable maximum precipitation (PMP) by coupling hydrological and hydraulic models has gained interest in recent years. Herein, we examine whether variability in precipitation patterns exceeds or is below selected uncertainty factors in flood loss estimation and if the flood losses within a river basin are related to the probable maximum discharge at the basin outlet. We developed a model experiment with an ensemble of probable maximum precipitation scenarios created by Monte Carlo simulations. For each rainfall pattern, we computed the flood losses with a model chain and benchmarked the effects of variability in rainfall distribution with other model uncertainties. The results show that flood losses vary considerably within the river basin and depend on the timing and superimposition of the flood peaks from the basin's sub-catchments. In addition to the flood hazard component, the other components of flood risk, exposure, and vulnerability contribute remarkably to the overall variability. This leads to the conclusion that the estimation of the probable maximum expectable flood losses in a river basin should not be based exclusively on the PMF. Consequently, the basin-specific sensitivities to different precipitation patterns and the spatial organization of the settlements within the river basin need to be considered in the analyses of probable maximum flood losses.
Singer, Donald A.; Kouda, Ryoichi
1991-01-01
The FINDER system employs geometric probability, Bayesian statistics, and the normal probability density function to integrate spatial and frequency information to produce a map of probabilities of target centers. Target centers can be mineral deposits, alteration associated with mineral deposits, or any other target that can be represented by a regular shape on a two dimensional map. The size, shape, mean, and standard deviation for each variable are characterized in a control area and the results applied by means of FINDER to the study area. The Kushikino deposit consists of groups of quartz-calcite-adularia veins that produced 55 tonnes of gold and 456 tonnes of silver since 1660. Part of a 6 by 10 km area near Kushikino served as a control area. Within the control area, data plotting, contouring, and cluster analysis were used to identify the barren and mineralized populations. Sodium was found to be depleted in an elliptically shaped area 3.1 by 1.6 km, potassium was both depleted and enriched locally in an elliptically shaped area 3.0 by 1.3 km, and sulfur was enriched in an elliptically shaped area 5.8 by 1.6 km. The potassium, sodium, and sulfur content from 233 surface rock samples were each used in FINDER to produce probability maps for the 12 by 30 km study area which includes Kushikino. High probability areas for each of the individual variables are over and offset up to 4 km eastward from the main Kushikino veins. In general, high probability areas identified by FINDER are displaced from the main veins and cover not only the host andesite and the dacite-andesite that is about the same age as the Kushikino mineralization, but also younger sedimentary rocks, andesite, and tuff units east and northeast of Kushikino. The maps also display the same patterns observed near Kushikino, but with somewhat lower probabilities, about 1.5 km east of the old gold prospect, Hajima, and in a broad zone 2.5 km east-west and 1 km north-south, centered 2 km west of the old gold prospect, Yaeyama.
Using climate model simulations to assess the current climate risk to maize production
NASA Astrophysics Data System (ADS)
Kent, Chris; Pope, Edward; Thompson, Vikki; Lewis, Kirsty; Scaife, Adam A.; Dunstone, Nick
2017-05-01
The relationship between the climate and agricultural production is of considerable importance to global food security. However, there has been relatively little exploration of climate-variability related yield shocks. The short observational yield record does not adequately sample natural inter-annual variability thereby limiting the accuracy of probability assessments. Focusing on the United States and China, we present an innovative use of initialised ensemble climate simulations and a new agro-climatic indicator, to calculate the risk of severe water stress. Combined, these regions provide 60% of the world’s maize, and therefore, are crucial to global food security. To probe a greater range of inter-annual variability, the indicator is applied to 1400 simulations of the present day climate. The probability of severe water stress in the major maize producing regions is quantified, and in many regions an increased risk is found compared to calculations from observed historical data. Analysis suggests that the present day climate is also capable of producing unprecedented severe water stress conditions. Therefore, adaptation plans and policies based solely on observed events from the recent past may considerably under-estimate the true risk of climate-related maize shocks. The probability of a major impact event occurring simultaneously across both regions—a multi-breadbasket failure—is estimated to be up to 6% per decade and arises from a physically plausible climate state. This novel approach highlights the significance of climate impacts on crop production shocks and provides a platform for considerably improving food security assessments, in the present day or under a changing climate, as well as development of new risk based climate services.
PIÑEYRO-NELSON, A; VAN HEERWAARDEN, J; PERALES, H R; SERRATOS-HERNÁNDEZ, J A; RANGEL, A; HUFFORD, M B; GEPTS, P; GARAY-ARROYO, A; RIVERA-BUSTAMANTE, R; ÁLVAREZ-BUYLLA, E R
2009-01-01
A possible consequence of planting genetically modified organisms (GMOs) in centres of crop origin is unintended gene flow into traditional landraces. In 2001, a study reported the presence of the transgenic 35S promoter in maize landraces sampled in 2000 from the Sierra Juarez of Oaxaca, Mexico. Analysis of a large sample taken from the same region in 2003 and 2004 could not confirm the existence of transgenes, thereby casting doubt on the earlier results. These two studies were based on different sampling and analytical procedures and are thus hard to compare. Here, we present new molecular data for this region that confirm the presence of transgenes in three of 23 localities sampled in 2001. Transgene sequences were not detected in samples taken in 2002 from nine localities, while directed samples taken in 2004 from two of the positive 2001 localities were again found to contain transgenic sequences. These findings suggest the persistence or re-introduction of transgenes up until 2004 in this area. We address variability in recombinant sequence detection by analyzing the consistency of current molecular assays. We also present theoretical results on the limitations of estimating the probability of transgene detection in samples taken from landraces. The inclusion of a limited number of female gametes and, more importantly, aggregated transgene distributions may significantly lower detection probabilities. Our analytical and sampling considerations help explain discrepancies among different detection efforts, including the one presented here, and provide considerations for the establishment of monitoring protocols to detect the presence of transgenes among structured populations of landraces. PMID:19143938
Piñeyro-Nelson, A; Van Heerwaarden, J; Perales, H R; Serratos-Hernández, J A; Rangel, A; Hufford, M B; Gepts, P; Garay-Arroyo, A; Rivera-Bustamante, R; Alvarez-Buylla, E R
2009-02-01
A possible consequence of planting genetically modified organisms (GMOs) in centres of crop origin is unintended gene flow into traditional landraces. In 2001, a study reported the presence of the transgenic 35S promoter in maize landraces sampled in 2000 from the Sierra Juarez of Oaxaca, Mexico. Analysis of a large sample taken from the same region in 2003 and 2004 could not confirm the existence of transgenes, thereby casting doubt on the earlier results. These two studies were based on different sampling and analytical procedures and are thus hard to compare. Here, we present new molecular data for this region that confirm the presence of transgenes in three of 23 localities sampled in 2001. Transgene sequences were not detected in samples taken in 2002 from nine localities, while directed samples taken in 2004 from two of the positive 2001 localities were again found to contain transgenic sequences. These findings suggest the persistence or re-introduction of transgenes up until 2004 in this area. We address variability in recombinant sequence detection by analyzing the consistency of current molecular assays. We also present theoretical results on the limitations of estimating the probability of transgene detection in samples taken from landraces. The inclusion of a limited number of female gametes and, more importantly, aggregated transgene distributions may significantly lower detection probabilities. Our analytical and sampling considerations help explain discrepancies among different detection efforts, including the one presented here, and provide considerations for the establishment of monitoring protocols to detect the presence of transgenes among structured populations of landraces.
Gitto, Lara; Noh, Yong-Hwan; Andrés, Antonio Rodríguez
2015-04-16
Depression is a mental health state whose frequency has been increasing in modern societies. It imposes a great burden, because of the strong impact on people's quality of life and happiness. Depression can be reliably diagnosed and treated in primary care: if more people could get effective treatments earlier, the costs related to depression would be reversed. The aim of this study was to examine the influence of socio-economic factors and gender on depressed mood, focusing on Korea. In fact, in spite of the great amount of empirical studies carried out for other countries, few epidemiological studies have examined the socio-economic determinants of depression in Korea and they were either limited to samples of employed women or did not control for individual health status. Moreover, as the likely data endogeneity (i.e. the possibility of correlation between the dependent variable and the error term as a result of autocorrelation or simultaneity, such as, in this case, the depressed mood due to health factors that, in turn might be caused by depression), might bias the results, the present study proposes an empirical approach, based on instrumental variables, to deal with this problem. Data for the year 2008 from the Korea National Health and Nutrition Examination Survey (KNHANES) were employed. About seven thousands of people (N= 6,751, of which 43% were males and 57% females), aged from 19 to 75 years old, were included in the sample considered in the analysis. In order to take into account the possible endogeneity of some explanatory variables, two Instrumental Variables Probit (IVP) regressions were estimated; the variables for which instrumental equations were estimated were related to the participation of women to the workforce and to good health, as reported by people in the sample. Explanatory variables were related to age, gender, family factors (such as the number of family members and marital status) and socio-economic factors (such as education, residence in metropolitan areas, and so on). As the results of the Wald test carried out after the estimations did not allow to reject the null hypothesis of endogeneity, a probit model was run too. Overall, women tend to develop depression more frequently than men. There is an inverse effect of education on depressed mood (probability of -24.6% to report a depressed mood due to high school education, as it emerges from the probit model marginal effects), while marital status and the number of family members may act as protective factors (probability to report a depressed mood of -1.0% for each family member). Depression is significantly associated with socio-economic conditions, such as work and income. Living in metropolitan areas is inversely correlated with depression (probability of -4.1% to report a depressed mood estimated through the probit model): this could be explained considering that, in rural areas, people rarely have immediate access to high-quality health services. This study outlines the factors that are more likely to impact on depression, and applies an IVP model to take into account the potential endogeneity of some of the predictors of depressive mood, such as female participation to workforce and health status. A probit model has been estimated too. Depression is associated with a wide range of socio-economic factors, although the strength and direction of the association can differ by gender. Prevention approaches to contrast depressive symptoms might take into consideration the evidence offered by the present study. © 2015 by Kerman University of Medical Sciences.
Gitto, Lara; Noh, Yong-Hwan; Andrés, Antonio Rodríguez
2015-01-01
Background: Depression is a mental health state whose frequency has been increasing in modern societies. It imposes a great burden, because of the strong impact on people’s quality of life and happiness. Depression can be reliably diagnosed and treated in primary care: if more people could get effective treatments earlier, the costs related to depression would be reversed. The aim of this study was to examine the influence of socio-economic factors and gender on depressed mood, focusing on Korea. In fact, in spite of the great amount of empirical studies carried out for other countries, few epidemiological studies have examined the socio-economic determinants of depression in Korea and they were either limited to samples of employed women or did not control for individual health status. Moreover, as the likely data endogeneity (i.e. the possibility of correlation between the dependent variable and the error term as a result of autocorrelation or simultaneity, such as, in this case, the depressed mood due to health factors that, in turn might be caused by depression), might bias the results, the present study proposes an empirical approach, based on instrumental variables, to deal with this problem. Methods: Data for the year 2008 from the Korea National Health and Nutrition Examination Survey (KNHANES) were employed. About seven thousands of people (N= 6,751, of which 43% were males and 57% females), aged from 19 to 75 years old, were included in the sample considered in the analysis. In order to take into account the possible endogeneity of some explanatory variables, two Instrumental Variables Probit (IVP) regressions were estimated; the variables for which instrumental equations were estimated were related to the participation of women to the workforce and to good health, as reported by people in the sample. Explanatory variables were related to age, gender, family factors (such as the number of family members and marital status) and socio-economic factors (such as education, residence in metropolitan areas, and so on). As the results of the Wald test carried out after the estimations did not allow to reject the null hypothesis of endogeneity, a probit model was run too. Results: Overall, women tend to develop depression more frequently than men. There is an inverse effect of education on depressed mood (probability of -24.6% to report a depressed mood due to high school education, as it emerges from the probit model marginal effects), while marital status and the number of family members may act as protective factors (probability to report a depressed mood of -1.0% for each family member). Depression is significantly associated with socio-economic conditions, such as work and income. Living in metropolitan areas is inversely correlated with depression (probability of -4.1% to report a depressed mood estimated through the probit model): this could be explained considering that, in rural areas, people rarely have immediate access to high-quality health services. Conclusion: This study outlines the factors that are more likely to impact on depression, and applies an IVP model to take into account the potential endogeneity of some of the predictors of depressive mood, such as female participation to workforce and health status. A probit model has been estimated too. Depression is associated with a wide range of socio-economic factors, although the strength and direction of the association can differ by gender. Prevention approaches to contrast depressive symptoms might take into consideration the evidence offered by the present study. PMID:26340392
NASA Astrophysics Data System (ADS)
Frič, Roman; Papčo, Martin
2017-12-01
Stressing a categorical approach, we continue our study of fuzzified domains of probability, in which classical random events are replaced by measurable fuzzy random events. In operational probability theory (S. Bugajski) classical random variables are replaced by statistical maps (generalized distribution maps induced by random variables) and in fuzzy probability theory (S. Gudder) the central role is played by observables (maps between probability domains). We show that to each of the two generalized probability theories there corresponds a suitable category and the two resulting categories are dually equivalent. Statistical maps and observables become morphisms. A statistical map can send a degenerated (pure) state to a non-degenerated one —a quantum phenomenon and, dually, an observable can map a crisp random event to a genuine fuzzy random event —a fuzzy phenomenon. The dual equivalence means that the operational probability theory and the fuzzy probability theory coincide and the resulting generalized probability theory has two dual aspects: quantum and fuzzy. We close with some notes on products and coproducts in the dual categories.
Validation of Metrics as Error Predictors
NASA Astrophysics Data System (ADS)
Mendling, Jan
In this chapter, we test the validity of metrics that were defined in the previous chapter for predicting errors in EPC business process models. In Section 5.1, we provide an overview of how the analysis data is generated. Section 5.2 describes the sample of EPCs from practice that we use for the analysis. Here we discuss a disaggregation by the EPC model group and by error as well as a correlation analysis between metrics and error. Based on this sample, we calculate a logistic regression model for predicting error probability with the metrics as input variables in Section 5.3. In Section 5.4, we then test the regression function for an independent sample of EPC models from textbooks as a cross-validation. Section 5.5 summarizes the findings.
Probabilistic Modeling of Aircraft Trajectories for Dynamic Separation Volumes
NASA Technical Reports Server (NTRS)
Lewis, Timothy A.
2016-01-01
With a proliferation of new and unconventional vehicles and operations expected in the future, the ab initio airspace design will require new approaches to trajectory prediction for separation assurance and other air traffic management functions. This paper presents an approach to probabilistic modeling of the trajectory of an aircraft when its intent is unknown. The approach uses a set of feature functions to constrain a maximum entropy probability distribution based on a set of observed aircraft trajectories. This model can be used to sample new aircraft trajectories to form an ensemble reflecting the variability in an aircraft's intent. The model learning process ensures that the variability in this ensemble reflects the behavior observed in the original data set. Computational examples are presented.
Seabloom, William; Seabloom, Mary E; Seabloom, Eric; Barron, Robert; Hendrickson, Sharon
2003-08-01
The study determines the effectiveness of a sexuality-positive adolescent sexual offender treatment program and examines subsequent criminal recidivism in the three outcome groups (completed, withdrawn, referred). The sample consists of 122 adolescent males and their families (491 individuals). Of the demographic variables, only living situation was significant, such that patients living with parents were more likely to graduate. None of the behavioral variables were found to be significant. Of the treatment variables, length of time in the program and participation in the Family Journey Seminar were included in the final model. When they were included in the model, no other treatment variable were significantly related to probability of graduation. There were no arrests or convictions for sex-related crimes in the population of participants that successfully completed the program. This group was also less likely than the other groups to be arrested (p = 0.014) or convicted (p = 0.004) across all crime categories.
Dodd, C.K.; Dorazio, R.M.
2004-01-01
A critical variable in both ecological and conservation field studies is determining how many individuals of a species are present within a defined sampling area. Labor intensive techniques such as capture-mark-recapture and removal sampling may provide estimates of abundance, but there are many logistical constraints to their widespread application. Many studies on terrestrial and aquatic salamanders use counts as an index of abundance, assuming that detection remains constant while sampling. If this constancy is violated, determination of detection probabilities is critical to the accurate estimation of abundance. Recently, a model was developed that provides a statistical approach that allows abundance and detection to be estimated simultaneously from spatially and temporally replicated counts. We adapted this model to estimate these parameters for salamanders sampled over a six vear period in area-constrained plots in Great Smoky Mountains National Park. Estimates of salamander abundance varied among years, but annual changes in abundance did not vary uniformly among species. Except for one species, abundance estimates were not correlated with site covariates (elevation/soil and water pH, conductivity, air and water temperature). The uncertainty in the estimates was so large as to make correlations ineffectual in predicting which covariates might influence abundance. Detection probabilities also varied among species and sometimes among years for the six species examined. We found such a high degree of variation in our counts and in estimates of detection among species, sites, and years as to cast doubt upon the appropriateness of using count data to monitor population trends using a small number of area-constrained survey plots. Still, the model provided reasonable estimates of abundance that could make it useful in estimating population size from count surveys.
Adverse childhood events, substance abuse, and measures of affiliation.
Zlotnick, Cheryl; Tam, Tammy; Robertson, Marjorie J
2004-08-01
Adverse childhood events may influence later behaviors, including adulthood substance use and social affiliation. Studies have noted high prevalence rates of adverse childhood experiences and adulthood substance abuse among homeless adults. Using an existing longitudinal, countywide probability sample of 397 homeless adults, we examine the relationships among adverse childhood events on adulthood substance use, and the relationship of these variables to affiliation. Almost 75% of the sample had experienced an adverse childhood event. Path analysis indicated adulthood substance abuse mediated the inverse relationship between adverse childhood events and two measures of adulthood affiliation. Thus, although there is a relationship between adverse childhood events and adulthood substance use, it is adulthood substance use that determines most aspects of affiliation.
Quantum-inspired algorithm for estimating the permanent of positive semidefinite matrices
NASA Astrophysics Data System (ADS)
Chakhmakhchyan, L.; Cerf, N. J.; Garcia-Patron, R.
2017-08-01
We construct a quantum-inspired classical algorithm for computing the permanent of Hermitian positive semidefinite matrices by exploiting a connection between these mathematical structures and the boson sampling model. Specifically, the permanent of a Hermitian positive semidefinite matrix can be expressed in terms of the expected value of a random variable, which stands for a specific photon-counting probability when measuring a linear-optically evolved random multimode coherent state. Our algorithm then approximates the matrix permanent from the corresponding sample mean and is shown to run in polynomial time for various sets of Hermitian positive semidefinite matrices, achieving a precision that improves over known techniques. This work illustrates how quantum optics may benefit algorithm development.
Continuous-Variable Instantaneous Quantum Computing is Hard to Sample.
Douce, T; Markham, D; Kashefi, E; Diamanti, E; Coudreau, T; Milman, P; van Loock, P; Ferrini, G
2017-02-17
Instantaneous quantum computing is a subuniversal quantum complexity class, whose circuits have proven to be hard to simulate classically in the discrete-variable realm. We extend this proof to the continuous-variable (CV) domain by using squeezed states and homodyne detection, and by exploring the properties of postselected circuits. In order to treat postselection in CVs, we consider finitely resolved homodyne detectors, corresponding to a realistic scheme based on discrete probability distributions of the measurement outcomes. The unavoidable errors stemming from the use of finitely squeezed states are suppressed through a qubit-into-oscillator Gottesman-Kitaev-Preskill encoding of quantum information, which was previously shown to enable fault-tolerant CV quantum computation. Finally, we show that, in order to render postselected computational classes in CVs meaningful, a logarithmic scaling of the squeezing parameter with the circuit size is necessary, translating into a polynomial scaling of the input energy.
A short note on the maximal point-biserial correlation under non-normality.
Cheng, Ying; Liu, Haiyan
2016-11-01
The aim of this paper is to derive the maximal point-biserial correlation under non-normality. Several widely used non-normal distributions are considered, namely the uniform distribution, t-distribution, exponential distribution, and a mixture of two normal distributions. Results show that the maximal point-biserial correlation, depending on the non-normal continuous variable underlying the binary manifest variable, may not be a function of p (the probability that the dichotomous variable takes the value 1), can be symmetric or non-symmetric around p = .5, and may still lie in the range from -1.0 to 1.0. Therefore researchers should exercise caution when they interpret their sample point-biserial correlation coefficients based on popular beliefs that the maximal point-biserial correlation is always smaller than 1, and that the size of the correlation is always further restricted as p deviates from .5. © 2016 The British Psychological Society.
Binary logistic regression modelling: Measuring the probability of relapse cases among drug addict
NASA Astrophysics Data System (ADS)
Ismail, Mohd Tahir; Alias, Siti Nor Shadila
2014-07-01
For many years Malaysia faced the drug addiction issues. The most serious case is relapse phenomenon among treated drug addict (drug addict who have under gone the rehabilitation programme at Narcotic Addiction Rehabilitation Centre, PUSPEN). Thus, the main objective of this study is to find the most significant factor that contributes to relapse to happen. The binary logistic regression analysis was employed to model the relationship between independent variables (predictors) and dependent variable. The dependent variable is the status of the drug addict either relapse, (Yes coded as 1) or not, (No coded as 0). Meanwhile the predictors involved are age, age at first taking drug, family history, education level, family crisis, community support and self motivation. The total of the sample is 200 which the data are provided by AADK (National Antidrug Agency). The finding of the study revealed that age and self motivation are statistically significant towards the relapse cases..
Whisman, Mark A.; Robustelli, Briana L.; Sbarra, David A.
2016-01-01
Rationale Marital disruption (i.e., marital separation, divorce) is associated with a wide range of poor mental and physical health outcomes, including increased risk for all-cause mortality. One biological intermediary that may help explain the association between marital disruption and poor health is accelerated cellular aging. Objective This study examines the association between marital disruption and salivary telomere length in a United States probability sample of adults ≥ 50 years of age. Method Participants were 3,526 individuals who participated in the 2008 wave of the Health and Retirement Study. Telomere length assays were performed using quantitative real-time polymerase chain reaction (qPCR) on DNA extracted from saliva samples. Health and lifestyle factors, traumatic and stressful life events, and neuroticism were assessed via self-report. Linear regression analyses were conducted to examine the associations between predictor variables and salivary telomere length. Results Based on their marital status data in the 2006 wave, people who were separated or divorced had shorter salivary telomeres than people who were continuously married or had never been married, and the association between marital disruption and salivary telomere length was not moderated by gender or neuroticism. Furthermore, the association between marital disruption and salivary telomere length remained statistically significant after adjusting for demographic and socioeconomic variables, neuroticism, cigarette use, body mass, traumatic life events, and other stressful life events. Additionally, results revealed that currently married adults with a history of divorce evidenced shorter salivary telomeres than people who were continuously married or never married. Conclusion Accelerated cellular aging, as indexed by telomere shortening, may be one pathway through which marital disruption is associated with morbidity and mortality. PMID:27062452
Whisman, Mark A; Robustelli, Briana L; Sbarra, David A
2016-05-01
Marital disruption (i.e., marital separation, divorce) is associated with a wide range of poor mental and physical health outcomes, including increased risk for all-cause mortality. One biological intermediary that may help explain the association between marital disruption and poor health is accelerated cellular aging. This study examines the association between marital disruption and salivary telomere length in a United States probability sample of adults ≥50 years of age. Participants were 3526 individuals who participated in the 2008 wave of the Health and Retirement Study. Telomere length assays were performed using quantitative real-time polymerase chain reaction (qPCR) on DNA extracted from saliva samples. Health and lifestyle factors, traumatic and stressful life events, and neuroticism were assessed via self-report. Linear regression analyses were conducted to examine the associations between predictor variables and salivary telomere length. Based on their marital status data in the 2006 wave, people who were separated or divorced had shorter salivary telomeres than people who were continuously married or had never been married, and the association between marital disruption and salivary telomere length was not moderated by gender or neuroticism. Furthermore, the association between marital disruption and salivary telomere length remained statistically significant after adjusting for demographic and socioeconomic variables, neuroticism, cigarette use, body mass, traumatic life events, and other stressful life events. Additionally, results revealed that currently married adults with a history of divorce evidenced shorter salivary telomeres than people who were continuously married or never married. Accelerated cellular aging, as indexed by telomere shortening, may be one pathway through which marital disruption is associated with morbidity and mortality. Copyright © 2016 Elsevier Ltd. All rights reserved.
The reliable solution and computation time of variable parameters logistic model
NASA Astrophysics Data System (ADS)
Wang, Pengfei; Pan, Xinnong
2018-05-01
The study investigates the reliable computation time (RCT, termed as T c) by applying a double-precision computation of a variable parameters logistic map (VPLM). Firstly, by using the proposed method, we obtain the reliable solutions for the logistic map. Secondly, we construct 10,000 samples of reliable experiments from a time-dependent non-stationary parameters VPLM and then calculate the mean T c. The results indicate that, for each different initial value, the T cs of the VPLM are generally different. However, the mean T c trends to a constant value when the sample number is large enough. The maximum, minimum, and probable distribution functions of T c are also obtained, which can help us to identify the robustness of applying a nonlinear time series theory to forecasting by using the VPLM output. In addition, the T c of the fixed parameter experiments of the logistic map is obtained, and the results suggest that this T c matches the theoretical formula-predicted value.
Boosting quantum annealer performance via sample persistence
NASA Astrophysics Data System (ADS)
Karimi, Hamed; Rosenberg, Gili
2017-07-01
We propose a novel method for reducing the number of variables in quadratic unconstrained binary optimization problems, using a quantum annealer (or any sampler) to fix the value of a large portion of the variables to values that have a high probability of being optimal. The resulting problems are usually much easier for the quantum annealer to solve, due to their being smaller and consisting of disconnected components. This approach significantly increases the success rate and number of observations of the best known energy value in samples obtained from the quantum annealer, when compared with calling the quantum annealer without using it, even when using fewer annealing cycles. Use of the method results in a considerable improvement in success metrics even for problems with high-precision couplers and biases, which are more challenging for the quantum annealer to solve. The results are further enhanced by applying the method iteratively and combining it with classical pre-processing. We present results for both Chimera graph-structured problems and embedded problems from a real-world application.
Okland, Bjørn; Skarpaas, Olav; Schroeder, Martin; Magnusson, Christer; Lindelöw, Ake; Thunes, Karl
2010-09-01
The pinewood nematode (PWN) is one of the worst tree-killing exotic pests in East-Asian countries. The first European record of establishment in Portugal in 1999 triggered extensive surveys and contingency plans for eradication in European countries, including immediate removal of large areas of conifer host trees. Using Norway as an example, we applied a simulation model to evaluate the chance of successful eradication of a hypothetical introduction by the current contingency plan in a northern area where wilting symptoms are not expected to occur. Despite a highly variable spread of nematode infestations in space and time, the probability of successful eradication in 20 years was consistently low (mean 0.035, SE 0.02). The low success did not change significantly by varying the biological parameters in sensitivity analyses (SA), probably due to the late detection of infestations by the survey (mean 14.3 years). SA revealed a strong influence of management parameters. However, a high probability of eradication required unrealistic measures: achieving an eradication probability of 0.99 in 20 years required 10,000 survey samples per year and a host tree removal radius of 8,000 m around each detection point. © 2010 Society for Risk Analysis.
A Unifying Probability Example.
ERIC Educational Resources Information Center
Maruszewski, Richard F., Jr.
2002-01-01
Presents an example from probability and statistics that ties together several topics including the mean and variance of a discrete random variable, the binomial distribution and its particular mean and variance, the sum of independent random variables, the mean and variance of the sum, and the central limit theorem. Uses Excel to illustrate these…
Çelikkiran, Seyhan; Bozkurt, Hasan; Coşkun, Murat
2015-06-01
The aim of this study was to investigate the prevalence of developmental problems and relationship with sociodemographic variables in a community sample of young children. Participants included 1000 children (558 males, 442 females, age range 1-48 months, mean 18.4 months, SD 7.8 months). Children were referred generally by their parents for developmental evaluation and consultation in response to a public announcement in a district area in Istanbul, Turkey. An interview form and the Denver Developmental Screening Test II (DDST) were used for sociodemographic data and developmental evaluation. The χ 2 test and Pearson's correlation test were used for data analysis. Seven hundred forty-one out of 1000 children (74.1%) had normal, 140 (14%) had risky, and 119 (11.9%) had abnormal findings on the DDST results. The probability of abnormal findings on the DDST results was significantly higher in males (p=0.003), the 2-4-year-old group (p<0.05), families with more than one child (p=0.001), consanguineous marriages (p<0.01), low parental educational levels and low household income (p<0.01), and in children without a history of breastfeeding (p=0.000). Immigration status and delivery mode did not have a significant effect on the probability of abnormal findings on the DDST results (p>0.05). Sociodemographic factors have a noteworthy impact on development. Determining these factors is important especially during the first years of life.
Jimsphere wind and turbulence exceedance statistic
NASA Technical Reports Server (NTRS)
Adelfang, S. I.; Court, A.
1972-01-01
Exceedance statistics of winds and gusts observed over Cape Kennedy with Jimsphere balloon sensors are described. Gust profiles containing positive and negative departures, from smoothed profiles, in the wavelength ranges 100-2500, 100-1900, 100-860, and 100-460 meters were computed from 1578 profiles with four 41 weight digital high pass filters. Extreme values of the square root of gust speed are normally distributed. Monthly and annual exceedance probability distributions of normalized rms gust speeds in three altitude bands (2-7, 6-11, and 9-14 km) are log-normal. The rms gust speeds are largest in the 100-2500 wavelength band between 9 and 14 km in late winter and early spring. A study of monthly and annual exceedance probabilities and the number of occurrences per kilometer of level crossings with positive slope indicates significant variability with season, altitude, and filter configuration. A decile sampling scheme is tested and an optimum approach is suggested for drawing a relatively small random sample that represents the characteristic extreme wind speeds and shears of a large parent population of Jimsphere wind profiles.
Epidemiology of major depression in four cities in Mexico.
Slone, Laurie B; Norris, Fran H; Murphy, Arthur D; Baker, Charlene K; Perilla, Julia L; Diaz, Dayna; Rodriguez, Francisco Gutiérrez; Gutiérrez Rodriguez, José de Jesús
2006-01-01
Analyses were conducted to estimate lifetime and current prevalence of major depressive disorder (MDD) for four representative cities of Mexico, to identify variables that influence the probability of MDD, and to further describe depression in Mexican culture. A multistage probability sampling design was used to draw a sample of 2,509 adults in four different regions of Mexico. MDD was assessed according to DSM-IV criteria by using the Composite International Diagnostic Interview collected by trained lay interviewers. The prevalence of MDD in these four cities averaged 12.8% for lifetime and 6.1% for the previous 12 months. MDD was highly comorbid with other mental disorders. Women were more likely to have lifetime MDD than were men. Being divorced, separated, or widowed (compared to married or never married) and having experienced childhood trauma were related to higher lifetime prevalence but not to current prevalence. In addition, age and education level were related to current 12-month MDD. Data on the profile of MDD in urban Mexico are provided. This research expands our understanding of MDD across cultures.
Whisman, Mark A
2016-12-01
Prior research has found that humiliating marital events are associated with depression. Building on this research, the current study investigated the association between one specific humiliating marital event-discovering that one's partner had an affair-and past-year major depressive episode (MDE) in a probability sample of married or cohabiting men and women who were at high risk for depression based on the criterion that they scored below the midpoint on a measure of marital satisfaction (N = 227). Results indicate that (i) women were more likely than men to report discovering their partner had an affair in the prior 12 months; (ii) discovering a partner affair was associated with a higher prevalence of past-year MDE and a lower level of marital adjustment; and (iii) the association between discovering a partner affair and MDE remained statistically significant when holding constant demographic variables and marital adjustment. These results support continued investigation into the impact that finding out about an affair has on the mental health of the person discovering a partner affair. © 2015 Family Process Institute.
Babamoradi, Hamid; van den Berg, Frans; Rinnan, Åsmund
2016-02-18
In Multivariate Statistical Process Control, when a fault is expected or detected in the process, contribution plots are essential for operators and optimization engineers in identifying those process variables that were affected by or might be the cause of the fault. The traditional way of interpreting a contribution plot is to examine the largest contributing process variables as the most probable faulty ones. This might result in false readings purely due to the differences in natural variation, measurement uncertainties, etc. It is more reasonable to compare variable contributions for new process runs with historical results achieved under Normal Operating Conditions, where confidence limits for contribution plots estimated from training data are used to judge new production runs. Asymptotic methods cannot provide confidence limits for contribution plots, leaving re-sampling methods as the only option. We suggest bootstrap re-sampling to build confidence limits for all contribution plots in online PCA-based MSPC. The new strategy to estimate CLs is compared to the previously reported CLs for contribution plots. An industrial batch process dataset was used to illustrate the concepts. Copyright © 2016 Elsevier B.V. All rights reserved.
Genetic variability in captive populations of the stingless bee Tetragonisca angustula.
Santiago, Leandro R; Francisco, Flávio O; Jaffé, Rodolfo; Arias, Maria C
2016-08-01
Low genetic variability has normally been considered a consequence of animal husbandry and a major contributing factor to declining bee populations. Here, we performed a molecular analysis of captive and wild populations of the stingless bee Tetragonisca angustula, one of the most commonly kept species across South America. Microsatellite analyses showed similar genetic variability between wild and captive populations However, captive populations showed lower mitochondrial genetic variability. Male-mediated gene flow, transport and division of nests are suggested as the most probable explanations for the observed patterns of genetic structure. We conclude that increasing the number of colonies kept through nest divisions does not negatively affect nuclear genetic variability, which seems to be maintained by small-scale male dispersal and human-mediated nest transport. However, the transport of nests from distant localities should be practiced with caution given the high genetic differentiation observed between samples from western and eastern areas. The high genetic structure verified is the result of a long-term evolutionary process, and bees from distant localities may represent unique evolutionary lineages.
Lietz, A.C.
2002-01-01
The acoustic Doppler current profiler (ADCP) and acoustic Doppler velocity meter (ADVM) were used to estimate constituent concentrations and loads at a sampling site along the Hendry-Collier County boundary in southwestern Florida. The sampling site is strategically placed within a highly managed canal system that exhibits low and rapidly changing water conditions. With the ADCP and ADVM, flow can be gaged more accurately rather than by conventional field-data collection methods. An ADVM velocity rating relates measured velocity determined by the ADCP (dependent variable) with the ADVM velocity (independent variable) by means of regression analysis techniques. The coefficient of determination (R2) for this rating is 0.99 at the sampling site. Concentrations and loads of total phosphorus, total Kjeldahl nitrogen, and total nitrogen (dependent variables) were related to instantaneous discharge, acoustic backscatter, stage, or water temperature (independent variables) recorded at the time of sampling. Only positive discharges were used for this analysis. Discharges less than 100 cubic feet per second generally are considered inaccurate (probably as a result of acoustic ray bending and vertical temperature gradients in the water column). Of the concentration models, only total phosphorus was statistically significant at the 95-percent confidence level (p-value less than 0.05). Total phosphorus had an adjusted R2 of 0.93, indicating most of the variation in the concentration can be explained by the discharge. All of the load models for total phosphorus, total Kjeldahl nitrogen, and total nitrogen were statistically significant. Most of the variation in load can be explained by the discharge as reflected in the adjusted R2 for total phosphorus (0.98), total Kjeldahl nitrogen (0.99), and total nitrogen (0.99).
Validation of the MODIS Collection 6 MCD64 Global Burned Area Product
NASA Astrophysics Data System (ADS)
Boschetti, L.; Roy, D. P.; Giglio, L.; Stehman, S. V.; Humber, M. L.; Sathyachandran, S. K.; Zubkova, M.; Melchiorre, A.; Huang, H.; Huo, L. Z.
2017-12-01
The research, policy and management applications of satellite products place a high priority on rigorously assessing their accuracy. A number of NASA, ESA and EU funded global and continental burned area products have been developed using coarse spatial resolution satellite data, and have the potential to become part of a long-term fire Essential Climate Variable. These products have usually been validated by comparison with reference burned area maps derived by visual interpretation of Landsat or similar spatial resolution data selected on an ad hoc basis. More optimally, a design-based validation method should be adopted, characterized by the selection of reference data via probability sampling. Design based techniques have been used for annual land cover and land cover change product validation, but have not been widely used for burned area products, or for other products that are highly variable in time and space (e.g. snow, floods, other non-permanent phenomena). This has been due to the challenge of designing an appropriate sampling strategy, and to the cost and limited availability of independent reference data. This paper describes the validation procedure adopted for the latest Collection 6 version of the MODIS Global Burned Area product (MCD64, Giglio et al, 2009). We used a tri-dimensional sampling grid that allows for probability sampling of Landsat data in time and in space (Boschetti et al, 2016). To sample the globe in the spatial domain with non-overlapping sampling units, the Thiessen Scene Area (TSA) tessellation of the Landsat WRS path/rows is used. The TSA grid is then combined with the 16-day Landsat acquisition calendar to provide tri-dimensonal elements (voxels). This allows the implementation of a sampling design where not only the location but also the time interval of the reference data is explicitly drawn through stratified random sampling. The novel sampling approach was used for the selection of a reference dataset consisting of 700 Landsat 8 image pairs, interpreted according to the CEOS Burned Area Validation Protocol (Boschetti et al., 2009). Standard quantitative burned area product accuracy measures that are important for different types of fire users (Boschetti et al, 2016, Roy and Boschetti, 2009, Boschetti et al, 2004) are computed to characterize the accuracy of the MCD64 product.
Sampling Methods in Cardiovascular Nursing Research: An Overview.
Kandola, Damanpreet; Banner, Davina; O'Keefe-McCarthy, Sheila; Jassal, Debbie
2014-01-01
Cardiovascular nursing research covers a wide array of topics from health services to psychosocial patient experiences. The selection of specific participant samples is an important part of the research design and process. The sampling strategy employed is of utmost importance to ensure that a representative sample of participants is chosen. There are two main categories of sampling methods: probability and non-probability. Probability sampling is the random selection of elements from the population, where each element of the population has an equal and independent chance of being included in the sample. There are five main types of probability sampling including simple random sampling, systematic sampling, stratified sampling, cluster sampling, and multi-stage sampling. Non-probability sampling methods are those in which elements are chosen through non-random methods for inclusion into the research study and include convenience sampling, purposive sampling, and snowball sampling. Each approach offers distinct advantages and disadvantages and must be considered critically. In this research column, we provide an introduction to these key sampling techniques and draw on examples from the cardiovascular research. Understanding the differences in sampling techniques may aid nurses in effective appraisal of research literature and provide a reference pointfor nurses who engage in cardiovascular research.
Wicki, J; Perneger, TV; Junod, AF; Bounameaux, H; Perrier, A
2000-01-01
PURPOSE We aimed to develop a simple standardized clinical score to stratify emergency ward patients with clinically suspected PE into groups with a high, intermediate, or low probability of PE, in order to improve and simplify the diagnostic approach. METHODS Analysis of a database of 1090 consecutive patients admitted to the emergency ward for suspected PE, in whom diagnosis of PE was ruled in or out by a standard diagnostic algorithm. Logistic regression was used to predict clinical parameters associated with PE. RESULTS 296 out of 1090 patients (27%) were found to have PE. The optimal estimate of clinical probability was based on eight variables: recent surgery, previous thromboembolic event, older age, hypocapnia, hypoxemia, tachycardia, band atelectasis or elevation of a hemidiaphragm on chest X-ray. A probability score was calculated by adding points assigned to these variables. A cut-off score of 4 best identified patients with low probability of PE. 486 patients (49%) had a low clinical probability of PE (score < 4), of which 50 (10.3%) had a proven PE. The prevalence of PE was 38% in the 437 patients with an intermediate probability (score 5–8, n = 437) and 81% in the 63 patients with a high probability (score>9). CONCLUSION This clinical score, based on easily available and objective variables, provides a standardized assessment of the clinical probability of PE. Applying this score to emergency ward patients suspected of PE could allow a more efficient diagnostic process.
Compositional cokriging for mapping the probability risk of groundwater contamination by nitrates.
Pardo-Igúzquiza, Eulogio; Chica-Olmo, Mario; Luque-Espinar, Juan A; Rodríguez-Galiano, Víctor
2015-11-01
Contamination by nitrates is an important cause of groundwater pollution and represents a potential risk to human health. Management decisions must be made using probability maps that assess the nitrate concentration potential of exceeding regulatory thresholds. However these maps are obtained with only a small number of sparse monitoring locations where the nitrate concentrations have been measured. It is therefore of great interest to have an efficient methodology for obtaining those probability maps. In this paper, we make use of the fact that the discrete probability density function is a compositional variable. The spatial discrete probability density function is estimated by compositional cokriging. There are several advantages in using this approach: (i) problems of classical indicator cokriging, like estimates outside the interval (0,1) and order relations, are avoided; (ii) secondary variables (e.g. aquifer parameters) can be included in the estimation of the probability maps; (iii) uncertainty maps of the probability maps can be obtained; (iv) finally there are modelling advantages because the variograms and cross-variograms of real variables that do not have the restrictions of indicator variograms and indicator cross-variograms. The methodology was applied to the Vega de Granada aquifer in Southern Spain and the advantages of the compositional cokriging approach were demonstrated. Copyright © 2015 Elsevier B.V. All rights reserved.
Sampling considerations for disease surveillance in wildlife populations
Nusser, S.M.; Clark, W.R.; Otis, D.L.; Huang, L.
2008-01-01
Disease surveillance in wildlife populations involves detecting the presence of a disease, characterizing its prevalence and spread, and subsequent monitoring. A probability sample of animals selected from the population and corresponding estimators of disease prevalence and detection provide estimates with quantifiable statistical properties, but this approach is rarely used. Although wildlife scientists often assume probability sampling and random disease distributions to calculate sample sizes, convenience samples (i.e., samples of readily available animals) are typically used, and disease distributions are rarely random. We demonstrate how landscape-based simulation can be used to explore properties of estimators from convenience samples in relation to probability samples. We used simulation methods to model what is known about the habitat preferences of the wildlife population, the disease distribution, and the potential biases of the convenience-sample approach. Using chronic wasting disease in free-ranging deer (Odocoileus virginianus) as a simple illustration, we show that using probability sample designs with appropriate estimators provides unbiased surveillance parameter estimates but that the selection bias and coverage errors associated with convenience samples can lead to biased and misleading results. We also suggest practical alternatives to convenience samples that mix probability and convenience sampling. For example, a sample of land areas can be selected using a probability design that oversamples areas with larger animal populations, followed by harvesting of individual animals within sampled areas using a convenience sampling method.
Approximation of Failure Probability Using Conditional Sampling
NASA Technical Reports Server (NTRS)
Giesy. Daniel P.; Crespo, Luis G.; Kenney, Sean P.
2008-01-01
In analyzing systems which depend on uncertain parameters, one technique is to partition the uncertain parameter domain into a failure set and its complement, and judge the quality of the system by estimating the probability of failure. If this is done by a sampling technique such as Monte Carlo and the probability of failure is small, accurate approximation can require so many sample points that the computational expense is prohibitive. Previous work of the authors has shown how to bound the failure event by sets of such simple geometry that their probabilities can be calculated analytically. In this paper, it is shown how to make use of these failure bounding sets and conditional sampling within them to substantially reduce the computational burden of approximating failure probability. It is also shown how the use of these sampling techniques improves the confidence intervals for the failure probability estimate for a given number of sample points and how they reduce the number of sample point analyses needed to achieve a given level of confidence.
Khan, Hafiz; Saxena, Anshul; Perisetti, Abhilash; Rafiq, Aamrin; Gabbidon, Kemesha; Mende, Sarah; Lyuksyutova, Maria; Quesada, Kandi; Blakely, Summre; Torres, Tiffany; Afesse, Mahlet
2016-12-01
Background: Breast cancer is a worldwide public health concern and is the most prevalent type of cancer in women in the United States. This study concerned the best fit of statistical probability models on the basis of survival times for nine state cancer registries: California, Connecticut, Georgia, Hawaii, Iowa, Michigan, New Mexico, Utah, and Washington. Materials and Methods: A probability random sampling method was applied to select and extract records of 2,000 breast cancer patients from the Surveillance Epidemiology and End Results (SEER) database for each of the nine state cancer registries used in this study. EasyFit software was utilized to identify the best probability models by using goodness of fit tests, and to estimate parameters for various statistical probability distributions that fit survival data. Results: Statistical analysis for the summary of statistics is reported for each of the states for the years 1973 to 2012. Kolmogorov-Smirnov, Anderson-Darling, and Chi-squared goodness of fit test values were used for survival data, the highest values of goodness of fit statistics being considered indicative of the best fit survival model for each state. Conclusions: It was found that California, Connecticut, Georgia, Iowa, New Mexico, and Washington followed the Burr probability distribution, while the Dagum probability distribution gave the best fit for Michigan and Utah, and Hawaii followed the Gamma probability distribution. These findings highlight differences between states through selected sociodemographic variables and also demonstrate probability modeling differences in breast cancer survival times. The results of this study can be used to guide healthcare providers and researchers for further investigations into social and environmental factors in order to reduce the occurrence of and mortality due to breast cancer. Creative Commons Attribution License
Risk-based water resources planning: Incorporating probabilistic nonstationary climate uncertainties
NASA Astrophysics Data System (ADS)
Borgomeo, Edoardo; Hall, Jim W.; Fung, Fai; Watts, Glenn; Colquhoun, Keith; Lambert, Chris
2014-08-01
We present a risk-based approach for incorporating nonstationary probabilistic climate projections into long-term water resources planning. The proposed methodology uses nonstationary synthetic time series of future climates obtained via a stochastic weather generator based on the UK Climate Projections (UKCP09) to construct a probability distribution of the frequency of water shortages in the future. The UKCP09 projections extend well beyond the range of current hydrological variability, providing the basis for testing the robustness of water resources management plans to future climate-related uncertainties. The nonstationary nature of the projections combined with the stochastic simulation approach allows for extensive sampling of climatic variability conditioned on climate model outputs. The probability of exceeding planned frequencies of water shortages of varying severity (defined as Levels of Service for the water supply utility company) is used as a risk metric for water resources planning. Different sources of uncertainty, including demand-side uncertainties, are considered simultaneously and their impact on the risk metric is evaluated. Supply-side and demand-side management strategies can be compared based on how cost-effective they are at reducing risks to acceptable levels. A case study based on a water supply system in London (UK) is presented to illustrate the methodology. Results indicate an increase in the probability of exceeding the planned Levels of Service across the planning horizon. Under a 1% per annum population growth scenario, the probability of exceeding the planned Levels of Service is as high as 0.5 by 2040. The case study also illustrates how a combination of supply and demand management options may be required to reduce the risk of water shortages.
NASA Technical Reports Server (NTRS)
Cerniglia, M. C.; Douglass, A. R.; Rood, R. B.; Sparling, L. C..; Nielsen, J. E.
1999-01-01
We present a study of the distribution of ozone in the lowermost stratosphere with the goal of understanding the relative contribution to the observations of air of either distinctly tropospheric or stratospheric origin. The air in the lowermost stratosphere is divided into two population groups based on Ertel's potential vorticity at 300 hPa. High [low] potential vorticity at 300 hPa suggests that the tropopause is low [high], and the identification of the two groups helps to account for dynamic variability. Conditional probability distribution functions are used to define the statistics of the mix from both observations and model simulations. Two data sources are chosen. First, several years of ozonesonde observations are used to exploit the high vertical resolution. Second, observations made by the Halogen Occultation Experiment [HALOE] on the Upper Atmosphere Research Satellite [UARS] are used to understand the impact on the results of the spatial limitations of the ozonesonde network. The conditional probability distribution functions are calculated at a series of potential temperature surfaces spanning the domain from the midlatitude tropopause to surfaces higher than the mean tropical tropopause [about 380K]. Despite the differences in spatial and temporal sampling, the probability distribution functions are similar for the two data sources. Comparisons with the model demonstrate that the model maintains a mix of air in the lowermost stratosphere similar to the observations. The model also simulates a realistic annual cycle. By using the model, possible mechanisms for the maintenance of mix of air in the lowermost stratosphere are revealed. The relevance of the results to the assessment of the environmental impact of aircraft effluence is discussed.
NASA Technical Reports Server (NTRS)
Cerniglia, M. C.; Douglass, A. R.; Rood, R. B.; Sparling, L. C.; Nielsen, J. E.
1999-01-01
We present a study of the distribution of ozone in the lowermost stratosphere with the goal of understanding the relative contribution to the observations of air of either distinctly tropospheric or stratospheric origin. The air in the lowermost stratosphere is divided into two population groups based on Ertel's potential vorticity at 300 hPa. High [low] potential vorticity at 300 hPa suggests that the tropopause is low [high], and the identification of the two groups helps to account for dynamic variability. Conditional probability distribution functions are used to define the statistics of the mix from both observations and model simulations. Two data sources are chosen. First, several years of ozonesonde observations are used to exploit the high vertical resolution. Second, observations made by the Halogen Occultation Experiment [HALOE) on the Upper Atmosphere Research Satellite [UARS] are used to understand the impact on the results of the spatial limitations of the ozonesonde network. The conditional probability distribution functions are calculated at a series of potential temperature surfaces spanning the domain from the midlatitude tropopause to surfaces higher than the mean tropical tropopause [approximately 380K]. Despite the differences in spatial and temporal sampling, the probability distribution functions are similar for the two data sources. Comparisons with the model demonstrate that the model maintains a mix of air in the lowermost stratosphere similar to the observations. The model also simulates a realistic annual cycle. By using the model, possible mechanisms for the maintenance of mix of air in the lowermost stratosphere are revealed. The relevance of the results to the assessment of the environmental impact of aircraft effluence is discussed.
Assessing Aircraft Supply Air to Recommend Compounds for Timely Warning of Contamination
NASA Astrophysics Data System (ADS)
Fox, Richard B.
Taking aircraft out of service for even one day to correct fume-in-cabin events can cost the industry roughly $630 million per year in lost revenue. The quantitative correlation study investigated quantitative relationships between measured concentrations of contaminants in bleed air and probability of odor detectability. Data were collected from 94 aircraft engine and auxiliary power unit (APU) bleed air tests from an archival data set between 1997 and 2011, and no relationships were found. Pearson correlation was followed by regression analysis for individual contaminants. Significant relationships of concentrations of compounds in bleed air to probability of odor detectability were found (p<0.05), as well as between compound concentration and probability of sensory irritancy detectability. Study results may be useful to establish early warning levels. Predictive trend monitoring, a method to identify potential pending failure modes within a mechanical system, may influence scheduled down-time for maintenance as a planned event, rather than repair after a mechanical failure and thereby reduce operational costs associated with odor-in-cabin events. Twenty compounds (independent variables) were found statistically significant as related to probability of odor detectability (dependent variable 1). Seventeen compounds (independent variables) were found statistically significant as related to probability of sensory irritancy detectability (dependent variable 2). Additional research was recommended to further investigate relationships between concentrations of contaminants and probability of odor detectability or probability of sensory irritancy detectability for all turbine oil brands. Further research on implementation of predictive trend monitoring may be warranted to demonstrate how the monitoring process might be applied to in-flight application.
Magruder, J Trent; Blasco-Colmenares, Elena; Crawford, Todd; Alejo, Diane; Conte, John V; Salenger, Rawn; Fonner, Clifford E; Kwon, Christopher C; Bobbitt, Jennifer; Brown, James M; Nelson, Mark G; Horvath, Keith A; Whitman, Glenn R
2017-01-01
Variation in red blood cell (RBC) transfusion practices exists at cardiac surgery centers across the nation. We tested the hypothesis that significant variation in RBC transfusion practices between centers in our state's cardiac surgery quality collaborative remains even after risk adjustment. Using a multiinstitutional statewide database created by the Maryland Cardiac Surgery Quality Initiative (MCSQI), we included patient-level data from 8,141 patients undergoing isolated coronary artery bypass (CAB) or aortic valve replacement at 1 of 10 centers. Risk-adjusted multivariable logistic regression models were constructed to predict the need for any intraoperative RBC transfusion, as well as for any postoperative RBC transfusion, with anonymized center number included as a factor variable. Unadjusted intraoperative RBC transfusion probabilities at the 10 centers ranged from 13% to 60%; postoperative RBC transfusion probabilities ranged from 16% to 41%. After risk adjustment with demographic, comorbidity, and operative data, significant intercenter variability was documented (intraoperative probability range, 4% -59%; postoperative probability range, 13%-39%). When stratifying patients by preoperative hematocrit quartiles, significant variability in intraoperative transfusion probability was seen among all quartiles (lowest quartile: mean hematocrit value, 30.5% ± 4.1%, probability range, 17%-89%; highest quartile: mean hematocrit value, 44.8% ± 2.5%; probability range, 1%-35%). Significant variation in intercenter RBC transfusion practices exists for both intraoperative and postoperative transfusions, even after risk adjustment, among our state's centers. Variability in intraoperative RBC transfusion persisted across quartiles of preoperative hematocrit values. Copyright © 2017 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.
Pretest variables that improve the predictive value of exercise testing in women.
Lamont, L S; Bobb, J; Blissmer, B; Desai, V
2015-12-01
Graded exercise testing (GXT) is used in coronary artery disease (CAD) prevention and rehabilitation programs. In women, this test has a decreased accuracy and predictive value but there are few studies that examine the predictors of a verified positive test. The aim of this study was to determine those pretest variables that might enhance the predictive value of the GXT in women clients. Medical records of 1761 patients referred for GXT's over a 5 yr period of time were screened. Demographic, medical, and exercise test variables were analyzed. The GXT's of 403 women were available for inclusion and they were stratified into 3 groups: positive responders that were subsequently shown to have CAD (N.=28 verified positive [VP]), positive responders that were not shown to have CAD (N.=84 non-verified positive [NVP]) and negative GXT responders (N.=291). Both univariate and a multivariate step-wise regression statistics were performed on this data. Pretest variables that differentiated between VP and NVP groups are: (an older age=65.8 vs. 60.2 yrs. P<0.05; a greater BMI=30.8 vs. 28.8 kg/m2; diabetes status or an elevated fasting glucose =107.4 vs. 95.2 mg/dL P<0.05; and the use of some cardiovascular medications. Our subsequent linear regression analysis emphasized that HDL cholesterol and beta blocker usage were the most predictive of a positive exercise test in this cohort. The American Heart Association recommends GXT's in women with an intermediate pretest probability of CAD. But there are only two clinical variables available prior to testing to make this probability decision: age and quality of chest pain. This study outlined that other pre-exercise test variables such as: BMI, blood chemistry (glucose and lipoprotein levels) and the use of cardiovascular medications are useful in clinical decision making. These pre-exercise test variables improved the predictive value of the GXT's in our sample.
Paretti, Nicholas; Coes, Alissa L.; Kephart, Christopher M.; Mayo, Justine
2018-03-05
Tumacácori National Historical Park protects the culturally important Mission, San José de Tumacácori, while also managing a portion of the ecologically diverse riparian corridor of the Santa Cruz River. This report describes the methods and quality assurance procedures used in the collection of water samples for the analysis of Escherichia coli (E. coli), microbial source tracking markers, suspended sediment, water-quality parameters, turbidity, and the data collection for discharge and stage; the process for data review and approval is also described. Finally, this report provides a quantitative assessment of the quality of the E. coli, microbial source tracking, and suspended sediment data.The data-quality assessment revealed that bias attributed to field and laboratory contamination was minimal, with E. coli detections in only 3 out of 33 field blank samples analyzed. Concentrations in the field blanks were several orders of magnitude lower than environmental concentrations. The microbial source tracking (MST) field blank was below the detection limit for all MST markers analyzed. Laboratory blanks for E. coli at the USGS Arizona Water Science Center and laboratory blanks for MST markers at the USGS Ohio Water Microbiology Laboratory were all below the detection limit. Irreplicate data for E. coli and suspended sediment indicated that bias was not introduced to the data by combining samples collected using discrete sampling methods with samples collected using automatic sampling methods.The split and sequential E. coli replicate data showed consistent analytical variability and a single equation was developed to explain the variability of E. coli concentrations. An additional analysis of analytical variability for E. coli indicated analytical variability around 18 percent relative standard deviation and no trend was observed in the concentration during the processing and analysis of multiple split-replicates. Two replicate samples were collected for MST and individual markers were compared for a base flow and flood sample. For the markers found in common between the two types of samples, the relative standard deviation for the base flow sample was more than 3 times greater than the markers in the flood sample. Sequential suspended sediment replicates had a relative standard deviation of about 1.3 percent, indicating that environmental and analytical variability was minimal.A holding time review and laboratory study analysis supported the extended holding times required for this investigation. Most concentrations for flood and base-flow samples were within the theoretical variability specified in the most probable number approach suggesting that extended hold times did not overly influence the final concentrations reported.
van Reenen, Mari; Westerhuis, Johan A; Reinecke, Carolus J; Venter, J Hendrik
2017-02-02
ERp is a variable selection and classification method for metabolomics data. ERp uses minimized classification error rates, based on data from a control and experimental group, to test the null hypothesis of no difference between the distributions of variables over the two groups. If the associated p-values are significant they indicate discriminatory variables (i.e. informative metabolites). The p-values are calculated assuming a common continuous strictly increasing cumulative distribution under the null hypothesis. This assumption is violated when zero-valued observations can occur with positive probability, a characteristic of GC-MS metabolomics data, disqualifying ERp in this context. This paper extends ERp to address two sources of zero-valued observations: (i) zeros reflecting the complete absence of a metabolite from a sample (true zeros); and (ii) zeros reflecting a measurement below the detection limit. This is achieved by allowing the null cumulative distribution function to take the form of a mixture between a jump at zero and a continuous strictly increasing function. The extended ERp approach is referred to as XERp. XERp is no longer non-parametric, but its null distributions depend only on one parameter, the true proportion of zeros. Under the null hypothesis this parameter can be estimated by the proportion of zeros in the available data. XERp is shown to perform well with regard to bias and power. To demonstrate the utility of XERp, it is applied to GC-MS data from a metabolomics study on tuberculosis meningitis in infants and children. We find that XERp is able to provide an informative shortlist of discriminatory variables, while attaining satisfactory classification accuracy for new subjects in a leave-one-out cross-validation context. XERp takes into account the distributional structure of data with a probability mass at zero without requiring any knowledge of the detection limit of the metabolomics platform. XERp is able to identify variables that discriminate between two groups by simultaneously extracting information from the difference in the proportion of zeros and shifts in the distributions of the non-zero observations. XERp uses simple rules to classify new subjects and a weight pair to adjust for unequal sample sizes or sensitivity and specificity requirements.
Does private religious activity prolong survival? A six-year follow-up study of 3,851 older adults.
Helm, H M; Hays, J C; Flint, E P; Koenig, H G; Blazer, D G
2000-07-01
Previous studies have linked higher religious attendance and longer survival. In this study, we examine the relationship between survival and private religious activity. A probability sample of elderly community-dwelling adults in North Carolina was assembled in 1986 and followed for 6 years. Level of participation in private religious activities such as prayer, meditation, or Bible study was assessed by self-report at baseline, along with a wide variety of sociodemographic and health variables. The main outcome was time (days) to death or censoring. During a median 6.3-year follow-up period, 1,137 subjects (29.5%) died. Those reporting rarely to never participating in private religious activity had an increased relative hazard of dying over more frequent participants, but this hazard did not remain significant for the sample as a whole after adjustment for demographic and health variables. When the sample was divided into activity of daily living (ADL) impaired and unimpaired, the effect did not remain significant for the ADL impaired group after controlling for demographic variables (hazard ratio [RH] 1.11, 95% confidence interval [CI] 0.91-1.35). However, the increased hazard remained significant for the ADL unimpaired group even after controlling for demographic and health variables (RH 1.63, 95% CI 1.20-2.21), and this effect persisted despite controlling for numerous explanatory variables including health practices, social support, and other religious practices (RH 1.47, 95% CI 1.07-2.03). Older adults who participate in private religious activity before the onset of ADL impairment appear to have a survival advantage over those who do not.
Testing the relativistic Doppler boost hypothesis for supermassive black hole binary candidates
NASA Astrophysics Data System (ADS)
Charisi, Maria; Haiman, Zoltán; Schiminovich, David; D'Orazio, Daniel J.
2018-06-01
Supermassive black hole binaries (SMBHBs) should be common in galactic nuclei as a result of frequent galaxy mergers. Recently, a large sample of sub-parsec SMBHB candidates was identified as bright periodically variable quasars in optical surveys. If the observed periodicity corresponds to the redshifted binary orbital period, the inferred orbital velocities are relativistic (v/c ≈ 0.1). The optical and ultraviolet (UV) luminosities are expected to arise from gas bound to the individual BHs, and would be modulated by the relativistic Doppler effect. The optical and UV light curves should vary in tandem with relative amplitudes which depend on the respective spectral slopes. We constructed a control sample of 42 quasars with aperiodic variability, to test whether this Doppler colour signature can be distinguished from intrinsic chromatic variability. We found that the Doppler signature can arise by chance in ˜20 per cent (˜37 per cent) of quasars in the nUV (fUV) band. These probabilities reflect the limited quality of the control sample and represent upper limits on how frequently quasars mimic the Doppler brightness+colour variations. We performed separate tests on the periodic quasar candidates, and found that for the majority, the Doppler boost hypothesis requires an unusually steep UV spectrum or an unexpectedly large BH mass and orbital velocity. We conclude that at most approximately one-third of these periodic candidates can harbor Doppler-modulated SMBHBs.
Rizzo, Austin A.; Brown, Donald J.; Welsh, Stuart A.; Thompson, Patricia A.
2017-01-01
Population monitoring is an essential component of endangered species recovery programs. The federally endangered Diamond Darter Crystallaria cincotta is in need of an effective monitoring design to improve our understanding of its distribution and track population trends. Because of their small size, cryptic coloration, and nocturnal behavior, along with limitations associated with current sampling methods, individuals are difficult to detect at known occupied sites. Therefore, research is needed to determine if survey efforts can be improved by increasing probability of individual detection. The primary objective of this study was to determine if there are seasonal and diel patterns in Diamond Darter detectability during population surveys. In addition to temporal factors, we also assessed five habitat variables that might influence individual detection. We used N-mixture models to estimate site abundances and relationships between covariates and individual detectability and ranked models using Akaike's information criteria. During 2015 three known occupied sites were sampled 15 times each between May and Oct. The best supported model included water temperature as a quadratic function influencing individual detectability, with temperatures around 22 C resulting in the highest detection probability. Detection probability when surveying at the optimal temperature was approximately 6% and 7.5% greater than when surveying at 16 C and 29 C, respectively. Time of Night and day of year were not strong predictors of Diamond Darter detectability. The results of this study will allow researchers and agencies to maximize detection probability when surveying populations, resulting in greater monitoring efficiency and likely more precise abundance estimates.
Garcia-Saenz, A; Napp, S; Lopez, S; Casal, J; Allepuz, A
2015-10-01
The achievement of the Officially Tuberculosis Free (OTF) status in regions with low bovine Tuberculosis (bTB) herd prevalence, as is the case of North-Eastern Spain (Catalonia), might be a likely option in the medium term. In this context, risk-based approaches could be an alternative surveillance strategy to the costly current strategy. However, before any change in the system may be contemplated, a reliable estimate of the sensitivity of the different surveillance components is needed. In this study, we focused on the slaughterhouse component. The probability of detection of a bTB-infected cattle by the slaughterhouses in Catalonia was estimated as the product of three consecutive probabilities: (P1) the probability that a bTB-infected animal arrived at the slaughterhouse presenting Macroscopically Detectable Lesions (MDL); (P2) the probability that MDL were detected by the routine meat inspection process and (P3) the probability that the veterinary officer suspected bTB and sent the sample for laboratory confirmation. The first probability was obtained from data collected through the bTB eradication program carried out in Catalonia between 2005 and 2008, while the last two were obtained through the expert opinion of the veterinary officers working at the slaughterhouses who fulfilled a questionnaire administered during 2014. The bTB surveillance sensitivity of the different cattle slaughterhouses in Catalonia obtained in this study was 31.4% (CI 95%: 28.6-36.2), and there were important differences among them. The low bTB surveillance sensitivity was mainly related with the low probability that a bTB-infected animal arrived at the slaughterhouse presenting MDL (around 44.8%). The variability of the sensitivity among the different slaughterhouses could be explained by significant associations between some variables included in the survey and P2. For instance, factors like attendance to training courses, number of meat technicians and speed of the slaughter chain were significantly related with the probabilities that a MDL was detected by the meat inspection procedure carried out in the slaughterhouse. Technical and policy efforts should be focused on the improvement of these factors in order to maximize the slaughterhouse sensitivity. Copyright © 2015 Elsevier B.V. All rights reserved.
[Prolonged mechanical ventilation probability model].
Añón, J M; Gómez-Tello, V; González-Higueras, E; Oñoro, J J; Córcoles, V; Quintana, M; López-Martínez, J; Marina, L; Choperena, G; García-Fernández, A M; Martín-Delgado, C; Gordo, F; Díaz-Alersi, R; Montejo, J C; Lorenzo, A García de; Pérez-Arriaga, M; Madero, R
2012-10-01
To design a probability model for prolonged mechanical ventilation (PMV) using variables obtained during the first 24 hours of the start of MV. An observational, prospective, multicenter cohort study. Thirteen Spanish medical-surgical intensive care units. Adult patients requiring mechanical ventilation for more than 24 hours. None. APACHE II, SOFA, demographic data, clinical data, reason for mechanical ventilation, comorbidity, and functional condition. A multivariate risk model was constructed. The model contemplated a dependent variable with three possible conditions: 1. Early mortality; 2. Early extubation; and 3. PMV. Of the 1661 included patients, 67.9% (n=1127) were men. Age: 62.1±16.2 years. APACHE II: 20.3±7.5. Total SOFA: 8.4±3.5. The APACHE II and SOFA scores were higher in patients ventilated for 7 or more days (p=0.04 and p=0.0001, respectively). Noninvasive ventilation failure was related to PMV (p=0.005). A multivariate model for the three above exposed outcomes was generated. The overall accuracy of the model in the training and validation sample was 0.763 (95%IC: 0.729-0.804) and 0.751 (95%IC: 0.672-0.816), respectively. The likelihood ratios (LRs) for early extubation, involving a cutoff point of 0.65, in the training sample were LR (+): 2.37 (95%CI: 1.77-3.19) and LR (-): 0.47 (95%CI: 0.41-0.55). The LRs for the early mortality model, for a cutoff point of 0.73, in the training sample, were LR (+): 2.64 (95%CI: 2.01-3.4) and LR (-): 0.39 (95%CI: 0.30-0.51). The proposed model could be a helpful tool in decision making. However, because of its moderate accuracy, it should be considered as a first approach, and the results should be corroborated by further studies involving larger samples and the use of standardized criteria. Copyright © 2011 Elsevier España, S.L. y SEMICYUC. All rights reserved.
Wiens, David; Kolar, Patrick; Hunt, W. Grainger; Hunt, Teresa; Fuller, Mark R.; Bell, Douglas A.
2018-01-01
We used a broad-scale sampling design to investigate spatial patterns in occupancy and breeding success of territorial pairs of Golden Eagles (Aquila chrysaetos) in the Diablo Range, California, USA, during a period of exceptional drought (2014–2016). We surveyed 138 randomly selected sample sites over 4 occasions each year and identified 199 pairs of eagles, 100 of which were detected in focal sample sites. We then used dynamic multistate modeling to identify relationships between site occupancy and reproduction of Golden Eagles relative to spatial variability in landscape composition and drought conditions. We observed little variability among years in site occupancy (3-yr mean = 0.74), but the estimated annual probability of successful reproduction was relatively low during the study period and declined from 0.39 (± 0.08 SE) to 0.18 (± 0.07 SE). Probabilities of site occupancy and reproduction were substantially greater at sample sites that were occupied by successful breeders in the previous year, indicating the presence of sites that were consistently used by successfully reproducing eagles. We found strong evidence for nonrandom spatial distribution in both occupancy and reproduction: Sites with the greatest potential for occupancy were characterized by rugged terrain conditions with intermediate amounts of grassland interspersed with patches of oak woodland and coniferous forest, whereas successful reproduction was most strongly associated with the amount of precipitation that a site received during the nesting period. Our findings highlight the contribution of consistently occupied and productive breeding sites to overall productivity of the local breeding population, and show that both occupancy and reproduction at these sites were maintained even during a period of exceptional drought. Our approach to quantifying and mapping site quality should be especially useful for the spatial prioritization of compensation measures intended to help offset the impacts of increasing human land use and development on Golden Eagles and their habitats.
Ellis, Ian O.; Green, Andrew R.; Hanka, Rudolf
2008-01-01
Background We consider the problem of assessing inter-rater agreement when there are missing data and a large number of raters. Previous studies have shown only ‘moderate’ agreement between pathologists in grading breast cancer tumour specimens. We analyse a large but incomplete data-set consisting of 24177 grades, on a discrete 1–3 scale, provided by 732 pathologists for 52 samples. Methodology/Principal Findings We review existing methods for analysing inter-rater agreement for multiple raters and demonstrate two further methods. Firstly, we examine a simple non-chance-corrected agreement score based on the observed proportion of agreements with the consensus for each sample, which makes no allowance for missing data. Secondly, treating grades as lying on a continuous scale representing tumour severity, we use a Bayesian latent trait method to model cumulative probabilities of assigning grade values as functions of the severity and clarity of the tumour and of rater-specific parameters representing boundaries between grades 1–2 and 2–3. We simulate from the fitted model to estimate, for each rater, the probability of agreement with the majority. Both methods suggest that there are differences between raters in terms of rating behaviour, most often caused by consistent over- or under-estimation of the grade boundaries, and also considerable variability in the distribution of grades assigned to many individual samples. The Bayesian model addresses the tendency of the agreement score to be biased upwards for raters who, by chance, see a relatively ‘easy’ set of samples. Conclusions/Significance Latent trait models can be adapted to provide novel information about the nature of inter-rater agreement when the number of raters is large and there are missing data. In this large study there is substantial variability between pathologists and uncertainty in the identity of the ‘true’ grade of many of the breast cancer tumours, a fact often ignored in clinical studies. PMID:18698346
Bakbergenuly, Ilyas; Kulinskaya, Elena; Morgenthaler, Stephan
2016-07-01
We study bias arising as a result of nonlinear transformations of random variables in random or mixed effects models and its effect on inference in group-level studies or in meta-analysis. The findings are illustrated on the example of overdispersed binomial distributions, where we demonstrate considerable biases arising from standard log-odds and arcsine transformations of the estimated probability p̂, both for single-group studies and in combining results from several groups or studies in meta-analysis. Our simulations confirm that these biases are linear in ρ, for small values of ρ, the intracluster correlation coefficient. These biases do not depend on the sample sizes or the number of studies K in a meta-analysis and result in abysmal coverage of the combined effect for large K. We also propose bias-correction for the arcsine transformation. Our simulations demonstrate that this bias-correction works well for small values of the intraclass correlation. The methods are applied to two examples of meta-analyses of prevalence. © 2016 The Authors. Biometrical Journal Published by Wiley-VCH Verlag GmbH & Co. KGaA.
Patients with schizophrenia activate behavioural intentions facilitated by counterfactual reasoning.
Contreras, Fernando; Albacete, Auria; Tebé, Cristian; Benejam, Bessy; Caño, Agnes; Menchón, José Manuel
2017-01-01
The main variables assessed were: answer to complete a target task (wrong or correctly), and percentage gain in the reaction time (RT) to complete a target task correctly depending on whether the prime was a counterfactual or a neutral-control cue. These variables were assessed in 37 patients with schizophrenia and 37 healthy controls. Potential associations with clinical status and socio-demographic characteristics were also explored. When a counterfactual prime was presented, the probability of giving an incorrect answer was lower for the entire sample than when a neutral prime was presented (OR 0.58; CI 95% 0.42 to 0.79), but the schizophrenia patients showed a higher probability than the controls of giving an incorrect answer (OR 3.89; CI 95% 2.0 to 7.6). Both the schizophrenia patients and the controls showed a similar percentage gain in RT to a correct answer of 8%. Challenging the results of previous research, our findings suggest a normal activation of behavioural intentions facilitated by CFT in schizophrenia. Nevertheless, the patients showed more difficulty than the controls with the task, adding support to the concept of CFT as a potential new target for consideration in future therapeutic approaches for this illness.
Stochastic transport models for mixing in variable-density turbulence
NASA Astrophysics Data System (ADS)
Bakosi, J.; Ristorcelli, J. R.
2011-11-01
In variable-density (VD) turbulent mixing, where very-different- density materials coexist, the density fluctuations can be an order of magnitude larger than their mean. Density fluctuations are non-negligible in the inertia terms of the Navier-Stokes equation which has both quadratic and cubic nonlinearities. Very different mixing rates of different materials give rise to large differential accelerations and some fundamentally new physics that is not seen in constant-density turbulence. In VD flows material mixing is active in a sense far stronger than that applied in the Boussinesq approximation of buoyantly-driven flows: the mass fraction fluctuations are coupled to each other and to the fluid momentum. Statistical modeling of VD mixing requires accounting for basic constraints that are not important in the small-density-fluctuation passive-scalar-mixing approximation: the unit-sum of mass fractions, bounded sample space, and the highly skewed nature of the probability densities become essential. We derive a transport equation for the joint probability of mass fractions, equivalent to a system of stochastic differential equations, that is consistent with VD mixing in multi-component turbulence and consistently reduces to passive scalar mixing in constant-density flows.
Occupancy and abundance of the endangered yellowcheek darter in Arkansas
Magoulick, Daniel D.; Lynch, Dustin T.
2015-01-01
The Yellowcheek Darter (Etheostoma moorei) is a rare fish endemic to the Little Red River watershed in the Boston Mountains of northern Arkansas. Remaining populations of this species are geographically isolated and declining, and the species was listed in 2011 as federally endangered. Populations have declined, in part, due to intense seasonal stream drying and inundation of lower reaches by a reservoir. We used a kick seine sampling approach to examine distribution and abundance of Yellowcheek Darter populations in the Middle Fork and South Fork Little Red River. We used presence data to estimate occupancy rates and detection probability and examined relationships between Yellowcheek Darter density and environmental variables. The species was found at five Middle Fork and South Fork sites where it had previously been present in 2003–2004. Occupancy rates were >0.6 but with wide 95% CI, and where the darters occurred, densities were typical of other Ozark darters but highly variable. Detection probability and density were positively related to current velocity. Given that stream drying has become more extreme over the past 30 years and anthropogenic threats have increased, regular monitoring and active management may be required to reduce extinction risk of Yellowcheek Darter populations.
Cataloging the Praesepe Cluster: Identifying Interlopers and Binary Systems
NASA Astrophysics Data System (ADS)
Lucey, Madeline R.; Gosnell, Natalie M.; Mann, Andrew; Douglas, Stephanie
2018-01-01
We present radial velocity measurements from an ongoing survey of the Praesepe open cluster using the WIYN 3.5m Telescope. Our target stars include 229 early-K to mid-M dwarfs with proper motion memberships that have been observed by the repurposed Kepler mission, K2. With this survey, we will provide a well-constrained membership list of the cluster. By removing interloping stars and determining the cluster binary frequency we can avoid systematic errors in our analysis of the K2 findings and more accurately determine exoplanet properties in the Praesepe cluster. Obtaining accurate exoplanet parameters in open clusters allows us to study the temporal dimension of exoplanet parameter space. We find Praesepe to have a mean radial velocity of 34.09 km/s and a velocity dispersion of 1.13 km/s, which is consistent with previous studies. We derive radial velocity membership probabilities for stars with ≥3 radial velocity measurements and compare against published membership probabilities. We also identify radial velocity variables and potential double-lined spectroscopic binaries. We plan to obtain more observations to determine the radial velocity membership of all the stars in our sample, as well as follow up on radial velocity variables to determine binary orbital solutions.
Wright, Wilson J.; Irvine, Kathryn M.
2017-01-01
We examined data on white pine blister rust (blister rust) collected during the monitoring of whitebark pine trees in the Greater Yellowstone Ecosystem (from 2004-2015). Summaries of repeat observations performed by multiple independent observers are reviewed and discussed. These summaries show variability among observers and the potential for errors being made in blister rust status. Based on this assessment, we utilized occupancy models to analyze blister rust prevalence while explicitly accounting for imperfect detection. Available covariates were used to model both the probability of a tree being infected with blister rust and the probability of an observer detecting the infection. The fitted model provided strong evidence that the probability of blister rust infection increases as tree diameter increases and decreases as site elevation increases. Most importantly, we found evidence of heterogeneity in detection probabilities related to tree size and average slope of a transect. These results suggested that detecting the presence of blister rust was more difficult in larger trees. Also, there was evidence that blister rust was easier to detect on transects located on steeper slopes. Our model accounted for potential impacts of observer experience on blister rust detection probabilities and also showed moderate variability among the different observers in their ability to detect blister rust. Based on these model results, we suggest that multiple observer sampling continue in future field seasons in order to allow blister rust prevalence estimates to be corrected for imperfect detection. We suggest that the multiple observer effort be spread out across many transects (instead of concentrated at a few each field season) while retaining the overall proportion of trees with multiple observers around 5-20%. Estimates of prevalence are confounded with detection unless it is explicitly accounted for in an analysis and we demonstrate how an occupancy model can be used to do account for this source of observation error.
NASA Astrophysics Data System (ADS)
Gronewold, A. D.; Wolpert, R. L.; Reckhow, K. H.
2007-12-01
Most probable number (MPN) and colony-forming-unit (CFU) are two estimates of fecal coliform bacteria concentration commonly used as measures of water quality in United States shellfish harvesting waters. The MPN is the maximum likelihood estimate (or MLE) of the true fecal coliform concentration based on counts of non-sterile tubes in serial dilution of a sample aliquot, indicating bacterial metabolic activity. The CFU is the MLE of the true fecal coliform concentration based on the number of bacteria colonies emerging on a growth plate after inoculation from a sample aliquot. Each estimating procedure has intrinsic variability and is subject to additional uncertainty arising from minor variations in experimental protocol. Several versions of each procedure (using different sized aliquots or different numbers of tubes, for example) are in common use, each with its own levels of probabilistic and experimental error and uncertainty. It has been observed empirically that the MPN procedure is more variable than the CFU procedure, and that MPN estimates are somewhat higher on average than CFU estimates, on split samples from the same water bodies. We construct a probabilistic model that provides a clear theoretical explanation for the observed variability in, and discrepancy between, MPN and CFU measurements. We then explore how this variability and uncertainty might propagate into shellfish harvesting area management decisions through a two-phased modeling strategy. First, we apply our probabilistic model in a simulation-based analysis of future water quality standard violation frequencies under alternative land use scenarios, such as those evaluated under guidelines of the total maximum daily load (TMDL) program. Second, we apply our model to water quality data from shellfish harvesting areas which at present are closed (either conditionally or permanently) to shellfishing, to determine if alternative laboratory analysis procedures might have led to different management decisions. Our research results indicate that the (often large) observed differences between MPN and CFU values for the same water body are well within the ranges predicted by our probabilistic model. Our research also indicates that the probability of violating current water quality guidelines at specified true fecal coliform concentrations depends on the laboratory procedure used. As a result, quality-based management decisions, such as opening or closing a shellfishing area, may also depend on the laboratory procedure used.
Prevalence of urinary incontinence and probable risk factors in a sample of kurdish women.
Ahmed, Hamdia M; Osman, Vian A; Al-Alaf, Shahla K; Al-Tawil, Namir G
2013-05-01
The most common manifestation of pelvic floor dysfunction is urinary incontinence (UI) which affects 15-50% of adult women depending on the age and risk factors of the population studied. The aim of this study was to determine the probable risk factors associated with UI; the characteristics of women with UI; describe the types of UI, and determine its prevalence. A cross-sectional study was conducted between February and August 2011, in the Maternity Teaching Hospital of the Erbil Governorate, Kurdistan Region, northern Iraq. It included 1,107 women who were accompanying patients admitted to the hospital. A questionnaire designed by the researchers was used for data collection. A chi-square test was used to test the significance of the association between UI and different risk factors. Binary logistic regression was used, considering UI as the dependent variable. The overall prevalence of UI was 51.7%. The prevalence of stress, urgency, and mixed UI was 5.4%, 13.3% and 33%, respectively. There was a significant positive association between UI and menopause, multiparity, diabetes mellitus (DM), chronic cough, constipation, and a history of gynaecological surgery, while a significant negative association was detected between UI and a history of delivery by both vaginal delivery and Caesarean section. A high prevalence of UI was detected in the studied sample, and the most probable risk factors were multiparity, menopausal status, constipation, chronic cough, and DM.
NASA Astrophysics Data System (ADS)
Gromov, Yu Yu; Minin, Yu V.; Ivanova, O. G.; Morozova, O. N.
2018-03-01
Multidimensional discrete distributions of probabilities of independent random values were received. Their one-dimensional distribution is widely used in probability theory. Producing functions of those multidimensional distributions were also received.
[Determination of wine original regions using information fusion of NIR and MIR spectroscopy].
Xiang, Ling-Li; Li, Meng-Hua; Li, Jing-Mingz; Li, Jun-Hui; Zhang, Lu-Da; Zhao, Long-Lian
2014-10-01
Geographical origins of wine grapes are significant factors affecting wine quality and wine prices. Tasters' evaluation is a good method but has some limitations. It is important to discriminate different wine original regions quickly and accurately. The present paper proposed a method to determine wine original regions based on Bayesian information fusion that fused near-infrared (NIR) transmission spectra information and mid-infrared (MIR) ATR spectra information of wines. This method improved the determination results by expanding the sources of analysis information. NIR spectra and MIR spectra of 153 wine samples from four different regions of grape growing were collected by near-infrared and mid-infrared Fourier transform spe trometer separately. These four different regions are Huailai, Yantai, Gansu and Changli, which areall typical geographical originals for Chinese wines. NIR and MIR discriminant models for wine regions were established using partial least squares discriminant analysis (PLS-DA) based on NIR spectra and MIR spectra separately. In PLS-DA, the regions of wine samples are presented in group of binary code. There are four wine regions in this paper, thereby using four nodes standing for categorical variables. The output nodes values for each sample in NIR and MIR models were normalized first. These values stand for the probabilities of each sample belonging to each category. They seemed as the input to the Bayesian discriminant formula as a priori probability value. The probabilities were substituteed into the Bayesian formula to get posterior probabilities, by which we can judge the new class characteristics of these samples. Considering the stability of PLS-DA models, all the wine samples were divided into calibration sets and validation sets randomly for ten times. The results of NIR and MIR discriminant models of four wine regions were as follows: the average accuracy rates of calibration sets were 78.21% (NIR) and 82.57% (MIR), and the average accuracy rates of validation sets were 82.50% (NIR) and 81.98% (MIR). After using the method proposed in this paper, the accuracy rates of calibration and validation changed to 87.11% and 90.87% separately, which all achieved better results of determination than individual spectroscopy. These results suggest that Bayesian information fusion of NIR and MIR spectra is feasible for fast identification of wine original regions.
Estimating the Probability of Elevated Nitrate Concentrations in Ground Water in Washington State
Frans, Lonna M.
2008-01-01
Logistic regression was used to relate anthropogenic (manmade) and natural variables to the occurrence of elevated nitrate concentrations in ground water in Washington State. Variables that were analyzed included well depth, ground-water recharge rate, precipitation, population density, fertilizer application amounts, soil characteristics, hydrogeomorphic regions, and land-use types. Two models were developed: one with and one without the hydrogeomorphic regions variable. The variables in both models that best explained the occurrence of elevated nitrate concentrations (defined as concentrations of nitrite plus nitrate as nitrogen greater than 2 milligrams per liter) were the percentage of agricultural land use in a 4-kilometer radius of a well, population density, precipitation, soil drainage class, and well depth. Based on the relations between these variables and measured nitrate concentrations, logistic regression models were developed to estimate the probability of nitrate concentrations in ground water exceeding 2 milligrams per liter. Maps of Washington State were produced that illustrate these estimated probabilities for wells drilled to 145 feet below land surface (median well depth) and the estimated depth to which wells would need to be drilled to have a 90-percent probability of drawing water with a nitrate concentration less than 2 milligrams per liter. Maps showing the estimated probability of elevated nitrate concentrations indicated that the agricultural regions are most at risk followed by urban areas. The estimated depths to which wells would need to be drilled to have a 90-percent probability of obtaining water with nitrate concentrations less than 2 milligrams per liter exceeded 1,000 feet in the agricultural regions; whereas, wells in urban areas generally would need to be drilled to depths in excess of 400 feet.
Probability of coincidental similarity among the orbits of small bodies - I. Pairing
NASA Astrophysics Data System (ADS)
Jopek, Tadeusz Jan; Bronikowska, Małgorzata
2017-09-01
Probability of coincidental clustering among orbits of comets, asteroids and meteoroids depends on many factors like: the size of the orbital sample searched for clusters or the size of the identified group, it is different for groups of 2,3,4,… members. Probability of coincidental clustering is assessed by the numerical simulation, therefore, it depends also on the method used for the synthetic orbits generation. We have tested the impact of some of these factors. For a given size of the orbital sample we have assessed probability of random pairing among several orbital populations of different sizes. We have found how these probabilities vary with the size of the orbital samples. Finally, keeping fixed size of the orbital sample we have shown that the probability of random pairing can be significantly different for the orbital samples obtained by different observation techniques. Also for the user convenience we have obtained several formulae which, for given size of the orbital sample can be used to calculate the similarity threshold corresponding to the small value of the probability of coincidental similarity among two orbits.
O'Connor, B.L.; Hondzo, Miki; Dobraca, D.; LaPara, T.M.; Finlay, J.A.; Brezonik, P.L.
2006-01-01
The spatial variability of subreach denitrification rates in streams was evaluated with respect to controlling environmental conditions, molecular examination of denitrifying bacteria, and dimensional analysis. Denitrification activities ranged from 0 and 800 ng-N gsed-1 d-1 with large variations observed within short distances (<50 m) along stream reaches. A log-normal probability distribution described the range in denitrification activities and was used to define low (16% of the probability distributibn), medium (68%), and high (16%) denitrification potential groups. Denitrifying bacteria were quantified using a competitive polymerase chain reaction (cPCR) technique that amplified the nirK gene that encodes for nitrite reductase. Results showed a range of nirK quantities from 103 to 107 gene-copy-number gsed.-1 A nonparametric statistical test showed no significant difference in nirK quantifies among stream reaches, but revealed that samples with a high denitrification potential had significantly higher nirK quantities. Denitrification activity was positively correlated with nirK quantities with scatter in the data that can be attributed to varying environmental conditions along stream reaches. Dimensional analysis was used to evaluate denitrification activities according to environmental variables that describe fluid-flow properties, nitrate and organic material quantities, and dissolved oxygen flux. Buckingham's pi theorem was used to generate dimensionless groupings and field data were used to determine scaling parameters. The resulting expressions between dimensionless NO3- flux and dimensionless groupings of environmental variables showed consistent scaling, which indicates that the subreach variability in denitrification rates can be predicted by the controlling physical, chemical, and microbiological conditions. Copyright 2006 by the American Geophysical Union.
NASA Astrophysics Data System (ADS)
Zhu, Xuchao; Cao, Ruixue; Shao, Mingan; Liang, Yin
2018-03-01
Cosmic-ray neutron probes (CRNPs) have footprint radii for measuring soil-water content (SWC). The theoretical radius is much larger at high altitude, such as the northern Tibetan Plateau, than the radius at sea level. The most probable practical radius of CRNPs for the northern Tibetan Plateau, however, is not known due to the lack of SWC data in this hostile environment. We calculated the theoretical footprint of the CRNP based on a recent simulation and analyzed the practical radius of a CRNP for the northern Tibetan Plateau by measuring SWC at 113 sampling locations on 21 measuring occasions to a depth of 30 cm in a 33.5 ha plot in an alpine meadow at 4600 m a.s.l. The temporal variability and spatial heterogeneity of SWC within the footprint were then analyzed. The theoretical footprint radius was between 360 and 420 m after accounting for the influences of air humidity, soil moisture, vegetation and air pressure. A comparison of SWCs measured by the CRNP and a neutron probe from access tubes in circles with different radii conservatively indicated that the most probable experimental footprint radius was >200 m. SWC within the CRNP footprint was moderately variable over both time and space, but the temporal variability was higher. Spatial heterogeneity was weak, but should be considered in future CRNP calibrations. This study provided theoretical and practical bases for the application and promotion of CRNPs in alpine meadows on the Tibetan Plateau.
Statistical surrogate models for prediction of high-consequence climate change.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Constantine, Paul; Field, Richard V., Jr.; Boslough, Mark Bruce Elrick
2011-09-01
In safety engineering, performance metrics are defined using probabilistic risk assessments focused on the low-probability, high-consequence tail of the distribution of possible events, as opposed to best estimates based on central tendencies. We frame the climate change problem and its associated risks in a similar manner. To properly explore the tails of the distribution requires extensive sampling, which is not possible with existing coupled atmospheric models due to the high computational cost of each simulation. We therefore propose the use of specialized statistical surrogate models (SSMs) for the purpose of exploring the probability law of various climate variables of interest.more » A SSM is different than a deterministic surrogate model in that it represents each climate variable of interest as a space/time random field. The SSM can be calibrated to available spatial and temporal data from existing climate databases, e.g., the Program for Climate Model Diagnosis and Intercomparison (PCMDI), or to a collection of outputs from a General Circulation Model (GCM), e.g., the Community Earth System Model (CESM) and its predecessors. Because of its reduced size and complexity, the realization of a large number of independent model outputs from a SSM becomes computationally straightforward, so that quantifying the risk associated with low-probability, high-consequence climate events becomes feasible. A Bayesian framework is developed to provide quantitative measures of confidence, via Bayesian credible intervals, in the use of the proposed approach to assess these risks.« less
Utilization of infertility services: how much does money matter?
Farley Ordovensky Staniec, J; Webb, Natalie J
2007-06-01
To estimate the effects of financial access and other individual characteristics on the likelihood that a woman pursues infertility treatment and the choice of treatment type. The 1995 National Survey of Family Growth. We use a binomial logit model to estimate the effects of financial access and individual characteristics on the likelihood that a woman pursues infertility treatment. We then use a multinomial logit model to estimate the differential effects of these variables across treatment types. This study analyzes the subset of 1,210 women who meet the definition of infertile or subfecund from the 1995 National Survey of Family Growth. We find that income, insurance coverage, age, and parity (number of previous births) all significantly affect the probability of seeking infertility treatment; however, the effect of these variables on choice of treatment type varies significantly. Neither income nor insurance influences the probability of seeking advice, a relatively low cost, low yield treatment. At the other end of the spectrum, the choice to pursue assisted reproductive technologies (ARTs)-a much more expensive but potentially more productive option-is highly influenced by income, but merely having private insurance has no significant effect. In the middle of the spectrum are treatment options such as testing, surgery, and medications, for which "financial access" increases their probability of selection. Our results illustrate that for the sample of infertile of subfecund women of childbearing age studied, and considering their options, financial access to infertility treatment does matter.
Wagner, Tyler; Jefferson T. Deweber,; Jason Detar,; Kristine, David; John A. Sweka,
2014-01-01
Many potential stressors to aquatic environments operate over large spatial scales, prompting the need to assess and monitor both site-specific and regional dynamics of fish populations. We used hierarchical Bayesian models to evaluate the spatial and temporal variability in density and capture probability of age-1 and older Brook Trout Salvelinus fontinalis from three-pass removal data collected at 291 sites over a 37-year time period (1975–2011) in Pennsylvania streams. There was high between-year variability in density, with annual posterior means ranging from 2.1 to 10.2 fish/100 m2; however, there was no significant long-term linear trend. Brook Trout density was positively correlated with elevation and negatively correlated with percent developed land use in the network catchment. Probability of capture did not vary substantially across sites or years but was negatively correlated with mean stream width. Because of the low spatiotemporal variation in capture probability and a strong correlation between first-pass CPUE (catch/min) and three-pass removal density estimates, the use of an abundance index based on first-pass CPUE could represent a cost-effective alternative to conducting multiple-pass removal sampling for some Brook Trout monitoring and assessment objectives. Single-pass indices may be particularly relevant for monitoring objectives that do not require precise site-specific estimates, such as regional monitoring programs that are designed to detect long-term linear trends in density.
Huang, Yunong; Wu, Lei
2012-01-01
This article examines the relationships between the two cultural variables of having mianzi in social interactions and Chinese cultural beliefs of adversity and life satisfaction among older people in a coastal city in mainland China. The mediating effect of having mianzi in social interactions on the relationship between Chinese cultural beliefs of adversity and life satisfaction is also examined. The study applies a non-probability sampling and adopts a face-to-face interview approach using a questionnaire composed of close-ended questions. A total of 532 valid questionnaires are obtained. Multiple regression analysis is used to test the hypotheses. Findings indicate that the two cultural variables are associated significantly with life satisfaction, while controlling for socio-demographic variables. The variable of Chinese cultural beliefs of adversity is also indirectly associated with life satisfaction through its effect on having mianzi in social interactions. Older people with higher endorsement of positive Chinese cultural beliefs of adversity and higher degree of having mianzi in social interactions tend to have higher life satisfaction. Professionals working with older people should be sensitive to cultural variables that exert impacts on older people's life satisfaction.
NASA Astrophysics Data System (ADS)
Menéndez, María C.; Piccolo, María C.; Hoffmeyer, Mónica S.
2012-10-01
The short-term dynamics of zooplankton in coastal ecosystems are strongly influenced by physical processes such as tides, riverine runoff and winds. In this study, we investigated the short-term changes of the representative taxa within mesozooplankton in relation to the semidiurnal tidal cycles. Also, we evaluated the influence of local winds on this short-term variability. Sampling was carried out bimonthly from December 2004 to April 2006 in a fixed point located in the inner zone of the Bahía Blanca Estuary, Argentina. Mesozooplankton samples were taken by pumps during 14-h tidal cycles at 3-h intervals, from surface and bottom. Vertical profiles of temperature and salinity as well as water samples to determine suspended particulate matter were acquired at each sampling date. All data concerning winds were obtained from a meteorological station and water level was recorded with a tide gauge. Holoplankton dominated numerically on meroplankton and adventitious fraction. Concerning holoplanktonic abundance, the highest values were attained by the calanoid copepods Acartia tonsa and Eurytemora americana. Meroplankton occurred mainly as barnacle larvae while benthic harpacticoids and Corophium sp. dominated the adventitious component. Semidiurnal tide was the main influence on the A. tonsa variability. However, noticeable differences in the abundance pattern as function of wind intensity were detected. Meroplankton abundance did not show a clear variation along the tidal cycle. Distributional pattern of harpacticoids seemed to be mainly modulated by velocity asymmetries in the tidal currents, in the same way as suspended particulate matter. However, the Corophium sp. distribution indicated probable behavioural responses associated with tides. The obtained results show how variable the mesozooplankton community structure can be over short-term time scales in mesotidal temperate estuaries. This variability should be taken into account for any zooplankton monitoring program conducted in temperate systems with a high-tidal regime but also to register changes in zooplankton community at a fine temporal scale.
Reward-Dependent Modulation of Movement Variability
Izawa, Jun; Shadmehr, Reza
2015-01-01
Movement variability is often considered an unwanted byproduct of a noisy nervous system. However, variability can signal a form of implicit exploration, indicating that the nervous system is intentionally varying the motor commands in search of actions that yield the greatest success. Here, we investigated the role of the human basal ganglia in controlling reward-dependent motor variability as measured by trial-to-trial changes in performance during a reaching task. We designed an experiment in which the only performance feedback was success or failure and quantified how reach variability was modulated as a function of the probability of reward. In healthy controls, reach variability increased as the probability of reward decreased. Control of variability depended on the history of past rewards, with the largest trial-to-trial changes occurring immediately after an unrewarded trial. In contrast, in participants with Parkinson's disease, a known example of basal ganglia dysfunction, reward was a poor modulator of variability; that is, the patients showed an impaired ability to increase variability in response to decreases in the probability of reward. This was despite the fact that, after rewarded trials, reach variability in the patients was comparable to healthy controls. In summary, we found that movement variability is partially a form of exploration driven by the recent history of rewards. When the function of the human basal ganglia is compromised, the reward-dependent control of movement variability is impaired, particularly affecting the ability to increase variability after unsuccessful outcomes. PMID:25740529
Simulating future uncertainty to guide the selection of survey designs for long-term monitoring
Garman, Steven L.; Schweiger, E. William; Manier, Daniel J.; Gitzen, Robert A.; Millspaugh, Joshua J.; Cooper, Andrew B.; Licht, Daniel S.
2012-01-01
A goal of environmental monitoring is to provide sound information on the status and trends of natural resources (Messer et al. 1991, Theobald et al. 2007, Fancy et al. 2009). When monitoring observations are acquired by measuring a subset of the population of interest, probability sampling as part of a well-constructed survey design provides the most reliable and legally defensible approach to achieve this goal (Cochran 1977, Olsen et al. 1999, Schreuder et al. 2004; see Chapters 2, 5, 6, 7). Previous works have described the fundamentals of sample surveys (e.g. Hansen et al. 1953, Kish 1965). Interest in survey designs and monitoring over the past 15 years has led to extensive evaluations and new developments of sample selection methods (Stevens and Olsen 2004), of strategies for allocating sample units in space and time (Urquhart et al. 1993, Overton and Stehman 1996, Urquhart and Kincaid 1999), and of estimation (Lesser and Overton 1994, Overton and Stehman 1995) and variance properties (Larsen et al. 1995, Stevens and Olsen 2003) of survey designs. Carefully planned, “scientific” (Chapter 5) survey designs have become a standard in contemporary monitoring of natural resources. Based on our experience with the long-term monitoring program of the US National Park Service (NPS; Fancy et al. 2009; Chapters 16, 22), operational survey designs tend to be selected using the following procedures. For a monitoring indicator (i.e. variable or response), a minimum detectable trend requirement is specified, based on the minimum level of change that would result in meaningful change (e.g. degradation). A probability of detecting this trend (statistical power) and an acceptable level of uncertainty (Type I error; see Chapter 2) within a specified time frame (e.g. 10 years) are specified to ensure timely detection. Explicit statements of the minimum detectable trend, the time frame for detecting the minimum trend, power, and acceptable probability of Type I error (α) collectively form the quantitative sampling objective.
Juric, I; Salzburger, W; Balmer, O
2017-04-01
The diamondback moth (DBM) (Plutella xylostella) is one of the main pests of brassicaceous crops worldwide and shows resistance against a wide range of synthetic insecticides incurring millions of dollars in control costs every year. The DBM is a prime example of the introduction of an exotic species as a consequence of globalization. In this study we analyzed the genetic population structure of the DBM and two of its parasitic wasps, Diadegma semiclausum and Diadegma fenestrale, based on mitochondrial DNA sequences. We analyzed DBM samples from 13 regions worldwide (n = 278), and samples of the two wasp species from six European and African countries (n = 131), in an attempt to reconstruct the geographic origin and phylogeography of the DBM and its two parasitic wasps. We found high variability in COI sequences in the diamondback moth. Haplotype analysis showed three distinct genetic clusters, one of which could represent a cryptic species. Mismatch analysis confirmed the hypothesized recent spread of diamondback moths in North America, Australia and New Zealand. The highest genetic variability was found in African DBM samples. Our data corroborate prior claims of Africa as the most probable origin of the species but cannot preclude Asia as an alternative. No genetic variability was found in the two Diadegma species. The lack of variability in both wasp species suggests a very recent spread of bottlenecked populations, possibly facilitated by their use as biocontrol agents. Our data thus also contain no signals of host-parasitoid co-evolution.
Methylmercury Modulation in Amazon Rivers Linked to Basin Characteristics and Seasonal Flood-Pulse.
Kasper, Daniele; Forsberg, Bruce R; Amaral, João H F; Py-Daniel, Sarah S; Bastos, Wanderley R; Malm, Olaf
2017-12-19
We investigated the impact of the seasonal inundation of wetlands on methylmercury (MeHg) concentration dynamics in the Amazon river system. We sampled 38 sites along the Solimões/Amazon and Negro rivers and their tributaries during distinct phases of the annual flood-pulse. MeHg dynamics in both basins was contrasted to provide insight into the factors controlling export of MeHg to the Amazon system. The export of MeHg by rivers was substantially higher during high-water in both basins since elevated MeHg concentrations and discharge occurred during this time. MeHg concentration was positively correlated to %flooded area upstream of the sampling site in the Solimões/Amazon Basin with the best correlation obtained using 100 km buffers instead of whole basin areas. The lower correlations obtained with the whole basin apparently reflected variable losses of MeHg exported from upstream wetlands due to demethylation, absorption, deposition, and degradation before reaching the sampling site. A similar correlation between %flooded area and MeHg concentrations was not observed in the Negro Basin probably due to the variable export of MeHg from poorly drained soils that are abundant in this basin but not consistently flooded.
Trocki, Karen; Drabble, Laurie
2008-11-01
Prior research has found heavier drinking and alcohol-related problems to be more prevalent in sexual minority populations, particularly among women. It has been suggested that differences may be explained in part by socializing in bars and other public drinking venues. This study explores gender, sexual orientation and bar patronage in two different samples: respondents from a random digit dial (RDD) probability study of 1,043 households in Northern California and 569 individuals who were surveyed exiting from 25 different bars in the same three counties that constituted the RDD sample. Bar patrons, in most instances, were at much higher risk of excessive consumption and related problems and consequences. On several key variables, women from the bar patron sample exceeded the problem rates of men in the general population. Bisexual women and bisexual men exhibited riskier behavior on many alcohol measures relative to heterosexuals. Measures of heavier drinking and alcohol-related problems were also elevated among lesbians compared to heterosexual women. Two of the bar motive variables, sensation seeking and mood change motives, were particularly predictive of heavier drinking and alcohol-related problems. Social motives did not predict problems.
NASA Astrophysics Data System (ADS)
Gomez, Jose Alfonso; Owens, Phillip N.; Koiter, Alex J.; Lobb, David
2016-04-01
One of the major sources of uncertainty in attributing sediment sources in fingerprinting studies is the uncertainty in determining the concentrations of the elements used in the mixing model due to the variability of the concentrations of these elements in the source materials (e.g., Kraushaar et al., 2015). The uncertainty in determining the "true" concentration of a given element in each one of the source areas depends on several factors, among them the spatial variability of that element, the sampling procedure and sampling density. Researchers have limited control over these factors, and usually sampling density tends to be sparse, limited by time and the resources available. Monte Carlo analysis has been used regularly in fingerprinting studies to explore the probable solutions within the measured variability of the elements in the source areas, providing an appraisal of the probability of the different solutions (e.g., Collins et al., 2012). This problem can be considered analogous to the propagation of uncertainty in hydrologic models due to uncertainty in the determination of the values of the model parameters, and there are many examples of Monte Carlo analysis of this uncertainty (e.g., Freeze, 1980; Gómez et al., 2001). Some of these model analyses rely on the simulation of "virtual" situations that were calibrated from parameter values found in the literature, with the purpose of providing insight about the response of the model to different configurations of input parameters. This approach - evaluating the answer for a "virtual" problem whose solution could be known in advance - might be useful in evaluating the propagation of uncertainty in mixing models in sediment fingerprinting studies. In this communication, we present the preliminary results of an on-going study evaluating the effect of variability of element concentrations in source materials, sampling density, and the number of elements included in the mixing models. For this study a virtual catchment was constructed, composed by three sub-catchments each of 500 x 500 m size. We assumed that there was no selectivity in sediment detachment or transport. A numerical excercise was performed considering these variables: 1) variability of element concentration: three levels with CVs of 20 %, 50 % and 80 %; 2) sampling density: 10, 25 and 50 "samples" per sub-catchment and element; and 3) number of elements included in the mixing model: two (determined), and five (overdetermined). This resulted in a total of 18 (3 x 3 x 2) possible combinations. The five fingerprinting elements considered in the study were: C, N, 40K, Al and Pavail, and their average values, taken from the literature, were: sub-catchment 1: 4.0 %, 0.35 %, 0.50 ppm, 5.0 ppm, 1.42 ppm, respectively; sub-catchment 2: 2.0 %, 0.18 %, 0.20 ppm, 10.0 ppm, 0.20 ppm, respectively; and sub-catchment 3: 1.0 %, 0.06 %, 1.0 ppm, 16.0 ppm, 7.8 ppm, respectively. For each sub-catchment, three maps of the spatial distribution of each element was generated using the random generator of Mejia and Rodriguez-Iturbe (1974) as described in Freeze (1980), using the average value and the three different CVs defined above. Each map for each source area and property was generated for a 100 x 100 square grid, each grid cell being 5 m x 5 m. Maps were randomly generated for each property and source area. In doing so, we did not consider the possibility of cross correlation among properties. Spatial autocorrelation was assumed to be weak. The reason for generating the maps was to create a "virtual" situation where all the element concentration values at each point are known. Simultaneously, we arbitrarily determined the percentage of sediment coming from sub-catchments. These values were 30 %, 10 % and 60 %, for sub-catchments 1, 2 and 3, respectively. Using these values, we determined the element concentrations in the sediment. The exercise consisted of creating different sampling strategies in a virtual environment to determine an average value for each of the different maps of element concentration and sub-catchment, under different sampling densities: 200 different average values for the "high" sampling density (average of 50 samples); 400 different average values for the "medium" sampling density (average of 25 samples); and 1,000 different average values for the "low" sampling density (average of 10 samples). All these combinations of possible values of element concentrations in the source areas were solved for the concentration in the sediment already determined for the "true" solution using limSolve (Soetaert et al., 2014) in R language. The sediment source solutions found for the different situations and values were analyzed in order to: 1) evaluate the uncertainty in the sediment source attribution; and 2) explore strategies to detect the most probable solutions that might lead to improved methods for constructing the most robust mixing models. Preliminary results on these will be presented and discussed in this communication. Key words: sediment, fingerprinting, uncertainty, variability, mixing model. References Collins, A.L., Zhang, Y., McChesney, D., Walling, D.E., Haley, S.M., Smith, P. 2012. Sediment source tracing in a lowland agricultural catchment in southern England using a modified procedure combining statistical analysis and numerical modelling. Science of the Total Environment 414: 301-317. Freeze, R.A. 1980. A stochastic-conceptual analysis of rainfall-runoff processes on a hillslope. Water Resources Research 16: 391-408.
Tvedt, Christine; Sjetne, Ingeborg Strømseng; Helgeland, Jon; Bukholm, Geir
2014-01-01
Background There is a growing body of evidence for associations between the work environment and patient outcomes. A good work environment may maximise healthcare workers’ efforts to avoid failures and to facilitate quality care that is focused on patient safety. Several studies use nurse-reported quality measures, but it is uncertain whether these outcomes are correlated with clinical outcomes. The aim of this study was to determine the correlations between hospital-aggregated, nurse-assessed quality and safety, and estimated probabilities for 30-day survival in and out of hospital. Methods In a multicentre study involving almost all Norwegian hospitals with more than 85 beds (sample size=30, information about nurses’ perceptions of organisational characteristics were collected. Subscales from this survey were used to describe properties of the organisations: quality system, patient safety management, nurse–physician relationship, staffing adequacy, quality of nursing and patient safety. The average scores for these organisational characteristics were aggregated to hospital level, and merged with estimated probabilities for 30-day survival in and out of hospital (survival probabilities) from a national database. In this observational, ecological study, the relationships between the organisational characteristics (independent variables) and clinical outcomes (survival probabilities) were examined. Results Survival probabilities were correlated with nurse-assessed quality of nursing. Furthermore, the subjective perception of staffing adequacy was correlated with overall survival. Conclusions This study showed that perceived staffing adequacy and nurses’ assessments of quality of nursing were correlated with survival probabilities. It is suggested that the way nurses characterise the microsystems they belong to, also reflects the general performance of hospitals. PMID:24728887
Determinants of customer satisfaction with hospitals: a managerial model.
Andaleeb, S S
1998-01-01
States that rapid changes in the environment have exerted significant pressures on hospitals to incorporate patient satisfaction in their strategic stance and quest for market share and long-term viability. This study proposes and tests a five-factor model that explains considerable variation in customer satisfaction with hospitals. These factors include communication with patients, competence of the staff, their demeanour, quality of the facilities, and perceived costs; they also represent strategic concepts that managers can address in their bid to remain competitive. A probability sample was selected and a multiple regression model used to test the hypotheses. The results indicate that all five variables were significant in the model and explained 62 per cent of the variation in the dependent variable. Managerial implications of the proposed model are discussed.
Uncertainty in the profitability of fertilizer management based on various sampling designs.
NASA Astrophysics Data System (ADS)
Muhammed, Shibu; Ben, Marchant; Webster, Richard; Milne, Alice; Dailey, Gordon; Whitmore, Andrew
2016-04-01
Many farmers sample their soil to measure the concentrations of plant nutrients, including phosphorus (P), so as to decide how much fertilizer to apply. Now that fertilizer can be applied at variable rates, farmers want to know whether maps of nutrient concentration made from grid samples or from field subdivisions (zones within their fields) are merited: do such maps lead to greater profit than would a single measurement on a bulked sample for each field when all costs are taken into account? We have examined the merits of grid-based and zone-based sampling strategies over single field-based averages using continuous spatial data on wheat yields at harvest in six fields in southern England and simulated concentrations of P in the soil. Features of the spatial variation in the yields provide predictions about which sampling scheme is likely to be most cost effective, but there is uncertainty associated with these predictions that must be communicated to farmers. Where variograms of the yield have large variances and long effective ranges, grid-sampling and mapping nutrients are likely to be cost-effective. Where effective ranges are short, sampling must be dense to reveal the spatial variation and may be expensive. In these circumstances variable-rate application of fertilizer is likely to be impracticable and almost certainly not cost-effective. We have explored several methods for communicating these results and found that the most effective method was using probability maps that show the likelihood of grid-based and zone-based sampling being more profitable that a field-based estimate.
Uncertainty Quantification of the FUN3D-Predicted NASA CRM Flutter Boundary
NASA Technical Reports Server (NTRS)
Stanford, Bret K.; Massey, Steven J.
2017-01-01
A nonintrusive point collocation method is used to propagate parametric uncertainties of the flexible Common Research Model, a generic transport configuration, through the unsteady aeroelastic CFD solver FUN3D. A range of random input variables are considered, including atmospheric flow variables, structural variables, and inertial (lumped mass) variables. UQ results are explored for a range of output metrics (with a focus on dynamic flutter stability), for both subsonic and transonic Mach numbers, for two different CFD mesh refinements. A particular focus is placed on computing failure probabilities: the probability that the wing will flutter within the flight envelope.
A global logrank test for adaptive treatment strategies based on observational studies.
Li, Zhiguo; Valenstein, Marcia; Pfeiffer, Paul; Ganoczy, Dara
2014-02-28
In studying adaptive treatment strategies, a natural question that is of paramount interest is whether there is any significant difference among all possible treatment strategies. When the outcome variable of interest is time-to-event, we propose an inverse probability weighted logrank test for testing the equivalence of a fixed set of pre-specified adaptive treatment strategies based on data from an observational study. The weights take into account both the possible selection bias in an observational study and the fact that the same subject may be consistent with more than one treatment strategy. The asymptotic distribution of the weighted logrank statistic under the null hypothesis is obtained. We show that, in an observational study where the treatment selection probabilities need to be estimated, the estimation of these probabilities does not have an effect on the asymptotic distribution of the weighted logrank statistic, as long as the estimation of the parameters in the models for these probabilities is n-consistent. Finite sample performance of the test is assessed via a simulation study. We also show in the simulation that the test can be pretty robust to misspecification of the models for the probabilities of treatment selection. The method is applied to analyze data on antidepressant adherence time from an observational database maintained at the Department of Veterans Affairs' Serious Mental Illness Treatment Research and Evaluation Center. Copyright © 2013 John Wiley & Sons, Ltd.
Improving inferences in population studies of rare species that are detected imperfectly
MacKenzie, D.I.; Nichols, J.D.; Sutton, N.; Kawanishi, K.; Bailey, L.L.
2005-01-01
For the vast majority of cases, it is highly unlikely that all the individuals of a population will be encountered during a study. Furthermore, it is unlikely that a constant fraction of the population is encountered over times, locations, or species to be compared. Hence, simple counts usually will not be good indices of population size. We recommend that detection probabilities (the probability of including an individual in a count) be estimated and incorporated into inference procedures. However, most techniques for estimating detection probability require moderate sample sizes, which may not be achievable when studying rare species. In order to improve the reliability of inferences from studies of rare species, we suggest two general approaches that researchers may wish to consider that incorporate the concept of imperfect detectability: (1) borrowing information about detectability or the other quantities of interest from other times, places, or species; and (2) using state variables other than abundance (e.g., species richness and occupancy). We illustrate these suggestions with examples and discuss the relative benefits and drawbacks of each approach.
Rattray, Gordon W.
2014-01-01
Quality-control (QC) samples were collected from 2002 through 2008 by the U.S. Geological Survey, in cooperation with the U.S. Department of Energy, to ensure data robustness by documenting the variability and bias of water-quality data collected at surface-water and groundwater sites at and near the Idaho National Laboratory. QC samples consisted of 139 replicates and 22 blanks (approximately 11 percent of the number of environmental samples collected). Measurements from replicates were used to estimate variability (from field and laboratory procedures and sample heterogeneity), as reproducibility and reliability, of water-quality measurements of radiochemical, inorganic, and organic constituents. Measurements from blanks were used to estimate the potential contamination bias of selected radiochemical and inorganic constituents in water-quality samples, with an emphasis on identifying any cross contamination of samples collected with portable sampling equipment. The reproducibility of water-quality measurements was estimated with calculations of normalized absolute difference for radiochemical constituents and relative standard deviation (RSD) for inorganic and organic constituents. The reliability of water-quality measurements was estimated with pooled RSDs for all constituents. Reproducibility was acceptable for all constituents except dissolved aluminum and total organic carbon. Pooled RSDs were equal to or less than 14 percent for all constituents except for total organic carbon, which had pooled RSDs of 70 percent for the low concentration range and 4.4 percent for the high concentration range. Source-solution and equipment blanks were measured for concentrations of tritium, strontium-90, cesium-137, sodium, chloride, sulfate, and dissolved chromium. Field blanks were measured for the concentration of iodide. No detectable concentrations were measured from the blanks except for strontium-90 in one source solution and one equipment blank collected in September and October 2004, respectively. The detectable concentrations of strontium-90 in the blanks probably were from a small source of strontium-90 contamination or large measurement variability, or both. Order statistics and the binomial probability distribution were used to estimate the magnitude and extent of any potential contamination bias of tritium, strontium-90, cesium-137, sodium, chloride, sulfate, dissolved chromium, and iodide in water-quality samples. These statistical methods indicated that, with (1) 87 percent confidence, contamination bias of cesium-137 and sodium in 60 percent of water-quality samples was less than the minimum detectable concentration or reporting level; (2) 92‒94 percent confidence, contamination bias of tritium, strontium-90, chloride, sulfate, and dissolved chromium in 70 percent of water-quality samples was less than the minimum detectable concentration or reporting level; and (3) 75 percent confidence, contamination bias of iodide in 50 percent of water-quality samples was less than the reporting level for iodide. These results support the conclusion that contamination bias of water-quality samples from sample processing, storage, shipping, and analysis was insignificant and that cross-contamination of perched groundwater samples collected with bailers during 2002–08 was insignificant.
Variability of mercury concentrations in domestic well water, New Jersey Coastal Plain
Szabo, Zoltan; Barringer, Julia L.; Jacobsen, Eric; Smith, Nicholas P; Gallagher, Robert A; Sites, Andrew
2010-01-01
Concentrations of total (unfiltered) mercury (Hg) exceed the Maximum Contaminant Level (2 µg/L) in the acidic water withdrawn by more than 700 domestic wells from the areally extensive unconfined Kirkwood-Cohansey aquifer system. Background concentrations of Hg generally are <0.01 µg/L. The source of the Hg contamination has been hypothesized to arise from Hg of pesticide-application, atmospheric, and geologic origin being mobilized by some component(s) of septic-system effluent or urban leachates in unsewered residential areas. Initial results at many affected wells were not reproducible upon later resampling despite rigorous quality assurance, prompting concerns that duration of well flushing could affect the Hg concentrations. A cooperative study by the U.S. Geological Survey and the New Jersey Department of Environmental Protection examined variability in Hg results during the flushing of domestic wells. Samples were collected at regular intervals (about 10 minutes) during flushing for eight domestic wells, until stabilization criteria was met for field-measured parameters; the Hg concentrations in the final samples ranged from about 0.0005 to 11 µg/L. Unfiltered Hg concentrations in samples collected during purging varied slightly, but particulate Hg concentration (unfiltered – filtered (0.45 micron capsule) concentration) typically was highly variable for each well, with no consistent pattern of increase or decrease in concentration. Surges of particulates probably were associated with pump cycling. Pre-pumping samples from the holding tanks generally had the lowest Hg concentrations among the samples collected at the well that day. Comparing the newly obtained results at each well to results from previous sampling indicated that Hg concentrations in water from the Hg-contaminated areas were generally greater among samples collected on different dates (long-term variations, months to years) than among samples collected on the same day (short-term variations, minutes to hours). The long-term variations likely are caused by changes in local pumping regimes and time-varying capture of slugs of Hg-contaminated water moving on flowpaths.
Is dietary diversity a proxy measurement of nutrient adequacy in Iranian elderly women?
Tavakoli, Sogand; Dorosty-Motlagh, Ahmad Reza; Hoshiar-Rad, Anahita; Eshraghian, Mohamad Reza; Sotoudeh, Gity; Azadbakht, Leila; Karimi, Mehrdad; Jalali-Farahani, Sara
2016-10-01
To investigate whether consumption of more diverse diets would increase the probability of nutrients adequacy among elderly women in Tehran, Iran. This cross-sectional study was conducted on 292 women aged ≥60 years who were randomly selected from 10 public health centers among 31 centers in south area of Tehran. Because of some limitations we randomly chose these 10 centers. The sample sizes provided 80% statistical power to meet the aim of study for test the relationship between Nutrient Adequacy Ratio (NAR), Mean Adequacy Ratio (MAR) as a dependent variable and total Dietary Diversity Score (DDS) as an independent variable. Dietary intakes were assessed by two 24-h recall questionnaires. The mean probability of adequacy across 12 nutrients and energy were calculated using the Dietary Reference Index (DRI). Dietary diversity Score was defined according to diet quality index revised (Haines et al. method). To investigate the relationship between MAR and DDS some demographic and socioeconomic variables were examined. Mean ± SD of total dietary diversity was 4.22 ± 1.28 (range 1.07-6.93). The Fruit and vegetable groups had the highest (1.27 ± 0.65, range 0-2.0) and the lowest (0.56 ± 0.36, range 0-1.71) diversity score, respectively. We observed that total DDS has significant positive correlation with MAR (r = 0.65, P < 0.001). Total DDS was significantly associated with NAR of all 12 studied nutrients (P < 0.01); probability adequacy of vitamin B2 revealed the strongest (r = 0.63, P < 0.01) and vitamin B12 revealed the weakest (r = 0.28, P < 0.01) relationship with total DDS. When maximizing sensitivity and specificity, the best cut-off point for achieving MAR≥1 was 4.5 for DDS. The results of our study showed that DDS is an appropriate indicator of the probability of nutrient adequacy in Tehranian elderly women. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Eliaš, Peter; Frič, Roman
2017-12-01
Categorical approach to probability leads to better understanding of basic notions and constructions in generalized (fuzzy, operational, quantum) probability, where observables—dual notions to generalized random variables (statistical maps)—play a major role. First, to avoid inconsistencies, we introduce three categories L, S, and P, the objects and morphisms of which correspond to basic notions of fuzzy probability theory and operational probability theory, and describe their relationships. To illustrate the advantages of categorical approach, we show that two categorical constructions involving observables (related to the representation of generalized random variables via products, or smearing of sharp observables, respectively) can be described as factorizing a morphism into composition of two morphisms having desired properties. We close with a remark concerning products.
Rosenfield, G.H.; Fitzpatrick-Lins, K.; Johnson, T.L.
1987-01-01
A cityscape (or any landscape) can be stratified into environmental units using multiple variables of information. For the purposes of sampling building materials, census and land use variables were used to identify similar strata. In the Metropolitan Statistical Area of a cityscape, the census tract is the smallest unit for which census data are summarized and digitized boundaries are available. For purposes of this analysis, census data on total population, total number of housing units, and number of singleunit dwellings were aggregated into variables of persons per square kilometer and proportion of housing units in single-unit dwellings. The level 2 categories of the U.S. Geological Survey's land use and land cover data base were aggregated into variables of proportion of residential land with buildings, proportion of nonresidential land with buildings, and proportion of open land. The cityscape was stratified, from these variables, into environmental strata of Urban Central Business District, Urban Livelihood Industrial Commercial, Urban Multi-Family Residential, Urban Single Family Residential, Non-Urban Suburbanizing, and Non-Urban Rural. The New England region was chosen as a region with commonality of building materials, and a procedure developed for trial classification of census tracts into one of the strata. Final stratification was performed by discriminant analysis using the trial classification and prior probabilities as weights. The procedure was applied to several cities, and the results analyzed by correlation analysis from a field sample of building materials. The methodology developed for stratification of a cityscape using multiple variables has application to many other types of environmental studies, including forest inventory, hydrologic unit management, waste disposal, transportation studies, and other urban studies. Multivariate analysis techniques have recently been used for urban stratification in England. ?? 1987 Annals of Regional Science.
Gilbert, Peter B; Yu, Xuesong; Rotnitzky, Andrea
2014-03-15
To address the objective in a clinical trial to estimate the mean or mean difference of an expensive endpoint Y, one approach employs a two-phase sampling design, wherein inexpensive auxiliary variables W predictive of Y are measured in everyone, Y is measured in a random sample, and the semiparametric efficient estimator is applied. This approach is made efficient by specifying the phase two selection probabilities as optimal functions of the auxiliary variables and measurement costs. While this approach is familiar to survey samplers, it apparently has seldom been used in clinical trials, and several novel results practicable for clinical trials are developed. We perform simulations to identify settings where the optimal approach significantly improves efficiency compared to approaches in current practice. We provide proofs and R code. The optimality results are developed to design an HIV vaccine trial, with objective to compare the mean 'importance-weighted' breadth (Y) of the T-cell response between randomized vaccine groups. The trial collects an auxiliary response (W) highly predictive of Y and measures Y in the optimal subset. We show that the optimal design-estimation approach can confer anywhere between absent and large efficiency gain (up to 24 % in the examples) compared to the approach with the same efficient estimator but simple random sampling, where greater variability in the cost-standardized conditional variance of Y given W yields greater efficiency gains. Accurate estimation of E[Y | W] is important for realizing the efficiency gain, which is aided by an ample phase two sample and by using a robust fitting method. Copyright © 2013 John Wiley & Sons, Ltd.
Gilbert, Peter B.; Yu, Xuesong; Rotnitzky, Andrea
2014-01-01
To address the objective in a clinical trial to estimate the mean or mean difference of an expensive endpoint Y, one approach employs a two-phase sampling design, wherein inexpensive auxiliary variables W predictive of Y are measured in everyone, Y is measured in a random sample, and the semi-parametric efficient estimator is applied. This approach is made efficient by specifying the phase-two selection probabilities as optimal functions of the auxiliary variables and measurement costs. While this approach is familiar to survey samplers, it apparently has seldom been used in clinical trials, and several novel results practicable for clinical trials are developed. Simulations are performed to identify settings where the optimal approach significantly improves efficiency compared to approaches in current practice. Proofs and R code are provided. The optimality results are developed to design an HIV vaccine trial, with objective to compare the mean “importance-weighted” breadth (Y) of the T cell response between randomized vaccine groups. The trial collects an auxiliary response (W) highly predictive of Y, and measures Y in the optimal subset. We show that the optimal design-estimation approach can confer anywhere between absent and large efficiency gain (up to 24% in the examples) compared to the approach with the same efficient estimator but simple random sampling, where greater variability in the cost-standardized conditional variance of Y given W yields greater efficiency gains. Accurate estimation of E[Y∣W] is important for realizing the efficiency gain, which is aided by an ample phase-two sample and by using a robust fitting method. PMID:24123289
Graves, T.A.; Kendall, Katherine C.; Royle, J. Andrew; Stetz, J.B.; Macleod, A.C.
2011-01-01
Few studies link habitat to grizzly bear Ursus arctos abundance and these have not accounted for the variation in detection or spatial autocorrelation. We collected and genotyped bear hair in and around Glacier National Park in northwestern Montana during the summer of 2000. We developed a hierarchical Markov chain Monte Carlo model that extends the existing occupancy and count models by accounting for (1) spatially explicit variables that we hypothesized might influence abundance; (2) separate sub-models of detection probability for two distinct sampling methods (hair traps and rub trees) targeting different segments of the population; (3) covariates to explain variation in each sub-model of detection; (4) a conditional autoregressive term to account for spatial autocorrelation; (5) weights to identify most important variables. Road density and per cent mesic habitat best explained variation in female grizzly bear abundance; spatial autocorrelation was not supported. More female bears were predicted in places with lower road density and with more mesic habitat. Detection rates of females increased with rub tree sampling effort. Road density best explained variation in male grizzly bear abundance and spatial autocorrelation was supported. More male bears were predicted in areas of low road density. Detection rates of males increased with rub tree and hair trap sampling effort and decreased over the sampling period. We provide a new method to (1) incorporate multiple detection methods into hierarchical models of abundance; (2) determine whether spatial autocorrelation should be included in final models. Our results suggest that the influence of landscape variables is consistent between habitat selection and abundance in this system.
Validation of the Six Sigma Z-score for the quality assessment of clinical laboratory timeliness.
Ialongo, Cristiano; Bernardini, Sergio
2018-03-28
The International Federation of Clinical Chemistry and Laboratory Medicine has introduced in recent times the turnaround time (TAT) as mandatory quality indicator for the postanalytical phase. Classic TAT indicators, namely, average, median, 90th percentile and proportion of acceptable test (PAT), are in use since almost 40 years and to date represent the mainstay for gauging the laboratory timeliness. In this study, we investigated the performance of the Six Sigma Z-score, which was previously introduced as a device for the quantitative assessment of timeliness. A numerical simulation was obtained modeling the actual TAT data set using the log-logistic probability density function. Five thousand replicates for each size of the artificial TAT random sample (n=20, 50, 250 and 1000) were generated, and different laboratory conditions were simulated manipulating the PDF in order to generate more or less variable data. The Z-score and the classic TAT indicators were assessed for precision (%CV), robustness toward right-tailing (precision at different sample variability), sensitivity and specificity. Z-score showed sensitivity and specificity comparable to PAT (≈80% with n≥250), but superior precision that ranged within 20% by moderately small sized samples (n≥50); furthermore, Z-score was less affected by the value of the cutoff used for setting the acceptable TAT, as well as by the sample variability that reflected into the magnitude of right-tailing. The Z-score was a valid indicator of laboratory timeliness and a suitable device to improve as well as to maintain the achieved quality level.
Cantarero, Samuel; Zafra-Gómez, Alberto; Ballesteros, Oscar; Navalón, Alberto; Reis, Marco S; Saraiva, Pedro M; Vílchez, José L
2011-01-01
In this work we present a monitoring study of linear alkylbenzene sulfonates (LAS) and insoluble soap performed on Spanish sewage sludge samples. This work focuses on finding statistical relations between LAS concentrations and insoluble soap in sewage sludge samples and variables related to wastewater treatment plants such as water hardness, population and treatment type. It is worth to mention that 38 samples, collected from different Spanish regions, were studied. The statistical tool we used was Principal Component Analysis (PC), in order to reduce the number of response variables. The analysis of variance (ANOVA) test and a non-parametric test such as the Kruskal-Wallis test were also studied through the estimation of the p-value (probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true) in order to study possible relations between the concentration of both analytes and the rest of variables. We also compared LAS and insoluble soap behaviors. In addition, the results obtained for LAS (mean value) were compared with the limit value proposed by the future Directive entitled "Working Document on Sludge". According to the results, the mean obtained for soap and LAS was 26.49 g kg(-1) and 6.15 g kg(-1) respectively. It is worth noting that LAS mean was significantly higher than the limit value (2.6 g kg(-1)). In addition, LAS and soap concentrations depend largely on water hardness. However, only LAS concentration depends on treatment type.
NASA Astrophysics Data System (ADS)
Meusinger, H.; Balafkan, N.
2014-08-01
Aims: A tiny fraction of the quasar population shows remarkably weak emission lines. Several hypotheses have been developed, but the weak line quasar (WLQ) phenomenon still remains puzzling. The aim of this study was to create a sizeable sample of WLQs and WLQ-like objects and to evaluate various properties of this sample. Methods: We performed a search for WLQs in the spectroscopic data from the Sloan Digital Sky Survey Data Release 7 based on Kohonen self-organising maps for nearly 105 quasar spectra. The final sample consists of 365 quasars in the redshift range z = 0.6 - 4.2 (z¯ = 1.50 ± 0.45) and includes in particular a subsample of 46 WLQs with equivalent widths WMg ii< 11 Å and WC iv< 4.8 Å. We compared the luminosities, black hole masses, Eddington ratios, accretion rates, variability, spectral slopes, and radio properties of the WLQs with those of control samples of ordinary quasars. Particular attention was paid to selection effects. Results: The WLQs have, on average, significantly higher luminosities, Eddington ratios, and accretion rates. About half of the excess comes from a selection bias, but an intrinsic excess remains probably caused primarily by higher accretion rates. The spectral energy distribution shows a bluer continuum at rest-frame wavelengths ≳1500 Å. The variability in the optical and UV is relatively low, even taking the variability-luminosity anti-correlation into account. The percentage of radio detected quasars and of core-dominant radio sources is significantly higher than for the control sample, whereas the mean radio-loudness is lower. Conclusions: The properties of our WLQ sample can be consistently understood assuming that it consists of a mix of quasars at the beginning of a stage of increased accretion activity and of beamed radio-quiet quasars. The higher luminosities and Eddington ratios in combination with a bluer spectral energy distribution can be explained by hotter continua, i.e. higher accretion rates. If quasar activity consists of subphases with different accretion rates, a change towards a higher rate is probably accompanied by an only slow development of the broad line region. The composite WLQ spectrum can be reasonably matched by the ordinary quasar composite where the continuum has been replaced by that of a hotter disk. A similar effect can be achieved by an additional power-law component in relativistically boosted radio-quiet quasars, which may explain the high percentage of radio quasars. The full catalogue is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/568/A114
New approach to probability estimate of femoral neck fracture by fall (Slovak regression model).
Wendlova, J
2009-01-01
3,216 Slovak women with primary or secondary osteoporosis or osteopenia, aged 20-89 years, were examined with the bone densitometer DXA (dual energy X-ray absorptiometry, GE, Prodigy - Primo), x = 58.9, 95% C.I. (58.42; 59.38). The values of the following variables for each patient were measured: FSI (femur strength index), T-score total hip left, alpha angle - left, theta angle - left, HAL (hip axis length) left, BMI (body mass index) was calculated from the height and weight of the patients. Regression model determined the following order of independent variables according to the intensity of their influence upon the occurrence of values of dependent FSI variable: 1. BMI, 2. theta angle, 3. T-score total hip, 4. alpha angle, 5. HAL. The regression model equation, calculated from the variables monitored in the study, enables a doctor in praxis to determine the probability magnitude (absolute risk) for the occurrence of pathological value of FSI (FSI < 1) in the femoral neck area, i. e., allows for probability estimate of a femoral neck fracture by fall for Slovak women. 1. The Slovak regression model differs from regression models, published until now, in chosen independent variables and a dependent variable, belonging to biomechanical variables, characterising the bone quality. 2. The Slovak regression model excludes the inaccuracies of other models, which are not able to define precisely the current and past clinical condition of tested patients (e.g., to define the length and dose of exposure to risk factors). 3. The Slovak regression model opens the way to a new method of estimating the probability (absolute risk) or the odds for a femoral neck fracture by fall, based upon the bone quality determination. 4. It is assumed that the development will proceed by improving the methods enabling to measure the bone quality, determining the probability of fracture by fall (Tab. 6, Fig. 3, Ref. 22). Full Text (Free, PDF) www.bmj.sk.
Early detection of emerging forest disease using dispersal estimation and ecological niche modeling.
Meentemeyer, Ross K; Anacker, Brian L; Mark, Walter; Rizzo, David M
2008-03-01
Distinguishing the manner in which dispersal limitation and niche requirements control the spread of invasive pathogens is important for prediction and early detection of disease outbreaks. Here, we use niche modeling augmented by dispersal estimation to examine the degree to which local habitat conditions vs. force of infection predict invasion of Phytophthora ramorum, the causal agent of the emerging infectious tree disease sudden oak death. We sampled 890 field plots for the presence of P. ramorum over a three-year period (2003-2005) across a range of host and abiotic conditions with variable proximities to known infections in California, USA. We developed and validated generalized linear models of invasion probability to analyze the relative predictive power of 12 niche variables and a negative exponential dispersal kernel estimated by likelihood profiling. Models were developed incrementally each year (2003, 2003-2004, 2003-2005) to examine annual variability in model parameters and to create realistic scenarios for using models to predict future infections and to guide early-detection sampling. Overall, 78 new infections were observed up to 33.5 km from the nearest known site of infection, with slightly increasing rates of prevalence across time windows (2003, 6.5%; 2003-2004, 7.1%; 2003-2005, 9.6%). The pathogen was not detected in many field plots that contained susceptible host vegetation. The generalized linear modeling indicated that the probability of invasion is limited by both dispersal and niche constraints. Probability of invasion was positively related to precipitation and temperature in the wet season and the presence of the inoculum-producing foliar host Umbellularia californica and decreased exponentially with distance to inoculum sources. Models that incorporated niche and dispersal parameters best predicted the locations of new infections, with accuracies ranging from 0.86 to 0.90, suggesting that the modeling approach can be used to forecast locations of disease spread. Application of the combined niche plus dispersal models in a geographic information system predicted the presence of P. ramorum across approximately 8228 km2 of California's 84785 km2 (9.7%) of land area with susceptible host species. This research illustrates how probabilistic modeling can be used to analyze the relative roles of niche and dispersal limitation in controlling the distribution of invasive pathogens.
II. MORE THAN JUST CONVENIENT: THE SCIENTIFIC MERITS OF HOMOGENEOUS CONVENIENCE SAMPLES.
Jager, Justin; Putnick, Diane L; Bornstein, Marc H
2017-06-01
Despite their disadvantaged generalizability relative to probability samples, nonprobability convenience samples are the standard within developmental science, and likely will remain so because probability samples are cost-prohibitive and most available probability samples are ill-suited to examine developmental questions. In lieu of focusing on how to eliminate or sharply reduce reliance on convenience samples within developmental science, here we propose how to augment their advantages when it comes to understanding population effects as well as subpopulation differences. Although all convenience samples have less clear generalizability than probability samples, we argue that homogeneous convenience samples have clearer generalizability relative to conventional convenience samples. Therefore, when researchers are limited to convenience samples, they should consider homogeneous convenience samples as a positive alternative to conventional (or heterogeneous) convenience samples. We discuss future directions as well as potential obstacles to expanding the use of homogeneous convenience samples in developmental science. © 2017 The Society for Research in Child Development, Inc.
X-ray Binaries in the Central Region of M31
NASA Astrophysics Data System (ADS)
Trudolyubov, Sergey P.; Priedhorsky, W. C.; Cordova, F. A.
2006-09-01
We present the results of the systematic survey of X-ray sources in the central region of M31 using the data of XMM-Newton observations. The spectral properties and variability of 124 bright X-ray sources were studied in detail. We found that more than 80% of sources observed in two or more observations show significant variability on the time scales of days to years. At least 50% of the sources in our sample are spectrally variable. The fraction of variable sources in our survey is much higher than previously reported from Chandra survey of M31, and is remarkably close to the fraction of variable sources found in M31 globular cluster X-ray source population. We present spectral distribution of M31 X-ray sources, based on the spectral fitting with a power law model. The distribution of spectral photon index has two main peaks at 1.8 and 2.3, and shows clear evolution with source luminosity. Based on the similarity of the properties of M31 X-ray sources and their Galactic counterparts, we expect most of X-ray sources in our sample to be accreting binary systems with neutron star and black hole primaries. Combining the results of X-ray analysis (X-ray spectra, hardness-luminosity diagrams and variability) with available data at other wavelengths, we explore the possibility of distinguishing between bright neutron star and black hole binary systems, and identify 7% and 25% of sources in our sample as a probable black hole and neutron star candidates. Finally, we compare the M31 X-ray source population to the source populations of normal galaxies of different morphological type. Support for this work was provided through NASA Grant NAG5-12390. Part of this work was done during a summer workshop ``Revealing Black Holes'' at the Aspen Center for Physics, S. T. is grateful to the Center for their hospitality.
Wormholes and the cosmological constant problem.
NASA Astrophysics Data System (ADS)
Klebanov, I.
The author reviews the cosmological constant problem and the recently proposed wormhole mechanism for its solution. Summation over wormholes in the Euclidean path integral for gravity turns all the coupling parameters into dynamical variables, sampled from a probability distribution. A formal saddle point analysis results in a distribution with a sharp peak at the cosmological constant equal to zero, which appears to solve the cosmological constant problem. He discusses the instabilities of the gravitational Euclidean path integral and the difficulties with its interpretation. He presents an alternate formalism for baby universes, based on the "third quantization" of the Wheeler-De Witt equation. This approach is analyzed in a minisuperspace model for quantum gravity, where it reduces to simple quantum mechanics. Once again, the coupling parameters become dynamical. Unfortunately, the a priori probability distribution for the cosmological constant and other parameters is typically a smooth function, with no sharp peaks.
Determinants of Workplace Injuries and Violence Among Newly Licensed RNs.
Unruh, Lynn; Asi, Yara
2018-06-01
Workplace injuries, such as musculoskeletal injuries, needlestick injuries, and emotional and physical violence, remain an issue in U.S. hospitals. To develop meaningful safety programs, it is important to identify workplace factors that contribute to injuries. This study explored factors that affect injuries in a sample of newly licensed registered nurses (NLRNs) in Florida. Regressions were run on models in which the dependent variable was the degree to which the respondent had experienced needlesticks, work-related musculoskeletal injuries, cuts or lacerations, contusions, verbal violence, physical violence, and other occupational injuries. A higher probability of these injuries was associated with greater length of employment, working evening or night shifts, working overtime, and reporting job difficulties and pressures. A lower probability was associated with working in a teaching hospital and working more hours. Study findings suggest that work environment issues must be addressed for safety programs to be effective.
Stock price analysis of sustainable foreign investment companies in Indonesia
NASA Astrophysics Data System (ADS)
Fachrudin, Khaira Amalia
2018-03-01
The stock price is determined by demand and supply in the stock market. Stock price reacts to information. Sustainable investment is an investment that considers environmental sustainability and human rights. This study aims to predict the probability of above average stock price by including the sustainability index as one of its variables. The population is all foreign investment companies in Indonesia. The target population is companies that distribute dividends – also as a sample. The analysis tool is a logistic regression. At 5% alpha, it was found that sustainability index did not have the probability to increase stock price average. The significant effects are free cash flow and cost of debt. However, sustainability index can increase the Negelkarke R square. The implication is that the awareness of sustainability is still necesary to be improved because from the research result it can be seen that investors only consider the risk and return.
Testing Models for Perceptual Discrimination Using Repeatable Noise
NASA Technical Reports Server (NTRS)
Ahumada, Albert J., Jr.; Null, Cynthia H. (Technical Monitor)
1998-01-01
Adding noise to stimuli to be discriminated allows estimation of observer classification functions based on the correlation between observer responses and relevant features of the noisy stimuli. Examples will be presented of stimulus features that are found in auditory tone detection and visual Vernier acuity. Using the standard signal detection model (Thurstone scaling), we derive formulas to estimate the proportion of the observer's decision variable variance that is controlled by the added noise. One is based on the probability of agreement of the observer with him/herself on trials with the same noise sample. Another is based on the relative performance of the observer and the model. When these do not agree, the model can be rejected. A second derivation gives the probability of agreement of observer and model when the observer follows the model except for internal noise. Agreement significantly less than this amount allows rejection of the model.
Jaccard, James; Dodge, Tonya; Guilamo-Ramos, Vincent
2005-03-01
The present study explores 2 key variables in social metacognition: perceived intelligence and perceived levels of knowledge about a specific content domain. The former represents a judgment of one's knowledge at an abstract level, whereas the latter represents a judgment of one's knowledge in a specific content domain. Data from interviews of approximately 8,411 female adolescents from a national sample were analyzed in a 2-wave panel design with a year between assessments. Higher levels of perceived intelligence at Wave 1 were associated with a lower probability of the occurrence of a pregnancy over the ensuing year independent of actual IQ, self-esteem, and academic aspirations. Higher levels of perceived knowledge about the accurate use of birth control were associated with a higher probability of the occurrence of a pregnancy independent of actual knowledge about accurate use, perceived intelligence, self-esteem, and academic aspirations.
Krishnamoorthy, K; Oral, Evrim
2017-12-01
Standardized likelihood ratio test (SLRT) for testing the equality of means of several log-normal distributions is proposed. The properties of the SLRT and an available modified likelihood ratio test (MLRT) and a generalized variable (GV) test are evaluated by Monte Carlo simulation and compared. Evaluation studies indicate that the SLRT is accurate even for small samples, whereas the MLRT could be quite liberal for some parameter values, and the GV test is in general conservative and less powerful than the SLRT. Furthermore, a closed-form approximate confidence interval for the common mean of several log-normal distributions is developed using the method of variance estimate recovery, and compared with the generalized confidence interval with respect to coverage probabilities and precision. Simulation studies indicate that the proposed confidence interval is accurate and better than the generalized confidence interval in terms of coverage probabilities. The methods are illustrated using two examples.
Pracht, Etienne E; Bass, Elizabeth
2011-01-01
This paper explores the link between utilization of ambulatory care and the likelihood of rehospitalization for an avoidable reason in veterans served by the Veteran Health Administration (VA). The analysis used administrative data containing healthcare utilization and patient characteristics stored at the national VA data warehouse, the Corporate Franchise Data Center. The study sample consisted of 284 veterans residing in Florida who had been hospitalized at least once for an avoidable reason. A bivariate probit model with instrumental variables was used to estimate the probability of rehospitalization. Veterans who had at least 1 ambulatory care visit per month experienced a significant reduction in the probability of rehospitalization for the same avoidable hospitalization condition. The findings suggest that ambulatory care can serve as an important substitute for more expensive hospitalization for the conditions characterized as avoidable. © 2011 National Association for Healthcare Quality.
Lorber, M.; Johnson, Kevin; Kross, B.; Pinsky, P.; Burmeister, L.; Thurman, M.; Wilkins, A.; Hallberg, G.
1997-01-01
In 1988, the Iowa Department of Natural Resources, along with the University of Iowa conducted the Statewide Rural Well Water Survey, commonly known as SWRL. A total of 686 private rural drinking water wells was selected by use of a probability sample and tested for pesticides and nitrates. Sixty-eight of these wells, the '10% repeat' wells, were additionally sampled in October, 1990 and June, 1991. Starting in November, 1991, the University of Iowa, with sponsorship from the United States Environmental Protection Agency, revisited these wells to begin a study of the temporal variability of atrazine and nitrates in wells. Other wells, which had originally tested positive for atrazine in SWRL but were not in the 10% repeat population, were added to the study population. Temporal sampling for a year-long period began in February of 1992 and concluded in January of 1993. All wells were sampled monthly, one subset was sampled weekly, and a second subset was sampled for 14-day consecutive periods. Two unique aspects of this study were the use of an immunoassay technique to screen for triazines before gas chromatography/mass spectrometry (GC/MS) analysis and quantification of atrazine, and the use of well owners to sample the wells. A total of 1771 samples from 83 wells are in the final data base for this study. This paper reviews the study design, the analytical methodologies, and development of the data base. A companion paper discusses the analysis of the data from this survey.
Nonprobability and probability-based sampling strategies in sexual science.
Catania, Joseph A; Dolcini, M Margaret; Orellana, Roberto; Narayanan, Vasudah
2015-01-01
With few exceptions, much of sexual science builds upon data from opportunistic nonprobability samples of limited generalizability. Although probability-based studies are considered the gold standard in terms of generalizability, they are costly to apply to many of the hard-to-reach populations of interest to sexologists. The present article discusses recent conclusions by sampling experts that have relevance to sexual science that advocates for nonprobability methods. In this regard, we provide an overview of Internet sampling as a useful, cost-efficient, nonprobability sampling method of value to sex researchers conducting modeling work or clinical trials. We also argue that probability-based sampling methods may be more readily applied in sex research with hard-to-reach populations than is typically thought. In this context, we provide three case studies that utilize qualitative and quantitative techniques directed at reducing limitations in applying probability-based sampling to hard-to-reach populations: indigenous Peruvians, African American youth, and urban men who have sex with men (MSM). Recommendations are made with regard to presampling studies, adaptive and disproportionate sampling methods, and strategies that may be utilized in evaluating nonprobability and probability-based sampling methods.
NASA Astrophysics Data System (ADS)
Varouchakis, Emmanouil; Kourgialas, Nektarios; Karatzas, George; Giannakis, Georgios; Lilli, Maria; Nikolaidis, Nikolaos
2014-05-01
Riverbank erosion affects the river morphology and the local habitat and results in riparian land loss, damage to property and infrastructures, ultimately weakening flood defences. An important issue concerning riverbank erosion is the identification of the areas vulnerable to erosion, as it allows for predicting changes and assists with stream management and restoration. One way to predict the vulnerable to erosion areas is to determine the erosion probability by identifying the underlying relations between riverbank erosion and the geomorphological and/or hydrological variables that prevent or stimulate erosion. A statistical model for evaluating the probability of erosion based on a series of independent local variables and by using logistic regression is developed in this work. The main variables affecting erosion are vegetation index (stability), the presence or absence of meanders, bank material (classification), stream power, bank height, river bank slope, riverbed slope, cross section width and water velocities (Luppi et al. 2009). In statistics, logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable, e.g. binary response, based on one or more predictor variables (continuous or categorical). The probabilities of the possible outcomes are modelled as a function of independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. 1 = "presence of erosion" and 0 = "no erosion") for any value of the independent variables. The regression coefficients are estimated by using maximum likelihood estimation. The erosion occurrence probability can be calculated in conjunction with the model deviance regarding the independent variables tested (Atkinson et al. 2003). The developed statistical model is applied to the Koiliaris River Basin in the island of Crete, Greece. The aim is to determine the probability of erosion along the Koiliaris' riverbanks considering a series of independent geomorphological and/or hydrological variables. Data for the river bank slope and for the river cross section width are available at ten locations along the river. The riverbank has indications of erosion at six of the ten locations while four has remained stable. Based on a recent work, measurements for the two independent variables and data regarding bank stability are available at eight different locations along the river. These locations were used as validation points for the proposed statistical model. The results show a very close agreement between the observed erosion indications and the statistical model as the probability of erosion was accurately predicted at seven out of the eight locations. The next step is to apply the model at more locations along the riverbanks. In November 2013, stakes were inserted at selected locations in order to be able to identify the presence or absence of erosion after the winter period. In April 2014 the presence or absence of erosion will be identified and the model results will be compared to the field data. Our intent is to extend the model by increasing the number of independent variables in order to indentify the key factors favouring erosion along the Koiliaris River. We aim at developing an easy to use statistical tool that will provide a quantified measure of the erosion probability along the riverbanks, which could consequently be used to prevent erosion and flooding events. Atkinson, P. M., German, S. E., Sear, D. A. and Clark, M. J. 2003. Exploring the relations between riverbank erosion and geomorphological controls using geographically weighted logistic regression. Geographical Analysis, 35 (1), 58-82. Luppi, L., Rinaldi, M., Teruggi, L. B., Darby, S. E. and Nardi, L. 2009. Monitoring and numerical modelling of riverbank erosion processes: A case study along the Cecina River (central Italy). Earth Surface Processes and Landforms, 34 (4), 530-546. Acknowledgements This work is part of an on-going THALES project (CYBERSENSORS - High Frequency Monitoring System for Integrated Water Resources Management of Rivers). The project has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.
Yoganandan, Narayan; Arun, Mike W.J.; Pintar, Frank A.; Szabo, Aniko
2015-01-01
Objective Derive optimum injury probability curves to describe human tolerance of the lower leg using parametric survival analysis. Methods The study re-examined lower leg PMHS data from a large group of specimens. Briefly, axial loading experiments were conducted by impacting the plantar surface of the foot. Both injury and non-injury tests were included in the testing process. They were identified by pre- and posttest radiographic images and detailed dissection following the impact test. Fractures included injuries to the calcaneus and distal tibia-fibula complex (including pylon), representing severities at the Abbreviated Injury Score (AIS) level 2+. For the statistical analysis, peak force was chosen as the main explanatory variable and the age was chosen as the co-variable. Censoring statuses depended on experimental outcomes. Parameters from the parametric survival analysis were estimated using the maximum likelihood approach and the dfbetas statistic was used to identify overly influential samples. The best fit from the Weibull, log-normal and log-logistic distributions was based on the Akaike Information Criterion. Plus and minus 95% confidence intervals were obtained for the optimum injury probability distribution. The relative sizes of the interval were determined at predetermined risk levels. Quality indices were described at each of the selected probability levels. Results The mean age, stature and weight: 58.2 ± 15.1 years, 1.74 ± 0.08 m and 74.9 ± 13.8 kg. Excluding all overly influential tests resulted in the tightest confidence intervals. The Weibull distribution was the most optimum function compared to the other two distributions. A majority of quality indices were in the good category for this optimum distribution when results were extracted for 25-, 45- and 65-year-old at five, 25 and 50% risk levels age groups for lower leg fracture. For 25, 45 and 65 years, peak forces were 8.1, 6.5, and 5.1 kN at 5% risk; 9.6, 7.7, and 6.1 kN at 25% risk; and 10.4, 8.3, and 6.6 kN at 50% risk, respectively. Conclusions This study derived axial loading-induced injury risk curves based on survival analysis using peak force and specimen age; adopting different censoring schemes; considering overly influential samples in the analysis; and assessing the quality of the distribution at discrete probability levels. Because procedures used in the present survival analysis are accepted by international automotive communities, current optimum human injury probability distributions can be used at all risk levels with more confidence in future crashworthiness applications for automotive and other disciplines. PMID:25307381
Cohn, Amy M; Zinzow, Heidi M; Resnick, Heidi S; Kilpatrick, Dean G
2013-02-01
Rape tactics, rape incident characteristics, and mental health problems (lifetime depression, PTSD, and substance abuse) were investigated as correlates of eight different reasons for not reporting a rape to police among women who had experienced but did not report a rape to police (n = 441) within a national telephone household probability sample. Rape tactics (nonmutually exclusive) included drug or alcohol-facilitated or incapacitated rape (DAFR/IR; n = 119) and forcible rape (FR; n = 376). Principal Components Analysis (PCA) was conducted to extract a dominant set of patterns among the eight reasons for not reporting, and to reduce the set of dependent variables. PCA results indicated three unique factors: Not Wanting Others to Know, Nonacknowledgment of Rape, and Criminal Justice Concerns. Hierarchical regression analyses showed DAFR/IR and FR were both positively and significantly associated with Criminal Justice Concerns, whereas DAFR/IR, but not FR, was associated with Nonacknowledgment as a reason for not reporting to police. Neither DAFR/IR nor FR emerged as significant predictors of Others Knowing after controlling for fear of death or injury at the time of the incident. Correlations among variables showed that the Criminal Justice Concerns factor was positively related to lifetime depression and PTSD and the Nonacknowledgement factor was negatively related to lifetime PTSD. Findings suggest prevention programs should educate women about the definition of rape, which may include incapacitation due to alcohol or drugs, to increase acknowledgement and decrease barriers to police reporting.
Coggins, Lewis G; Bacheler, Nathan M; Gwinn, Daniel C
2014-01-01
Occupancy models using incidence data collected repeatedly at sites across the range of a population are increasingly employed to infer patterns and processes influencing population distribution and dynamics. While such work is common in terrestrial systems, fewer examples exist in marine applications. This disparity likely exists because the replicate samples required by these models to account for imperfect detection are often impractical to obtain when surveying aquatic organisms, particularly fishes. We employ simultaneous sampling using fish traps and novel underwater camera observations to generate the requisite replicate samples for occupancy models of red snapper, a reef fish species. Since the replicate samples are collected simultaneously by multiple sampling devices, many typical problems encountered when obtaining replicate observations are avoided. Our results suggest that augmenting traditional fish trap sampling with camera observations not only doubled the probability of detecting red snapper in reef habitats off the Southeast coast of the United States, but supplied the necessary observations to infer factors influencing population distribution and abundance while accounting for imperfect detection. We found that detection probabilities tended to be higher for camera traps than traditional fish traps. Furthermore, camera trap detections were influenced by the current direction and turbidity of the water, indicating that collecting data on these variables is important for future monitoring. These models indicate that the distribution and abundance of this species is more heavily influenced by latitude and depth than by micro-scale reef characteristics lending credence to previous characterizations of red snapper as a reef habitat generalist. This study demonstrates the utility of simultaneous sampling devices, including camera traps, in aquatic environments to inform occupancy models and account for imperfect detection when describing factors influencing fish population distribution and dynamics.
Coggins, Lewis G.; Bacheler, Nathan M.; Gwinn, Daniel C.
2014-01-01
Occupancy models using incidence data collected repeatedly at sites across the range of a population are increasingly employed to infer patterns and processes influencing population distribution and dynamics. While such work is common in terrestrial systems, fewer examples exist in marine applications. This disparity likely exists because the replicate samples required by these models to account for imperfect detection are often impractical to obtain when surveying aquatic organisms, particularly fishes. We employ simultaneous sampling using fish traps and novel underwater camera observations to generate the requisite replicate samples for occupancy models of red snapper, a reef fish species. Since the replicate samples are collected simultaneously by multiple sampling devices, many typical problems encountered when obtaining replicate observations are avoided. Our results suggest that augmenting traditional fish trap sampling with camera observations not only doubled the probability of detecting red snapper in reef habitats off the Southeast coast of the United States, but supplied the necessary observations to infer factors influencing population distribution and abundance while accounting for imperfect detection. We found that detection probabilities tended to be higher for camera traps than traditional fish traps. Furthermore, camera trap detections were influenced by the current direction and turbidity of the water, indicating that collecting data on these variables is important for future monitoring. These models indicate that the distribution and abundance of this species is more heavily influenced by latitude and depth than by micro-scale reef characteristics lending credence to previous characterizations of red snapper as a reef habitat generalist. This study demonstrates the utility of simultaneous sampling devices, including camera traps, in aquatic environments to inform occupancy models and account for imperfect detection when describing factors influencing fish population distribution and dynamics. PMID:25255325
Denison, Stephanie; Trikutam, Pallavi; Xu, Fei
2014-08-01
A rich tradition in developmental psychology explores physical reasoning in infancy. However, no research to date has investigated whether infants can reason about physical objects that behave probabilistically, rather than deterministically. Physical events are often quite variable, in that similar-looking objects can be placed in similar contexts with different outcomes. Can infants rapidly acquire probabilistic physical knowledge, such as some leaves fall and some glasses break by simply observing the statistical regularity with which objects behave and apply that knowledge in subsequent reasoning? We taught 11-month-old infants physical constraints on objects and asked them to reason about the probability of different outcomes when objects were drawn from a large distribution. Infants could have reasoned either by using the perceptual similarity between the samples and larger distributions or by applying physical rules to adjust base rates and estimate the probabilities. Infants learned the physical constraints quickly and used them to estimate probabilities, rather than relying on similarity, a version of the representativeness heuristic. These results indicate that infants can rapidly and flexibly acquire physical knowledge about objects following very brief exposure and apply it in subsequent reasoning. PsycINFO Database Record (c) 2014 APA, all rights reserved.
A predictive Bayesian approach to the design and analysis of bridging studies.
Gould, A Lawrence; Jin, Tian; Zhang, Li Xin; Wang, William W B
2012-09-01
Pharmaceutical product development culminates in confirmatory trials whose evidence for the product's efficacy and safety supports regulatory approval for marketing. Regulatory agencies in countries whose patients were not included in the confirmatory trials often require confirmation of efficacy and safety in their patient populations, which may be accomplished by carrying out bridging studies to establish consistency for local patients of the effects demonstrated by the original trials. This article describes and illustrates an approach for designing and analyzing bridging studies that fully incorporates the information provided by the original trials. The approach determines probability contours or regions of joint predictive intervals for treatment effect and response variability, or endpoints of treatment effect confidence intervals, that are functions of the findings from the original trials, the sample sizes for the bridging studies, and possible deviations from complete consistency with the original trials. The bridging studies are judged consistent with the original trials if their findings fall within the probability contours or regions. Regulatory considerations determine the region definitions and appropriate probability levels. Producer and consumer risks provide a way to assess alternative region and probability choices. [Supplemental materials are available for this article. Go to the Publisher's online edition of the Journal of Biopharmaceutical Statistics for the following free supplemental resource: Appendix 2: R code for Calculations.].
Costa, Marilia G; Barbosa, José C; Yamamoto, Pedro T
2007-01-01
The sequential sampling is characterized by using samples of variable sizes, and has the advantage of reducing sampling time and costs if compared to fixed-size sampling. To introduce an adequate management for orthezia, sequential sampling plans were developed for orchards under low and high infestation. Data were collected in Matão, SP, in commercial stands of the orange variety 'Pêra Rio', at five, nine and 15 years of age. Twenty samplings were performed in the whole area of each stand by observing the presence or absence of scales on plants, being plots comprised of ten plants. After observing that in all of the three stands the scale population was distributed according to the contagious model, fitting the Negative Binomial Distribution in most samplings, two sequential sampling plans were constructed according to the Sequential Likelihood Ratio Test (SLRT). To construct these plans an economic threshold of 2% was adopted and the type I and II error probabilities were fixed in alpha = beta = 0.10. Results showed that the maximum numbers of samples expected to determine control need were 172 and 76 samples for stands with low and high infestation, respectively.
Psychosocial variables of sexual satisfaction in Chile.
Barrientos, Jaime E; Páez, Dario
2006-01-01
This study analyzed psychosocial variables of sexual satisfaction in Chile using data from the COSECON survey. Participants were 5,407 subjects (2,244 min and 3,163 women, aged 18-69 years). We used a cross-sectional questionnaire with a national probability sample. Data were collected using a thorough sexual behavior questionnaire consisting of 190 face-to-face questions and 24 self-reported questions. A single item included in the COSECON questionnaire assessed sexual satisfaction. Results showed that high education level, marital status, and high socioeconomic levels were associated with sexual satisfaction in women but not in men. The results also showed important gender differences and sustain the idea that sexuality changes may be more present in middle and high social classes. The proximal variables typically used for measuring sexual satisfaction, such as the frequency of sexual intercourse and orgasm, showed a positive but smaller association with sexual satisfaction. Other important variables related to sexual satisfaction were being in love with the partner and having a steady partner. The results confirmed previous findings and are discussed in the frame of approaches like the exchange, equity, and sexual scripts theories.
Importance Sampling in the Evaluation and Optimization of Buffered Failure Probability
2015-07-01
12th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12 Vancouver, Canada, July 12-15, 2015...Importance Sampling in the Evaluation and Optimization of Buffered Failure Probability Marwan M. Harajli Graduate Student, Dept. of Civil and Environ...criterion is usually the failure probability . In this paper, we examine the buffered failure probability as an attractive alternative to the failure
Characterisation of indomethacin and nifedipine using variable-temperature solid-state NMR.
Apperley, David C; Forster, Angus H; Fournier, Romain; Harris, Robin K; Hodgkinson, Paul; Lancaster, Robert W; Rades, Thomas
2005-11-01
We have characterised the stable polymorphic forms of two drug molecules, indomethacin (1) and nifedipine (2) by 13C CPMAS NMR and the resonances have been assigned. The signal for the C-Cl carbon of indomethacin has been studied as a function of applied magnetic field, and the observed bandshapes have been simulated. Variable-temperature 1H relaxation measurements of static samples have revealed a T1rho minimum for indomethacin at 17.8 degrees C. The associated activation energy is 38 kJ mol(-1). The relevant motion is probably an internal rotation and it is suggested that this involves the C-OCH3 group. Since the two drug compounds are potential candidates for formulation in the amorphous state, we have examined quench-cooled melts in detail by variable-temperature 13C and 1H NMR. There is a change in slope for T1H and T1rhoH at the glass transition temperature (Tg) for indomethacin, but this occurs a few degrees below Tg for nifedipine, which is perhaps relevant to the lower real-time stability of the amorphous form for the latter compound. Comparison of relaxation time data for the crystalline and amorphous forms of each compound reveals a greater difference for nifedipine than for indomethacin, which again probably relates to real-time stabilities. Recrystallisation of the two drugs has been followed by proton bandshape measurements at higher temperatures. It is shown that, under the conditions of the experiments, recrystallisation of nifedipine can be detected already at 70 degrees C, whereas this does not occur until 110 degrees C for indomethacin. The effect of crushing the amorphous samples has been studied by 13C NMR; nifedipine recrystallises but indomethacin does not. The results were supported by DSC, powder XRD, FTIR and solution-state NMR measurements. Copyright (c) 2005 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Jirásek, Jakub; Dolníček, Zdeněk; Matýsek, Dalibor; Urubek, Tomáš
2017-04-01
Barite is a relatively uncommon phase in vein and amygdule mineralizations hosted by igneous rocks of the teschenite association in the Silesian Unit (Western Carpathians). In macroscopically observable sizes, it has been reported from 10 sites situated only in the Czech part of the Silesian Unit. Microscopic barite produced by the hydrothermal alteration of rock matrix and also by the supergene processes is more abundant. We examined four samples of barite by mineralogical and geochemical methods. Electron microprobe analyses proved pure barites with up to 0.038 apfu Sr and without remarkable internal zonation. Fluid inclusion and sulphur isotope data suggests that multiple sources of fluid components have been involved during barite crystallization. Barite contains primary and secondary aqueous all-liquid (L) or less frequent two-phase (L+V) aqueous fluid inclusions with variable salinity (0.4-2.9 wt. % NaCl eq.) and homogenization temperatures between 77 and 152 °C. The higher-salinity fluid endmember was probably Cretaceous seawater and the lower-salinity one was probably diagenetic water derived from surrounding flysch sediments during compaction and thermal alteration of clay minerals. The δ34S values of barite samples range between -1.0 ‰ and +16.4 ‰ CDT suggesting participation of two sources of sulphate, one with a near-zero δ34S values probably derived from wall rocks and another with high δ34S values being most probably sulphate from the Cretaceous seawater. All results underline the role of externally derived fluids during post-magmatic alteration of bodies of rock of the teschenite association.
Reconstructing the equilibrium Boltzmann distribution from well-tempered metadynamics.
Bonomi, M; Barducci, A; Parrinello, M
2009-08-01
Metadynamics is a widely used and successful method for reconstructing the free-energy surface of complex systems as a function of a small number of suitably chosen collective variables. This is achieved by biasing the dynamics of the system. The bias acting on the collective variables distorts the probability distribution of the other variables. Here we present a simple reweighting algorithm for recovering the unbiased probability distribution of any variable from a well-tempered metadynamics simulation. We show the efficiency of the reweighting procedure by reconstructing the distribution of the four backbone dihedral angles of alanine dipeptide from two and even one dimensional metadynamics simulation. 2009 Wiley Periodicals, Inc.
Scorrano, Gabriele; Lelli, Roberta; Martínez-Labarga, Cristina; Scano, Giuseppina; Contini, Irene; Hafez, Hani S; Rudan, Pavao; Rickards, Olga
2016-01-01
The most abundant of the collagen protein family, type I collagen is encoded by the COL1A2 gene. The COL1A2 restriction fragment length polymorphisms (RFLPs) EcoRI, RsaI and MspI in samples from several different central-eastern Mediterranean populations were analysed and found to be potentially informative anthropogenetic markers. The objective was to define the genetic variability of COL1A2 in the central-eastern Mediterranean and to shed light on its genetic distribution in human groups over a wide geographic area. PCR-RFLP analysis of EcoRI, RsaI and MspI polymorphisms of the COL1A2 gene was performed on oral swab and blood samples from 308 individuals from the central-eastern Mediterranean Basin. The genetic similarities among these groups and other populations described in the literature were investigated through correspondence analysis. Single-marker data and haplotype frequencies seemed to suggest a genetic homogeneity within the European populations, whereas a certain degree of differentiation was noted for the Egyptians and the Turks. The genetic variability in the central-eastern Mediterranean area is probably a result of the geographical barrier of the Mediterranean Sea, which separated European and African populations over time.
Integrating count and detection–nondetection data to model population dynamics
Zipkin, Elise F.; Rossman, Sam; Yackulic, Charles B.; Wiens, David; Thorson, James T.; Davis, Raymond J.; Grant, Evan H. Campbell
2017-01-01
There is increasing need for methods that integrate multiple data types into a single analytical framework as the spatial and temporal scale of ecological research expands. Current work on this topic primarily focuses on combining capture–recapture data from marked individuals with other data types into integrated population models. Yet, studies of species distributions and trends often rely on data from unmarked individuals across broad scales where local abundance and environmental variables may vary. We present a modeling framework for integrating detection–nondetection and count data into a single analysis to estimate population dynamics, abundance, and individual detection probabilities during sampling. Our dynamic population model assumes that site-specific abundance can change over time according to survival of individuals and gains through reproduction and immigration. The observation process for each data type is modeled by assuming that every individual present at a site has an equal probability of being detected during sampling processes. We examine our modeling approach through a series of simulations illustrating the relative value of count vs. detection–nondetection data under a variety of parameter values and survey configurations. We also provide an empirical example of the model by combining long-term detection–nondetection data (1995–2014) with newly collected count data (2015–2016) from a growing population of Barred Owl (Strix varia) in the Pacific Northwest to examine the factors influencing population abundance over time. Our model provides a foundation for incorporating unmarked data within a single framework, even in cases where sampling processes yield different detection probabilities. This approach will be useful for survey design and to researchers interested in incorporating historical or citizen science data into analyses focused on understanding how demographic rates drive population abundance.
Integrating count and detection-nondetection data to model population dynamics.
Zipkin, Elise F; Rossman, Sam; Yackulic, Charles B; Wiens, J David; Thorson, James T; Davis, Raymond J; Grant, Evan H Campbell
2017-06-01
There is increasing need for methods that integrate multiple data types into a single analytical framework as the spatial and temporal scale of ecological research expands. Current work on this topic primarily focuses on combining capture-recapture data from marked individuals with other data types into integrated population models. Yet, studies of species distributions and trends often rely on data from unmarked individuals across broad scales where local abundance and environmental variables may vary. We present a modeling framework for integrating detection-nondetection and count data into a single analysis to estimate population dynamics, abundance, and individual detection probabilities during sampling. Our dynamic population model assumes that site-specific abundance can change over time according to survival of individuals and gains through reproduction and immigration. The observation process for each data type is modeled by assuming that every individual present at a site has an equal probability of being detected during sampling processes. We examine our modeling approach through a series of simulations illustrating the relative value of count vs. detection-nondetection data under a variety of parameter values and survey configurations. We also provide an empirical example of the model by combining long-term detection-nondetection data (1995-2014) with newly collected count data (2015-2016) from a growing population of Barred Owl (Strix varia) in the Pacific Northwest to examine the factors influencing population abundance over time. Our model provides a foundation for incorporating unmarked data within a single framework, even in cases where sampling processes yield different detection probabilities. This approach will be useful for survey design and to researchers interested in incorporating historical or citizen science data into analyses focused on understanding how demographic rates drive population abundance. © 2017 by the Ecological Society of America.
Diffusion Processes Satisfying a Conservation Law Constraint
Bakosi, J.; Ristorcelli, J. R.
2014-03-04
We investigate coupled stochastic differential equations governing N non-negative continuous random variables that satisfy a conservation principle. In various fields a conservation law requires that a set of fluctuating variables be non-negative and (if appropriately normalized) sum to one. As a result, any stochastic differential equation model to be realizable must not produce events outside of the allowed sample space. We develop a set of constraints on the drift and diffusion terms of such stochastic models to ensure that both the non-negativity and the unit-sum conservation law constraint are satisfied as the variables evolve in time. We investigate the consequencesmore » of the developed constraints on the Fokker-Planck equation, the associated system of stochastic differential equations, and the evolution equations of the first four moments of the probability density function. We show that random variables, satisfying a conservation law constraint, represented by stochastic diffusion processes, must have diffusion terms that are coupled and nonlinear. The set of constraints developed enables the development of statistical representations of fluctuating variables satisfying a conservation law. We exemplify the results with the bivariate beta process and the multivariate Wright-Fisher, Dirichlet, and Lochner’s generalized Dirichlet processes.« less
Diffusion Processes Satisfying a Conservation Law Constraint
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bakosi, J.; Ristorcelli, J. R.
We investigate coupled stochastic differential equations governing N non-negative continuous random variables that satisfy a conservation principle. In various fields a conservation law requires that a set of fluctuating variables be non-negative and (if appropriately normalized) sum to one. As a result, any stochastic differential equation model to be realizable must not produce events outside of the allowed sample space. We develop a set of constraints on the drift and diffusion terms of such stochastic models to ensure that both the non-negativity and the unit-sum conservation law constraint are satisfied as the variables evolve in time. We investigate the consequencesmore » of the developed constraints on the Fokker-Planck equation, the associated system of stochastic differential equations, and the evolution equations of the first four moments of the probability density function. We show that random variables, satisfying a conservation law constraint, represented by stochastic diffusion processes, must have diffusion terms that are coupled and nonlinear. The set of constraints developed enables the development of statistical representations of fluctuating variables satisfying a conservation law. We exemplify the results with the bivariate beta process and the multivariate Wright-Fisher, Dirichlet, and Lochner’s generalized Dirichlet processes.« less
Immigration, stress, and depressive symptoms in a Mexican-American community.
Golding, J M; Burnam, M A
1990-03-01
This study assessed levels of depressive symptomatology in a household probability sample of Mexico-born (N = 706) and U.S.-born (N = 538) Mexican Americans. We hypothesized that immigration status differences in acculturation, strain, social resources, and social conflict, as well as differences in the associations of these variables with depression, would account for differences in depression between U.S.-born and Mexico-born respondents. U.S.-born Mexican Americans had higher depression scores than those born in Mexico. When cultural and social psychological variables were controlled in a multiple regression analysis, the immigrant status difference persisted. Tests of interaction terms suggested greater vulnerability to the effects of low acculturation and low educational attainment among the U.S.-born relative to those born in Mexico; however, the immigrant status difference persisted after controlling for these interactions. Unmeasured variables such as selective migration of persons with better coping skills, selective return of depressed immigrants, or generational differences in social comparison processes may account for the immigration status difference.
Design tradeoffs in long-term research for stream salamanders
Brand, Adrianne B,; Grant, Evan H. Campbell
2017-01-01
Long-term research programs can benefit from early and periodic evaluation of their ability to meet stated objectives. In particular, consideration of the spatial allocation of effort is key. We sampled 4 species of stream salamanders intensively for 2 years (2010–2011) in the Chesapeake and Ohio Canal National Historical Park, Maryland, USA to evaluate alternative distributions of sampling locations within stream networks, and then evaluated via simulation the ability of multiple survey designs to detect declines in occupancy and to estimate dynamic parameters (colonization, extinction) over 5 years for 2 species. We expected that fine-scale microhabitat variables (e.g., cobble, detritus) would be the strongest determinants of occupancy for each of the 4 species; however, we found greater support for all species for models including variables describing position within the stream network, stream size, or stream microhabitat. A monitoring design focused on headwater sections had greater power to detect changes in occupancy and the dynamic parameters in each of 3 scenarios for the dusky salamander (Desmognathus fuscus) and red salamander (Pseudotriton ruber). Results for transect length were more variable, but across all species and scenarios, 25-m transects are most suitable as a balance between maximizing detection probability and describing colonization and extinction. These results inform sampling design and provide a general framework for setting appropriate goals, effort, and duration in the initial planning stages of research programs on stream salamanders in the eastern United States.
Geochronology and geochemistry of lavas from the 1996 North Gorda Ridge eruption
NASA Astrophysics Data System (ADS)
Rubin, K. H.; Smith, M. C.; Perfit, M. R.; Christie, D. M.; Sacks, L. F.
1998-12-01
Radiometric dating of three North Gorda Ridge lavas by the 210Po- 210Pb method confirms that an eruption occurred during a period of increased seismic activity along the ridge during late February/early March 1996. These lavas were collected following detection of enhanced T-phase seismicity and subsequent ocean bottom photographs documented the existence of a large pillow mound of fresh-appearing lavas. 210Po- 210Pb dating of these lavas indicates that an eruption coinciding with this seismicity did occur (within analytical error) and that followup efforts to sample the recent lava flows were successful. Compositions of the three confirmed young lavas and eleven other samples of this contiguous "new flow" sequence are distinct from older lavas from this area but are variable at a level outside analytical uncertainty. These intraflow variations can not easily be related to a single, common parent magma. Compositional variability within the new flow is compared to that of other recently documented individual flow sequences, and this comparison reveals a strong positive correlation of compositional variance with flow volumes spanning a range of >2 orders of magnitude. The geochemical heterogeneity in the North Gorda new flow probably reflects incomplete mixing of magmas generated from a heterogeneous mantle source or from slightly different melting conditions of a single source. The compositional variability, range in sample ages (up to 6 weeks) and range in active seismicity (4 weeks) imply that this relatively large flow was erupted over an interval of several weeks.
Bogaert, Anthony F; Cairney, John
2004-01-01
A birth order and sexual orientation relationship has been demonstrated numerous times in men, but a related variable, parental age (i.e. age of parents when the participant was born), has been less studied and has demonstrated contradictory results. In this research, the relations among birth order, parental age and sexual orientation were examined in a national probability sample of the US (Kessler, 1994; Kessler et al., 1994) and in a Canadian sample of homosexual and heterosexual men closely matched on demographic characteristics (Blanchard & Bogaert, 1996a). In both studies, an interaction between birth order and parental age was observed in men, such that there was positive association between number of older siblings and the likelihood of homosexuality, but this association weakened with increasing parental age. No significant effects were observed for women. The results are discussed in relation to recent theories of the birth order/sexual orientation relationship.
Wohlsen, T; Bates, J; Vesey, G; Robinson, W A; Katouli, M
2006-04-01
To use BioBall cultures as a precise reference standard to evaluate methods for enumeration of Escherichia coli and other coliform bacteria in water samples. Eight methods were evaluated including membrane filtration, standard plate count (pour and spread plate methods), defined substrate technology methods (Colilert and Colisure), the most probable number method and the Petrifilm disposable plate method. Escherichia coli and Enterobacter aerogenes BioBall cultures containing 30 organisms each were used. All tests were performed using 10 replicates. The mean recovery of both bacteria varied with the different methods employed. The best and most consistent results were obtained with Petrifilm and the pour plate method. Other methods either yielded a low recovery or showed significantly high variability between replicates. The BioBall is a very suitable quality control tool for evaluating the efficiency of methods for bacterial enumeration in water samples.
Under-sampling trajectory design for compressed sensing based DCE-MRI.
Liu, Duan-duan; Liang, Dong; Zhang, Na; Liu, Xin; Zhang, Yuan-ting
2013-01-01
Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) needs high temporal and spatial resolution to accurately estimate quantitative parameters and characterize tumor vasculature. Compressed Sensing (CS) has the potential to accomplish this mutual importance. However, the randomness in CS under-sampling trajectory designed using the traditional variable density (VD) scheme may translate to uncertainty in kinetic parameter estimation when high reduction factors are used. Therefore, accurate parameter estimation using VD scheme usually needs multiple adjustments on parameters of Probability Density Function (PDF), and multiple reconstructions even with fixed PDF, which is inapplicable for DCE-MRI. In this paper, an under-sampling trajectory design which is robust to the change on PDF parameters and randomness with fixed PDF is studied. The strategy is to adaptively segment k-space into low-and high frequency domain, and only apply VD scheme in high-frequency domain. Simulation results demonstrate high accuracy and robustness comparing to VD design.
Deformation of the Batestown till of the Lake Michigan lobe, Laurentide ice sheet
Thomason, J.F.; Iverson, N.R.
2009-01-01
Deep, pervasive shear deformation of the bed to high strains (>100) may have been primarily responsible for flow and sediment transport of the Lake Michigan lobe of the Laurentide ice sheet. To test this hypothesis, we sampled at 0.2 m increments a basal till from one advance of the lobe (Batestown till) along vertical profiles and measured fabrics due to both anisotropy of magnetic susceptibility and sand-grain preferred orientation. Unlike past fabric studies, interpretations were guided by results of laboratory experiments in which this till was deformed in simple shear to high strains. Fabric strengths indicate that more than half of the till sampled has a <5% probability of having been sheared to moderate strains (7-30). Secular changes in fabric azimuth over the thickness of the till, probably due to changing ice-flow direction as the lobe receded, indicate that the bed accreted with time and that the depth of deformation of the bed did not exceed a few decimeters. Orientations of principal magnetic susceptibilities show that the state of strain was commonly complex, deviating from bed-parallel simple shear. Deformation is inferred to have been focused in shallow, temporally variable patches during till deposition from ice.
Systematic review and consensus guidelines for environmental sampling of Burkholderia pseudomallei.
Limmathurotsakul, Direk; Dance, David A B; Wuthiekanun, Vanaporn; Kaestli, Mirjam; Mayo, Mark; Warner, Jeffrey; Wagner, David M; Tuanyok, Apichai; Wertheim, Heiman; Yoke Cheng, Tan; Mukhopadhyay, Chiranjay; Puthucheary, Savithiri; Day, Nicholas P J; Steinmetz, Ivo; Currie, Bart J; Peacock, Sharon J
2013-01-01
Burkholderia pseudomallei, a Tier 1 Select Agent and the cause of melioidosis, is a Gram-negative bacillus present in the environment in many tropical countries. Defining the global pattern of B. pseudomallei distribution underpins efforts to prevent infection, and is dependent upon robust environmental sampling methodology. Our objective was to review the literature on the detection of environmental B. pseudomallei, update the risk map for melioidosis, and propose international consensus guidelines for soil sampling. An international working party (Detection of Environmental Burkholderia pseudomallei Working Party (DEBWorP)) was formed during the VIth World Melioidosis Congress in 2010. PubMed (January 1912 to December 2011) was searched using the following MeSH terms: pseudomallei or melioidosis. Bibliographies were hand-searched for secondary references. The reported geographical distribution of B. pseudomallei in the environment was mapped and categorized as definite, probable, or possible. The methodology used for detecting environmental B. pseudomallei was extracted and collated. We found that global coverage was patchy, with a lack of studies in many areas where melioidosis is suspected to occur. The sampling strategies and bacterial identification methods used were highly variable, and not all were robust. We developed consensus guidelines with the goals of reducing the probability of false-negative results, and the provision of affordable and 'low-tech' methodology that is applicable in both developed and developing countries. The proposed consensus guidelines provide the basis for the development of an accurate and comprehensive global map of environmental B. pseudomallei.
Archaeal β diversity patterns under the seafloor along geochemical gradients
NASA Astrophysics Data System (ADS)
Koyano, Hitoshi; Tsubouchi, Taishi; Kishino, Hirohisa; Akutsu, Tatsuya
2014-09-01
Recently, deep drilling into the seafloor has revealed that there are vast sedimentary ecosystems of diverse microorganisms, particularly archaea, in subsurface areas. We investigated the β diversity patterns of archaeal communities in sediment layers under the seafloor and their determinants. This study was accomplished by analyzing large environmental samples of 16S ribosomal RNA gene sequences and various geochemical data collected from a sediment core of 365.3 m, obtained by drilling into the seafloor off the east coast of the Shimokita Peninsula. To extract the maximum amount of information from these environmental samples, we first developed a method for measuring β diversity using sequence data by applying probability theory on a set of strings developed by two of the authors in a previous publication. We introduced an index of β diversity between sequence populations from which the sequence data were sampled. We then constructed an estimator of the β diversity index based on the sequence data and demonstrated that it converges to the β diversity index between sequence populations with probability of 1 as the number of sampled sequences increases. Next, we applied this new method to quantify β diversities between archaeal sequence populations under the seafloor and constructed a quantitative model of the estimated β diversity patterns. Nearly 90% of the variation in the archaeal β diversity was explained by a model that included as variables the differences in the abundances of chlorine, iodine, and carbon between the sediment layers.
Chao, Li-Wei; Szrek, Helena; Peltzer, Karl; Ramlagan, Shandir; Fleming, Peter; Leite, Rui; Magerman, Jesswill; Ngwenya, Godfrey B.; Pereira, Nuno Sousa; Behrman, Jere
2011-01-01
Finding an efficient method for sampling micro- and small-enterprises (MSEs) for research and statistical reporting purposes is a challenge in developing countries, where registries of MSEs are often nonexistent or outdated. This lack of a sampling frame creates an obstacle in finding a representative sample of MSEs. This study uses computer simulations to draw samples from a census of businesses and non-businesses in the Tshwane Municipality of South Africa, using three different sampling methods: the traditional probability sampling method, the compact segment sampling method, and the World Health Organization’s Expanded Programme on Immunization (EPI) sampling method. Three mechanisms by which the methods could differ are tested, the proximity selection of respondents, the at-home selection of respondents, and the use of inaccurate probability weights. The results highlight the importance of revisits and accurate probability weights, but the lesser effect of proximity selection on the samples’ statistical properties. PMID:22582004
Methodology Series Module 5: Sampling Strategies.
Setia, Maninder Singh
2016-01-01
Once the research question and the research design have been finalised, it is important to select the appropriate sample for the study. The method by which the researcher selects the sample is the ' Sampling Method'. There are essentially two types of sampling methods: 1) probability sampling - based on chance events (such as random numbers, flipping a coin etc.); and 2) non-probability sampling - based on researcher's choice, population that accessible & available. Some of the non-probability sampling methods are: purposive sampling, convenience sampling, or quota sampling. Random sampling method (such as simple random sample or stratified random sample) is a form of probability sampling. It is important to understand the different sampling methods used in clinical studies and mention this method clearly in the manuscript. The researcher should not misrepresent the sampling method in the manuscript (such as using the term ' random sample' when the researcher has used convenience sample). The sampling method will depend on the research question. For instance, the researcher may want to understand an issue in greater detail for one particular population rather than worry about the ' generalizability' of these results. In such a scenario, the researcher may want to use ' purposive sampling' for the study.
Risk markers for disappearance of pediatric Web resources
Hernández-Borges, Angel A.; Jiménez-Sosa, Alejandro; Torres-Álvarez de Arcaya, Maria L.; Macías-Cervi, Pablo; Gaspar-Guardado, Maria A.; Ruíz-Rabaza, Ana
2005-01-01
Objectives: The authors sought to find out whether certain Webometric indexes of a sample of pediatric Web resources, and some tests based on them, could be helpful predictors of their disappearance. Methods: The authors performed a retrospective study of a sample of 363 pediatric Websites and pages they had followed for 4 years. Main measurements included: number of resources that disappeared, number of inbound links and their annual increment, average daily visits to the resources in the sample, sample compliance with the quality criteria of 3 international organizations, and online time of the Web resources. Results: On average, 11% of the sample disappeared annually. However, 13% of these were available again at the end of follow up. Disappearing and surviving Websites did not show differences in the variables studied. However, surviving Web pages had a higher number of inbound links and higher annual increment in inbound links. Similarly, Web pages that survived showed higher compliance with recognized sets of quality criteria than those that disappeared. A subset of 14 quality criteria whose compliance accounted for 90% of the probability of online permanence was identified. Finally, a progressive increment of inbound links was found to be a marker of good prognosis, showing high specificity and positive predictive value (88% and 94%, respectively). Conclusions: The number of inbound links and annual increment of inbound links could be useful markers of the permanence probability for pediatric Web pages. Strategies that assure the Web editors' awareness of their Web resources' popularity could stimulate them to improve the quality of their Websites. PMID:16059427
Determinants of the use of specialist mental health services by nursing home residents.
Shea, D G; Streit, A; Smyer, M A
1994-01-01
OBJECTIVE. This study examines the effects of resident and facility characteristics on the probability of nursing home residents receiving treatment by mental health professionals. DATA SOURCES/STUDY SETTING. The study uses data from the Institutional Population Component of the 1987 National Medical Expenditure Survey, a secondary data source containing data on 3,350 nursing home residents living in 810 nursing homes as of January 1, 1987. STUDY DESIGN. Andersen's health services use model (1968) is used to estimate a multivariate logistic equation for the effects of independent variables on the probability that a resident has received services from mental health professionals. Important variables include resident race, sex, and age; presence of several behaviors and reported mental illnesses; and facility ownership, facility size, and facility certification. DATA COLLECTION/EXTRACTION METHODS. Data on 188 residents were excluded from the sample because information was missing on several important variables. For some additional variables residents who had missing information were coded as negative responses. This left 3,162 observations for analysis in the logistic regressions. PRINCIPAL FINDINGS. Older residents and residents with more ADL limitations are much less likely than other residents to have received treatment from a mental health professional. Residents with reported depression, schizophrenia, or psychoses, and residents who are agitated or hallucinating are more likely to have received treatment. Residents in government nursing homes, homes run by chains, and homes with low levels of certification are less likely to have received treatment. CONCLUSIONS. Few residents receive treatment from mental health professionals despite need. Older, physically disabled residents need special attention. Care in certain types of facilities requires further study. New regulations mandating treatment for mentally ill residents will demand increased attention from nursing home administrators and mental health professionals. PMID:8005788
Modeling marbled murrelet (Brachyramphus marmoratus) habitat using LiDAR-derived canopy data
Hagar, Joan C.; Eskelson, Bianca N.I.; Haggerty, Patricia K.; Nelson, S. Kim; Vesely, David G.
2014-01-01
LiDAR (Light Detection And Ranging) is an emerging remote-sensing tool that can provide fine-scale data describing vertical complexity of vegetation relevant to species that are responsive to forest structure. We used LiDAR data to estimate occupancy probability for the federally threatened marbled murrelet (Brachyramphus marmoratus) in the Oregon Coast Range of the United States. Our goal was to address the need identified in the Recovery Plan for a more accurate estimate of the availability of nesting habitat by developing occupancy maps based on refined measures of nest-strand structure. We used murrelet occupancy data collected by the Bureau of Land Management Coos Bay District, and canopy metrics calculated from discrete return airborne LiDAR data, to fit a logistic regression model predicting the probability of occupancy. Our final model for stand-level occupancy included distance to coast, and 5 LiDAR-derived variables describing canopy structure. With an area under the curve value (AUC) of 0.74, this model had acceptable discrimination and fair agreement (Cohen's κ = 0.24), especially considering that all sites in our sample were regarded by managers as potential habitat. The LiDAR model provided better discrimination between occupied and unoccupied sites than did a model using variables derived from Gradient Nearest Neighbor maps that were previously reported as important predictors of murrelet occupancy (AUC = 0.64, κ = 0.12). We also evaluated LiDAR metrics at 11 known murrelet nest sites. Two LiDAR-derived variables accurately discriminated nest sites from random sites (average AUC = 0.91). LiDAR provided a means of quantifying 3-dimensional canopy structure with variables that are ecologically relevant to murrelet nesting habitat, and have not been as accurately quantified by other mensuration methods.
NASA Astrophysics Data System (ADS)
Guimarães Nobre, Gabriela; Arnbjerg-Nielsen, Karsten; Rosbjerg, Dan; Madsen, Henrik
2016-04-01
Traditionally, flood risk assessment studies have been carried out from a univariate frequency analysis perspective. However, statistical dependence between hydrological variables, such as extreme rainfall and extreme sea surge, is plausible to exist, since both variables to some extent are driven by common meteorological conditions. Aiming to overcome this limitation, multivariate statistical techniques has the potential to combine different sources of flooding in the investigation. The aim of this study was to apply a range of statistical methodologies for analyzing combined extreme hydrological variables that can lead to coastal and urban flooding. The study area is the Elwood Catchment, which is a highly urbanized catchment located in the city of Port Phillip, Melbourne, Australia. The first part of the investigation dealt with the marginal extreme value distributions. Two approaches to extract extreme value series were applied (Annual Maximum and Partial Duration Series), and different probability distribution functions were fit to the observed sample. Results obtained by using the Generalized Pareto distribution demonstrate the ability of the Pareto family to model the extreme events. Advancing into multivariate extreme value analysis, first an investigation regarding the asymptotic properties of extremal dependence was carried out. As a weak positive asymptotic dependence between the bivariate extreme pairs was found, the Conditional method proposed by Heffernan and Tawn (2004) was chosen. This approach is suitable to model bivariate extreme values, which are relatively unlikely to occur together. The results show that the probability of an extreme sea surge occurring during a one-hour intensity extreme precipitation event (or vice versa) can be twice as great as what would occur when assuming independent events. Therefore, presuming independence between these two variables would result in severe underestimation of the flooding risk in the study area.
Dimitrov, S; Detroyer, A; Piroird, C; Gomes, C; Eilstein, J; Pauloin, T; Kuseva, C; Ivanova, H; Popova, I; Karakolev, Y; Ringeissen, S; Mekenyan, O
2016-12-01
When searching for alternative methods to animal testing, confidently rescaling an in vitro result to the corresponding in vivo classification is still a challenging problem. Although one of the most important factors affecting good correlation is sample characteristics, they are very rarely integrated into correlation studies. Usually, in these studies, it is implicitly assumed that both compared values are error-free numbers, which they are not. In this work, we propose a general methodology to analyze and integrate data variability and thus confidence estimation when rescaling from one test to another. The methodology is demonstrated through the case study of rescaling the in vitro Direct Peptide Reactivity Assay (DPRA) reactivity to the in vivo Local Lymph Node Assay (LLNA) skin sensitization potency classifications. In a first step, a comprehensive statistical analysis evaluating the reliability and variability of LLNA and DPRA as such was done. These results allowed us to link the concept of gray zones and confidence probability, which in turn represents a new perspective for a more precise knowledge of the classification of chemicals within their in vivo OR in vitro test. Next, the novelty and practical value of our methodology introducing variability into the threshold optimization between the in vitro AND in vivo test resides in the fact that it attributes a confidence probability to the predicted classification. The methodology, classification and screening approach presented in this study are not restricted to skin sensitization only. They could be helpful also for fate, toxicity and health hazard assessment where plenty of in vitro and in chemico assays and/or QSARs models are available. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Inferring probabilistic stellar rotation periods using Gaussian processes
NASA Astrophysics Data System (ADS)
Angus, Ruth; Morton, Timothy; Aigrain, Suzanne; Foreman-Mackey, Daniel; Rajpaul, Vinesh
2018-02-01
Variability in the light curves of spotted, rotating stars is often non-sinusoidal and quasi-periodic - spots move on the stellar surface and have finite lifetimes, causing stellar flux variations to slowly shift in phase. A strictly periodic sinusoid therefore cannot accurately model a rotationally modulated stellar light curve. Physical models of stellar surfaces have many drawbacks preventing effective inference, such as highly degenerate or high-dimensional parameter spaces. In this work, we test an appropriate effective model: a Gaussian Process with a quasi-periodic covariance kernel function. This highly flexible model allows sampling of the posterior probability density function of the periodic parameter, marginalizing over the other kernel hyperparameters using a Markov Chain Monte Carlo approach. To test the effectiveness of this method, we infer rotation periods from 333 simulated stellar light curves, demonstrating that the Gaussian process method produces periods that are more accurate than both a sine-fitting periodogram and an autocorrelation function method. We also demonstrate that it works well on real data, by inferring rotation periods for 275 Kepler stars with previously measured periods. We provide a table of rotation periods for these and many more, altogether 1102 Kepler objects of interest, and their posterior probability density function samples. Because this method delivers posterior probability density functions, it will enable hierarchical studies involving stellar rotation, particularly those involving population modelling, such as inferring stellar ages, obliquities in exoplanet systems, or characterizing star-planet interactions. The code used to implement this method is available online.
A Comparative Study of Involvement and Motivation among Casino Gamblers.
Lee, Choong-Ki; Lee, Bongkoo; Bernhard, Bo Jason; Lee, Tae Kyung
2009-09-01
The purpose of this paper is to investigate three different types of gamblers (which we label "non-problem", "some problem", and "probable pathological gamblers") to determine differences in involvement and motivation, as well as differences in demographic and behavioral variables. The analysis takes advantage of a unique opportunity to sample on-site at a major casino in South Korea, and the resulting purposive sample yielded 180 completed questionnaires in each of the three groups, for a total number of 540. Factor analysis, analysis of variance (ANOVA) and Duncan tests, and Chi-square tests are employed to analyze the data collected from the survey. Findings from ANOVA tests indicate that involvement factors of importance/self-expression, pleasure/interest, and centrality derived from the factor analysis were significantly different among these three types of gamblers. The "probable pathological" and "some problem" gamblers were found to have similar degrees of involvement, and higher degrees of involvement than the non-problem gamblers. The tests also reveal that motivational factors of escape, socialization, winning, and exploring scenery were significantly different among these three types of gamblers. When looking at motivations to visit the casino, "probable pathological" gamblers were more likely to seek winning, the "some problem" group appeared to be more likely to seek escape, and the "non-problem" gamblers indicate that their motivations to visit centered around explorations of scenery and culture in the surrounding casino area. The tools for exploring motivations and involvements of gambling provide valuable and discerning information about the entire spectrum of gamblers.
Two-step estimation in ratio-of-mediator-probability weighted causal mediation analysis.
Bein, Edward; Deutsch, Jonah; Hong, Guanglei; Porter, Kristin E; Qin, Xu; Yang, Cheng
2018-04-15
This study investigates appropriate estimation of estimator variability in the context of causal mediation analysis that employs propensity score-based weighting. Such an analysis decomposes the total effect of a treatment on the outcome into an indirect effect transmitted through a focal mediator and a direct effect bypassing the mediator. Ratio-of-mediator-probability weighting estimates these causal effects by adjusting for the confounding impact of a large number of pretreatment covariates through propensity score-based weighting. In step 1, a propensity score model is estimated. In step 2, the causal effects of interest are estimated using weights derived from the prior step's regression coefficient estimates. Statistical inferences obtained from this 2-step estimation procedure are potentially problematic if the estimated standard errors of the causal effect estimates do not reflect the sampling uncertainty in the estimation of the weights. This study extends to ratio-of-mediator-probability weighting analysis a solution to the 2-step estimation problem by stacking the score functions from both steps. We derive the asymptotic variance-covariance matrix for the indirect effect and direct effect 2-step estimators, provide simulation results, and illustrate with an application study. Our simulation results indicate that the sampling uncertainty in the estimated weights should not be ignored. The standard error estimation using the stacking procedure offers a viable alternative to bootstrap standard error estimation. We discuss broad implications of this approach for causal analysis involving propensity score-based weighting. Copyright © 2018 John Wiley & Sons, Ltd.
Schulz, Amy J.; Mentz, Graciela; Lachance, Laurie; Zenk, Shannon N.; Johnson, Jonetta; Stokes, Carmen; Mandell, Rebecca
2013-01-01
Objective To examine contributions of observed and perceived neighborhood characteristics in explaining associations between neighborhood poverty and cumulative biological risk (CBR) in an urban community. Methods Multilevel regression analyses were conducted using cross-sectional data from a probability sample survey (n=919), and observational and census data. Dependent variable: CBR. Independent variables: Neighborhood disorder, deterioration and characteristics; perceived neighborhood social environment, physical environment, and neighborhood environment. Covariates: Neighborhood and individual demographics, health-related behaviors. Results Observed and perceived indicators of neighborhood conditions were significantly associated with CBR, after accounting for both neighborhood and individual level socioeconomic indicators. Observed and perceived neighborhood environmental conditions mediated associations between neighborhood poverty and CBR. Conclusions Findings were consistent with the hypothesis that neighborhood conditions associated with economic divestment mediate associations between neighborhood poverty and CBR. PMID:24100238
Mohajeri, Leila; Aziz, Hamidi Abdul; Isa, Mohamed Hasnain; Zahed, Mohammad Ali
2010-02-01
This work studied the bioremediation of weathered crude oil (WCO) in coastal sediment samples using central composite face centered design (CCFD) under response surface methodology (RSM). Initial oil concentration, biomass, nitrogen and phosphorus concentrations were used as independent variables (factors) and oil removal as dependent variable (response) in a 60 days trial. A statistically significant model for WCO removal was obtained. The coefficient of determination (R(2)=0.9732) and probability value (P<0.0001) demonstrated significance for the regression model. Numerical optimization based on desirability function were carried out for initial oil concentration of 2, 16 and 30 g per kg sediment and 83.13, 78.06 and 69.92 per cent removal were observed respectively, compare to 77.13, 74.17 and 69.87 per cent removal for un-optimized results.
NASA Astrophysics Data System (ADS)
Alvarez, Diego A.; Uribe, Felipe; Hurtado, Jorge E.
2018-02-01
Random set theory is a general framework which comprises uncertainty in the form of probability boxes, possibility distributions, cumulative distribution functions, Dempster-Shafer structures or intervals; in addition, the dependence between the input variables can be expressed using copulas. In this paper, the lower and upper bounds on the probability of failure are calculated by means of random set theory. In order to accelerate the calculation, a well-known and efficient probability-based reliability method known as subset simulation is employed. This method is especially useful for finding small failure probabilities in both low- and high-dimensional spaces, disjoint failure domains and nonlinear limit state functions. The proposed methodology represents a drastic reduction of the computational labor implied by plain Monte Carlo simulation for problems defined with a mixture of representations for the input variables, while delivering similar results. Numerical examples illustrate the efficiency of the proposed approach.
Cytologic diagnosis: expression of probability by clinical pathologists.
Christopher, Mary M; Hotz, Christine S
2004-01-01
Clinical pathologists use descriptive terms or modifiers to express the probability or likelihood of a cytologic diagnosis. Words are imprecise in meaning, however, and may be used and interpreted differently by pathologists and clinicians. The goals of this study were to 1) assess the frequency of use of 18 modifiers, 2) determine the probability of a positive diagnosis implied by the modifiers, 3) identify preferred modifiers for different levels of probability, 4) ascertain the importance of factors that affect expression of diagnostic certainty, and 5) evaluate differences based on gender, employment, and experience. We surveyed 202 clinical pathologists who were board-certified by the American College of Veterinary Pathologists (Clinical Pathology). Surveys were distributed in October 2001 and returned by e-mail, fax, or surface mail over a 2-month period. Results were analyzed by parametric and nonparametric tests. Survey response rate was 47.5% (n = 96) and primarily included clinical pathologists at veterinary schools (n = 58) and diagnostic laboratories (n = 31). Eleven of 18 terms were used "often" or "sometimes" by >/= 50% of respondents. Broad variability was found in the probability assigned to each term, especially those with median values of 75 to 90%. Preferred modifiers for 7 numerical probabilities ranging from 0 to 100% included 68 unique terms; however, a set of 10 terms was used by >/= 50% of respondents. Cellularity and quality of the sample, experience of the pathologist, and implications of the diagnosis were the most important factors affecting the expression of probability. Because of wide discrepancy in the implied likelihood of a diagnosis using words, defined terminology and controlled vocabulary may be useful in improving communication and the quality of data in cytology reporting.
Baskerville, Jerry Ray; Herrick, John
2012-02-01
This study focuses on clinically assigned prospective estimated pretest probability and pretest perception of legal risk as independent variables in the ordering of multidetector computed tomographic (MDCT) head scans. Our primary aim is to measure the association between pretest probability of a significant finding and pretest perception of legal risk. Secondarily, we measure the percentage of MDCT scans that physicians would not order if there was no legal risk. This study is a prospective, cross-sectional, descriptive analysis of patients 18 years and older for whom emergency medicine physicians ordered a head MDCT. We collected a sample of 138 patients subjected to head MDCT scans. The prevalence of a significant finding in our population was 6%, yet the pretest probability expectation of a significant finding was 33%. The legal risk presumed was even more dramatic at 54%. These data support the hypothesis that physicians presume the legal risk to be significantly higher than the risk of a significant finding. A total of 21% or 15% patients (95% confidence interval, ±5.9%) would not have been subjected to MDCT if there was no legal risk. Physicians overestimated the probability that the computed tomographic scan would yield a significant result and indicated an even greater perceived medicolegal risk if the scan was not obtained. Physician test-ordering behavior is complex, and our study queries pertinent aspects of MDCT testing. The magnification of legal risk vs the pretest probability of a significant finding is demonstrated. Physicians significantly overestimated pretest probability of a significant finding on head MDCT scans and presumed legal risk. Copyright © 2012 Elsevier Inc. All rights reserved.
Fisher, Charles K; Mehta, Pankaj
2015-06-01
Feature selection, identifying a subset of variables that are relevant for predicting a response, is an important and challenging component of many methods in statistics and machine learning. Feature selection is especially difficult and computationally intensive when the number of variables approaches or exceeds the number of samples, as is often the case for many genomic datasets. Here, we introduce a new approach--the Bayesian Ising Approximation (BIA)-to rapidly calculate posterior probabilities for feature relevance in L2 penalized linear regression. In the regime where the regression problem is strongly regularized by the prior, we show that computing the marginal posterior probabilities for features is equivalent to computing the magnetizations of an Ising model with weak couplings. Using a mean field approximation, we show it is possible to rapidly compute the feature selection path described by the posterior probabilities as a function of the L2 penalty. We present simulations and analytical results illustrating the accuracy of the BIA on some simple regression problems. Finally, we demonstrate the applicability of the BIA to high-dimensional regression by analyzing a gene expression dataset with nearly 30 000 features. These results also highlight the impact of correlations between features on Bayesian feature selection. An implementation of the BIA in C++, along with data for reproducing our gene expression analyses, are freely available at http://physics.bu.edu/∼pankajm/BIACode. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Making great leaps forward: Accounting for detectability in herpetological field studies
Mazerolle, Marc J.; Bailey, Larissa L.; Kendall, William L.; Royle, J. Andrew; Converse, Sarah J.; Nichols, James D.
2007-01-01
Detecting individuals of amphibian and reptile species can be a daunting task. Detection can be hindered by various factors such as cryptic behavior, color patterns, or observer experience. These factors complicate the estimation of state variables of interest (e.g., abundance, occupancy, species richness) as well as the vital rates that induce changes in these state variables (e.g., survival probabilities for abundance; extinction probabilities for occupancy). Although ad hoc methods (e.g., counts uncorrected for detection, return rates) typically perform poorly in the face of no detection, they continue to be used extensively in various fields, including herpetology. However, formal approaches that estimate and account for the probability of detection, such as capture-mark-recapture (CMR) methods and distance sampling, are available. In this paper, we present classical approaches and recent advances in methods accounting for detectability that are particularly pertinent for herpetological data sets. Through examples, we illustrate the use of several methods, discuss their performance compared to that of ad hoc methods, and we suggest available software to perform these analyses. The methods we discuss control for imperfect detection and reduce bias in estimates of demographic parameters such as population size, survival, or, at other levels of biological organization, species occurrence. Among these methods, recently developed approaches that no longer require marked or resighted individuals should be particularly of interest to field herpetologists. We hope that our effort will encourage practitioners to implement some of the estimation methods presented herein instead of relying on ad hoc methods that make more limiting assumptions.
Red-shouldered hawk occupancy surveys in central Minnesota, USA
Henneman, C.; McLeod, M.A.; Andersen, D.E.
2007-01-01
Forest-dwelling raptors are often difficult to detect because many species occur at low density or are secretive. Broadcasting conspecific vocalizations can increase the probability of detecting forest-dwelling raptors and has been shown to be an effective method for locating raptors and assessing their relative abundance. Recent advances in statistical techniques based on presence-absence data use probabilistic arguments to derive probability of detection when it is <1 and to provide a model and likelihood-based method for estimating proportion of sites occupied. We used these maximum-likelihood models with data from red-shouldered hawk (Buteo lineatus) call-broadcast surveys conducted in central Minnesota, USA, in 1994-1995 and 2004-2005. Our objectives were to obtain estimates of occupancy and detection probability 1) over multiple sampling seasons (yr), 2) incorporating within-season time-specific detection probabilities, 3) with call type and breeding stage included as covariates in models of probability of detection, and 4) with different sampling strategies. We visited individual survey locations 2-9 times per year, and estimates of both probability of detection (range = 0.28-0.54) and site occupancy (range = 0.81-0.97) varied among years. Detection probability was affected by inclusion of a within-season time-specific covariate, call type, and breeding stage. In 2004 and 2005 we used survey results to assess the effect that number of sample locations, double sampling, and discontinued sampling had on parameter estimates. We found that estimates of probability of detection and proportion of sites occupied were similar across different sampling strategies, and we suggest ways to reduce sampling effort in a monitoring program.
Methodology Series Module 5: Sampling Strategies
Setia, Maninder Singh
2016-01-01
Once the research question and the research design have been finalised, it is important to select the appropriate sample for the study. The method by which the researcher selects the sample is the ‘ Sampling Method’. There are essentially two types of sampling methods: 1) probability sampling – based on chance events (such as random numbers, flipping a coin etc.); and 2) non-probability sampling – based on researcher's choice, population that accessible & available. Some of the non-probability sampling methods are: purposive sampling, convenience sampling, or quota sampling. Random sampling method (such as simple random sample or stratified random sample) is a form of probability sampling. It is important to understand the different sampling methods used in clinical studies and mention this method clearly in the manuscript. The researcher should not misrepresent the sampling method in the manuscript (such as using the term ‘ random sample’ when the researcher has used convenience sample). The sampling method will depend on the research question. For instance, the researcher may want to understand an issue in greater detail for one particular population rather than worry about the ‘ generalizability’ of these results. In such a scenario, the researcher may want to use ‘ purposive sampling’ for the study. PMID:27688438
Accuracy and precision of Legionella isolation by US laboratories in the ELITE program pilot study.
Lucas, Claressa E; Taylor, Thomas H; Fields, Barry S
2011-10-01
A pilot study for the Environmental Legionella Isolation Techniques Evaluation (ELITE) Program, a proficiency testing scheme for US laboratories that culture Legionella from environmental samples, was conducted September 1, 2008 through March 31, 2009. Participants (n=20) processed panels consisting of six sample types: pure and mixed positive, pure and mixed negative, pure and mixed variable. The majority (93%) of all samples (n=286) were correctly characterized, with 88.5% of samples positive for Legionella and 100% of negative samples identified correctly. Variable samples were incorrectly identified as negative in 36.9% of reports. For all samples reported positive (n=128), participants underestimated the cfu/ml by a mean of 1.25 logs with standard deviation of 0.78 logs, standard error of 0.07 logs, and a range of 3.57 logs compared to the CDC re-test value. Centering results around the interlaboratory mean yielded a standard deviation of 0.65 logs, standard error of 0.06 logs, and a range of 3.22 logs. Sampling protocol, treatment regimen, culture procedure, and laboratory experience did not significantly affect the accuracy or precision of reported concentrations. Qualitative and quantitative results from the ELITE pilot study were similar to reports from a corresponding proficiency testing scheme available in the European Union, indicating these results are probably valid for most environmental laboratories worldwide. The large enumeration error observed suggests that the need for remediation of a water system should not be determined solely by the concentration of Legionella observed in a sample since that value is likely to underestimate the true level of contamination. Published by Elsevier Ltd.
What Are Probability Surveys used by the National Aquatic Resource Surveys?
The National Aquatic Resource Surveys (NARS) use probability-survey designs to assess the condition of the nation’s waters. In probability surveys (also known as sample-surveys or statistical surveys), sampling sites are selected randomly.
The Use of Variable Q1 Isolation Windows Improves Selectivity in LC-SWATH-MS Acquisition.
Zhang, Ying; Bilbao, Aivett; Bruderer, Tobias; Luban, Jeremy; Strambio-De-Castillia, Caterina; Lisacek, Frédérique; Hopfgartner, Gérard; Varesio, Emmanuel
2015-10-02
As tryptic peptides and metabolites are not equally distributed along the mass range, the probability of cross fragment ion interference is higher in certain windows when fixed Q1 SWATH windows are applied. We evaluated the benefits of utilizing variable Q1 SWATH windows with regards to selectivity improvement. Variable windows based on equalizing the distribution of either the precursor ion population (PIP) or the total ion current (TIC) within each window were generated by an in-house software, swathTUNER. These two variable Q1 SWATH window strategies outperformed, with respect to quantification and identification, the basic approach using a fixed window width (FIX) for proteomic profiling of human monocyte-derived dendritic cells (MDDCs). Thus, 13.8 and 8.4% additional peptide precursors, which resulted in 13.1 and 10.0% more proteins, were confidently identified by SWATH using the strategy PIP and TIC, respectively, in the MDDC proteomic sample. On the basis of the spectral library purity score, some improvement warranted by variable Q1 windows was also observed, albeit to a lesser extent, in the metabolomic profiling of human urine. We show that the novel concept of "scheduled SWATH" proposed here, which incorporates (i) variable isolation windows and (ii) precursor retention time segmentation further improves both peptide and metabolite identifications.
[Experimental analysis of some determinants of inductive reasoning].
Ono, K
1989-02-01
Three experiments were conducted from a behavioral perspective to investigate the determinants of inductive reasoning and to compare some methodological differences. The dependent variable used in these experiments was the threshold of confident response (TCR), which was defined as "the minimal sample size required to establish generalization from instances." Experiment 1 examined the effects of population size on inductive reasoning, and the results from 35 college students showed that the TCR varied in proportion to the logarithm of population size. In Experiment 2, 30 subjects showed distinct sensitivity to both prior probability and base-rate. The results from 70 subjects who participated in Experiment 3 showed that the TCR was affected by its consequences (risk condition), and especially, that humans were sensitive to a loss situation. These results demonstrate the sensitivity of humans to statistical variables in inductive reasoning. Furthermore, methodological comparison indicated that the experimentally observed values of TCR were close to, but not as precise as the optimal values predicted by Bayes' model. On the other hand, the subjective TCR estimated by subjects was highly discrepant from the observed TCR. These findings suggest that various aspects of inductive reasoning can be fruitfully investigated not only from subjective estimations such as probability likelihood but also from an objective behavioral perspective.
NASA Astrophysics Data System (ADS)
Wang, Q. J.; Robertson, D. E.; Chiew, F. H. S.
2009-05-01
Seasonal forecasting of streamflows can be highly valuable for water resources management. In this paper, a Bayesian joint probability (BJP) modeling approach for seasonal forecasting of streamflows at multiple sites is presented. A Box-Cox transformed multivariate normal distribution is proposed to model the joint distribution of future streamflows and their predictors such as antecedent streamflows and El Niño-Southern Oscillation indices and other climate indicators. Bayesian inference of model parameters and uncertainties is implemented using Markov chain Monte Carlo sampling, leading to joint probabilistic forecasts of streamflows at multiple sites. The model provides a parametric structure for quantifying relationships between variables, including intersite correlations. The Box-Cox transformed multivariate normal distribution has considerable flexibility for modeling a wide range of predictors and predictands. The Bayesian inference formulated allows the use of data that contain nonconcurrent and missing records. The model flexibility and data-handling ability means that the BJP modeling approach is potentially of wide practical application. The paper also presents a number of statistical measures and graphical methods for verification of probabilistic forecasts of continuous variables. Results for streamflows at three river gauges in the Murrumbidgee River catchment in southeast Australia show that the BJP modeling approach has good forecast quality and that the fitted model is consistent with observed data.
The Integrative Weaning Index in Elderly ICU Subjects.
Azeredo, Leandro M; Nemer, Sérgio N; Barbas, Carmen Sv; Caldeira, Jefferson B; Noé, Rosângela; Guimarães, Bruno L; Caldas, Célia P
2017-03-01
With increasing life expectancy and ICU admission of elderly patients, mechanical ventilation, and weaning trials have increased worldwide. We evaluated a cohort with 479 subjects in the ICU. Patients younger than 18 y, tracheostomized, or with neurologic diseases were excluded, resulting in 331 subjects. Subjects ≥70 y old were considered elderly, whereas those <70 y old were considered non-elderly. Besides the conventional weaning indexes, we evaluated the performance of the integrative weaning index (IWI). The probability of successful weaning was investigated using relative risk and logistic regression. The Hosmer-Lemeshow goodness-of-fit test was used to calibrate and the C statistic was calculated to evaluate the association between predicted probabilities and observed proportions in the logistic regression model. Prevalence of successful weaning in the sample was 83.7%. There was no difference in mortality between elderly and non-elderly subjects ( P = .16), in days of mechanical ventilation ( P = .22) and days of weaning ( P = .55). In elderly subjects, the IWI was the only respiratory variable associated with mechanical ventilation weaning in this population ( P < .001). The IWI was the independent variable found in weaning of elderly subjects that may contribute to the critical moment of this population in intensive care. Copyright © 2017 by Daedalus Enterprises.
Magnetoreresistance of carbon nanotube-polypyrrole composite yarns
NASA Astrophysics Data System (ADS)
Ghanbari, R.; Ghorbani, S. R.; Arabi, H.; Foroughi, J.
2018-05-01
Three types of samples, carbon nanotube yarn and carbon nanotube-polypyrrole composite yarns had been investigated by measurement of the electrical conductivity as a function of temperature and magnetic field. The conductivity was well explained by 3D Mott variable range hopping (VRH) law at T < 100 K. Both positive and negative magnetoresistance (MR) were observed by increasing magnetic field. The MR data were analyzed based a theoretical model. A quadratic positive and negative MR was observed for three samples. It was found that the localization length decreases with applied magnetic field while the density of states increases. The increasing of the density of states induces increasing the number of available energy states for hopping. Thus the electron hopping probability increases in between sites with the shorter distance that results to small the average hopping length.
Moustakas, Aristides; Evans, Matthew R
2015-02-28
Plant survival is a key factor in forest dynamics and survival probabilities often vary across life stages. Studies specifically aimed at assessing tree survival are unusual and so data initially designed for other purposes often need to be used; such data are more likely to contain errors than data collected for this specific purpose. We investigate the survival rates of ten tree species in a dataset designed to monitor growth rates. As some individuals were not included in the census at some time points we use capture-mark-recapture methods both to allow us to account for missing individuals, and to estimate relocation probabilities. Growth rates, size, and light availability were included as covariates in the model predicting survival rates. The study demonstrates that tree mortality is best described as constant between years and size-dependent at early life stages and size independent at later life stages for most species of UK hardwood. We have demonstrated that even with a twenty-year dataset it is possible to discern variability both between individuals and between species. Our work illustrates the potential utility of the method applied here for calculating plant population dynamics parameters in time replicated datasets with small sample sizes and missing individuals without any loss of sample size, and including explanatory covariates.
Analysis of the impact of safeguards criteria
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mullen, M.F.; Reardon, P.T.
As part of the US Program of Technical Assistance to IAEA Safeguards, the Pacific Northwest Laboratory (PNL) was asked to assist in developing and demonstrating a model for assessing the impact of setting criteria for the application of IAEA safeguards. This report presents the results of PNL's work on the task. The report is in three parts. The first explains the technical approach and methodology. The second contains an example application of the methodology. The third presents the conclusions of the study. PNL used the model and computer programs developed as part of Task C.5 (Estimation of Inspection Efforts) ofmore » the Program of Technical Assistance. The example application of the methodology involves low-enriched uranium conversion and fuel fabrication facilities. The effects of variations in seven parameters are considered: false alarm probability, goal probability of detection, detection goal quantity, the plant operator's measurement capability, the inspector's variables measurement capability, the inspector's attributes measurement capability, and annual plant throughput. Among the key results and conclusions of the analysis are the following: the variables with the greatest impact on the probability of detection are the inspector's measurement capability, the goal quantity, and the throughput; the variables with the greatest impact on inspection costs are the throughput, the goal quantity, and the goal probability of detection; there are important interactions between variables. That is, the effects of a given variable often depends on the level or value of some other variable. With the methodology used in this study, these interactions can be quantitatively analyzed; reasonably good approximate prediction equations can be developed using the methodology described here.« less
Olson, Gail S.; Anthony, Robert G.; Forsman, Eric D.; Ackers, Steven H.; Loschl, Peter J.; Reid, Janice A.; Dugger, Katie M.; Glenn, Elizabeth M.; Ripple, William J.
2005-01-01
Northern spotted owls (Strix occidentalis caurina) have been studied intensively since their listing as a threatened species by the U.S. Fish and Wildlife Service in 1990. Studies of spotted owl site occupancy have used various binary response measures, but most of these studies have made the assumption that detectability is perfect, or at least high and not variable. Further, previous studies did not consider temporal variation in site occupancy. We used relatively new methods for open population modeling of site occupancy that incorporated imperfect and variable detectability of spotted owls and allowed modeling of temporal variation in site occupancy, extinction, and colonization probabilities. We also examined the effects of barred owl (S. varia) presence on these parameters. We used spotted owl survey data from 1990 to 2002 for 3 study areas in Oregon, USA, and we used program MARK to develop and analyze site occupancy models. We found per visit detection probabilities averaged <0.70 and were highly variable among study years and study areas. Site occupancy probabilities for owl pairs declined greatly on 1 study area and slightly on the other 2 areas. For all owls, including singles and pairs, site occupancy was mostly stable through time. Barred owl presence had a negative effect on spotted owl detection probabilities, and it had either a positive effect on local-extinction probabilities or a negative effect on colonization probabilities. We conclude that further analyses of spotted owls must account for imperfect and variable detectability and barred owl presence to properly interpret results. Further, because barred owl presence is increasing within the range of northern spotted owls, we expect to see further declines in the proportion of sites occupied by spotted owls.
Systematic sampling for suspended sediment
Robert B. Thomas
1991-01-01
Abstract - Because of high costs or complex logistics, scientific populations cannot be measured entirely and must be sampled. Accepted scientific practice holds that sample selection be based on statistical principles to assure objectivity when estimating totals and variances. Probability sampling--obtaining samples with known probabilities--is the only method that...
Spatial and temporal variability in rates of landsliding in seismically active mountain ranges
NASA Astrophysics Data System (ADS)
Parker, R.; Petley, D.; Rosser, N.; Densmore, A.; Gunasekera, R.; Brain, M.
2012-04-01
Where earthquake and precipitation driven disasters occur in steep, mountainous regions, landslides often account for a large proportion of the associated damage and losses. This research addresses spatial and temporal variability in rates of landslide occurrence in seismically active mountain ranges as a step towards developing better regional scale prediction of losses in such events. In the first part of this paper we attempt to explain reductively the variability in spatial rates of landslide occurrence, using data from five major earthquakes. This is achieved by fitting a regression-based conditional probability model to spatial probabilities of landslide occurrence, using as predictor variables proxies for spatial patterns of seismic ground motion and modelled hillslope stability. A combined model for all earthquakes performs well in hindcasting spatial probabilities of landslide occurrence as a function of readily-attainable spatial variables. We present validation of the model and demonstrate the extent to which it may be applied globally to derive landslide probabilities for future earthquakes. In part two we examine the temporal behaviour of rates of landslide occurrence. This is achieved through numerical modelling to simulate the behaviour of a hypothetical landscape. The model landscape is composed of hillslopes that continually weaken, fail and reset in response to temporally-discrete forcing events that represent earthquakes. Hillslopes with different geometries require different amounts of weakening to fail, such that they fail and reset at different temporal rates. Our results suggest that probabilities of landslide occurrence are not temporally constant, but rather vary with time, irrespective of changes in forcing event magnitudes or environmental conditions. Various parameters influencing the magnitude and temporal patterns of this variability are identified, highlighting areas where future research is needed. This model has important implications for landslide hazard and risk analysis in mountain areas as existing techniques usually assume that susceptibility to failure does not change with time.
Eby, Lisa A.; Helmy, Olga; Holsinger, Lisa M.; Young, Michael K.
2014-01-01
Many freshwater fish species are considered vulnerable to stream temperature warming associated with climate change because they are ectothermic, yet there are surprisingly few studies documenting changes in distributions. Streams and rivers in the U.S. Rocky Mountains have been warming for several decades. At the same time these systems have been experiencing an increase in the severity and frequency of wildfires, which often results in habitat changes including increased water temperatures. We resampled 74 sites across a Rocky Mountain watershed 17 to 20 years after initial samples to determine whether there were trends in bull trout occurrence associated with temperature, wildfire, or other habitat variables. We found that site abandonment probabilities (0.36) were significantly higher than colonization probabilities (0.13), which indicated a reduction in the number of occupied sites. Site abandonment probabilities were greater at low elevations with warm temperatures. Other covariates, such as the presence of wildfire, nonnative brook trout, proximity to areas with many adults, and various stream habitat descriptors, were not associated with changes in probability of occupancy. Higher abandonment probabilities at low elevation for bull trout provide initial evidence validating the predictions made by bioclimatic models that bull trout populations will retreat to higher, cooler thermal refuges as water temperatures increase. The geographic breadth of these declines across the region is unknown but the approach of revisiting historical sites using an occupancy framework provides a useful template for additional assessments. PMID:24897341
Eby, Lisa A; Helmy, Olga; Holsinger, Lisa M; Young, Michael K
2014-01-01
Many freshwater fish species are considered vulnerable to stream temperature warming associated with climate change because they are ectothermic, yet there are surprisingly few studies documenting changes in distributions. Streams and rivers in the U.S. Rocky Mountains have been warming for several decades. At the same time these systems have been experiencing an increase in the severity and frequency of wildfires, which often results in habitat changes including increased water temperatures. We resampled 74 sites across a Rocky Mountain watershed 17 to 20 years after initial samples to determine whether there were trends in bull trout occurrence associated with temperature, wildfire, or other habitat variables. We found that site abandonment probabilities (0.36) were significantly higher than colonization probabilities (0.13), which indicated a reduction in the number of occupied sites. Site abandonment probabilities were greater at low elevations with warm temperatures. Other covariates, such as the presence of wildfire, nonnative brook trout, proximity to areas with many adults, and various stream habitat descriptors, were not associated with changes in probability of occupancy. Higher abandonment probabilities at low elevation for bull trout provide initial evidence validating the predictions made by bioclimatic models that bull trout populations will retreat to higher, cooler thermal refuges as water temperatures increase. The geographic breadth of these declines across the region is unknown but the approach of revisiting historical sites using an occupancy framework provides a useful template for additional assessments.
Kendall, W.L.; Nichols, J.D.; North, P.M.; Nichols, J.D.
1995-01-01
The use of the Cormack- Jolly-Seber model under a standard sampling scheme of one sample per time period, when the Jolly-Seber assumption that all emigration is permanent does not hold, leads to the confounding of temporary emigration probabilities with capture probabilities. This biases the estimates of capture probability when temporary emigration is a completely random process, and both capture and survival probabilities when there is a temporary trap response in temporary emigration, or it is Markovian. The use of secondary capture samples over a shorter interval within each period, during which the population is assumed to be closed (Pollock's robust design), provides a second source of information on capture probabilities. This solves the confounding problem, and thus temporary emigration probabilities can be estimated. This process can be accomplished in an ad hoc fashion for completely random temporary emigration and to some extent in the temporary trap response case, but modelling the complete sampling process provides more flexibility and permits direct estimation of variances. For the case of Markovian temporary emigration, a full likelihood is required.
Optimal Sampling to Provide User-Specific Climate Information.
NASA Astrophysics Data System (ADS)
Panturat, Suwanna
The types of weather-related world problems which are of socio-economic importance selected in this study as representative of three different levels of user groups include: (i) a regional problem concerned with air pollution plumes which lead to acid rain in the north eastern United States, (ii) a state-level problem in the form of winter wheat production in Oklahoma, and (iii) an individual-level problem involving reservoir management given errors in rainfall estimation at Lake Ellsworth, upstream from Lawton, Oklahoma. The study is aimed at designing optimal sampling networks which are based on customer value systems and also abstracting from data sets that information which is most cost-effective in reducing the climate-sensitive aspects of a given user problem. Three process models being used in this study to interpret climate variability in terms of the variables of importance to the user comprise: (i) the HEFFTER-SAMSON diffusion model as the climate transfer function for acid rain, (ii) the CERES-MAIZE plant process model for winter wheat production and (iii) the AGEHYD streamflow model selected as "a black box" for reservoir management. A state-of-the-art Non Linear Program (NLP) algorithm for minimizing an objective function is employed to determine the optimal number and location of various sensors. Statistical quantities considered in determining sensor locations including Bayes Risk, the chi-squared value, the probability of the Type I error (alpha) and the probability of the Type II error (beta) and the noncentrality parameter delta^2. Moreover, the number of years required to detect a climate change resulting in a given bushel per acre change in mean wheat production is determined; the number of seasons of observations required to reduce the standard deviation of the error variance of the ambient sulfur dioxide to less than a certain percent of the mean is found; and finally the policy of maintaining pre-storm flood pools at selected levels is examined given information from the optimal sampling network as defined by the study.
Jathanna, Devcharan; Karanth, K. Ullas; Kumar, N. Samba; Karanth, Krithi K.; Goswami, Varun R.
2015-01-01
Understanding species distribution patterns has direct ramifications for the conservation of endangered species, such as the Asian elephant Elephas maximus. However, reliable assessment of elephant distribution is handicapped by factors such as the large spatial scales of field studies, survey expertise required, the paucity of analytical approaches that explicitly account for confounding observation processes such as imperfect and variable detectability, unequal sampling probability and spatial dependence among animal detections. We addressed these problems by carrying out ‘detection—non-detection’ surveys of elephant signs across a c. 38,000-km2 landscape in the Western Ghats of Karnataka, India. We analyzed the resulting sign encounter data using a recently developed modeling approach that explicitly addresses variable detectability across space and spatially dependent non-closure of occupancy, across sampling replicates. We estimated overall occupancy, a parameter useful to monitoring elephant populations, and examined key ecological and anthropogenic drivers of elephant presence. Our results showed elephants occupied 13,483 km2 (SE = 847 km2) corresponding to 64% of the available 21,167 km2 of elephant habitat in the study landscape, a useful baseline to monitor future changes. Replicate-level detection probability ranged between 0.56 and 0.88, and ignoring it would have underestimated elephant distribution by 2116 km2 or 16%. We found that anthropogenic factors predominated over natural habitat attributes in determining elephant occupancy, underscoring the conservation need to regulate them. Human disturbances affected elephant habitat occupancy as well as site-level detectability. Rainfall is not an important limiting factor in this relatively humid bioclimate. Finally, we discuss cost-effective monitoring of Asian elephant populations and the specific spatial scales at which different population parameters can be estimated. We emphasize the need to model the observation and sampling processes that often obscure the ecological process of interest, in this case relationship between elephants to their habitat. PMID:26207378
Dahl, Alv A; Østby-Deglum, Marie; Oldenburg, Jan; Bremnes, Roy; Dahl, Olav; Klepp, Olbjørn; Wist, Erik; Fosså, Sophie D
2016-10-01
The purpose of this research is to study the prevalence of posttraumatic stress disorder (PTSD) and variables associated with PTSD in Norwegian long-term testicular cancer survivors (TCSs) both cross-sectionally and longitudinally. At a mean of 11 years after diagnosis, 1418 TCSs responded to a mailed questionnaire, and at a mean of 19 years after diagnosis, 1046 of them responded again to a modified questionnaire. Posttraumatic symptoms related to testicular cancer were self-rated with the Impact of Event Scale (IES) at the 11-year study only. An IES total score ≥35 defined Full PTSD, and a score 26-34 identified Partial PTSD, and the combination of Full and Partial PTSD defined Probable PTSD. At the 11-year study, 4.5 % had Full PTSD, 6.4 % had Partial PTSD, and 10.9 % Probable had PTSD. At both studies, socio-demographic variables, somatic health, anxiety/depression, chronic fatigue, and neurotoxic adverse effects were significantly associated with Probable PTSD in bivariate analyses. Probable anxiety disorder, poor self-rated health, and neurotoxicity remained significant with Probable PTSD in multivariate analyses at the 11-year study. In bivariate analyses, probable PTSD at that time significantly predicted socio-demographic variables, somatic health, anxiety/depression, chronic fatigue, and neurotoxicity among participants of the 19-year study, but only probable anxiety disorder remained significant in multivariable analysis. In spite of excellent prognosis, 10.9 % of long-term testicular cancer survivors had Probable PTSD at a mean of 11 years after diagnosis. Probable PTSD was significantly associated with a broad range of problems both at that time and was predictive of considerable problems at a mean of 19 year postdiagnosis. Among long-term testicular cancer survivors, 10.9 % have Probable PTSD with many associated problems, and therefore health personnel should explore stress symptoms at follow-up since efficient treatments are available.
Variability of recurrence interval for New Zealand surface-rupturing paleoearthquakes
NASA Astrophysics Data System (ADS)
Nicol, A., , Prof; Robinson, R., Jr.; Van Dissen, R. J.; Harvison, A.
2015-12-01
Recurrence interval (RI) for successive earthquakes on individual faults is recorded by paleoseismic datasets for surface-rupturing earthquakes which, in New Zealand, have magnitudes of >Mw ~6 to 7.2 depending on the thickness of the brittle crust. New Zealand faults examined have mean RI of ~130 to 8500 yrs, with an upper bound censored by the sample duration (<30 kyr) and an inverse relationship to fault slip rate. Frequency histograms, probability density functions (PDFs) and coefficient of variation (CoV= standard deviation/arithmetic mean) values have been used to quantify RI variability for geological and simulated earthquakes on >100 New Zealand active faults. RI for individual faults can vary by more than an order of magnitude. CoV of RI for paleoearthquake data comprising 4-10 events ranges from ~0.2 to 1 with a mean of 0.6±0.2. These values are generally comparable to simulated earthquakes (>100 events per fault) and suggest that RI ranges from quasi periodic (e.g., ~0.2-0.5) to random (e.g., ~1.0). Comparison of earthquake simulation and paleoearthquake data indicates that the mean and CoV of RI can be strongly influenced by sampling artefacts including; the magnitude of completeness, the dimensionality of spatial sampling and the duration of the sample period. Despite these sampling issues RI for the best of the geological data (i.e. >6 events) and earthquake simulations are described by log-normal or Weibull distributions with long recurrence tails (~3 times the mean) and provide a basis for quantifying real RI variability (rather than sampling artefacts). Our analysis indicates that CoV of RI is negatively related to fault slip rate. These data are consistent with the notion that fault interaction and associated stress perturbations arising from slip on larger faults are more likely to advance or retard future slip on smaller faults than visa versa.
Probability Issues in without Replacement Sampling
ERIC Educational Resources Information Center
Joarder, A. H.; Al-Sabah, W. S.
2007-01-01
Sampling without replacement is an important aspect in teaching conditional probabilities in elementary statistics courses. Different methods proposed in different texts for calculating probabilities of events in this context are reviewed and their relative merits and limitations in applications are pinpointed. An alternative representation of…
Rogers, R W; Mewborn, C R
1976-07-01
Three factorial experiments examined the persuasive effects of the noxiousness of threatened event, its probability of occurrence, and the efficacy of recommended protective measures. A total of 176 students participated in separate studies on the topics of cigarette smoking, driving safety, and venereal disease. The results disclosed that increments in the efficacy variable increased intentions to adopt the efficacy variable increased intentions to adopt the recommended practices. Interaction effects revealed that when the preventive practices were effective, increments in the noxiousness and probability variables facilitated attitude change; however, when the coping responses were the preventive practices were effective, increments in the noxiousness and probability either had no effect or a deleterious effect, respectively. These interaction effects were discussed in terms of a defensive avoidance hypothesis, the crucial component of which was an inability to ward off the danger. Furthermore, the effect of the emotion of fear upon intentions was found to be mediated by the cognitive appraisal of severity of the threat. Finally, similarities with and extensions of previous studies were reviewed.
REGULATION OF GEOGRAPHIC VARIABILITY IN HAPLOID:DIPLOD RATIOS OF BIPHASIC SEAWEED LIFE CYCLES(1).
da Silva Vieira, Vasco Manuel Nobre de Carvalho; Santos, Rui Orlando Pimenta
2012-08-01
The relative abundance of haploid and diploid individuals (H:D) in isomorphic marine algal biphasic cycles varies spatially, but only if vital rates of haploid and diploid phases vary differently with environmental conditions (i.e. conditional differentiation between phases). Vital rates of isomorphic phases in particular environments may be determined by subtle morphological or physiological differences. Herein, we test numerically how geographic variability in H:D is regulated by conditional differentiation between isomorphic life phases and the type of life strategy of populations (i.e. life cycles dominated by reproduction, survival or growth). Simulation conditions were selected using available data on H:D spatial variability in seaweeds. Conditional differentiation between ploidy phases had a small effect on the H:D variability for species with life strategies that invest either in fertility or in growth. Conversely, species with life strategies that invest mainly in survival, exhibited high variability in H:D through a conditional differentiation in stasis (the probability of staying in the same size class), breakage (the probability of changing to a smaller size class) or growth (the probability of changing to a bigger size class). These results were consistent with observed geographic variability in H:D of natural marine algae populations. © 2012 Phycological Society of America.
NASA Astrophysics Data System (ADS)
Nanus, L.; Campbell, D. H.; Williams, M. W.
2004-12-01
Acidification of high-elevation lakes in the Western United States is of concern because of the storage and release of pollutants in snowmelt runoff combined with steep topography, granitic bedrock, and limited soils and biota. Land use managers have limited resources for sampling and thus need direction on how best to design monitoring programs. We evaluated the sensitivity of 400 lakes in Grand Teton (GRTE) and Yellowstone (YELL) National Parks to acidification from atmospheric deposition of nitrogen and sulfur based on statistical relations between acid-neutralizing capacity (ANC) concentrations and basin characteristics to aid in the design of a long-term monitoring plan for Outstanding Natural Resource Waters. ANC concentrations that were measured at 52 lakes in GRTE and 23 lakes in YELL during synoptic surveys were used to calibrate the statistical models. Basin-characteristic information was derived from Geographic Information System data sets. The explanatory variables that were considered included bedrock type, basin slope, basin aspect, basin elevation, lake area, basin area, inorganic nitrogen (N) deposition, sulfate deposition, hydrogen ion deposition, basin precipitation, soil type, and vegetation type. A logistic regression model was developed and applied to lake basins greater than 1 hectare (ha) in GRTE (n=106) and YELL (n=294). For GRTE, 36 percent of lakes had a greater than 60-percent probability of having ANC concentrations less than 100 microequivalents per liter, and 14 percent of lakes had a greater than 80-percent probability of having ANC concentrations less than 100 microequivalents per liter. The elevation of the lake outlet and the area of the basin with northeast aspects were determined to be statistically significant and were used as the explanatory variables in the multivariate logistic regression model. For YELL, results indicated that 13 percent of lakes had a greater than 60-percent probability of having ANC concentrations less than 100 microequivalents per liter, and 9 percent of lakes had a greater than 80-percent probability of having ANC concentrations less than 100 microequivalents per liter. Only the elevation of the lake outlet was determined to be statistically significant and was used as the explanatory variable in the multivariate logistic regression model. The lakes that exceeded 80-percent probability of having an ANC concentration less than 100 microequivalents per liter, and therefore had the greatest sensitivity to acidification from atmospheric deposition, are located at elevations greater than 2,810 meters (m) in GRTE, and greater than 2,655 m in YELL.
Voracek, Martin; Tran, Ulrich S; Formann, Anton K
2008-02-01
Subjective estimates and associated confidence ratings for the solutions of some classic occupancy problems were studied in samples of 721 psychology undergraduates, 39 casino visitors, and 34 casino employees. On tasks varying the classic birthday problem, i.e., the probability P for any coincidence among N individuals sharing the same birthday, clear majorities of respondents markedly overestimated N, given P, and markedly underestimated P, given N. Respondents did notedly better on tasks varying the birthmate problem, i.e., P for the specific coincidence among N individuals of having a birthday today. Psychology students and women did better on both task types, but were less confident about their estimates than casino visitors or per sonnel and men. Several further person variables, such as indicators of topical knowledge and familiarity, were associated with better and more confident performance on birthday problems, but not on birthmate problems. Likewise, higher confidence ratings were related to subjective estimates that were closer to the solutions of birthday problems, but not of birthmate problems. Implications of and possible explanations for these findings, study limitations, directions for further inquiry, and the real-world relevance of ameliorating misconceptions of probability are discussed.
Modeling uncertainty in producing natural gas from tight sands
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chermak, J.M.; Dahl, C.A.; Patrick, R.H
1995-12-31
Since accurate geologic, petroleum engineering, and economic information are essential ingredients in making profitable production decisions for natural gas, we combine these ingredients in a dynamic framework to model natural gas reservoir production decisions. We begin with the certainty case before proceeding to consider how uncertainty might be incorporated in the decision process. Our production model uses dynamic optimal control to combine economic information with geological constraints to develop optimal production decisions. To incorporate uncertainty into the model, we develop probability distributions on geologic properties for the population of tight gas sand wells and perform a Monte Carlo study tomore » select a sample of wells. Geological production factors, completion factors, and financial information are combined into the hybrid economic-petroleum reservoir engineering model to determine the optimal production profile, initial gas stock, and net present value (NPV) for an individual well. To model the probability of the production abandonment decision, the NPV data is converted to a binary dependent variable. A logit model is used to model this decision as a function of the above geological and economic data to give probability relationships. Additional ways to incorporate uncertainty into the decision process include confidence intervals and utility theory.« less
Assessing performance and validating finite element simulations using probabilistic knowledge
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dolin, Ronald M.; Rodriguez, E. A.
Two probabilistic approaches for assessing performance are presented. The first approach assesses probability of failure by simultaneously modeling all likely events. The probability each event causes failure along with the event's likelihood of occurrence contribute to the overall probability of failure. The second assessment method is based on stochastic sampling using an influence diagram. Latin-hypercube sampling is used to stochastically assess events. The overall probability of failure is taken as the maximum probability of failure of all the events. The Likelihood of Occurrence simulation suggests failure does not occur while the Stochastic Sampling approach predicts failure. The Likelihood of Occurrencemore » results are used to validate finite element predictions.« less
Mammography facilities are accessible, so why is utilization so low?
Mobley, Lee R; Kuo, Tzy-Mey May; Clayton, Laurel J; Evans, W Douglas
2009-08-01
This study examines new socio-ecological variables reflecting community context as predictors of mammography use. The conceptual model is a hybrid of traditional health-behavioral and socio-ecological constructs with an emphasis on spatial interaction among women and their environments, differentiating between several levels of influence for community context. Multilevel probability models of mammography use are estimated. The study sample includes 70,129 women with traditional Medicare fee-for-service coverage for inpatient and outpatient services, drawn from the SEER-Medicare linked data. The study population lives in heterogeneous California, where mammography facilities are dense but utilization rates are low. Several contextual effects have large significant impacts on the probability of mammography use. Women living in areas with higher proportions of elderly in poverty are 33% less likely to use mammography. However, dually eligible women living in these poor areas are 2% more likely to use mammography than those without extra assistance living in these areas. Living in areas with higher commuter intensity, higher violent crime rates, greater land use mix (urbanicity), or more segregated Hispanic communities exhibit -14%, -1%, -6%, and -3% (lower) probability of use, respectively. Women living in segregated American Indian communities or in communities where more elderly women live alone exhibit 16% and 12% (higher) probability of use, respectively. Minority women living in more segregated communities by their minority are more likely to use mammography, suggesting social support, but this is significant for Native Americans only. Women with disability as their original reason for entitlement are found 40% more likely to use mammography when they reside in communities with high commuter intensity, suggesting greater ease of transportation for them in these environments. Socio-ecological variables reflecting community context are important predictors of mammography use in insured elderly populations, often with larger magnitudes of effect than personal characteristics such as race or ethnicity (-3% to -7%), age (-2%), recent address change (-7%), disability (-5%) or dual eligibility status (-1%). Better understanding of community factors can enhance cancer control efforts.
Farmer, Nicholas A.; Karnauskas, Mandy
2013-01-01
There is broad interest in the development of efficient marine protected areas (MPAs) to reduce bycatch and end overfishing of speckled hind (Epinephelus drummondhayi) and warsaw grouper (Hyporthodus nigritus) in the Atlantic Ocean off the southeastern U.S. We assimilated decades of data from many fishery-dependent, fishery-independent, and anecdotal sources to describe the spatial distribution of these data limited stocks. A spatial classification model was developed to categorize depth-grids based on the distribution of speckled hind and warsaw grouper point observations and identified benthic habitats. Logistic regression analysis was used to develop a quantitative model to predict the spatial distribution of speckled hind and warsaw grouper as a function of depth, latitude, and habitat. Models, controlling for sampling gear effects, were selected based on AIC and 10-fold cross validation. The best-fitting model for warsaw grouper included latitude and depth to explain 10.8% of the variability in probability of detection, with a false prediction rate of 28–33%. The best-fitting model for speckled hind, per cross-validation, included latitude and depth to explain 36.8% of the variability in probability of detection, with a false prediction rate of 25–27%. The best-fitting speckled hind model, per AIC, also included habitat, but had false prediction rates up to 36%. Speckled hind and warsaw grouper habitats followed a shelf-edge hardbottom ridge from North Carolina to southeast Florida, with speckled hind more common to the north and warsaw grouper more common to the south. The proportion of habitat classifications and model-estimated stock contained within established and proposed MPAs was computed. Existing MPAs covered 10% of probable shelf-edge habitats for speckled hind and warsaw grouper, protecting 3–8% of speckled hind and 8% of warsaw grouper stocks. Proposed MPAs could add 24% more probable shelf-edge habitat, and protect an additional 14–29% of speckled hind and 20% of warsaw grouper stocks. PMID:24260126
DOE Office of Scientific and Technical Information (OSTI.GOV)
Masci, Frank J.; Grillmair, Carl J.; Cutri, Roc M.
2014-07-01
We describe a methodology to classify periodic variable stars identified using photometric time-series measurements constructed from the Wide-field Infrared Survey Explorer (WISE) full-mission single-exposure Source Databases. This will assist in the future construction of a WISE Variable Source Database that assigns variables to specific science classes as constrained by the WISE observing cadence with statistically meaningful classification probabilities. We have analyzed the WISE light curves of 8273 variable stars identified in previous optical variability surveys (MACHO, GCVS, and ASAS) and show that Fourier decomposition techniques can be extended into the mid-IR to assist with their classification. Combined with other periodicmore » light-curve features, this sample is then used to train a machine-learned classifier based on the random forest (RF) method. Consistent with previous classification studies of variable stars in general, the RF machine-learned classifier is superior to other methods in terms of accuracy, robustness against outliers, and relative immunity to features that carry little or redundant class information. For the three most common classes identified by WISE: Algols, RR Lyrae, and W Ursae Majoris type variables, we obtain classification efficiencies of 80.7%, 82.7%, and 84.5% respectively using cross-validation analyses, with 95% confidence intervals of approximately ±2%. These accuracies are achieved at purity (or reliability) levels of 88.5%, 96.2%, and 87.8% respectively, similar to that achieved in previous automated classification studies of periodic variable stars.« less
Duarte, Adam; Adams, Michael J.; Peterson, James T.
2018-01-01
Monitoring animal populations is central to wildlife and fisheries management, and the use of N-mixture models toward these efforts has markedly increased in recent years. Nevertheless, relatively little work has evaluated estimator performance when basic assumptions are violated. Moreover, diagnostics to identify when bias in parameter estimates from N-mixture models is likely is largely unexplored. We simulated count data sets using 837 combinations of detection probability, number of sample units, number of survey occasions, and type and extent of heterogeneity in abundance or detectability. We fit Poisson N-mixture models to these data, quantified the bias associated with each combination, and evaluated if the parametric bootstrap goodness-of-fit (GOF) test can be used to indicate bias in parameter estimates. We also explored if assumption violations can be diagnosed prior to fitting N-mixture models. In doing so, we propose a new model diagnostic, which we term the quasi-coefficient of variation (QCV). N-mixture models performed well when assumptions were met and detection probabilities were moderate (i.e., ≥0.3), and the performance of the estimator improved with increasing survey occasions and sample units. However, the magnitude of bias in estimated mean abundance with even slight amounts of unmodeled heterogeneity was substantial. The parametric bootstrap GOF test did not perform well as a diagnostic for bias in parameter estimates when detectability and sample sizes were low. The results indicate the QCV is useful to diagnose potential bias and that potential bias associated with unidirectional trends in abundance or detectability can be diagnosed using Poisson regression. This study represents the most thorough assessment to date of assumption violations and diagnostics when fitting N-mixture models using the most commonly implemented error distribution. Unbiased estimates of population state variables are needed to properly inform management decision making. Therefore, we also discuss alternative approaches to yield unbiased estimates of population state variables using similar data types, and we stress that there is no substitute for an effective sample design that is grounded upon well-defined management objectives.
Spatial variability of heavy metals in the coastal soils under long-term reclamation
NASA Astrophysics Data System (ADS)
Wang, Lin; Coles, Neil A.; Wu, Chunfa; Wu, Jiaping
2014-12-01
The coastal plain of Cixi City, China, has experienced over 1000 years of reclamation. With the rapid development of agriculture and industry after reclamation, successive inputs into agricultural soils have drastically modified the soil environment. To determine the spatial distribution of heavy metals and to evaluate the influence of anthropogenic activities, a total of 329 top soil samples were taken along a transect on the coastal plain. The samples collected across 11 sea dikes, were selected by a nested sampling methodology. Total Cu, Fe, Mn, Ni, Pb, and Zn concentrations, as well as their diethylenetriamine penta-acetic acid (DTPA) extractable (available) concentrations were determined. Results indicated that except for Zn concentrations, there was neither heavy metals pollution nor mineral deficiency in the soils. Heavy metals exhibited considerable spatial variability, obvious spatial dependence, and close relationships on the reclaimed land. For most metals, the reclamation history was the main influencing factor. Metals concentrations generally showed discontinuities around the position of sea dikes, and the longer reclamation histories tended to have higher metals concentrations than the recently reclaimed sectors. As for Cu and Zn total concentrations, stochastic factors, like industrial waste discharge, fertilization and pesticide application, probably led to the high nugget effect and altered this relationship. The 6th and 10th zones generally had the highest total metals concentrations, due to the concentration of household appliance manufacturers in these reclaimed areas. The first two zones were characterized by high available metals concentrations, probably due to the alternant flooding and emergence, low pH values and high organic matter contents in these paddy field soils. From the 3rd to 7th zones with the same land use history and soil type, metals concentrations, especially available concentrations, showed homogeneity. The nested sampling method adopted demonstrated that the 500-m interval was enough to capture the spatial variation of the metals. These results were useful in evaluating the variation in the environment quality of the soils under long-term reclamation and to formulate plans for future reclamation projects.
Servetto, Natalia; Sahade, Ricardo
2016-01-01
The pennatulid Malacobelemnon daytoni is one of the dominant species in Potter Cove, Antarctica. Its abundance and range of distribution have increased in recent years probably related to climate change mediated alterations of environmental factors. This work is the second part of a study dealing on the reproductive ecology of Malacobelemnon daytoni, and aims to assess its reproductive seasonality over a two-year period. Sampling was carried out every month during 2009–2010 and samples were examined by histological analysis. Gametogenesis exhibited a seasonal pattern evidenced by the maturity stage index (MSI) and the number of mature oocytes and cysts throughout the year. Immature oocytes and spermatocytes were present year-round, but maturation was seasonal and it seems that more than one spawning per year was possible. These spawnings could be more linked with suspended particulate matter (SPM) (probably available via resuspension events) than with primary production pulses. This idea reinforces the hypothesis that winter time is not so stressful, in energy terms, in Potter Cove, which seems to depend on energy sources other than local phytoplankton production. There was not a strong inter-annual variability between the reproductive characteristics analyzed in 2009 and 2010; the only variable different was the size of oocytes (higher in 2009), suggesting different energy availability in each year, related with a higher concentration of SPM in 2009 (although it was not significant). Malacobelemnon daytoni could be the first reported Antarctic suspension feeder species that presents a reproductive cycle with more than a spawning event per year. This strategy would help to explain the success of this species in the Potter Cove ecosystem and in high ice-impacted areas. PMID:27732608
Relationship between physician and industry in Aragon (Spain).
Lobo, Elena; Rabanaque, M José; Carrera, Patricia; Abad, José M; Moliner, Javier
2012-01-01
To describe the relationship between industry and physicians and to analyze the physician characteristics associated with the probability of receiving benefits from industry in Aragon (Spain). We carried out an observational, cross-sectional study in which Aragonese physicians (north-east region in Spain) from public and private settings completed an anonymous questionnaire on a web page between June and November 2008. Visits/month with industry, samples, gifts, reimbursements and payments were used as dependant variables in the regression analyses. Year of medical license, specialty, work setting, time spent on direct care, articles read/month and being a resident's tutor were used as independent variables. A total of 659 questionnaires were considered valid for the analysis. Overall, 87% (n=573) of the respondents reported they had received some benefit in the previous year and 90.1% (n=593) reported having held meetings with industry representatives monthly. Non-clinical specialists received fewer gifts (odds ratio [OR]=0.38; 95% confidence interval [95%CI]: 0.18-0.77), reimbursements (OR=0.14; 95%CI: 0.06-0.35) and payments (OR=0.30; 95%CI: 0.13-0.74) than their clinical colleagues. The probability of receiving reimbursements (OR=0.37; 95%CI: 0.15-0.89) and payments (OR=0.39; 95%CI: 0.20-0.77) was lower in primary care physicians. This study, performed in a sample of physicians from a southern European region, demonstrates differences in the intensity of the physician-industry relationship depending on physician specialty and work setting. These results provide important information for improving transparency and for future research on the appropriateness and efficiency of prescription in Spain and other countries with similar health systems. Copyright © 2011 SESPAS. Published by Elsevier España, S.L. All rights reserved.
NASA Astrophysics Data System (ADS)
Tesoriero, A. J.; Terziotti, S.
2014-12-01
Nitrate trends in streams often do not match expectations based on recent nitrogen source loadings to the land surface. Groundwater discharge with long travel times has been suggested as the likely cause for these observations. The fate of nitrate in groundwater depends to a large extent on the occurrence of denitrification along flow paths. Because denitrification in groundwater is inhibited when dissolved oxygen (DO) concentrations are high, defining the oxic-suboxic interface has been critical in determining pathways for nitrate transport in groundwater and to streams at the local scale. Predicting redox conditions on a regional scale is complicated by the spatial variability of reaction rates. In this study, logistic regression and boosted classification tree analysis were used to predict the probability of oxic water in groundwater in the Chesapeake Bay watershed. The probability of oxic water (DO > 2 mg/L) was predicted by relating DO concentrations in over 3,000 groundwater samples to indicators of residence time and/or electron donor availability. Variables that describe position in the flow system (e.g., depth to top of the open interval), soil drainage and surficial geology were the most important predictors of oxic water. Logistic regression and boosted classification tree analysis correctly predicted the presence or absence of oxic conditions in over 75 % of the samples in both training and validation data sets. Predictions of the percentages of oxic wells in deciles of risk were very accurate (r2>0.9) in both the training and validation data sets. Depth to the bottom of the oxic layer was predicted and is being used to estimate the effect that groundwater denitrification has on stream nitrate concentrations and the time lag between the application of nitrogen at the land surface and its effect on streams.
NASA Astrophysics Data System (ADS)
Pereira, M. F.; Ribeiro, C.; Vilallonga, F.; Chichorro, M.; Drost, K.; Silva, J. B.; Albardeiro, L.; Hofmann, M.; Linnemann, U.
2014-07-01
This study combines geochemical and geochronological data in order to decipher the provenance of Carboniferous turbidites from the South Portuguese Zone (SW Iberia). Major and trace elements of 25 samples of graywackes and mudstones from the Mértola (Visean), Mira (Serpukhovian), and Brejeira (Moscovian) Formations were analyzed, and 363 U-Pb ages were obtained on detrital zircons from five samples of graywackes from the Mira and Brejeira Formations using LA-ICPMS. The results indicate that turbiditic sedimentation during the Carboniferous was marked by variability in the sources, involving the denudation of different crustal blocks and a break in synorogenic volcanism. The Visean is characterized by the accumulation of immature turbidites (Mértola Formation and the base of the Mira Formation) inherited from a terrane with intermediate to mafic source rocks. These source rocks were probably formed in relation to Devonian magmatic arcs poorly influenced by sedimentary recycling, as indicated by the almost total absence of pre-Devonian zircons typical of the Gondwana and/or Laurussia basements. The presence of Carboniferous grains in Visean turbidites indicates that volcanism was active at this time. Later, Serpukhovian to Moscovian turbiditic sedimentation (Mira and Brejeira Formations) included sedimentary detritus derived from felsic mature source rocks situated far from active magmatism. The abundance of Precambrian and Paleozoic zircons reveals strong recycling of the Gondwana and/or Laurussia basements. A peri-Gondwanan provenance is indicated by zircon populations with Neoproterozoic (Cadomian-Avalonian and Pan-African zircon-forming events), Paleoproterozoic, and Archean ages. The presence of late Ordovician and Silurian detrital zircons in Brejeira turbidites, which have no correspondence in the Gondwana basement of SW Iberia, indicates Laurussia as their most probable source.
Patients with schizophrenia activate behavioural intentions facilitated by counterfactual reasoning
Tebé, Cristian; Benejam, Bessy; Caño, Agnes; Menchón, José Manuel
2017-01-01
Previous research has associated schizophrenia with an inability to activate behavioural intentions facilitated by counterfactual thinking (CFT) as a step to improving performance. Consequently, these findings suggest that rehabilitation strategies will be entirely ineffective. To extend previous research, we evaluated the influence of CFT in the activation of behavioural intentions using a novel sequential priming paradigm in the largest sample of subjects explored to date. Method The main variables assessed were: answer to complete a target task (wrong or correctly), and percentage gain in the reaction time (RT) to complete a target task correctly depending on whether the prime was a counterfactual or a neutral-control cue. These variables were assessed in 37 patients with schizophrenia and 37 healthy controls. Potential associations with clinical status and socio-demographic characteristics were also explored. Results When a counterfactual prime was presented, the probability of giving an incorrect answer was lower for the entire sample than when a neutral prime was presented (OR 0.58; CI 95% 0.42 to 0.79), but the schizophrenia patients showed a higher probability than the controls of giving an incorrect answer (OR 3.89; CI 95% 2.0 to 7.6). Both the schizophrenia patients and the controls showed a similar percentage gain in RT to a correct answer of 8%. Conclusions Challenging the results of previous research, our findings suggest a normal activation of behavioural intentions facilitated by CFT in schizophrenia. Nevertheless, the patients showed more difficulty than the controls with the task, adding support to the concept of CFT as a potential new target for consideration in future therapeutic approaches for this illness. PMID:28586400
Analysis of blocking probability for OFDM-based variable bandwidth optical network
NASA Astrophysics Data System (ADS)
Gong, Lei; Zhang, Jie; Zhao, Yongli; Lin, Xuefeng; Wu, Yuyao; Gu, Wanyi
2011-12-01
Orthogonal Frequency Division Multiplexing (OFDM) has recently been proposed as a modulation technique. For optical networks, because of its good spectral efficiency, flexibility, and tolerance to impairments, optical OFDM is much more flexible compared to traditional WDM systems, enabling elastic bandwidth transmissions, and optical networking is the future trend of development. In OFDM-based optical network the research of blocking rate has very important significance for network assessment. Current research for WDM network is basically based on a fixed bandwidth, in order to accommodate the future business and the fast-changing development of optical network, our study is based on variable bandwidth OFDM-based optical networks. We apply the mathematical analysis and theoretical derivation, based on the existing theory and algorithms, research blocking probability of the variable bandwidth of optical network, and then we will build a model for blocking probability.
Detecting Anomalies in Process Control Networks
NASA Astrophysics Data System (ADS)
Rrushi, Julian; Kang, Kyoung-Don
This paper presents the estimation-inspection algorithm, a statistical algorithm for anomaly detection in process control networks. The algorithm determines if the payload of a network packet that is about to be processed by a control system is normal or abnormal based on the effect that the packet will have on a variable stored in control system memory. The estimation part of the algorithm uses logistic regression integrated with maximum likelihood estimation in an inductive machine learning process to estimate a series of statistical parameters; these parameters are used in conjunction with logistic regression formulas to form a probability mass function for each variable stored in control system memory. The inspection part of the algorithm uses the probability mass functions to estimate the normalcy probability of a specific value that a network packet writes to a variable. Experimental results demonstrate that the algorithm is very effective at detecting anomalies in process control networks.
Variables Affecting Probability of Detection in Bolt Hole Eddy Current Inspection
NASA Astrophysics Data System (ADS)
Lemire, H.; Krause, T. W.; Bunn, M.; Butcher, D. J.
2009-03-01
Physical variables affecting probability of detection (POD) in a bolt-hole eddy current inspection were examined. The POD study involved simulated bolt holes in 7075-T6 aluminum coupons representative of wing areas on CC-130 and CP-140 aircraft. The data were obtained from 24 inspectors who inspected 468 coupons, containing a subset of coupons with 45 electric discharge machined notches and 72 laboratory grown fatigue cracks located at the inner surface corner of the bi-layer structures. A comparison of physical features of cracks and notches in light of skin depth effects and probe geometry was used to identify length rather than depth as the significant variable producing signal variation. Probability of detection based on length produced similar results for the two discontinuity types, except at lengths less than 0.4 mm, where POD for cracks was found to be higher than that of notches.
PROBABILITY SAMPLING AND POPULATION INFERENCE IN MONITORING PROGRAMS
A fundamental difference between probability sampling and conventional statistics is that "sampling" deals with real, tangible populations, whereas "conventional statistics" usually deals with hypothetical populations that have no real-world realization. he focus here is on real ...
Depression in Mongolian women over the first 2 months after childbirth: prevalence and risk factors.
Pollock, J I; Manaseki-Holland, S; Patel, V
2009-07-01
Social, political and economic changes in Mongolia have followed post-Soviet style government policies and contributed to both increased liberalisation and reduced security in employment and family finances. This is the first study to attempt to assess the prevalence of depression in a population of Mongolian women in the post-partum period and assess risk factors, including financial position, associated with the condition. A total of 1044 women who had delivered healthy babies in Ulaanbaatar between October and December 2002 were screened for depression using the WHO Self Reporting Questionnaire between 5 and 9 weeks post-partum. Further details on the mother, her family and social and economic circumstances were simultaneously collected. Analysis of risk factors for probable depression was undertaken using multiple logistic regression techniques. The prevalence of depression was 9.1% (95% CLs 7.5%-11.1%). Variables significantly and independently associated with risk of probable maternal depression included economic factors, mother being subject to physical abuse, dissatisfied with the pregnancy, concerned about her baby's behaviour, and her own health problems. The sample was drawn from a population of mothers all of whom had healthy, full-term babies of normal birth weight. Clinical confirmation of diagnosis was not established. Mongolian women with young infants in Ulaanbaatar probably experience depression at rates comparable with other cultures. Factors associated with probable depression were dominated by health, relationships and financial position.
Rothmann, Mark
2005-01-01
When testing the equality of means from two different populations, a t-test or large sample normal test tend to be performed. For these tests, when the sample size or design for the second sample is dependent on the results of the first sample, the type I error probability is altered for each specific possibility in the null hypothesis. We will examine the impact on the type I error probabilities for two confidence interval procedures and procedures using test statistics when the design for the second sample or experiment is dependent on the results from the first sample or experiment (or series of experiments). Ways for controlling a desired maximum type I error probability or a desired type I error rate will be discussed. Results are applied to the setting of noninferiority comparisons in active controlled trials where the use of a placebo is unethical.
Sampling in epidemiological research: issues, hazards and pitfalls.
Tyrer, Stephen; Heyman, Bob
2016-04-01
Surveys of people's opinions are fraught with difficulties. It is easier to obtain information from those who respond to text messages or to emails than to attempt to obtain a representative sample. Samples of the population that are selected non-randomly in this way are termed convenience samples as they are easy to recruit. This introduces a sampling bias. Such non-probability samples have merit in many situations, but an epidemiological enquiry is of little value unless a random sample is obtained. If a sufficient number of those selected actually complete a survey, the results are likely to be representative of the population. This editorial describes probability and non-probability sampling methods and illustrates the difficulties and suggested solutions in performing accurate epidemiological research.
Sampling in epidemiological research: issues, hazards and pitfalls
Tyrer, Stephen; Heyman, Bob
2016-01-01
Surveys of people's opinions are fraught with difficulties. It is easier to obtain information from those who respond to text messages or to emails than to attempt to obtain a representative sample. Samples of the population that are selected non-randomly in this way are termed convenience samples as they are easy to recruit. This introduces a sampling bias. Such non-probability samples have merit in many situations, but an epidemiological enquiry is of little value unless a random sample is obtained. If a sufficient number of those selected actually complete a survey, the results are likely to be representative of the population. This editorial describes probability and non-probability sampling methods and illustrates the difficulties and suggested solutions in performing accurate epidemiological research. PMID:27087985
A survey for red varibles INT he LMC - II
NASA Astrophysics Data System (ADS)
Reid, Neill; Glass, I. S.; Catchpole, R. M.
1988-05-01
Infrared photometry of a sample of 126 variables drawn from a 16 sq deg area of the northern LMC is presented. Most of these stars were previously unknown and the majority prove the be long period red-giant variables. Most of the latter stars fall within two groups in the /K(0), log(P)/ diagram, the lower luminosity ones being Miras which obey a definite period-luminosity relation. Using the latter stars as distance estimators is discussed. The /M(bol), P/ diagram is compared with the theoretical tracks calculated by Wood, Bessell & Fox (1983), and it is found that the distribution of stars is probably consistent with a lull in star formation in the LMC from about 10 to the 9th - 2 x 10 to the 8th yr ago, although this conclusion depends strongly on the luminosity at which stars of different initial mass enter the thermally pulsing AGB.
Sharma, R; Synkewecz, C; Raggio, T; Mattison, D R
1994-11-01
A probability sample survey of high-risk inner-city women with a live birth in the last 3 years shows that maternal medical risks and health behaviors during pregnancy are important intermediate variables influencing preterm delivery and birthweight. Women who developed two or more medical risks had about three-and-a-half times the risk of preterm delivery and two-and-a-half times the risk of low birthweight compared to those without such risks. Women with prior fetal loss had twofold increase in the risk of preterm delivery and low birthweight. Unintended pregnancy resulted in one-and-a-half to twofold increase in preterm delivery and low birthweight, respectively. Inadequate gestational weight increased the risk of preterm delivery by about 50%. Smoking during pregnancy raised the risk of low birthweight slightly more than one-and-a-half times.
Survival and selection of migrating salmon from capture-recapture models with individual traits
Zabel, R.W.; Wagner, T.; Congleton, J.L.; Smith, S.G.; Williams, J.G.
2005-01-01
Capture-recapture studies are powerful tools for studying animal population dynamics, providing information on population abundance, survival rates, population growth rates, and selection for phenotypic traits. In these studies, the probability of observing a tagged individual reflects both the probability of the individual surviving to the time of recapture and the probability of recapturing an animal, given that it is alive. If both of these probabilities are related to the same phenotypic trait, it can be difficult to distinguish effects on survival probabilities from effects on recapture probabilities. However, when animals are individually tagged and have multiple opportunities for recapture, we can properly partition observed trait-related variability into survival and recapture components. We present an overview of capture-recapture models that incorporate individual variability and develop methods to incorporate results from these models into estimates of population survival and selection for phenotypic traits. We conducted a series of simulations to understand the performance of these estimators and to assess the consequences of ignoring individual variability when it exists. In addition, we analyzed a large data set of > 153 000 juvenile chinook salmon (Oncorhynchus tshawytscha) and steelhead (O. mykiss) of known length that were PIT-tagged during their seaward migration. Both our simulations and the case study indicated that the ability to precisely estimate selection for phenotypic traits was greatly compromised when differential recapture probabilities were ignored. Estimates of population survival, however, were far more robust. In the chinook salmon and steelhead study, we consistently found that smaller fish had a greater probability of recapture. We also uncovered length-related survival relationships in over half of the release group/river segment combinations that we observed, but we found both positive and negative relationships between length and survival probability. These results have important implications for the management of salmonid populations. ?? 2005 by the Ecological Society of America.
Ramey, Andy M.; Schmutz, Joel A.; Fleskes, Joseph P.; Yabsley, Michael J.
2013-01-01
Information on the molecular detection of hematozoa from different tissue types and multiple years would be useful to inform sample collection efforts and interpret results of meta-analyses or investigations spanning multiple seasons. In this study, we tested blood and muscle tissue collected from northern pintails (Anas acuta) during autumn and winter of different years to evaluate prevalence and genetic diversity ofLeucocytozoon, Haemoproteus, and Plasmodium infections in this abundant waterfowl species of the Central Valley of California. We first compared results for paired blood and wing muscle samples to assess the utility of different tissue types for molecular investigations of haemosporidian parasites. Second, we explored inter-annual variability of hematozoa infection in Central Valley northern pintails and investigated possible effects of age, sex, and sub-region of sample collection on estimated parasite detection probability and prevalence. We found limited evidence for differences between tissue types in detection probability and prevalence ofLeucocytozoon, Haemoproteus, and Plasmodium parasites, which supports the utility of both sample types for obtaining information on hematozoan infections. However, we detected 11 haemosporidian mtDNA cyt bhaplotypes in blood samples vs. six in wing muscle tissue collected during the same sample year suggesting an advantage to using blood samples for investigations of genetic diversity. Estimated prevalence ofLeucocytozoon parasites was greater during 2006–2007 as compared to 2011–2012 and four unique haemosporidian mtDNA cyt b haplotypes were detected in the former sample year but not in the latter. Seven of 15 mtDNA cyt b haplotypes detected in northern pintails had 100% identity with previously reported hematozoa lineages detected in waterfowl (Haemoproteus and Leucocytozoon) or other avian taxa (Plasmodium) providing support for lack of host specificity for some parasite lineages.
Analysis of TPA Pulsed-Laser-Induced Single-Event Latchup Sensitive-Area
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Peng; Sternberg, Andrew L.; Kozub, John A.
Two-photon absorption (TPA) testing is employed to analyze the laser-induced latchup sensitive-volume (SV) of a specially designed test structure. This method takes into account the existence of an onset region in which the probability of triggering latchup transitions from zero to one as the laser pulse energy increases. This variability is attributed to pulse-to-pulse variability, uncertainty in measurement of the pulse energy, and variation in local carrier density and temperature. For each spatial position, the latchup probability associated with a given energy is calculated from multiple pulses. The latchup probability data are well-described by a Weibull distribution. The results showmore » that the area between p-n-p-n cell structures is more sensitive than the p+ and n+ source areas, and locations far from the well contacts are more sensitive than those near the contact region. The transition from low probability of latchup to high probability is more abrupt near the source contacts than it is for the surrounding areas.« less
Analysis of TPA Pulsed-Laser-Induced Single-Event Latchup Sensitive-Area
Wang, Peng; Sternberg, Andrew L.; Kozub, John A.; ...
2017-12-07
Two-photon absorption (TPA) testing is employed to analyze the laser-induced latchup sensitive-volume (SV) of a specially designed test structure. This method takes into account the existence of an onset region in which the probability of triggering latchup transitions from zero to one as the laser pulse energy increases. This variability is attributed to pulse-to-pulse variability, uncertainty in measurement of the pulse energy, and variation in local carrier density and temperature. For each spatial position, the latchup probability associated with a given energy is calculated from multiple pulses. The latchup probability data are well-described by a Weibull distribution. The results showmore » that the area between p-n-p-n cell structures is more sensitive than the p+ and n+ source areas, and locations far from the well contacts are more sensitive than those near the contact region. The transition from low probability of latchup to high probability is more abrupt near the source contacts than it is for the surrounding areas.« less
An operational system of fire danger rating over Mediterranean Europe
NASA Astrophysics Data System (ADS)
Pinto, Miguel M.; DaCamara, Carlos C.; Trigo, Isabel F.; Trigo, Ricardo M.
2017-04-01
A methodology is presented to assess fire danger based on the probability of exceedance of prescribed thresholds of daily released energy. The procedure is developed and tested over Mediterranean Europe, defined by latitude circles of 35 and 45°N and meridians of 10°W and 27.5°E, for the period 2010-2016. The procedure involves estimating the so-called static and daily probabilities of exceedance. For a given point, the static probability is estimated by the ratio of the number of daily fire occurrences releasing energy above a given threshold to the total number of occurrences inside a cell centred at the point. The daily probability of exceedance which takes into account meteorological factors by means of the Canadian Fire Weather Index (FWI) is in turn estimated based on a Generalized Pareto distribution with static probability and FWI as covariates of the scale parameter. The rationale of the procedure is that small fires, assessed by the static probability, have a weak dependence on weather, whereas the larger fires strongly depend on concurrent meteorological conditions. It is shown that observed frequencies of exceedance over the study area for the period 2010-2016 match with the estimated values of probability based on the developed models for static and daily probabilities of exceedance. Some (small) variability is however found between different years suggesting that refinements can be made in future works by using a larger sample to further increase the robustness of the method. The developed methodology presents the advantage of evaluating fire danger with the same criteria for all the study area, making it a good parameter to harmonize fire danger forecasts and forest management studies. Research was performed within the framework of EUMETSAT Satellite Application Facility for Land Surface Analysis (LSA SAF). Part of methods developed and results obtained are on the basis of the platform supported by The Navigator Company that is currently providing information about fire meteorological danger for Portugal for a wide range of users.
Methods for fitting a parametric probability distribution to most probable number data.
Williams, Michael S; Ebel, Eric D
2012-07-02
Every year hundreds of thousands, if not millions, of samples are collected and analyzed to assess microbial contamination in food and water. The concentration of pathogenic organisms at the end of the production process is low for most commodities, so a highly sensitive screening test is used to determine whether the organism of interest is present in a sample. In some applications, samples that test positive are subjected to quantitation. The most probable number (MPN) technique is a common method to quantify the level of contamination in a sample because it is able to provide estimates at low concentrations. This technique uses a series of dilution count experiments to derive estimates of the concentration of the microorganism of interest. An application for these data is food-safety risk assessment, where the MPN concentration estimates can be fitted to a parametric distribution to summarize the range of potential exposures to the contaminant. Many different methods (e.g., substitution methods, maximum likelihood and regression on order statistics) have been proposed to fit microbial contamination data to a distribution, but the development of these methods rarely considers how the MPN technique influences the choice of distribution function and fitting method. An often overlooked aspect when applying these methods is whether the data represent actual measurements of the average concentration of microorganism per milliliter or the data are real-valued estimates of the average concentration, as is the case with MPN data. In this study, we propose two methods for fitting MPN data to a probability distribution. The first method uses a maximum likelihood estimator that takes average concentration values as the data inputs. The second is a Bayesian latent variable method that uses the counts of the number of positive tubes at each dilution to estimate the parameters of the contamination distribution. The performance of the two fitting methods is compared for two data sets that represent Salmonella and Campylobacter concentrations on chicken carcasses. The results demonstrate a bias in the maximum likelihood estimator that increases with reductions in average concentration. The Bayesian method provided unbiased estimates of the concentration distribution parameters for all data sets. We provide computer code for the Bayesian fitting method. Published by Elsevier B.V.
Variability of FUV Emission Line in Classical T Tauri Stars as a Diagnostic for Disc Accretion
NASA Astrophysics Data System (ADS)
Ramkumar, B.; Johns-Krull, C. M.
2005-12-01
We present our results of FUV emission line variability studies done on four classical T Tauri stars. We have used the IUE Final Archive spectra of pre-main sequence stars to analyze the sample of four stars BP Tau, DR Tau, RU Lup and RY Tau where each of these low-resolution (R ˜6 Å) spectra was observed in the IUE short-wavelength band pass (1150--1980Å). Given a broad time line of multiple observations being available from the IUE Final archive, an intrinsic variability study has been possible with this sample. Our results indicate that the transition region lines \\ion{Si}{4} and \\ion{C}{4}, produced near the accretion shocks at ˜105 K, have a strong correlation between them in all four stars except DR Tau. We also observe a strong correlation between \\ion{C}{4} & \\ion{He}{2} on our entire sample with a correlation coefficient of 0.549 (false alarm probability = 7.9 x 10-2) or higher. In addition, \\ion{He}{2} correlates with the molecular hydrogen (1503Å) line in all but RU Lup. If the \\ion{He}{2} lines are produced because of X-ray ionization then the observed molecular hydrogen emission is indeed controlled by X-ray ionization and therefore \\ion{He}{2} could serve as an X-ray proxy for future studies. Also, our correlation results strengthen the fact that \\ion{C}{4} is a good predictor of \\ion{Si}{4} and have a common origin i.e. in accretion shocks in the star formation process.
PG 1553+113: Five Years Of Observations With Magic
J., Aleksić
2012-03-05
We present the results of five years (2005-2009) of MAGIC observations of the BL Lac object PG 1553+113 at very high energies (VHEs; E > 100 GeV). Power-law fits of the individual years are compatible with a steady mean photon index Γ = 4.27 ± 0.14. In the last three years of data, the flux level above 150 GeV shows a clear variability (probability of constant flux < 0.001%). The flux variations are modest, lying in the range from 4% to 11% of the Crab Nebula flux. Simultaneous optical data also show only modest variability that seems to be correlatedmore » with VHE gamma-ray variability. We also performed a temporal analysis of (all available) simultaneous Fermi/Large Area Telescope data of PG 1553+113 above 1 GeV, which reveals hints of variability in the 2008-2009 sample. Finally, we present a combination of the mean spectrum measured at VHEs with archival data available for other wavelengths. The mean spectral energy distribution can be modeled with a one-zone synchrotron self-Compton model, which gives the main physical parameters governing the VHE emission in the blazar jet.« less
Systematic Review and Consensus Guidelines for Environmental Sampling of Burkholderia pseudomallei
Limmathurotsakul, Direk; Dance, David A. B.; Wuthiekanun, Vanaporn; Kaestli, Mirjam; Mayo, Mark; Warner, Jeffrey; Wagner, David M.; Tuanyok, Apichai; Wertheim, Heiman; Yoke Cheng, Tan; Mukhopadhyay, Chiranjay; Puthucheary, Savithiri; Day, Nicholas P. J.; Steinmetz, Ivo; Currie, Bart J.; Peacock, Sharon J.
2013-01-01
Background Burkholderia pseudomallei, a Tier 1 Select Agent and the cause of melioidosis, is a Gram-negative bacillus present in the environment in many tropical countries. Defining the global pattern of B. pseudomallei distribution underpins efforts to prevent infection, and is dependent upon robust environmental sampling methodology. Our objective was to review the literature on the detection of environmental B. pseudomallei, update the risk map for melioidosis, and propose international consensus guidelines for soil sampling. Methods/Principal Findings An international working party (Detection of Environmental Burkholderia pseudomallei Working Party (DEBWorP)) was formed during the VIth World Melioidosis Congress in 2010. PubMed (January 1912 to December 2011) was searched using the following MeSH terms: pseudomallei or melioidosis. Bibliographies were hand-searched for secondary references. The reported geographical distribution of B. pseudomallei in the environment was mapped and categorized as definite, probable, or possible. The methodology used for detecting environmental B. pseudomallei was extracted and collated. We found that global coverage was patchy, with a lack of studies in many areas where melioidosis is suspected to occur. The sampling strategies and bacterial identification methods used were highly variable, and not all were robust. We developed consensus guidelines with the goals of reducing the probability of false-negative results, and the provision of affordable and ‘low-tech’ methodology that is applicable in both developed and developing countries. Conclusions/Significance The proposed consensus guidelines provide the basis for the development of an accurate and comprehensive global map of environmental B. pseudomallei. PMID:23556010
Janssen, Ian
2016-11-01
The primary objective was to use isotemporal substitution models to estimate whether replacing time spent in sedentary video games (SVGs) and active outdoor play (AOP) with active video games (AVGs) would be associated with changes in youth's mental health. A representative sample of 20,122 Canadian youth in Grades 6-10 was studied. The exposure variables were average hours/day spent playing AVGs, SVGs, and AOP. The outcomes consisted of a negative and internalizing mental health indicator (emotional problems), a positive and internalizing mental health indicator (life satisfaction), and a positive and externalizing mental health indicator (prosocial behavior). Isotemporal substitution models estimated the extent to which replacing time spent in SVGs and AOP with an equivalent amount of time in AVGs had on the mental health indicators. Replacing 1 hour/day of SVGs with 1 hour/day of AVGs was associated with a 6% (95% confidence interval: 3%-9%) reduced probability of high emotional problems, a 4% (2%-7%) increased probability of high life satisfaction, and a 13% (9%-16%) increased probability of high prosocial behavior. Replacing 1 hour/day of AOP with 1 hour/day of AVGs was associated with a 7% (3%-11%) increased probability of high emotional problems, a 3% (1%-5%) reduced probability of high life satisfaction, and a 6% (2%-9%) reduced probability of high prosocial behavior. Replacing SVGs with AVGs was associated with more preferable mental health indicators. Conversely, replacing AOP with AVGs was associated with more deleterious mental health indicators. Copyright © 2016 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Durand, Casey P
2013-01-01
Statistical interactions are a common component of data analysis across a broad range of scientific disciplines. However, the statistical power to detect interactions is often undesirably low. One solution is to elevate the Type 1 error rate so that important interactions are not missed in a low power situation. To date, no study has quantified the effects of this practice on power in a linear regression model. A Monte Carlo simulation study was performed. A continuous dependent variable was specified, along with three types of interactions: continuous variable by continuous variable; continuous by dichotomous; and dichotomous by dichotomous. For each of the three scenarios, the interaction effect sizes, sample sizes, and Type 1 error rate were varied, resulting in a total of 240 unique simulations. In general, power to detect the interaction effect was either so low or so high at α = 0.05 that raising the Type 1 error rate only served to increase the probability of including a spurious interaction in the model. A small number of scenarios were identified in which an elevated Type 1 error rate may be justified. Routinely elevating Type 1 error rate when testing interaction effects is not an advisable practice. Researchers are best served by positing interaction effects a priori and accounting for them when conducting sample size calculations.
Costs of measuring leaf area index of corn
NASA Technical Reports Server (NTRS)
Daughtry, C. S. T.; Hollinger, S. E.
1984-01-01
The magnitude of plant-to-plant variability of leaf area of corn plants selected from uniform plots was examined and four representative methods for measuring leaf area index (LAI) were evaluated. The number of plants required and the relative costs for each sampling method were calculated to detect 10, 20, and 50% differences in LAI using 0.05 and 0.01 tests of significance and a 90% probability of success (beta = 0.1). The natural variability of leaf area per corn plant was nearly 10%. Additional variability or experimental error may be introduced by the measurement technique employed and by nonuniformity within the plot. Direct measurement of leaf area with an electronic area meter had the lowest CV, required that the fewest plants be sampled, but required approximately the same amount of time as the leaf area/weight ratio method to detect comparable differences. Indirect methods based on measurements of length and width of leaves required more plants but less total time than the direct method. Unless the coefficients for converting length and width to area are verified frequently, the indirect methods may be biased. When true differences in LAI among treatments exceed 50% of mean, all four methods are equal. The method of choice depends on the resources available, the differences to be detected, and what additional information, such as leaf weight or stalk weight, is also desired.
Therapeutic Progression in Abused Women Following a Drug-Addiction Treatment Program.
Fernández-Montalvo, Javier; López-Goñi, José J; Arteaga, Alfonso; Cacho, Raúl; Azanza, Paula
2015-06-30
This study explored the prevalence of victims of abuse and the therapeutic progression among women who sought treatment for drug addiction. A sample of 180 addicted Spanish women was assessed. Information was collected on the patients' lifetime history of abuse (psychological, physical, and/or sexual), socio-demographic factors, consumption variables, and psychological symptoms. Of the total sample, 74.4% (n = 134) of the addicted women had been victims of abuse. Psychological abuse affected 66.1% (n = 119) of the patients, followed by physical abuse (51.7%; n = 93) and sexual abuse (31.7%; n = 57). Compared with patients who had not been abused, the addicted women with histories of victimization scored significantly higher on several European version of the Addiction Severity Index (EuropASI) and psychological variables. Specifically, physical abuse and sexual abuse were related to higher levels of severity of addiction. Regarding therapeutic progression, the highest rate of dropout was observed among victims of sexual abuse (63.5%; n = 33), followed by victims of physical abuse (48.9%; n = 23). Multivariate analysis showed that medical and family areas of the EuropASI, as well as violence problems and suicide ideation, were the main variables related to physical and/or sexual abuse. Moreover, women without abuse and with fewer family problems presented the higher probability of treatment completion. The implications of these results for further research and clinical practice are discussed. © The Author(s) 2015.
Variable stars in Local Group Galaxies - II. Sculptor dSph
NASA Astrophysics Data System (ADS)
Martínez-Vázquez, C. E.; Stetson, P. B.; Monelli, M.; Bernard, E. J.; Fiorentino, G.; Gallart, C.; Bono, G.; Cassisi, S.; Dall'Ora, M.; Ferraro, I.; Iannicola, G.; Walker, A. R.
2016-11-01
We present the identification of 634 variable stars in the Milky Way dwarf spheroidal (dSph) satellite Sculptor based on archival ground-based optical observations spanning ˜24 yr and covering ˜2.5 deg2. We employed the same methodologies as the `Homogeneous Photometry' series published by Stetson. In particular, we have identified and characterized one of the largest (536) RR Lyrae samples so far in a Milky Way dSph satellite. We have also detected four Anomalous Cepheids, 23 SX Phoenicis stars, five eclipsing binaries, three field variable stars, three peculiar variable stars located above the horizontal branch - near to the locus of BL Herculis - that we are unable to classify properly. Additionally, we identify 37 long period variables plus 23 probable variable stars, for which the current data do not allow us to determine the period. We report positions and finding charts for all the variable stars, and basic properties (period, amplitude, mean magnitude) and light curves for 574 of them. We discuss the properties of the RR Lyrae stars in the Bailey diagram, which supports the coexistence of subpopulations with different chemical compositions. We estimate the mean mass of Anomalous Cepheids (˜1.5 M⊙) and SX Phoenicis stars (˜1 M⊙). We discuss in detail the nature of the former. The connections between the properties of the different families of variable stars are discussed in the context of the star formation history of the Sculptor dSph galaxy.
High and variable mortality of leatherback turtles reveal possible anthropogenic impacts.
Santidrián Tomillo, P; Robinson, N J; Sanz-Aguilar, A; Spotila, J R; Paladino, F V; Tavecchia, G
2017-08-01
The number of nesting leatherback turtles (Dermochelys coriacea) in the eastern Pacific Ocean has declined dramatically since the late 1980s. This decline has been attributed to egg poaching and interactions with fisheries. However, it is not clear how much of the decline should also be ascribed to variability in the physical characteristics of the ocean. We used data on individually marked turtles that nest at Playa Grande, Costa Rica, to address whether climatic variability affects survival and inter-breeding interval. Because some turtles might nest undetected, we used capture-recapture models to model survival probability accounting for a detection failure. In addition, as the probability of reproduction is constrained by past nesting events, we formulated a new parameterization to estimate inter-breeding intervals and contrast hypotheses on the role of climatic covariates on reproductive frequency. Average annual survival for the period 1993-2011 was low (0.78) and varied over time ranging from 0.49 to 0.99 with a negative temporal trend mainly due to the high mortality values registered after 2004. Survival probability was not associated with the Multivariate ENSO Index of the South Pacific Ocean (MEI) but this index explained 24% of the temporal variability in the reproductive frequency. The probability of a turtle to permanently leave after the first encounter was 26%. This high proportion of transients might be associated with a high mortality cost of the first reproduction or with a long-distance nesting dispersal after the first nesting season. Although current data do not allow separating these two hypotheses, low encounter rate at other locations and high investment in reproduction, supports the first hypothesis. The low and variable annual survival probability has largely contributed to the decline of this leatherback population. The lack of correlation between survival probability and the most important climatic driver of oceanic processes in the Pacific discards a climate-related decline and point to anthropogenic sources of mortality as the main causes responsible for the observed population decline. © 2017 by the Ecological Society of America.
Exploring cluster Monte Carlo updates with Boltzmann machines
NASA Astrophysics Data System (ADS)
Wang, Lei
2017-11-01
Boltzmann machines are physics informed generative models with broad applications in machine learning. They model the probability distribution of an input data set with latent variables and generate new samples accordingly. Applying the Boltzmann machines back to physics, they are ideal recommender systems to accelerate the Monte Carlo simulation of physical systems due to their flexibility and effectiveness. More intriguingly, we show that the generative sampling of the Boltzmann machines can even give different cluster Monte Carlo algorithms. The latent representation of the Boltzmann machines can be designed to mediate complex interactions and identify clusters of the physical system. We demonstrate these findings with concrete examples of the classical Ising model with and without four-spin plaquette interactions. In the future, automatic searches in the algorithm space parametrized by Boltzmann machines may discover more innovative Monte Carlo updates.
A Dynamic Bayesian Network Model for the Production and Inventory Control
NASA Astrophysics Data System (ADS)
Shin, Ji-Sun; Takazaki, Noriyuki; Lee, Tae-Hong; Kim, Jin-Il; Lee, Hee-Hyol
In general, the production quantities and delivered goods are changed randomly and then the total stock is also changed randomly. This paper deals with the production and inventory control using the Dynamic Bayesian Network. Bayesian Network is a probabilistic model which represents the qualitative dependence between two or more random variables by the graph structure, and indicates the quantitative relations between individual variables by the conditional probability. The probabilistic distribution of the total stock is calculated through the propagation of the probability on the network. Moreover, an adjusting rule of the production quantities to maintain the probability of a lower limit and a ceiling of the total stock to certain values is shown.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Laycock, Silas; Cappallo, Rigel; Williams, Benjamin F.
We have monitored the Cassiopeia dwarf galaxy (IC 10) in a series of 10 Chandra ACIS-S observations to capture its variable and transient X-ray source population, which is expected to be dominated by High Mass X-ray Binaries (HMXBs). We present a sample of 21 X-ray sources that are variable between observations at the 3 σ level, from a catalog of 110 unique point sources. We find four transients (flux variability ratio greater than 10) and a further eight objects with ratios >5. The observations span the years 2003–2010 and reach a limiting luminosity of >10{sup 35} erg s{sup −1}, providingmore » sensitivity to X-ray binaries in IC 10 as well as flare stars in the foreground Milky Way. The nature of the variable sources is investigated from light curves, X-ray spectra, energy quantiles, and optical counterparts. The purpose of this study is to discover the composition of the X-ray binary population in a young starburst environment. IC 10 provides a sharp contrast in stellar population age (<10 My) when compared to the Magellanic Clouds (40–200 My) where most of the known HMXBs reside. We find 10 strong HMXB candidates, 2 probable background Active Galactic Nuclei, 4 foreground flare-stars or active binaries, and 5 not yet classifiable sources. Complete classification of the sample requires optical spectroscopy for radial velocity analysis and deeper X-ray observations to obtain higher S/N spectra and search for pulsations. A catalog and supporting data set are provided.« less
Riparian vegetation structure under desertification scenarios
NASA Astrophysics Data System (ADS)
Rosário Fernandes, M.; Segurado, Pedro; Jauch, Eduardo; Ferreira, M. Teresa
2015-04-01
Riparian areas are responsible for many ecological and ecosystems services, including the filtering function, that are considered crucial to the preservation of water quality and social benefits. The main goal of this study is to quantify and understand the riparian variability under desertification scenario(s) and identify the optimal riparian indicators for water scarcity and droughts (WS&D), henceforth improving river basin management. This study was performed in the Iberian Tâmega basin, using riparian woody patches, mapped by visual interpretation on Google Earth imagery, along 130 Sampling Units of 250 m long river stretches. Eight riparian structural indicators, related with lateral dimension, weighted area and shape complexity of riparian patches were calculated using Patch Analyst extension for ArcGis 10. A set of 29 hydrological, climatic, and hydrogeomorphological variables were computed, by a water modelling system (MOHID), using monthly meteorological data between 2008 and 2014. Land-use classes were also calculated, in a 250m-buffer surrounding each sampling unit, using a classification based system on Corine Land Cover. Boosted Regression Trees identified Mean-width (MW) as the optimal riparian indicator for water scarcity and drought, followed by the Weighted Class Area (WCA) (classification accuracy =0.79 and 0.69 respectively). Average Flow and Strahler number were consistently selected, by all boosted models, as the most important explanatory variables. However, a combined effect of hidrogeomorphology and land-use can explain the high variability found in the riparian width mainly in Tâmega tributaries. Riparian patches are larger towards Tâmega river mouth although with lower shape complexity, probably related with more continuous and almost monospecific stands. Climatic, hydrological and land use scenarios, singly and combined, were used to quantify the riparian variability responding to these changes, and to assess the loss of riparian functions such as nutrient incorporation and sediment flux alterations.
Salisbury, Margaret L; Xia, Meng; Murray, Susan; Bartholmai, Brian J; Kazerooni, Ella A; Meldrum, Catherine A; Martinez, Fernando J; Flaherty, Kevin R
2016-09-01
Idiopathic pulmonary fibrosis (IPF) can be diagnosed confidently and non-invasively when clinical and computed tomography (CT) criteria are met. Many do not meet these criteria due to absence of CT honeycombing. We investigated predictors of IPF and combinations allowing accurate diagnosis in individuals without honeycombing. We utilized prospectively collected clinical and CT data from patients enrolled in the Lung Tissue Research Consortium. Included patients had no honeycombing, no connective tissue disease, underwent diagnostic lung biopsy, and had CT pattern consistent with fibrosing ILD (n = 200). Logistic regression identified clinical and CT variables predictive of IPF. The probability of IPF was assessed at various cut-points of important clinical and CT variables. A multivariable model adjusted for age and gender found increasingly extensive reticular densities (OR 2.93, CI 95% 1.55-5.56, p = 0.001) predicted IPF, while increasing ground glass densities predicted a diagnosis other than IPF (OR 0.55, CI 95% 0.34-0.89, p = 0.02). The model-based probability of IPF was 80% or greater in patients with age at least 60 years and extent of reticular density one-third or more of total lung volume; for patients meeting or exceeding these clinical thresholds the specificity for IPF is 96% (CI 95% 91-100%) with 21 of 134 (16%) biopsies avoided. In patients with suspected fibrotic ILD and absence of CT honeycombing, extent of reticular and ground glass densities predict a diagnosis of IPF. The probability of IPF exceeds 80% in subjects over age 60 years with one-third of total lung having reticular densities. Copyright © 2016 Elsevier Ltd. All rights reserved.
Analyses of flood-flow frequency for selected gaging stations in South Dakota
Benson, R.D.; Hoffman, E.B.; Wipf, V.J.
1985-01-01
Analyses of flood flow frequency were made for 111 continuous-record gaging stations in South Dakota with 10 or more years of record. The analyses were developed using the log-Pearson Type III procedure recommended by the U.S. Water Resources Council. The procedure characterizes flood occurrence at a single site as a sequence of annual peak flows. The magnitudes of the annual peak flows are assumed to be independent random variables following a log-Pearson Type III probability distribution, which defines the probability that any single annual peak flow will exceed a specified discharge. By considering only annual peak flows, the flood-frequency analysis becomes the estimation of the log-Pearson annual-probability curve using the record of annual peak flows at the site. The recorded data are divided into two classes: systematic and historic. The systematic record includes all annual peak flows determined in the process of conducting a systematic gaging program at a site. In this program, the annual peak flow is determined for each and every year of the program. The systematic record is intended to constitute an unbiased and representative sample of the population of all possible annual peak flows at the site. In contrast to the systematic record, the historic record consists of annual peak flows that would not have been determined except for evidence indicating their unusual magnitude. Flood information acquired from historical sources almost invariably refers to floods of noteworthy, and hence extraordinary, size. Although historic records form a biased and unrepresentative sample, they can be used to supplement the systematic record. (Author 's abstract)
A Comparative Study of Involvement and Motivation among Casino Gamblers
Lee, Choong-Ki; Lee, BongKoo; Bernhard, Bo Jason
2009-01-01
Objective The purpose of this paper is to investigate three different types of gamblers (which we label "non-problem", "some problem", and "probable pathological gamblers") to determine differences in involvement and motivation, as well as differences in demographic and behavioral variables. Methods The analysis takes advantage of a unique opportunity to sample on-site at a major casino in South Korea, and the resulting purposive sample yielded 180 completed questionnaires in each of the three groups, for a total number of 540. Factor analysis, analysis of variance (ANOVA) and Duncan tests, and Chi-square tests are employed to analyze the data collected from the survey. Results Findings from ANOVA tests indicate that involvement factors of importance/self-expression, pleasure/interest, and centrality derived from the factor analysis were significantly different among these three types of gamblers. The "probable pathological" and "some problem" gamblers were found to have similar degrees of involvement, and higher degrees of involvement than the non-problem gamblers. The tests also reveal that motivational factors of escape, socialization, winning, and exploring scenery were significantly different among these three types of gamblers. When looking at motivations to visit the casino, "probable pathological" gamblers were more likely to seek winning, the "some problem" group appeared to be more likely to seek escape, and the "non-problem" gamblers indicate that their motivations to visit centered around explorations of scenery and culture in the surrounding casino area. Conclusion The tools for exploring motivations and involvements of gambling provide valuable and discerning information about the entire spectrum of gamblers. PMID:20046388
NASA Technical Reports Server (NTRS)
Sankararaman, Shankar; Goebel, Kai
2013-01-01
This paper investigates the use of the inverse first-order reliability method (inverse- FORM) to quantify the uncertainty in the remaining useful life (RUL) of aerospace components. The prediction of remaining useful life is an integral part of system health prognosis, and directly helps in online health monitoring and decision-making. However, the prediction of remaining useful life is affected by several sources of uncertainty, and therefore it is necessary to quantify the uncertainty in the remaining useful life prediction. While system parameter uncertainty and physical variability can be easily included in inverse-FORM, this paper extends the methodology to include: (1) future loading uncertainty, (2) process noise; and (3) uncertainty in the state estimate. The inverse-FORM method has been used in this paper to (1) quickly obtain probability bounds on the remaining useful life prediction; and (2) calculate the entire probability distribution of remaining useful life prediction, and the results are verified against Monte Carlo sampling. The proposed methodology is illustrated using a numerical example.
Weathering the storm: hurricanes and birth outcomes.
Currie, Janet; Rossin-Slater, Maya
2013-05-01
A growing literature suggests that stressful events in pregnancy can have negative effects on birth outcomes. Some of the estimates in this literature may be affected by small samples, omitted variables, endogenous mobility in response to disasters, and errors in the measurement of gestation, as well as by a mechanical correlation between longer gestation and the probability of having been exposed. We use millions of individual birth records to examine the effects of exposure to hurricanes during pregnancy, and the sensitivity of the estimates to these econometric problems. We find that exposure to a hurricane during pregnancy increases the probability of abnormal conditions of the newborn such as being on a ventilator more than 30min and meconium aspiration syndrome (MAS). Although we are able to reproduce previous estimates of effects on birth weight and gestation, our results suggest that measured effects of stressful events on these outcomes are sensitive to specification and it is preferable to use more sensitive indicators of newborn health. Copyright © 2013 Elsevier B.V. All rights reserved.
Weathering the Storm: Hurricanes and Birth Outcomes
Currie, Janet
2013-01-01
A growing literature suggests that stressful events in pregnancy can have negative effects on birth outcomes. Some of the estimates in this literature may be affected by small samples, omitted variables, endogenous mobility in response to disasters, and errors in the measurement of gestation, as well as by a mechanical correlation between longer gestation and the probability of having been exposed. We use millions of individual birth records to examine the effects of exposure to hurricanes during pregnancy, and the sensitivity of the estimates to these econometric problems. We find that exposure to a hurricane during pregnancy increases the probability of abnormal conditions of the newborn such as being on a ventilator more than 30 minutes and meconium aspiration syndrome (MAS). Although we are able to reproduce previous estimates of effects on birth weight and gestation, our results suggest that measured effects of stressful events on these outcomes are sensitive to specification and it is preferable to use more sensitive indicators of newborn health. PMID:23500506
NASA Technical Reports Server (NTRS)
Ahumada, Albert J., Jr.; Null, Cynthia H. (Technical Monitor)
1998-01-01
Adding noise to stimuli to be discriminated allows estimation of observer classification functions based on the correlation between observer responses and relevant features of the noisy stimuli. Examples will be presented of stimulus features that are found in auditory tone detection and visual vernier acuity. using the standard signal detection model (Thurstone scaling), we derive formulas to estimate the proportion of the observers decision variable variance that is controlled by the added noise. one is based on the probability of agreement of the observer with him/herself on trials with the same noise sample. Another is based on the relative performance of the observer and the model. When these do not agree, the model can be rejected. A second derivation gives the probability of agreement of observer and model when the observer follows the model except for internal noise. Agreement significantly less than this amount allows rejection of the model.
Huang, Biao; Zhao, Yongcun
2014-01-01
Estimating standard-exceeding probabilities of toxic metals in soil is crucial for environmental evaluation. Because soil pH and land use types have strong effects on the bioavailability of trace metals in soil, they were taken into account by some environmental protection agencies in making composite soil environmental quality standards (SEQSs) that contain multiple metal thresholds under different pH and land use conditions. This study proposed a method for estimating the standard-exceeding probability map of soil cadmium using a composite SEQS. The spatial variability and uncertainty of soil pH and site-specific land use type were incorporated through simulated realizations by sequential Gaussian simulation. A case study was conducted using a sample data set from a 150 km2 area in Wuhan City and the composite SEQS for cadmium, recently set by the State Environmental Protection Administration of China. The method may be useful for evaluating the pollution risks of trace metals in soil with composite SEQSs. PMID:24672364
Case−Control Study of Risk Factors for Meningococcal Disease in Chile
Matute, Isabel; González, Claudia; Delgado, Iris; Poffald, Lucy; Pedroni, Elena; Alfaro, Tania; Hirmas, Macarena; Nájera, Manuel; Gormaz, Ana; López, Darío; Loayza, Sergio; Ferreccio, Catterina; Gallegos, Doris; Fuentes, Rodrigo; Vial, Pablo; Aguilera, Ximena
2017-01-01
An outbreak of meningococcal disease with a case-fatality rate of 30% and caused by predominantly serogroup W of Neisseria meningitidis began in Chile in 2012. This outbreak required a case−control study to assess determinants and risk factors for infection. We identified confirmed cases during January 2012−March 2013 and selected controls by random sampling of the population, matched for age and sex, resulting in 135 case-patients and 618 controls. Sociodemographic variables, habits, and previous illnesses were studied. Analyses yielded adjusted odds ratios as estimators of the probability of disease development. Results indicated that conditions of social vulnerability, such as low income and overcrowding, as well as familial history of this disease and clinical histories, especially chronic diseases and hospitalization for respiratory conditions, increased the probability of illness. Findings should contribute to direction of intersectoral public policies toward a highly vulnerable social group to enable them to improve their living conditions and health. PMID:28628448
Risk forewarning model for rice grain Cd pollution based on Bayes theory.
Wu, Bo; Guo, Shuhai; Zhang, Lingyan; Li, Fengmei
2018-03-15
Cadmium (Cd) pollution of rice grain caused by Cd-contaminated soils is a common problem in southwest and central south China. In this study, utilizing the advantages of the Bayes classification statistical method, we established a risk forewarning model for rice grain Cd pollution, and put forward two parameters (the prior probability factor and data variability factor). The sensitivity analysis of the model parameters illustrated that sample size and standard deviation influenced the accuracy and applicable range of the model. The accuracy of the model was improved by the self-renewal of the model through adding the posterior data into the priori data. Furthermore, this method can be used to predict the risk probability of rice grain Cd pollution under similar soil environment, tillage and rice varietal conditions. The Bayes approach thus represents a feasible method for risk forewarning of heavy metals pollution of agricultural products caused by contaminated soils. Copyright © 2017 Elsevier B.V. All rights reserved.
Dickey, C; Santella, R M; Hattis, D; Tang, D; Hsu, Y; Cooper, T; Young, T L; Perera, F P
1997-10-01
Biomarkers such as DNA adducts have significant potential to improve quantitative risk assessment by characterizing individual differences in metabolism of genotoxins and DNA repair and accounting for some of the factors that could affect interindividual variation in cancer risk. Inherent uncertainty in laboratory measurements and within-person variability of DNA adduct levels over time are putatively unrelated to cancer risk and should be subtracted from observed variation to better estimate interindividual variability of response to carcinogen exposure. A total of 41 volunteers, both smokers and nonsmokers, were asked to provide a peripheral blood sample every 3 weeks for several months in order to specifically assess intraindividual variability of polycyclic aromatic hydrocarbon (PAH)-DNA adduct levels. The intraindividual variance in PAH-DNA adduct levels, together with measurement uncertainty (laboratory variability and unaccounted for differences in exposure), constituted roughly 30% of the overall variance. An estimated 70% of the total variance was contributed by interindividual variability and is probably representative of the true biologic variability of response to carcinogenic exposure in lymphocytes. The estimated interindividual variability in DNA damage after subtracting intraindividual variability and measurement uncertainty was 24-fold. Inter-individual variance was higher (52-fold) in persons who constitutively lack the Glutathione S-Transferase M1 (GSTM1) gene which is important in the detoxification pathway of PAH. Risk assessment models that do not consider the variability of susceptibility to DNA damage following carcinogen exposure may underestimate risks to the general population, especially for those people who are most vulnerable.
A brief introduction to probability.
Di Paola, Gioacchino; Bertani, Alessandro; De Monte, Lavinia; Tuzzolino, Fabio
2018-02-01
The theory of probability has been debated for centuries: back in 1600, French mathematics used the rules of probability to place and win bets. Subsequently, the knowledge of probability has significantly evolved and is now an essential tool for statistics. In this paper, the basic theoretical principles of probability will be reviewed, with the aim of facilitating the comprehension of statistical inference. After a brief general introduction on probability, we will review the concept of the "probability distribution" that is a function providing the probabilities of occurrence of different possible outcomes of a categorical or continuous variable. Specific attention will be focused on normal distribution that is the most relevant distribution applied to statistical analysis.