Planning Community-Based Assessments of HIV Educational Intervention Programs in Sub-Saharan Africa
ERIC Educational Resources Information Center
Kelcey, Ben; Shen, Zuchao
2017-01-01
A key consideration in planning studies of community-based HIV education programs is identifying a sample size large enough to ensure a reasonable probability of detecting program effects if they exist. Sufficient sample sizes for community- or group-based designs are proportional to the correlation or similarity of individuals within communities.…
Red-shouldered hawk occupancy surveys in central Minnesota, USA
Henneman, C.; McLeod, M.A.; Andersen, D.E.
2007-01-01
Forest-dwelling raptors are often difficult to detect because many species occur at low density or are secretive. Broadcasting conspecific vocalizations can increase the probability of detecting forest-dwelling raptors and has been shown to be an effective method for locating raptors and assessing their relative abundance. Recent advances in statistical techniques based on presence-absence data use probabilistic arguments to derive probability of detection when it is <1 and to provide a model and likelihood-based method for estimating proportion of sites occupied. We used these maximum-likelihood models with data from red-shouldered hawk (Buteo lineatus) call-broadcast surveys conducted in central Minnesota, USA, in 1994-1995 and 2004-2005. Our objectives were to obtain estimates of occupancy and detection probability 1) over multiple sampling seasons (yr), 2) incorporating within-season time-specific detection probabilities, 3) with call type and breeding stage included as covariates in models of probability of detection, and 4) with different sampling strategies. We visited individual survey locations 2-9 times per year, and estimates of both probability of detection (range = 0.28-0.54) and site occupancy (range = 0.81-0.97) varied among years. Detection probability was affected by inclusion of a within-season time-specific covariate, call type, and breeding stage. In 2004 and 2005 we used survey results to assess the effect that number of sample locations, double sampling, and discontinued sampling had on parameter estimates. We found that estimates of probability of detection and proportion of sites occupied were similar across different sampling strategies, and we suggest ways to reduce sampling effort in a monitoring program.
A National Survey of Tobacco Cessation Programs for Youths
Curry, Susan J.; Emery, Sherry; Sporer, Amy K.; Mermelstein, Robin; Flay, Brian R.; Berbaum, Michael; Warnecke, Richard B.; Johnson, Timothy; Mowery, Paul; Parsons, Jennifer; Harmon, Lori; Hund, Lisa; Wells, Henry
2007-01-01
Objectives. We collected data on a national sample of existing community-based tobacco cessation programs for youths to understand their prevalence and overall characteristics. Methods. We employed a 2-stage sampling design with US counties as the first-stage probability sampling units. We then used snowball sampling in selected counties to identify administrators of tobacco cessation programs for youths. We collected data on cessation programs when programs were identified. Results. We profiled 591 programs in 408 counties. Programs were more numerous in urban counties; fewer programs were found in low-income counties. State-level measures of smoking prevalence and tobacco control expenditures were not associated with program availability. Most programs were multisession, school-based group programs serving 50 or fewer youths per year. Program content included cognitive-behavioral components found in adult programs along with content specific to adolescence. The median annual budget was $2000. Few programs (9%) reported only mandatory enrollment, 35% reported mixed mandatory and voluntary enrollment, and 56% reported only voluntary enrollment. Conclusions. There is considerable homogeneity among community-based tobacco cessation programs for youths. Programs are least prevalent in the types of communities for which national data show increases in youths’ smoking prevalence. PMID:17138932
Statistical computation of tolerance limits
NASA Technical Reports Server (NTRS)
Wheeler, J. T.
1993-01-01
Based on a new theory, two computer codes were developed specifically to calculate the exact statistical tolerance limits for normal distributions within unknown means and variances for the one-sided and two-sided cases for the tolerance factor, k. The quantity k is defined equivalently in terms of the noncentral t-distribution by the probability equation. Two of the four mathematical methods employ the theory developed for the numerical simulation. Several algorithms for numerically integrating and iteratively root-solving the working equations are written to augment the program simulation. The program codes generate some tables of k's associated with the varying values of the proportion and sample size for each given probability to show accuracy obtained for small sample sizes.
Probabilistic methods for rotordynamics analysis
NASA Technical Reports Server (NTRS)
Wu, Y.-T.; Torng, T. Y.; Millwater, H. R.; Fossum, A. F.; Rheinfurth, M. H.
1991-01-01
This paper summarizes the development of the methods and a computer program to compute the probability of instability of dynamic systems that can be represented by a system of second-order ordinary linear differential equations. Two instability criteria based upon the eigenvalues or Routh-Hurwitz test functions are investigated. Computational methods based on a fast probability integration concept and an efficient adaptive importance sampling method are proposed to perform efficient probabilistic analysis. A numerical example is provided to demonstrate the methods.
ELIPGRID-PC: A PC program for calculating hot spot probabilities
DOE Office of Scientific and Technical Information (OSTI.GOV)
Davidson, J.R.
1994-10-01
ELIPGRID-PC, a new personal computer program has been developed to provide easy access to Singer`s 1972 ELIPGRID algorithm for hot-spot detection probabilities. Three features of the program are the ability to determine: (1) the grid size required for specified conditions, (2) the smallest hot spot that can be sampled with a given probability, and (3) the approximate grid size resulting from specified conditions and sampling cost. ELIPGRID-PC also provides probability of hit versus cost data for graphing with spread-sheets or graphics software. The program has been successfully tested using Singer`s published ELIPGRID results. An apparent error in the original ELIPGRIDmore » code has been uncovered and an appropriate modification incorporated into the new program.« less
Chen, Peichen; Liu, Shih-Chia; Liu, Hung-I; Chen, Tse-Wei
2011-01-01
For quarantine sampling, it is of fundamental importance to determine the probability of finding an infestation when a specified number of units are inspected. In general, current sampling procedures assume 100% probability (perfect) of detecting a pest if it is present within a unit. Ideally, a nematode extraction method should remove all stages of all species with 100% efficiency regardless of season, temperature, or other environmental conditions; in practice however, no method approaches these criteria. In this study we determined the probability of detecting nematode infestations for quarantine sampling with imperfect extraction efficacy. Also, the required sample and the risk involved in detecting nematode infestations with imperfect extraction efficacy are presented. Moreover, we developed a computer program to calculate confidence levels for different scenarios with varying proportions of infestation and efficacy of detection. In addition, a case study, presenting the extraction efficacy of the modified Baermann's Funnel method on Aphelenchoides besseyi, is used to exemplify the use of our program to calculate the probability of detecting nematode infestations in quarantine sampling with imperfect extraction efficacy. The result has important implications for quarantine programs and highlights the need for a very large number of samples if perfect extraction efficacy is not achieved in such programs. We believe that the results of the study will be useful for the determination of realistic goals in the implementation of quarantine sampling. PMID:22791911
Emery, Sherry; Lee, Jungwha; Curry, Susan J; Johnson, Tim; Sporer, Amy K; Mermelstein, Robin; Flay, Brian; Warnecke, Richard
2010-02-01
Surveys of community-based programs are difficult to conduct when there is virtually no information about the number or locations of the programs of interest. This article describes the methodology used by the Helping Young Smokers Quit (HYSQ) initiative to identify and profile community-based youth smoking cessation programs in the absence of a defined sample frame. We developed a two-stage sampling design, with counties as the first-stage probability sampling units. The second stage used snowball sampling to saturation, to identify individuals who administered youth smoking cessation programs across three economic sectors in each county. Multivariate analyses modeled the relationship between program screening, eligibility, and response rates and economic sector and stratification criteria. Cumulative logit models analyzed the relationship between the number of contacts in a county and the number of programs screened, eligible, or profiled in a county. The snowball process yielded 9,983 unique and traceable contacts. Urban and high-income counties yielded significantly more screened program administrators; urban counties produced significantly more eligible programs, but there was no significant association between the county characteristics and program response rate. There is a positive relationship between the number of informants initially located and the number of programs screened, eligible, and profiled in a county. Our strategy to identify youth tobacco cessation programs could be used to create a sample frame for other nonprofit organizations that are difficult to identify due to a lack of existing directories, lists, or other traditional sample frames.
Romer, Jeremy D.; Gitelman, Alix I.; Clements, Shaun; Schreck, Carl B.
2015-01-01
A number of researchers have attempted to estimate salmonid smolt survival during outmigration through an estuary. However, it is currently unclear how the design of such studies influences the accuracy and precision of survival estimates. In this simulation study we consider four patterns of smolt survival probability in the estuary, and test the performance of several different sampling strategies for estimating estuarine survival assuming perfect detection. The four survival probability patterns each incorporate a systematic component (constant, linearly increasing, increasing and then decreasing, and two pulses) and a random component to reflect daily fluctuations in survival probability. Generally, spreading sampling effort (tagging) across the season resulted in more accurate estimates of survival. All sampling designs in this simulation tended to under-estimate the variation in the survival estimates because seasonal and daily variation in survival probability are not incorporated in the estimation procedure. This under-estimation results in poorer performance of estimates from larger samples. Thus, tagging more fish may not result in better estimates of survival if important components of variation are not accounted for. The results of our simulation incorporate survival probabilities and run distribution data from previous studies to help illustrate the tradeoffs among sampling strategies in terms of the number of tags needed and distribution of tagging effort. This information will assist researchers in developing improved monitoring programs and encourage discussion regarding issues that should be addressed prior to implementation of any telemetry-based monitoring plan. We believe implementation of an effective estuary survival monitoring program will strengthen the robustness of life cycle models used in recovery plans by providing missing data on where and how much mortality occurs in the riverine and estuarine portions of smolt migration. These data could result in better informed management decisions and assist in guidance for more effective estuarine restoration projects.
Azzolina, Nicholas A; Small, Mitchell J; Nakles, David V; Glazewski, Kyle A; Peck, Wesley D; Gorecki, Charles D; Bromhal, Grant S; Dilmore, Robert M
2015-01-20
This work uses probabilistic methods to simulate a hypothetical geologic CO2 storage site in a depleted oil and gas field, where the large number of legacy wells would make it cost-prohibitive to sample all wells for all measurements as part of the postinjection site care. Deep well leakage potential scores were assigned to the wells using a random subsample of 100 wells from a detailed study of 826 legacy wells that penetrate the basal Cambrian formation on the U.S. side of the U.S./Canadian border. Analytical solutions and Monte Carlo simulations were used to quantify the statistical power of selecting a leaking well. Power curves were developed as a function of (1) the number of leaking wells within the Area of Review; (2) the sampling design (random or judgmental, choosing first the wells with the highest deep leakage potential scores); (3) the number of wells included in the monitoring sampling plan; and (4) the relationship between a well’s leakage potential score and its relative probability of leakage. Cases where the deep well leakage potential scores are fully or partially informative of the relative leakage probability are compared to a noninformative base case in which leakage is equiprobable across all wells in the Area of Review. The results show that accurate prior knowledge about the probability of well leakage adds measurable value to the ability to detect a leaking well during the monitoring program, and that the loss in detection ability due to imperfect knowledge of the leakage probability can be quantified. This work underscores the importance of a data-driven, risk-based monitoring program that incorporates uncertainty quantification into long-term monitoring sampling plans at geologic CO2 storage sites.
Quantum probabilistic logic programming
NASA Astrophysics Data System (ADS)
Balu, Radhakrishnan
2015-05-01
We describe a quantum mechanics based logic programming language that supports Horn clauses, random variables, and covariance matrices to express and solve problems in probabilistic logic. The Horn clauses of the language wrap random variables, including infinite valued, to express probability distributions and statistical correlations, a powerful feature to capture relationship between distributions that are not independent. The expressive power of the language is based on a mechanism to implement statistical ensembles and to solve the underlying SAT instances using quantum mechanical machinery. We exploit the fact that classical random variables have quantum decompositions to build the Horn clauses. We establish the semantics of the language in a rigorous fashion by considering an existing probabilistic logic language called PRISM with classical probability measures defined on the Herbrand base and extending it to the quantum context. In the classical case H-interpretations form the sample space and probability measures defined on them lead to consistent definition of probabilities for well formed formulae. In the quantum counterpart, we define probability amplitudes on Hinterpretations facilitating the model generations and verifications via quantum mechanical superpositions and entanglements. We cast the well formed formulae of the language as quantum mechanical observables thus providing an elegant interpretation for their probabilities. We discuss several examples to combine statistical ensembles and predicates of first order logic to reason with situations involving uncertainty.
Exact and Approximate Probabilistic Symbolic Execution
NASA Technical Reports Server (NTRS)
Luckow, Kasper; Pasareanu, Corina S.; Dwyer, Matthew B.; Filieri, Antonio; Visser, Willem
2014-01-01
Probabilistic software analysis seeks to quantify the likelihood of reaching a target event under uncertain environments. Recent approaches compute probabilities of execution paths using symbolic execution, but do not support nondeterminism. Nondeterminism arises naturally when no suitable probabilistic model can capture a program behavior, e.g., for multithreading or distributed systems. In this work, we propose a technique, based on symbolic execution, to synthesize schedulers that resolve nondeterminism to maximize the probability of reaching a target event. To scale to large systems, we also introduce approximate algorithms to search for good schedulers, speeding up established random sampling and reinforcement learning results through the quantification of path probabilities based on symbolic execution. We implemented the techniques in Symbolic PathFinder and evaluated them on nondeterministic Java programs. We show that our algorithms significantly improve upon a state-of- the-art statistical model checking algorithm, originally developed for Markov Decision Processes.
Peng, Xiang; King, Irwin
2008-01-01
The Biased Minimax Probability Machine (BMPM) constructs a classifier which deals with the imbalanced learning tasks. It provides a worst-case bound on the probability of misclassification of future data points based on reliable estimates of means and covariance matrices of the classes from the training data samples, and achieves promising performance. In this paper, we develop a novel yet critical extension training algorithm for BMPM that is based on Second-Order Cone Programming (SOCP). Moreover, we apply the biased classification model to medical diagnosis problems to demonstrate its usefulness. By removing some crucial assumptions in the original solution to this model, we make the new method more accurate and robust. We outline the theoretical derivatives of the biased classification model, and reformulate it into an SOCP problem which could be efficiently solved with global optima guarantee. We evaluate our proposed SOCP-based BMPM (BMPMSOCP) scheme in comparison with traditional solutions on medical diagnosis tasks where the objectives are to focus on improving the sensitivity (the accuracy of the more important class, say "ill" samples) instead of the overall accuracy of the classification. Empirical results have shown that our method is more effective and robust to handle imbalanced classification problems than traditional classification approaches, and the original Fractional Programming-based BMPM (BMPMFP).
ERIC Educational Resources Information Center
Baker, Amy J. L.; Ashare, Caryn; Charvat, Benjamin J.
2009-01-01
Fifty-three adolescent girls residing in community-based group-living child welfare programs were administered a standardized measure (SASS-2) in order to assess probability of a substance use/dependency disorder in this highly vulnerable population. Findings revealed that one third of the sample, and one half of the nonpregnant/parenting girls,…
PROBABILITY SAMPLING AND POPULATION INFERENCE IN MONITORING PROGRAMS
A fundamental difference between probability sampling and conventional statistics is that "sampling" deals with real, tangible populations, whereas "conventional statistics" usually deals with hypothetical populations that have no real-world realization. he focus here is on real ...
Diefenbach, D.R.; Rosenberry, C.S.; Boyd, Robert C.
2004-01-01
Surveillance programs for Chronic Wasting Disease (CWD) in free-ranging cervids often use a standard of being able to detect 1% prevalence when determining minimum sample sizes. However, 1% prevalence may represent >10,000 infected animals in a population of 1 million, and most wildlife managers would prefer to detect the presence of CWD when far fewer infected animals exist. We wanted to detect the presence of CWD in white-tailed deer (Odocoileus virginianus) in Pennsylvania when the disease was present in only 1 of 21 wildlife management units (WMUs) statewide. We used computer simulation to estimate the probability of detecting CWD based on a sampling design to detect the presence of CWD at 0.1% and 1.0% prevalence (23-76 and 225-762 infected deer, respectively) using tissue samples collected from hunter-killed deer. The probability of detection at 0.1% prevalence was <30% with sample sizes of ???6,000 deer, and the probability of detection at 1.0% prevalence was 46-72% with statewide sample sizes of 2,000-6,000 deer. We believe that testing of hunter-killed deer is an essential part of any surveillance program for CWD, but our results demonstrated the importance of a multifaceted surveillance approach for CWD detection rather than sole reliance on testing hunter-killed deer.
Stan : A Probabilistic Programming Language
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carpenter, Bob; Gelman, Andrew; Hoffman, Matthew D.
Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. As of version 2.14.0, Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectationmore » propagation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can also be called from the command line using the cmdstan package, through R using the rstan package, and through Python using the pystan package. All three interfaces support sampling and optimization-based inference with diagnostics and posterior analysis. rstan and pystan also provide access to log probabilities, gradients, Hessians, parameter transforms, and specialized plotting.« less
Stan : A Probabilistic Programming Language
Carpenter, Bob; Gelman, Andrew; Hoffman, Matthew D.; ...
2017-01-01
Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. As of version 2.14.0, Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectationmore » propagation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can also be called from the command line using the cmdstan package, through R using the rstan package, and through Python using the pystan package. All three interfaces support sampling and optimization-based inference with diagnostics and posterior analysis. rstan and pystan also provide access to log probabilities, gradients, Hessians, parameter transforms, and specialized plotting.« less
INDICATORS OF ECOLOGICAL STRESS AND THEIR EXTENT IN THE POPULATION OF NORTHEASTERN LAKES
The Environmental Monitoring and Assessment Program (EMAP) surveyed 345 northeastern lakes, during 1991-1996, in the first regional-scale survey to use a probability-based sampling design to collect biological assemblage data along with a broad range of physical and chemical indi...
Public attitudes toward stuttering in Turkey: probability versus convenience sampling.
Ozdemir, R Sertan; St Louis, Kenneth O; Topbaş, Seyhun
2011-12-01
A Turkish translation of the Public Opinion Survey of Human Attributes-Stuttering (POSHA-S) was used to compare probability versus convenience sampling to measure public attitudes toward stuttering. A convenience sample of adults in Eskişehir, Turkey was compared with two replicates of a school-based, probability cluster sampling scheme. The two replicates of the probability sampling scheme yielded similar demographic samples, both of which were different from the convenience sample. Components of subscores on the POSHA-S were significantly different in more than half of the comparisons between convenience and probability samples, indicating important differences in public attitudes. If POSHA-S users intend to generalize to specific geographic areas, results of this study indicate that probability sampling is a better research strategy than convenience sampling. The reader will be able to: (1) discuss the difference between convenience sampling and probability sampling; (2) describe a school-based probability sampling scheme; and (3) describe differences in POSHA-S results from convenience sampling versus probability sampling. Copyright © 2011 Elsevier Inc. All rights reserved.
Technology Development Risk Assessment for Space Transportation Systems
NASA Technical Reports Server (NTRS)
Mathias, Donovan L.; Godsell, Aga M.; Go, Susie
2006-01-01
A new approach for assessing development risk associated with technology development projects is presented. The method represents technology evolution in terms of sector-specific discrete development stages. A Monte Carlo simulation is used to generate development probability distributions based on statistical models of the discrete transitions. Development risk is derived from the resulting probability distributions and specific program requirements. Two sample cases are discussed to illustrate the approach, a single rocket engine development and a three-technology space transportation portfolio.
Planning and processing multistage samples with a computer programMUST.
John W. Hazard; Larry E. Stewart
1974-01-01
A computer program was written to handle multistage sampling designs in insect populations. It is, however, general enough to be used for any population where the number of stages does not exceed three. The program handles three types of sampling situations, all of which assume equal probability sampling. Option 1 takes estimates of sample variances, costs, and either...
Sampling designs matching species biology produce accurate and affordable abundance indices
Farley, Sean; Russell, Gareth J.; Butler, Matthew J.; Selinger, Jeff
2013-01-01
Wildlife biologists often use grid-based designs to sample animals and generate abundance estimates. Although sampling in grids is theoretically sound, in application, the method can be logistically difficult and expensive when sampling elusive species inhabiting extensive areas. These factors make it challenging to sample animals and meet the statistical assumption of all individuals having an equal probability of capture. Violating this assumption biases results. Does an alternative exist? Perhaps by sampling only where resources attract animals (i.e., targeted sampling), it would provide accurate abundance estimates more efficiently and affordably. However, biases from this approach would also arise if individuals have an unequal probability of capture, especially if some failed to visit the sampling area. Since most biological programs are resource limited, and acquiring abundance data drives many conservation and management applications, it becomes imperative to identify economical and informative sampling designs. Therefore, we evaluated abundance estimates generated from grid and targeted sampling designs using simulations based on geographic positioning system (GPS) data from 42 Alaskan brown bears (Ursus arctos). Migratory salmon drew brown bears from the wider landscape, concentrating them at anadromous streams. This provided a scenario for testing the targeted approach. Grid and targeted sampling varied by trap amount, location (traps placed randomly, systematically or by expert opinion), and traps stationary or moved between capture sessions. We began by identifying when to sample, and if bears had equal probability of capture. We compared abundance estimates against seven criteria: bias, precision, accuracy, effort, plus encounter rates, and probabilities of capture and recapture. One grid (49 km2 cells) and one targeted configuration provided the most accurate results. Both placed traps by expert opinion and moved traps between capture sessions, which raised capture probabilities. The grid design was least biased (−10.5%), but imprecise (CV 21.2%), and used most effort (16,100 trap-nights). The targeted configuration was more biased (−17.3%), but most precise (CV 12.3%), with least effort (7,000 trap-nights). Targeted sampling generated encounter rates four times higher, and capture and recapture probabilities 11% and 60% higher than grid sampling, in a sampling frame 88% smaller. Bears had unequal probability of capture with both sampling designs, partly because some bears never had traps available to sample them. Hence, grid and targeted sampling generated abundance indices, not estimates. Overall, targeted sampling provided the most accurate and affordable design to index abundance. Targeted sampling may offer an alternative method to index the abundance of other species inhabiting expansive and inaccessible landscapes elsewhere, provided their attraction to resource concentrations. PMID:24392290
Effects of sampling conditions on DNA-based estimates of American black bear abundance
Laufenberg, Jared S.; Van Manen, Frank T.; Clark, Joseph D.
2013-01-01
DNA-based capture-mark-recapture techniques are commonly used to estimate American black bear (Ursus americanus) population abundance (N). Although the technique is well established, many questions remain regarding study design. In particular, relationships among N, capture probability of heterogeneity mixtures A and B (pA and pB, respectively, or p, collectively), the proportion of each mixture (π), number of capture occasions (k), and probability of obtaining reliable estimates of N are not fully understood. We investigated these relationships using 1) an empirical dataset of DNA samples for which true N was unknown and 2) simulated datasets with known properties that represented a broader array of sampling conditions. For the empirical data analysis, we used the full closed population with heterogeneity data type in Program MARK to estimate N for a black bear population in Great Smoky Mountains National Park, Tennessee. We systematically reduced the number of those samples used in the analysis to evaluate the effect that changes in capture probabilities may have on parameter estimates. Model-averaged N for females and males were 161 (95% CI = 114–272) and 100 (95% CI = 74–167), respectively (pooled N = 261, 95% CI = 192–419), and the average weekly p was 0.09 for females and 0.12 for males. When we reduced the number of samples of the empirical data, support for heterogeneity models decreased. For the simulation analysis, we generated capture data with individual heterogeneity covering a range of sampling conditions commonly encountered in DNA-based capture-mark-recapture studies and examined the relationships between those conditions and accuracy (i.e., probability of obtaining an estimated N that is within 20% of true N), coverage (i.e., probability that 95% confidence interval includes true N), and precision (i.e., probability of obtaining a coefficient of variation ≤20%) of estimates using logistic regression. The capture probability for the larger of 2 mixture proportions of the population (i.e., pA or pB, depending on the value of π) was most important for predicting accuracy and precision, whereas capture probabilities of both mixture proportions (pA and pB) were important to explain variation in coverage. Based on sampling conditions similar to parameter estimates from the empirical dataset (pA = 0.30, pB = 0.05, N = 250, π = 0.15, and k = 10), predicted accuracy and precision were low (60% and 53%, respectively), whereas coverage was high (94%). Increasing pB, the capture probability for the predominate but most difficult to capture proportion of the population, was most effective to improve accuracy under those conditions. However, manipulation of other parameters may be more effective under different conditions. In general, the probabilities of obtaining accurate and precise estimates were best when p≥ 0.2. Our regression models can be used by managers to evaluate specific sampling scenarios and guide development of sampling frameworks or to assess reliability of DNA-based capture-mark-recapture studies.
One of the Environmental Monitoring and Assessment Program's first projects was a survey of 345 lakes in the eight states of the Northeast, during summers of 1991-1996. This survey was the first regional-scale attempt to use a probability-based sampling design to collect biolog...
Sampling design trade-offs in occupancy studies with imperfect detection: examples and software
Bailey, L.L.; Hines, J.E.; Nichols, J.D.
2007-01-01
Researchers have used occupancy, or probability of occupancy, as a response or state variable in a variety of studies (e.g., habitat modeling), and occupancy is increasingly favored by numerous state, federal, and international agencies engaged in monitoring programs. Recent advances in estimation methods have emphasized that reliable inferences can be made from these types of studies if detection and occupancy probabilities are simultaneously estimated. The need for temporal replication at sampled sites to estimate detection probability creates a trade-off between spatial replication (number of sample sites distributed within the area of interest/inference) and temporal replication (number of repeated surveys at each site). Here, we discuss a suite of questions commonly encountered during the design phase of occupancy studies, and we describe software (program GENPRES) developed to allow investigators to easily explore design trade-offs focused on particularities of their study system and sampling limitations. We illustrate the utility of program GENPRES using an amphibian example from Greater Yellowstone National Park, USA.
A linear programming model for protein inference problem in shotgun proteomics.
Huang, Ting; He, Zengyou
2012-11-15
Assembling peptides identified from tandem mass spectra into a list of proteins, referred to as protein inference, is an important issue in shotgun proteomics. The objective of protein inference is to find a subset of proteins that are truly present in the sample. Although many methods have been proposed for protein inference, several issues such as peptide degeneracy still remain unsolved. In this article, we present a linear programming model for protein inference. In this model, we use a transformation of the joint probability that each peptide/protein pair is present in the sample as the variable. Then, both the peptide probability and protein probability can be expressed as a formula in terms of the linear combination of these variables. Based on this simple fact, the protein inference problem is formulated as an optimization problem: minimize the number of proteins with non-zero probabilities under the constraint that the difference between the calculated peptide probability and the peptide probability generated from peptide identification algorithms should be less than some threshold. This model addresses the peptide degeneracy issue by forcing some joint probability variables involving degenerate peptides to be zero in a rigorous manner. The corresponding inference algorithm is named as ProteinLP. We test the performance of ProteinLP on six datasets. Experimental results show that our method is competitive with the state-of-the-art protein inference algorithms. The source code of our algorithm is available at: https://sourceforge.net/projects/prolp/. zyhe@dlut.edu.cn. Supplementary data are available at Bioinformatics Online.
Mattfeldt, S.D.; Bailey, L.L.; Grant, E.H.C.
2009-01-01
Monitoring programs have the potential to identify population declines and differentiate among the possible cause(s) of these declines. Recent criticisms regarding the design of monitoring programs have highlighted a failure to clearly state objectives and to address detectability and spatial sampling issues. Here, we incorporate these criticisms to design an efficient monitoring program whose goals are to determine environmental factors which influence the current distribution and measure change in distributions over time for a suite of amphibians. In designing the study we (1) specified a priori factors that may relate to occupancy, extinction, and colonization probabilities and (2) used the data collected (incorporating detectability) to address our scientific questions and adjust our sampling protocols. Our results highlight the role of wetland hydroperiod and other local covariates in the probability of amphibian occupancy. There was a change in overall occupancy probabilities for most species over the first three years of monitoring. Most colonization and extinction estimates were constant over time (years) and space (among wetlands), with one notable exception: local extinction probabilities for Rana clamitans were lower for wetlands with longer hydroperiods. We used information from the target system to generate scenarios of population change and gauge the ability of the current sampling to meet monitoring goals. Our results highlight the limitations of the current sampling design, emphasizing the need for long-term efforts, with periodic re-evaluation of the program in a framework that can inform management decisions.
On the importance of incorporating sampling weights in ...
Occupancy models are used extensively to assess wildlife-habitat associations and to predict species distributions across large geographic regions. Occupancy models were developed as a tool to properly account for imperfect detection of a species. Current guidelines on survey design requirements for occupancy models focus on the number of sample units and the pattern of revisits to a sample unit within a season. We focus on the sampling design or how the sample units are selected in geographic space (e.g., stratified, simple random, unequal probability, etc). In a probability design, each sample unit has a sample weight which quantifies the number of sample units it represents in the finite (oftentimes areal) sampling frame. We demonstrate the importance of including sampling weights in occupancy model estimation when the design is not a simple random sample or equal probability design. We assume a finite areal sampling frame as proposed for a national bat monitoring program. We compare several unequal and equal probability designs and varying sampling intensity within a simulation study. We found the traditional single season occupancy model produced biased estimates of occupancy and lower confidence interval coverage rates compared to occupancy models that accounted for the sampling design. We also discuss how our findings inform the analyses proposed for the nascent North American Bat Monitoring Program and other collaborative synthesis efforts that propose h
Nonprobability and probability-based sampling strategies in sexual science.
Catania, Joseph A; Dolcini, M Margaret; Orellana, Roberto; Narayanan, Vasudah
2015-01-01
With few exceptions, much of sexual science builds upon data from opportunistic nonprobability samples of limited generalizability. Although probability-based studies are considered the gold standard in terms of generalizability, they are costly to apply to many of the hard-to-reach populations of interest to sexologists. The present article discusses recent conclusions by sampling experts that have relevance to sexual science that advocates for nonprobability methods. In this regard, we provide an overview of Internet sampling as a useful, cost-efficient, nonprobability sampling method of value to sex researchers conducting modeling work or clinical trials. We also argue that probability-based sampling methods may be more readily applied in sex research with hard-to-reach populations than is typically thought. In this context, we provide three case studies that utilize qualitative and quantitative techniques directed at reducing limitations in applying probability-based sampling to hard-to-reach populations: indigenous Peruvians, African American youth, and urban men who have sex with men (MSM). Recommendations are made with regard to presampling studies, adaptive and disproportionate sampling methods, and strategies that may be utilized in evaluating nonprobability and probability-based sampling methods.
THREE-PEE SAMPLING THEORY and program 'THRP' for computer generation of selection criteria
L. R. Grosenbaugh
1965-01-01
Theory necessary for sampling with probability proportional to prediction ('three-pee,' or '3P,' sampling) is first developed and then exemplified by numerical comparisons of several estimators. Program 'T RP' for computer generation of appropriate 3P-sample-selection criteria is described, and convenient random integer dispensers are...
PROBABILITY SURVEYS , CONDITIONAL PROBABILITIES AND ECOLOGICAL RISK ASSESSMENT
We show that probability-based environmental resource monitoring programs, such as the U.S. Environmental Protection Agency's (U.S. EPA) Environmental Monitoring and Assessment Program, and conditional probability analysis can serve as a basis for estimating ecological risk over ...
PROBABILITY SURVEYS, CONDITIONAL PROBABILITIES, AND ECOLOGICAL RISK ASSESSMENT
We show that probability-based environmental resource monitoring programs, such as U.S. Environmental Protection Agency's (U.S. EPA) Environmental Monitoring and Asscssment Program EMAP) can be analyzed with a conditional probability analysis (CPA) to conduct quantitative probabi...
Ji, Yuan; Wang, Sue-Jane
2013-01-01
The 3 + 3 design is the most common choice among clinicians for phase I dose-escalation oncology trials. In recent reviews, more than 95% of phase I trials have been based on the 3 + 3 design. Given that it is intuitive and its implementation does not require a computer program, clinicians can conduct 3 + 3 dose escalations in practice with virtually no logistic cost, and trial protocols based on the 3 + 3 design pass institutional review board and biostatistics reviews quickly. However, the performance of the 3 + 3 design has rarely been compared with model-based designs in simulation studies with matched sample sizes. In the vast majority of statistical literature, the 3 + 3 design has been shown to be inferior in identifying true maximum-tolerated doses (MTDs), although the sample size required by the 3 + 3 design is often orders-of-magnitude smaller than model-based designs. In this article, through comparative simulation studies with matched sample sizes, we demonstrate that the 3 + 3 design has higher risks of exposing patients to toxic doses above the MTD than the modified toxicity probability interval (mTPI) design, a newly developed adaptive method. In addition, compared with the mTPI design, the 3 + 3 design does not yield higher probabilities in identifying the correct MTD, even when the sample size is matched. Given that the mTPI design is equally transparent, costless to implement with free software, and more flexible in practical situations, we highly encourage its adoption in early dose-escalation studies whenever the 3 + 3 design is also considered. We provide free software to allow direct comparisons of the 3 + 3 design with other model-based designs in simulation studies with matched sample sizes. PMID:23569307
Probability Surveys, Conditional Probability, and Ecological Risk Assessment
We show that probability-based environmental resource monitoring programs, such as the U.S. Environmental Protection Agency’s (U.S. EPA) Environmental Monitoring and Assessment Program, and conditional probability analysis can serve as a basis for estimating ecological risk over ...
Shirley, Matthew H.; Dorazio, Robert M.; Abassery, Ekramy; Elhady, Amr A.; Mekki, Mohammed S.; Asran, Hosni H.
2012-01-01
As part of the development of a management program for Nile crocodiles in Lake Nasser, Egypt, we used a dependent double-observer sampling protocol with multiple observers to compute estimates of population size. To analyze the data, we developed a hierarchical model that allowed us to assess variation in detection probabilities among observers and survey dates, as well as account for variation in crocodile abundance among sites and habitats. We conducted surveys from July 2008-June 2009 in 15 areas of Lake Nasser that were representative of 3 main habitat categories. During these surveys, we sampled 1,086 km of lake shore wherein we detected 386 crocodiles. Analysis of the data revealed significant variability in both inter- and intra-observer detection probabilities. Our raw encounter rate was 0.355 crocodiles/km. When we accounted for observer effects and habitat, we estimated a surface population abundance of 2,581 (2,239-2,987, 95% credible intervals) crocodiles in Lake Nasser. Our results underscore the importance of well-trained, experienced monitoring personnel in order to decrease heterogeneity in intra-observer detection probability and to better detect changes in the population based on survey indices. This study will assist the Egyptian government establish a monitoring program as an integral part of future crocodile harvest activities in Lake Nasser
Multiple data sources improve DNA-based mark-recapture population estimates of grizzly bears.
Boulanger, John; Kendall, Katherine C; Stetz, Jeffrey B; Roon, David A; Waits, Lisette P; Paetkau, David
2008-04-01
A fundamental challenge to estimating population size with mark-recapture methods is heterogeneous capture probabilities and subsequent bias of population estimates. Confronting this problem usually requires substantial sampling effort that can be difficult to achieve for some species, such as carnivores. We developed a methodology that uses two data sources to deal with heterogeneity and applied this to DNA mark-recapture data from grizzly bears (Ursus arctos). We improved population estimates by incorporating additional DNA "captures" of grizzly bears obtained by collecting hair from unbaited bear rub trees concurrently with baited, grid-based, hair snag sampling. We consider a Lincoln-Petersen estimator with hair snag captures as the initial session and rub tree captures as the recapture session and develop an estimator in program MARK that treats hair snag and rub tree samples as successive sessions. Using empirical data from a large-scale project in the greater Glacier National Park, Montana, USA, area and simulation modeling we evaluate these methods and compare the results to hair-snag-only estimates. Empirical results indicate that, compared with hair-snag-only data, the joint hair-snag-rub-tree methods produce similar but more precise estimates if capture and recapture rates are reasonably high for both methods. Simulation results suggest that estimators are potentially affected by correlation of capture probabilities between sample types in the presence of heterogeneity. Overall, closed population Huggins-Pledger estimators showed the highest precision and were most robust to sparse data, heterogeneity, and capture probability correlation among sampling types. Results also indicate that these estimators can be used when a segment of the population has zero capture probability for one of the methods. We propose that this general methodology may be useful for other species in which mark-recapture data are available from multiple sources.
O'Connell, Allan F.; Talancy, Neil W.; Bailey, Larissa L.; Sauer, John R.; Cook, Robert; Gilbert, Andrew T.
2006-01-01
Large-scale, multispecies monitoring programs are widely used to assess changes in wildlife populations but they often assume constant detectability when documenting species occurrence. This assumption is rarely met in practice because animal populations vary across time and space. As a result, detectability of a species can be influenced by a number of physical, biological, or anthropogenic factors (e.g., weather, seasonality, topography, biological rhythms, sampling methods). To evaluate some of these influences, we estimated site occupancy rates using species-specific detection probabilities for meso- and large terrestrial mammal species on Cape Cod, Massachusetts, USA. We used model selection to assess the influence of different sampling methods and major environmental factors on our ability to detect individual species. Remote cameras detected the most species (9), followed by cubby boxes (7) and hair traps (4) over a 13-month period. Estimated site occupancy rates were similar among sampling methods for most species when detection probabilities exceeded 0.15, but we question estimates obtained from methods with detection probabilities between 0.05 and 0.15, and we consider methods with lower probabilities unacceptable for occupancy estimation and inference. Estimated detection probabilities can be used to accommodate variation in sampling methods, which allows for comparison of monitoring programs using different protocols. Vegetation and seasonality produced species-specific differences in detectability and occupancy, but differences were not consistent within or among species, which suggests that our results should be considered in the context of local habitat features and life history traits for the target species. We believe that site occupancy is a useful state variable and suggest that monitoring programs for mammals using occupancy data consider detectability prior to making inferences about species distributions or population change.
Monte Carlo tests of the ELIPGRID-PC algorithm
DOE Office of Scientific and Technical Information (OSTI.GOV)
Davidson, J.R.
1995-04-01
The standard tool for calculating the probability of detecting pockets of contamination called hot spots has been the ELIPGRID computer code of Singer and Wickman. The ELIPGRID-PC program has recently made this algorithm available for an IBM{reg_sign} PC. However, no known independent validation of the ELIPGRID algorithm exists. This document describes a Monte Carlo simulation-based validation of a modified version of the ELIPGRID-PC code. The modified ELIPGRID-PC code is shown to match Monte Carlo-calculated hot-spot detection probabilities to within {plus_minus}0.5% for 319 out of 320 test cases. The one exception, a very thin elliptical hot spot located within a rectangularmore » sampling grid, differed from the Monte Carlo-calculated probability by about 1%. These results provide confidence in the ability of the modified ELIPGRID-PC code to accurately predict hot-spot detection probabilities within an acceptable range of error.« less
Site occupancy models with heterogeneous detection probabilities
Royle, J. Andrew
2006-01-01
Models for estimating the probability of occurrence of a species in the presence of imperfect detection are important in many ecological disciplines. In these ?site occupancy? models, the possibility of heterogeneity in detection probabilities among sites must be considered because variation in abundance (and other factors) among sampled sites induces variation in detection probability (p). In this article, I develop occurrence probability models that allow for heterogeneous detection probabilities by considering several common classes of mixture distributions for p. For any mixing distribution, the likelihood has the general form of a zero-inflated binomial mixture for which inference based upon integrated likelihood is straightforward. A recent paper by Link (2003, Biometrics 59, 1123?1130) demonstrates that in closed population models used for estimating population size, different classes of mixture distributions are indistinguishable from data, yet can produce very different inferences about population size. I demonstrate that this problem can also arise in models for estimating site occupancy in the presence of heterogeneous detection probabilities. The implications of this are discussed in the context of an application to avian survey data and the development of animal monitoring programs.
Extended Importance Sampling for Reliability Analysis under Evidence Theory
NASA Astrophysics Data System (ADS)
Yuan, X. K.; Chen, B.; Zhang, B. Q.
2018-05-01
In early engineering practice, the lack of data and information makes uncertainty difficult to deal with. However, evidence theory has been proposed to handle uncertainty with limited information as an alternative way to traditional probability theory. In this contribution, a simulation-based approach, called ‘Extended importance sampling’, is proposed based on evidence theory to handle problems with epistemic uncertainty. The proposed approach stems from the traditional importance sampling for reliability analysis under probability theory, and is developed to handle the problem with epistemic uncertainty. It first introduces a nominal instrumental probability density function (PDF) for every epistemic uncertainty variable, and thus an ‘equivalent’ reliability problem under probability theory is obtained. Then the samples of these variables are generated in a way of importance sampling. Based on these samples, the plausibility and belief (upper and lower bounds of probability) can be estimated. It is more efficient than direct Monte Carlo simulation. Numerical and engineering examples are given to illustrate the efficiency and feasible of the proposed approach.
Petrology of lunar rocks and implication to lunar evolution
NASA Technical Reports Server (NTRS)
Ridley, W. I.
1976-01-01
Recent advances in lunar petrology, based on studies of lunar rock samples available through the Apollo program, are reviewed. Samples of bedrock from both maria and terra have been collected where micrometeorite impact penetrated the regolith and brought bedrock to the surface, but no in situ cores have been taken. Lunar petrogenesis and lunar thermal history supported by studies of the rock sample are discussed and a tentative evolutionary scenario is constructed. Mare basalts, terra assemblages of breccias, soils, rocks, and regolith are subjected to elemental analysis, mineralogical analysis, trace content analysis, with studies of texture, ages and isotopic composition. Probable sources of mare basalts are indicated.
NASA Technical Reports Server (NTRS)
Munoz, E. F.; Silverman, M. P.
1979-01-01
A single-step most-probable-number method for determining the number of fecal coliform bacteria present in sewage treatment plant effluents is discussed. A single growth medium based on that of Reasoner et al. (1976) and consisting of 5.0 gr. proteose peptone, 3.0 gr. yeast extract, 10.0 gr. lactose, 7.5 gr. NaCl, 0.2 gr. sodium lauryl sulfate, and 0.1 gr. sodium desoxycholate per liter is used. The pH is adjusted to 6.5, and samples are incubated at 44.5 deg C. Bacterial growth is detected either by measuring the increase with time in the electrical impedance ratio between the innoculated sample vial and an uninnoculated reference vial or by visual examination for turbidity. Results obtained by the single-step method for chlorinated and unchlorinated effluent samples are in excellent agreement with those obtained by the standard method. It is suggested that in automated treatment plants impedance ratio data could be automatically matched by computer programs with the appropriate dilution factors and most probable number tables already in the computer memory, with the corresponding result displayed as fecal coliforms per 100 ml of effluent.
LaMotte, A.E.; Greene, E.A.
2007-01-01
Spatial relations between land use and groundwater quality in the watershed adjacent to Assateague Island National Seashore, Maryland and Virginia, USA were analyzed by the use of two spatial models. One model used a logit analysis and the other was based on geostatistics. The models were developed and compared on the basis of existing concentrations of nitrate as nitrogen in samples from 529 domestic wells. The models were applied to produce spatial probability maps that show areas in the watershed where concentrations of nitrate in groundwater are likely to exceed a predetermined management threshold value. Maps of the watershed generated by logistic regression and probability kriging analysis showing where the probability of nitrate concentrations would exceed 3 mg/L (>0.50) compared favorably. Logistic regression was less dependent on the spatial distribution of sampled wells, and identified an additional high probability area within the watershed that was missed by probability kriging. The spatial probability maps could be used to determine the natural or anthropogenic factors that best explain the occurrence and distribution of elevated concentrations of nitrate (or other constituents) in shallow groundwater. This information can be used by local land-use planners, ecologists, and managers to protect water supplies and identify land-use planning solutions and monitoring programs in vulnerable areas. ?? 2006 Springer-Verlag.
FUNSTAT and statistical image representations
NASA Technical Reports Server (NTRS)
Parzen, E.
1983-01-01
General ideas of functional statistical inference analysis of one sample and two samples, univariate and bivariate are outlined. ONESAM program is applied to analyze the univariate probability distributions of multi-spectral image data.
Sloma, Michael F.; Mathews, David H.
2016-01-01
RNA secondary structure prediction is widely used to analyze RNA sequences. In an RNA partition function calculation, free energy nearest neighbor parameters are used in a dynamic programming algorithm to estimate statistical properties of the secondary structure ensemble. Previously, partition functions have largely been used to estimate the probability that a given pair of nucleotides form a base pair, the conditional stacking probability, the accessibility to binding of a continuous stretch of nucleotides, or a representative sample of RNA structures. Here it is demonstrated that an RNA partition function can also be used to calculate the exact probability of formation of hairpin loops, internal loops, bulge loops, or multibranch loops at a given position. This calculation can also be used to estimate the probability of formation of specific helices. Benchmarking on a set of RNA sequences with known secondary structures indicated that loops that were calculated to be more probable were more likely to be present in the known structure than less probable loops. Furthermore, highly probable loops are more likely to be in the known structure than the set of loops predicted in the lowest free energy structures. PMID:27852924
Quality-control materials in the USDA National Food and Nutrient Analysis Program (NFNAP).
Phillips, Katherine M; Patterson, Kristine Y; Rasor, Amy S; Exler, Jacob; Haytowitz, David B; Holden, Joanne M; Pehrsson, Pamela R
2006-03-01
The US Department of Agriculture (USDA) Nutrient Data Laboratory (NDL) develops and maintains the USDA National Nutrient Databank System (NDBS). Data are released from the NDBS for scientific and public use through the USDA National Nutrient Database for Standard Reference (SR) ( http://www.ars.usda.gov/ba/bhnrc/ndl ). In 1997 the NDL initiated the National Food and Nutrient Analysis Program (NFNAP) to update and expand its food-composition data. The program included: 1) nationwide probability-based sampling of foods; 2) central processing and archiving of food samples; 3) analysis of food components at commercial, government, and university laboratories; 4) incorporation of new analytical data into the NDBS; and 5) dissemination of these data to the scientific community. A key feature and strength of the NFNAP was a rigorous quality-control program that enabled independent verification of the accuracy and precision of analytical results. Custom-made food-control composites and/or commercially available certified reference materials were sent to the laboratories, blinded, with the samples. Data for these materials were essential to ongoing monitoring of analytical work, to identify and resolve suspected analytical problems, to ensure the accuracy and precision of results for the NFNAP food samples.
On evaluating clustering procedures for use in classification
NASA Technical Reports Server (NTRS)
Pore, M. D.; Moritz, T. E.; Register, D. T.; Yao, S. S.; Eppler, W. G. (Principal Investigator)
1979-01-01
The problem of evaluating clustering algorithms and their respective computer programs for use in a preprocessing step for classification is addressed. In clustering for classification the probability of correct classification is suggested as the ultimate measure of accuracy on training data. A means of implementing this criterion and a measure of cluster purity are discussed. Examples are given. A procedure for cluster labeling that is based on cluster purity and sample size is presented.
Gordon, Allegra R; Conron, Kerith J; Calzo, Jerel P; White, Matthew T; Reisner, Sari L; Austin, S Bryn
2018-04-01
Young people may experience school-based violence and bullying victimization related to their gender expression, independent of sexual orientation identity. However, the associations between gender expression and bullying and violence have not been examined in racially and ethnically diverse population-based samples of high school students. This study includes 5469 students (13-18 years) from the 2013 Youth Risk Behavior Surveys conducted in 4 urban school districts. Respondents were 51% Hispanic/Latino, 21% black/African American, 14% white. Generalized additive models were used to examine the functional form of relationships between self-reported gender expression (range: 1 = Most gender conforming, 7 = Most gender nonconforming) and 5 indicators of violence and bullying victimization. We estimated predicted probabilities across gender expression by sex, adjusting for sexual orientation identity and potential confounders. Statistically significant quadratic associations indicated that girls and boys at the most gender conforming and nonconforming ends of the scale had elevated probabilities of fighting and fighting-related injury, compared to those in the middle of the scale (p < .05). There was a significant linear relationship between gender expression and bullying victimization; every unit increase in gender nonconformity was associated with 15% greater odds of experiencing bullying (p < .0001). School-based victimization is associated with conformity and nonconformity to gender norms. School violence prevention programs should include gender diversity education. © 2018, American School Health Association.
The Minnesota Children's Pesticide Exposure Study is a probability-based sample of 102 children 3-13 years old who were monitored for commonly used pesticides. During the summer of 1997, first-morning-void urine samples (1-3 per child) were obtained for 88% of study children a...
CENTENNIAL MOUNTAINS WILDERNESS STUDY AREA, MONTANA AND IDAHO.
Witkind, Irving J.; Ridenour, James
1984-01-01
A mineral survey conducted within the Centennial Mountains Wilderness study area in Montana and Idaho showed large areas of probable and substantiated resource potential for phosphate. Byproducts that may be derived from processing the phosphate include vanadium, chromium, uranium, silver, fluorine, and the rare earths, lanthanum and yttrium. Results of a geochemical sampling program suggest that there is little promise for the occurrence of base and precious metals in the area. Although the area contains other nonmetallic deposits, such as coal, building stone, and pumiceous ash they are not considered as mineral resources. There is a probable resource potential for oil and gas and significant amounts may underlie the area around the Peet Creek and Odell Creek anticlines.
Gavett, Brandon E
2015-03-01
The base rates of abnormal test scores in cognitively normal samples have been a focus of recent research. The goal of the current study is to illustrate how Bayes' theorem uses these base rates--along with the same base rates in cognitively impaired samples and prevalence rates of cognitive impairment--to yield probability values that are more useful for making judgments about the absence or presence of cognitive impairment. Correlation matrices, means, and standard deviations were obtained from the Wechsler Memory Scale--4th Edition (WMS-IV) Technical and Interpretive Manual and used in Monte Carlo simulations to estimate the base rates of abnormal test scores in the standardization and special groups (mixed clinical) samples. Bayes' theorem was applied to these estimates to identify probabilities of normal cognition based on the number of abnormal test scores observed. Abnormal scores were common in the standardization sample (65.4% scoring below a scaled score of 7 on at least one subtest) and more common in the mixed clinical sample (85.6% scoring below a scaled score of 7 on at least one subtest). Probabilities varied according to the number of abnormal test scores, base rates of normal cognition, and cutoff scores. The results suggest that interpretation of base rates obtained from cognitively healthy samples must also account for data from cognitively impaired samples. Bayes' theorem can help neuropsychologists answer questions about the probability that an individual examinee is cognitively healthy based on the number of abnormal test scores observed.
Systematic sampling for suspended sediment
Robert B. Thomas
1991-01-01
Abstract - Because of high costs or complex logistics, scientific populations cannot be measured entirely and must be sampled. Accepted scientific practice holds that sample selection be based on statistical principles to assure objectivity when estimating totals and variances. Probability sampling--obtaining samples with known probabilities--is the only method that...
Methodology Series Module 5: Sampling Strategies.
Setia, Maninder Singh
2016-01-01
Once the research question and the research design have been finalised, it is important to select the appropriate sample for the study. The method by which the researcher selects the sample is the ' Sampling Method'. There are essentially two types of sampling methods: 1) probability sampling - based on chance events (such as random numbers, flipping a coin etc.); and 2) non-probability sampling - based on researcher's choice, population that accessible & available. Some of the non-probability sampling methods are: purposive sampling, convenience sampling, or quota sampling. Random sampling method (such as simple random sample or stratified random sample) is a form of probability sampling. It is important to understand the different sampling methods used in clinical studies and mention this method clearly in the manuscript. The researcher should not misrepresent the sampling method in the manuscript (such as using the term ' random sample' when the researcher has used convenience sample). The sampling method will depend on the research question. For instance, the researcher may want to understand an issue in greater detail for one particular population rather than worry about the ' generalizability' of these results. In such a scenario, the researcher may want to use ' purposive sampling' for the study.
Methodology Series Module 5: Sampling Strategies
Setia, Maninder Singh
2016-01-01
Once the research question and the research design have been finalised, it is important to select the appropriate sample for the study. The method by which the researcher selects the sample is the ‘ Sampling Method’. There are essentially two types of sampling methods: 1) probability sampling – based on chance events (such as random numbers, flipping a coin etc.); and 2) non-probability sampling – based on researcher's choice, population that accessible & available. Some of the non-probability sampling methods are: purposive sampling, convenience sampling, or quota sampling. Random sampling method (such as simple random sample or stratified random sample) is a form of probability sampling. It is important to understand the different sampling methods used in clinical studies and mention this method clearly in the manuscript. The researcher should not misrepresent the sampling method in the manuscript (such as using the term ‘ random sample’ when the researcher has used convenience sample). The sampling method will depend on the research question. For instance, the researcher may want to understand an issue in greater detail for one particular population rather than worry about the ‘ generalizability’ of these results. In such a scenario, the researcher may want to use ‘ purposive sampling’ for the study. PMID:27688438
NASA Astrophysics Data System (ADS)
Dioguardi, Fabio; Mele, Daniela
2018-03-01
This paper presents PYFLOW_2.0, a hazard tool for the calculation of the impact parameters of dilute pyroclastic density currents (DPDCs). DPDCs represent the dilute turbulent type of gravity flows that occur during explosive volcanic eruptions; their hazard is the result of their mobility and the capability to laterally impact buildings and infrastructures and to transport variable amounts of volcanic ash along the path. Starting from data coming from the analysis of deposits formed by DPDCs, PYFLOW_2.0 calculates the flow properties (e.g., velocity, bulk density, thickness) and impact parameters (dynamic pressure, deposition time) at the location of the sampled outcrop. Given the inherent uncertainties related to sampling, laboratory analyses, and modeling assumptions, the program provides ranges of variations and probability density functions of the impact parameters rather than single specific values; from these functions, the user can interrogate the program to obtain the value of the computed impact parameter at any specified exceedance probability. In this paper, the sedimentological models implemented in PYFLOW_2.0 are presented, program functionalities are briefly introduced, and two application examples are discussed so as to show the capabilities of the software in quantifying the impact of the analyzed DPDCs in terms of dynamic pressure, volcanic ash concentration, and residence time in the atmosphere. The software and user's manual are made available as a downloadable electronic supplement.
An Asymptotically-Optimal Sampling-Based Algorithm for Bi-directional Motion Planning
Starek, Joseph A.; Gomez, Javier V.; Schmerling, Edward; Janson, Lucas; Moreno, Luis; Pavone, Marco
2015-01-01
Bi-directional search is a widely used strategy to increase the success and convergence rates of sampling-based motion planning algorithms. Yet, few results are available that merge both bi-directional search and asymptotic optimality into existing optimal planners, such as PRM*, RRT*, and FMT*. The objective of this paper is to fill this gap. Specifically, this paper presents a bi-directional, sampling-based, asymptotically-optimal algorithm named Bi-directional FMT* (BFMT*) that extends the Fast Marching Tree (FMT*) algorithm to bidirectional search while preserving its key properties, chiefly lazy search and asymptotic optimality through convergence in probability. BFMT* performs a two-source, lazy dynamic programming recursion over a set of randomly-drawn samples, correspondingly generating two search trees: one in cost-to-come space from the initial configuration and another in cost-to-go space from the goal configuration. Numerical experiments illustrate the advantages of BFMT* over its unidirectional counterpart, as well as a number of other state-of-the-art planners. PMID:27004130
Butler, Troy; Wildey, Timothy
2018-01-01
In thist study, we develop a procedure to utilize error estimates for samples of a surrogate model to compute robust upper and lower bounds on estimates of probabilities of events. We show that these error estimates can also be used in an adaptive algorithm to simultaneously reduce the computational cost and increase the accuracy in estimating probabilities of events using computationally expensive high-fidelity models. Specifically, we introduce the notion of reliability of a sample of a surrogate model, and we prove that utilizing the surrogate model for the reliable samples and the high-fidelity model for the unreliable samples gives preciselymore » the same estimate of the probability of the output event as would be obtained by evaluation of the original model for each sample. The adaptive algorithm uses the additional evaluations of the high-fidelity model for the unreliable samples to locally improve the surrogate model near the limit state, which significantly reduces the number of high-fidelity model evaluations as the limit state is resolved. Numerical results based on a recently developed adjoint-based approach for estimating the error in samples of a surrogate are provided to demonstrate (1) the robustness of the bounds on the probability of an event, and (2) that the adaptive enhancement algorithm provides a more accurate estimate of the probability of the QoI event than standard response surface approximation methods at a lower computational cost.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Butler, Troy; Wildey, Timothy
In thist study, we develop a procedure to utilize error estimates for samples of a surrogate model to compute robust upper and lower bounds on estimates of probabilities of events. We show that these error estimates can also be used in an adaptive algorithm to simultaneously reduce the computational cost and increase the accuracy in estimating probabilities of events using computationally expensive high-fidelity models. Specifically, we introduce the notion of reliability of a sample of a surrogate model, and we prove that utilizing the surrogate model for the reliable samples and the high-fidelity model for the unreliable samples gives preciselymore » the same estimate of the probability of the output event as would be obtained by evaluation of the original model for each sample. The adaptive algorithm uses the additional evaluations of the high-fidelity model for the unreliable samples to locally improve the surrogate model near the limit state, which significantly reduces the number of high-fidelity model evaluations as the limit state is resolved. Numerical results based on a recently developed adjoint-based approach for estimating the error in samples of a surrogate are provided to demonstrate (1) the robustness of the bounds on the probability of an event, and (2) that the adaptive enhancement algorithm provides a more accurate estimate of the probability of the QoI event than standard response surface approximation methods at a lower computational cost.« less
HABITAT ASSESSMENT USING A RANDOM PROBABILITY BASED SAMPLING DESIGN: ESCAMBIA RIVER DELTA, FLORIDA
Smith, Lisa M., Darrin D. Dantin and Steve Jordan. In press. Habitat Assessment Using a Random Probability Based Sampling Design: Escambia River Delta, Florida (Abstract). To be presented at the SWS/GERS Fall Joint Society Meeting: Communication and Collaboration: Coastal Systems...
Sloma, Michael F; Mathews, David H
2016-12-01
RNA secondary structure prediction is widely used to analyze RNA sequences. In an RNA partition function calculation, free energy nearest neighbor parameters are used in a dynamic programming algorithm to estimate statistical properties of the secondary structure ensemble. Previously, partition functions have largely been used to estimate the probability that a given pair of nucleotides form a base pair, the conditional stacking probability, the accessibility to binding of a continuous stretch of nucleotides, or a representative sample of RNA structures. Here it is demonstrated that an RNA partition function can also be used to calculate the exact probability of formation of hairpin loops, internal loops, bulge loops, or multibranch loops at a given position. This calculation can also be used to estimate the probability of formation of specific helices. Benchmarking on a set of RNA sequences with known secondary structures indicated that loops that were calculated to be more probable were more likely to be present in the known structure than less probable loops. Furthermore, highly probable loops are more likely to be in the known structure than the set of loops predicted in the lowest free energy structures. © 2016 Sloma and Mathews; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Occupancy Modeling Species-Environment Relationships with Non-ignorable Survey Designs.
Irvine, Kathryn M; Rodhouse, Thomas J; Wright, Wilson J; Olsen, Anthony R
2018-05-26
Statistical models supporting inferences about species occurrence patterns in relation to environmental gradients are fundamental to ecology and conservation biology. A common implicit assumption is that the sampling design is ignorable and does not need to be formally accounted for in analyses. The analyst assumes data are representative of the desired population and statistical modeling proceeds. However, if datasets from probability and non-probability surveys are combined or unequal selection probabilities are used, the design may be non ignorable. We outline the use of pseudo-maximum likelihood estimation for site-occupancy models to account for such non-ignorable survey designs. This estimation method accounts for the survey design by properly weighting the pseudo-likelihood equation. In our empirical example, legacy and newer randomly selected locations were surveyed for bats to bridge a historic statewide effort with an ongoing nationwide program. We provide a worked example using bat acoustic detection/non-detection data and show how analysts can diagnose whether their design is ignorable. Using simulations we assessed whether our approach is viable for modeling datasets composed of sites contributed outside of a probability design Pseudo-maximum likelihood estimates differed from the usual maximum likelihood occu31 pancy estimates for some bat species. Using simulations we show the maximum likelihood estimator of species-environment relationships with non-ignorable sampling designs was biased, whereas the pseudo-likelihood estimator was design-unbiased. However, in our simulation study the designs composed of a large proportion of legacy or non-probability sites resulted in estimation issues for standard errors. These issues were likely a result of highly variable weights confounded by small sample sizes (5% or 10% sampling intensity and 4 revisits). Aggregating datasets from multiple sources logically supports larger sample sizes and potentially increases spatial extents for statistical inferences. Our results suggest that ignoring the mechanism for how locations were selected for data collection (e.g., the sampling design) could result in erroneous model-based conclusions. Therefore, in order to ensure robust and defensible recommendations for evidence-based conservation decision-making, the survey design information in addition to the data themselves must be available for analysts. Details for constructing the weights used in estimation and code for implementation are provided. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Space shuttle solid rocket booster recovery system definition, volume 1
NASA Technical Reports Server (NTRS)
1973-01-01
The performance requirements, preliminary designs, and development program plans for an airborne recovery system for the space shuttle solid rocket booster are discussed. The analyses performed during the study phase of the program are presented. The basic considerations which established the system configuration are defined. A Monte Carlo statistical technique using random sampling of the probability distribution for the critical water impact parameters was used to determine the failure probability of each solid rocket booster component as functions of impact velocity and component strength capability.
On the Analysis of Case-Control Studies in Cluster-correlated Data Settings.
Haneuse, Sebastien; Rivera-Rodriguez, Claudia
2018-01-01
In resource-limited settings, long-term evaluation of national antiretroviral treatment (ART) programs often relies on aggregated data, the analysis of which may be subject to ecological bias. As researchers and policy makers consider evaluating individual-level outcomes such as treatment adherence or mortality, the well-known case-control design is appealing in that it provides efficiency gains over random sampling. In the context that motivates this article, valid estimation and inference requires acknowledging any clustering, although, to our knowledge, no statistical methods have been published for the analysis of case-control data for which the underlying population exhibits clustering. Furthermore, in the specific context of an ongoing collaboration in Malawi, rather than performing case-control sampling across all clinics, case-control sampling within clinics has been suggested as a more practical strategy. To our knowledge, although similar outcome-dependent sampling schemes have been described in the literature, a case-control design specific to correlated data settings is new. In this article, we describe this design, discuss balanced versus unbalanced sampling techniques, and provide a general approach to analyzing case-control studies in cluster-correlated settings based on inverse probability-weighted generalized estimating equations. Inference is based on a robust sandwich estimator with correlation parameters estimated to ensure appropriate accounting of the outcome-dependent sampling scheme. We conduct comprehensive simulations, based in part on real data on a sample of N = 78,155 program registrants in Malawi between 2005 and 2007, to evaluate small-sample operating characteristics and potential trade-offs associated with standard case-control sampling or when case-control sampling is performed within clusters.
Computer program determines exact two-sided tolerance limits for normal distributions
NASA Technical Reports Server (NTRS)
Friedman, H. A.; Webb, S. R.
1968-01-01
Computer program determines by numerical integration the exact statistical two-sided tolerance limits, when the proportion between the limits is at least a specified number. The program is limited to situations in which the underlying probability distribution for the population sampled is the normal distribution with unknown mean and variance.
Spatially balanced survey designs for natural resources
Ecological resource monitoring programs typically require the use of a probability survey design to select locations or entities to be physically sampled in the field. The ecological resource of interest, the target population, occurs over a spatial domain and the sample selecte...
ERIC Educational Resources Information Center
Berry, Kenneth J.; And Others
1977-01-01
A FORTRAN program, GAMMA, computes Goodman and Kruskal's coefficient of ordinal association, gamma, and Somer's coefficient. The program also provides associated standard errors, standard scores, and probability values. (Author/JKS)
Calibrating SALT: a sampling scheme to improve estimates of suspended sediment yield
Robert B. Thomas
1986-01-01
Abstract - SALT (Selection At List Time) is a variable probability sampling scheme that provides unbiased estimates of suspended sediment yield and its variance. SALT performs better than standard schemes which are estimate variance. Sampling probabilities are based on a sediment rating function which promotes greater sampling intensity during periods of high...
Burkness, Eric C; Hutchison, W D
2009-10-01
Populations of cabbage looper, Trichoplusiani (Lepidoptera: Noctuidae), were sampled in experimental plots and commercial fields of cabbage (Brasicca spp.) in Minnesota during 1998-1999 as part of a larger effort to implement an integrated pest management program. Using a resampling approach and the Wald's sequential probability ratio test, sampling plans with different sampling parameters were evaluated using independent presence/absence and enumerative data. Evaluations and comparisons of the different sampling plans were made based on the operating characteristic and average sample number functions generated for each plan and through the use of a decision probability matrix. Values for upper and lower decision boundaries, sequential error rates (alpha, beta), and tally threshold were modified to determine parameter influence on the operating characteristic and average sample number functions. The following parameters resulted in the most desirable operating characteristic and average sample number functions; action threshold of 0.1 proportion of plants infested, tally threshold of 1, alpha = beta = 0.1, upper boundary of 0.15, lower boundary of 0.05, and resampling with replacement. We found that sampling parameters can be modified and evaluated using resampling software to achieve desirable operating characteristic and average sample number functions. Moreover, management of T. ni by using binomial sequential sampling should provide a good balance between cost and reliability by minimizing sample size and maintaining a high level of correct decisions (>95%) to treat or not treat.
Faster computation of exact RNA shape probabilities.
Janssen, Stefan; Giegerich, Robert
2010-03-01
Abstract shape analysis allows efficient computation of a representative sample of low-energy foldings of an RNA molecule. More comprehensive information is obtained by computing shape probabilities, accumulating the Boltzmann probabilities of all structures within each abstract shape. Such information is superior to free energies because it is independent of sequence length and base composition. However, up to this point, computation of shape probabilities evaluates all shapes simultaneously and comes with a computation cost which is exponential in the length of the sequence. We device an approach called RapidShapes that computes the shapes above a specified probability threshold T by generating a list of promising shapes and constructing specialized folding programs for each shape to compute its share of Boltzmann probability. This aims at a heuristic improvement of runtime, while still computing exact probability values. Evaluating this approach and several substrategies, we find that only a small proportion of shapes have to be actually computed. For an RNA sequence of length 400, this leads, depending on the threshold, to a 10-138 fold speed-up compared with the previous complete method. Thus, probabilistic shape analysis has become feasible in medium-scale applications, such as the screening of RNA transcripts in a bacterial genome. RapidShapes is available via http://bibiserv.cebitec.uni-bielefeld.de/rnashapes
VARIANCE ESTIMATION FOR SPATIALLY BALANCED SAMPLES OF ENVIRONMENTAL RESOURCES
The spatial distribution of a natural resource is an important consideration in designing an efficient survey or monitoring program for the resource. We review a unified strategy for designing probability samples of discrete, finite resource populations, such as lakes within som...
The purpose of this manuscript is to describe the practical strategies developed for the implementation of the Minnesota Children's Pesticide Exposure Study (MNCPES), which is one of the first probability-based samples of multi-pathway and multi-pesticide exposures in children....
Forestry inventory based on multistage sampling with probability proportional to size
NASA Technical Reports Server (NTRS)
Lee, D. C. L.; Hernandez, P., Jr.; Shimabukuro, Y. E.
1983-01-01
A multistage sampling technique, with probability proportional to size, is developed for a forest volume inventory using remote sensing data. The LANDSAT data, Panchromatic aerial photographs, and field data are collected. Based on age and homogeneity, pine and eucalyptus classes are identified. Selection of tertiary sampling units is made through aerial photographs to minimize field work. The sampling errors for eucalyptus and pine ranged from 8.34 to 21.89 percent and from 7.18 to 8.60 percent, respectively.
Public Attitudes toward Stuttering in Turkey: Probability versus Convenience Sampling
ERIC Educational Resources Information Center
Ozdemir, R. Sertan; St. Louis, Kenneth O.; Topbas, Seyhun
2011-01-01
Purpose: A Turkish translation of the "Public Opinion Survey of Human Attributes-Stuttering" ("POSHA-S") was used to compare probability versus convenience sampling to measure public attitudes toward stuttering. Method: A convenience sample of adults in Eskisehir, Turkey was compared with two replicates of a school-based,…
Royle, J. Andrew; Chandler, Richard B.; Yackulic, Charles; Nichols, James D.
2012-01-01
1. Understanding the factors affecting species occurrence is a pre-eminent focus of applied ecological research. However, direct information about species occurrence is lacking for many species. Instead, researchers sometimes have to rely on so-called presence-only data (i.e. when no direct information about absences is available), which often results from opportunistic, unstructured sampling. MAXENT is a widely used software program designed to model and map species distribution using presence-only data. 2. We provide a critical review of MAXENT as applied to species distribution modelling and discuss how it can lead to inferential errors. A chief concern is that MAXENT produces a number of poorly defined indices that are not directly related to the actual parameter of interest – the probability of occurrence (ψ). This focus on an index was motivated by the belief that it is not possible to estimate ψ from presence-only data; however, we demonstrate that ψ is identifiable using conventional likelihood methods under the assumptions of random sampling and constant probability of species detection. 3. The model is implemented in a convenient r package which we use to apply the model to simulated data and data from the North American Breeding Bird Survey. We demonstrate that MAXENT produces extreme under-predictions when compared to estimates produced by logistic regression which uses the full (presence/absence) data set. We note that MAXENT predictions are extremely sensitive to specification of the background prevalence, which is not objectively estimated using the MAXENT method. 4. As with MAXENT, formal model-based inference requires a random sample of presence locations. Many presence-only data sets, such as those based on museum records and herbarium collections, may not satisfy this assumption. However, when sampling is random, we believe that inference should be based on formal methods that facilitate inference about interpretable ecological quantities instead of vaguely defined indices.
Fatigue crack growth model RANDOM2 user manual, appendix 1
NASA Technical Reports Server (NTRS)
Boyce, Lola; Lovelace, Thomas B.
1989-01-01
The FORTRAN program RANDOM2 is documented. RANDOM2 is based on fracture mechanics using a probabilistic fatigue crack growth model. It predicts the random lifetime of an engine component to reach a given crack size. Included in this user manual are details regarding the theoretical background of RANDOM2, input data, instructions and a sample problem illustrating the use of RANDOM2. Appendix A gives information on the physical quantities, their symbols, FORTRAN names, and both SI and U.S. Customary units. Appendix B includes photocopies of the actual computer printout corresponding to the sample problem. Appendices C and D detail the IMSL, Ver. 10(1), subroutines and functions called by RANDOM2 and a SAS/GRAPH(2) program that can be used to plot both the probability density function (p.d.f.) and the cumulative distribution function (c.d.f.).
Using GIS to generate spatially balanced random survey designs for natural resource applications.
Theobald, David M; Stevens, Don L; White, Denis; Urquhart, N Scott; Olsen, Anthony R; Norman, John B
2007-07-01
Sampling of a population is frequently required to understand trends and patterns in natural resource management because financial and time constraints preclude a complete census. A rigorous probability-based survey design specifies where to sample so that inferences from the sample apply to the entire population. Probability survey designs should be used in natural resource and environmental management situations because they provide the mathematical foundation for statistical inference. Development of long-term monitoring designs demand survey designs that achieve statistical rigor and are efficient but remain flexible to inevitable logistical or practical constraints during field data collection. Here we describe an approach to probability-based survey design, called the Reversed Randomized Quadrant-Recursive Raster, based on the concept of spatially balanced sampling and implemented in a geographic information system. This provides environmental managers a practical tool to generate flexible and efficient survey designs for natural resource applications. Factors commonly used to modify sampling intensity, such as categories, gradients, or accessibility, can be readily incorporated into the spatially balanced sample design.
To assess the ecological condition of streams and rivers in Oregon, we sampled 146 sites
in summer, 1997 as part of the U.S. EPA's Environmental Monitoring and Assessment Program.
Sample reaches were selected using a systematic, randomized sample design from the blue-line n...
This statistical summary reports data from the Environmental Monitoring and Assessment Program (EMAP) Western Pilot (EMAP-W). EMAP-W was a sample survey (or probability survey, often simply called 'random') of streams and rivers in 12 states of the western U.S. (Arizona, Californ...
Nurse Family Partnership: Comparing Costs per Family in Randomized Trials Versus Scale-Up.
Miller, Ted R; Hendrie, Delia
2015-12-01
The literature that addresses cost differences between randomized trials and full-scale replications is quite sparse. This paper examines how costs differed among three randomized trials and six statewide scale-ups of nurse family partnership (NFP) intensive home visitation to low income first-time mothers. A literature review provided data on pertinent trials. At our request, six well-established programs reported their total expenditures. We adjusted the costs to national prices based on mean hourly wages for registered nurses and then inflated them to 2010 dollars. A centralized data system provided utilization. Replications had fewer home visits per family than trials (25 vs. 31, p = .05), lower costs per client ($8860 vs. $12,398, p = .01), and lower costs per visit ($354 vs. $400, p = .30). Sample size limited the significance of these differences. In this type of labor intensive program, costs probably were lower in scale-up than in randomized trials. Key cost drivers were attrition and the stable caseload size possible in an ongoing program. Our estimates reveal a wide variation in cost per visit across six state programs, which suggests that those planning replications should not expect a simple rule to guide cost estimations for scale-ups. Nevertheless, NFP replications probably achieved some economies of scale.
Ellison, Laura E.; Lukacs, Paul M.
2014-01-01
Concern for migratory tree-roosting bats in North America has grown because of possible population declines from wind energy development. This concern has driven interest in estimating population-level changes. Mark-recapture methodology is one possible analytical framework for assessing bat population changes, but sample size requirements to produce reliable estimates have not been estimated. To illustrate the sample sizes necessary for a mark-recapture-based monitoring program we conducted power analyses using a statistical model that allows reencounters of live and dead marked individuals. We ran 1,000 simulations for each of five broad sample size categories in a Burnham joint model, and then compared the proportion of simulations in which 95% confidence intervals overlapped between and among years for a 4-year study. Additionally, we conducted sensitivity analyses of sample size to various capture probabilities and recovery probabilities. More than 50,000 individuals per year would need to be captured and released to accurately determine 10% and 15% declines in annual survival. To detect more dramatic declines of 33% or 50% survival over four years, then sample sizes of 25,000 or 10,000 per year, respectively, would be sufficient. Sensitivity analyses reveal that increasing recovery of dead marked individuals may be more valuable than increasing capture probability of marked individuals. Because of the extraordinary effort that would be required, we advise caution should such a mark-recapture effort be initiated because of the difficulty in attaining reliable estimates. We make recommendations for what techniques show the most promise for mark-recapture studies of bats because some techniques violate the assumptions of mark-recapture methodology when used to mark bats.
Assessing performance and validating finite element simulations using probabilistic knowledge
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dolin, Ronald M.; Rodriguez, E. A.
Two probabilistic approaches for assessing performance are presented. The first approach assesses probability of failure by simultaneously modeling all likely events. The probability each event causes failure along with the event's likelihood of occurrence contribute to the overall probability of failure. The second assessment method is based on stochastic sampling using an influence diagram. Latin-hypercube sampling is used to stochastically assess events. The overall probability of failure is taken as the maximum probability of failure of all the events. The Likelihood of Occurrence simulation suggests failure does not occur while the Stochastic Sampling approach predicts failure. The Likelihood of Occurrencemore » results are used to validate finite element predictions.« less
Sampling considerations for disease surveillance in wildlife populations
Nusser, S.M.; Clark, W.R.; Otis, D.L.; Huang, L.
2008-01-01
Disease surveillance in wildlife populations involves detecting the presence of a disease, characterizing its prevalence and spread, and subsequent monitoring. A probability sample of animals selected from the population and corresponding estimators of disease prevalence and detection provide estimates with quantifiable statistical properties, but this approach is rarely used. Although wildlife scientists often assume probability sampling and random disease distributions to calculate sample sizes, convenience samples (i.e., samples of readily available animals) are typically used, and disease distributions are rarely random. We demonstrate how landscape-based simulation can be used to explore properties of estimators from convenience samples in relation to probability samples. We used simulation methods to model what is known about the habitat preferences of the wildlife population, the disease distribution, and the potential biases of the convenience-sample approach. Using chronic wasting disease in free-ranging deer (Odocoileus virginianus) as a simple illustration, we show that using probability sample designs with appropriate estimators provides unbiased surveillance parameter estimates but that the selection bias and coverage errors associated with convenience samples can lead to biased and misleading results. We also suggest practical alternatives to convenience samples that mix probability and convenience sampling. For example, a sample of land areas can be selected using a probability design that oversamples areas with larger animal populations, followed by harvesting of individual animals within sampled areas using a convenience sampling method.
Simulating future uncertainty to guide the selection of survey designs for long-term monitoring
Garman, Steven L.; Schweiger, E. William; Manier, Daniel J.; Gitzen, Robert A.; Millspaugh, Joshua J.; Cooper, Andrew B.; Licht, Daniel S.
2012-01-01
A goal of environmental monitoring is to provide sound information on the status and trends of natural resources (Messer et al. 1991, Theobald et al. 2007, Fancy et al. 2009). When monitoring observations are acquired by measuring a subset of the population of interest, probability sampling as part of a well-constructed survey design provides the most reliable and legally defensible approach to achieve this goal (Cochran 1977, Olsen et al. 1999, Schreuder et al. 2004; see Chapters 2, 5, 6, 7). Previous works have described the fundamentals of sample surveys (e.g. Hansen et al. 1953, Kish 1965). Interest in survey designs and monitoring over the past 15 years has led to extensive evaluations and new developments of sample selection methods (Stevens and Olsen 2004), of strategies for allocating sample units in space and time (Urquhart et al. 1993, Overton and Stehman 1996, Urquhart and Kincaid 1999), and of estimation (Lesser and Overton 1994, Overton and Stehman 1995) and variance properties (Larsen et al. 1995, Stevens and Olsen 2003) of survey designs. Carefully planned, “scientific” (Chapter 5) survey designs have become a standard in contemporary monitoring of natural resources. Based on our experience with the long-term monitoring program of the US National Park Service (NPS; Fancy et al. 2009; Chapters 16, 22), operational survey designs tend to be selected using the following procedures. For a monitoring indicator (i.e. variable or response), a minimum detectable trend requirement is specified, based on the minimum level of change that would result in meaningful change (e.g. degradation). A probability of detecting this trend (statistical power) and an acceptable level of uncertainty (Type I error; see Chapter 2) within a specified time frame (e.g. 10 years) are specified to ensure timely detection. Explicit statements of the minimum detectable trend, the time frame for detecting the minimum trend, power, and acceptable probability of Type I error (α) collectively form the quantitative sampling objective.
System Risk Balancing Profiles: Software Component
NASA Technical Reports Server (NTRS)
Kelly, John C.; Sigal, Burton C.; Gindorf, Tom
2000-01-01
The Software QA / V&V guide will be reviewed and updated based on feedback from NASA organizations and others with a vested interest in this area. Hardware, EEE Parts, Reliability, and Systems Safety are a sample of the future guides that will be developed. Cost Estimates, Lessons Learned, Probability of Failure and PACTS (Prevention, Avoidance, Control or Test) are needed to provide a more complete risk management strategy. This approach to risk management is designed to help balance the resources and program content for risk reduction for NASA's changing environment.
Disentangling sampling and ecological explanations underlying species-area relationships
Cam, E.; Nichols, J.D.; Hines, J.E.; Sauer, J.R.; Alpizar-Jara, R.; Flather, C.H.
2002-01-01
We used a probabilistic approach to address the influence of sampling artifacts on the form of species-area relationships (SARs). We developed a model in which the increase in observed species richness is a function of sampling effort exclusively. We assumed that effort depends on area sampled, and we generated species-area curves under that model. These curves can be realistic looking. We then generated SARs from avian data, comparing SARs based on counts with those based on richness estimates. We used an approach to estimation of species richness that accounts for species detection probability and, hence, for variation in sampling effort. The slopes of SARs based on counts are steeper than those of curves based on estimates of richness, indicating that the former partly reflect failure to account for species detection probability. SARs based on estimates reflect ecological processes exclusively, not sampling processes. This approach permits investigation of ecologically relevant hypotheses. The slope of SARs is not influenced by the slope of the relationship between habitat diversity and area. In situations in which not all of the species are detected during sampling sessions, approaches to estimation of species richness integrating species detection probability should be used to investigate the rate of increase in species richness with area.
Tanadini, Lorenzo G; Schmidt, Benedikt R
2011-01-01
Monitoring is an integral part of species conservation. Monitoring programs must take imperfect detection of species into account in order to be reliable. Theory suggests that detection probability may be determined by population size but this relationship has not yet been assessed empirically. Population size is particularly important because it may induce heterogeneity in detection probability and thereby cause bias in estimates of biodiversity. We used a site occupancy model to analyse data from a volunteer-based amphibian monitoring program to assess how well different variables explain variation in detection probability. An index to population size best explained detection probabilities for four out of six species (to avoid circular reasoning, we used the count of individuals at a previous site visit as an index to current population size). The relationship between the population index and detection probability was positive. Commonly used weather variables best explained detection probabilities for two out of six species. Estimates of site occupancy probabilities differed depending on whether the population index was or was not used to model detection probability. The relationship between the population index and detectability has implications for the design of monitoring and species conservation. Most importantly, because many small populations are likely to be overlooked, monitoring programs should be designed in such a way that small populations are not overlooked. The results also imply that methods cannot be standardized in such a way that detection probabilities are constant. As we have shown here, one can easily account for variation in population size in the analysis of data from long-term monitoring programs by using counts of individuals from surveys at the same site in previous years. Accounting for variation in population size is important because it can affect the results of long-term monitoring programs and ultimately the conservation of imperiled species.
Modulation Based on Probability Density Functions
NASA Technical Reports Server (NTRS)
Williams, Glenn L.
2009-01-01
A proposed method of modulating a sinusoidal carrier signal to convey digital information involves the use of histograms representing probability density functions (PDFs) that characterize samples of the signal waveform. The method is based partly on the observation that when a waveform is sampled (whether by analog or digital means) over a time interval at least as long as one half cycle of the waveform, the samples can be sorted by frequency of occurrence, thereby constructing a histogram representing a PDF of the waveform during that time interval.
USDA-ARS?s Scientific Manuscript database
Small, coded, pill-sized tracers embedded in grain are proposed as a method for grain traceability. A sampling process for a grain traceability system was designed and investigated by applying probability statistics using a science-based sampling approach to collect an adequate number of tracers fo...
Klimstra, J.D.; O'Connell, A.F.; Pistrang, M.J.; Lewis, L.M.; Herrig, J.A.; Sauer, J.R.
2007-01-01
Science-based monitoring of biological resources is important for a greater understanding of ecological systems and for assessment of the target population using theoretic-based management approaches. When selecting variables to monitor, managers first need to carefully consider their objectives, the geographic and temporal scale at which they will operate, and the effort needed to implement the program. Generally, monitoring can be divided into two categories: index and inferential. Although index monitoring is usually easier to implement, analysis of index data requires strong assumptions about consistency in detection rates over time and space, and parameters are often biasednot accounting for detectability and spatial variation. In most cases, individuals are not always available for detection during sampling periods, and the entire area of interest cannot be sampled. Conversely, inferential monitoring is more rigorous because it is based on nearly unbiased estimators of spatial distribution. Thus, we recommend that detectability and spatial variation be considered for all monitoring programs that intend to make inferences about the target population or the area of interest. Application of these techniques is especially important for the monitoring of Threatened and Endangered (T&E) species because it is critical to determine if population size is increasing or decreasing with some level of certainty. Use of estimation-based methods and probability sampling will reduce many of the biases inherently associated with index data and provide meaningful information with respect to changes that occur in target populations. We incorporated inferential monitoring into protocols for T&E species spanning a wide range of taxa on the Cherokee National Forest in the Southern Appalachian Mountains. We review the various approaches employed for different taxa and discuss design issues, sampling strategies, data analysis, and the details of estimating detectability using site occupancy. These techniques provide a science-based approach for monitoring and can be of value to all resource managers responsible for management of T&E species.
Ranganathan, Anjana; Dougherty, Meredith; Waite, David
2013-01-01
Abstract Objective This study examined the impact of palliative home nursing care on rates of hospital 30-day readmissions. Methods The electronic health record based retrospective cohort study was performed within home care and palliative home care programs. Participants were home care patients discharged from one of three urban teaching hospitals. Outcome measures were propensity score matched rates of hospital readmissions within 30 days of hospital discharge. Results Of 406 palliative home care patients, matches were identified for 392 (96%). Of 15,709 home care patients, 890 were used at least once as a match for palliative care patients, for a total final sample of 1282. Using the matched sample we calculated the average treatment effect for treated patients. In this sample, palliative care patients had a 30-day readmission probability of 9.1% compared to a probability of 17.4% in the home care group (mean ATT: 8.3%; 95% confidence interval [CI] 8.0%–8.6%). This effect persisted after adjustment for visit frequency. Conclusions Palliative home care may offer benefits to health systems by allowing patients to remain at home and thereby avoiding 30-day rehospitalizations. PMID:24007348
Acceptability of smartphone application-based HIV prevention among young men who have sex with men.
Holloway, Ian W; Rice, Eric; Gibbs, Jeremy; Winetrobe, Hailey; Dunlap, Shannon; Rhoades, Harmony
2014-02-01
Young men who have sex with men (YMSM) are increasingly using mobile smartphone applications ("apps"), such as Grindr, to meet sex partners. A probability sample of 195 Grindr-using YMSM in Southern California were administered an anonymous online survey to assess patterns of and motivations for Grindr use in order to inform development and tailoring of smartphone-based HIV prevention for YMSM. The number one reason for using Grindr (29 %) was to meet "hook ups." Among those participants who used both Grindr and online dating sites, a statistically significantly greater percentage used online dating sites for "hook ups" (42 %) compared to Grindr (30 %). Seventy percent of YMSM expressed a willingness to participate in a smartphone app-based HIV prevention program. Development and testing of smartphone apps for HIV prevention delivery has the potential to engage YMSM in HIV prevention programming, which can be tailored based on use patterns and motivations for use.
Acceptability of Smartphone Application-Based HIV Prevention Among Young Men Who Have Sex With Men
Holloway, Ian W.; Rice, Eric; Gibbs, Jeremy; Winetrobe, Hailey; Dunlap, Shannon; Rhoades, Harmony
2014-01-01
Young men who have sex with men (YMSM) are increasingly using mobile smartphone applications (“apps”), such as Grindr, to meet sex partners. A probability sample of 195 Grindrusing YMSM in Southern California were administered an anonymous online survey to assess patterns of and motivations for Grindr use in order to inform development and tailoring of smartphone-based HIV prevention for YMSM. The number one reason for using Grindr (29%) was to meet “hook ups.” Among those participants who used both Grindr and online dating sites, a statistically significantly greater percentage used online dating sites for “hook ups” (42%) compared to Grindr (30%). Seventy percent of YMSM expressed a willingness to participate in a smartphone app-based HIV prevention program. Development and testing of smartphone apps for HIV prevention delivery has the potential to engage YMSM in HIV prevention programming, which can be tailored based on use patterns and motivations for use. PMID:24292281
Ash, A; Schwartz, M; Payne, S M; Restuccia, J D
1990-11-01
Medical record review is increasing in importance as the need to identify and monitor utilization and quality of care problems grow. To conserve resources, reviews are usually performed on a subset of cases. If judgment is used to identify subgroups for review, this raises the following questions: How should subgroups be determined, particularly since the locus of problems can change over time? What standard of comparison should be used in interpreting rates of problems found in subgroups? How can population problem rates be estimated from observed subgroup rates? How can the bias be avoided that arises because reviewers know that selected cases are suspected of having problems? How can changes in problem rates over time be interpreted when evaluating intervention programs? Simple random sampling, an alternative to subgroup review, overcomes the problems implied by these questions but is inefficient. The Self-Adapting Focused Review System (SAFRS), introduced and described here, provides an adaptive approach to record selection that is based upon model-weighted probability sampling. It retains the desirable inferential properties of random sampling while allowing reviews to be concentrated on cases currently thought most likely to be problematic. Model development and evaluation are illustrated using hospital data to predict inappropriate admissions.
Nongpiur, Monisha E; Haaland, Benjamin A; Perera, Shamira A; Friedman, David S; He, Mingguang; Sakata, Lisandro M; Baskaran, Mani; Aung, Tin
2014-01-01
To develop a score along with an estimated probability of disease for detecting angle closure based on anterior segment optical coherence tomography (AS OCT) imaging. Cross-sectional study. A total of 2047 subjects 50 years of age and older were recruited from a community polyclinic in Singapore. All subjects underwent standardized ocular examination including gonioscopy and imaging by AS OCT (Carl Zeiss Meditec). Customized software (Zhongshan Angle Assessment Program) was used to measure AS OCT parameters. Complete data were available for 1368 subjects. Data from the right eyes were used for analysis. A stepwise logistic regression model with Akaike information criterion was used to generate a score that then was converted to an estimated probability of the presence of gonioscopic angle closure, defined as the inability to visualize the posterior trabecular meshwork for at least 180 degrees on nonindentation gonioscopy. Of the 1368 subjects, 295 (21.6%) had gonioscopic angle closure. The angle closure score was calculated from the shifted linear combination of the AS OCT parameters. The score can be converted to an estimated probability of having angle closure using the relationship: estimated probability = e(score)/(1 + e(score)), where e is the natural exponential. The score performed well in a second independent sample of 178 angle-closure subjects and 301 normal controls, with an area under the receiver operating characteristic curve of 0.94. A score derived from a single AS OCT image, coupled with an estimated probability, provides an objective platform for detection of angle closure. Copyright © 2014 Elsevier Inc. All rights reserved.
A Monte Carlo study of Weibull reliability analysis for space shuttle main engine components
NASA Technical Reports Server (NTRS)
Abernethy, K.
1986-01-01
The incorporation of a number of additional capabilities into an existing Weibull analysis computer program and the results of Monte Carlo computer simulation study to evaluate the usefulness of the Weibull methods using samples with a very small number of failures and extensive censoring are discussed. Since the censoring mechanism inherent in the Space Shuttle Main Engine (SSME) data is hard to analyze, it was decided to use a random censoring model, generating censoring times from a uniform probability distribution. Some of the statistical techniques and computer programs that are used in the SSME Weibull analysis are described. The methods documented in were supplemented by adding computer calculations of approximate (using iteractive methods) confidence intervals for several parameters of interest. These calculations are based on a likelihood ratio statistic which is asymptotically a chisquared statistic with one degree of freedom. The assumptions built into the computer simulations are described. The simulation program and the techniques used in it are described there also. Simulation results are tabulated for various combinations of Weibull shape parameters and the numbers of failures in the samples.
Shao, Jing-Yuan; Qu, Hai-Bin; Gong, Xing-Chu
2018-05-01
In this work, two algorithms (overlapping method and the probability-based method) for design space calculation were compared by using the data collected from extraction process of Codonopsis Radix as an example. In the probability-based method, experimental error was simulated to calculate the probability of reaching the standard. The effects of several parameters on the calculated design space were studied, including simulation number, step length, and the acceptable probability threshold. For the extraction process of Codonopsis Radix, 10 000 times of simulation and 0.02 for the calculation step length can lead to a satisfactory design space. In general, the overlapping method is easy to understand, and can be realized by several kinds of commercial software without coding programs, but the reliability of the process evaluation indexes when operating in the design space is not indicated. Probability-based method is complex in calculation, but can provide the reliability to ensure that the process indexes can reach the standard within the acceptable probability threshold. In addition, there is no probability mutation in the edge of design space by probability-based method. Therefore, probability-based method is recommended for design space calculation. Copyright© by the Chinese Pharmaceutical Association.
Rothmann, Mark
2005-01-01
When testing the equality of means from two different populations, a t-test or large sample normal test tend to be performed. For these tests, when the sample size or design for the second sample is dependent on the results of the first sample, the type I error probability is altered for each specific possibility in the null hypothesis. We will examine the impact on the type I error probabilities for two confidence interval procedures and procedures using test statistics when the design for the second sample or experiment is dependent on the results from the first sample or experiment (or series of experiments). Ways for controlling a desired maximum type I error probability or a desired type I error rate will be discussed. Results are applied to the setting of noninferiority comparisons in active controlled trials where the use of a placebo is unethical.
Acar, Elif F; Sun, Lei
2013-06-01
Motivated by genetic association studies of SNPs with genotype uncertainty, we propose a generalization of the Kruskal-Wallis test that incorporates group uncertainty when comparing k samples. The extended test statistic is based on probability-weighted rank-sums and follows an asymptotic chi-square distribution with k - 1 degrees of freedom under the null hypothesis. Simulation studies confirm the validity and robustness of the proposed test in finite samples. Application to a genome-wide association study of type 1 diabetic complications further demonstrates the utilities of this generalized Kruskal-Wallis test for studies with group uncertainty. The method has been implemented as an open-resource R program, GKW. © 2013, The International Biometric Society.
A combinatorial perspective of the protein inference problem.
Yang, Chao; He, Zengyou; Yu, Weichuan
2013-01-01
In a shotgun proteomics experiment, proteins are the most biologically meaningful output. The success of proteomics studies depends on the ability to accurately and efficiently identify proteins. Many methods have been proposed to facilitate the identification of proteins from peptide identification results. However, the relationship between protein identification and peptide identification has not been thoroughly explained before. In this paper, we devote ourselves to a combinatorial perspective of the protein inference problem. We employ combinatorial mathematics to calculate the conditional protein probabilities (protein probability means the probability that a protein is correctly identified) under three assumptions, which lead to a lower bound, an upper bound, and an empirical estimation of protein probabilities, respectively. The combinatorial perspective enables us to obtain an analytical expression for protein inference. Our method achieves comparable results with ProteinProphet in a more efficient manner in experiments on two data sets of standard protein mixtures and two data sets of real samples. Based on our model, we study the impact of unique peptides and degenerate peptides (degenerate peptides are peptides shared by at least two proteins) on protein probabilities. Meanwhile, we also study the relationship between our model and ProteinProphet. We name our program ProteinInfer. Its Java source code, our supplementary document and experimental results are available at: >http://bioinformatics.ust.hk/proteininfer.
Nielson, Ryan M.; Gray, Brian R.; McDonald, Lyman L.; Heglund, Patricia J.
2011-01-01
Estimation of site occupancy rates when detection probabilities are <1 is well established in wildlife science. Data from multiple visits to a sample of sites are used to estimate detection probabilities and the proportion of sites occupied by focal species. In this article we describe how site occupancy methods can be applied to estimate occupancy rates of plants and other sessile organisms. We illustrate this approach and the pitfalls of ignoring incomplete detection using spatial data for 2 aquatic vascular plants collected under the Upper Mississippi River's Long Term Resource Monitoring Program (LTRMP). Site occupancy models considered include: a naïve model that ignores incomplete detection, a simple site occupancy model assuming a constant occupancy rate and a constant probability of detection across sites, several models that allow site occupancy rates and probabilities of detection to vary with habitat characteristics, and mixture models that allow for unexplained variation in detection probabilities. We used information theoretic methods to rank competing models and bootstrapping to evaluate the goodness-of-fit of the final models. Results of our analysis confirm that ignoring incomplete detection can result in biased estimates of occupancy rates. Estimates of site occupancy rates for 2 aquatic plant species were 19–36% higher compared to naive estimates that ignored probabilities of detection <1. Simulations indicate that final models have little bias when 50 or more sites are sampled, and little gains in precision could be expected for sample sizes >300. We recommend applying site occupancy methods for monitoring presence of aquatic species.
Framing From Experience: Cognitive Processes and Predictions of Risky Choice.
Gonzalez, Cleotilde; Mehlhorn, Katja
2016-07-01
A framing bias shows risk aversion in problems framed as "gains" and risk seeking in problems framed as "losses," even when these are objectively equivalent and probabilities and outcomes values are explicitly provided. We test this framing bias in situations where decision makers rely on their own experience, sampling the problem's options (safe and risky) and seeing the outcomes before making a choice. In Experiment 1, we replicate the framing bias in description-based decisions and find risk indifference in gains and losses in experience-based decisions. Predictions of an Instance-Based Learning model suggest that objective probabilities as well as the number of samples taken are factors that contribute to the lack of framing effect. We test these two factors in Experiment 2 and find no framing effect when a few samples are taken but when large samples are taken, the framing effect appears regardless of the objective probability values. Implications of behavioral results and cognitive modeling are discussed. Copyright © 2015 Cognitive Science Society, Inc.
A dynamic programming approach to estimate the capacity value of energy storage
Sioshansi, Ramteen; Madaeni, Seyed Hossein; Denholm, Paul
2013-09-17
Here, we present a method to estimate the capacity value of storage. Our method uses a dynamic program to model the effect of power system outages on the operation and state of charge of storage in subsequent periods. We combine the optimized dispatch from the dynamic program with estimated system loss of load probabilities to compute a probability distribution for the state of charge of storage in each period. This probability distribution can be used as a forced outage rate for storage in standard reliability-based capacity value estimation methods. Our proposed method has the advantage over existing approximations that itmore » explicitly captures the effect of system shortage events on the state of charge of storage in subsequent periods. We also use a numerical case study, based on five utility systems in the U.S., to demonstrate our technique and compare it to existing approximation methods.« less
Ibáñez, R.; Félez-Sánchez, M.; Godínez, J. M.; Guardià, C.; Caballero, E.; Juve, R.; Combalia, N.; Bellosillo, B.; Cuevas, D.; Moreno-Crespi, J.; Pons, L.; Autonell, J.; Gutierrez, C.; Ordi, J.; de Sanjosé, S.
2014-01-01
In Catalonia, a screening protocol for cervical cancer, including human papillomavirus (HPV) DNA testing using the Digene Hybrid Capture 2 (HC2) assay, was implemented in 2006. In order to monitor interlaboratory reproducibility, a proficiency testing (PT) survey of the HPV samples was launched in 2008. The aim of this study was to explore the repeatability of the HC2 assay's performance. Participating laboratories provided 20 samples annually, 5 randomly chosen samples from each of the following relative light unit (RLU) intervals: <0.5, 0.5 to 0.99, 1 to 9.99, and ≥10. Kappa statistics were used to determine the agreement levels between the original and the PT readings. The nature and origin of the discrepant results were calculated by bootstrapping. A total of 946 specimens were retested. The kappa values were 0.91 for positive/negative categorical classification and 0.79 for the four RLU intervals studied. Sample retesting yielded systematically lower RLU values than the original test (P < 0.005), independently of the time elapsed between the two determinations (median, 53 days), possibly due to freeze-thaw cycles. The probability for a sample to show clinically discrepant results upon retesting was a function of the RLU value; samples with RLU values in the 0.5 to 5 interval showed 10.80% probability to yield discrepant results (95% confidence interval [CI], 7.86 to 14.33) compared to 0.85% probability for samples outside this interval (95% CI, 0.17 to 1.69). Globally, the HC2 assay shows high interlaboratory concordance. We have identified differential confidence thresholds and suggested the guidelines for interlaboratory PT in the future, as analytical quality assessment of HPV DNA detection remains a central component of the screening program for cervical cancer prevention. PMID:24574284
Learning Probabilities in Computer Engineering by Using a Competency- and Problem-Based Approach
ERIC Educational Resources Information Center
Khoumsi, Ahmed; Hadjou, Brahim
2005-01-01
Our department has redesigned its electrical and computer engineering programs by adopting a learning methodology based on competence development, problem solving, and the realization of design projects. In this article, we show how this pedagogical approach has been successfully used for learning probabilities and their application to computer…
Coggins, L.G.; Pine, William E.; Walters, C.J.; Martell, S.J.D.
2006-01-01
We present a new model to estimate capture probabilities, survival, abundance, and recruitment using traditional Jolly-Seber capture-recapture methods within a standard fisheries virtual population analysis framework. This approach compares the numbers of marked and unmarked fish at age captured in each year of sampling with predictions based on estimated vulnerabilities and abundance in a likelihood function. Recruitment to the earliest age at which fish can be tagged is estimated by using a virtual population analysis method to back-calculate the expected numbers of unmarked fish at risk of capture. By using information from both marked and unmarked animals in a standard fisheries age structure framework, this approach is well suited to the sparse data situations common in long-term capture-recapture programs with variable sampling effort. ?? Copyright by the American Fisheries Society 2006.
Schwarcz, Sandra; Spindler, Hilary; Scheer, Susan; Valleroy, Linda; Lansky, Amy
2007-07-01
Convenience samples are used to determine HIV-related behaviors among men who have sex with men (MSM) without measuring the extent to which the results are representative of the broader MSM population. We compared results from a cross-sectional survey of MSM recruited from gay bars between June and October 2001 to a random digit dial telephone survey conducted between June 2002 and January 2003. The men in the probability sample were older, better educated, and had higher incomes than men in the convenience sample, the convenience sample enrolled more employed men and men of color. Substance use around the time of sex was higher in the convenience sample but other sexual behaviors were similar. HIV testing was common among men in both samples. Periodic validation, through comparison of data collected by different sampling methods, may be useful when relying on survey data for program and policy development.
A predictive model of hospitalization risk among disabled medicaid enrollees.
McAna, John F; Crawford, Albert G; Novinger, Benjamin W; Sidorov, Jaan; Din, Franklin M; Maio, Vittorio; Louis, Daniel Z; Goldfarb, Neil I
2013-05-01
To identify Medicaid patients, based on 1 year of administrative data, who were at high risk of admission to a hospital in the next year, and who were most likely to benefit from outreach and targeted interventions. Observational cohort study for predictive modeling. Claims, enrollment, and eligibility data for 2007 from a state Medicaid program were used to provide the independent variables for a logistic regression model to predict inpatient stays in 2008 for fully covered, continuously enrolled, disabled members. The model was developed using a 50% random sample from the state and was validated against the other 50%. Further validation was carried out by applying the parameters from the model to data from a second state's disabled Medicaid population. The strongest predictors in the model developed from the first 50% sample were over age 65 years, inpatient stay(s) in 2007, and higher Charlson Comorbidity Index scores. The areas under the receiver operating characteristic curve for the model based on the 50% state sample and its application to the 2 other samples ranged from 0.79 to 0.81. Models developed independently for all 3 samples were as high as 0.86. The results show a consistent trend of more accurate prediction of hospitalization with increasing risk score. This is a fairly robust method for targeting Medicaid members with a high probability of future avoidable hospitalizations for possible case management or other interventions. Comparison with a second state's Medicaid program provides additional evidence for the usefulness of the model.
Rand E. Eads; Mark R. Boolootian; Steven C. [Inventors] Hankin
1987-01-01
Abstract - A programmable calculator is connected to a pumping sampler by an interface circuit board. The calculator has a sediment sampling program stored therein and includes a timer to periodically wake up the calculator. Sediment collection is controlled by a Selection At List Time (SALT) scheme in which the probability of taking a sample is proportional to its...
DOE Office of Scientific and Technical Information (OSTI.GOV)
La Russa, D
Purpose: The purpose of this project is to develop a robust method of parameter estimation for a Poisson-based TCP model using Bayesian inference. Methods: Bayesian inference was performed using the PyMC3 probabilistic programming framework written in Python. A Poisson-based TCP regression model that accounts for clonogen proliferation was fit to observed rates of local relapse as a function of equivalent dose in 2 Gy fractions for a population of 623 stage-I non-small-cell lung cancer patients. The Slice Markov Chain Monte Carlo sampling algorithm was used to sample the posterior distributions, and was initiated using the maximum of the posterior distributionsmore » found by optimization. The calculation of TCP with each sample step required integration over the free parameter α, which was performed using an adaptive 24-point Gauss-Legendre quadrature. Convergence was verified via inspection of the trace plot and posterior distribution for each of the fit parameters, as well as with comparisons of the most probable parameter values with their respective maximum likelihood estimates. Results: Posterior distributions for α, the standard deviation of α (σ), the average tumour cell-doubling time (Td), and the repopulation delay time (Tk), were generated assuming α/β = 10 Gy, and a fixed clonogen density of 10{sup 7} cm−{sup 3}. Posterior predictive plots generated from samples from these posterior distributions are in excellent agreement with the observed rates of local relapse used in the Bayesian inference. The most probable values of the model parameters also agree well with maximum likelihood estimates. Conclusion: A robust method of performing Bayesian inference of TCP data using a complex TCP model has been established.« less
Sample Size Determination for Rasch Model Tests
ERIC Educational Resources Information Center
Draxler, Clemens
2010-01-01
This paper is concerned with supplementing statistical tests for the Rasch model so that additionally to the probability of the error of the first kind (Type I probability) the probability of the error of the second kind (Type II probability) can be controlled at a predetermined level by basing the test on the appropriate number of observations.…
ERIC Educational Resources Information Center
Hamilton, Jacqueline
2009-01-01
An experimental study was conducted to investigate the effects of an Employee Wellness Program on physiological risk factors, job satisfaction, and monetary savings in a South Texas University. The non-probability sample consisted of 31 employees from lower income level positions. The employees were randomly assigned to the treatment group which…
Deterministic multidimensional nonuniform gap sampling.
Worley, Bradley; Powers, Robert
2015-12-01
Born from empirical observations in nonuniformly sampled multidimensional NMR data relating to gaps between sampled points, the Poisson-gap sampling method has enjoyed widespread use in biomolecular NMR. While the majority of nonuniform sampling schemes are fully randomly drawn from probability densities that vary over a Nyquist grid, the Poisson-gap scheme employs constrained random deviates to minimize the gaps between sampled grid points. We describe a deterministic gap sampling method, based on the average behavior of Poisson-gap sampling, which performs comparably to its random counterpart with the additional benefit of completely deterministic behavior. We also introduce a general algorithm for multidimensional nonuniform sampling based on a gap equation, and apply it to yield a deterministic sampling scheme that combines burst-mode sampling features with those of Poisson-gap schemes. Finally, we derive a relationship between stochastic gap equations and the expectation value of their sampling probability densities. Copyright © 2015 Elsevier Inc. All rights reserved.
Schloesser, J.T.; Paukert, Craig P.; Doyle, W.J.; Hill, Tracy D.; Steffensen, K.D.; Travnichek, Vincent H.
2012-01-01
Occupancy modeling was used to determine (1) if detection probabilities (p) for 7 regionally imperiled Missouri River fishes (Scaphirhynchus albus, Scaphirhynchus platorynchus, Cycleptus elongatus, Sander canadensis, Macrhybopsis aestivalis, Macrhybopsis gelida, and Macrhybopsis meeki) differed among gear types (i.e. stationary gill nets, drifted trammel nets, and otter trawls), and (2) how detection probabilities were affected by habitat (i.e. pool, bar, and open water), longitudinal position (five 189 to 367 rkm long segments), sampling year (2003 to 2006), and season (July 1 to October 30 and October 31 to June 30). Adult, large-bodied fishes were best detected with gill nets (p: 0.02–0.74), but most juvenile large-bodied and all small-bodied species were best detected with otter trawls (p: 0.02–0.58). Trammel nets may be a redundant sampling gear for imperiled fishes in the lower Missouri River because most species had greater detection probabilities with gill nets or otter trawls. Detection probabilities varied with river segment for S. platorynchus, C. elongatus, and all small-bodied fishes, suggesting that changes in habitat influenced gear efficiency or abundance changes among river segments. Detection probabilities varied by habitat for adult S. albus and S. canadensis, year for juvenile S. albus, C. elongatus, and S. canadensis, and season for adult S. albus. Concentrating sampling effort on gears with the greatest detection probabilities may increase species detections to better monitor a population's response to environmental change and the effects of management actions on large-river fishes.
Electron microprobe analysis program for biological specimens: BIOMAP
NASA Technical Reports Server (NTRS)
Edwards, B. F.
1972-01-01
BIOMAP is a Univac 1108 compatible program which facilitates the electron probe microanalysis of biological specimens. Input data are X-ray intensity data from biological samples, the X-ray intensity and composition data from a standard sample and the electron probe operating parameters. Outputs are estimates of the weight percentages of the analyzed elements, the distribution of these estimates for sets of red blood cells and the probabilities for correlation between elemental concentrations. An optional feature statistically estimates the X-ray intensity and residual background of a principal standard relative to a series of standards.
ALIENS IN WESTERN STREAM ECOSYSTEMS
The USEPA's Environmental Monitoring and Assessment Program conducted a five year probability sample of permanent mapped streams in 12 western US states. The study design enables us to determine the extent of selected riparian invasive plants, alien aquatic vertebrates, and some ...
SURE reliability analysis: Program and mathematics
NASA Technical Reports Server (NTRS)
Butler, Ricky W.; White, Allan L.
1988-01-01
The SURE program is a new reliability analysis tool for ultrareliable computer system architectures. The computational methods on which the program is based provide an efficient means for computing accurate upper and lower bounds for the death state probabilities of a large class of semi-Markov models. Once a semi-Markov model is described using a simple input language, the SURE program automatically computes the upper and lower bounds on the probability of system failure. A parameter of the model can be specified as a variable over a range of values directing the SURE program to perform a sensitivity analysis automatically. This feature, along with the speed of the program, makes it especially useful as a design tool.
The sampling design for the National Children¿s Study (NCS) calls for a population-based, multi-stage, clustered household sampling approach (visit our website for more information on the NCS : www.nationalchildrensstudy.gov). The full sample is designed to be representative of ...
Xiao, Chuan-Le; Chen, Xiao-Zhou; Du, Yang-Li; Sun, Xuesong; Zhang, Gong; He, Qing-Yu
2013-01-04
Mass spectrometry has become one of the most important technologies in proteomic analysis. Tandem mass spectrometry (LC-MS/MS) is a major tool for the analysis of peptide mixtures from protein samples. The key step of MS data processing is the identification of peptides from experimental spectra by searching public sequence databases. Although a number of algorithms to identify peptides from MS/MS data have been already proposed, e.g. Sequest, OMSSA, X!Tandem, Mascot, etc., they are mainly based on statistical models considering only peak-matches between experimental and theoretical spectra, but not peak intensity information. Moreover, different algorithms gave different results from the same MS data, implying their probable incompleteness and questionable reproducibility. We developed a novel peptide identification algorithm, ProVerB, based on a binomial probability distribution model of protein tandem mass spectrometry combined with a new scoring function, making full use of peak intensity information and, thus, enhancing the ability of identification. Compared with Mascot, Sequest, and SQID, ProVerB identified significantly more peptides from LC-MS/MS data sets than the current algorithms at 1% False Discovery Rate (FDR) and provided more confident peptide identifications. ProVerB is also compatible with various platforms and experimental data sets, showing its robustness and versatility. The open-source program ProVerB is available at http://bioinformatics.jnu.edu.cn/software/proverb/ .
An automated approach to the design of decision tree classifiers
NASA Technical Reports Server (NTRS)
Argentiero, P.; Chin, P.; Beaudet, P.
1980-01-01
The classification of large dimensional data sets arising from the merging of remote sensing data with more traditional forms of ancillary data is considered. Decision tree classification, a popular approach to the problem, is characterized by the property that samples are subjected to a sequence of decision rules before they are assigned to a unique class. An automated technique for effective decision tree design which relies only on apriori statistics is presented. This procedure utilizes a set of two dimensional canonical transforms and Bayes table look-up decision rules. An optimal design at each node is derived based on the associated decision table. A procedure for computing the global probability of correct classfication is also provided. An example is given in which class statistics obtained from an actual LANDSAT scene are used as input to the program. The resulting decision tree design has an associated probability of correct classification of .76 compared to the theoretically optimum .79 probability of correct classification associated with a full dimensional Bayes classifier. Recommendations for future research are included.
Hartwell, T D; Steele, P; French, M T; Potter, F J; Rodman, N F; Zarkin, G A
1996-06-01
Employee assistance programs (EAPs) are job-based programs designed to identify and assist troubled employees. This study determines the prevalence, cost, and characteristics of these programs in the United States by worksite size, industry, and census region. A stratified national probability sample of more than 6400 private, nonagricultural US worksites with 50 or more full-time employees was contacted with a computer-assisted telephone interviewing protocol. More than 3200 worksites responded and were eligible, with a response rate of 90%. Approximately 33% of all private, nonagricultural worksites with 50 or more full-time employees currently offer EAP services to their employees, an 8.9% increase over 1985. These programs are more likely to be found in larger worksites and in the communications/utilities/transportation industries. The most popular model is an external provider, and the median annual cost per eligible employee for internal and external programs was $21.83 and $18.09, respectively. EAPs are becoming a more prevalent point of access to health care for workers with personal problems such as substance abuse, family problems, or emotional distress.
Sampling guidelines for oral fluid-based surveys of group-housed animals.
Rotolo, Marisa L; Sun, Yaxuan; Wang, Chong; Giménez-Lirola, Luis; Baum, David H; Gauger, Phillip C; Harmon, Karen M; Hoogland, Marlin; Main, Rodger; Zimmerman, Jeffrey J
2017-09-01
Formulas and software for calculating sample size for surveys based on individual animal samples are readily available. However, sample size formulas are not available for oral fluids and other aggregate samples that are increasingly used in production settings. Therefore, the objective of this study was to develop sampling guidelines for oral fluid-based porcine reproductive and respiratory syndrome virus (PRRSV) surveys in commercial swine farms. Oral fluid samples were collected in 9 weekly samplings from all pens in 3 barns on one production site beginning shortly after placement of weaned pigs. Samples (n=972) were tested by real-time reverse-transcription PCR (RT-rtPCR) and the binary results analyzed using a piecewise exponential survival model for interval-censored, time-to-event data with misclassification. Thereafter, simulation studies were used to study the barn-level probability of PRRSV detection as a function of sample size, sample allocation (simple random sampling vs fixed spatial sampling), assay diagnostic sensitivity and specificity, and pen-level prevalence. These studies provided estimates of the probability of detection by sample size and within-barn prevalence. Detection using fixed spatial sampling was as good as, or better than, simple random sampling. Sampling multiple barns on a site increased the probability of detection with the number of barns sampled. These results are relevant to PRRSV control or elimination projects at the herd, regional, or national levels, but the results are also broadly applicable to contagious pathogens of swine for which oral fluid tests of equivalent performance are available. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Coleman, Laci S.; Ford, W. Mark; Dobony, Christopher A.; Britzke, Eric R.
2014-01-01
Concomitant with the emergence and spread of white-nose syndrome (WNS) and precipitous decline of many bat species in North America, natural resource managers need modified and/or new techniques for bat inventory and monitoring that provide robust occupancy estimates. We used Anabat acoustic detectors to determine the most efficient passive acoustic sampling design for optimizing detection probabilities of multiple bat species in a WNS-impacted environment in New York, USA. Our sampling protocol included: six acoustic stations deployed for the entire duration of monitoring as well as a 4 x 4 grid and five transects of 5-10 acoustic units that were deployed for 6-8 night sample durations surveyed during the summers of 2011-2012. We used Program PRESENCE to determine detection probability and site occupancy estimates. Overall, the grid produced the highest detection probabilities for most species because it contained the most detectors and intercepted the greatest spatial area. However, big brown bats (Eptesicus fuscus) and species not impacted by WNS were detected easily regardless of sampling array. Endangered Indiana (Myotis sodalis) and little brown (Myotis lucifugus) and tri-colored bats (Perimyotis subflavus) showed declines in detection probabilities over our study, potentially indicative of continued WNS-associated declines. Identification of species presence through efficient methodologies is vital for future conservation efforts as bat populations decline further due to WNS and other factors.
ERIC Educational Resources Information Center
Bowen, Michelle; Laurion, Suzanne
A study documented, using a telephone survey, the incidence rates of sexual harassment of mass communication interns, and compared those rates to student and professional rates. A probability sample of 44 male and 52 female mass communications professionals was generated using several random sampling techniques from among professionals who work in…
Program adherence and effectiveness of a commercial nutrition program: the metabolic balance study.
Meffert, Cornelia; Gerdes, Nikolaus
2010-01-01
Objective. To assess the effectiveness of a commercial nutrition program in improving weight, blood lipids, and health-related quality of life (HRQOL). Methods. Prospective observational study with followup after 1, 3, 6, and 12 months with data from questionnaires and blood samples. Subjects. After 12 months, we had data from 524 subjects (= 60.6% of the initial samples). 84.1% of the subjects were women. The average BMI at baseline was 30.3 (SD = 5.7). Results. After 12 months, the average weight loss was 6.8 kg (SD = 7.1 kg). Program adherence declined over time but was still high after 12 months and showed a positive linear correlation with weight loss. Relevant blood parameters as well as HRQOL improved significantly. Conclusion. After 12 months, nearly two thirds of the samples had achieved >5% reduction of their initial weights. The high degree of program adherence is probably due to personal counseling and individually designed nutrition plans provided by the program.
Norms governing urban African American adolescents’ sexual and substance-using behavior
Dolcini, M. Margaret; Catania, Joseph A.; Harper, Gary W.; Watson, Susan E.; Ellen, Jonathan M.; Towner, Senna L.
2013-01-01
Using a probability-based neighborhood sample of urban African American youth and a sample of their close friends (N = 202), we conducted a one-year longitudinal study to examine key questions regarding sexual and drug using norms. The results provide validation of social norms governing sexual behavior, condom use, and substance use among friendship groups. These norms had strong to moderate homogeneity; and both normative strength and homogeneity were relatively stable over a one-year period independent of changes in group membership. The data further suggest that sex and substance using norms may operate as a normative set. Similar to studies of adults, we identified three distinct “norm-based” social strata in our sample. Together, our findings suggest that the norms investigated are valid targets for health promotion efforts, and such efforts may benefit from tailoring programs to the normative sets that make up the different social strata in a given adolescent community. PMID:23072891
NASA Astrophysics Data System (ADS)
Gurov, V. V.
2017-01-01
Software tools for educational purposes, such as e-lessons, computer-based testing system, from the point of view of reliability, have a number of features. The main ones among them are the need to ensure a sufficiently high probability of their faultless operation for a specified time, as well as the impossibility of their rapid recovery by the way of replacing it with a similar running program during the classes. The article considers the peculiarities of reliability evaluation of programs in contrast to assessments of hardware reliability. The basic requirements to reliability of software used for carrying out practical and laboratory classes in the form of computer-based training programs are given. The essential requirements applicable to the reliability of software used for conducting the practical and laboratory studies in the form of computer-based teaching programs are also described. The mathematical tool based on Markov chains, which allows to determine the degree of debugging of the training program for use in the educational process by means of applying the graph of the software modules interaction, is presented.
Exploration and Adoption of Evidence-based Practice by US Child Welfare Agencies.
Horwitz, Sarah McCue; Hurlburt, Michael S; Goldhaber-Fiebert, Jeremy D; Palinkas, Lawrence A; Rolls-Reutz, Jennifer; Zhang, Jinjin; Fisher, Emily; Landsverk, John
2014-04-01
To examine the extent to which child welfare agencies adopt new practices and to determine the barriers to and facilitators of adoption of new practices. Data came from telephone interviews with the directors of the 92 public child welfare agencies that constituted the probability sample for the first National Survey of Child and Adolescent Well-being (NSCAWI). In a semi-structured 40 minute interview administered by a trained Research Associate, agency directors were asked about agency demographics, knowledge of evidence-based practices, use of technical assistance and actual use of evidence-based practices.. Of the 92 agencies, 83 or 90% agreed to be interviewed. Agencies reported that the majority of staff had a BA degree (53.45%) and that they either paid for (52.6%) or provided (80.7%) continuing education. Although agencies routinely collect standardized child outcomes (90%) they much less frequently collect measures of child functioning (30.9%). Almost all agencies (94%) had started a new program or practice but only 24.8% were evidence-based and strategies used to explore new programs or practices usually involved local or state contracts. Factors that were associated with program success included internal support for the innovation (27.3%), and an existing evidence base (23.5%). Directors of child welfare agencies frequently institute new programs or practices but they are not often evidence-based. Because virtually all agencies provide some continuing education adding discussions of evidence-based programs/practices may spur adaption. Reliance on local and state colleagues to explore new programs and practices suggests that developing well informed social networks may be a way to increase the spread of evidence0based practices.
Use of probability analysis to establish routine bioassay screening levels
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carbaugh, E.H.; Sula, M.J.; McFadden, K.M.
1990-09-01
Probability analysis was used by the Hanford Internal Dosimetry Program to establish bioassay screening levels for tritium and uranium in urine. Background environmental levels of these two radionuclides are generally detectable by the highly sensitive urine analysis procedures routinely used at Hanford. Establishing screening levels requires balancing the impact of false detection with the consequence of potentially undetectable occupation dose. To establish the screening levels, tritium and uranium analyses were performed on urine samples collected from workers exposed only to environmental sources. All samples were collected at home using a simulated 12-hour protocol for tritium and a simulated 24-hour collectionmore » protocol for uranium. Results of the analyses of these samples were ranked according to tritium concentration or total sample uranium. The cumulative percentile was calculated and plotted using log-probability coordinates. Geometric means and screening levels corresponding to various percentiles were estimated by graphical interpolation and standard calculations. The potentially annual internal dose associated with a screening level was calculated. Screening levels were selected corresponding to the 99.9 percentile, implying that, on the average, 1 out of 1000 samples collected from an unexposed worker population would be expected to exceed the screening level. 4 refs., 2 figs.« less
Lake Superior Phytoplankton Characterization from the 2006 Probability Based Survey
We conducted a late summer probability based survey of Lake Superior in 2006 which consisted of 52 sites stratified across 3 depth zones. As part of this effort, we collected composite phytoplankton samples from the epilimnion and the fluorescence maxima (Fmax) at 29 of the site...
Optimizing liquid effluent monitoring at a large nuclear complex.
Chou, Charissa J; Barnett, D Brent; Johnson, Vernon G; Olson, Phil M
2003-12-01
Effluent monitoring typically requires a large number of analytes and samples during the initial or startup phase of a facility. Once a baseline is established, the analyte list and sampling frequency may be reduced. Although there is a large body of literature relevant to the initial design, few, if any, published papers exist on updating established effluent monitoring programs. This paper statistically evaluates four years of baseline data to optimize the liquid effluent monitoring efficiency of a centralized waste treatment and disposal facility at a large defense nuclear complex. Specific objectives were to: (1) assess temporal variability in analyte concentrations, (2) determine operational factors contributing to waste stream variability, (3) assess the probability of exceeding permit limits, and (4) streamline the sampling and analysis regime. Results indicated that the probability of exceeding permit limits was one in a million under normal facility operating conditions, sampling frequency could be reduced, and several analytes could be eliminated. Furthermore, indicators such as gross alpha and gross beta measurements could be used in lieu of more expensive specific isotopic analyses (radium, cesium-137, and strontium-90) for routine monitoring. Study results were used by the state regulatory agency to modify monitoring requirements for a new discharge permit, resulting in an annual cost savings of US dollars 223,000. This case study demonstrates that statistical evaluation of effluent contaminant variability coupled with process knowledge can help plant managers and regulators streamline analyte lists and sampling frequencies based on detection history and environmental risk.
Accounting for Incomplete Species Detection in Fish Community Monitoring
DOE Office of Scientific and Technical Information (OSTI.GOV)
McManamay, Ryan A; Orth, Dr. Donald J; Jager, Yetta
2013-01-01
Riverine fish assemblages are heterogeneous and very difficult to characterize with a one-size-fits-all approach to sampling. Furthermore, detecting changes in fish assemblages over time requires accounting for variation in sampling designs. We present a modeling approach that permits heterogeneous sampling by accounting for site and sampling covariates (including method) in a model-based framework for estimation (versus a sampling-based framework). We snorkeled during three surveys and electrofished during a single survey in suite of delineated habitats stratified by reach types. We developed single-species occupancy models to determine covariates influencing patch occupancy and species detection probabilities whereas community occupancy models estimated speciesmore » richness in light of incomplete detections. For most species, information-theoretic criteria showed higher support for models that included patch size and reach as covariates of occupancy. In addition, models including patch size and sampling method as covariates of detection probabilities also had higher support. Detection probability estimates for snorkeling surveys were higher for larger non-benthic species whereas electrofishing was more effective at detecting smaller benthic species. The number of sites and sampling occasions required to accurately estimate occupancy varied among fish species. For rare benthic species, our results suggested that higher number of occasions, and especially the addition of electrofishing, may be required to improve detection probabilities and obtain accurate occupancy estimates. Community models suggested that richness was 41% higher than the number of species actually observed and the addition of an electrofishing survey increased estimated richness by 13%. These results can be useful to future fish assemblage monitoring efforts by informing sampling designs, such as site selection (e.g. stratifying based on patch size) and determining effort required (e.g. number of sites versus occasions).« less
Spreadsheet-Based Program for Simulating Atomic Emission Spectra
ERIC Educational Resources Information Center
Flannigan, David J.
2014-01-01
A simple Excel spreadsheet-based program for simulating atomic emission spectra from the properties of neutral atoms (e.g., energies and statistical weights of the electronic states, electronic partition functions, transition probabilities, etc.) is described. The contents of the spreadsheet (i.e., input parameters, formulas for calculating…
ERIC Educational Resources Information Center
Longbotham, Pamela J.
2012-01-01
The study examined the impact of participation in an optional flexible year program (OFYP) on academic achievement. The ex post facto study employed an explanatory sequential mixed methods design. The non-probability sample consisted of 163 fifth grade students in an OFYP district and 137 5th graders in a 180-day instructional year school…
Evidence-based psychosocial treatments for child and adolescent depression.
David-Ferdon, Corinne; Kaslow, Nadine J
2008-01-01
The evidence-base of psychosocial treatment outcome studies for depressed youth conducted since 1998 is examined. All studies for depressed children meet Nathan and Gorman's (2002) criteria for Type 2 studies whereas the adolescent protocols meet criteria for both Type 1 and Type 2 studies. Based on the Task Force on the Promotion and Dissemination of Psychological Procedures guidelines, the cognitive-behavioral therapy (CBT) based specific programs of Penn Prevention Program, Self-Control Therapy, and Coping with Depression-Adolescent are probably efficacious. Interpersonal Therapy-Adolescent, which falls under the theoretical category of interpersonal therapy (IPT), also is a probably efficacious treatment. CBT provided through the modalities of child group only and child group plus parent components are well-established intervention approaches for depressed children. For adolescents, two modalities are well-established (CBT adolescent only group, IPT individual), and three are probably efficacious (CBT adolescent group plus parent component, CBT individual, CBT individual plus parent/family component). From the broad theoretical level, CBT has well-established efficacy and behavior therapy meets criteria for a probably efficacious intervention for childhood depression. For adolescent depression, both CBT and IPT have well-established efficacy. Future research directions and best practices are offered.
Woodworth, M.T.; Connor, B.F.
2001-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T-165 (trace constituents), M-158 (major constituents), N-69 (nutrient constituents), N-70 (nutrient constituents), P-36 (low ionic-strength constituents), and Hg-32 (mercury) -- that were distributed in April 2001 to laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data received from 73 laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Woodworth, M.T.; Conner, B.F.
2002-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T- 169 (trace constituents), M- 162 (major constituents), N-73 (nutrient constituents), N-74 (nutrient constituents), P-38 (low ionic-strength constituents), and Hg-34 (mercury) -- that were distributed in March 2002 to laboratories enrolled in the U.S. Geological Survey sponsored intedaboratory testing program. Analytical data received from 93 laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Woodworth, Mark T.; Connor, Brooke F.
2003-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T-171 (trace constituents), M-164 (major constituents), N-75 (nutrient constituents), N-76 (nutrient constituents), P-39 (low ionic-strength constituents), and Hg-35 (mercury) -- that were distributed in September 2002 to laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data received from 102 laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Woodworth, Mark T.; Connor, Brooke F.
2002-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T-167 (trace constituents), M-160 (major constituents), N-71 (nutrient constituents), N-72 (nutrient constituents), P-37 (low ionic-strength constituents), and Hg-33 (mercury) -- that were distributed in September 2001 to laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data received from 98 laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Farrar, Jerry W.; Copen, Ashley M.
2000-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T-161 (trace constituents), M-154 (major constituents), N-65 (nutrient constituents), N-66 nutrient constituents), P-34 (low ionic strength constituents), and Hg-30 (mercury) -- that were distributed in March 2000 to 144 laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data that were received from 132 of the laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Farrar, T.W.
2000-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T-159 (trace constituents), M-152 (major constituents), N-63 (nutrient constituents), N-64 (nutrient constituents), P-33 (low ionic strength constituents), and Hg-29 (mercury) -- that were distributed in October 1999 to 149 laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data that were received from 131 of the laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Woodworth, Mark T.; Connor, Brooke F.
2003-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T-173 (trace constituents), M-166 (major constituents), N-77 (nutrient constituents), N-78 (nutrient constituents), P-40 (low ionic-strength constituents), and Hg-36 (mercury) -- that were distributed in March 2003 to laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data received from 110 laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Connor, B.F.; Currier, J.P.; Woodworth, M.T.
2001-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T-163 (trace constituents), M-156 (major constituents), N-67 (nutrient constituents), N-68 (nutrient constituents), P-35 (low ionic strength constituents), and Hg-31 (mercury) -- that were distributed in October 2000 to 126 laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data that were received from 122 of the laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
On incomplete sampling under birth-death models and connections to the sampling-based coalescent.
Stadler, Tanja
2009-11-07
The constant rate birth-death process is used as a stochastic model for many biological systems, for example phylogenies or disease transmission. As the biological data are usually not fully available, it is crucial to understand the effect of incomplete sampling. In this paper, we analyze the constant rate birth-death process with incomplete sampling. We derive the density of the bifurcation events for trees on n leaves which evolved under this birth-death-sampling process. This density is used for calculating prior distributions in Bayesian inference programs and for efficiently simulating trees. We show that the birth-death-sampling process can be interpreted as a birth-death process with reduced rates and complete sampling. This shows that joint inference of birth rate, death rate and sampling probability is not possible. The birth-death-sampling process is compared to the sampling-based population genetics model, the coalescent. It is shown that despite many similarities between these two models, the distribution of bifurcation times remains different even in the case of very large population sizes. We illustrate these findings on an Hepatitis C virus dataset from Egypt. We show that the transmission times estimates are significantly different-the widely used Gamma statistic even changes its sign from negative to positive when switching from the coalescent to the birth-death process.
Janssen, Stefan; Schudoma, Christian; Steger, Gerhard; Giegerich, Robert
2011-11-03
Many bioinformatics tools for RNA secondary structure analysis are based on a thermodynamic model of RNA folding. They predict a single, "optimal" structure by free energy minimization, they enumerate near-optimal structures, they compute base pair probabilities and dot plots, representative structures of different abstract shapes, or Boltzmann probabilities of structures and shapes. Although all programs refer to the same physical model, they implement it with considerable variation for different tasks, and little is known about the effects of heuristic assumptions and model simplifications used by the programs on the outcome of the analysis. We extract four different models of the thermodynamic folding space which underlie the programs RNAFOLD, RNASHAPES, and RNASUBOPT. Their differences lie within the details of the energy model and the granularity of the folding space. We implement probabilistic shape analysis for all models, and introduce the shape probability shift as a robust measure of model similarity. Using four data sets derived from experimentally solved structures, we provide a quantitative evaluation of the model differences. We find that search space granularity affects the computed shape probabilities less than the over- or underapproximation of free energy by a simplified energy model. Still, the approximations perform similar enough to implementations of the full model to justify their continued use in settings where computational constraints call for simpler algorithms. On the side, we observe that the rarely used level 2 shapes, which predict the complete arrangement of helices, multiloops, internal loops and bulges, include the "true" shape in a rather small number of predicted high probability shapes. This calls for an investigation of new strategies to extract high probability members from the (very large) level 2 shape space of an RNA sequence. We provide implementations of all four models, written in a declarative style that makes them easy to be modified. Based on our study, future work on thermodynamic RNA folding may make a choice of model based on our empirical data. It can take our implementations as a starting point for further program development.
[Computer diagnosis of traumatic impact by hepatic lesion].
Kimbar, V I; Sevankeev, V V
2007-01-01
A method of computer-assisted diagnosis of traumatic affection by liver damage (HEPAR-test program) is described. The program is based on calculated diagnostic coefficients using Bayes' probability method with Wald's recognition procedure.
The SURE reliability analysis program
NASA Technical Reports Server (NTRS)
Butler, R. W.
1986-01-01
The SURE program is a new reliability tool for ultrareliable computer system architectures. The program is based on computational methods recently developed for the NASA Langley Research Center. These methods provide an efficient means for computing accurate upper and lower bounds for the death state probabilities of a large class of semi-Markov models. Once a semi-Markov model is described using a simple input language, the SURE program automatically computes the upper and lower bounds on the probability of system failure. A parameter of the model can be specified as a variable over a range of values directing the SURE program to perform a sensitivity analysis automatically. This feature, along with the speed of the program, makes it especially useful as a design tool.
The SURE Reliability Analysis Program
NASA Technical Reports Server (NTRS)
Butler, R. W.
1986-01-01
The SURE program is a new reliability analysis tool for ultrareliable computer system architectures. The program is based on computational methods recently developed for the NASA Langley Research Center. These methods provide an efficient means for computing accurate upper and lower bounds for the death state probabilities of a large class of semi-Markov models. Once a semi-Markov model is described using a simple input language, the SURE program automatically computes the upper and lower bounds on the probability of system failure. A parameter of the model can be specified as a variable over a range of values directing the SURE program to perform a sensitivity analysis automatically. This feature, along with the speed of the program, makes it especially useful as a design tool.
ERIC Educational Resources Information Center
Herek, Gregory M.
2009-01-01
Using survey responses collected via the Internet from a U.S. national probability sample of gay, lesbian, and bisexual adults (N = 662), this article reports prevalence estimates of criminal victimization and related experiences based on the target's sexual orientation. Approximately 20% of respondents reported having experienced a person or…
Tornado damage risk assessment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reinhold, T.A.; Ellingwood, B.
1982-09-01
Several proposed models were evaluated for predicting tornado wind speed probabilities at nuclear plant sites as part of a program to develop statistical data on tornadoes needed for probability-based load combination analysis. A unified model was developed which synthesized the desired aspects of tornado occurrence and damage potential. The sensitivity of wind speed probability estimates to various tornado modeling assumptions are examined, and the probability distributions of tornado wind speed that are needed for load combination studies are presented.
Probability of identifying different salmonella serotypes in poultry samples
USDA-ARS?s Scientific Manuscript database
Recent work has called attention to the unequal competitive abilities of different Salmonella serotypes in standard broth culture and plating media. Such serotypes include Enteritidis and Typhimurium that are specifically targeted in some regulatory and certification programs because they cause a l...
A Gibbs sampler for Bayesian analysis of site-occupancy data
Dorazio, Robert M.; Rodriguez, Daniel Taylor
2012-01-01
1. A Bayesian analysis of site-occupancy data containing covariates of species occurrence and species detection probabilities is usually completed using Markov chain Monte Carlo methods in conjunction with software programs that can implement those methods for any statistical model, not just site-occupancy models. Although these software programs are quite flexible, considerable experience is often required to specify a model and to initialize the Markov chain so that summaries of the posterior distribution can be estimated efficiently and accurately. 2. As an alternative to these programs, we develop a Gibbs sampler for Bayesian analysis of site-occupancy data that include covariates of species occurrence and species detection probabilities. This Gibbs sampler is based on a class of site-occupancy models in which probabilities of species occurrence and detection are specified as probit-regression functions of site- and survey-specific covariate measurements. 3. To illustrate the Gibbs sampler, we analyse site-occupancy data of the blue hawker, Aeshna cyanea (Odonata, Aeshnidae), a common dragonfly species in Switzerland. Our analysis includes a comparison of results based on Bayesian and classical (non-Bayesian) methods of inference. We also provide code (based on the R software program) for conducting Bayesian and classical analyses of site-occupancy data.
ERIC Educational Resources Information Center
Kustos, Paul Nicholas
2010-01-01
Student difficulty in the study of probability arises in intuitively-based misconceptions derived from heuristics. One such heuristic, the one of note for this research study, is that of representativeness, in which an individual informally assesses the probability of an event based on the degree to which the event is similar to the sample from…
A country-wide probability sample of public attitudes toward stuttering in Portugal.
Valente, Ana Rita S; St Louis, Kenneth O; Leahy, Margaret; Hall, Andreia; Jesus, Luis M T
2017-06-01
Negative public attitudes toward stuttering have been widely reported, although differences among countries and regions exist. Clear reasons for these differences remain obscure. Published research is unavailable on public attitudes toward stuttering in Portugal as well as a representative sample that explores stuttering attitudes in an entire country. This study sought to (a) determine the feasibility of a country-wide probability sampling scheme to measure public stuttering attitudes in Portugal using a standard instrument (the Public Opinion Survey of Human Attributes-Stuttering [POSHA-S]) and (b) identify demographic variables that predict Portuguese attitudes. The POSHA-S was translated to European Portuguese through a five-step process. Thereafter, a local administrative office-based, three-stage, cluster, probability sampling scheme was carried out to obtain 311 adult respondents who filled out the questionnaire. The Portuguese population held stuttering attitudes that were generally within the average range of those observed from numerous previous POSHA-S samples. Demographic variables that predicted more versus less positive stuttering attitudes were respondents' age, region of the country, years of school completed, working situation, and number of languages spoken. Non-predicting variables were respondents' sex, marital status, and parental status. A local administrative office-based, probability sampling scheme generated a respondent profile similar to census data and indicated that Portuguese attitudes are generally typical. Copyright © 2017 Elsevier Inc. All rights reserved.
We conducted a probability-based sampling of Lake Superior in 2006 and compared the zooplankton biomass estimate with laser optical plankton counter (LOPC) predictions. The net survey consisted of 52 sites stratified across three depth zones (0-30, 30-150, >150 m). The LOPC tow...
Patel, Deepak; Lambert, Estelle V; da Silva, Roseanne; Greyling, Mike; Kolbe-Alexander, Tracy; Noach, Adam; Conradie, Jaco; Nossel, Craig; Borresen, Jill; Gaziano, Thomas
2011-01-01
A retrospective, longitudinal study examined changes in participation in fitness-related activities and hospital claims over 5 years amongst members of an incentivized health promotion program offered by a private health insurer. A 3-year retrospective observational analysis measuring gym visits and participation in documented fitness-related activities, probability of hospital admission, and associated costs of admission. A South African private health plan, Discovery Health and the Vitality health promotion program. 304,054 adult members of the Discovery medical plan, 192,467 of whom registered for the health promotion program and 111,587 members who were not on the program. Members were incentivised for fitness-related activities on the basis of the frequency of gym visits. Changes in electronically documented gym visits and registered participation in fitness-related activities over 3 years and measures of association between changes in participation (years 1-3) and subsequent probability and costs of hospital admission (years 4-5). Hospital admissions and associated costs are based on claims extracted from the health insurer database. The probability of a claim modeled by using linear logistic regression and costs of claims examined by using general linear models. Propensity scores were estimated and included age, gender, registration for chronic disease benefits, plan type, and the presence of a claim during the transition period, and these were used as covariates in the final model. There was a significant decrease in the prevalence of inactive members (76% to 68%) over 5 years. Members who remained highly active (years 1-3) had a lower probability (p < .05) of hospital admission in years 4 to 5 (20.7%) compared with those who remained inactive (22.2%). The odds of admission were 13% lower for two additional gym visits per week (odds ratio, .87; 95% confidence interval [CI], .801-.949). We observed an increase in fitness-related activities over time amongst members of this incentive-based health promotion program, which was associated with a lower probability of hospital admission and lower hospital costs in the subsequent 2 years. Copyright © 2011 by American Journal of Health Promotion, Inc.
Mohammadkhani, Parvaneh; Khanipour, Hamid; Azadmehr, Hedieh; Mobramm, Ardeshir; Naseri, Esmaeil
2015-01-01
The aim of this study was to evaluate suicide probability in Iranian males with substance abuse or dependence disorder and to investigate the predictors of suicide probability based on trait mindfulness, reasons for living and severity of general psychiatric symptoms. Participants were 324 individuals with substance abuse or dependence in an outpatient setting and prison. Reasons for living questionnaire, Mindfulness Attention Awareness Scale and Suicide probability Scale were used as instruments. Sample was selected based on convenience sampling method. Data were analyzed using SPSS and AMOS. The life-time prevalence of suicide attempt in the outpatient setting was35% and it was 42% in the prison setting. Suicide probability in the prison setting was significantly higher than in the outpatient setting (p<0.001). The severity of general symptom strongly correlated with suicide probability. Trait mindfulness, not reasons for living beliefs, had a mediating effect in the relationship between the severity of general symptoms and suicide probability. Fear of social disapproval, survival and coping beliefs and child-related concerns significantly predicted suicide probability (p<0.001). It could be suggested that trait mindfulness was more effective in preventing suicide probability than beliefs about reasons for living in individuals with substance abuse or dependence disorders. The severity of general symptom should be regarded as an important risk factor of suicide probability.
Landsat for practical forest type mapping - A test case
NASA Technical Reports Server (NTRS)
Bryant, E.; Dodge, A. G., Jr.; Warren, S. D.
1980-01-01
Computer classified Landsat maps are compared with a recent conventional inventory of forest lands in northern Maine. Over the 196,000 hectare area mapped, estimates of the areas of softwood, mixed wood and hardwood forest obtained by a supervised classification of the Landsat data and a standard inventory based on aerial photointerpretation, probability proportional to prediction, field sampling and a standard forest measurement program are found to agree to within 5%. The cost of the Landsat maps is estimated to be $0.065/hectare. It is concluded that satellite techniques are worth developing for forest inventories, although they are not yet refined enough to be incorporated into current practical inventories.
Fatigue strength reduction model: RANDOM3 and RANDOM4 user manual, appendix 2
NASA Technical Reports Server (NTRS)
Boyce, Lola; Lovelace, Thomas B.
1989-01-01
The FORTRAN programs RANDOM3 and RANDOM4 are documented. They are based on fatigue strength reduction, using a probabilistic constitutive model. They predict the random lifetime of an engine component to reach a given fatigue strength. Included in this user manual are details regarding the theoretical backgrounds of RANDOM3 and RANDOM4. Appendix A gives information on the physical quantities, their symbols, FORTRAN names, and both SI and U.S. Customary units. Appendix B and C include photocopies of the actual computer printout corresponding to the sample problems. Appendices D and E detail the IMSL, Version 10(1), subroutines and functions called by RANDOM3 and RANDOM4 and SAS/GRAPH(2) programs that can be used to plot both the probability density functions (p.d.f.) and the cumulative distribution functions (c.d.f.).
Smart, Adam S; Tingley, Reid; Weeks, Andrew R; van Rooyen, Anthony R; McCarthy, Michael A
2015-10-01
Effective management of alien species requires detecting populations in the early stages of invasion. Environmental DNA (eDNA) sampling can detect aquatic species at relatively low densities, but few studies have directly compared detection probabilities of eDNA sampling with those of traditional sampling methods. We compare the ability of a traditional sampling technique (bottle trapping) and eDNA to detect a recently established invader, the smooth newt Lissotriton vulgaris vulgaris, at seven field sites in Melbourne, Australia. Over a four-month period, per-trap detection probabilities ranged from 0.01 to 0.26 among sites where L. v. vulgaris was detected, whereas per-sample eDNA estimates were much higher (0.29-1.0). Detection probabilities of both methods varied temporally (across days and months), but temporal variation appeared to be uncorrelated between methods. Only estimates of spatial variation were strongly correlated across the two sampling techniques. Environmental variables (water depth, rainfall, ambient temperature) were not clearly correlated with detection probabilities estimated via trapping, whereas eDNA detection probabilities were negatively correlated with water depth, possibly reflecting higher eDNA concentrations at lower water levels. Our findings demonstrate that eDNA sampling can be an order of magnitude more sensitive than traditional methods, and illustrate that traditional- and eDNA-based surveys can provide independent information on species distributions when occupancy surveys are conducted over short timescales.
Average probability that a "cold hit" in a DNA database search results in an erroneous attribution.
Song, Yun S; Patil, Anand; Murphy, Erin E; Slatkin, Montgomery
2009-01-01
We consider a hypothetical series of cases in which the DNA profile of a crime-scene sample is found to match a known profile in a DNA database (i.e., a "cold hit"), resulting in the identification of a suspect based only on genetic evidence. We show that the average probability that there is another person in the population whose profile matches the crime-scene sample but who is not in the database is approximately 2(N - d)p(A), where N is the number of individuals in the population, d is the number of profiles in the database, and p(A) is the average match probability (AMP) for the population. The AMP is estimated by computing the average of the probabilities that two individuals in the population have the same profile. We show further that if a priori each individual in the population is equally likely to have left the crime-scene sample, then the average probability that the database search attributes the crime-scene sample to a wrong person is (N - d)p(A).
Assefa, Yibeltal; Worku, Alemayehu; Wouters, Edwin; Koole, Olivier; Haile Mariam, Damen; Van Damme, Wim
2012-01-01
Patient retention in care is a critical challenge for antiretroviral treatment programs. This is mainly because retention in care is related to adherence to treatment and patient survival. It is therefore imperative that health facilities and programs measure patient retention in care. However, the currently available tools, such as Kaplan Meier, for measuring retention in care have a lot of practical limitations. The objective of this study was to develop simplified tools for measuring retention in care. Retrospective cohort data were collected from patient registers in nine health facilities in Ethiopia. Retention in care was the primary outcome for the study. Tools were developed to measure "current retention" in care during a specific period of time for a specific "ART-age group" and "cohort retention" in care among patients who were followed for the last "Y" number of years on ART. "Probability of retention" based on the tool for "cohort retention" in care was compared with "probability of retention" based on Kaplan Meier. We found that the new tools enable to measure "current retention" and "cohort retention" in care. We also found that the tools were easy to use and did not require advanced statistical skills. Both "current retention" and "cohort retention" are lower among patients in the first two "ART-age groups" and "ART-age cohorts" than in subsequent "ART-age groups" and "ART-age cohorts". The "probability of retention" based on the new tools were found to be similar to the "probability of retention" based on Kaplan Meier. The simplified tools for "current retention" and "cohort retention" will enable practitioners and program managers to measure and monitor rates of retention in care easily and appropriately. We therefore recommend that health facilities and programs start to use these tools in their efforts to improve retention in care and patient outcomes.
User's Manual for Program PeakFQ, Annual Flood-Frequency Analysis Using Bulletin 17B Guidelines
Flynn, Kathleen M.; Kirby, William H.; Hummel, Paul R.
2006-01-01
Estimates of flood flows having given recurrence intervals or probabilities of exceedance are needed for design of hydraulic structures and floodplain management. Program PeakFQ provides estimates of instantaneous annual-maximum peak flows having recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years (annual-exceedance probabilities of 0.50, 0.20, 0.10, 0.04, 0.02, 0.01, 0.005, and 0.002, respectively). As implemented in program PeakFQ, the Pearson Type III frequency distribution is fit to the logarithms of instantaneous annual peak flows following Bulletin 17B guidelines of the Interagency Advisory Committee on Water Data. The parameters of the Pearson Type III frequency curve are estimated by the logarithmic sample moments (mean, standard deviation, and coefficient of skewness), with adjustments for low outliers, high outliers, historic peaks, and generalized skew. This documentation provides an overview of the computational procedures in program PeakFQ, provides a description of the program menus, and provides an example of the output from the program.
Sedinger, J.S.; Chelgren, N.D.
2007-01-01
We examined the relationship between mass late in the first summer and survival and return to the natal breeding colony for 12 cohorts (1986-1997) of female Black Brant (Branta bernicla nigricans). We used Cormack-Jolly-Seber methods and the program MARK to analyze capture-recapture data. Models included two kinds of residuals from regressions of mass on days after peak of hatch when goslings were measured; one based on the entire sample (12 cohorts) and the other based only on individuals in the same cohort. Some models contained date of peak of hatch (a group covariate related to lateness of nesting in that year) and mean cohort residual mass. Finally, models allowed survival to vary among cohorts. The best model of encounter probability included an effect of residual mass on encounter probability and allowed encounter probability to vary among age classes and across years. All competitive models contained an effect of one of the estimates of residual mass; relatively larger goslings survived their first year at higher rates. Goslings in cohorts from later years in the analysis tended to have lower first-year survival, after controlling for residual mass, which reflected the generally smaller mean masses for these cohorts but was potentially also a result of population-density effects additional to those on growth. Variation among cohorts in mean mass accounted for 56% of variation among cohorts in first-year survival. Encounter probabilities, which were correlated with breeding probability, increased with relative mass, which suggests that larger goslings not only survived at higher rates but also bred at higher rates. Although our findings support the well-established linkage between gosling mass and fitness, they suggest that additional environmental factors also influence first-year survival.
Wong, Linda; Hill, Beth L; Hunsberger, Benjamin C; Bagwell, C Bruce; Curtis, Adam D; Davis, Bruce H
2015-01-01
Leuko64™ (Trillium Diagnostics) is a flow cytometric assay that measures neutrophil CD64 expression and serves as an in vitro indicator of infection/sepsis or the presence of a systemic acute inflammatory response. Leuko64 assay currently utilizes QuantiCALC, a semiautomated software that employs cluster algorithms to define cell populations. The software reduces subjective gating decisions, resulting in interanalyst variability of <5%. We evaluated a completely automated approach to measuring neutrophil CD64 expression using GemStone™ (Verity Software House) and probability state modeling (PSM). Four hundred and fifty-seven human blood samples were processed using the Leuko64 assay. Samples were analyzed on four different flow cytometer models: BD FACSCanto II, BD FACScan, BC Gallios/Navios, and BC FC500. A probability state model was designed to identify calibration beads and three leukocyte subpopulations based on differences in intensity levels of several parameters. PSM automatically calculates CD64 index values for each cell population using equations programmed into the model. GemStone software uses PSM that requires no operator intervention, thus totally automating data analysis and internal quality control flagging. Expert analysis with the predicate method (QuantiCALC) was performed. Interanalyst precision was evaluated for both methods of data analysis. PSM with GemStone correlates well with the expert manual analysis, r(2) = 0.99675 for the neutrophil CD64 index values with no intermethod bias detected. The average interanalyst imprecision for the QuantiCALC method was 1.06% (range 0.00-7.94%), which was reduced to 0.00% with the GemStone PSM. The operator-to-operator agreement in GemStone was a perfect correlation, r(2) = 1.000. Automated quantification of CD64 index values produced results that strongly correlate with expert analysis using a standard gate-based data analysis method. PSM successfully evaluated flow cytometric data generated by multiple instruments across multiple lots of the Leuko64 kit in all 457 cases. The probability-based method provides greater objectivity, higher data analysis speed, and allows for greater precision for in vitro diagnostic flow cytometric assays. © 2015 International Clinical Cytometry Society.
McCarthy, Peter M.
2006-01-01
The Yellowstone River is very important in a variety of ways to the residents of southeastern Montana; however, it is especially vulnerable to spilled contaminants. In 2004, the U.S. Geological Survey, in cooperation with Montana Department of Environmental Quality, initiated a study to develop a computer program to rapidly estimate instream travel times and concentrations of a potential contaminant in the Yellowstone River using regression equations developed in 1999 by the U.S. Geological Survey. The purpose of this report is to describe these equations and their limitations, describe the development of a computer program to apply the equations to the Yellowstone River, and provide detailed instructions on how to use the program. This program is available online at [http://pubs.water.usgs.gov/sir2006-5057/includes/ytot.xls]. The regression equations provide estimates of instream travel times and concentrations in rivers where little or no contaminant-transport data are available. Equations were developed and presented for the most probable flow velocity and the maximum probable flow velocity. These velocity estimates can then be used to calculate instream travel times and concentrations of a potential contaminant. The computer program was developed so estimation equations for instream travel times and concentrations can be solved quickly for sites along the Yellowstone River between Corwin Springs and Sidney, Montana. The basic types of data needed to run the program are spill data, streamflow data, and data for locations of interest along the Yellowstone River. Data output from the program includes spill location, river mileage at specified locations, instantaneous discharge, mean-annual discharge, drainage area, and channel slope. Travel times and concentrations are provided for estimates of the most probable velocity of the peak concentration and the maximum probable velocity of the peak concentration. Verification of estimates of instream travel times and concentrations for the Yellowstone River requires information about the flow velocity throughout the 520 mi of river in the study area. Dye-tracer studies would provide the best data about flow velocities and would provide the best verification of instream travel times and concentrations estimated from this computer program; however, data from such studies does not currently (2006) exist and new studies would be expensive and time-consuming. An alternative approach used in this study for verification of instream travel times is based on the use of flood-wave velocities determined from recorded streamflow hydrographs at selected mainstem streamflow-gaging stations along the Yellowstone River. The ratios of flood-wave velocity to the most probable velocity for the base flow estimated from the computer program are within the accepted range of 2.5 to 4.0 and indicate that flow velocities estimated from the computer program are reasonable for the Yellowstone River. The ratios of flood-wave velocity to the maximum probable velocity are within a range of 1.9 to 2.8 and indicate that the maximum probable flow velocities estimated from the computer program, which corresponds to the shortest travel times and maximum probable concentrations, are conservative and reasonable for the Yellowstone River.
Hayer, C.-A.; Irwin, E.R.
2008-01-01
We used an information-theoretic approach to examine the variation in detection probabilities for 87 Piedmont and Coastal Plain fishes in relation to instream gravel mining in four Alabama streams of the Mobile River drainage. Biotic and abiotic variables were also included in candidate models. Detection probabilities were heterogeneous across species and varied with habitat type, stream, season, and water quality. Instream gravel mining influenced the variation in detection probabilities for 38% of the species collected, probably because it led to habitat loss and increased sedimentation. Higher detection probabilities were apparent at unmined sites than at mined sites for 78% of the species for which gravel mining was shown to influence detection probabilities, indicating potential negative impacts to these species. Physical and chemical attributes also explained the variation in detection probabilities for many species. These results indicate that anthropogenic impacts can affect detection probabilities for fishes, and such variation should be considered when developing monitoring programs or routine sampling protocols. ?? Copyright by the American Fisheries Society 2008.
Koneff, M.D.; Royle, J. Andrew; Forsell, D.J.; Wortham, J.S.; Boomer, G.S.; Perry, M.C.
2005-01-01
Survey design for wintering scoters (Melanitta sp.) and other sea ducks that occur in offshore waters is challenging because these species have large ranges, are subject to distributional shifts among years and within a season, and can occur in aggregations. Interest in winter sea duck population abundance surveys has grown in recent years. This interest stems from concern over the population status of some sea ducks, limitations of extant breeding waterfowl survey programs in North America and logistical challenges and costs of conducting surveys in northern breeding regions, high winter area philopatry in some species and potential conservation implications, and increasing concern over offshore development and other threats to sea duck wintering habitats. The efficiency and practicality of statistically-rigorous monitoring strategies for mobile, aggregated wintering sea duck populations have not been sufficiently investigated. This study evaluated a 2-phase adaptive stratified strip transect sampling plan to estimate wintering population size of scoters, long-tailed ducks (Clangua hyemalis), and other sea ducks and provide information on distribution. The sampling plan results in an optimal allocation of a fixed sampling effort among offshore strata in the U.S. mid-Atlantic coast region. Phase I transect selection probabilities were based on historic distribution and abundance data, while Phase 2 selection probabilities were based on observations made during Phase 1 flights. Distance sampling methods were used to estimate detection rates. Environmental variables thought to affect detection rates were recorded during the survey and post-stratification and covariate modeling were investigated to reduce the effect of heterogeneity on detection estimation. We assessed cost-precision tradeoffs under a number of fixed-cost sampling scenarios using Monte Carlo simulation. We discuss advantages and limitations of this sampling design for estimating wintering sea duck abundance and mapping distribution and suggest improvements for future surveys.
Gao, Xueping; Liu, Yinzhu; Sun, Bowen
2018-06-05
The risk of water shortage caused by uncertainties, such as frequent drought, varied precipitation, multiple water resources, and different water demands, brings new challenges to the water transfer projects. Uncertainties exist for transferring water and local surface water; therefore, the relationship between them should be thoroughly studied to prevent water shortage. For more effective water management, an uncertainty-based water shortage risk assessment model (UWSRAM) is developed to study the combined effect of multiple water resources and analyze the shortage degree under uncertainty. The UWSRAM combines copula-based Monte Carlo stochastic simulation and the chance-constrained programming-stochastic multiobjective optimization model, using the Lunan water-receiving area in China as an example. Statistical copula functions are employed to estimate the joint probability of available transferring water and local surface water and sampling from the multivariate probability distribution, which are used as inputs for the optimization model. The approach reveals the distribution of water shortage and is able to emphasize the importance of improving and updating transferring water and local surface water management, and examine their combined influence on water shortage risk assessment. The possible available water and shortages can be calculated applying the UWSRAM, also with the corresponding allocation measures under different water availability levels and violating probabilities. The UWSRAM is valuable for mastering the overall multi-water resource and water shortage degree, adapting to the uncertainty surrounding water resources, establishing effective water resource planning policies for managers and achieving sustainable development.
A computer program for uncertainty analysis integrating regression and Bayesian methods
Lu, Dan; Ye, Ming; Hill, Mary C.; Poeter, Eileen P.; Curtis, Gary
2014-01-01
This work develops a new functionality in UCODE_2014 to evaluate Bayesian credible intervals using the Markov Chain Monte Carlo (MCMC) method. The MCMC capability in UCODE_2014 is based on the FORTRAN version of the differential evolution adaptive Metropolis (DREAM) algorithm of Vrugt et al. (2009), which estimates the posterior probability density function of model parameters in high-dimensional and multimodal sampling problems. The UCODE MCMC capability provides eleven prior probability distributions and three ways to initialize the sampling process. It evaluates parametric and predictive uncertainties and it has parallel computing capability based on multiple chains to accelerate the sampling process. This paper tests and demonstrates the MCMC capability using a 10-dimensional multimodal mathematical function, a 100-dimensional Gaussian function, and a groundwater reactive transport model. The use of the MCMC capability is made straightforward and flexible by adopting the JUPITER API protocol. With the new MCMC capability, UCODE_2014 can be used to calculate three types of uncertainty intervals, which all can account for prior information: (1) linear confidence intervals which require linearity and Gaussian error assumptions and typically 10s–100s of highly parallelizable model runs after optimization, (2) nonlinear confidence intervals which require a smooth objective function surface and Gaussian observation error assumptions and typically 100s–1,000s of partially parallelizable model runs after optimization, and (3) MCMC Bayesian credible intervals which require few assumptions and commonly 10,000s–100,000s or more partially parallelizable model runs. Ready access allows users to select methods best suited to their work, and to compare methods in many circumstances.
Johnson, Emmanuel Janagan
2017-01-01
The purpose of this study is to explore the impact of domestic violence on the economic condition of the families. This cross-sectional study utilized a non-probability sampling procedure (purposive sampling) that included 30 women who have sought services from the Coalition Against Domestic Violence Agency. Data were collected using a questionnaire, which was comprised of 21 questions. The questions sought information on socioeconomic conditions and impact on domestic violence on the financial position. The study revealed that more of domestic violence victims were at an early age. Recommendations for future research include identifying the major causes for family disorganization and break down in the families arise out of domestic violence and other associated factors where explored while emphasizing the importance of family-based programs that minimize the impact.
Ennis, Erin J; Foley, Joe P
2016-07-15
A stochastic approach was utilized to estimate the probability of a successful isocratic or gradient separation in conventional chromatography for numbers of sample components, peak capacities, and saturation factors ranging from 2 to 30, 20-300, and 0.017-1, respectively. The stochastic probabilities were obtained under conditions of (i) constant peak width ("gradient" conditions) and (ii) peak width increasing linearly with time ("isocratic/constant N" conditions). The isocratic and gradient probabilities obtained stochastically were compared with the probabilities predicted by Martin et al. [Anal. Chem., 58 (1986) 2200-2207] and Davis and Stoll [J. Chromatogr. A, (2014) 128-142]; for a given number of components and peak capacity the same trend is always observed: probability obtained with the isocratic stochastic approach
Use of Internet panels to conduct surveys.
Hays, Ron D; Liu, Honghu; Kapteyn, Arie
2015-09-01
The use of Internet panels to collect survey data is increasing because it is cost-effective, enables access to large and diverse samples quickly, takes less time than traditional methods to obtain data for analysis, and the standardization of the data collection process makes studies easy to replicate. A variety of probability-based panels have been created, including Telepanel/CentERpanel, Knowledge Networks (now GFK KnowledgePanel), the American Life Panel, the Longitudinal Internet Studies for the Social Sciences panel, and the Understanding America Study panel. Despite the advantage of having a known denominator (sampling frame), the probability-based Internet panels often have low recruitment participation rates, and some have argued that there is little practical difference between opting out of a probability sample and opting into a nonprobability (convenience) Internet panel. This article provides an overview of both probability-based and convenience panels, discussing potential benefits and cautions for each method, and summarizing the approaches used to weight panel respondents in order to better represent the underlying population. Challenges of using Internet panel data are discussed, including false answers, careless responses, giving the same answer repeatedly, getting multiple surveys from the same respondent, and panelists being members of multiple panels. More is to be learned about Internet panels generally and about Web-based data collection, as well as how to evaluate data collected using mobile devices and social-media platforms.
CUMBIN - CUMULATIVE BINOMIAL PROGRAMS
NASA Technical Reports Server (NTRS)
Bowerman, P. N.
1994-01-01
The cumulative binomial program, CUMBIN, is one of a set of three programs which calculate cumulative binomial probability distributions for arbitrary inputs. The three programs, CUMBIN, NEWTONP (NPO-17556), and CROSSER (NPO-17557), can be used independently of one another. CUMBIN can be used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. The program has been used for reliability/availability calculations. CUMBIN calculates the probability that a system of n components has at least k operating if the probability that any one operating is p and the components are independent. Equivalently, this is the reliability of a k-out-of-n system having independent components with common reliability p. CUMBIN can evaluate the incomplete beta distribution for two positive integer arguments. CUMBIN can also evaluate the cumulative F distribution and the negative binomial distribution, and can determine the sample size in a test design. CUMBIN is designed to work well with all integer values 0 < k <= n. To run the program, the user simply runs the executable version and inputs the information requested by the program. The program is not designed to weed out incorrect inputs, so the user must take care to make sure the inputs are correct. Once all input has been entered, the program calculates and lists the result. The CUMBIN program is written in C. It was developed on an IBM AT with a numeric co-processor using Microsoft C 5.0. Because the source code is written using standard C structures and functions, it should compile correctly with most C compilers. The program format is interactive. It has been implemented under DOS 3.2 and has a memory requirement of 26K. CUMBIN was developed in 1988.
A risk assessment method for multi-site damage
NASA Astrophysics Data System (ADS)
Millwater, Harry Russell, Jr.
This research focused on developing probabilistic methods suitable for computing small probabilities of failure, e.g., 10sp{-6}, of structures subject to multi-site damage (MSD). MSD is defined as the simultaneous development of fatigue cracks at multiple sites in the same structural element such that the fatigue cracks may coalesce to form one large crack. MSD is modeled as an array of collinear cracks with random initial crack lengths with the centers of the initial cracks spaced uniformly apart. The data used was chosen to be representative of aluminum structures. The structure is considered failed whenever any two adjacent cracks link up. A fatigue computer model is developed that can accurately and efficiently grow a collinear array of arbitrary length cracks from initial size until failure. An algorithm is developed to compute the stress intensity factors of all cracks considering all interaction effects. The probability of failure of two to 100 cracks is studied. Lower bounds on the probability of failure are developed based upon the probability of the largest crack exceeding a critical crack size. The critical crack size is based on the initial crack size that will grow across the ligament when the neighboring crack has zero length. The probability is evaluated using extreme value theory. An upper bound is based on the probability of the maximum sum of initial cracks being greater than a critical crack size. A weakest link sampling approach is developed that can accurately and efficiently compute small probabilities of failure. This methodology is based on predicting the weakest link, i.e., the two cracks to link up first, for a realization of initial crack sizes, and computing the cycles-to-failure using these two cracks. Criteria to determine the weakest link are discussed. Probability results using the weakest link sampling method are compared to Monte Carlo-based benchmark results. The results indicate that very small probabilities can be computed accurately in a few minutes using a Hewlett-Packard workstation.
45 CFR 286.260 - May Tribes use sampling and electronic filing?
Code of Federal Regulations, 2011 CFR
2011-10-01
... method” means a probability sampling method in which every sampling unit has a known, non-zero chance to... quarterly reports electronically, based on format specifications that we will provide. Tribes who do not...
Mars Exploration Rovers Landing Dispersion Analysis
NASA Technical Reports Server (NTRS)
Knocke, Philip C.; Wawrzyniak, Geoffrey G.; Kennedy, Brian M.; Desai, Prasun N.; Parker, TImothy J.; Golombek, Matthew P.; Duxbury, Thomas C.; Kass, David M.
2004-01-01
Landing dispersion estimates for the Mars Exploration Rover missions were key elements in the site targeting process and in the evaluation of landing risk. This paper addresses the process and results of the landing dispersion analyses performed for both Spirit and Opportunity. The several contributors to landing dispersions (navigation and atmospheric uncertainties, spacecraft modeling, winds, and margins) are discussed, as are the analysis tools used. JPL's MarsLS program, a MATLAB-based landing dispersion visualization and statistical analysis tool, was used to calculate the probability of landing within hazardous areas. By convolving this with the probability of landing within flight system limits (in-spec landing) for each hazard area, a single overall measure of landing risk was calculated for each landing ellipse. In-spec probability contours were also generated, allowing a more synoptic view of site risks, illustrating the sensitivity to changes in landing location, and quantifying the possible consequences of anomalies such as incomplete maneuvers. Data and products required to support these analyses are described, including the landing footprints calculated by NASA Langley's POST program and JPL's AEPL program, cartographically registered base maps and hazard maps, and flight system estimates of in-spec landing probabilities for each hazard terrain type. Various factors encountered during operations, including evolving navigation estimates and changing atmospheric models, are discussed and final landing points are compared with approach estimates.
Hartwell, T D; Steele, P; French, M T; Potter, F J; Rodman, N F; Zarkin, G A
1996-01-01
OBJECTIVES: Employee assistance programs (EAPs) are job-based programs designed to identify and assist troubled employees. This study determines the prevalence, cost, and characteristics of these programs in the United States by worksite size, industry, and census region. METHODS: A stratified national probability sample of more than 6400 private, nonagricultural US worksites with 50 or more full-time employees was contacted with a computer-assisted telephone interviewing protocol. More than 3200 worksites responded and were eligible, with a response rate of 90%. RESULTS: Approximately 33% of all private, nonagricultural worksites with 50 or more full-time employees currently offer EAP services to their employees, an 8.9% increase over 1985. These programs are more likely to be found in larger worksites and in the communications/utilities/transportation industries. The most popular model is an external provider, and the median annual cost per eligible employee for internal and external programs was $21.83 and $18.09, respectively. CONCLUSIONS: EAPs are becoming a more prevalent point of access to health care for workers with personal problems such as substance abuse, family problems, or emotional distress. PMID:8659653
Multinomial mixture model with heterogeneous classification probabilities
Holland, M.D.; Gray, B.R.
2011-01-01
Royle and Link (Ecology 86(9):2505-2512, 2005) proposed an analytical method that allowed estimation of multinomial distribution parameters and classification probabilities from categorical data measured with error. While useful, we demonstrate algebraically and by simulations that this method yields biased multinomial parameter estimates when the probabilities of correct category classifications vary among sampling units. We address this shortcoming by treating these probabilities as logit-normal random variables within a Bayesian framework. We use Markov chain Monte Carlo to compute Bayes estimates from a simulated sample from the posterior distribution. Based on simulations, this elaborated Royle-Link model yields nearly unbiased estimates of multinomial and correct classification probability estimates when classification probabilities are allowed to vary according to the normal distribution on the logit scale or according to the Beta distribution. The method is illustrated using categorical submersed aquatic vegetation data. ?? 2010 Springer Science+Business Media, LLC.
NASA Astrophysics Data System (ADS)
Sonnenfeld, Alessandro; Chan, James H. H.; Shu, Yiping; More, Anupreeta; Oguri, Masamune; Suyu, Sherry H.; Wong, Kenneth C.; Lee, Chien-Hsiu; Coupon, Jean; Yonehara, Atsunori; Bolton, Adam S.; Jaelani, Anton T.; Tanaka, Masayuki; Miyazaki, Satoshi; Komiyama, Yutaka
2018-01-01
The Hyper Suprime-Cam Subaru Strategic Program (HSC-SSP) is an excellent survey for the search for strong lenses, thanks to its area, image quality, and depth. We use three different methods to look for lenses among 43000 luminous red galaxies from the Baryon Oscillation Spectroscopic Survey (BOSS) sample with photometry from the S16A internal data release of the HSC-SSP. The first method is a newly developed algorithm, named YATTALENS, which looks for arc-like features around massive galaxies and then estimates the likelihood of an object being a lens by performing a lens model fit. The second method, CHITAH, is a modeling-based algorithm originally developed to look for lensed quasars. The third method makes use of spectroscopic data to look for emission lines from objects at a different redshift from that of the main galaxy. We find 15 definite lenses, 36 highly probable lenses, and 282 possible lenses. Among the three methods, YATTALENS, which was developed specifically for this study, performs best in terms of both completeness and purity. Nevertheless, five highly probable lenses were missed by YATTALENS but found by the other two methods, indicating that the three methods are highly complementary. Based on these numbers, we expect to find ˜300 definite or probable lenses by the end of the HSC-SSP.
Counihan, T.D.; Miller, Allen I.; Parsley, M.J.
1999-01-01
The development of recruitment monitoring programs for age-0 white sturgeons Acipenser transmontanus is complicated by the statistical properties of catch-per-unit-effort (CPUE) data. We found that age-0 CPUE distributions from bottom trawl surveys violated assumptions of statistical procedures based on normal probability theory. Further, no single data transformation uniformly satisfied these assumptions because CPUE distribution properties varied with the sample mean (??(CPUE)). Given these analytic problems, we propose that an additional index of age-0 white sturgeon relative abundance, the proportion of positive tows (Ep), be used to estimate sample sizes before conducting age-0 recruitment surveys and to evaluate statistical hypothesis tests comparing the relative abundance of age-0 white sturgeons among years. Monte Carlo simulations indicated that Ep was consistently more precise than ??(CPUE), and because Ep is binomially rather than normally distributed, surveys can be planned and analyzed without violating the assumptions of procedures based on normal probability theory. However, we show that Ep may underestimate changes in relative abundance at high levels and confound our ability to quantify responses to management actions if relative abundance is consistently high. If data suggest that most samples will contain age-0 white sturgeons, estimators of relative abundance other than Ep should be considered. Because Ep may also obscure correlations to climatic and hydrologic variables if high abundance levels are present in time series data, we recommend ??(CPUE) be used to describe relations to environmental variables. The use of both Ep and ??(CPUE) will facilitate the evaluation of hypothesis tests comparing relative abundance levels and correlations to variables affecting age-0 recruitment. Estimated sample sizes for surveys should therefore be based on detecting predetermined differences in Ep, but data necessary to calculate ??(CPUE) should also be collected.
DOE Office of Scientific and Technical Information (OSTI.GOV)
VanderNoot, Victoria A.; Haroldsen, Brent L.; Renzi, Ronald F.
2010-03-01
In a multiyear research agreement with Tenix Investments Pty. Ltd., Sandia has been developing field deployable technologies for detection of biotoxins in water supply systems. The unattended water sensor or UWS employs microfluidic chip based gel electrophoresis for monitoring biological analytes in a small integrated sensor platform. This instrument collects, prepares, and analyzes water samples in an automated manner. Sample analysis is done using the {mu}ChemLab{trademark} analysis module. This report uses analysis results of two datasets collected using the UWS to estimate performance of the device. The first dataset is made up of samples containing ricin at varying concentrations andmore » is used for assessing instrument response and detection probability. The second dataset is comprised of analyses of water samples collected at a water utility which are used to assess the false positive probability. The analyses of the two sets are used to estimate the Receiver Operating Characteristic or ROC curves for the device at one set of operational and detection algorithm parameters. For these parameters and based on a statistical estimate, the ricin probability of detection is about 0.9 at a concentration of 5 nM for a false positive probability of 1 x 10{sup -6}.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hong, X; Gao, H; Schuemann, J
2015-06-15
Purpose: The Monte Carlo (MC) method is a gold standard for dose calculation in radiotherapy. However, it is not a priori clear how many particles need to be simulated to achieve a given dose accuracy. Prior error estimate and stopping criterion are not well established for MC. This work aims to fill this gap. Methods: Due to the statistical nature of MC, our approach is based on one-sample t-test. We design the prior error estimate method based on the t-test, and then use this t-test based error estimate for developing a simulation stopping criterion. The three major components are asmore » follows.First, the source particles are randomized in energy, space and angle, so that the dose deposition from a particle to the voxel is independent and identically distributed (i.i.d.).Second, a sample under consideration in the t-test is the mean value of dose deposition to the voxel by sufficiently large number of source particles. Then according to central limit theorem, the sample as the mean value of i.i.d. variables is normally distributed with the expectation equal to the true deposited dose.Third, the t-test is performed with the null hypothesis that the difference between sample expectation (the same as true deposited dose) and on-the-fly calculated mean sample dose from MC is larger than a given error threshold, in addition to which users have the freedom to specify confidence probability and region of interest in the t-test based stopping criterion. Results: The method is validated for proton dose calculation. The difference between the MC Result based on the t-test prior error estimate and the statistical Result by repeating numerous MC simulations is within 1%. Conclusion: The t-test based prior error estimate and stopping criterion are developed for MC and validated for proton dose calculation. Xiang Hong and Hao Gao were partially supported by the NSFC (#11405105), the 973 Program (#2015CB856000) and the Shanghai Pujiang Talent Program (#14PJ1404500)« less
The National Children's Study: Recruitment Outcomes Using the Provider-Based Recruitment Approach.
Hale, Daniel E; Wyatt, Sharon B; Buka, Stephen; Cherry, Debra; Cislo, Kendall K; Dudley, Donald J; McElfish, Pearl Anna; Norman, Gwendolyn S; Reynolds, Simone A; Siega-Riz, Anna Maria; Wadlinger, Sandra; Walker, Cheryl K; Robbins, James M
2016-06-01
In 2009, the National Children's Study (NCS) Vanguard Study tested the feasibility of household-based recruitment and participant enrollment using a birth-rate probability sample. In 2010, the NCS Program Office launched 3 additional recruitment approaches. We tested whether provider-based recruitment could improve recruitment outcomes compared with household-based recruitment. The NCS aimed to recruit 18- to 49-year-old women who were pregnant or at risk for becoming pregnant who lived in designated geographic segments within primary sampling units, generally counties. Using provider-based recruitment, 10 study centers engaged providers to enroll eligible participants at their practice. Recruitment models used different levels of provider engagement (full, intermediate, information-only). The percentage of eligible women per county ranged from 1.5% to 57.3%. Across the centers, 3371 potential participants were approached for screening, 3459 (92%) were screened and 1479 were eligible (43%). Of those 1181 (80.0%) gave consent and 1008 (94%) were retained until delivery. Recruited participants were generally representative of the county population. Provider-based recruitment was successful in recruiting NCS participants. Challenges included time-intensity of engaging the clinical practices, differential willingness of providers to participate, and necessary reliance on providers for participant identification. The vast majority of practices cooperated to some degree. Recruitment from obstetric practices is an effective means of obtaining a representative sample. Copyright © 2016 by the American Academy of Pediatrics.
The National Children’s Study: Recruitment Outcomes Using the Provider-Based Recruitment Approach
Wyatt, Sharon B.; Buka, Stephen; Cherry, Debra; Cislo, Kendall K.; Dudley, Donald J.; McElfish, Pearl Anna; Norman, Gwendolyn S.; Reynolds, Simone A.; Siega-Riz, Anna Maria; Wadlinger, Sandra; Walker, Cheryl K.; Robbins, James M.
2016-01-01
OBJECTIVE: In 2009, the National Children’s Study (NCS) Vanguard Study tested the feasibility of household-based recruitment and participant enrollment using a birth-rate probability sample. In 2010, the NCS Program Office launched 3 additional recruitment approaches. We tested whether provider-based recruitment could improve recruitment outcomes compared with household-based recruitment. METHODS: The NCS aimed to recruit 18- to 49-year-old women who were pregnant or at risk for becoming pregnant who lived in designated geographic segments within primary sampling units, generally counties. Using provider-based recruitment, 10 study centers engaged providers to enroll eligible participants at their practice. Recruitment models used different levels of provider engagement (full, intermediate, information-only). RESULTS: The percentage of eligible women per county ranged from 1.5% to 57.3%. Across the centers, 3371 potential participants were approached for screening, 3459 (92%) were screened and 1479 were eligible (43%). Of those 1181 (80.0%) gave consent and 1008 (94%) were retained until delivery. Recruited participants were generally representative of the county population. CONCLUSIONS: Provider-based recruitment was successful in recruiting NCS participants. Challenges included time-intensity of engaging the clinical practices, differential willingness of providers to participate, and necessary reliance on providers for participant identification. The vast majority of practices cooperated to some degree. Recruitment from obstetric practices is an effective means of obtaining a representative sample. PMID:27251870
Schillaci, Michael A; Schillaci, Mario E
2009-02-01
The use of small sample sizes in human and primate evolutionary research is commonplace. Estimating how well small samples represent the underlying population, however, is not commonplace. Because the accuracy of determinations of taxonomy, phylogeny, and evolutionary process are dependant upon how well the study sample represents the population of interest, characterizing the uncertainty, or potential error, associated with analyses of small sample sizes is essential. We present a method for estimating the probability that the sample mean is within a desired fraction of the standard deviation of the true mean using small (n<10) or very small (n < or = 5) sample sizes. This method can be used by researchers to determine post hoc the probability that their sample is a meaningful approximation of the population parameter. We tested the method using a large craniometric data set commonly used by researchers in the field. Given our results, we suggest that sample estimates of the population mean can be reasonable and meaningful even when based on small, and perhaps even very small, sample sizes.
Astrelin, A V; Sokolov, M V; Behnisch, T; Reymann, K G; Voronin, L L
1997-04-25
A statistical approach to analysis of amplitude fluctuations of postsynaptic responses is described. This includes (1) using a L1-metric in the space of distribution functions for minimisation with application of linear programming methods to decompose amplitude distributions into a convolution of Gaussian and discrete distributions; (2) deconvolution of the resulting discrete distribution with determination of the release probabilities and the quantal amplitude for cases with a small number (< 5) of discrete components. The methods were tested against simulated data over a range of sample sizes and signal-to-noise ratios which mimicked those observed in physiological experiments. In computer simulation experiments, comparisons were made with other methods of 'unconstrained' (generalized) and constrained reconstruction of discrete components from convolutions. The simulation results provided additional criteria for improving the solutions to overcome 'over-fitting phenomena' and to constrain the number of components with small probabilities. Application of the programme to recordings from hippocampal neurones demonstrated its usefulness for the analysis of amplitude distributions of postsynaptic responses.
ERIC Educational Resources Information Center
Coyle-Rogers, Patricia G.; Rogers, George E.
A study determined whether there are any differences in the adaptive competency acquisition between technology education teachers who have completed a school district add-on alternative certification process and technology education teachers who completed a traditional baccalaureate degree certification program. Non-probability sampling was used…
Probability surveys of stream and river resources (hereafter referred to as streams) provide reliable estimates of stream condition when the areas for the estimates have sufficient number of sample sites. Monitoring programs are frequently asked to provide estimates for areas th...
Evaluation of seven aquatic sampling methods for amphibians and other aquatic fauna
Gunzburger, M.S.
2007-01-01
To design effective and efficient research and monitoring programs researchers must have a thorough understanding of the capabilities and limitations of their sampling methods. Few direct comparative studies exist for aquatic sampling methods for amphibians. The objective of this study was to simultaneously employ seven aquatic sampling methods in 10 wetlands to compare amphibian species richness and number of individuals detected with each method. Four sampling methods allowed counts of individuals (metal dipnet, D-frame dipnet, box trap, crayfish trap), whereas the other three methods allowed detection of species (visual encounter, aural, and froglogger). Amphibian species richness was greatest with froglogger, box trap, and aural samples. For anuran species, the sampling methods by which each life stage was detected was related to relative length of larval and breeding periods and tadpole size. Detection probability of amphibians varied across sampling methods. Box trap sampling resulted in the most precise amphibian count, but the precision of all four count-based methods was low (coefficient of variation > 145 for all methods). The efficacy of the four count sampling methods at sampling fish and aquatic invertebrates was also analyzed because these predatory taxa are known to be important predictors of amphibian habitat distribution. Species richness and counts were similar for fish with the four methods, whereas invertebrate species richness and counts were greatest in box traps. An effective wetland amphibian monitoring program in the southeastern United States should include multiple sampling methods to obtain the most accurate assessment of species community composition at each site. The combined use of frogloggers, crayfish traps, and dipnets may be the most efficient and effective amphibian monitoring protocol. ?? 2007 Brill Academic Publishers.
Teaching Probability to Pre-Service Teachers with Argumentation Based Science Learning Approach
ERIC Educational Resources Information Center
Can, Ömer Sinan; Isleyen, Tevfik
2016-01-01
The aim of this study is to explore the effects of the argumentation based science learning (ABSL) approach on the teaching probability to pre-service teachers. The sample of the study included 41 students studying at the Department of Elementary School Mathematics Education in a public university during the 2014-2015 academic years. The study is…
FASP, an analytic resource appraisal program for petroleum play analysis
Crovelli, R.A.; Balay, R.H.
1986-01-01
An analytic probabilistic methodology for resource appraisal of undiscovered oil and gas resources in play analysis is presented in a FORTRAN program termed FASP. This play-analysis methodology is a geostochastic system for petroleum resource appraisal in explored as well as frontier areas. An established geologic model considers both the uncertainty of the presence of the assessed hydrocarbon and its amount if present. The program FASP produces resource estimates of crude oil, nonassociated gas, dissolved gas, and gas for a geologic play in terms of probability distributions. The analytic method is based upon conditional probability theory and many laws of expectation and variance. ?? 1986.
Using effort information with change-in-ratio data for population estimation
Udevitz, Mark S.; Pollock, Kenneth H.
1995-01-01
Most change-in-ratio (CIR) methods for estimating fish and wildlife population sizes have been based only on assumptions about how encounter probabilities vary among population subclasses. When information on sampling effort is available, it is also possible to derive CIR estimators based on assumptions about how encounter probabilities vary over time. This paper presents a generalization of previous CIR models that allows explicit consideration of a range of assumptions about the variation of encounter probabilities among subclasses and over time. Explicit estimators are derived under this model for specific sets of assumptions about the encounter probabilities. Numerical methods are presented for obtaining estimators under the full range of possible assumptions. Likelihood ratio tests for these assumptions are described. Emphasis is on obtaining estimators based on assumptions about variation of encounter probabilities over time.
Benndorf, Matthias; Neubauer, Jakob; Langer, Mathias; Kotter, Elmar
2017-03-01
In the diagnostic process of primary bone tumors, patient age, tumor localization and to a lesser extent sex affect the differential diagnosis. We therefore aim to develop a pretest probability calculator for primary malignant bone tumors based on population data taking these variables into account. We access the SEER (Surveillance, Epidemiology and End Results Program of the National Cancer Institute, 2015 release) database and analyze data of all primary malignant bone tumors diagnosed between 1973 and 2012. We record age at diagnosis, tumor localization according to the International Classification of Diseases (ICD-O-3) and sex. We take relative probability of the single tumor entity as a surrogate parameter for unadjusted pretest probability. We build a probabilistic (naïve Bayes) classifier to calculate pretest probabilities adjusted for age, tumor localization and sex. We analyze data from 12,931 patients (647 chondroblastic osteosarcomas, 3659 chondrosarcomas, 1080 chordomas, 185 dedifferentiated chondrosarcomas, 2006 Ewing's sarcomas, 281 fibroblastic osteosarcomas, 129 fibrosarcomas, 291 fibrous malignant histiocytomas, 289 malignant giant cell tumors, 238 myxoid chondrosarcomas, 3730 osteosarcomas, 252 parosteal osteosarcomas, 144 telangiectatic osteosarcomas). We make our probability calculator accessible at http://ebm-radiology.com/bayesbone/index.html . We provide exhaustive tables for age and localization data. Results from tenfold cross-validation show that in 79.8 % of cases the pretest probability is correctly raised. Our approach employs population data to calculate relative pretest probabilities for primary malignant bone tumors. The calculator is not diagnostic in nature. However, resulting probabilities might serve as an initial evaluation of probabilities of tumors on the differential diagnosis list.
Probability of illness definition for the Skylab flight crew health stabilization program
NASA Technical Reports Server (NTRS)
1974-01-01
Management and analysis of crew and environmental microbiological data from SMEAT and Skylab are discussed. Samples were collected from ten different body sites on each SMEAT and Skylab crew-member on approximately 50 occasions and since several different organisms could be isolated from each sample, several thousand lab reports were generated. These lab reports were coded and entered in a computer file and from the file various tabular summaries were constructed.
Rice, Eric; Winetrobe, Hailey; Holloway, Ian W.; Montoya, Jorge; Plant, Aaron; Kordic, Timothy
2014-01-01
Online partner seeking is associated with sexual risk behavior among young adults (specifically men who have sex with men), but this association has yet to be explored among a probability sample of adolescents. Moreover, cell phone internet access and sexual risk taking online and offline have not been explored. A probability sample (N = 1,831) of Los Angeles Unified School District high school students was collected in 2011. Logistic regression models assessed relationships between specific sexual risk behaviors (online sexual solicitation, seeking partners online, sex with internet-met partners, condom use) and frequency of internet use, internet access points, and demographics. Students with cell phone internet access were more likely to report being solicited online for sex, being sexually active, and having sex with an internet-met partner. Bisexual-identifying students reported higher rates of being approached online for sex, being sexually active, and not using condoms at last sex. Gay, lesbian, and questioning (GLQ) students were more likely to report online partner seeking and unprotected sex at last sex with an internet-met partner. Additionally, having sex with an internet-met partner was associated with being male, online sexual solicitation, and online partner seeking. Internet- and school-based sexual health programs should incorporate safety messages regarding online sexual solicitation, seeking sex partners online, and engaging in safer sex practices with all partners. Programs must target adolescents of all sexual identities, as adolescents may not yet be “out,” and bisexual and GLQ adolescents are more likely to engage in risky sex behaviors. PMID:25344027
Rice, Eric; Winetrobe, Hailey; Holloway, Ian W; Montoya, Jorge; Plant, Aaron; Kordic, Timothy
2015-04-01
Online partner seeking is associated with sexual risk behavior among young adults (specifically men who have sex with men), but this association has yet to be explored among a probability sample of adolescents. Moreover, cell phone internet access and sexual risk taking online and offline have not been explored. A probability sample (N = 1,831) of Los Angeles Unified School District high school students was collected in 2011. Logistic regression models assessed relationships between specific sexual risk behaviors (online sexual solicitation, seeking partners online, sex with internet-met partners, condom use) and frequency of internet use, internet access points, and demographics. Students with cell phone internet access were more likely to report being solicited online for sex, being sexually active, and having sex with an internet-met partner. Bisexual-identifying students reported higher rates of being approached online for sex, being sexually active, and not using condoms at last sex. Gay, lesbian, and questioning (GLQ) students were more likely to report online partner seeking and unprotected sex at last sex with an internet-met partner. Additionally, having sex with an internet-met partner was associated with being male, online sexual solicitation, and online partner seeking. Internet- and school-based sexual health programs should incorporate safety messages regarding online sexual solicitation, seeking sex partners online, and engaging in safer sex practices with all partners. Programs must target adolescents of all sexual identities, as adolescents may not yet be "out," and bisexual and GLQ adolescents are more likely to engage in risky sex behaviors.
Farrar, Jerry W.; Long, H. Keith
1996-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for 6 standard reference samples--T-137 (trace constituents), M-136 (major constituents), N-47 (nutrient constituents), N-48 (nutrient constituents), P-25 (low ionic strength constituents), and Hg-21 (mercury)--that were distributed in October 1995 to 149 laboratories registered in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data that were received from 136 of the laboratories were evaluated with respect to: overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Long, H. Keith; Farrar, Jerry W.
1994-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for five standard reference samples--T-129 (trace constituents), M-130 (major constituents), N-42 (nutrients), P-22 (low ionic strength), Hg-18(mercury),--that were distributed in April 1994 to 157 laboratories registered in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data that were received from 133 of the laboratories were evaluated with respect to: overall laboratory performance and relative laboratory performance for each analyte in the five reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the five standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Long, H.K.; Farrar, J.W.
1993-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for seven standard reference samples--T-123 (trace constituents), T-125 (trace constituents), M-126 (major constituents), N-38 (nutrients), N-39 (Nutrients), P-20 (precipitation-low ionic strength), and Hg-16 (mercury)--that were distributed in April 1993 to 175 laboratories registered in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data received from 131 of the laboratories were evaluated with respect to: overall laboratory performance and relative laboratory performance for each analyte in the 7 reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the seven standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Farrar, Jerry W.; Chleboun, Kimberly M.
1999-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for 8 standard reference samples -- T-157 (trace constituents), M-150 (major constituents), N-61 (nutrient constituents), N-62 (nutrient constituents), P-32 (low ionic strength constituents), GWT-5 (ground-water trace constituents), GWM- 4 (ground-water major constituents),and Hg-28 (mercury) -- that were distributed in March 1999 to 120 laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data that were received from 111 of the laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the seven reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the 8 standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Kundeti, Vamsi; Rajasekaran, Sanguthevar
2012-06-01
Efficient tile sets for self assembling rectilinear shapes is of critical importance in algorithmic self assembly. A lower bound on the tile complexity of any deterministic self assembly system for an n × n square is [Formula: see text] (inferred from the Kolmogrov complexity). Deterministic self assembly systems with an optimal tile complexity have been designed for squares and related shapes in the past. However designing [Formula: see text] unique tiles specific to a shape is still an intensive task in the laboratory. On the other hand copies of a tile can be made rapidly using PCR (polymerase chain reaction) experiments. This led to the study of self assembly on tile concentration programming models. We present two major results in this paper on the concentration programming model. First we show how to self assemble rectangles with a fixed aspect ratio ( α:β ), with high probability, using Θ( α + β ) tiles. This result is much stronger than the existing results by Kao et al. (Randomized self-assembly for approximate shapes, LNCS, vol 5125. Springer, Heidelberg, 2008) and Doty (Randomized self-assembly for exact shapes. In: proceedings of the 50th annual IEEE symposium on foundations of computer science (FOCS), IEEE, Atlanta. pp 85-94, 2009)-which can only self assembly squares and rely on tiles which perform binary arithmetic. On the other hand, our result is based on a technique called staircase sampling . This technique eliminates the need for sub-tiles which perform binary arithmetic, reduces the constant in the asymptotic bound, and eliminates the need for approximate frames (Kao et al. Randomized self-assembly for approximate shapes, LNCS, vol 5125. Springer, Heidelberg, 2008). Our second result applies staircase sampling on the equimolar concentration programming model (The tile complexity of linear assemblies. In: proceedings of the 36th international colloquium automata, languages and programming: Part I on ICALP '09, Springer-Verlag, pp 235-253, 2009), to self assemble rectangles (of fixed aspect ratio) with high probability. The tile complexity of our algorithm is Θ(log( n )) and is optimal on the probabilistic tile assembly model (PTAM)- n being an upper bound on the dimensions of a rectangle.
Adams, Vanessa M.; Pressey, Robert L.; Stoeckl, Natalie
2014-01-01
The need to integrate social and economic factors into conservation planning has become a focus of academic discussions and has important practical implications for the implementation of conservation areas, both private and public. We conducted a survey in the Daly Catchment, Northern Territory, to inform the design and implementation of a stewardship payment program. We used a choice model to estimate the likely level of participation in two legal arrangements - conservation covenants and management agreements - based on payment level and proportion of properties required to be managed. We then spatially predicted landholders’ probability of participating at the resolution of individual properties and incorporated these predictions into conservation planning software to examine the potential for the stewardship program to meet conservation objectives. We found that the properties that were least costly, per unit area, to manage were also the least likely to participate. This highlights a tension between planning for a cost-effective program and planning for a program that targets properties with the highest probability of participation. PMID:24892520
Sheets, Erin S.; Craighead, Linda Wilcoxon; Brosse, Alisha L.; Hauser, Monika; Madsen, Joshua W.; Craighead, W. Edward
2012-01-01
Background Among the most serious sequelae to an initial episode of Major Depressive Disorder (MDD) during adolescence is the significant increase in the probability of recurrence. This study reports on an integrated CBT/IPT program, provided in a group format, that was developed to decrease the rate of MDD recurrence in emerging adults. Methods Participants were 89 young adults who were not depressed at study entry but had experienced MDD during adolescence. Participants were assigned to a CBT/IPT prevention program or to an assessment only control condition and were followed through the first 2 years of college. Results Risk for MDD recurrence was reduced more than 50% for the prevention program participants compared to assessment only controls. The intervention also conferred beneficial effects on academic performance for those students who completed the majority of the group sessions. Limitations The study included a self-selected sample of emerging adults who were aware of their history of depression. Due to the small sample size, it will be important to evaluate similar interventions in adequately-powered trials to determine if this is a replicable finding. Conclusions With 51% of the assessment only participants experiencing a MDD recurrence during the first 2 years of college, these findings support the need for programs designed to prevent MDD recurrence in young adults. The current program, based on IPT and CBT principles, appears to reduce the rate of MDD recurrence among previously depressed emerging adults. PMID:23021821
Identification of key ancestors of modern germplasm in a breeding program of maize.
Technow, F; Schrag, T A; Schipprack, W; Melchinger, A E
2014-12-01
Probabilities of gene origin computed from the genomic kinships matrix can accurately identify key ancestors of modern germplasms Identifying the key ancestors of modern plant breeding populations can provide valuable insights into the history of a breeding program and provide reference genomes for next generation whole genome sequencing. In an animal breeding context, a method was developed that employs probabilities of gene origin, computed from the pedigree-based additive kinship matrix, for identifying key ancestors. Because reliable and complete pedigree information is often not available in plant breeding, we replaced the additive kinship matrix with the genomic kinship matrix. As a proof-of-concept, we applied this approach to simulated data sets with known ancestries. The relative contribution of the ancestral lines to later generations could be determined with high accuracy, with and without selection. Our method was subsequently used for identifying the key ancestors of the modern Dent germplasm of the public maize breeding program of the University of Hohenheim. We found that the modern germplasm can be traced back to six or seven key ancestors, with one or two of them having a disproportionately large contribution. These results largely corroborated conjectures based on early records of the breeding program. We conclude that probabilities of gene origin computed from the genomic kinships matrix can be used for identifying key ancestors in breeding programs and estimating the proportion of genes contributed by them.
Nichols, J.D.; Sauer, J.R.; Hines, J.E.; Boulinier, T.; Pollock, K.H.; Therres, Glenn D.
2001-01-01
Although many ecological monitoring programs are now in place, the use of resulting data to draw inferences about changes in biodiversity is problematic. The difficulty arises because of the inability to count all animals present in any sampled area. This inability results not only in underestimation of species richness but also in potentially misleading comparisons of species richness over time and space. We recommend the use of probabilistic estimators for estimating species richness and related parameters (e.g., rate of change in species richness, local extinction probability, local turnover, local colonization) when animal detection probabilities are <1. We illustrate these methods using data from the North American Breeding Bird Survey obtained along survey routes in Maryland. We also introduce software to implement these estimation methods.
Sampling design for long-term regional trends in marine rocky intertidal communities
Irvine, Gail V.; Shelley, Alice
2013-01-01
Probability-based designs reduce bias and allow inference of results to the pool of sites from which they were chosen. We developed and tested probability-based designs for monitoring marine rocky intertidal assemblages at Glacier Bay National Park and Preserve (GLBA), Alaska. A multilevel design was used that varied in scale and inference. The levels included aerial surveys, extensive sampling of 25 sites, and more intensive sampling of 6 sites. Aerial surveys of a subset of intertidal habitat indicated that the original target habitat of bedrock-dominated sites with slope ≤30° was rare. This unexpected finding illustrated one value of probability-based surveys and led to a shift in the target habitat type to include steeper, more mixed rocky habitat. Subsequently, we evaluated the statistical power of different sampling methods and sampling strategies to detect changes in the abundances of the predominant sessile intertidal taxa: barnacles Balanomorpha, the mussel Mytilus trossulus, and the rockweed Fucus distichus subsp. evanescens. There was greatest power to detect trends in Mytilus and lesser power for barnacles and Fucus. Because of its greater power, the extensive, coarse-grained sampling scheme was adopted in subsequent years over the intensive, fine-grained scheme. The sampling attributes that had the largest effects on power included sampling of “vertical” line transects (vs. horizontal line transects or quadrats) and increasing the number of sites. We also evaluated the power of several management-set parameters. Given equal sampling effort, sampling more sites fewer times had greater power. The information gained through intertidal monitoring is likely to be useful in assessing changes due to climate, including ocean acidification; invasive species; trampling effects; and oil spills.
Meador, M.R.; McIntyre, J.P.; Pollock, K.H.
2003-01-01
Two-pass backpack electrofishing data collected as part of the U.S. Geological Survey's National Water-Quality Assessment Program were analyzed to assess the efficacy of single-pass backpack electrofishing. A two-capture removal model was used to estimate, within 10 river basins across the United States, proportional fish species richness from one-pass electrofishing and probabilities of detection for individual fish species. Mean estimated species richness from first-pass sampling (ps1) ranged from 80.7% to 100% of estimated total species richness for each river basin, based on at least seven samples per basin. However, ps1 values for individual sites ranged from 40% to 100% of estimated total species richness. Additional species unique to the second pass were collected in 50.3% of the samples. Of these, cyprinids and centrarchids were collected most frequently. Proportional fish species richness estimated for the first pass increased significantly with decreasing stream width for 1 of the 10 river basins. When used to calculate probabilities of detection of individual fish species, the removal model failed 48% of the time because the number of individuals of a species was greater in the second pass than in the first pass. Single-pass backpack electrofishing data alone may make it difficult to determine whether characterized fish community structure data are real or spurious. The two-pass removal model can be used to assess the effectiveness of sampling species richness with a single electrofishing pass. However, the two-pass removal model may have limited utility to determine probabilities of detection of individual species and, thus, limit the ability to assess the effectiveness of single-pass sampling to characterize species relative abundances. Multiple-pass (at least three passes) backpack electrofishing at a large number of sites may not be cost-effective as part of a standardized sampling protocol for large-geographic-scale studies. However, multiple-pass electrofishing at some sites may be necessary to better evaluate the adequacy of single-pass electrofishing and to help make meaningful interpretations of fish community structure.
Analyses of flood-flow frequency for selected gaging stations in South Dakota
Benson, R.D.; Hoffman, E.B.; Wipf, V.J.
1985-01-01
Analyses of flood flow frequency were made for 111 continuous-record gaging stations in South Dakota with 10 or more years of record. The analyses were developed using the log-Pearson Type III procedure recommended by the U.S. Water Resources Council. The procedure characterizes flood occurrence at a single site as a sequence of annual peak flows. The magnitudes of the annual peak flows are assumed to be independent random variables following a log-Pearson Type III probability distribution, which defines the probability that any single annual peak flow will exceed a specified discharge. By considering only annual peak flows, the flood-frequency analysis becomes the estimation of the log-Pearson annual-probability curve using the record of annual peak flows at the site. The recorded data are divided into two classes: systematic and historic. The systematic record includes all annual peak flows determined in the process of conducting a systematic gaging program at a site. In this program, the annual peak flow is determined for each and every year of the program. The systematic record is intended to constitute an unbiased and representative sample of the population of all possible annual peak flows at the site. In contrast to the systematic record, the historic record consists of annual peak flows that would not have been determined except for evidence indicating their unusual magnitude. Flood information acquired from historical sources almost invariably refers to floods of noteworthy, and hence extraordinary, size. Although historic records form a biased and unrepresentative sample, they can be used to supplement the systematic record. (Author 's abstract)
NASA Astrophysics Data System (ADS)
Walton, A. W.; Walker, J. R.
2015-12-01
Project Hotspot's 1821m coring operation at Mountain Home Air Force Base, Idaho (MHC), sought to examine interaction of hotspot magmas with continental crust and evaluate geothermal resources. Subsurface temperature increased at a gradient of 76˚/km. Alteration was uniform and not intense over the upper part of the core and at the bottom, but differed markedly in an anomalous zone (AZ) from 1700 to 1800m. The MHC core contains diatomite, basalt lava and minor hyaloclastite. Olivine (Ol) in lavas is more-or-less altered to iddingsite. Plagioclase (Plag) has altered to smectite along cleavage planes and fractures except in the AZ, where it is intensely altered to corrensite. Clinopyroxene (CPX, pinkish in thin section) is little altered, as are apatite and opaque minerals (probably ilmenite with magnetite or pyrite in different samples). Interstitial material is converted to smectite or, in the AZ, to corrensite. Phyllosilicate lines vesicles, and calcite, zeolite and phyllosilicate fill them. Pore-lining phillipsite is common shallow in the core, with vesicle-filling analcime and heulandite at greater depth. A fibrous zeolite, probably stilbite, is also present. Hyaloclasts are altered to concentrically layered masses of smectite. MHC hyaloclastites do not display the microbial traces and palagonite ("gel-palagonite") alteration common in Hawaii Scientific Drilling Project #2 (HSDP) samples. HSDP samples do contain pore-lining phillipsite, but pore fillings are chabazite. Calcite is absent in HSDP hyaloclastites. Neither Ol nor Plag were altered in HSDP hyaloclastites. HSPD glasses are less silicic and Ti-rich than MHC lavas, containing Ol rather than CPX as a dominant mafic. However the differences in alteration of hyaloclastites probably reflect either the fact that the HSDP core was collected at temperatures equivalent to those at the top of the MHC-2 core or HSDP samples were from beds that were in modified marine pore water, rather than continental waters.
Long, H. Keith; Farrar, Jerry W.
1995-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for 7 standard reference samples--T-131 (trace constituents), T-133 (trace constituents), M-132 (major constituents), N-43 (nutrients), N-44 (nutrients), P-23 (low ionic strength), and Hg-19 (mercury). The samples were distributed in October 1994 to 131 laboratories registered in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data that were received from 121 of the laboratories were evaluated with respect to: overall laboratory performance and relative laboratory performance for each analyte in the seven reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the seven standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
NASA Astrophysics Data System (ADS)
Khodabakhshi, M.; Jafarpour, B.
2013-12-01
Characterization of complex geologic patterns that create preferential flow paths in certain reservoir systems requires higher-order geostatistical modeling techniques. Multipoint statistics (MPS) provides a flexible grid-based approach for simulating such complex geologic patterns from a conceptual prior model known as a training image (TI). In this approach, a stationary TI that encodes the higher-order spatial statistics of the expected geologic patterns is used to represent the shape and connectivity of the underlying lithofacies. While MPS is quite powerful for describing complex geologic facies connectivity, the nonlinear and complex relation between the flow data and facies distribution makes flow data conditioning quite challenging. We propose an adaptive technique for conditioning facies simulation from a prior TI to nonlinear flow data. Non-adaptive strategies for conditioning facies simulation to flow data can involves many forward flow model solutions that can be computationally very demanding. To improve the conditioning efficiency, we develop an adaptive sampling approach through a data feedback mechanism based on the sampling history. In this approach, after a short period of sampling burn-in time where unconditional samples are generated and passed through an acceptance/rejection test, an ensemble of accepted samples is identified and used to generate a facies probability map. This facies probability map contains the common features of the accepted samples and provides conditioning information about facies occurrence in each grid block, which is used to guide the conditional facies simulation process. As the sampling progresses, the initial probability map is updated according to the collective information about the facies distribution in the chain of accepted samples to increase the acceptance rate and efficiency of the conditioning. This conditioning process can be viewed as an optimization approach where each new sample is proposed based on the sampling history to improve the data mismatch objective function. We extend the application of this adaptive conditioning approach to the case where multiple training images are proposed to describe the geologic scenario in a given formation. We discuss the advantages and limitations of the proposed adaptive conditioning scheme and use numerical experiments from fluvial channel formations to demonstrate its applicability and performance compared to non-adaptive conditioning techniques.
Estimation of density of mongooses with capture-recapture and distance sampling
Corn, J.L.; Conroy, M.J.
1998-01-01
We captured mongooses (Herpestes javanicus) in live traps arranged in trapping webs in Antigua, West Indies, and used capture-recapture and distance sampling to estimate density. Distance estimation and program DISTANCE were used to provide estimates of density from the trapping-web data. Mean density based on trapping webs was 9.5 mongooses/ha (range, 5.9-10.2/ha); estimates had coefficients of variation ranging from 29.82-31.58% (X?? = 30.46%). Mark-recapture models were used to estimate abundance, which was converted to density using estimates of effective trap area. Tests of model assumptions provided by CAPTURE indicated pronounced heterogeneity in capture probabilities and some indication of behavioral response and variation over time. Mean estimated density was 1.80 mongooses/ha (range, 1.37-2.15/ha) with estimated coefficients of variation of 4.68-11.92% (X?? = 7.46%). Estimates of density based on mark-recapture data depended heavily on assumptions about animal home ranges; variances of densities also may be underestimated, leading to unrealistically narrow confidence intervals. Estimates based on trap webs require fewer assumptions, and estimated variances may be a more realistic representation of sampling variation. Because trap webs are established easily and provide adequate data for estimation in a few sample occasions, the method should be efficient and reliable for estimating densities of mongooses.
Pritt, Jeremy J.; DuFour, Mark R.; Mayer, Christine M.; Roseman, Edward F.; DeBruyne, Robin L.
2014-01-01
Larval fish are frequently sampled in coastal tributaries to determine factors affecting recruitment, evaluate spawning success, and estimate production from spawning habitats. Imperfect detection of larvae is common, because larval fish are small and unevenly distributed in space and time, and coastal tributaries are often large and heterogeneous. We estimated detection probabilities of larval fish from several taxa in the Maumee and Detroit rivers, the two largest tributaries of Lake Erie. We then demonstrated how accounting for imperfect detection influenced (1) the probability of observing taxa as present relative to sampling effort and (2) abundance indices for larval fish of two Detroit River species. We found that detection probabilities ranged from 0.09 to 0.91 but were always less than 1.0, indicating that imperfect detection is common among taxa and between systems. In general, taxa with high fecundities, small larval length at hatching, and no nesting behaviors had the highest detection probabilities. Also, detection probabilities were higher in the Maumee River than in the Detroit River. Accounting for imperfect detection produced up to fourfold increases in abundance indices for Lake Whitefish Coregonus clupeaformis and Gizzard Shad Dorosoma cepedianum. The effect of accounting for imperfect detection in abundance indices was greatest during periods of low abundance for both species. Detection information can be used to determine the appropriate level of sampling effort for larval fishes and may improve management and conservation decisions based on larval fish data.
Rodhouse, Thomas J.; Ormsbee, Patricia C.; Irvine, Kathryn M.; Vierling, Lee A.; Szewczak, Joseph M.; Vierling, Kerri T.
2012-01-01
Despite its common status, M. lucifugus was only detected during ∼50% of the surveys in occupied sample units. The overall naïve estimate for the proportion of the study region occupied by the species was 0.69, but after accounting for imperfect detection, this increased to ∼0.90. Our models provide evidence of an association between NPP and forest cover and M. lucifugus distribution, with implications for the projected effects of accelerated climate change in the region, which include net aridification as snowpack and stream flows decline. Annual turnover, the probability that an occupied sample unit was a newly occupied one, was estimated to be low (∼0.04–0.14), resulting in flat trend estimated with relatively high precision (SD = 0.04). We mapped the variation in predicted occurrence probabilities and corresponding prediction uncertainty along the productivity gradient. Our results provide a much needed baseline against which future anticipated declines in M. lucifugus occurrence can be measured. The dynamic distribution modeling approach has broad applicability to regional bat monitoring efforts now underway in several countries and we suggest ways to improve and expand our grid-based monitoring program to gain robust insights into bat population status and trend across large portions of North America.
NASA Astrophysics Data System (ADS)
Gürbüz, Ramazan
2010-09-01
The purpose of this study is to investigate and compare the effects of activity-based and traditional instructions on students' conceptual development of certain probability concepts. The study was conducted using a pretest-posttest control group design with 80 seventh graders. A developed 'Conceptual Development Test' comprising 12 open-ended questions was administered on both groups of students before and after the intervention. The data were analysed using analysis of covariance, with the pretest as covariate. The results revealed that activity-based instruction (ABI) outperformed the traditional counterpart in the development of probability concepts. Furthermore, ABI was found to contribute students' conceptual development of the concept of 'Probability of an Event' the most, whereas to the concept of 'Sample Space' the least. As a consequence, it can be deduced that the designed instructional process was effective in the instruction of probability concepts.
Reading Activities of American Adults.
ERIC Educational Resources Information Center
Sharon, Amiel T.
A reading activities survey as part of the Targeted Research and Development Reading Program was done by interviewing 3,504 adults, aged 16 years or older, selected by area probability sampling. Among the preliminary findings was that the most frequent type of reading is newspaper reading. Seven out of 10 people read or look at a newspaper during…
Duncan C. Lutes; Robert E. Keane; John F. Caratti; Carl H. Key; Nathan C. Benson
2006-01-01
This is probably the most critical phase of FIREMON sampling because this plot ID must be unique across all plots that will be entered in the FIREMON database. The plot identifier is made up of three parts: Registration Code, Project Code, and Plot Number.The FIREMON Analysis Tools program will allow summarization and comparison of plots only if...
Anomalous dismeter distribution shifts estimated from FIA inventories through time
Francis A. Roesch; Paul C. Van Deusen
2010-01-01
In the past decade, the United States Department of Agriculture Forest Serviceâs Forest Inventory and Analysis Program (FIA) has replaced regionally autonomous, periodic, state-wide forest inventories using various probability proportional to tree size sampling designs with a nationally consistent annual forest inventory design utilizing systematically spaced clusters...
Program SimAssem: software for simulating species assemblages and estimating species richness
Gordon C. Reese; Kenneth R. Wilson; Curtis H. Flather
2013-01-01
1. Species richness, the number of species in a defined area, is the most frequently used biodiversity measure. Despite its intuitive appeal and conceptual simplicity, species richness is often difficult to quantify, even in well surveyed areas, because of sampling limitations such as survey effort and species detection probability....
Perceived risk associated with ecstasy use: a latent class analysis approach
Martins, SS; Carlson, RG; Alexandre, PK; Falck, RS
2011-01-01
This study aims to define categories of perceived health problems among ecstasy users based on observed clustering of their perceptions of ecstasy-related health problems. Data from a community sample of ecstasy users (n=402) aged 18 to 30, in Ohio, was used in this study. Data was analyzed via Latent Class Analysis (LCA) and Regression. This study identified five different subgroups of ecstasy users based on their perceptions of health problems they associated with their ecstasy use. Almost one third of the sample (28.9%) belonged to a class with “low level of perceived problems” (Class 4). About one fourth (25.6%) of the sample (Class 2), had high probabilities of “perceiving problems on sexual-related items”, but generally low or moderate probabilities of perceiving problems in other areas. Roughly one-fifth of the sample (21.1%, Class 1) had moderate probabilities of perceiving ecstasy health-related problems in all areas. A small proportion of respondents (11.9%, Class 5) had high probabilities of reporting “perceived memory and cognitive problems, and of perceiving “ecstasy related-problems in all areas” (12.4%, Class 3). A large proportion of ecstasy users perceive either low or moderate risk associated with their ecstasy use. It is important to further investigate whether lower levels of risk perception are associated with persistence of ecstasy use. PMID:21296504
Hamilton, Craig S; Kruse, Regina; Sansoni, Linda; Barkhofen, Sonja; Silberhorn, Christine; Jex, Igor
2017-10-27
Boson sampling has emerged as a tool to explore the advantages of quantum over classical computers as it does not require universal control over the quantum system, which favors current photonic experimental platforms. Here, we introduce Gaussian Boson sampling, a classically hard-to-solve problem that uses squeezed states as a nonclassical resource. We relate the probability to measure specific photon patterns from a general Gaussian state in the Fock basis to a matrix function called the Hafnian, which answers the last remaining question of sampling from Gaussian states. Based on this result, we design Gaussian Boson sampling, a #P hard problem, using squeezed states. This demonstrates that Boson sampling from Gaussian states is possible, with significant advantages in the photon generation probability, compared to existing protocols.
High throughput nonparametric probability density estimation.
Farmer, Jenny; Jacobs, Donald
2018-01-01
In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference.
High throughput nonparametric probability density estimation
Farmer, Jenny
2018-01-01
In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference. PMID:29750803
1984-10-01
contamination resulting from previous waste disposal practices at Hancock Field .. o Recommend measures to mitigate adverse impacts at identified...best well to use in judging water quality impacts caused by the disposal activities. Slug tests (Hvorslev, 1951) were performed at each of the four... impact future samplings because this water will probably become mixed in the aquifer before the next sample round and if some remains . near the well
NASA Astrophysics Data System (ADS)
Eilert, Tobias; Beckers, Maximilian; Drechsler, Florian; Michaelis, Jens
2017-10-01
The analysis tool and software package Fast-NPS can be used to analyse smFRET data to obtain quantitative structural information about macromolecules in their natural environment. In the algorithm a Bayesian model gives rise to a multivariate probability distribution describing the uncertainty of the structure determination. Since Fast-NPS aims to be an easy-to-use general-purpose analysis tool for a large variety of smFRET networks, we established an MCMC based sampling engine that approximates the target distribution and requires no parameter specification by the user at all. For an efficient local exploration we automatically adapt the multivariate proposal kernel according to the shape of the target distribution. In order to handle multimodality, the sampler is equipped with a parallel tempering scheme that is fully adaptive with respect to temperature spacing and number of chains. Since the molecular surrounding of a dye molecule affects its spatial mobility and thus the smFRET efficiency, we introduce dye models which can be selected for every dye molecule individually. These models allow the user to represent the smFRET network in great detail leading to an increased localisation precision. Finally, a tool to validate the chosen model combination is provided. Programme Files doi:http://dx.doi.org/10.17632/7ztzj63r68.1 Licencing provisions: Apache-2.0 Programming language: GUI in MATLAB (The MathWorks) and the core sampling engine in C++ Nature of problem: Sampling of highly diverse multivariate probability distributions in order to solve for macromolecular structures from smFRET data. Solution method: MCMC algorithm with fully adaptive proposal kernel and parallel tempering scheme.
O'Brien, Kathryn; Edwards, Adrian; Hood, Kerenza; Butler, Christopher C
2013-02-01
Urinary tract infection (UTI) in children may be associated with long-term complications that could be prevented by prompt treatment. To determine the prevalence of UTI in acutely ill children ≤ 5 years presenting in general practice and to explore patterns of presenting symptoms and urine sampling strategies. Prospective observational study with systematic urine sampling, in general practices in Wales, UK. In total, 1003 children were recruited from 13 general practices between March 2008 and July 2010. The prevalence of UTI was determined and multivariable analysis performed to determine the probability of UTI. Out of 597 (60.0%) children who provided urine samples within 2 days, the prevalence of UTI was 5.9% (95% confidence interval [CI] = 4.3% to 8.0%) overall, 7.3% in those < 3 years and 3.2% in 3-5 year olds. Neither a history of fever nor the absence of an alternative source of infection was associated with UTI (P = 0.64; P = 0.69, respectively). The probability of UTI in children aged ≥3 years without increased urinary frequency or dysuria was 2%. The probability of UTI was ≥5% in all other groups. Urine sampling based purely on GP suspicion would have missed 80% of UTIs, while a sampling strategy based on current guidelines would have missed 50%. Approximately 6% of acutely unwell children presenting to UK general practice met the criteria for a laboratory diagnosis of UTI. This higher than previously recognised prior probability of UTI warrants raised awareness of the condition and suggests clinicians should lower their threshold for urine sampling in young children. The absence of fever or presence of an alternative source of infection, as emphasised in current guidelines, may not rule out UTI in young children with adequate certainty.
Oyeflaten, Irene; Lie, Stein Atle; Ihlebæk, Camilla M; Eriksen, Hege R
2012-09-06
Return to work (RTW) after long-term sick leave can be a long-lasting process where the individual may shift between work and receiving different social security benefits, as well as between part-time and full-time work. This is a challenge in the assessment of RTW outcomes after rehabilitation interventions. The aim of this study was to analyse the probability for RTW, and the probabilities of transitions between different benefits during a 4-year follow-up, after participating in a work-related rehabilitation program. The sample consisted of 584 patients (66% females), mean age 44 years (sd = 9.3). Mean duration on various types of sick leave benefits at entry to the rehabilitation program was 9.3 months (sd = 3.4)]. The patients had mental (47%), musculoskeletal (46%), or other diagnoses (7%). Official national register data over a 4-year follow-up period was analysed. Extended statistical tools for multistate models were used to calculate transition probabilities between the following eight states; working, partial sick leave, full-time sick leave, medical rehabilitation, vocational rehabilitation, and disability pension; (partial, permanent and time-limited). During the follow-up there was an increased probability for working, a decreased probability for being on sick leave, and an increased probability for being on disability pension. The probability of RTW was not related to the work and benefit status at departure from the rehabilitation clinic. The patients had an average of 3.7 (range 0-18) transitions between work and the different benefits. The process of RTW or of receiving disability pension was complex, and may take several years, with multiple transitions between work and different benefits. Access to reliable register data and the use of a multistate RTW model, makes it possible to describe the developmental nature and the different levels of the recovery and disability process.
2012-01-01
Background Return to work (RTW) after long-term sick leave can be a long-lasting process where the individual may shift between work and receiving different social security benefits, as well as between part-time and full-time work. This is a challenge in the assessment of RTW outcomes after rehabilitation interventions. The aim of this study was to analyse the probability for RTW, and the probabilities of transitions between different benefits during a 4-year follow-up, after participating in a work-related rehabilitation program. Methods The sample consisted of 584 patients (66% females), mean age 44 years (sd = 9.3). Mean duration on various types of sick leave benefits at entry to the rehabilitation program was 9.3 months (sd = 3.4)]. The patients had mental (47%), musculoskeletal (46%), or other diagnoses (7%). Official national register data over a 4-year follow-up period was analysed. Extended statistical tools for multistate models were used to calculate transition probabilities between the following eight states; working, partial sick leave, full-time sick leave, medical rehabilitation, vocational rehabilitation, and disability pension; (partial, permanent and time-limited). Results During the follow-up there was an increased probability for working, a decreased probability for being on sick leave, and an increased probability for being on disability pension. The probability of RTW was not related to the work and benefit status at departure from the rehabilitation clinic. The patients had an average of 3.7 (range 0–18) transitions between work and the different benefits. Conclusions The process of RTW or of receiving disability pension was complex, and may take several years, with multiple transitions between work and different benefits. Access to reliable register data and the use of a multistate RTW model, makes it possible to describe the developmental nature and the different levels of the recovery and disability process. PMID:22954254
Aitken, C G
1999-07-01
It is thought that, in a consignment of discrete units, a certain proportion of the units contain illegal material. A sample of the consignment is to be inspected. Various methods for the determination of the sample size are compared. The consignment will be considered as a random sample from some super-population of units, a certain proportion of which contain drugs. For large consignments, a probability distribution, known as the beta distribution, for the proportion of the consignment which contains illegal material is obtained. This distribution is based on prior beliefs about the proportion. Under certain specific conditions the beta distribution gives the same numerical results as an approach based on the binomial distribution. The binomial distribution provides a probability for the number of units in a sample which contain illegal material, conditional on knowing the proportion of the consignment which contains illegal material. This is in contrast to the beta distribution which provides probabilities for the proportion of a consignment which contains illegal material, conditional on knowing the number of units in the sample which contain illegal material. The interpretation when the beta distribution is used is much more intuitively satisfactory. It is also much more flexible in its ability to cater for prior beliefs which may vary given the different circumstances of different crimes. For small consignments, a distribution, known as the beta-binomial distribution, for the number of units in the consignment which are found to contain illegal material, is obtained, based on prior beliefs about the number of units in the consignment which are thought to contain illegal material. As with the beta and binomial distributions for large samples, it is shown that, in certain specific conditions, the beta-binomial and hypergeometric distributions give the same numerical results. However, the beta-binomial distribution, as with the beta distribution, has a more intuitively satisfactory interpretation and greater flexibility. The beta and the beta-binomial distributions provide methods for the determination of the minimum sample size to be taken from a consignment in order to satisfy a certain criterion. The criterion requires the specification of a proportion and a probability.
Probability sampling in legal cases: Kansas cellphone users
NASA Astrophysics Data System (ADS)
Kadane, Joseph B.
2012-10-01
Probability sampling is a standard statistical technique. This article introduces the basic ideas of probability sampling, and shows in detail how probability sampling was used in a particular legal case.
Estimating the probability of arsenic occurrence in domestic wells in the United States
NASA Astrophysics Data System (ADS)
Ayotte, J.; Medalie, L.; Qi, S.; Backer, L. F.; Nolan, B. T.
2016-12-01
Approximately 43 million people (about 14 percent of the U.S. population) rely on privately owned domestic wells as their source of drinking water. Unlike public water systems, which are regulated by the Safe Drinking Water Act, there is no comprehensive national program to ensure that the water from domestic wells is routinely tested and that is it safe to drink. A study published in 2009 from the National Water-Quality Assessment Program of the U.S. Geological Survey assessed water-quality conditions from 2,100 domestic wells within 48 states and reported that more than one in five (23 percent) of the sampled wells contained one or more contaminants at a concentration greater than a human-health benchmark. In addition, there are many activities such as resource extraction, climate change-induced drought, and changes in land use patterns that could potentially affect the quality of the ground water source for domestic wells. The Health Studies Branch (HSB) of the National Center for Environmental Health, Centers for Disease Control and Prevention, created a Clean Water for Health Program to help address domestic well concerns. The goals of this program are to identify emerging public health issues associated with using domestic wells for drinking water and develop plans to address these issues. As part of this effort, HSB in cooperation with the U.S. Geological Survey has created probability models to estimate the probability of arsenic occurring at various concentrations in domestic wells in the U.S. We will present preliminary results of the project, including estimates of the population supplied by domestic wells that is likely to have arsenic greater than 10 micrograms per liter. Nationwide, we estimate this to be just over 2 million people. Logistic regression model results showing probabilities of arsenic greater than the Maximum Contaminant Level for public supply wells of 10 micrograms per liter in domestic wells in the U.S., based on data for arsenic concentrations in domestic wells across the U.S. will be described, as well as the use of data on domestic-well use by county in the U.S., to estimate the affected population. Similar work has been done by public health professionals on a state and regional basis.
Cowell, Robert G
2018-05-04
Current models for single source and mixture samples, and probabilistic genotyping software based on them used for analysing STR electropherogram data, assume simple probability distributions, such as the gamma distribution, to model the allelic peak height variability given the initial amount of DNA prior to PCR amplification. Here we illustrate how amplicon number distributions, for a model of the process of sample DNA collection and PCR amplification, may be efficiently computed by evaluating probability generating functions using discrete Fourier transforms. Copyright © 2018 Elsevier B.V. All rights reserved.
Probability-based hazard avoidance guidance for planetary landing
NASA Astrophysics Data System (ADS)
Yuan, Xu; Yu, Zhengshi; Cui, Pingyuan; Xu, Rui; Zhu, Shengying; Cao, Menglong; Luan, Enjie
2018-03-01
Future landing and sample return missions on planets and small bodies will seek landing sites with high scientific value, which may be located in hazardous terrains. Autonomous landing in such hazardous terrains and highly uncertain planetary environments is particularly challenging. Onboard hazard avoidance ability is indispensable, and the algorithms must be robust to uncertainties. In this paper, a novel probability-based hazard avoidance guidance method is developed for landing in hazardous terrains on planets or small bodies. By regarding the lander state as probabilistic, the proposed guidance algorithm exploits information on the uncertainty of lander position and calculates the probability of collision with each hazard. The collision probability serves as an accurate safety index, which quantifies the impact of uncertainties on the lander safety. Based on the collision probability evaluation, the state uncertainty of the lander is explicitly taken into account in the derivation of the hazard avoidance guidance law, which contributes to enhancing the robustness to the uncertain dynamics of planetary landing. The proposed probability-based method derives fully analytic expressions and does not require off-line trajectory generation. Therefore, it is appropriate for real-time implementation. The performance of the probability-based guidance law is investigated via a set of simulations, and the effectiveness and robustness under uncertainties are demonstrated.
NASA Technical Reports Server (NTRS)
Johnson, J. R. (Principal Investigator)
1974-01-01
The author has identified the following significant results. The broad scale vegetation classification was developed for a 3,200 sq mile area in southeastern Arizona. The 31 vegetation types were derived from association tables which contained information taken at about 500 ground sites. The classification provided an information base that was suitable for use with small scale photography. A procedure was developed and tested for objectively comparing photo images. The procedure consisted of two parts, image groupability testing and image complexity testing. The Apollo and ERTS photos were compared for relative suitability as first stage stratification bases in two stage proportional probability sampling. High altitude photography was used in common at the second stage.
Design-based Sample and Probability Law-Assumed Sample: Their Role in Scientific Investigation.
ERIC Educational Resources Information Center
Ojeda, Mario Miguel; Sahai, Hardeo
2002-01-01
Discusses some key statistical concepts in probabilistic and non-probabilistic sampling to provide an overview for understanding the inference process. Suggests a statistical model constituting the basis of statistical inference and provides a brief review of the finite population descriptive inference and a quota sampling inferential theory.…
Review of sampling hard-to-reach and hidden populations for HIV surveillance.
Magnani, Robert; Sabin, Keith; Saidel, Tobi; Heckathorn, Douglas
2005-05-01
Adequate surveillance of hard-to-reach and 'hidden' subpopulations is crucial to containing the HIV epidemic in low prevalence settings and in slowing the rate of transmission in high prevalence settings. For a variety of reasons, however, conventional facility and survey-based surveillance data collection strategies are ineffective for a number of key subpopulations, particularly those whose behaviors are illegal or illicit. This paper critically reviews alternative sampling strategies for undertaking behavioral or biological surveillance surveys of such groups. Non-probability sampling approaches such as facility-based sentinel surveillance and snowball sampling are the simplest to carry out, but are subject to a high risk of sampling/selection bias. Most of the probability sampling methods considered are limited in that they are adequate only under certain circumstances and for some groups. One relatively new method, respondent-driven sampling, an adaptation of chain-referral sampling, appears to be the most promising for general applications. However, as its applicability to HIV surveillance in resource-poor settings has yet to be established, further field trials are needed before a firm conclusion can be reached.
Tools for Basic Statistical Analysis
NASA Technical Reports Server (NTRS)
Luz, Paul L.
2005-01-01
Statistical Analysis Toolset is a collection of eight Microsoft Excel spreadsheet programs, each of which performs calculations pertaining to an aspect of statistical analysis. These programs present input and output data in user-friendly, menu-driven formats, with automatic execution. The following types of calculations are performed: Descriptive statistics are computed for a set of data x(i) (i = 1, 2, 3 . . . ) entered by the user. Normal Distribution Estimates will calculate the statistical value that corresponds to cumulative probability values, given a sample mean and standard deviation of the normal distribution. Normal Distribution from two Data Points will extend and generate a cumulative normal distribution for the user, given two data points and their associated probability values. Two programs perform two-way analysis of variance (ANOVA) with no replication or generalized ANOVA for two factors with four levels and three repetitions. Linear Regression-ANOVA will curvefit data to the linear equation y=f(x) and will do an ANOVA to check its significance.
Eaton, Mitchell J.; Hughes, Phillip T.; Hines, James E.; Nichols, James D.
2014-01-01
Metapopulation ecology is a field that is richer in theory than in empirical results. Many existing empirical studies use an incidence function approach based on spatial patterns and key assumptions about extinction and colonization rates. Here we recast these assumptions as hypotheses to be tested using 18 years of historic detection survey data combined with four years of data from a new monitoring program for the Lower Keys marsh rabbit. We developed a new model to estimate probabilities of local extinction and colonization in the presence of nondetection, while accounting for estimated occupancy levels of neighboring patches. We used model selection to identify important drivers of population turnover and estimate the effective neighborhood size for this system. Several key relationships related to patch size and isolation that are often assumed in metapopulation models were supported: patch size was negatively related to the probability of extinction and positively related to colonization, and estimated occupancy of neighboring patches was positively related to colonization and negatively related to extinction probabilities. This latter relationship suggested the existence of rescue effects. In our study system, we inferred that coastal patches experienced higher probabilities of extinction and colonization than interior patches. Interior patches exhibited higher occupancy probabilities and may serve as refugia, permitting colonization of coastal patches following disturbances such as hurricanes and storm surges. Our modeling approach should be useful for incorporating neighbor occupancy into future metapopulation analyses and in dealing with other historic occupancy surveys that may not include the recommended levels of sampling replication.
Richard, David; Speck, Thomas
2018-03-28
We investigate the kinetics and the free energy landscape of the crystallization of hard spheres from a supersaturated metastable liquid though direct simulations and forward flux sampling. In this first paper, we describe and test two different ways to reconstruct the free energy barriers from the sampled steady state probability distribution of cluster sizes without sampling the equilibrium distribution. The first method is based on mean first passage times, and the second method is based on splitting probabilities. We verify both methods for a single particle moving in a double-well potential. For the nucleation of hard spheres, these methods allow us to probe a wide range of supersaturations and to reconstruct the kinetics and the free energy landscape from the same simulation. Results are consistent with the scaling predicted by classical nucleation theory although a quantitative fit requires a rather large effective interfacial tension.
NASA Astrophysics Data System (ADS)
Richard, David; Speck, Thomas
2018-03-01
We investigate the kinetics and the free energy landscape of the crystallization of hard spheres from a supersaturated metastable liquid though direct simulations and forward flux sampling. In this first paper, we describe and test two different ways to reconstruct the free energy barriers from the sampled steady state probability distribution of cluster sizes without sampling the equilibrium distribution. The first method is based on mean first passage times, and the second method is based on splitting probabilities. We verify both methods for a single particle moving in a double-well potential. For the nucleation of hard spheres, these methods allow us to probe a wide range of supersaturations and to reconstruct the kinetics and the free energy landscape from the same simulation. Results are consistent with the scaling predicted by classical nucleation theory although a quantitative fit requires a rather large effective interfacial tension.
Ye, Qing; Pan, Hao; Liu, Changhua
2015-01-01
This research proposes a novel framework of final drive simultaneous failure diagnosis containing feature extraction, training paired diagnostic models, generating decision threshold, and recognizing simultaneous failure modes. In feature extraction module, adopt wavelet package transform and fuzzy entropy to reduce noise interference and extract representative features of failure mode. Use single failure sample to construct probability classifiers based on paired sparse Bayesian extreme learning machine which is trained only by single failure modes and have high generalization and sparsity of sparse Bayesian learning approach. To generate optimal decision threshold which can convert probability output obtained from classifiers into final simultaneous failure modes, this research proposes using samples containing both single and simultaneous failure modes and Grid search method which is superior to traditional techniques in global optimization. Compared with other frequently used diagnostic approaches based on support vector machine and probability neural networks, experiment results based on F 1-measure value verify that the diagnostic accuracy and efficiency of the proposed framework which are crucial for simultaneous failure diagnosis are superior to the existing approach. PMID:25722717
Zhou, Hanzhi; Elliott, Michael R; Raghunathan, Trivellore E
2016-06-01
Multistage sampling is often employed in survey samples for cost and convenience. However, accounting for clustering features when generating datasets for multiple imputation is a nontrivial task, particularly when, as is often the case, cluster sampling is accompanied by unequal probabilities of selection, necessitating case weights. Thus, multiple imputation often ignores complex sample designs and assumes simple random sampling when generating imputations, even though failing to account for complex sample design features is known to yield biased estimates and confidence intervals that have incorrect nominal coverage. In this article, we extend a recently developed, weighted, finite-population Bayesian bootstrap procedure to generate synthetic populations conditional on complex sample design data that can be treated as simple random samples at the imputation stage, obviating the need to directly model design features for imputation. We develop two forms of this method: one where the probabilities of selection are known at the first and second stages of the design, and the other, more common in public use files, where only the final weight based on the product of the two probabilities is known. We show that this method has advantages in terms of bias, mean square error, and coverage properties over methods where sample designs are ignored, with little loss in efficiency, even when compared with correct fully parametric models. An application is made using the National Automotive Sampling System Crashworthiness Data System, a multistage, unequal probability sample of U.S. passenger vehicle crashes, which suffers from a substantial amount of missing data in "Delta-V," a key crash severity measure.
Zhou, Hanzhi; Elliott, Michael R.; Raghunathan, Trivellore E.
2017-01-01
Multistage sampling is often employed in survey samples for cost and convenience. However, accounting for clustering features when generating datasets for multiple imputation is a nontrivial task, particularly when, as is often the case, cluster sampling is accompanied by unequal probabilities of selection, necessitating case weights. Thus, multiple imputation often ignores complex sample designs and assumes simple random sampling when generating imputations, even though failing to account for complex sample design features is known to yield biased estimates and confidence intervals that have incorrect nominal coverage. In this article, we extend a recently developed, weighted, finite-population Bayesian bootstrap procedure to generate synthetic populations conditional on complex sample design data that can be treated as simple random samples at the imputation stage, obviating the need to directly model design features for imputation. We develop two forms of this method: one where the probabilities of selection are known at the first and second stages of the design, and the other, more common in public use files, where only the final weight based on the product of the two probabilities is known. We show that this method has advantages in terms of bias, mean square error, and coverage properties over methods where sample designs are ignored, with little loss in efficiency, even when compared with correct fully parametric models. An application is made using the National Automotive Sampling System Crashworthiness Data System, a multistage, unequal probability sample of U.S. passenger vehicle crashes, which suffers from a substantial amount of missing data in “Delta-V,” a key crash severity measure. PMID:29226161
ERIC Educational Resources Information Center
O'Reilly, Fran E.; And Others
This study analyzes major federal programs designed to provide services to children, from birth through age 2, who have developmental delays or a high probability of developmental delays, and their families. Programs were selected based on their provision of education or health-related services to infants and toddlers with handicaps and their…
2014-01-01
Background The World Health Organization recommends African children receive two doses of measles containing vaccine (MCV) through routine programs or supplemental immunization activities (SIA). Moreover, children have an additional opportunity to receive MCV through outbreak response immunization (ORI) mass campaigns in certain contexts. Here, we present the results of MCV coverage by dose estimated through surveys conducted after outbreak response in diverse settings in Sub-Saharan Africa. Methods We included 24 household-based surveys conducted in six countries after a non-selective mass vaccination campaign. In the majority (22/24), the survey sample was selected using probability proportional to size cluster-based sampling. Others used Lot Quality Assurance Sampling. Results In total, data were collected on 60,895 children from 2005 to 2011. Routine coverage varied between countries (>95% in Malawi and Kirundo province (Burundi) while <35% in N’Djamena (Chad) in 2005), within a country and over time. SIA coverage was <75% in most settings. ORI coverage ranged from >95% in Malawi to 71.4% [95% CI: 68.9-73.8] in N’Djamena (Chad) in 2005. In five sites, >5% of children remained unvaccinated after several opportunities. Conversely, in Malawi and DRC, over half of the children eligible for the last SIA received a third dose of MCV. Conclusions Control pre-elimination targets were still not reached, contributing to the occurrence of repeated measles outbreak in the Sub-Saharan African countries reported here. Although children receiving a dose of MCV through outbreak response benefit from the intervention, ensuring that programs effectively target hard to reach children remains the cornerstone of measles control. PMID:24559281
Probability of Corporal Punishment: Lack of Resources and Vulnerable Students
ERIC Educational Resources Information Center
Han, Seunghee
2011-01-01
The author examined corporal punishment practices in the United States based on data from 362 public school principals where corporal punishment is available. Results from multiple regression analyses show that schools with multiple student violence prevention programs and teacher training programs had fewer possibilities of use corporal…
Zoonoses action plan Salmonella monitoring programme: an investigation of the sampling protocol.
Snary, E L; Munday, D K; Arnold, M E; Cook, A J C
2010-03-01
The Zoonoses Action Plan (ZAP) Salmonella Programme was established by the British Pig Executive to monitor Salmonella prevalence in quality-assured British pigs at slaughter by testing a sample of pigs with a meat juice enzyme-linked immunosorbent assay for antibodies against group B and C(1) Salmonella. Farms were assigned a ZAP level (1 to 3) depending on the monitored prevalence, and ZAP 2 or 3 farms were required to act to reduce the prevalence. The ultimate goal was to reduce the risk of human salmonellosis attributable to British pork. A mathematical model has been developed to describe the ZAP sampling protocol. Results show that the probability of assigning a farm the correct ZAP level was high, except for farms that had a seroprevalence close to the cutoff points between different ZAP levels. Sensitivity analyses identified that the probability of assigning a farm to the correct ZAP level was dependent on the sensitivity and specificity of the test, the number of batches taken to slaughter each quarter, and the number of samples taken per batch. The variability of the predicted seroprevalence was reduced as the number of batches or samples increased and, away from the cutoff points, the probability of being assigned the correct ZAP level increased as the number of batches or samples increased. In summary, the model described here provided invaluable insight into the ZAP sampling protocol. Further work is required to understand the impact of the program for Salmonella infection in British pig farms and therefore on human health.
ERIC Educational Resources Information Center
Kann, L.; Grunbaum, J.; McKenna, M. L.; Wechsler, H.; Galuska, D. A.
2005-01-01
School Health Profiles is conducted biennially to assess characteristics of school health programs. State and local departments of education and health select either all public secondary schools within their jurisdictions or a systematic, equal-probability sample of public secondary schools to participate in School Health Profiles. At each school,…
ERIC Educational Resources Information Center
Chu, Yuan-Hsiang
2006-01-01
The study investigated the effects of an educational intervention on the variables of awareness, perception, self-efficacy, and behavioral intentions towards technological literacy among students at a College of Design in Southern Taiwan. Using non-probability sampling, 42 freshmen students from the Department of Product Design participated in the…
Mollenhauer, Robert; Mouser, Joshua B.; Brewer, Shannon K.
2018-01-01
Temporal and spatial variability in streams result in heterogeneous gear capture probability (i.e., the proportion of available individuals identified) that confounds interpretation of data used to monitor fish abundance. We modeled tow-barge electrofishing capture probability at multiple spatial scales for nine Ozark Highland stream fishes. In addition to fish size, we identified seven reach-scale environmental characteristics associated with variable capture probability: stream discharge, water depth, conductivity, water clarity, emergent vegetation, wetted width–depth ratio, and proportion of riffle habitat. The magnitude of the relationship between capture probability and both discharge and depth varied among stream fishes. We also identified lithological characteristics among stream segments as a coarse-scale source of variable capture probability. The resulting capture probability model can be used to adjust catch data and derive reach-scale absolute abundance estimates across a wide range of sampling conditions with similar effort as used in more traditional fisheries surveys (i.e., catch per unit effort). Adjusting catch data based on variable capture probability improves the comparability of data sets, thus promoting both well-informed conservation and management decisions and advances in stream-fish ecology.
Aurora, R. Nisha; Putcha, Nirupama; Swartz, Rachel; Punjabi, Naresh M.
2016-01-01
Background Obstructive sleep apnea is a prevalent yet underdiagnosed condition associated with cardiovascular morbidity and mortality. Home sleep testing offers an efficient means for diagnosing obstructive sleep apnea but has primarily been deployed in clinical samples with a high pretest probability. The current study sought to assess if obstructive sleep apnea can be diagnosed with home sleep testing in a non-referred sample without involvement of a sleep medicine specialist. Methods A study of community-based adults with untreated obstructive sleep apnea was undertaken. Misclassification of disease severity based on home sleep testing with and without involvement of a sleep medicine specialist was assessed, and agreement was characterized using scatter plots, Pearson's correlation coefficient, Bland-Altman analysis, and the kappa statistic. Analyses were also conducted to assess whether any observed differences varied as a function of pretest probability of obstructive sleep apnea or subjective sleepiness. Results The sample consisted of 191 subjects with over half (56.5%) having obstructive sleep apnea. Without involvement of a sleep medicine specialist, obstructive sleep apnea was not identified in only 5.8% of the sample. Analyses comparing the categorical assessment of disease severity with and without a sleep medicine specialist showed that in total, 32 subjects (16.8%) were misclassified. Agreement in the disease severity with and without a sleep medicine specialist was not influenced by the pretest probability or daytime sleep tendency. Conclusion Obstructive sleep apnea can be reliably identified with home sleep testing in a non-referred sample irrespective of the pretest probability of the disease. PMID:26968467
Kelder, S. H.; Pérez, A.; Day, R. S.; Benoit, J.; Frankowski, R. F.; Walker, J. L.; Lee, E. S.
2016-01-01
Although national and state estimates of child obesity are available, data at these levels are insufficient to monitor effects of local obesity prevention initiatives. The purpose of this study was to examine regional changes in the prevalence of obesity due to state-wide policies and programs among children in grades 4, 8, and 11 in Texas Health Service Regions (HSR) between 2000–2002 and 2004–2005, and nine selected counties in 2004–2005. A cross-sectional, probability-based sample of 23,190 Texas students in grades 4, 8, and 11 were weighed and measured to obtain body mass index (BMI). Obesity was greater than 95th percentile for BMI by age/sex using CDC growth charts. Child obesity prevalence significantly decreased between 2000–2002 and 2004–2005 for 4th grade students in the El Paso HSR (−7.0%, p=0.005). A leveling off in the prevalence of obesity was noted for all other regions for grades 4, 8 and 11. County-level data supported the statistically significant decreases noted in the El Paso region. The reduction of child obesity levels observed in the El Paso area is one of the few examples of effective programs and policies based on a population-wide survey: in this region, a local foundation funded extensive regional implementation of community programs for obesity prevention, including an evidence-based elementary school-based health promotion program, adult nutrition and physical activity programs, and a radio and television advertising campaign. Results emphasize the need for sustained school, community and policy efforts, and that these efforts can result in decreases in child obesity at the population level. PMID:19798066
Tucker, Joseph D.; Chakraborty, Hrishikesh; Cohen, Myron S.; Chen, Xiang-Sheng
2016-01-01
Background Syphilis is prevalent among men who have sex with men (MSM) in China. Syphilis partner notification (PN) programs targeting MSM has been considered as one of effective strategies to prevention and control of the infection in the population. We examined willingness and preferences for PN among MSM to measure feasibility and optimize uptake. Methods Participation in a syphilis PN program was measured using a factorial survey from both the perspective of the index patient and the partner. Respondents were recruited from April-July 2011 using convenience sampling at two sites—a MSM sexually transmitted disease (STD) clinic and a MSM community based organization (CBO). Respondents first evaluated three factorial survey vignettes to measure probability of participation and then an anonymous sociodemographic questionnaire. A two-level mixed linear model was fitted for the factorial survey analysis. Results In 372 respondents with mean age (± SD) 28.5 (± 6.0) years, most were single (82.0%) and closeted gays (66.7%). The Internet was the most frequent place to search for sex. Few (31.2%) had legal names for casual partners, but most had instant messenger (86.5%) and mobile phone numbers (77.7%). The mean probability of participation in a syphilis PN program was 64.5% (± 32.4%) for index patients and 63.7% (± 32.6%) for partners. Referral of the partner to a private clinic or MSM CBO for follow-up decreased participation compared to the local Center for Disease Control and Prevention (CDC) or public STD clinic. Conclusions Enhanced PN services may be feasible among MSM in South China. Internet and mobile phone PN may contact partners untraceable by traditional PN. Referral of partners to the local CDC or public STD clinic may maximize PN participation. PMID:27462724
Environmental DNA (eDNA) Detection Probability Is Influenced by Seasonal Activity of Organisms.
de Souza, Lesley S; Godwin, James C; Renshaw, Mark A; Larson, Eric
2016-01-01
Environmental DNA (eDNA) holds great promise for conservation applications like the monitoring of invasive or imperiled species, yet this emerging technique requires ongoing testing in order to determine the contexts over which it is effective. For example, little research to date has evaluated how seasonality of organism behavior or activity may influence detection probability of eDNA. We applied eDNA to survey for two highly imperiled species endemic to the upper Black Warrior River basin in Alabama, US: the Black Warrior Waterdog (Necturus alabamensis) and the Flattened Musk Turtle (Sternotherus depressus). Importantly, these species have contrasting patterns of seasonal activity, with N. alabamensis more active in the cool season (October-April) and S. depressus more active in the warm season (May-September). We surveyed sites historically occupied by these species across cool and warm seasons over two years with replicated eDNA water samples, which were analyzed in the laboratory using species-specific quantitative PCR (qPCR) assays. We then used occupancy estimation with detection probability modeling to evaluate both the effects of landscape attributes on organism presence and season of sampling on detection probability of eDNA. Importantly, we found that season strongly affected eDNA detection probability for both species, with N. alabamensis having higher eDNA detection probabilities during the cool season and S. depressus have higher eDNA detection probabilities during the warm season. These results illustrate the influence of organismal behavior or activity on eDNA detection in the environment and identify an important role for basic natural history in designing eDNA monitoring programs.
Environmental DNA (eDNA) Detection Probability Is Influenced by Seasonal Activity of Organisms
de Souza, Lesley S.; Godwin, James C.; Renshaw, Mark A.; Larson, Eric
2016-01-01
Environmental DNA (eDNA) holds great promise for conservation applications like the monitoring of invasive or imperiled species, yet this emerging technique requires ongoing testing in order to determine the contexts over which it is effective. For example, little research to date has evaluated how seasonality of organism behavior or activity may influence detection probability of eDNA. We applied eDNA to survey for two highly imperiled species endemic to the upper Black Warrior River basin in Alabama, US: the Black Warrior Waterdog (Necturus alabamensis) and the Flattened Musk Turtle (Sternotherus depressus). Importantly, these species have contrasting patterns of seasonal activity, with N. alabamensis more active in the cool season (October-April) and S. depressus more active in the warm season (May-September). We surveyed sites historically occupied by these species across cool and warm seasons over two years with replicated eDNA water samples, which were analyzed in the laboratory using species-specific quantitative PCR (qPCR) assays. We then used occupancy estimation with detection probability modeling to evaluate both the effects of landscape attributes on organism presence and season of sampling on detection probability of eDNA. Importantly, we found that season strongly affected eDNA detection probability for both species, with N. alabamensis having higher eDNA detection probabilities during the cool season and S. depressus have higher eDNA detection probabilities during the warm season. These results illustrate the influence of organismal behavior or activity on eDNA detection in the environment and identify an important role for basic natural history in designing eDNA monitoring programs. PMID:27776150
Natural environment application for NASP-X-30 design and mission planning
NASA Technical Reports Server (NTRS)
Johnson, D. L.; Hill, C. K.; Brown, S. C.; Batts, G. W.
1993-01-01
The NASA/MSFC Mission Analysis Program has recently been utilized in various National Aero-Space Plane (NASP) mission and operational planning scenarios. This paper focuses on presenting various atmospheric constraint statistics based on assumed NASP mission phases using established natural environment design, parametric, threshold values. Probabilities of no-go are calculated using atmospheric parameters such as temperature, humidity, density altitude, peak/steady-state winds, cloud cover/ceiling, thunderstorms, and precipitation. The program although developed to evaluate test or operational missions after flight constraints have been established, can provide valuable information in the design phase of the NASP X-30 program. Inputting the design values as flight constraints the Mission Analysis Program returns the probability of no-go, or launch delay, by hour by month. This output tells the X-30 program manager whether the design values are stringent enough to meet his required test flight schedules.
Adaptive Sampling-Based Information Collection for Wireless Body Area Networks.
Xu, Xiaobin; Zhao, Fang; Wang, Wendong; Tian, Hui
2016-08-31
To collect important health information, WBAN applications typically sense data at a high frequency. However, limited by the quality of wireless link, the uploading of sensed data has an upper frequency. To reduce upload frequency, most of the existing WBAN data collection approaches collect data with a tolerable error. These approaches can guarantee precision of the collected data, but they are not able to ensure that the upload frequency is within the upper frequency. Some traditional sampling based approaches can control upload frequency directly, however, they usually have a high loss of information. Since the core task of WBAN applications is to collect health information, this paper aims to collect optimized information under the limitation of upload frequency. The importance of sensed data is defined according to information theory for the first time. Information-aware adaptive sampling is proposed to collect uniformly distributed data. Then we propose Adaptive Sampling-based Information Collection (ASIC) which consists of two algorithms. An adaptive sampling probability algorithm is proposed to compute sampling probabilities of different sensed values. A multiple uniform sampling algorithm provides uniform samplings for values in different intervals. Experiments based on a real dataset show that the proposed approach has higher performance in terms of data coverage and information quantity. The parameter analysis shows the optimized parameter settings and the discussion shows the underlying reason of high performance in the proposed approach.
Adaptive Sampling-Based Information Collection for Wireless Body Area Networks
Xu, Xiaobin; Zhao, Fang; Wang, Wendong; Tian, Hui
2016-01-01
To collect important health information, WBAN applications typically sense data at a high frequency. However, limited by the quality of wireless link, the uploading of sensed data has an upper frequency. To reduce upload frequency, most of the existing WBAN data collection approaches collect data with a tolerable error. These approaches can guarantee precision of the collected data, but they are not able to ensure that the upload frequency is within the upper frequency. Some traditional sampling based approaches can control upload frequency directly, however, they usually have a high loss of information. Since the core task of WBAN applications is to collect health information, this paper aims to collect optimized information under the limitation of upload frequency. The importance of sensed data is defined according to information theory for the first time. Information-aware adaptive sampling is proposed to collect uniformly distributed data. Then we propose Adaptive Sampling-based Information Collection (ASIC) which consists of two algorithms. An adaptive sampling probability algorithm is proposed to compute sampling probabilities of different sensed values. A multiple uniform sampling algorithm provides uniform samplings for values in different intervals. Experiments based on a real dataset show that the proposed approach has higher performance in terms of data coverage and information quantity. The parameter analysis shows the optimized parameter settings and the discussion shows the underlying reason of high performance in the proposed approach. PMID:27589758
Williams, M S; Ebel, E D; Cao, Y
2013-01-01
The fitting of statistical distributions to microbial sampling data is a common application in quantitative microbiology and risk assessment applications. An underlying assumption of most fitting techniques is that data are collected with simple random sampling, which is often times not the case. This study develops a weighted maximum likelihood estimation framework that is appropriate for microbiological samples that are collected with unequal probabilities of selection. A weighted maximum likelihood estimation framework is proposed for microbiological samples that are collected with unequal probabilities of selection. Two examples, based on the collection of food samples during processing, are provided to demonstrate the method and highlight the magnitude of biases in the maximum likelihood estimator when data are inappropriately treated as a simple random sample. Failure to properly weight samples to account for how data are collected can introduce substantial biases into inferences drawn from the data. The proposed methodology will reduce or eliminate an important source of bias in inferences drawn from the analysis of microbial data. This will also make comparisons between studies and the combination of results from different studies more reliable, which is important for risk assessment applications. © 2012 No claim to US Government works.
Drug use prevention: factors associated with program implementation in Brazilian urban schools.
Pereira, Ana Paula Dias; Sanchez, Zila M
2018-03-07
A school is a learning environment that contributes to the construction of personal values, beliefs, habits and lifestyles, provide convenient settings for the implementation of drug use prevention programs targeting adolescents, who are the population group at highest risk of initiating drug use. The objective of the present study was to investigate the prevalence of factors associated with implementing drug use prevention programs in Brazilian public and private middle and high urban schools. The present population-based cross-sectional survey was conducted with a probability sample of 1151 school administrators stratified by the 5 Brazilian administrative divisions, in 2014. A close-ended, self-reported online questionnaire was used. Logistic regression analysis was used to identify factors associated with implementing drug use prevention programs in schools. A total of 51.1% of the schools had adopted drug use prevention programs. The factors associated with program implementation were as follows: belonging to the public school network; having a library; development of activities targeting sexuality; development of "Health at School Program" activities; offering extracurricular activities; and having an administrator that participated in training courses on drugs. The adoption of drug use prevention practices in Brazilian schools may be expanded with greater orchestration of schools through specialized training of administrators and teachers, expansion of the School Health Program and concomitant development of the schools' structural and curricular attributes.
Lognormal Approximations of Fault Tree Uncertainty Distributions.
El-Shanawany, Ashraf Ben; Ardron, Keith H; Walker, Simon P
2018-01-26
Fault trees are used in reliability modeling to create logical models of fault combinations that can lead to undesirable events. The output of a fault tree analysis (the top event probability) is expressed in terms of the failure probabilities of basic events that are input to the model. Typically, the basic event probabilities are not known exactly, but are modeled as probability distributions: therefore, the top event probability is also represented as an uncertainty distribution. Monte Carlo methods are generally used for evaluating the uncertainty distribution, but such calculations are computationally intensive and do not readily reveal the dominant contributors to the uncertainty. In this article, a closed-form approximation for the fault tree top event uncertainty distribution is developed, which is applicable when the uncertainties in the basic events of the model are lognormally distributed. The results of the approximate method are compared with results from two sampling-based methods: namely, the Monte Carlo method and the Wilks method based on order statistics. It is shown that the closed-form expression can provide a reasonable approximation to results obtained by Monte Carlo sampling, without incurring the computational expense. The Wilks method is found to be a useful means of providing an upper bound for the percentiles of the uncertainty distribution while being computationally inexpensive compared with full Monte Carlo sampling. The lognormal approximation method and Wilks's method appear attractive, practical alternatives for the evaluation of uncertainty in the output of fault trees and similar multilinear models. © 2018 Society for Risk Analysis.
Olsen, Chris; Wang, Chong; Christopher-Hennings, Jane; Doolittle, Kent; Harmon, Karen M; Abate, Sarah; Kittawornrat, Apisit; Lizano, Sergio; Main, Rodger; Nelson, Eric A; Otterson, Tracy; Panyasing, Yaowalak; Rademacher, Chris; Rauh, Rolf; Shah, Rohan; Zimmerman, Jeffrey
2013-05-01
Pen-based oral fluid sampling has proven to be an efficient method for surveillance of infectious diseases in swine populations. To better interpret diagnostic results, the performance of oral fluid assays (antibody- and nucleic acid-based) must be established for pen-based oral fluid samples. Therefore, the objective of the current study was to determine the probability of detecting Porcine reproductive and respiratory syndrome virus (PRRSV) infection in pen-based oral fluid samples from pens of known PRRSV prevalence. In 1 commercial swine barn, 25 pens were assigned to 1 of 5 levels of PRRSV prevalence (0%, 4%, 12%, 20%, or 36%) by placing a fixed number (0, 1, 3, 5, or 9) of PRRSV-positive pigs (14 days post PRRSV modified live virus vaccination) in each pen. Prior to placement of the vaccinated pigs, 1 oral fluid sample was collected from each pen. Thereafter, 5 oral fluid samples were collected from each pen, for a total of 150 samples. To confirm individual pig PRRSV status, serum samples from the PRRSV-negative pigs (n = 535) and the PRRSV vaccinated pigs (n = 90) were tested for PRRSV antibodies and PRRSV RNA. The 150 pen-based oral fluid samples were assayed for PRRSV antibody and PRRSV RNA at 6 laboratories. Among the 100 samples from pens containing ≥1 positive pig (≥4% prevalence) and tested at the 6 laboratories, the mean positivity was 62% for PRRSV RNA and 61% for PRRSV antibody. These results support the use of pen-based oral fluid sampling for PRRSV surveillance in commercial pig populations.
NASA Astrophysics Data System (ADS)
Giovanis, D. G.; Shields, M. D.
2018-07-01
This paper addresses uncertainty quantification (UQ) for problems where scalar (or low-dimensional vector) response quantities are insufficient and, instead, full-field (very high-dimensional) responses are of interest. To do so, an adaptive stochastic simulation-based methodology is introduced that refines the probability space based on Grassmann manifold variations. The proposed method has a multi-element character discretizing the probability space into simplex elements using a Delaunay triangulation. For every simplex, the high-dimensional solutions corresponding to its vertices (sample points) are projected onto the Grassmann manifold. The pairwise distances between these points are calculated using appropriately defined metrics and the elements with large total distance are sub-sampled and refined. As a result, regions of the probability space that produce significant changes in the full-field solution are accurately resolved. An added benefit is that an approximation of the solution within each element can be obtained by interpolation on the Grassmann manifold. The method is applied to study the probability of shear band formation in a bulk metallic glass using the shear transformation zone theory.
Validating long-term satellite-derived disturbance products: the case of burned areas
NASA Astrophysics Data System (ADS)
Boschetti, L.; Roy, D. P.
2015-12-01
The potential research, policy and management applications of satellite products place a high priority on providing statements about their accuracy. A number of NASA, ESA and EU funded global and continental burned area products have been developed using coarse spatial resolution satellite data, and have the potential to become part of a long-term fire Climate Data Record. These products have usually been validated by comparison with reference burned area maps derived by visual interpretation of Landsat or similar spatial resolution data selected on an ad hoc basis. More optimally, a design-based validation method should be adopted that is characterized by the selection of reference data via a probability sampling that can subsequently be used to compute accuracy metrics, taking into account the sampling probability. Design based techniques have been used for annual land cover and land cover change product validation, but have not been widely used for burned area products, or for the validation of global products that are highly variable in time and space (e.g. snow, floods or other non-permanent phenomena). This has been due to the challenge of designing an appropriate sampling strategy, and to the cost of collecting independent reference data. We propose a tri-dimensional sampling grid that allows for probability sampling of Landsat data in time and in space. To sample the globe in the spatial domain with non-overlapping sampling units, the Thiessen Scene Area (TSA) tessellation of the Landsat WRS path/rows is used. The TSA grid is then combined with the 16-day Landsat acquisition calendar to provide tri-dimensonal elements (voxels). This allows the implementation of a sampling design where not only the location but also the time interval of the reference data is explicitly drawn by probability sampling. The proposed sampling design is a stratified random sampling, with two-level stratification of the voxels based on biomes and fire activity (Figure 1). The novel validation approach, used for the validation of the MODIS and forthcoming VIIRS global burned area products, is a general one, and could be used for the validation of other global products that are highly variable in space and time and is required to assess the accuracy of climate records. The approach is demonstrated using a 1 year dataset of MODIS fire products.
Farrar, Jerry W.
1999-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for seven standard reference samples -- T-155 (trace constituents), M-148 (major constituents), N-59 (nutrient constituents), N-60 (nutrient constituents), P-31 (low ionic strength constituents), GWT-4 (ground-water trace constituents), and Hg- 27 (mercury) -- which were distributed in September 1998 to 162 laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data that were received from 136 of the laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the seven reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the seven standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Long, H.K.; Farrar, J.W.
1994-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for eight standard reference samples--T-127 (trace constituents), M-128 (major constituents), N-40 (nutrients), N-41 (nutrients), P-21 (low ionic strength), Hg-17 (mercury), AMW-3 (acid mine water), and WW-1 (whole water)--that were distributed in October 1993 to 158 laboratories registered in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data that were received from 145 of the laboratories were evaluated with respect to: overall laboratory performance and relative laboratory performance for each analyte in the eight reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the eight standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Beyond the swab: ecosystem sampling to understand the persistence of an amphibian pathogen.
Mosher, Brittany A; Huyvaert, Kathryn P; Bailey, Larissa L
2018-06-02
Understanding the ecosystem-level persistence of pathogens is essential for predicting and measuring host-pathogen dynamics. However, this process is often masked, in part due to a reliance on host-based pathogen detection methods. The amphibian pathogens Batrachochytrium dendrobatidis (Bd) and B. salamandrivorans (Bsal) are pathogens of global conservation concern. Despite having free-living life stages, little is known about the distribution and persistence of these pathogens outside of their amphibian hosts. We combine historic amphibian monitoring data with contemporary host- and environment-based pathogen detection data to obtain estimates of Bd occurrence independent of amphibian host distributions. We also evaluate differences in filter- and swab-based detection probability and assess inferential differences arising from using different decision criteria used to classify samples as positive or negative. Water filtration-based detection probabilities were lower than those from swabs but were > 10%, and swab-based detection probabilities varied seasonally, declining in the early fall. The decision criterion used to classify samples as positive or negative was important; using a more liberal criterion yielded higher estimates of Bd occurrence than when a conservative criterion was used. Different covariates were important when using the liberal or conservative criterion in modeling Bd detection. We found evidence of long-term Bd persistence for several years after an amphibian host species of conservation concern, the boreal toad (Anaxyrus boreas boreas), was last detected. Our work provides evidence of long-term Bd persistence in the ecosystem, and underscores the importance of environmental samples for understanding and mitigating disease-related threats to amphibian biodiversity.
Bayesian data analysis tools for atomic physics
NASA Astrophysics Data System (ADS)
Trassinelli, Martino
2017-10-01
We present an introduction to some concepts of Bayesian data analysis in the context of atomic physics. Starting from basic rules of probability, we present the Bayes' theorem and its applications. In particular we discuss about how to calculate simple and joint probability distributions and the Bayesian evidence, a model dependent quantity that allows to assign probabilities to different hypotheses from the analysis of a same data set. To give some practical examples, these methods are applied to two concrete cases. In the first example, the presence or not of a satellite line in an atomic spectrum is investigated. In the second example, we determine the most probable model among a set of possible profiles from the analysis of a statistically poor spectrum. We show also how to calculate the probability distribution of the main spectral component without having to determine uniquely the spectrum modeling. For these two studies, we implement the program Nested_fit to calculate the different probability distributions and other related quantities. Nested_fit is a Fortran90/Python code developed during the last years for analysis of atomic spectra. As indicated by the name, it is based on the nested algorithm, which is presented in details together with the program itself.
A Proper Motions Study of the Globular Cluster NGC 3201
NASA Astrophysics Data System (ADS)
Sariya, Devesh P.; Jiang, Ing-Guey; Yadav, R. K. S.
2017-03-01
With a high value of heliocentric radial velocity, a retrograde orbit, and suspected to have an extragalactic origin, NGC 3201 is an interesting globular cluster for kinematical studies. Our purpose is to calculate the relative proper motions (PMs) and membership probability for the stars in the wide region of globular cluster NGC 3201. PM based membership probabilities are used to isolate the cluster sample from the field stars. The membership catalog will help address the question of chemical inhomogeneity in the cluster. Archive CCD data taken with a wide-field imager (WFI) mounted on the ESO 2.2 m telescope are reduced using the high-precision astrometric software developed by Anderson et al. for the WFI images. The epoch gap between the two observational runs is ˜14.3 years. To standardize the BVI photometry, Stetson’s secondary standard stars are used. The CCD data with an epoch gap of ˜14.3 years enables us to decontaminate the cluster stars from field stars efficiently. The median precision of PMs is better than ˜0.8 mas yr-1 for stars having V< 18 mag that increases up to ˜1.5 mas yr-1 for stars with 18< V< 20 mag. Kinematic membership probabilities are calculated using PMs for stars brighter than V˜ 20 mag. An electronic catalog of positions, relative PMs, BVI magnitudes, and membership probabilities in the ˜19.7 × 17 arcmin2 region of NGC 3201 is presented. We use our membership catalog to identify probable cluster members among the known variables and X-ray sources in the direction of NGC 3201. Based on observations with the MPG/ESO 2.2 m and ESO/VLT telescopes, located at La Silla and Paranal Observatory, Chile, under DDT programs 164.O-0561(F), 093.A-9028(A), and the archive material.
Manganese recycling in the United States in 1998
Jones, Thomas S.
2003-01-01
This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T-163 (trace constituents), M-156 (major constituents), N-67 (nutrient constituents), N-68 (nutrient constituents), P-35 (low ionic strength constituents), and Hg-31 (mercury) -- that were distributed in October 2000 to 126 laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data that were received from 122 of the laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
NASA Astrophysics Data System (ADS)
Miller, Jacob; Sanders, Stephen; Miyake, Akimasa
2017-12-01
While quantum speed-up in solving certain decision problems by a fault-tolerant universal quantum computer has been promised, a timely research interest includes how far one can reduce the resource requirement to demonstrate a provable advantage in quantum devices without demanding quantum error correction, which is crucial for prolonging the coherence time of qubits. We propose a model device made of locally interacting multiple qubits, designed such that simultaneous single-qubit measurements on it can output probability distributions whose average-case sampling is classically intractable, under similar assumptions as the sampling of noninteracting bosons and instantaneous quantum circuits. Notably, in contrast to these previous unitary-based realizations, our measurement-based implementation has two distinctive features. (i) Our implementation involves no adaptation of measurement bases, leading output probability distributions to be generated in constant time, independent of the system size. Thus, it could be implemented in principle without quantum error correction. (ii) Verifying the classical intractability of our sampling is done by changing the Pauli measurement bases only at certain output qubits. Our usage of random commuting quantum circuits in place of computationally universal circuits allows a unique unification of sampling and verification, so they require the same physical resource requirements in contrast to the more demanding verification protocols seen elsewhere in the literature.
Piecewise SALT sampling for estimating suspended sediment yields
Robert B. Thomas
1989-01-01
A probability sampling method called SALT (Selection At List Time) has been developed for collecting and summarizing data on delivery of suspended sediment in rivers. It is based on sampling and estimating yield using a suspended-sediment rating curve for high discharges and simple random sampling for low flows. The method gives unbiased estimates of total yield and...
Crovelli, R.A.; Balay, R.H.
1991-01-01
A general risk-analysis method was developed for petroleum-resource assessment and other applications. The triangular probability distribution is used as a model with an analytic aggregation methodology based on probability theory rather than Monte-Carlo simulation. Among the advantages of the analytic method are its computational speed and flexibility, and the saving of time and cost on a microcomputer. The input into the model consists of a set of components (e.g. geologic provinces) and, for each component, three potential resource estimates: minimum, most likely (mode), and maximum. Assuming a triangular probability distribution, the mean, standard deviation, and seven fractiles (F100, F95, F75, F50, F25, F5, and F0) are computed for each component, where for example, the probability of more than F95 is equal to 0.95. The components are aggregated by combining the means, standard deviations, and respective fractiles under three possible siutations (1) perfect positive correlation, (2) complete independence, and (3) any degree of dependence between these two polar situations. A package of computer programs named the TRIAGG system was written in the Turbo Pascal 4.0 language for performing the analytic probabilistic methodology. The system consists of a program for processing triangular probability distribution assessments and aggregations, and a separate aggregation routine for aggregating aggregations. The user's documentation and program diskette of the TRIAGG system are available from USGS Open File Services. TRIAGG requires an IBM-PC/XT/AT compatible microcomputer with 256kbyte of main memory, MS-DOS 3.1 or later, either two diskette drives or a fixed disk, and a 132 column printer. A graphics adapter and color display are optional. ?? 1991.
Todd Trench, Elaine C.
2004-01-01
A time-series analysis approach developed by the U.S. Geological Survey was used to analyze trends in total phosphorus and evaluate optimal sampling designs for future trend detection, using long-term data for two water-quality monitoring stations on the Quinebaug River in eastern Connecticut. Trend-analysis results for selected periods of record during 1971?2001 indicate that concentrations of total phosphorus in the Quinebaug River have varied over time, but have decreased significantly since the 1970s and 1980s. Total phosphorus concentrations at both stations increased in the late 1990s and early 2000s, but were still substantially lower than historical levels. Drainage areas for both stations are primarily forested, but water quality at both stations is affected by point discharges from municipal wastewater-treatment facilities. Various designs with sampling frequencies ranging from 4 to 11 samples per year were compared to the trend-detection power of the monthly (12-sample) design to determine the most efficient configuration of months to sample for a given annual sampling frequency. Results from this evaluation indicate that the current (2004) 8-sample schedule for the two Quinebaug stations, with monthly sampling from May to September and bimonthly sampling for the remainder of the year, is not the most efficient 8-sample design for future detection of trends in total phosphorus. Optimal sampling schedules for the two stations differ, but in both cases, trend-detection power generally is greater among 8-sample designs that include monthly sampling in fall and winter. Sampling designs with fewer than 8 samples per year generally provide a low level of probability for detection of trends in total phosphorus. Managers may determine an acceptable level of probability for trend detection within the context of the multiple objectives of the state?s water-quality management program and the scientific understanding of the watersheds in question. Managers may identify a threshold of probability for trend detection that is high enough to justify the agency?s investment in the water-quality sampling program. Results from an analysis of optimal sampling designs can provide an important component of information for the decision-making process in which sampling schedules are periodically reviewed and revised. Results from the study described in this report and previous studies indicate that optimal sampling schedules for trend detection may differ substantially for different stations and constituents. A more comprehensive statewide evaluation of sampling schedules for key stations and constituents could provide useful information for any redesign of the schedule for water-quality monitoring in the Quinebaug River Basin and elsewhere in the state.
ERIC Educational Resources Information Center
Gordon, Allegra R.; Conron, Kerith J.; Calzo, Jerel P.; White, Matthew T.; Reisner, Sari L.; Austin, S. Bryn
2018-01-01
Background: Young people may experience school-based violence and bullying victimization related to their gender expression, independent of sexual orientation identity. However, the associations between gender expression and bullying and violence have not been examined in racially and ethnically diverse population-based samples of high school…
Nonlinear Demodulation and Channel Coding in EBPSK Scheme
Chen, Xianqing; Wu, Lenan
2012-01-01
The extended binary phase shift keying (EBPSK) is an efficient modulation technique, and a special impacting filter (SIF) is used in its demodulator to improve the bit error rate (BER) performance. However, the conventional threshold decision cannot achieve the optimum performance, and the SIF brings more difficulty in obtaining the posterior probability for LDPC decoding. In this paper, we concentrate not only on reducing the BER of demodulation, but also on providing accurate posterior probability estimates (PPEs). A new approach for the nonlinear demodulation based on the support vector machine (SVM) classifier is introduced. The SVM method which selects only a few sampling points from the filter output was used for getting PPEs. The simulation results show that the accurate posterior probability can be obtained with this method and the BER performance can be improved significantly by applying LDPC codes. Moreover, we analyzed the effect of getting the posterior probability with different methods and different sampling rates. We show that there are more advantages of the SVM method under bad condition and it is less sensitive to the sampling rate than other methods. Thus, SVM is an effective method for EBPSK demodulation and getting posterior probability for LDPC decoding. PMID:23213281
Nonlinear demodulation and channel coding in EBPSK scheme.
Chen, Xianqing; Wu, Lenan
2012-01-01
The extended binary phase shift keying (EBPSK) is an efficient modulation technique, and a special impacting filter (SIF) is used in its demodulator to improve the bit error rate (BER) performance. However, the conventional threshold decision cannot achieve the optimum performance, and the SIF brings more difficulty in obtaining the posterior probability for LDPC decoding. In this paper, we concentrate not only on reducing the BER of demodulation, but also on providing accurate posterior probability estimates (PPEs). A new approach for the nonlinear demodulation based on the support vector machine (SVM) classifier is introduced. The SVM method which selects only a few sampling points from the filter output was used for getting PPEs. The simulation results show that the accurate posterior probability can be obtained with this method and the BER performance can be improved significantly by applying LDPC codes. Moreover, we analyzed the effect of getting the posterior probability with different methods and different sampling rates. We show that there are more advantages of the SVM method under bad condition and it is less sensitive to the sampling rate than other methods. Thus, SVM is an effective method for EBPSK demodulation and getting posterior probability for LDPC decoding.
Ryan, K; Williams, D Gareth; Balding, David J
2016-11-01
Many DNA profiles recovered from crime scene samples are of a quality that does not allow them to be searched against, nor entered into, databases. We propose a method for the comparison of profiles arising from two DNA samples, one or both of which can have multiple donors and be affected by low DNA template or degraded DNA. We compute likelihood ratios to evaluate the hypothesis that the two samples have a common DNA donor, and hypotheses specifying the relatedness of two donors. Our method uses a probability distribution for the genotype of the donor of interest in each sample. This distribution can be obtained from a statistical model, or we can exploit the ability of trained human experts to assess genotype probabilities, thus extracting much information that would be discarded by standard interpretation rules. Our method is compatible with established methods in simple settings, but is more widely applicable and can make better use of information than many current methods for the analysis of mixed-source, low-template DNA profiles. It can accommodate uncertainty arising from relatedness instead of or in addition to uncertainty arising from noisy genotyping. We describe a computer program GPMDNA, available under an open source licence, to calculate LRs using the method presented in this paper. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Probabilistic Design of a Mars Sample Return Earth Entry Vehicle Thermal Protection System
NASA Technical Reports Server (NTRS)
Dec, John A.; Mitcheltree, Robert A.
2002-01-01
The driving requirement for design of a Mars Sample Return mission is to assure containment of the returned samples. Designing to, and demonstrating compliance with, such a requirement requires physics based tools that establish the relationship between engineer's sizing margins and probabilities of failure. The traditional method of determining margins on ablative thermal protection systems, while conservative, provides little insight into the actual probability of an over-temperature during flight. The objective of this paper is to describe a new methodology for establishing margins on sizing the thermal protection system (TPS). Results of this Monte Carlo approach are compared with traditional methods.
Integrated system for gathering, processing, and reporting data relating to site contamination
Long, D.D.; Goldberg, M.S.; Baker, L.A.
1997-11-11
An integrated screening system comprises an intrusive sampling subsystem, a field mobile laboratory subsystem, a computer assisted design/geographical information subsystem, and a telecommunication linkup subsystem, all integrated to provide synergistically improved data relating to the extent of site soil/groundwater contamination. According to the present invention, data samples related to the soil, groundwater or other contamination of the subsurface material are gathered and analyzed to measure contaminants. Based on the location of origin of the samples in three-dimensional space, the analyzed data are transmitted to a location display. The data from analyzing samples and the data from the locating the origin are managed to project the next probable sample location. The next probable sample location is then forwarded for use as a guide in the placement of ensuing sample location, whereby the number of samples needed to accurately characterize the site is minimized. 10 figs.
Integrated system for gathering, processing, and reporting data relating to site contamination
Long, Delmar D.; Goldberg, Mitchell S.; Baker, Lorie A.
1997-01-01
An integrated screening system comprises an intrusive sampling subsystem, a field mobile laboratory subsystem, a computer assisted design/geographical information subsystem, and a telecommunication linkup subsystem, all integrated to provide synergistically improved data relating to the extent of site soil/groundwater contamination. According to the present invention, data samples related to the soil, groundwater or other contamination of the subsurface material are gathered and analyzed to measure contaminants. Based on the location of origin of the samples in three-dimensional space, the analyzed data are transmitted to a location display. The data from analyzing samples and the data from the locating the origin are managed to project the next probable sample location. The next probable sample location is then forwarded for use as a guide in the placement of ensuing sample location, whereby the number of samples needed to accurately characterize the site is minimized.
NASA Technical Reports Server (NTRS)
Jasperson, W. H.; Nastrom, G. D.; Davis, R. E.; Holdeman, J. D.
1984-01-01
Summary studies are presented for the entire cloud observation archieve from the NASA Global Atmospheric Sampling Program (GASP). Studies are also presented for GASP particle concentration data gathered concurrently with the cloud observations. Cloud encounters are shown on about 15 percent of the data samples overall, but the probability of cloud encounter is shown to vary significantly with altitude, latitude, and distance from the tropopause. Several meteorological circulation features are apparent in the latitudinal distribution of cloud cover, and the cloud encounter statistics are shown to be consistent with the classical mid-latitude cyclone model. Observations of clouds spaced more closely than 90 minutes are shown to be statistically dependent. The statistics for cloud and particle encounter are utilized to estimate the frequency of cloud encounter on long range airline routes, and to assess the probability and extent of laminar flow loss due to cloud or particle encounter by aircraft utilizing laminar flow control (LFC). It is shown that the probability of extended cloud encounter is too low, of itself, to make LFC impractical.
NASA Astrophysics Data System (ADS)
Sari, Dwi Ivayana; Budayasa, I. Ketut; Juniati, Dwi
2017-08-01
Formulation of mathematical learning goals now is not only oriented on cognitive product, but also leads to cognitive process, which is probabilistic thinking. Probabilistic thinking is needed by students to make a decision. Elementary school students are required to develop probabilistic thinking as foundation to learn probability at higher level. A framework of probabilistic thinking of students had been developed by using SOLO taxonomy, which consists of prestructural probabilistic thinking, unistructural probabilistic thinking, multistructural probabilistic thinking and relational probabilistic thinking. This study aimed to analyze of probability task completion based on taxonomy of probabilistic thinking. The subjects were two students of fifth grade; boy and girl. Subjects were selected by giving test of mathematical ability and then based on high math ability. Subjects were given probability tasks consisting of sample space, probability of an event and probability comparison. The data analysis consisted of categorization, reduction, interpretation and conclusion. Credibility of data used time triangulation. The results was level of boy's probabilistic thinking in completing probability tasks indicated multistructural probabilistic thinking, while level of girl's probabilistic thinking in completing probability tasks indicated unistructural probabilistic thinking. The results indicated that level of boy's probabilistic thinking was higher than level of girl's probabilistic thinking. The results could contribute to curriculum developer in developing probability learning goals for elementary school students. Indeed, teachers could teach probability with regarding gender difference.
Kraschnewski, Jennifer L; Keyserling, Thomas C; Bangdiwala, Shrikant I; Gizlice, Ziya; Garcia, Beverly A; Johnston, Larry F; Gustafson, Alison; Petrovic, Lindsay; Glasgow, Russell E; Samuel-Hodge, Carmen D
2010-01-01
Studies of type 2 translation, the adaption of evidence-based interventions to real-world settings, should include representative study sites and staff to improve external validity. Sites for such studies are, however, often selected by convenience sampling, which limits generalizability. We used an optimized probability sampling protocol to select an unbiased, representative sample of study sites to prepare for a randomized trial of a weight loss intervention. We invited North Carolina health departments within 200 miles of the research center to participate (N = 81). Of the 43 health departments that were eligible, 30 were interested in participating. To select a representative and feasible sample of 6 health departments that met inclusion criteria, we generated all combinations of 6 from the 30 health departments that were eligible and interested. From the subset of combinations that met inclusion criteria, we selected 1 at random. Of 593,775 possible combinations of 6 counties, 15,177 (3%) met inclusion criteria. Sites in the selected subset were similar to all eligible sites in terms of health department characteristics and county demographics. Optimized probability sampling improved generalizability by ensuring an unbiased and representative sample of study sites.
LEGEND, a LEO-to-GEO Environment Debris Model
NASA Technical Reports Server (NTRS)
Liou, Jer Chyi; Hall, Doyle T.
2013-01-01
LEGEND (LEO-to-GEO Environment Debris model) is a three-dimensional orbital debris evolutionary model that is capable of simulating the historical and future debris populations in the near-Earth environment. The historical component in LEGEND adopts a deterministic approach to mimic the known historical populations. Launched rocket bodies, spacecraft, and mission-related debris (rings, bolts, etc.) are added to the simulated environment. Known historical breakup events are reproduced, and fragments down to 1 mm in size are created. The LEGEND future projection component adopts a Monte Carlo approach and uses an innovative pair-wise collision probability evaluation algorithm to simulate the future breakups and the growth of the debris populations. This algorithm is based on a new "random sampling in time" approach that preserves characteristics of the traditional approach and captures the rapidly changing nature of the orbital debris environment. LEGEND is a Fortran 90-based numerical simulation program. It operates in a UNIX/Linux environment.
Memory-efficient dynamic programming backtrace and pairwise local sequence alignment.
Newberg, Lee A
2008-08-15
A backtrace through a dynamic programming algorithm's intermediate results in search of an optimal path, or to sample paths according to an implied probability distribution, or as the second stage of a forward-backward algorithm, is a task of fundamental importance in computational biology. When there is insufficient space to store all intermediate results in high-speed memory (e.g. cache) existing approaches store selected stages of the computation, and recompute missing values from these checkpoints on an as-needed basis. Here we present an optimal checkpointing strategy, and demonstrate its utility with pairwise local sequence alignment of sequences of length 10,000. Sample C++-code for optimal backtrace is available in the Supplementary Materials. Supplementary data is available at Bioinformatics online.
Jung, Minsoo
2015-01-01
When there is no sampling frame within a certain group or the group is concerned that making its population public would bring social stigma, we say the population is hidden. It is difficult to approach this kind of population survey-methodologically because the response rate is low and its members are not quite honest with their responses when probability sampling is used. The only alternative known to address the problems caused by previous methods such as snowball sampling is respondent-driven sampling (RDS), which was developed by Heckathorn and his colleagues. RDS is based on a Markov chain, and uses the social network information of the respondent. This characteristic allows for probability sampling when we survey a hidden population. We verified through computer simulation whether RDS can be used on a hidden population of cancer survivors. According to the simulation results of this thesis, the chain-referral sampling of RDS tends to minimize as the sample gets bigger, and it becomes stabilized as the wave progresses. Therefore, it shows that the final sample information can be completely independent from the initial seeds if a certain level of sample size is secured even if the initial seeds were selected through convenient sampling. Thus, RDS can be considered as an alternative which can improve upon both key informant sampling and ethnographic surveys, and it needs to be utilized for various cases domestically as well.
Determinants of Workplace Injuries and Violence Among Newly Licensed RNs.
Unruh, Lynn; Asi, Yara
2018-06-01
Workplace injuries, such as musculoskeletal injuries, needlestick injuries, and emotional and physical violence, remain an issue in U.S. hospitals. To develop meaningful safety programs, it is important to identify workplace factors that contribute to injuries. This study explored factors that affect injuries in a sample of newly licensed registered nurses (NLRNs) in Florida. Regressions were run on models in which the dependent variable was the degree to which the respondent had experienced needlesticks, work-related musculoskeletal injuries, cuts or lacerations, contusions, verbal violence, physical violence, and other occupational injuries. A higher probability of these injuries was associated with greater length of employment, working evening or night shifts, working overtime, and reporting job difficulties and pressures. A lower probability was associated with working in a teaching hospital and working more hours. Study findings suggest that work environment issues must be addressed for safety programs to be effective.
State-wide monitoring based on probability survey designs requires a spatially explicit representation of all streams and rivers of interest within a state, i.e., a sample frame. The sample frame should be the best available map representation of the resource. Many stream progr...
Ethnic Group Bias in Intelligence Test Items.
ERIC Educational Resources Information Center
Scheuneman, Janice
In previous studies of ethnic group bias in intelligence test items, the question of bias has been confounded with ability differences between the ethnic group samples compared. The present study is based on a conditional probability model in which an unbiased item is defined as one where the probability of a correct response to an item is the…
Ximenes, Ricardo Arraes de Alencar; Pereira, Leila Maria Beltrão; Martelli, Celina Maria Turchi; Merchán-Hamann, Edgar; Stein, Airton Tetelbom; Figueiredo, Gerusa Maria; Braga, Maria Cynthia; Montarroyos, Ulisses Ramos; Brasil, Leila Melo; Turchi, Marília Dalva; Fonseca, José Carlos Ferraz da; Lima, Maria Luiza Carvalho de; Alencar, Luis Cláudio Arraes de; Costa, Marcelo; Coral, Gabriela; Moreira, Regina Celia; Cardoso, Maria Regina Alves
2010-09-01
A population-based survey to provide information on the prevalence of hepatitis viral infection and the pattern of risk factors was carried out in the urban population of all Brazilian state capitals and the Federal District, between 2005 and 2009. This paper describes the design and methodology of the study which involved a population aged 5 to 19 for hepatitis A and 10 to 69 for hepatitis B and C. Interviews and blood samples were obtained through household visits. The sample was selected using stratified multi-stage cluster sampling and was drawn with equal probability from each domain of study (region and age-group). Nationwide, 19,280 households and ~31,000 residents were selected. The study is large enough to detect prevalence of viral infection around 0.1% and risk factor assessments within each region. The methodology seems to be a viable way of differentiating between distinct epidemiological patterns of hepatitis A, B and C. These data will be of value for the evaluation of vaccination policies and for the design of control program strategies.
Selecting supplier combination based on fuzzy multicriteria analysis
NASA Astrophysics Data System (ADS)
Han, Zhi-Qiu; Luo, Xin-Xing; Chen, Xiao-Hong; Yang, Wu-E.
2015-07-01
Existing multicriteria analysis (MCA) methods are probably ineffective in selecting a supplier combination. Thus, an MCA-based fuzzy 0-1 programming method is introduced. The programming relates to a simple MCA matrix that is used to select a single supplier. By solving the programming, the most feasible combination of suppliers is selected. Importantly, this result differs from selecting suppliers one by one according to a single-selection order, which is used to rank sole suppliers in existing MCA methods. An example highlights such difference and illustrates the proposed method.
Using quantum principles to develop independent continuing nursing education programs.
Zurlinden, Jeffrey; Pepsnik, Dawn
2013-01-01
Innovations in health care call for fresh approaches to continuing nursing education that support lateral relationships, teamwork, and collaboration. To foster this transformation, we devised the following education principles: Everyone teaches, everyone learns; embrace probability; information is dynamic; and trust professionals to practice professionally. These principles guided the development of seven independent, practice-specific, evidence-based continuing nursing education programs totaling 21.5 contact hours for casual-status nurses who practiced as childbirth educators. The programs were popular, promoted teamwork, and increased communication about evidence-based practice.
NASA Astrophysics Data System (ADS)
Bare, William D.
2000-07-01
An argument is presented which suggests that the commonly seen calculus-based derivations of Beer's law may not be adequately useful to students and may in fact contribute to widely held misconceptions about the interaction of light with absorbing samples. For this reason, an alternative derivation of Beer's law based on a corpuscular model and the laws of probability is presented. Unlike many previously reported derivations, that presented here does not require the use of calculus, nor does it require the assumption of absorption properties in an infinitesimally thin film. The corpuscular-probability model and its accompanying derivation of Beer's law are believed to comprise a more pedagogically effective presentation than those presented previously.
Cost-Effectiveness and Cost-Utility of Internet-Based Computer Tailoring for Smoking Cessation
Evers, Silvia MAA; de Vries, Hein; Hoving, Ciska
2013-01-01
Background Although effective smoking cessation interventions exist, information is limited about their cost-effectiveness and cost-utility. Objective To assess the cost-effectiveness and cost-utility of an Internet-based multiple computer-tailored smoking cessation program and tailored counseling by practice nurses working in Dutch general practices compared with an Internet-based multiple computer-tailored program only and care as usual. Methods The economic evaluation was embedded in a randomized controlled trial, for which 91 practice nurses recruited 414 eligible smokers. Smokers were randomized to receive multiple tailoring and counseling (n=163), multiple tailoring only (n=132), or usual care (n=119). Self-reported cost and quality of life were assessed during a 12-month follow-up period. Prolonged abstinence and 24-hour and 7-day point prevalence abstinence were assessed at 12-month follow-up. The trial-based economic evaluation was conducted from a societal perspective. Uncertainty was accounted for by bootstrapping (1000 times) and sensitivity analyses. Results No significant differences were found between the intervention arms with regard to baseline characteristics or effects on abstinence, quality of life, and addiction level. However, participants in the multiple tailoring and counseling group reported significantly more annual health care–related costs than participants in the usual care group. Cost-effectiveness analysis, using prolonged abstinence as the outcome measure, showed that the mere multiple computer-tailored program had the highest probability of being cost-effective. Compared with usual care, in this group €5100 had to be paid for each additional abstinent participant. With regard to cost-utility analyses, using quality of life as the outcome measure, usual care was probably most efficient. Conclusions To our knowledge, this was the first study to determine the cost-effectiveness and cost-utility of an Internet-based smoking cessation program with and without counseling by a practice nurse. Although the Internet-based multiple computer-tailored program seemed to be the most cost-effective treatment, the cost-utility was probably highest for care as usual. However, to ease the interpretation of cost-effectiveness results, future research should aim at identifying an acceptable cutoff point for the willingness to pay per abstinent participant. PMID:23491820
An Expert-System Engine With Operative Probabilities
NASA Technical Reports Server (NTRS)
Orlando, N. E.; Palmer, M. T.; Wallace, R. S.
1986-01-01
Program enables proof-of-concepts tests of expert systems under development. AESOP is rule-based inference engine for expert system, which makes decisions about particular situation given user-supplied hypotheses, rules, and answers to questions drawn from rules. If knowledge base containing hypotheses and rules governing environment is available to AESOP, almost any situation within that environment resolved by answering questions asked by AESOP. Questions answered with YES, NO, MAYBE, DON'T KNOW, DON'T CARE, or with probability factor ranging from 0 to 10. AESOP written in Franz LISP for interactive execution.
European Scientific Notes, Volume 35, Number 11
1981-11-30
PERFORMING ORGANIZATION NAME AND ADDRESS ,. PROGRAM ELEMENT, PROJECT, TASK US Office of Naval Research Branch Office London AREA & WORK UNIT NUMBERS Box 39...presentation in both sides of the tree. Such methodsI was "Non-Standard Uses of the word If," can be spectacularly efficient: in one by D.S. Bred and R. Smit...probability sample of rf uses reduced to a few minutes’ run. was analyzed. Of these, some 60 percent At least two game programs are now of the If’s were
Probabilistic confidence for decisions based on uncertain reliability estimates
NASA Astrophysics Data System (ADS)
Reid, Stuart G.
2013-05-01
Reliability assessments are commonly carried out to provide a rational basis for risk-informed decisions concerning the design or maintenance of engineering systems and structures. However, calculated reliabilities and associated probabilities of failure often have significant uncertainties associated with the possible estimation errors relative to the 'true' failure probabilities. For uncertain probabilities of failure, a measure of 'probabilistic confidence' has been proposed to reflect the concern that uncertainty about the true probability of failure could result in a system or structure that is unsafe and could subsequently fail. The paper describes how the concept of probabilistic confidence can be applied to evaluate and appropriately limit the probabilities of failure attributable to particular uncertainties such as design errors that may critically affect the dependability of risk-acceptance decisions. This approach is illustrated with regard to the dependability of structural design processes based on prototype testing with uncertainties attributable to sampling variability.
Lorz, C; Fürst, C; Galic, Z; Matijasic, D; Podrazky, V; Potocic, N; Simoncic, P; Strauch, M; Vacik, H; Makeschin, F
2010-12-01
We assessed the probability of three major natural hazards--windthrow, drought, and forest fire--for Central and South-Eastern European forests which are major threats for the provision of forest goods and ecosystem services. In addition, we analyzed spatial distribution and implications for a future oriented management of forested landscapes. For estimating the probability of windthrow, we used rooting depth and average wind speed. Probabilities of drought and fire were calculated from climatic and total water balance during growing season. As an approximation to climate change scenarios, we used a simplified approach with a general increase of pET by 20%. Monitoring data from the pan-European forests crown condition program and observed burnt areas and hot spots from the European Forest Fire Information System were used to test the plausibility of probability maps. Regions with high probabilities of natural hazard are identified and management strategies to minimize probability of natural hazards are discussed. We suggest future research should focus on (i) estimating probabilities using process based models (including sensitivity analysis), (ii) defining probability in terms of economic loss, (iii) including biotic hazards, (iv) using more detailed data sets on natural hazards, forest inventories and climate change scenarios, and (v) developing a framework of adaptive risk management.
NASA Astrophysics Data System (ADS)
Goldbery, R.; Tehori, O.
SEDPAK provides a comprehensive software package for operation of a settling tube and sand analyzer (2-0.063 mm) and includes data-processing programs for statistical and graphic output of results. The programs are menu-driven and written in APPLESOFT BASIC, conforming with APPLE 3.3 DOS. Data storage and retrieval from disc is an important feature of SEDPAK. Additional features of SEDPAK include condensation of raw settling data via standard size-calibration curves to yield statistical grain-size parameters, plots of grain-size frequency distributions and cumulative log/probability curves. The program also has a module for processing of grain-size frequency data from sieved samples. An addition feature of SEDPAK is the option for automatic data processing and graphic output of a sequential or nonsequential array of samples on one side of a disc.
Rajasekaran, Sanguthevar
2013-01-01
Efficient tile sets for self assembling rectilinear shapes is of critical importance in algorithmic self assembly. A lower bound on the tile complexity of any deterministic self assembly system for an n × n square is Ω(log(n)log(log(n))) (inferred from the Kolmogrov complexity). Deterministic self assembly systems with an optimal tile complexity have been designed for squares and related shapes in the past. However designing Θ(log(n)log(log(n))) unique tiles specific to a shape is still an intensive task in the laboratory. On the other hand copies of a tile can be made rapidly using PCR (polymerase chain reaction) experiments. This led to the study of self assembly on tile concentration programming models. We present two major results in this paper on the concentration programming model. First we show how to self assemble rectangles with a fixed aspect ratio (α:β), with high probability, using Θ(α + β) tiles. This result is much stronger than the existing results by Kao et al. (Randomized self-assembly for approximate shapes, LNCS, vol 5125. Springer, Heidelberg, 2008) and Doty (Randomized self-assembly for exact shapes. In: proceedings of the 50th annual IEEE symposium on foundations of computer science (FOCS), IEEE, Atlanta. pp 85–94, 2009)—which can only self assembly squares and rely on tiles which perform binary arithmetic. On the other hand, our result is based on a technique called staircase sampling. This technique eliminates the need for sub-tiles which perform binary arithmetic, reduces the constant in the asymptotic bound, and eliminates the need for approximate frames (Kao et al. Randomized self-assembly for approximate shapes, LNCS, vol 5125. Springer, Heidelberg, 2008). Our second result applies staircase sampling on the equimolar concentration programming model (The tile complexity of linear assemblies. In: proceedings of the 36th international colloquium automata, languages and programming: Part I on ICALP ’09, Springer-Verlag, pp 235–253, 2009), to self assemble rectangles (of fixed aspect ratio) with high probability. The tile complexity of our algorithm is Θ(log(n)) and is optimal on the probabilistic tile assembly model (PTAM)—n being an upper bound on the dimensions of a rectangle. PMID:24311993
The estimation of tree posterior probabilities using conditional clade probability distributions.
Larget, Bret
2013-07-01
In this article I introduce the idea of conditional independence of separated subtrees as a principle by which to estimate the posterior probability of trees using conditional clade probability distributions rather than simple sample relative frequencies. I describe an algorithm for these calculations and software which implements these ideas. I show that these alternative calculations are very similar to simple sample relative frequencies for high probability trees but are substantially more accurate for relatively low probability trees. The method allows the posterior probability of unsampled trees to be calculated when these trees contain only clades that are in other sampled trees. Furthermore, the method can be used to estimate the total probability of the set of sampled trees which provides a measure of the thoroughness of a posterior sample.
Statistical Symbolic Execution with Informed Sampling
NASA Technical Reports Server (NTRS)
Filieri, Antonio; Pasareanu, Corina S.; Visser, Willem; Geldenhuys, Jaco
2014-01-01
Symbolic execution techniques have been proposed recently for the probabilistic analysis of programs. These techniques seek to quantify the likelihood of reaching program events of interest, e.g., assert violations. They have many promising applications but have scalability issues due to high computational demand. To address this challenge, we propose a statistical symbolic execution technique that performs Monte Carlo sampling of the symbolic program paths and uses the obtained information for Bayesian estimation and hypothesis testing with respect to the probability of reaching the target events. To speed up the convergence of the statistical analysis, we propose Informed Sampling, an iterative symbolic execution that first explores the paths that have high statistical significance, prunes them from the state space and guides the execution towards less likely paths. The technique combines Bayesian estimation with a partial exact analysis for the pruned paths leading to provably improved convergence of the statistical analysis. We have implemented statistical symbolic execution with in- formed sampling in the Symbolic PathFinder tool. We show experimentally that the informed sampling obtains more precise results and converges faster than a purely statistical analysis and may also be more efficient than an exact symbolic analysis. When the latter does not terminate symbolic execution with informed sampling can give meaningful results under the same time and memory limits.
O’Brien, Kathryn; Edwards, Adrian; Hood, Kerenza; Butler, Christopher C
2013-01-01
Background Urinary tract infection (UTI) in children may be associated with long-term complications that could be prevented by prompt treatment. Aim To determine the prevalence of UTI in acutely ill children ≤ 5 years presenting in general practice and to explore patterns of presenting symptoms and urine sampling strategies. Design and setting Prospective observational study with systematic urine sampling, in general practices in Wales, UK. Method In total, 1003 children were recruited from 13 general practices between March 2008 and July 2010. The prevalence of UTI was determined and multivariable analysis performed to determine the probability of UTI. Result Out of 597 (60.0%) children who provided urine samples within 2 days, the prevalence of UTI was 5.9% (95% confidence interval [CI] = 4.3% to 8.0%) overall, 7.3% in those < 3 years and 3.2% in 3–5 year olds. Neither a history of fever nor the absence of an alternative source of infection was associated with UTI (P = 0.64; P = 0.69, respectively). The probability of UTI in children aged ≥3 years without increased urinary frequency or dysuria was 2%. The probability of UTI was ≥5% in all other groups. Urine sampling based purely on GP suspicion would have missed 80% of UTIs, while a sampling strategy based on current guidelines would have missed 50%. Conclusion Approximately 6% of acutely unwell children presenting to UK general practice met the criteria for a laboratory diagnosis of UTI. This higher than previously recognised prior probability of UTI warrants raised awareness of the condition and suggests clinicians should lower their threshold for urine sampling in young children. The absence of fever or presence of an alternative source of infection, as emphasised in current guidelines, may not rule out UTI in young children with adequate certainty. PMID:23561695
UQ for Decision Making: How (at least five) Kinds of Probability Might Come Into Play
NASA Astrophysics Data System (ADS)
Smith, L. A.
2013-12-01
In 1959 IJ Good published the discussion "Kinds of Probability" in Science. Good identified (at least) five kinds. The need for (at least) a sixth kind of probability when quantifying uncertainty in the context of climate science is discussed. This discussion brings out the differences in weather-like forecasting tasks and climate-links tasks, with a focus on the effective use both of science and of modelling in support of decision making. Good also introduced the idea of a "Dynamic probability" a probability one expects to change without any additional empirical evidence; the probabilities assigned by a chess playing program when it is only half thorough its analysis being an example. This case is contrasted with the case of "Mature probabilities" where a forecast algorithm (or model) has converged on its asymptotic probabilities and the question hinges in whether or not those probabilities are expected to change significantly before the event in question occurs, even in the absence of new empirical evidence. If so, then how might one report and deploy such immature probabilities in scientific-support of decision-making rationally? Mature Probability is suggested as a useful sixth kind, although Good would doubtlessly argue that we can get by with just one, effective communication with decision makers may be enhanced by speaking as if the others existed. This again highlights the distinction between weather-like contexts and climate-like contexts. In the former context one has access to a relevant climatology (a relevant, arguably informative distribution prior to any model simulations), in the latter context that information is not available although one can fall back on the scientific basis upon which the model itself rests, and estimate the probability that the model output is in fact misinformative. This subjective "probability of a big surprise" is one way to communicate the probability of model-based information holding in practice, the probability that the information the model-based probability is conditioned on holds. It is argued that no model-based climate-like probability forecast is complete without a quantitative estimate of its own irrelevance, and that the clear identification of model-based probability forecasts as mature or immature, are critical elements for maintaining the credibility of science-based decision support, and can shape uncertainty quantification more widely.
Papini, Paolo; Faustini, Annunziata; Manganello, Rosa; Borzacchi, Giancarlo; Spera, Domenico; Perucci, Carlo A
2005-01-01
To determine the frequency of sampling in small water distribution systems (<5,000 inhabitants) and compare the results according to different hypotheses in bacteria distribution. We carried out two sampling programs to monitor the water distribution system in a town in Central Italy between July and September 1992; the Poisson distribution assumption implied 4 water samples, the assumption of negative binomial distribution implied 21 samples. Coliform organisms were used as indicators of water safety. The network consisted of two pipe rings and two wells fed by the same water source. The number of summer customers varied considerably from 3,000 to 20,000. The mean density was 2.33 coliforms/100 ml (sd= 5.29) for 21 samples and 3 coliforms/100 ml (sd= 6) for four samples. However the hypothesis of homogeneity was rejected (p-value <0.001) and the probability of II type error with the assumption of heterogeneity was higher with 4 samples (beta= 0.24) than with 21 (beta= 0.05). For this small network, determining the samples' size according to heterogeneity hypothesis strengthens the statement that water is drinkable compared with homogeneity assumption.
The Estimation of Tree Posterior Probabilities Using Conditional Clade Probability Distributions
Larget, Bret
2013-01-01
In this article I introduce the idea of conditional independence of separated subtrees as a principle by which to estimate the posterior probability of trees using conditional clade probability distributions rather than simple sample relative frequencies. I describe an algorithm for these calculations and software which implements these ideas. I show that these alternative calculations are very similar to simple sample relative frequencies for high probability trees but are substantially more accurate for relatively low probability trees. The method allows the posterior probability of unsampled trees to be calculated when these trees contain only clades that are in other sampled trees. Furthermore, the method can be used to estimate the total probability of the set of sampled trees which provides a measure of the thoroughness of a posterior sample. [Bayesian phylogenetics; conditional clade distributions; improved accuracy; posterior probabilities of trees.] PMID:23479066
Metocean design parameter estimation for fixed platform based on copula functions
NASA Astrophysics Data System (ADS)
Zhai, Jinjin; Yin, Qilin; Dong, Sheng
2017-08-01
Considering the dependent relationship among wave height, wind speed, and current velocity, we construct novel trivariate joint probability distributions via Archimedean copula functions. Total 30-year data of wave height, wind speed, and current velocity in the Bohai Sea are hindcast and sampled for case study. Four kinds of distributions, namely, Gumbel distribution, lognormal distribution, Weibull distribution, and Pearson Type III distribution, are candidate models for marginal distributions of wave height, wind speed, and current velocity. The Pearson Type III distribution is selected as the optimal model. Bivariate and trivariate probability distributions of these environmental conditions are established based on four bivariate and trivariate Archimedean copulas, namely, Clayton, Frank, Gumbel-Hougaard, and Ali-Mikhail-Haq copulas. These joint probability models can maximize marginal information and the dependence among the three variables. The design return values of these three variables can be obtained by three methods: univariate probability, conditional probability, and joint probability. The joint return periods of different load combinations are estimated by the proposed models. Platform responses (including base shear, overturning moment, and deck displacement) are further calculated. For the same return period, the design values of wave height, wind speed, and current velocity obtained by the conditional and joint probability models are much smaller than those by univariate probability. Considering the dependence among variables, the multivariate probability distributions provide close design parameters to actual sea state for ocean platform design.
Faith, Daniel P
2008-12-01
New species conservation strategies, including the EDGE of Existence (EDGE) program, have expanded threatened species assessments by integrating information about species' phylogenetic distinctiveness. Distinctiveness has been measured through simple scores that assign shared credit among species for evolutionary heritage represented by the deeper phylogenetic branches. A species with a high score combined with a high extinction probability receives high priority for conservation efforts. Simple hypothetical scenarios for phylogenetic trees and extinction probabilities demonstrate how such scoring approaches can provide inefficient priorities for conservation. An existing probabilistic framework derived from the phylogenetic diversity measure (PD) properly captures the idea of shared responsibility for the persistence of evolutionary history. It avoids static scores, takes into account the status of close relatives through their extinction probabilities, and allows for the necessary updating of priorities in light of changes in species threat status. A hypothetical phylogenetic tree illustrates how changes in extinction probabilities of one or more species translate into changes in expected PD. The probabilistic PD framework provided a range of strategies that moved beyond expected PD to better consider worst-case PD losses. In another example, risk aversion gave higher priority to a conservation program that provided a smaller, but less risky, gain in expected PD. The EDGE program could continue to promote a list of top species conservation priorities through application of probabilistic PD and simple estimates of current extinction probability. The list might be a dynamic one, with all the priority scores updated as extinction probabilities change. Results of recent studies suggest that estimation of extinction probabilities derived from the red list criteria linked to changes in species range sizes may provide estimated probabilities for many different species. Probabilistic PD provides a framework for single-species assessment that is well-integrated with a broader measurement of impacts on PD owing to climate change and other factors.
Detecting truly clonal alterations from multi-region profiling of tumours
Werner, Benjamin; Traulsen, Arne; Sottoriva, Andrea; Dingli, David
2017-01-01
Modern cancer therapies aim at targeting tumour-specific alterations, such as mutations or neo-antigens, and maximal treatment efficacy requires that targeted alterations are present in all tumour cells. Currently, treatment decisions are based on one or a few samples per tumour, creating uncertainty on whether alterations found in those samples are actually present in all tumour cells. The probability of classifying clonal versus sub-clonal alterations from multi-region profiling of tumours depends on the earliest phylogenetic branching event during tumour growth. By analysing 181 samples from 10 renal carcinoma and 11 colorectal cancers we demonstrate that the information gain from additional sampling falls onto a simple universal curve. We found that in colorectal cancers, 30% of alterations identified as clonal with one biopsy proved sub-clonal when 8 samples were considered. The probability to overestimate clonal alterations fell below 1% in 7/11 patients with 8 samples per tumour. In renal cell carcinoma, 8 samples reduced the list of clonal alterations by 40% with respect to a single biopsy. The probability to overestimate clonal alterations remained as high as 92% in 7/10 renal cancer patients. Furthermore, treatment was associated with more unbalanced tumour phylogenetic trees, suggesting the need of denser sampling of tumours at relapse. PMID:28344344
Detecting truly clonal alterations from multi-region profiling of tumours
NASA Astrophysics Data System (ADS)
Werner, Benjamin; Traulsen, Arne; Sottoriva, Andrea; Dingli, David
2017-03-01
Modern cancer therapies aim at targeting tumour-specific alterations, such as mutations or neo-antigens, and maximal treatment efficacy requires that targeted alterations are present in all tumour cells. Currently, treatment decisions are based on one or a few samples per tumour, creating uncertainty on whether alterations found in those samples are actually present in all tumour cells. The probability of classifying clonal versus sub-clonal alterations from multi-region profiling of tumours depends on the earliest phylogenetic branching event during tumour growth. By analysing 181 samples from 10 renal carcinoma and 11 colorectal cancers we demonstrate that the information gain from additional sampling falls onto a simple universal curve. We found that in colorectal cancers, 30% of alterations identified as clonal with one biopsy proved sub-clonal when 8 samples were considered. The probability to overestimate clonal alterations fell below 1% in 7/11 patients with 8 samples per tumour. In renal cell carcinoma, 8 samples reduced the list of clonal alterations by 40% with respect to a single biopsy. The probability to overestimate clonal alterations remained as high as 92% in 7/10 renal cancer patients. Furthermore, treatment was associated with more unbalanced tumour phylogenetic trees, suggesting the need of denser sampling of tumours at relapse.
POF-Darts: Geometric adaptive sampling for probability of failure
Ebeida, Mohamed S.; Mitchell, Scott A.; Swiler, Laura P.; ...
2016-06-18
We introduce a novel technique, POF-Darts, to estimate the Probability Of Failure based on random disk-packing in the uncertain parameter space. POF-Darts uses hyperplane sampling to explore the unexplored part of the uncertain space. We use the function evaluation at a sample point to determine whether it belongs to failure or non-failure regions, and surround it with a protection sphere region to avoid clustering. We decompose the domain into Voronoi cells around the function evaluations as seeds and choose the radius of the protection sphere depending on the local Lipschitz continuity. As sampling proceeds, regions uncovered with spheres will shrink,more » improving the estimation accuracy. After exhausting the function evaluation budget, we build a surrogate model using the function evaluations associated with the sample points and estimate the probability of failure by exhaustive sampling of that surrogate. In comparison to other similar methods, our algorithm has the advantages of decoupling the sampling step from the surrogate construction one, the ability to reach target POF values with fewer samples, and the capability of estimating the number and locations of disconnected failure regions, not just the POF value. Furthermore, we present various examples to demonstrate the efficiency of our novel approach.« less
Guan, Li; Hao, Bibo; Cheng, Qijin; Yip, Paul SF
2015-01-01
Background Traditional offline assessment of suicide probability is time consuming and difficult in convincing at-risk individuals to participate. Identifying individuals with high suicide probability through online social media has an advantage in its efficiency and potential to reach out to hidden individuals, yet little research has been focused on this specific field. Objective The objective of this study was to apply two classification models, Simple Logistic Regression (SLR) and Random Forest (RF), to examine the feasibility and effectiveness of identifying high suicide possibility microblog users in China through profile and linguistic features extracted from Internet-based data. Methods There were nine hundred and nine Chinese microblog users that completed an Internet survey, and those scoring one SD above the mean of the total Suicide Probability Scale (SPS) score, as well as one SD above the mean in each of the four subscale scores in the participant sample were labeled as high-risk individuals, respectively. Profile and linguistic features were fed into two machine learning algorithms (SLR and RF) to train the model that aims to identify high-risk individuals in general suicide probability and in its four dimensions. Models were trained and then tested by 5-fold cross validation; in which both training set and test set were generated under the stratified random sampling rule from the whole sample. There were three classic performance metrics (Precision, Recall, F1 measure) and a specifically defined metric “Screening Efficiency” that were adopted to evaluate model effectiveness. Results Classification performance was generally matched between SLR and RF. Given the best performance of the classification models, we were able to retrieve over 70% of the labeled high-risk individuals in overall suicide probability as well as in the four dimensions. Screening Efficiency of most models varied from 1/4 to 1/2. Precision of the models was generally below 30%. Conclusions Individuals in China with high suicide probability are recognizable by profile and text-based information from microblogs. Although there is still much space to improve the performance of classification models in the future, this study may shed light on preliminary screening of risky individuals via machine learning algorithms, which can work side-by-side with expert scrutiny to increase efficiency in large-scale-surveillance of suicide probability from online social media. PMID:26543921
Guan, Li; Hao, Bibo; Cheng, Qijin; Yip, Paul Sf; Zhu, Tingshao
2015-01-01
Traditional offline assessment of suicide probability is time consuming and difficult in convincing at-risk individuals to participate. Identifying individuals with high suicide probability through online social media has an advantage in its efficiency and potential to reach out to hidden individuals, yet little research has been focused on this specific field. The objective of this study was to apply two classification models, Simple Logistic Regression (SLR) and Random Forest (RF), to examine the feasibility and effectiveness of identifying high suicide possibility microblog users in China through profile and linguistic features extracted from Internet-based data. There were nine hundred and nine Chinese microblog users that completed an Internet survey, and those scoring one SD above the mean of the total Suicide Probability Scale (SPS) score, as well as one SD above the mean in each of the four subscale scores in the participant sample were labeled as high-risk individuals, respectively. Profile and linguistic features were fed into two machine learning algorithms (SLR and RF) to train the model that aims to identify high-risk individuals in general suicide probability and in its four dimensions. Models were trained and then tested by 5-fold cross validation; in which both training set and test set were generated under the stratified random sampling rule from the whole sample. There were three classic performance metrics (Precision, Recall, F1 measure) and a specifically defined metric "Screening Efficiency" that were adopted to evaluate model effectiveness. Classification performance was generally matched between SLR and RF. Given the best performance of the classification models, we were able to retrieve over 70% of the labeled high-risk individuals in overall suicide probability as well as in the four dimensions. Screening Efficiency of most models varied from 1/4 to 1/2. Precision of the models was generally below 30%. Individuals in China with high suicide probability are recognizable by profile and text-based information from microblogs. Although there is still much space to improve the performance of classification models in the future, this study may shed light on preliminary screening of risky individuals via machine learning algorithms, which can work side-by-side with expert scrutiny to increase efficiency in large-scale-surveillance of suicide probability from online social media.
Validation of Statistical Sampling Algorithms in Visual Sample Plan (VSP): Summary Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nuffer, Lisa L; Sego, Landon H.; Wilson, John E.
2009-02-18
The U.S. Department of Homeland Security, Office of Technology Development (OTD) contracted with a set of U.S. Department of Energy national laboratories, including the Pacific Northwest National Laboratory (PNNL), to write a Remediation Guidance for Major Airports After a Chemical Attack. The report identifies key activities and issues that should be considered by a typical major airport following an incident involving release of a toxic chemical agent. Four experimental tasks were identified that would require further research in order to supplement the Remediation Guidance. One of the tasks, Task 4, OTD Chemical Remediation Statistical Sampling Design Validation, dealt with statisticalmore » sampling algorithm validation. This report documents the results of the sampling design validation conducted for Task 4. In 2005, the Government Accountability Office (GAO) performed a review of the past U.S. responses to Anthrax terrorist cases. Part of the motivation for this PNNL report was a major GAO finding that there was a lack of validated sampling strategies in the U.S. response to Anthrax cases. The report (GAO 2005) recommended that probability-based methods be used for sampling design in order to address confidence in the results, particularly when all sample results showed no remaining contamination. The GAO also expressed a desire that the methods be validated, which is the main purpose of this PNNL report. The objective of this study was to validate probability-based statistical sampling designs and the algorithms pertinent to within-building sampling that allow the user to prescribe or evaluate confidence levels of conclusions based on data collected as guided by the statistical sampling designs. Specifically, the designs found in the Visual Sample Plan (VSP) software were evaluated. VSP was used to calculate the number of samples and the sample location for a variety of sampling plans applied to an actual release site. Most of the sampling designs validated are probability based, meaning samples are located randomly (or on a randomly placed grid) so no bias enters into the placement of samples, and the number of samples is calculated such that IF the amount and spatial extent of contamination exceeds levels of concern, at least one of the samples would be taken from a contaminated area, at least X% of the time. Hence, "validation" of the statistical sampling algorithms is defined herein to mean ensuring that the "X%" (confidence) is actually met.« less
Tenhagen, B A; Hille, A; Schmidt, A; Heuwieser, W
2005-02-01
It was the objective of this study to analyse shedding patterns and somatic cell counts in cows and quarters infected with Prototheca spp. and to evaluate two approaches to identify infected animals by somatic cell count (SCC) or by bacteriological analysis of pooled milk samples. Five lactating dairy cows, chronically infected with Prototheca spp. in at least one quarter were studied over 11 weeks to 13 months. Quarter milk samples and a pooled milk sample from 4 quarters were collected aseptically from all quarters of the cows on a weekly basis. Culture results of quarter milk and pooled samples were compared using cross tabulation. SCC of quarter milk samples and of pooled samples were related to the probability of detection in the infected quarters and cows, respectively. Shedding of Prototheca spp. was continuous in 2 of 8 quarters. In the other quarters negative samples were obtained sporadically or over a longer period (1 quarter). Overall, Prototheca spp. were isolated from 83.6% of quarter milk samples and 77.0% of pooled milk samples of infected quarters and cows. Somatic cell counts were higher in those samples from infected quarters that contained the algae than in negative samples (p < 0.0001). The same applied for composite samples from infected cows. Positive samples had higher SCC than negative samples. However, Prototheca spp. were also isolated from quarter milk and pooled samples with physiological SCC (i.e. < 10(5)/ml). Infected quarters that were dried off did not develop acute mastitis. However, drying off had no effect on the infection, i.e. samples collected at calving or 8 weeks after dry off still contained Prototheca spp. Results indicate that pre-selection of cows to be sampled for Prototheca spp. by SCC and the use of composite samples are probably inadequate in attempts to eradicate the disease. However, due to intermittent shedding of the algae in some cows, single herd sampling using quarter milk samples probably also fails to detect all infected cases. Therefore, continuous monitoring of problem cows with clinical mastitis or increased SCC in herds during eradication programs is recommended.
Geospatial techniques for developing a sampling frame of watersheds across a region
Gresswell, Robert E.; Bateman, Douglas S.; Lienkaemper, George; Guy, T.J.
2004-01-01
Current land-management decisions that affect the persistence of native salmonids are often influenced by studies of individual sites that are selected based on judgment and convenience. Although this approach is useful for some purposes, extrapolating results to areas that were not sampled is statistically inappropriate because the sampling design is usually biased. Therefore, in recent investigations of coastal cutthroat trout (Oncorhynchus clarki clarki) located above natural barriers to anadromous salmonids, we used a methodology for extending the statistical scope of inference. The purpose of this paper is to apply geospatial tools to identify a population of watersheds and develop a probability-based sampling design for coastal cutthroat trout in western Oregon, USA. The population of mid-size watersheds (500-5800 ha) west of the Cascade Range divide was derived from watershed delineations based on digital elevation models. Because a database with locations of isolated populations of coastal cutthroat trout did not exist, a sampling frame of isolated watersheds containing cutthroat trout had to be developed. After the sampling frame of watersheds was established, isolated watersheds with coastal cutthroat trout were stratified by ecoregion and erosion potential based on dominant bedrock lithology (i.e., sedimentary and igneous). A stratified random sample of 60 watersheds was selected with proportional allocation in each stratum. By comparing watershed drainage areas of streams in the general population to those in the sampling frame and the resulting sample (n = 60), we were able to evaluate the how representative the subset of watersheds was in relation to the population of watersheds. Geospatial tools provided a relatively inexpensive means to generate the information necessary to develop a statistically robust, probability-based sampling design.
Turner, Cameron R.; Miller, Derryl J.; Coyne, Kathryn J.; Corush, Joel
2014-01-01
Indirect, non-invasive detection of rare aquatic macrofauna using aqueous environmental DNA (eDNA) is a relatively new approach to population and biodiversity monitoring. As such, the sensitivity of monitoring results to different methods of eDNA capture, extraction, and detection is being investigated in many ecosystems and species. One of the first and largest conservation programs with eDNA-based monitoring as a central instrument focuses on Asian bigheaded carp (Hypophthalmichthys spp.), an invasive fish spreading toward the Laurentian Great Lakes. However, the standard eDNA methods of this program have not advanced since their development in 2010. We developed new, quantitative, and more cost-effective methods and tested them against the standard protocols. In laboratory testing, our new quantitative PCR (qPCR) assay for bigheaded carp eDNA was one to two orders of magnitude more sensitive than the existing endpoint PCR assays. When applied to eDNA samples from an experimental pond containing bigheaded carp, the qPCR assay produced a detection probability of 94.8% compared to 4.2% for the endpoint PCR assays. Also, the eDNA capture and extraction method we adapted from aquatic microbiology yielded five times more bigheaded carp eDNA from the experimental pond than the standard method, at a per sample cost over forty times lower. Our new, more sensitive assay provides a quantitative tool for eDNA-based monitoring of bigheaded carp, and the higher-yielding eDNA capture and extraction method we describe can be used for eDNA-based monitoring of any aquatic species. PMID:25474207
Turner, Cameron R; Miller, Derryl J; Coyne, Kathryn J; Corush, Joel
2014-01-01
Indirect, non-invasive detection of rare aquatic macrofauna using aqueous environmental DNA (eDNA) is a relatively new approach to population and biodiversity monitoring. As such, the sensitivity of monitoring results to different methods of eDNA capture, extraction, and detection is being investigated in many ecosystems and species. One of the first and largest conservation programs with eDNA-based monitoring as a central instrument focuses on Asian bigheaded carp (Hypophthalmichthys spp.), an invasive fish spreading toward the Laurentian Great Lakes. However, the standard eDNA methods of this program have not advanced since their development in 2010. We developed new, quantitative, and more cost-effective methods and tested them against the standard protocols. In laboratory testing, our new quantitative PCR (qPCR) assay for bigheaded carp eDNA was one to two orders of magnitude more sensitive than the existing endpoint PCR assays. When applied to eDNA samples from an experimental pond containing bigheaded carp, the qPCR assay produced a detection probability of 94.8% compared to 4.2% for the endpoint PCR assays. Also, the eDNA capture and extraction method we adapted from aquatic microbiology yielded five times more bigheaded carp eDNA from the experimental pond than the standard method, at a per sample cost over forty times lower. Our new, more sensitive assay provides a quantitative tool for eDNA-based monitoring of bigheaded carp, and the higher-yielding eDNA capture and extraction method we describe can be used for eDNA-based monitoring of any aquatic species.
More than Just Convenient: The Scientific Merits of Homogeneous Convenience Samples
Jager, Justin; Putnick, Diane L.; Bornstein, Marc H.
2017-01-01
Despite their disadvantaged generalizability relative to probability samples, non-probability convenience samples are the standard within developmental science, and likely will remain so because probability samples are cost-prohibitive and most available probability samples are ill-suited to examine developmental questions. In lieu of focusing on how to eliminate or sharply reduce reliance on convenience samples within developmental science, here we propose how to augment their advantages when it comes to understanding population effects as well as subpopulation differences. Although all convenience samples have less clear generalizability than probability samples, we argue that homogeneous convenience samples have clearer generalizability relative to conventional convenience samples. Therefore, when researchers are limited to convenience samples, they should consider homogeneous convenience samples as a positive alternative to conventional or heterogeneous) convenience samples. We discuss future directions as well as potential obstacles to expanding the use of homogeneous convenience samples in developmental science. PMID:28475254
A new estimator of the discovery probability.
Favaro, Stefano; Lijoi, Antonio; Prünster, Igor
2012-12-01
Species sampling problems have a long history in ecological and biological studies and a number of issues, including the evaluation of species richness, the design of sampling experiments, and the estimation of rare species variety, are to be addressed. Such inferential problems have recently emerged also in genomic applications, however, exhibiting some peculiar features that make them more challenging: specifically, one has to deal with very large populations (genomic libraries) containing a huge number of distinct species (genes) and only a small portion of the library has been sampled (sequenced). These aspects motivate the Bayesian nonparametric approach we undertake, since it allows to achieve the degree of flexibility typically needed in this framework. Based on an observed sample of size n, focus will be on prediction of a key aspect of the outcome from an additional sample of size m, namely, the so-called discovery probability. In particular, conditionally on an observed basic sample of size n, we derive a novel estimator of the probability of detecting, at the (n+m+1)th observation, species that have been observed with any given frequency in the enlarged sample of size n+m. Such an estimator admits a closed-form expression that can be exactly evaluated. The result we obtain allows us to quantify both the rate at which rare species are detected and the achieved sample coverage of abundant species, as m increases. Natural applications are represented by the estimation of the probability of discovering rare genes within genomic libraries and the results are illustrated by means of two expressed sequence tags datasets. © 2012, The International Biometric Society.
Zeller, K.A.; Nijhawan, S.; Salom-Perez, R.; Potosme, S.H.; Hines, J.E.
2011-01-01
Corridors are critical elements in the long-term conservation of wide-ranging species like the jaguar (Panthera onca). Jaguar corridors across the range of the species were initially identified using a GIS-based least-cost corridor model. However, due to inherent errors in remotely sensed data and model uncertainties, these corridors warrant field verification before conservation efforts can begin. We developed a novel corridor assessment protocol based on interview data and site occupancy modeling. We divided our pilot study area, in southeastern Nicaragua, into 71, 6. ??. 6 km sampling units and conducted 160 structured interviews with local residents. Interviews were designed to collect data on jaguar and seven prey species so that detection/non-detection matrices could be constructed for each sampling unit. Jaguars were reportedly detected in 57% of the sampling units and had a detection probability of 28%. With the exception of white-lipped peccary, prey species were reportedly detected in 82-100% of the sampling units. Though the use of interview data may violate some assumptions of the occupancy modeling approach for determining 'proportion of area occupied', we countered these shortcomings through study design and interpreting the occupancy parameter, psi, as 'probability of habitat used'. Probability of habitat use was modeled for each target species using single state or multistate models. A combination of the estimated probabilities of habitat use for jaguar and prey was selected to identify the final jaguar corridor. This protocol provides an efficient field methodology for identifying corridors for easily-identifiable species, across large study areas comprised of unprotected, private lands. ?? 2010 Elsevier Ltd.
Zhao, Ruiying; Chen, Songchao; Zhou, Yue; Jin, Bin; Li, Yan
2018-01-01
Assessing heavy metal pollution and delineating pollution are the bases for evaluating pollution and determining a cost-effective remediation plan. Most existing studies are based on the spatial distribution of pollutants but ignore related uncertainty. In this study, eight heavy-metal concentrations (Cr, Pb, Cd, Hg, Zn, Cu, Ni, and Zn) were collected at 1040 sampling sites in a coastal industrial city in the Yangtze River Delta, China. The single pollution index (PI) and Nemerow integrated pollution index (NIPI) were calculated for every surface sample (0–20 cm) to assess the degree of heavy metal pollution. Ordinary kriging (OK) was used to map the spatial distribution of heavy metals content and NIPI. Then, we delineated composite heavy metal contamination based on the uncertainty produced by indicator kriging (IK). The results showed that mean values of all PIs and NIPIs were at safe levels. Heavy metals were most accumulated in the central portion of the study area. Based on IK, the spatial probability of composite heavy metal pollution was computed. The probability of composite contamination in the central core urban area was highest. A probability of 0.6 was found as the optimum probability threshold to delineate polluted areas from unpolluted areas for integrative heavy metal contamination. Results of pollution delineation based on uncertainty showed the proportion of false negative error areas was 6.34%, while the proportion of false positive error areas was 0.86%. The accuracy of the classification was 92.80%. This indicated the method we developed is a valuable tool for delineating heavy metal pollution. PMID:29642623
Hu, Bifeng; Zhao, Ruiying; Chen, Songchao; Zhou, Yue; Jin, Bin; Li, Yan; Shi, Zhou
2018-04-10
Assessing heavy metal pollution and delineating pollution are the bases for evaluating pollution and determining a cost-effective remediation plan. Most existing studies are based on the spatial distribution of pollutants but ignore related uncertainty. In this study, eight heavy-metal concentrations (Cr, Pb, Cd, Hg, Zn, Cu, Ni, and Zn) were collected at 1040 sampling sites in a coastal industrial city in the Yangtze River Delta, China. The single pollution index (PI) and Nemerow integrated pollution index (NIPI) were calculated for every surface sample (0-20 cm) to assess the degree of heavy metal pollution. Ordinary kriging (OK) was used to map the spatial distribution of heavy metals content and NIPI. Then, we delineated composite heavy metal contamination based on the uncertainty produced by indicator kriging (IK). The results showed that mean values of all PIs and NIPIs were at safe levels. Heavy metals were most accumulated in the central portion of the study area. Based on IK, the spatial probability of composite heavy metal pollution was computed. The probability of composite contamination in the central core urban area was highest. A probability of 0.6 was found as the optimum probability threshold to delineate polluted areas from unpolluted areas for integrative heavy metal contamination. Results of pollution delineation based on uncertainty showed the proportion of false negative error areas was 6.34%, while the proportion of false positive error areas was 0.86%. The accuracy of the classification was 92.80%. This indicated the method we developed is a valuable tool for delineating heavy metal pollution.
Climate sensitivity estimated from temperature reconstructions of the Last Glacial Maximum
NASA Astrophysics Data System (ADS)
Schmittner, A.; Urban, N.; Shakun, J. D.; Mahowald, N. M.; Clark, P. U.; Bartlein, P. J.; Mix, A. C.; Rosell-Melé, A.
2011-12-01
In 1959 IJ Good published the discussion "Kinds of Probability" in Science. Good identified (at least) five kinds. The need for (at least) a sixth kind of probability when quantifying uncertainty in the context of climate science is discussed. This discussion brings out the differences in weather-like forecasting tasks and climate-links tasks, with a focus on the effective use both of science and of modelling in support of decision making. Good also introduced the idea of a "Dynamic probability" a probability one expects to change without any additional empirical evidence; the probabilities assigned by a chess playing program when it is only half thorough its analysis being an example. This case is contrasted with the case of "Mature probabilities" where a forecast algorithm (or model) has converged on its asymptotic probabilities and the question hinges in whether or not those probabilities are expected to change significantly before the event in question occurs, even in the absence of new empirical evidence. If so, then how might one report and deploy such immature probabilities in scientific-support of decision-making rationally? Mature Probability is suggested as a useful sixth kind, although Good would doubtlessly argue that we can get by with just one, effective communication with decision makers may be enhanced by speaking as if the others existed. This again highlights the distinction between weather-like contexts and climate-like contexts. In the former context one has access to a relevant climatology (a relevant, arguably informative distribution prior to any model simulations), in the latter context that information is not available although one can fall back on the scientific basis upon which the model itself rests, and estimate the probability that the model output is in fact misinformative. This subjective "probability of a big surprise" is one way to communicate the probability of model-based information holding in practice, the probability that the information the model-based probability is conditioned on holds. It is argued that no model-based climate-like probability forecast is complete without a quantitative estimate of its own irrelevance, and that the clear identification of model-based probability forecasts as mature or immature, are critical elements for maintaining the credibility of science-based decision support, and can shape uncertainty quantification more widely.
NASA Astrophysics Data System (ADS)
Jones, M. I.; Brahm, R.; Wittenmyer, R. A.; Drass, H.; Jenkins, J. S.; Melo, C. H. F.; Vos, J.; Rojo, P.
2017-06-01
We report the discovery of a substellar companion around the giant star HIP 67537. Based on precision radial velocity measurements from CHIRON and FEROS high-resolution spectroscopic data, we derived the following orbital elements for HIP 67537 b: mb sin I = 11.1+0.4-1.1Mjup, a =4.9+0.14-0.13 AU and e = 0.59+0.05-0.02 . Considering random inclination angles, this object has ≳65% probability to be above the theoretical deuterium-burning limit, thus it is one of the few known objects in the planet to brown-dwarf (BD) transition region. In addition, we analyzed the Hipparcos astrometric data of this star, from which we derived a minimum inclination angle for the companion of 2 deg. This value corresponds to an upper mass limit of 0.3 M⊙, therefore the probability that HIP 67537 b is stellar in nature is ≲7%. The large mass of the host star and the high orbital eccentricity makes HIP 67537 b a very interesting and rare substellar object. This is the second candidate companion in the brown dwarf desert detected in the sample of intermediate-mass stars targeted by the EXoPlanets aRound Evolved StarS (EXPRESS) radial velocity program, which corresponds to a detection fraction of f = +2.0-0.5 %. This value is larger than the fraction observed in solar-type stars, providing new observational evidence of an enhanced formation efficiency of massive substellar companions in massive disks. Finally, we speculate about different formation channels for this object. Based on observations collected at La Silla - Paranal Observatory under programs ID's 085.C-0557, 087.C.0476, 089.C-0524, 090.C-0345 and through the Chilean Telescope Time under programs ID's CN-12A-073, CN-12B-047, CN-13A-111, CN-2013B-51, CN-2014A-52, CN-15A-48, CN-15B-25 and CN-16A-13.
Making a Difference in Science Education: The Impact of Undergraduate Research Programs
Eagan, M. Kevin; Hurtado, Sylvia; Chang, Mitchell J.; Garcia, Gina A.; Herrera, Felisha A.; Garibay, Juan C.
2014-01-01
To increase the numbers of underrepresented racial minority students in science, technology, engineering, and mathematics (STEM), federal and private agencies have allocated significant funding to undergraduate research programs, which have been shown to students’ intentions of enrolling in graduate or professional school. Analyzing a longitudinal sample of 4,152 aspiring STEM majors who completed the 2004 Freshman Survey and 2008 College Senior Survey, this study utilizes multinomial hierarchical generalized linear modeling (HGLM) and propensity score matching techniques to examine how participation in undergraduate research affects STEM students’ intentions to enroll in STEM and non-STEM graduate and professional programs. Findings indicate that participation in an undergraduate research program significantly improved students’ probability of indicating plans to enroll in a STEM graduate program. PMID:25190821
Applying the Hájek Approach in Formula-Based Variance Estimation. Research Report. ETS RR-17-24
ERIC Educational Resources Information Center
Qian, Jiahe
2017-01-01
The variance formula derived for a two-stage sampling design without replacement employs the joint inclusion probabilities in the first-stage selection of clusters. One of the difficulties encountered in data analysis is the lack of information about such joint inclusion probabilities. One way to solve this issue is by applying Hájek's…
Wong, Frances Kam Yuet; So, Ching; Ng, Alina Yee Man; Lam, Po-Tin; Ng, Jeffrey Sheung Ching; Ng, Nancy Hiu Yim; Chau, June; Sham, Michael Mau Kwong
2018-02-01
Studies have shown positive clinical outcomes of specialist palliative care for end-stage heart failure patients, but cost-effectiveness evaluation is lacking. To examine the cost-effectiveness of a transitional home-based palliative care program for patients with end-stage heart failure patients as compared to the customary palliative care service. A cost-effectiveness analysis was conducted alongside a randomized controlled trial (Trial number: NCT02086305). The costs included pre-program training, intervention, and hospital use. Quality of life was measured using SF-6D. The study took place in three hospitals in Hong Kong. The inclusion criteria were meeting clinical indicators for end-stage heart failure patients including clinician-judged last year of life, discharged to home within the service area, and palliative care referral accepted. A total of 84 subjects (study = 43, control = 41) were recruited. When the study group was compared to the control group, the net incremental quality-adjusted life years gain was 0.0012 (28 days)/0.0077 (84 days) and the net incremental costs per case was -HK$7935 (28 days)/-HK$26,084 (84 days). The probability of being cost-effective was 85% (28 days)/100% (84 days) based on the cost-effectiveness thresholds recommended both by National Institute for Health and Clinical Excellence (£20,000/quality-adjusted life years) and World Health Organization (Hong Kong gross domestic product/capita in 2015, HK$328117). Results suggest that a transitional home-based palliative care program is more cost-effective than customary palliative care service. Limitations of the study include small sample size, study confined to one city, clinic consultation costs, and societal costs including patient costs and unpaid care-giving costs were not included.
Comparing the Effects of Group and Home-based Physical Activity on Mental Health in the Elderly.
Mortazavi, Seyede Salehe; Shati, Mohsen; Ardebili, Hassan Eftekhar; Mohammad, Kazem; Beni, Reza Dorali; Keshteli, A H
2013-11-01
The present study focuses on comparing the effects of home-based (HB) and group-based (GB) physical activity on mental health in a sample of older adults in Shahr-e-kord. In this quasi-experimental study, a twice-weekly physical activity program for 2 months was provided either individually at home or in a group format for 181 people who were divided into two groups (HB and GB). The outcome, mental health, was measured with the 28-item General Health Questionnaire (GHQ-28). Mental health status improved after participation in the physical activity program. The decrease in GHQ-28 total score in GB group, 3 months after intervention, was 3.61 ± 2.28 (P < 0.001). In HB group, this reduction was 1.20 ± 2.32 during the same period (P < 0.001). The difference of these "before-after differences" between the two groups in the GHQ-28 and all its subscales was statistically significant (P < 0.001). Also, the effects of GB physical activity on mental health compared with HB physical activity, adjusted for related baseline variables, were significant. These findings reveal the probable effects of GB rather than HB physical activity on mental health among the elderly.
Sethi, Suresh; Linden, Daniel; Wenburg, John; Lewis, Cara; Lemons, Patrick R.; Fuller, Angela K.; Hare, Matthew P.
2016-01-01
Error-tolerant likelihood-based match calling presents a promising technique to accurately identify recapture events in genetic mark–recapture studies by combining probabilities of latent genotypes and probabilities of observed genotypes, which may contain genotyping errors. Combined with clustering algorithms to group samples into sets of recaptures based upon pairwise match calls, these tools can be used to reconstruct accurate capture histories for mark–recapture modelling. Here, we assess the performance of a recently introduced error-tolerant likelihood-based match-calling model and sample clustering algorithm for genetic mark–recapture studies. We assessed both biallelic (i.e. single nucleotide polymorphisms; SNP) and multiallelic (i.e. microsatellite; MSAT) markers using a combination of simulation analyses and case study data on Pacific walrus (Odobenus rosmarus divergens) and fishers (Pekania pennanti). A novel two-stage clustering approach is demonstrated for genetic mark–recapture applications. First, repeat captures within a sampling occasion are identified. Subsequently, recaptures across sampling occasions are identified. The likelihood-based matching protocol performed well in simulation trials, demonstrating utility for use in a wide range of genetic mark–recapture studies. Moderately sized SNP (64+) and MSAT (10–15) panels produced accurate match calls for recaptures and accurate non-match calls for samples from closely related individuals in the face of low to moderate genotyping error. Furthermore, matching performance remained stable or increased as the number of genetic markers increased, genotyping error notwithstanding.
Mining Rare Events Data for Assessing Customer Attrition Risk
NASA Astrophysics Data System (ADS)
Au, Tom; Chin, Meei-Ling Ivy; Ma, Guangqin
Customer attrition refers to the phenomenon whereby a customer leaves a service provider. As competition intensifies, preventing customers from leaving is a major challenge to many businesses such as telecom service providers. Research has shown that retaining existing customers is more profitable than acquiring new customers due primarily to savings on acquisition costs, the higher volume of service consumption, and customer referrals. For a large enterprise, its customer base consists of tens of millions service subscribers, more often the events, such as switching to competitors or canceling services are large in absolute number, but rare in percentage, far less than 5%. Based on a simple random sample, popular statistical procedures, such as logistic regression, tree-based method and neural network, can sharply underestimate the probability of rare events, and often result a null model (no significant predictors). To improve efficiency and accuracy for event probability estimation, a case-based data collection technique is then considered. A case-based sample is formed by taking all available events and a small, but representative fraction of nonevents from a dataset of interest. In this article we showed a consistent prior correction method for events probability estimation and demonstrated the performance of the above data collection techniques in predicting customer attrition with actual telecommunications data.
Castro-Ríos, Angélica; Reyes-Morales, Hortensia; Pérez-Cuevas, Ricardo
2008-01-01
To evaluate the impact of a continuing medical education program on family doctors to improve prescription of hypoglycemic drugs. An observational study was conducted with two groups of comparison (with-without program) and before-after periods. The unit of analysis was the visit. The period of evaluation comprised six months before and six after implementing the program. The outcome variable was the appropriateness of prescription that was based upon two criteria: appropriate selection and proper indication of the drug. Logistic regression models and the double differences technique were used to analyze the information. Models were adjusted by independent variables related with the patient, the visit and the PCC, the more relevant ones were: sex, obesity, conditions other than diabetes, number of visits in the analyzed period, number of drugs prescribed, size of the PCC and period. the program increases 0.6% the probability of appropriate prescription and 11% the probability of appropriate choice of the hypoglycemic drug in obese patients.
United States Air Force High School Apprenticeship Program: 1989 Program Management Report. Volume 2
1989-12-01
error determination of a root, and the Gaussian probability function. I found this 47-7 flowcharting exposure to be an asset while writing more...writing simple programs based upon flowcharts . This skill was further enhanced when my mentor taught me how to take a flowchart (or program) written in...software that teaches Ada to beginners . Though the first part of Ada-Tutr was review, the package proved to be very helpful in assisting me to write more
DATA MANAGEMENT SYSTEM FOR MOBILE SATELLITE PROPAGATION DATA
NASA Technical Reports Server (NTRS)
Kantak, A. V.
1994-01-01
The "Data Management System for Mobile Satellite Propogation" package is a collection of FORTRAN programs and UNIX shell scripts designed to handle the huge amounts of data resulting from Mobile Satellite propogation experiments. These experiments are designed to assist in defining channels for mobile satellite systems. By understanding multipath fading characteristics of the channel, doppler effects, and blockage due to manmade objects as well as natural surroundings, characterization of the channel can be realized. Propogation experiments, then, are performed using a prototype of the system simulating the ultimate product environment. After the data from these experiments is generated, the researcher must access this data with a minimum of effort and to derive some standard results. The programs included in this package manipulate the data files generated by the NASA/JPL Mobile Satellite propogation experiment on an interactive basis. In the experiment, a transmitter operating at 869 MHz was carried to an altitude of 32Km by a stratospheric balloon. A vehicle within the line-of-sight of the transmitter was then driven around, splitting the incoming signal into I and Q channels, and sampling the resulting signal strength at 1000 samples per second. The data was collected at various antenna elavation angles and different times of day generating the ancillary data for the experiment. This package contains a program to convert the binary format of the data generated into standard ASCII format suitable for use with a wide variety of machine architectures. Also included is a UNIX shell-script designed to parse this ASCII file into those records of data that match the researcher's desired values for the ancillary data parameters. In addition, four FORTRAN programs are included to obtain standard quantities from the data. Quantities such as probability of signal level greater than or equal to a specified signal level, probability density of the signal levels, frequency of fade duration, and Fourier Transforms of the sampled data can be generated from the propogation experiment data. All programs in this package are written in either FORTRAN 77 or UNIX shell-scripts. The package does not include test data. The programs were developed in 1987 for use with a UNIX operating system on a DEC MicroVAX computer.
Cao, Youfang; Liang, Jie
2013-01-01
Critical events that occur rarely in biological processes are of great importance, but are challenging to study using Monte Carlo simulation. By introducing biases to reaction selection and reaction rates, weighted stochastic simulation algorithms based on importance sampling allow rare events to be sampled more effectively. However, existing methods do not address the important issue of barrier crossing, which often arises from multistable networks and systems with complex probability landscape. In addition, the proliferation of parameters and the associated computing cost pose significant problems. Here we introduce a general theoretical framework for obtaining optimized biases in sampling individual reactions for estimating probabilities of rare events. We further describe a practical algorithm called adaptively biased sequential importance sampling (ABSIS) method for efficient probability estimation. By adopting a look-ahead strategy and by enumerating short paths from the current state, we estimate the reaction-specific and state-specific forward and backward moving probabilities of the system, which are then used to bias reaction selections. The ABSIS algorithm can automatically detect barrier-crossing regions, and can adjust bias adaptively at different steps of the sampling process, with bias determined by the outcome of exhaustively generated short paths. In addition, there are only two bias parameters to be determined, regardless of the number of the reactions and the complexity of the network. We have applied the ABSIS method to four biochemical networks: the birth-death process, the reversible isomerization, the bistable Schlögl model, and the enzymatic futile cycle model. For comparison, we have also applied the finite buffer discrete chemical master equation (dCME) method recently developed to obtain exact numerical solutions of the underlying discrete chemical master equations of these problems. This allows us to assess sampling results objectively by comparing simulation results with true answers. Overall, ABSIS can accurately and efficiently estimate rare event probabilities for all examples, often with smaller variance than other importance sampling algorithms. The ABSIS method is general and can be applied to study rare events of other stochastic networks with complex probability landscape. PMID:23862966
NASA Astrophysics Data System (ADS)
Cao, Youfang; Liang, Jie
2013-07-01
Critical events that occur rarely in biological processes are of great importance, but are challenging to study using Monte Carlo simulation. By introducing biases to reaction selection and reaction rates, weighted stochastic simulation algorithms based on importance sampling allow rare events to be sampled more effectively. However, existing methods do not address the important issue of barrier crossing, which often arises from multistable networks and systems with complex probability landscape. In addition, the proliferation of parameters and the associated computing cost pose significant problems. Here we introduce a general theoretical framework for obtaining optimized biases in sampling individual reactions for estimating probabilities of rare events. We further describe a practical algorithm called adaptively biased sequential importance sampling (ABSIS) method for efficient probability estimation. By adopting a look-ahead strategy and by enumerating short paths from the current state, we estimate the reaction-specific and state-specific forward and backward moving probabilities of the system, which are then used to bias reaction selections. The ABSIS algorithm can automatically detect barrier-crossing regions, and can adjust bias adaptively at different steps of the sampling process, with bias determined by the outcome of exhaustively generated short paths. In addition, there are only two bias parameters to be determined, regardless of the number of the reactions and the complexity of the network. We have applied the ABSIS method to four biochemical networks: the birth-death process, the reversible isomerization, the bistable Schlögl model, and the enzymatic futile cycle model. For comparison, we have also applied the finite buffer discrete chemical master equation (dCME) method recently developed to obtain exact numerical solutions of the underlying discrete chemical master equations of these problems. This allows us to assess sampling results objectively by comparing simulation results with true answers. Overall, ABSIS can accurately and efficiently estimate rare event probabilities for all examples, often with smaller variance than other importance sampling algorithms. The ABSIS method is general and can be applied to study rare events of other stochastic networks with complex probability landscape.
Cao, Youfang; Liang, Jie
2013-07-14
Critical events that occur rarely in biological processes are of great importance, but are challenging to study using Monte Carlo simulation. By introducing biases to reaction selection and reaction rates, weighted stochastic simulation algorithms based on importance sampling allow rare events to be sampled more effectively. However, existing methods do not address the important issue of barrier crossing, which often arises from multistable networks and systems with complex probability landscape. In addition, the proliferation of parameters and the associated computing cost pose significant problems. Here we introduce a general theoretical framework for obtaining optimized biases in sampling individual reactions for estimating probabilities of rare events. We further describe a practical algorithm called adaptively biased sequential importance sampling (ABSIS) method for efficient probability estimation. By adopting a look-ahead strategy and by enumerating short paths from the current state, we estimate the reaction-specific and state-specific forward and backward moving probabilities of the system, which are then used to bias reaction selections. The ABSIS algorithm can automatically detect barrier-crossing regions, and can adjust bias adaptively at different steps of the sampling process, with bias determined by the outcome of exhaustively generated short paths. In addition, there are only two bias parameters to be determined, regardless of the number of the reactions and the complexity of the network. We have applied the ABSIS method to four biochemical networks: the birth-death process, the reversible isomerization, the bistable Schlögl model, and the enzymatic futile cycle model. For comparison, we have also applied the finite buffer discrete chemical master equation (dCME) method recently developed to obtain exact numerical solutions of the underlying discrete chemical master equations of these problems. This allows us to assess sampling results objectively by comparing simulation results with true answers. Overall, ABSIS can accurately and efficiently estimate rare event probabilities for all examples, often with smaller variance than other importance sampling algorithms. The ABSIS method is general and can be applied to study rare events of other stochastic networks with complex probability landscape.
Parke, Tom; Marchenko, Olga; Anisimov, Vladimir; Ivanova, Anastasia; Jennison, Christopher; Perevozskaya, Inna; Song, Guochen
2017-01-01
Designing an oncology clinical program is more challenging than designing a single study. The standard approaches have been proven to be not very successful during the last decade; the failure rate of Phase 2 and Phase 3 trials in oncology remains high. Improving a development strategy by applying innovative statistical methods is one of the major objectives of a drug development process. The oncology sub-team on Adaptive Program under the Drug Information Association Adaptive Design Scientific Working Group (DIA ADSWG) evaluated hypothetical oncology programs with two competing treatments and published the work in the Therapeutic Innovation and Regulatory Science journal in January 2014. Five oncology development programs based on different Phase 2 designs, including adaptive designs and a standard two parallel arm Phase 3 design were simulated and compared in terms of the probability of clinical program success and expected net present value (eNPV). In this article, we consider eight Phase2/Phase3 development programs based on selected combinations of five Phase 2 study designs and three Phase 3 study designs. We again used the probability of program success and eNPV to compare simulated programs. For the development strategies, we considered that the eNPV showed robust improvement for each successive strategy, with the highest being for a three-arm response adaptive randomization design in Phase 2 and a group sequential design with 5 analyses in Phase 3.
USING GIS TO GENERATE SPATIALLY-BALANCED RANDOM SURVEY DESIGNS FOR NATURAL RESOURCE APPLICATIONS
Sampling of a population is frequently required to understand trends and patterns in natural resource management because financial and time constraints preclude a complete census. A rigorous probability-based survey design specifies where to sample so that inferences from the sam...
Quantum probability ranking principle for ligand-based virtual screening.
Al-Dabbagh, Mohammed Mumtaz; Salim, Naomie; Himmat, Mubarak; Ahmed, Ali; Saeed, Faisal
2017-04-01
Chemical libraries contain thousands of compounds that need screening, which increases the need for computational methods that can rank or prioritize compounds. The tools of virtual screening are widely exploited to enhance the cost effectiveness of lead drug discovery programs by ranking chemical compounds databases in decreasing probability of biological activity based upon probability ranking principle (PRP). In this paper, we developed a novel ranking approach for molecular compounds inspired by quantum mechanics, called quantum probability ranking principle (QPRP). The QPRP ranking criteria would make an attempt to draw an analogy between the physical experiment and molecular structure ranking process for 2D fingerprints in ligand based virtual screening (LBVS). The development of QPRP criteria in LBVS has employed the concepts of quantum at three different levels, firstly at representation level, this model makes an effort to develop a new framework of molecular representation by connecting the molecular compounds with mathematical quantum space. Secondly, estimate the similarity between chemical libraries and references based on quantum-based similarity searching method. Finally, rank the molecules using QPRP approach. Simulated virtual screening experiments with MDL drug data report (MDDR) data sets showed that QPRP outperformed the classical ranking principle (PRP) for molecular chemical compounds.
Quantum probability ranking principle for ligand-based virtual screening
NASA Astrophysics Data System (ADS)
Al-Dabbagh, Mohammed Mumtaz; Salim, Naomie; Himmat, Mubarak; Ahmed, Ali; Saeed, Faisal
2017-04-01
Chemical libraries contain thousands of compounds that need screening, which increases the need for computational methods that can rank or prioritize compounds. The tools of virtual screening are widely exploited to enhance the cost effectiveness of lead drug discovery programs by ranking chemical compounds databases in decreasing probability of biological activity based upon probability ranking principle (PRP). In this paper, we developed a novel ranking approach for molecular compounds inspired by quantum mechanics, called quantum probability ranking principle (QPRP). The QPRP ranking criteria would make an attempt to draw an analogy between the physical experiment and molecular structure ranking process for 2D fingerprints in ligand based virtual screening (LBVS). The development of QPRP criteria in LBVS has employed the concepts of quantum at three different levels, firstly at representation level, this model makes an effort to develop a new framework of molecular representation by connecting the molecular compounds with mathematical quantum space. Secondly, estimate the similarity between chemical libraries and references based on quantum-based similarity searching method. Finally, rank the molecules using QPRP approach. Simulated virtual screening experiments with MDL drug data report (MDDR) data sets showed that QPRP outperformed the classical ranking principle (PRP) for molecular chemical compounds.
A new measure of child vocal reciprocity in children with autism spectrum disorder.
Harbison, Amy L; Woynaroski, Tiffany G; Tapp, Jon; Wade, Joshua W; Warlaumont, Anne S; Yoder, Paul J
2018-06-01
Children's vocal development occurs in the context of reciprocal exchanges with a communication partner who models "speechlike" productions. We propose a new measure of child vocal reciprocity, which we define as the degree to which an adult vocal response increases the probability of an immediately following child vocal response. Vocal reciprocity is likely to be associated with the speechlikeness of vocal communication in young children with autism spectrum disorder (ASD). Two studies were conducted to test the utility of the new measure. The first used simulated vocal samples with randomly sequenced child and adult vocalizations to test the accuracy of the proposed index of child vocal reciprocity. The second was an empirical study of 21 children with ASD who were preverbal or in the early stages of language development. Daylong vocal samples collected in the natural environment were computer analyzed to derive the proposed index of child vocal reciprocity, which was highly stable when derived from two daylong vocal samples and was associated with speechlikeness of vocal communication. This association was significant even when controlling for chance probability of child vocalizations to adult vocal responses, probability of adult vocalizations, or probability of child vocalizations. A valid measure of children's vocal reciprocity might eventually improve our ability to predict which children are on track to develop useful speech and/or are most likely to respond to language intervention. A link to a free, publicly-available software program to derive the new measure of child vocal reciprocity is provided. Autism Res 2018, 11: 903-915. © 2018 International Society for Autism Research, Wiley Periodicals, Inc. Children and adults often engage in back-and-forth vocal exchanges. The extent to which they do so is believed to support children's early speech and language development. Two studies tested a new measure of child vocal reciprocity using computer-generated and real-life vocal samples of young children with autism collected in natural settings. The results provide initial evidence of accuracy, test-retest reliability, and validity of the new measure of child vocal reciprocity. A sound measure of children's vocal reciprocity might improve our ability to predict which children are on track to develop useful speech and/or are most likely to respond to language intervention. A free, publicly-available software program and manuals are provided. © 2018 International Society for Autism Research, Wiley Periodicals, Inc.
Multi-scale occupancy estimation and modelling using multiple detection methods
Nichols, James D.; Bailey, Larissa L.; O'Connell, Allan F.; Talancy, Neil W.; Grant, Evan H. Campbell; Gilbert, Andrew T.; Annand, Elizabeth M.; Husband, Thomas P.; Hines, James E.
2008-01-01
Occupancy estimation and modelling based on detection–nondetection data provide an effective way of exploring change in a species’ distribution across time and space in cases where the species is not always detected with certainty. Today, many monitoring programmes target multiple species, or life stages within a species, requiring the use of multiple detection methods. When multiple methods or devices are used at the same sample sites, animals can be detected by more than one method.We develop occupancy models for multiple detection methods that permit simultaneous use of data from all methods for inference about method-specific detection probabilities. Moreover, the approach permits estimation of occupancy at two spatial scales: the larger scale corresponds to species’ use of a sample unit, whereas the smaller scale corresponds to presence of the species at the local sample station or site.We apply the models to data collected on two different vertebrate species: striped skunks Mephitis mephitis and red salamanders Pseudotriton ruber. For striped skunks, large-scale occupancy estimates were consistent between two sampling seasons. Small-scale occupancy probabilities were slightly lower in the late winter/spring when skunks tend to conserve energy, and movements are limited to males in search of females for breeding. There was strong evidence of method-specific detection probabilities for skunks. As anticipated, large- and small-scale occupancy areas completely overlapped for red salamanders. The analyses provided weak evidence of method-specific detection probabilities for this species.Synthesis and applications. Increasingly, many studies are utilizing multiple detection methods at sampling locations. The modelling approach presented here makes efficient use of detections from multiple methods to estimate occupancy probabilities at two spatial scales and to compare detection probabilities associated with different detection methods. The models can be viewed as another variation of Pollock's robust design and may be applicable to a wide variety of scenarios where species occur in an area but are not always near the sampled locations. The estimation approach is likely to be especially useful in multispecies conservation programmes by providing efficient estimates using multiple detection devices and by providing device-specific detection probability estimates for use in survey design.
The late Neandertal supraorbital fossils from Vindija Cave, Croatia: a biased sample?
Ahern, James C M; Lee, Sang-Hee; Hawks, John D
2002-09-01
The late Neandertal sample from Vindija (Croatia) has been described as transitional between the earlier Central European Neandertals from Krapina (Croatia) and modern humans. However, the morphological differences indicating this transition may rather be the result of different sex and/or age compositions between the samples. This study tests the hypothesis that the metric differences between the Krapina and Vindija supraorbital samples are due to sampling bias. We focus upon the supraorbital region because past studies have posited this region as particularly indicative of the Vindija sample's transitional nature. Furthermore, the supraorbital region varies significantly with both age and sex. We analyzed four chords and two derived indices of supraorbital torus form as defined by Smith & Ranyard (1980, Am. J. phys. Anthrop.93, pp. 589-610). For each variable, we analyzed relative sample bias of the Krapina and Vindija samples using three sampling methods. In order to test the hypothesis that the Vindija sample contains an over-representation of females and/or young while the Krapina sample is normal or also female/young biased, we determined the probability of drawing a sample of the same size as and with a mean equal to or less than Vindija's from a Krapina-based population. In order to test the hypothesis that the Vindija sample is female/young biased while the Krapina sample is male/old biased, we determined the probability of drawing a sample of the same size as and with a mean equal or less than Vindija's from a generated population whose mean is halfway between Krapina's and Vindija's. Finally, in order to test the hypothesis that the Vindija sample is normal while the Krapina sample contains an over-representation of males and/or old, we determined the probability of drawing a sample of the same size as and with a mean equal to or greater than Krapina's from a Vindija-based population. Unless we assume that the Vindija sample is female/young and the Krapina sample is male/old biased, our results falsify the hypothesis that the metric differences between the Krapina and Vindija samples are due to sample bias.
Coalescence computations for large samples drawn from populations of time-varying sizes
Polanski, Andrzej; Szczesna, Agnieszka; Garbulowski, Mateusz; Kimmel, Marek
2017-01-01
We present new results concerning probability distributions of times in the coalescence tree and expected allele frequencies for coalescent with large sample size. The obtained results are based on computational methodologies, which involve combining coalescence time scale changes with techniques of integral transformations and using analytical formulae for infinite products. We show applications of the proposed methodologies for computing probability distributions of times in the coalescence tree and their limits, for evaluation of accuracy of approximate expressions for times in the coalescence tree and expected allele frequencies, and for analysis of large human mitochondrial DNA dataset. PMID:28170404
The exact probability distribution of the rank product statistics for replicated experiments.
Eisinga, Rob; Breitling, Rainer; Heskes, Tom
2013-03-18
The rank product method is a widely accepted technique for detecting differentially regulated genes in replicated microarray experiments. To approximate the sampling distribution of the rank product statistic, the original publication proposed a permutation approach, whereas recently an alternative approximation based on the continuous gamma distribution was suggested. However, both approximations are imperfect for estimating small tail probabilities. In this paper we relate the rank product statistic to number theory and provide a derivation of its exact probability distribution and the true tail probabilities. Copyright © 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Sergeenko, N. P.
2017-11-01
An adequate statistical method should be developed in order to predict probabilistically the range of ionospheric parameters. This problem is solved in this paper. The time series of the critical frequency of the layer F2- foF2( t) were subjected to statistical processing. For the obtained samples {δ foF2}, statistical distributions and invariants up to the fourth order are calculated. The analysis shows that the distributions differ from the Gaussian law during the disturbances. At levels of sufficiently small probability distributions, there are arbitrarily large deviations from the model of the normal process. Therefore, it is attempted to describe statistical samples {δ foF2} based on the Poisson model. For the studied samples, the exponential characteristic function is selected under the assumption that time series are a superposition of some deterministic and random processes. Using the Fourier transform, the characteristic function is transformed into a nonholomorphic excessive-asymmetric probability-density function. The statistical distributions of the samples {δ foF2} calculated for the disturbed periods are compared with the obtained model distribution function. According to the Kolmogorov's criterion, the probabilities of the coincidence of a posteriori distributions with the theoretical ones are P 0.7-0.9. The conducted analysis makes it possible to draw a conclusion about the applicability of a model based on the Poisson random process for the statistical description and probabilistic variation estimates during heliogeophysical disturbances of the variations {δ foF2}.
Challenges of DNA-based mark-recapture studies of American black bears
Settlage, K.E.; Van Manen, F.T.; Clark, J.D.; King, T.L.
2008-01-01
We explored whether genetic sampling would be feasible to provide a region-wide population estimate for American black bears (Ursus americanus) in the southern Appalachians, USA. Specifically, we determined whether adequate capture probabilities (p >0.20) and population estimates with a low coefficient of variation (CV <20%) could be achieved given typical agency budget and personnel constraints. We extracted DNA from hair collected from baited barbed-wire enclosures sampled over a 10-week period on 2 study areas: a high-density black bear population in a portion of Great Smoky Mountains National Park and a lower density population on National Forest lands in North Carolina, South Carolina, and Georgia. We identified individual bears by their unique genotypes obtained from 9 microsatellite loci. We sampled 129 and 60 different bears in the National Park and National Forest study areas, respectively, and applied closed mark–recapture models to estimate population abundance. Capture probabilities and precision of the population estimates were acceptable only for sampling scenarios for which we pooled weekly sampling periods. We detected capture heterogeneity biases, probably because of inadequate spatial coverage by the hair-trapping grid. The logistical challenges of establishing and checking a sufficiently high density of hair traps make DNA-based estimates of black bears impractical for the southern Appalachian region. Alternatives are to estimate population size for smaller areas, estimate population growth rates or survival using mark–recapture methods, or use independent marking and recapturing techniques to reduce capture heterogeneity.
Detection of image structures using the Fisher information and the Rao metric.
Maybank, Stephen J
2004-12-01
In many detection problems, the structures to be detected are parameterized by the points of a parameter space. If the conditional probability density function for the measurements is known, then detection can be achieved by sampling the parameter space at a finite number of points and checking each point to see if the corresponding structure is supported by the data. The number of samples and the distances between neighboring samples are calculated using the Rao metric on the parameter space. The Rao metric is obtained from the Fisher information which is, in turn, obtained from the conditional probability density function. An upper bound is obtained for the probability of a false detection. The calculations are simplified in the low noise case by making an asymptotic approximation to the Fisher information. An application to line detection is described. Expressions are obtained for the asymptotic approximation to the Fisher information, the volume of the parameter space, and the number of samples. The time complexity for line detection is estimated. An experimental comparison is made with a Hough transform-based method for detecting lines.
BIODEGRADATION PROBABILITY PROGRAM (BIODEG)
The Biodegradation Probability Program (BIODEG) calculates the probability that a chemical under aerobic conditions with mixed cultures of microorganisms will biodegrade rapidly or slowly. It uses fragment constants developed using multiple linear and non-linear regressions and d...
ERIC Educational Resources Information Center
Saito, Rebecca N.
2006-01-01
Most people would probably agree that participation in quality youth programs and neighborhood-based, informal relationships and opportunities is a good thing for young people. The problem is that not nearly enough children and youth are engaged in these growth-enhancing opportunities. What can educators learn from young people about designing…
Exact and Monte carlo resampling procedures for the Wilcoxon-Mann-Whitney and Kruskal-Wallis tests.
Berry, K J; Mielke, P W
2000-12-01
Exact and Monte Carlo resampling FORTRAN programs are described for the Wilcoxon-Mann-Whitney rank sum test and the Kruskal-Wallis one-way analysis of variance for ranks test. The program algorithms compensate for tied values and do not depend on asymptotic approximations for probability values, unlike most algorithms contained in PC-based statistical software packages.
McGlynn, Natalie; Kirsh, Victoria A.; Cotterchio, Michelle; Harris, M. Anne; Nadalin, Victoria; Kreiger, Nancy
2015-01-01
Background/Objectives It has been suggested that the association between shift work and chronic disease is mediated by an increase in obesity. However, investigations of the relationship between shift work and obesity reveal mixed findings. Using a recently developed exposure assessment tool, this study examined the association between shift work and obesity among Canadian women from two studies: a cohort of university alumni, and a population-based study. Methods Self-administered questionnaire data were used from healthy, currently employed females in a population-based study, the Ontario Women’s Diet and Health case-control study (n = 1611 controls), and from a subset of a of university alumni from the Canadian Study of Diet, Lifestyle, and Health (n = 1097) cohort study. Overweight was defined as BMI≥25 to <30, and obesity as BMI≥30. Reported occupation was converted to occupational codes and linked to a probability of shift work value derived from Survey of Labour and Income Dynamics data. Regular evenings, nights, or rotating work comprised shift work. Polytomous logistic regression estimated the association between probability of shift work, categorized as near nil, low, medium, and high probability of shift work, on overweight and obesity, controlling for detected confounders. Results In the population-based sample, high probability of shift work was associated with obesity (reference = near nil probability of shift work, OR: 1.88, 95% CI: 1.01–3.51, p = 0.047). In the alumni cohort, no significant association was detected between shift work and overweight or obesity. Conclusions As these analyses found a positive association between high probability of shift work exposure and obesity in a population-based sample, but not in an alumni cohort, it is suggested that the relationship between shift work and obesity is complex, and may be particularly susceptible to occupational and education-related factors within a given population. PMID:26376050
McGlynn, Natalie; Kirsh, Victoria A; Cotterchio, Michelle; Harris, M Anne; Nadalin, Victoria; Kreiger, Nancy
2015-01-01
It has been suggested that the association between shift work and chronic disease is mediated by an increase in obesity. However, investigations of the relationship between shift work and obesity reveal mixed findings. Using a recently developed exposure assessment tool, this study examined the association between shift work and obesity among Canadian women from two studies: a cohort of university alumni, and a population-based study. Self-administered questionnaire data were used from healthy, currently employed females in a population-based study, the Ontario Women's Diet and Health case-control study (n = 1611 controls), and from a subset of a of university alumni from the Canadian Study of Diet, Lifestyle, and Health (n = 1097) cohort study. Overweight was defined as BMI≥25 to <30, and obesity as BMI≥30. Reported occupation was converted to occupational codes and linked to a probability of shift work value derived from Survey of Labour and Income Dynamics data. Regular evenings, nights, or rotating work comprised shift work. Polytomous logistic regression estimated the association between probability of shift work, categorized as near nil, low, medium, and high probability of shift work, on overweight and obesity, controlling for detected confounders. In the population-based sample, high probability of shift work was associated with obesity (reference = near nil probability of shift work, OR: 1.88, 95% CI: 1.01-3.51, p = 0.047). In the alumni cohort, no significant association was detected between shift work and overweight or obesity. As these analyses found a positive association between high probability of shift work exposure and obesity in a population-based sample, but not in an alumni cohort, it is suggested that the relationship between shift work and obesity is complex, and may be particularly susceptible to occupational and education-related factors within a given population.
Confidence Intervals for Proportion Estimates in Complex Samples. Research Report. ETS RR-06-21
ERIC Educational Resources Information Center
Oranje, Andreas
2006-01-01
Confidence intervals are an important tool to indicate uncertainty of estimates and to give an idea of probable values of an estimate if a different sample from the population was drawn or a different sample of measures was used. Standard symmetric confidence intervals for proportion estimates based on a normal approximation can yield bounds…
Zhang, Hang; Maloney, Laurence T.
2012-01-01
In decision from experience, the source of probability information affects how probability is distorted in the decision task. Understanding how and why probability is distorted is a key issue in understanding the peculiar character of experience-based decision. We consider how probability information is used not just in decision-making but also in a wide variety of cognitive, perceptual, and motor tasks. Very similar patterns of distortion of probability/frequency information have been found in visual frequency estimation, frequency estimation based on memory, signal detection theory, and in the use of probability information in decision-making under risk and uncertainty. We show that distortion of probability in all cases is well captured as linear transformations of the log odds of frequency and/or probability, a model with a slope parameter, and an intercept parameter. We then consider how task and experience influence these two parameters and the resulting distortion of probability. We review how the probability distortions change in systematic ways with task and report three experiments on frequency distortion where the distortions change systematically in the same task. We found that the slope of frequency distortions decreases with the sample size, which is echoed by findings in decision from experience. We review previous models of the representation of uncertainty and find that none can account for the empirical findings. PMID:22294978
ERIC Educational Resources Information Center
King, James M.; And Others
The materials described here represent the conversion of a highly popular student workbook "Sets, Probability and Statistics: The Mathematics of Life Insurance" into a computer program. The program is designed to familiarize students with the concepts of sets, probability, and statistics, and to provide practice using real life examples. It also…
Building from within: pastoral insights into community resources and assets.
Ford, Cassandra D
2013-01-01
To explore perceptions of community pastors regarding the extent of community resources and assets in a rural, Southern, African American community. Utilizing a qualitative, descriptive design, interviews were conducted with six African American pastors. Interviews were conducted using a semi-structured interview guide based on an assets-oriented approach. Pastors discussed various resources and assets, probable within the community that may be considered as support for program development. Key themes included: (1) community strengths, (2) community support, and (3) resources for a healthy lifestyle. The church was identified, throughout the interviews, as a primary source of strength and support for community members. In this study of African American pastors, various perceptions of community resources were identified. Findings indicate that a sample, rural, Southern, African American community has a wealth of resources and assets, but additional resources related to health promotion are still necessary to produce optimal results. Specific programs to prevent chronic conditions such as cardiovascular disease can provide an effective means for addressing related health disparities. Programs implemented through churches can reach large numbers of individuals in the community and provide an important source of sustainable efforts to improve the health of African Americans. © 2013 Wiley Periodicals, Inc.
Bivariate normal, conditional and rectangular probabilities: A computer program with applications
NASA Technical Reports Server (NTRS)
Swaroop, R.; Brownlow, J. D.; Ashwworth, G. R.; Winter, W. R.
1980-01-01
Some results for the bivariate normal distribution analysis are presented. Computer programs for conditional normal probabilities, marginal probabilities, as well as joint probabilities for rectangular regions are given: routines for computing fractile points and distribution functions are also presented. Some examples from a closed circuit television experiment are included.
Hahm, Myung-Il; Park, Eun-Cheol; Choi, Kui Son; Lee, Hoo-Yeon; Park, Jae-Hyun; Park, Sohee
2011-02-01
Although national-level organized cancer screening programs have reduced barriers to screening for people of low socioeconomic status, barriers to early screening remain. Our aim was to determine the diffusion pattern and identify the factors associated with early participation in stomach and breast cancer screening programs. The study population was derived from the Korean National Cancer Screening Survey, conducted in 2007. A stratified random sample of people aged 40 years and older from a nationwide population-based database was gathered in Korea (n=1,517) in 2007. Time of participation in early screening was defined as the number of years that had elapsed between the participant's 30th birthday and the age at first screening. Significant differences were observed in the probability of adopting stomach and breast cancer screening in relation to education, household income, and job level. Results from Cox's proportional hazard model indicated that higher household income was significantly associated with an increased probability of adopting stomach cancer screening earlier (p<0.05), and people with high household incomes were more likely to adopt breast cancer screening earlier than were those with incomes under US$1,500 per month (p<0.01). When considered at a significance level of 0.1, we found that the most highly educated women were more likely than the least educated to be screened early. Despite organized governmental screening programs, there are still inequalities in the early adoption of cancer screening. The results of this study also suggest that inequalities in early adoption may affect participation in regular screening. Copyright © 2010 Elsevier Ltd. All rights reserved.
Cost and detection rate of glaucoma screening with imaging devices in a primary care center
Anton, Alfonso; Fallon, Monica; Cots, Francesc; Sebastian, María A; Morilla-Grasa, Antonio; Mojal, Sergi; Castells, Xavier
2017-01-01
Purpose To analyze the cost and detection rate of a screening program for detecting glaucoma with imaging devices. Materials and methods In this cross-sectional study, a glaucoma screening program was applied in a population-based sample randomly selected from a population of 23,527. Screening targeted the population at risk of glaucoma. Examinations included optic disk tomography (Heidelberg retina tomograph [HRT]), nerve fiber analysis, and tonometry. Subjects who met at least 2 of 3 endpoints (HRT outside normal limits, nerve fiber index ≥30, or tonometry ≥21 mmHg) were referred for glaucoma consultation. The currently established (“conventional”) detection method was evaluated by recording data from primary care and ophthalmic consultations in the same population. The direct costs of screening and conventional detection were calculated by adding the unit costs generated during the diagnostic process. The detection rate of new glaucoma cases was assessed. Results The screening program evaluated 414 subjects; 32 cases were referred for glaucoma consultation, 7 had glaucoma, and 10 had probable glaucoma. The current detection method assessed 677 glaucoma suspects in the population, of whom 29 were diagnosed with glaucoma or probable glaucoma. Glaucoma screening and the conventional detection method had detection rates of 4.1% and 3.1%, respectively, and the cost per case detected was 1,410 and 1,435€, respectively. The cost of screening 1 million inhabitants would be 5.1 million euros and would allow the detection of 4,715 new cases. Conclusion The proposed screening method directed at population at risk allows a detection rate of 4.1% and a cost of 1,410 per case detected. PMID:28243057
NASA Technical Reports Server (NTRS)
Jasperson, W. H.; Nastron, G. D.; Davis, R. E.; Holdeman, J. D.
1984-01-01
Summary studies are presented for the entire cloud observation archive from the NASA Global Atmospheric Sampling Program (GASP). Studies are also presented for GASP particle-concentration data gathered concurrently with the cloud observations. Cloud encounters are shown on about 15 percent of the data samples overall, but the probability of cloud encounter is shown to vary significantly with altitude, latitude, and distance from the tropopause. Several meteorological circulation features are apparent in the latitudinal distribution of cloud cover, and the cloud-encounter statistics are shown to be consistent with the classical mid-latitude cyclone model. Observations of clouds spaced more closely than 90 minutes are shown to be statistically dependent. The statistics for cloud and particle encounter are utilized to estimate the frequency of cloud encounter on long-range airline routes, and to assess the probability and extent of laminaar flow loss due to cloud or particle encounter by aircraft utilizing laminar flow control (LFC). It is shown that the probability of extended cloud encounter is too low, of itself, to make LFC impractical. This report is presented in two volumes. Volume I contains the narrative, analysis, and conclusions. Volume II contains five supporting appendixes.
NASA Astrophysics Data System (ADS)
Wang, C.; Rubin, Y.
2014-12-01
Spatial distribution of important geotechnical parameter named compression modulus Es contributes considerably to the understanding of the underlying geological processes and the adequate assessment of the Es mechanics effects for differential settlement of large continuous structure foundation. These analyses should be derived using an assimilating approach that combines in-situ static cone penetration test (CPT) with borehole experiments. To achieve such a task, the Es distribution of stratum of silty clay in region A of China Expo Center (Shanghai) is studied using the Bayesian-maximum entropy method. This method integrates rigorously and efficiently multi-precision of different geotechnical investigations and sources of uncertainty. Single CPT samplings were modeled as a rational probability density curve by maximum entropy theory. Spatial prior multivariate probability density function (PDF) and likelihood PDF of the CPT positions were built by borehole experiments and the potential value of the prediction point, then, preceding numerical integration on the CPT probability density curves, the posterior probability density curve of the prediction point would be calculated by the Bayesian reverse interpolation framework. The results were compared between Gaussian Sequential Stochastic Simulation and Bayesian methods. The differences were also discussed between single CPT samplings of normal distribution and simulated probability density curve based on maximum entropy theory. It is shown that the study of Es spatial distributions can be improved by properly incorporating CPT sampling variation into interpolation process, whereas more informative estimations are generated by considering CPT Uncertainty for the estimation points. Calculation illustrates the significance of stochastic Es characterization in a stratum, and identifies limitations associated with inadequate geostatistical interpolation techniques. This characterization results will provide a multi-precision information assimilation method of other geotechnical parameters.
Surveying Europe's Only Cave-Dwelling Chordate Species (Proteus anguinus) Using Environmental DNA.
Vörös, Judit; Márton, Orsolya; Schmidt, Benedikt R; Gál, Júlia Tünde; Jelić, Dušan
2017-01-01
In surveillance of subterranean fauna, especially in the case of rare or elusive aquatic species, traditional techniques used for epigean species are often not feasible. We developed a non-invasive survey method based on environmental DNA (eDNA) to detect the presence of the red-listed cave-dwelling amphibian, Proteus anguinus, in the caves of the Dinaric Karst. We tested the method in fifteen caves in Croatia, from which the species was previously recorded or expected to occur. We successfully confirmed the presence of P. anguinus from ten caves and detected the species for the first time in five others. Using a hierarchical occupancy model we compared the availability and detection probability of eDNA of two water sampling methods, filtration and precipitation. The statistical analysis showed that both availability and detection probability depended on the method and estimates for both probabilities were higher using filter samples than for precipitation samples. Combining reliable field and laboratory methods with robust statistical modeling will give the best estimates of species occurrence.
Norström, Madelaine; Jonsson, Malin E; Åkerstedt, Johan; Whist, Anne Cathrine; Kristoffersen, Anja Bråthen; Sviland, Ståle; Hopp, Petter; Wahlström, Helene
2014-09-01
Disease caused by Bovine virus diarrhoea virus (BVDV) is notifiable in Norway. An eradication programme started in 1992. The number of herds with restrictions decreased from 2950 in 1994 to zero at the end of 2006. From 2007, the aim of the programme has been surveillance in order to document freedom from the infection. To estimate the probability of freedom from BVDV infection in the Norwegian cattle population by the end of 2011, a scenario tree model of the surveillance program during the years 2007-2011 was used. Three surveillance system components (SSCs) were included in the model: dairy, beef suckler sampled at farms (2007-2010) and beef suckler sampled at slaughterhouses (2011). The design prevalence was set to 0.2% at herd level and to 30% at within-herd level for the whole cattle population. The median probability of freedom from BVDV in Norway at the end of 2011 was 0.996; (0.995-0.997, credibility interval). The results from the scenario tree model support that the Norwegian cattle population is free from BVDV. The highest estimate of the annual sensitivity for the beef suckling SSCs originated from the surveillance at the slaughterhouses in 2011. The change to sampling at the slaughterhouse level further increased the sensitivity of the surveillance. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.
Messner, Michael J; Berger, Philip; Javier, Julie
2017-06-01
Public water systems (PWSs) in the United States generate total coliform (TC) and Escherichia coli (EC) monitoring data, as required by the Total Coliform Rule (TCR). We analyzed data generated in 2011 by approximately 38,000 small (serving fewer than 4101 individuals) undisinfected public water systems (PWSs). We used statistical modeling to characterize a distribution of TC detection probabilities for each of nine groupings of PWSs based on system type (community, non-transient non-community, and transient non-community) and population served (less than 101, 101-1000 and 1001-4100 people). We found that among PWS types sampled in 2011, on average, undisinfected transient PWSs test positive for TC 4.3% of the time as compared with 3% for undisinfected non-transient PWSs and 2.5% for undisinfected community PWSs. Within each type of PWS, the smaller systems have higher median TC detection than the larger systems. All TC-positive samples were assayed for EC. Among TC-positive samples from small undisinfected PWSs, EC is detected in about 5% of samples, regardless of PWS type or size. We evaluated the upper tail of the TC detection probability distributions and found that significant percentages of some system types have high TC detection probabilities. For example, assuming the systems providing data are nationally-representative, then 5.0% of the ∼50,000 small undisinfected transient PWSs in the U.S. have TC detection probabilities of 20% or more. Communities with such high TC detection probabilities may have elevated risk of acute gastrointestinal (AGI) illness - perhaps as great or greater than the attributable risk to drinking water (6-22%) calculated for 14 Wisconsin community PWSs with much lower TC detection probabilities (about 2.3%, Borchardt et al., 2012). Published by Elsevier GmbH.
Statistical Inference in Hidden Markov Models Using k-Segment Constraints
Titsias, Michalis K.; Holmes, Christopher C.; Yau, Christopher
2016-01-01
Hidden Markov models (HMMs) are one of the most widely used statistical methods for analyzing sequence data. However, the reporting of output from HMMs has largely been restricted to the presentation of the most-probable (MAP) hidden state sequence, found via the Viterbi algorithm, or the sequence of most probable marginals using the forward–backward algorithm. In this article, we expand the amount of information we could obtain from the posterior distribution of an HMM by introducing linear-time dynamic programming recursions that, conditional on a user-specified constraint in the number of segments, allow us to (i) find MAP sequences, (ii) compute posterior probabilities, and (iii) simulate sample paths. We collectively call these recursions k-segment algorithms and illustrate their utility using simulated and real examples. We also highlight the prospective and retrospective use of k-segment constraints for fitting HMMs or exploring existing model fits. Supplementary materials for this article are available online. PMID:27226674
Jung, R.E.; Royle, J. Andrew; Sauer, J.R.; Addison, C.; Rau, R.D.; Shirk, J.L.; Whissel, J.C.
2005-01-01
Stream salamanders in the family Plethodontidae constitute a large biomass in and near headwater streams in the eastern United States and are promising indicators of stream ecosystem health. Many studies of stream salamanders have relied on population indices based on counts rather than population estimates based on techniques such as capture-recapture and removal. Application of estimation procedures allows the calculation of detection probabilities (the proportion of total animals present that are detected during a survey) and their associated sampling error, and may be essential for determining salamander population sizes and trends. In 1999, we conducted capture-recapture and removal population estimation methods for Desmognathus salamanders at six streams in Shenandoah National Park, Virginia, USA. Removal sampling appeared more efficient and detection probabilities from removal data were higher than those from capture-recapture. During 2001-2004, we used removal estimation at eight streams in the park to assess the usefulness of this technique for long-term monitoring of stream salamanders. Removal detection probabilities ranged from 0.39 to 0.96 for Desmognathus, 0.27 to 0.89 for Eurycea and 0.27 to 0.75 for northern spring (Gyrinophilus porphyriticus) and northern red (Pseudotriton ruber) salamanders across stream transects. Detection probabilities did not differ across years for Desmognathus and Eurycea, but did differ among streams for Desmognathus. Population estimates of Desmognathus decreased between 2001-2002 and 2003-2004 which may be related to changes in stream flow conditions. Removal-based procedures may be a feasible approach for population estimation of salamanders, but field methods should be designed to meet the assumptions of the sampling procedures. New approaches to estimating stream salamander populations are discussed.
... use a complex, stratified, multistage probability cluster sampling design. NHANES data collection is based on a nationally ... conjunction with the 2012 NHANES and the survey design was based on the design for NHANES, with ...
Predictions of malaria vector distribution in Belize based on multispectral satellite data.
Roberts, D R; Paris, J F; Manguin, S; Harbach, R E; Woodruff, R; Rejmankova, E; Polanco, J; Wullschleger, B; Legters, L J
1996-03-01
Use of multispectral satellite data to predict arthropod-borne disease trouble spots is dependent on clear understandings of environmental factors that determine the presence of disease vectors. A blind test of remote sensing-based predictions for the spatial distribution of a malaria vector, Anopheles pseudopunctipennis, was conducted as a follow-up to two years of studies on vector-environmental relationships in Belize. Four of eight sites that were predicted to be high probability locations for presence of An. pseudopunctipennis were positive and all low probability sites (0 of 12) were negative. The absence of An. pseudopunctipennis at four high probability locations probably reflects the low densities that seem to characterize field populations of this species, i.e., the population densities were below the threshold of our sampling effort. Another important malaria vector, An. darlingi, was also present at all high probability sites and absent at all low probability sites. Anopheles darlingi, like An. pseudopunctipennis, is a riverine species. Prior to these collections at ecologically defined locations, this species was last detected in Belize in 1946.
Predictions of malaria vector distribution in Belize based on multispectral satellite data
NASA Technical Reports Server (NTRS)
Roberts, D. R.; Paris, J. F.; Manguin, S.; Harbach, R. E.; Woodruff, R.; Rejmankova, E.; Polanco, J.; Wullschleger, B.; Legters, L. J.
1996-01-01
Use of multispectral satellite data to predict arthropod-borne disease trouble spots is dependent on clear understandings of environmental factors that determine the presence of disease vectors. A blind test of remote sensing-based predictions for the spatial distribution of a malaria vector, Anopheles pseudopunctipennis, was conducted as a follow-up to two years of studies on vector-environmental relationships in Belize. Four of eight sites that were predicted to be high probability locations for presence of An. pseudopunctipennis were positive and all low probability sites (0 of 12) were negative. The absence of An. pseudopunctipennis at four high probability locations probably reflects the low densities that seem to characterize field populations of this species, i.e., the population densities were below the threshold of our sampling effort. Another important malaria vector, An. darlingi, was also present at all high probability sites and absent at all low probability sites. Anopheles darlingi, like An. pseudopunctipennis, is a riverine species. Prior to these collections at ecologically defined locations, this species was last detected in Belize in 1946.
NESTOR: A Computer-Based Medical Diagnostic Aid That Integrates Causal and Probabilistic Knowledge.
1984-11-01
indiidual conditional probabilities between one cause node and its effect node, but less common to know a joint conditional probability between a...PERFOAMING ORG. REPORT NUMBER * 7. AUTI4ORs) O Gregory F. Cooper 1 CONTRACT OR GRANT NUMBERIa) ONR N00014-81-K-0004 g PERFORMING ORGANIZATION NAME AND...ADDRESS 10. PROGRAM ELEMENT, PROJECT. TASK Department of Computer Science AREA & WORK UNIT NUMBERS Stanford University Stanford, CA 94305 USA 12. REPORT
Scheid, Anika; Nebel, Markus E
2012-07-09
Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of such an SCFG based sampling method without significant accuracy losses. If, on the other hand, the quality of sampled structures can be observed to strongly react to slight disturbances, there is little hope for improving the complexity by heuristic procedures. We hence provide a reliable test for the hypothesis that a heuristic method could be implemented to improve the time scaling of RNA secondary structure prediction in the worst-case - without sacrificing much of the accuracy of the results. Our experiments indicate that absolute errors generally lead to the generation of useless sample sets, whereas relative errors seem to have only small negative impact on both the predictive accuracy and the overall quality of resulting structure samples. Based on these observations, we present some useful ideas for developing a time-reduced sampling method guaranteeing an acceptable predictive accuracy. We also discuss some inherent drawbacks that arise in the context of approximation. The key results of this paper are crucial for the design of an efficient and competitive heuristic prediction method based on the increasingly accepted and attractive statistical sampling approach. This has indeed been indicated by the construction of prototype algorithms.
2012-01-01
Background Over the past years, statistical and Bayesian approaches have become increasingly appreciated to address the long-standing problem of computational RNA structure prediction. Recently, a novel probabilistic method for the prediction of RNA secondary structures from a single sequence has been studied which is based on generating statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method samples the possible foldings from a distribution implied by a sophisticated (traditional or length-dependent) stochastic context-free grammar (SCFG) that mirrors the standard thermodynamic model applied in modern physics-based prediction algorithms. Specifically, that grammar represents an exact probabilistic counterpart to the energy model underlying the Sfold software, which employs a sampling extension of the partition function (PF) approach to produce statistically representative subsets of the Boltzmann-weighted ensemble. Although both sampling approaches have the same worst-case time and space complexities, it has been indicated that they differ in performance (both with respect to prediction accuracy and quality of generated samples), where neither of these two competing approaches generally outperforms the other. Results In this work, we will consider the SCFG based approach in order to perform an analysis on how the quality of generated sample sets and the corresponding prediction accuracy changes when different degrees of disturbances are incorporated into the needed sampling probabilities. This is motivated by the fact that if the results prove to be resistant to large errors on the distinct sampling probabilities (compared to the exact ones), then it will be an indication that these probabilities do not need to be computed exactly, but it may be sufficient and more efficient to approximate them. Thus, it might then be possible to decrease the worst-case time requirements of such an SCFG based sampling method without significant accuracy losses. If, on the other hand, the quality of sampled structures can be observed to strongly react to slight disturbances, there is little hope for improving the complexity by heuristic procedures. We hence provide a reliable test for the hypothesis that a heuristic method could be implemented to improve the time scaling of RNA secondary structure prediction in the worst-case – without sacrificing much of the accuracy of the results. Conclusions Our experiments indicate that absolute errors generally lead to the generation of useless sample sets, whereas relative errors seem to have only small negative impact on both the predictive accuracy and the overall quality of resulting structure samples. Based on these observations, we present some useful ideas for developing a time-reduced sampling method guaranteeing an acceptable predictive accuracy. We also discuss some inherent drawbacks that arise in the context of approximation. The key results of this paper are crucial for the design of an efficient and competitive heuristic prediction method based on the increasingly accepted and attractive statistical sampling approach. This has indeed been indicated by the construction of prototype algorithms. PMID:22776037
Detection of Invasive Mosquito Vectors Using Environmental DNA (eDNA) from Water Samples
Schneider, Judith; Valentini, Alice; Dejean, Tony; Montarsi, Fabrizio; Taberlet, Pierre
2016-01-01
Repeated introductions and spread of invasive mosquito species (IMS) have been recorded on a large scale these last decades worldwide. In this context, members of the mosquito genus Aedes can present serious risks to public health as they have or may develop vector competence for various viral diseases. While the Tiger mosquito (Aedes albopictus) is a well-known vector for e.g. dengue and chikungunya viruses, the Asian bush mosquito (Ae. j. japonicus) and Ae. koreicus have shown vector competence in the field and the laboratory for a number of viruses including dengue, West Nile fever and Japanese encephalitis. Early detection and identification is therefore crucial for successful eradication or control strategies. Traditional specific identification and monitoring of different and/or cryptic life stages of the invasive Aedes species based on morphological grounds may lead to misidentifications, and are problematic when extensive surveillance is needed. In this study, we developed, tested and applied an environmental DNA (eDNA) approach for the detection of three IMS, based on water samples collected in the field in several European countries. We compared real-time quantitative PCR (qPCR) assays specific for these three species and an eDNA metabarcoding approach with traditional sampling, and discussed the advantages and limitations of these methods. Detection probabilities for eDNA-based approaches were in most of the specific comparisons higher than for traditional survey and the results were congruent between both molecular methods, confirming the reliability and efficiency of alternative eDNA-based techniques for the early and unambiguous detection and surveillance of invasive mosquito vectors. The ease of water sampling procedures in the eDNA approach tested here allows the development of large-scale monitoring and surveillance programs of IMS, especially using citizen science projects. PMID:27626642
Detection of Invasive Mosquito Vectors Using Environmental DNA (eDNA) from Water Samples.
Schneider, Judith; Valentini, Alice; Dejean, Tony; Montarsi, Fabrizio; Taberlet, Pierre; Glaizot, Olivier; Fumagalli, Luca
2016-01-01
Repeated introductions and spread of invasive mosquito species (IMS) have been recorded on a large scale these last decades worldwide. In this context, members of the mosquito genus Aedes can present serious risks to public health as they have or may develop vector competence for various viral diseases. While the Tiger mosquito (Aedes albopictus) is a well-known vector for e.g. dengue and chikungunya viruses, the Asian bush mosquito (Ae. j. japonicus) and Ae. koreicus have shown vector competence in the field and the laboratory for a number of viruses including dengue, West Nile fever and Japanese encephalitis. Early detection and identification is therefore crucial for successful eradication or control strategies. Traditional specific identification and monitoring of different and/or cryptic life stages of the invasive Aedes species based on morphological grounds may lead to misidentifications, and are problematic when extensive surveillance is needed. In this study, we developed, tested and applied an environmental DNA (eDNA) approach for the detection of three IMS, based on water samples collected in the field in several European countries. We compared real-time quantitative PCR (qPCR) assays specific for these three species and an eDNA metabarcoding approach with traditional sampling, and discussed the advantages and limitations of these methods. Detection probabilities for eDNA-based approaches were in most of the specific comparisons higher than for traditional survey and the results were congruent between both molecular methods, confirming the reliability and efficiency of alternative eDNA-based techniques for the early and unambiguous detection and surveillance of invasive mosquito vectors. The ease of water sampling procedures in the eDNA approach tested here allows the development of large-scale monitoring and surveillance programs of IMS, especially using citizen science projects.
Use and interpretation of logistic regression in habitat-selection studies
Keating, Kim A.; Cherry, Steve
2004-01-01
Logistic regression is an important tool for wildlife habitat-selection studies, but the method frequently has been misapplied due to an inadequate understanding of the logistic model, its interpretation, and the influence of sampling design. To promote better use of this method, we review its application and interpretation under 3 sampling designs: random, case-control, and use-availability. Logistic regression is appropriate for habitat use-nonuse studies employing random sampling and can be used to directly model the conditional probability of use in such cases. Logistic regression also is appropriate for studies employing case-control sampling designs, but careful attention is required to interpret results correctly. Unless bias can be estimated or probability of use is small for all habitats, results of case-control studies should be interpreted as odds ratios, rather than probability of use or relative probability of use. When data are gathered under a use-availability design, logistic regression can be used to estimate approximate odds ratios if probability of use is small, at least on average. More generally, however, logistic regression is inappropriate for modeling habitat selection in use-availability studies. In particular, using logistic regression to fit the exponential model of Manly et al. (2002:100) does not guarantee maximum-likelihood estimates, valid probabilities, or valid likelihoods. We show that the resource selection function (RSF) commonly used for the exponential model is proportional to a logistic discriminant function. Thus, it may be used to rank habitats with respect to probability of use and to identify important habitat characteristics or their surrogates, but it is not guaranteed to be proportional to probability of use. Other problems associated with the exponential model also are discussed. We describe an alternative model based on Lancaster and Imbens (1996) that offers a method for estimating conditional probability of use in use-availability studies. Although promising, this model fails to converge to a unique solution in some important situations. Further work is needed to obtain a robust method that is broadly applicable to use-availability studies.
ERIC Educational Resources Information Center
Gans, Herbert J.
To collect data on how to make television a more effective learning instrument outside of the classroom, a standard probability sample with quotas consisting of 200 adults and 200 adolescents living in New York City was interviewed to study how people use TV, their attitudes toward various types of programing, and their viewing preferences.…
The role of predictive uncertainty in the operational management of reservoirs
NASA Astrophysics Data System (ADS)
Todini, E.
2014-09-01
The present work deals with the operational management of multi-purpose reservoirs, whose optimisation-based rules are derived, in the planning phase, via deterministic (linear and nonlinear programming, dynamic programming, etc.) or via stochastic (generally stochastic dynamic programming) approaches. In operation, the resulting deterministic or stochastic optimised operating rules are then triggered based on inflow predictions. In order to fully benefit from predictions, one must avoid using them as direct inputs to the reservoirs, but rather assess the "predictive knowledge" in terms of a predictive probability density to be operationally used in the decision making process for the estimation of expected benefits and/or expected losses. Using a theoretical and extremely simplified case, it will be shown why directly using model forecasts instead of the full predictive density leads to less robust reservoir management decisions. Moreover, the effectiveness and the tangible benefits for using the entire predictive probability density instead of the model predicted values will be demonstrated on the basis of the Lake Como management system, operational since 1997, as well as on the basis of a case study on the lake of Aswan.
Woldegebriel, Michael; Vivó-Truyols, Gabriel
2016-10-04
A novel method for compound identification in liquid chromatography-high resolution mass spectrometry (LC-HRMS) is proposed. The method, based on Bayesian statistics, accommodates all possible uncertainties involved, from instrumentation up to data analysis into a single model yielding the probability of the compound of interest being present/absent in the sample. This approach differs from the classical methods in two ways. First, it is probabilistic (instead of deterministic); hence, it computes the probability that the compound is (or is not) present in a sample. Second, it answers the hypothesis "the compound is present", opposed to answering the question "the compound feature is present". This second difference implies a shift in the way data analysis is tackled, since the probability of interfering compounds (i.e., isomers and isobaric compounds) is also taken into account.
High-Frequency Replanning Under Uncertainty Using Parallel Sampling-Based Motion Planning
Sun, Wen; Patil, Sachin; Alterovitz, Ron
2015-01-01
As sampling-based motion planners become faster, they can be re-executed more frequently by a robot during task execution to react to uncertainty in robot motion, obstacle motion, sensing noise, and uncertainty in the robot’s kinematic model. We investigate and analyze high-frequency replanning (HFR), where, during each period, fast sampling-based motion planners are executed in parallel as the robot simultaneously executes the first action of the best motion plan from the previous period. We consider discrete-time systems with stochastic nonlinear (but linearizable) dynamics and observation models with noise drawn from zero mean Gaussian distributions. The objective is to maximize the probability of success (i.e., avoid collision with obstacles and reach the goal) or to minimize path length subject to a lower bound on the probability of success. We show that, as parallel computation power increases, HFR offers asymptotic optimality for these objectives during each period for goal-oriented problems. We then demonstrate the effectiveness of HFR for holonomic and nonholonomic robots including car-like vehicles and steerable medical needles. PMID:26279645
Quantitative methods to direct exploration based on hydrogeologic information
Graettinger, A.J.; Lee, J.; Reeves, H.W.; Dethan, D.
2006-01-01
Quantitatively Directed Exploration (QDE) approaches based on information such as model sensitivity, input data covariance and model output covariance are presented. Seven approaches for directing exploration are developed, applied, and evaluated on a synthetic hydrogeologic site. The QDE approaches evaluate input information uncertainty, subsurface model sensitivity and, most importantly, output covariance to identify the next location to sample. Spatial input parameter values and covariances are calculated with the multivariate conditional probability calculation from a limited number of samples. A variogram structure is used during data extrapolation to describe the spatial continuity, or correlation, of subsurface information. Model sensitivity can be determined by perturbing input data and evaluating output response or, as in this work, sensitivities can be programmed directly into an analysis model. Output covariance is calculated by the First-Order Second Moment (FOSM) method, which combines the covariance of input information with model sensitivity. A groundwater flow example, modeled in MODFLOW-2000, is chosen to demonstrate the seven QDE approaches. MODFLOW-2000 is used to obtain the piezometric head and the model sensitivity simultaneously. The seven QDE approaches are evaluated based on the accuracy of the modeled piezometric head after information from a QDE sample is added. For the synthetic site used in this study, the QDE approach that identifies the location of hydraulic conductivity that contributes the most to the overall piezometric head variance proved to be the best method to quantitatively direct exploration. ?? IWA Publishing 2006.
Kocher, David C; Apostoaei, A Iulian; Henshaw, Russell W; Hoffman, F Owen; Schubauer-Berigan, Mary K; Stancescu, Daniel O; Thomas, Brian A; Trabalka, John R; Gilbert, Ethel S; Land, Charles E
2008-07-01
The Interactive RadioEpidemiological Program (IREP) is a Web-based, interactive computer code that is used to estimate the probability that a given cancer in an individual was induced by given exposures to ionizing radiation. IREP was developed by a Working Group of the National Cancer Institute and Centers for Disease Control and Prevention, and was adopted and modified by the National Institute for Occupational Safety and Health (NIOSH) for use in adjudicating claims for compensation for cancer under the Energy Employees Occupational Illness Compensation Program Act of 2000. In this paper, the quantity calculated in IREP is referred to as "probability of causation/assigned share" (PC/AS). PC/AS for a given cancer in an individual is calculated on the basis of an estimate of the excess relative risk (ERR) associated with given radiation exposures and the relationship PC/AS = ERR/ERR+1. IREP accounts for uncertainties in calculating probability distributions of ERR and PC/AS. An accounting of uncertainty is necessary when decisions about granting claims for compensation for cancer are made on the basis of an estimate of the upper 99% credibility limit of PC/AS to give claimants the "benefit of the doubt." This paper discusses models and methods incorporated in IREP to estimate ERR and PC/AS. Approaches to accounting for uncertainty are emphasized, and limitations of IREP are discussed. Although IREP is intended to provide unbiased estimates of ERR and PC/AS and their uncertainties to represent the current state of knowledge, there are situations described in this paper in which NIOSH, as a matter of policy, makes assumptions that give a higher estimate of the upper 99% credibility limit of PC/AS than other plausible alternatives and, thus, are more favorable to claimants.
Extreme Mean and Its Applications
NASA Technical Reports Server (NTRS)
Swaroop, R.; Brownlow, J. D.
1979-01-01
Extreme value statistics obtained from normally distributed data are considered. An extreme mean is defined as the mean of p-th probability truncated normal distribution. An unbiased estimate of this extreme mean and its large sample distribution are derived. The distribution of this estimate even for very large samples is found to be nonnormal. Further, as the sample size increases, the variance of the unbiased estimate converges to the Cramer-Rao lower bound. The computer program used to obtain the density and distribution functions of the standardized unbiased estimate, and the confidence intervals of the extreme mean for any data are included for ready application. An example is included to demonstrate the usefulness of extreme mean application.
Sampling Methods in Cardiovascular Nursing Research: An Overview.
Kandola, Damanpreet; Banner, Davina; O'Keefe-McCarthy, Sheila; Jassal, Debbie
2014-01-01
Cardiovascular nursing research covers a wide array of topics from health services to psychosocial patient experiences. The selection of specific participant samples is an important part of the research design and process. The sampling strategy employed is of utmost importance to ensure that a representative sample of participants is chosen. There are two main categories of sampling methods: probability and non-probability. Probability sampling is the random selection of elements from the population, where each element of the population has an equal and independent chance of being included in the sample. There are five main types of probability sampling including simple random sampling, systematic sampling, stratified sampling, cluster sampling, and multi-stage sampling. Non-probability sampling methods are those in which elements are chosen through non-random methods for inclusion into the research study and include convenience sampling, purposive sampling, and snowball sampling. Each approach offers distinct advantages and disadvantages and must be considered critically. In this research column, we provide an introduction to these key sampling techniques and draw on examples from the cardiovascular research. Understanding the differences in sampling techniques may aid nurses in effective appraisal of research literature and provide a reference pointfor nurses who engage in cardiovascular research.
Shwartz, Michael; Peköz, Erol A; Burgess, James F; Christiansen, Cindy L; Rosen, Amy K; Berlowitz, Dan
2014-12-01
Two approaches are commonly used for identifying high-performing facilities on a performance measure: one, that the facility is in a top quantile (eg, quintile or quartile); and two, that a confidence interval is below (or above) the average of the measure for all facilities. This type of yes/no designation often does not do well in distinguishing high-performing from average-performing facilities. To illustrate an alternative continuous-valued metric for profiling facilities--the probability a facility is in a top quantile--and show the implications of using this metric for profiling and pay-for-performance. We created a composite measure of quality from fiscal year 2007 data based on 28 quality indicators from 112 Veterans Health Administration nursing homes. A Bayesian hierarchical multivariate normal-binomial model was used to estimate shrunken rates of the 28 quality indicators, which were combined into a composite measure using opportunity-based weights. Rates were estimated using Markov Chain Monte Carlo methods as implemented in WinBUGS. The probability metric was calculated from the simulation replications. Our probability metric allowed better discrimination of high performers than the point or interval estimate of the composite score. In a pay-for-performance program, a smaller top quantile (eg, a quintile) resulted in more resources being allocated to the highest performers, whereas a larger top quantile (eg, being above the median) distinguished less among high performers and allocated more resources to average performers. The probability metric has potential but needs to be evaluated by stakeholders in different types of delivery systems.
Invited commentary: recruiting for epidemiologic studies using social media.
Allsworth, Jenifer E
2015-05-15
Social media-based recruitment for epidemiologic studies has the potential to expand the demographic and geographic reach of investigators and identify potential participants more cost-effectively than traditional approaches. In fact, social media are particularly appealing for their ability to engage traditionally "hard-to-reach" populations, including young adults and low-income populations. Despite their great promise as a tool for epidemiologists, social media-based recruitment approaches do not currently compare favorably with gold-standard probability-based sampling approaches. Sparse data on the demographic characteristics of social media users, patterns of social media use, and appropriate sampling frames limit our ability to implement probability-based sampling strategies. In a well-conducted study, Harris et al. (Am J Epidemiol. 2015;181(10):737-746) examined the cost-effectiveness of social media-based recruitment (advertisements and promotion) in the Contraceptive Use, Pregnancy Intention, and Decisions (CUPID) Study, a cohort study of 3,799 young adult Australian women, and the approximate representativeness of the CUPID cohort. Implications for social media-based recruitment strategies for cohort assembly, data accuracy, implementation, and human subjects concerns are discussed. © The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Approximation of Failure Probability Using Conditional Sampling
NASA Technical Reports Server (NTRS)
Giesy. Daniel P.; Crespo, Luis G.; Kenney, Sean P.
2008-01-01
In analyzing systems which depend on uncertain parameters, one technique is to partition the uncertain parameter domain into a failure set and its complement, and judge the quality of the system by estimating the probability of failure. If this is done by a sampling technique such as Monte Carlo and the probability of failure is small, accurate approximation can require so many sample points that the computational expense is prohibitive. Previous work of the authors has shown how to bound the failure event by sets of such simple geometry that their probabilities can be calculated analytically. In this paper, it is shown how to make use of these failure bounding sets and conditional sampling within them to substantially reduce the computational burden of approximating failure probability. It is also shown how the use of these sampling techniques improves the confidence intervals for the failure probability estimate for a given number of sample points and how they reduce the number of sample point analyses needed to achieve a given level of confidence.
ERIC Educational Resources Information Center
Jones, Jennifer; Ramirez, Rafael Roberto; Davies, Mark; Canino, Glorisa; Goodwin, Renee D.
2008-01-01
This study examined rates and correlates of suicidal behavior among youth on the island of Puerto Rico. Data were drawn from two probability samples, one clinical (n = 736) and one community-based sample (n = 1,896), of youth ages 12 to 17. Consistent with previous studies in U.S. mainland adolescent populations, our results demonstrate that most…
ERIC Educational Resources Information Center
George, Goldy C.; Hoelscher, Deanna M.; Nicklas, Theresa A.; Kelder, Steven H.
2009-01-01
Objective: To examine diet- and body size-related attitudes and behaviors associated with supplement use in a representative sample of fourth-grade students in Texas. Design: Cross-sectional data from the School Physical Activity and Nutrition study, a probability-based sample of schoolchildren. Children completed a questionnaire that assessed…
Ecological Condition of Streams in Eastern and Southern NevadaEPA R-EMAP Muddy-Virgin River Project
The report presents data collected during a one year study period beginning in May of 2000. Sampling sites were selected using a probability-based design (as opposed to subjectively selected sites) using the USEPA River Reach File version 3 (RF3). About 37 sites were sampled. ...
Sample design effects in landscape genetics
Oyler-McCance, Sara J.; Fedy, Bradley C.; Landguth, Erin L.
2012-01-01
An important research gap in landscape genetics is the impact of different field sampling designs on the ability to detect the effects of landscape pattern on gene flow. We evaluated how five different sampling regimes (random, linear, systematic, cluster, and single study site) affected the probability of correctly identifying the generating landscape process of population structure. Sampling regimes were chosen to represent a suite of designs common in field studies. We used genetic data generated from a spatially-explicit, individual-based program and simulated gene flow in a continuous population across a landscape with gradual spatial changes in resistance to movement. Additionally, we evaluated the sampling regimes using realistic and obtainable number of loci (10 and 20), number of alleles per locus (5 and 10), number of individuals sampled (10-300), and generational time after the landscape was introduced (20 and 400). For a simulated continuously distributed species, we found that random, linear, and systematic sampling regimes performed well with high sample sizes (>200), levels of polymorphism (10 alleles per locus), and number of molecular markers (20). The cluster and single study site sampling regimes were not able to correctly identify the generating process under any conditions and thus, are not advisable strategies for scenarios similar to our simulations. Our research emphasizes the importance of sampling data at ecologically appropriate spatial and temporal scales and suggests careful consideration for sampling near landscape components that are likely to most influence the genetic structure of the species. In addition, simulating sampling designs a priori could help guide filed data collection efforts.
The new car assessment program: does it predict the relative safety of vehicles in actual crashes?
Nirula, Ram; Mock, Charles N; Nathens, Avery B; Grossman, David C
2004-10-01
Federal motor vehicle safety standards are based on crash test dummy analyses that estimate the relative risk of traumatic brain injury (TBI) and severe thoracic injury (STI) by quantifying head (Head Injury Criterion [HIC]) and chest (Chest Gravity Score [CGS]) acceleration. The New Car Assessment Program (NCAP) combines these probabilities to yield the vehicle's five-star rating. The validity of the NCAP system as it relates to an actual motor vehicle crash (MVC) remains undetermined. We therefore sought to determine whether HIC and CGS accurately predict TBI and STI in actual crashes, and compared the NCAP five-star rating system to the rates of TBI and/or STI in actual MVCs. We analyzed frontal crashes with restrained drivers from the 1994 to 1998 National Automotive Sampling System. The relationship of HIC and CGS to the probabilities of TBI and STI derived from crash tests were respectively compared with the HIC-TBI and CGS-STI risk relationships observed in actual crashes while controlling for covariates. Receiver operating characteristic curves determined the sensitivity and specificity of HIC and CGS as predictors of TBI and STI, respectively. Estimates of the likelihood of TBI and/or STI (in actual MVCs) were compared with the expected probabilities of TBI and STI (determined by crash test analysis), as they relate to NCAP ratings. The crash tests overestimate TBI likelihood at HIC scores >800 and underestimate it at scores <500. STI likelihood is overestimated when CGS exceeds 40 g. Receiver operating characteristic curves demonstrated poor sensitivity and specificity of HIC and CGS in predicting injury. The actual MVC injury probability estimates did not vary between vehicles of different NCAP rating. HIC and CGS are poor predictors of TBI and STI in actual MVCs. The NCAP five-star rating system is unable to differentiate vehicles of varying crashworthiness in actual MVCs. More sensitive parameters need to be developed and incorporated into vehicle crash safety testing to provide consumers and automotive manufacturers with useful tools with which to measure vehicle safety.
Representation of complex probabilities and complex Gibbs sampling
NASA Astrophysics Data System (ADS)
Salcedo, Lorenzo Luis
2018-03-01
Complex weights appear in Physics which are beyond a straightforward importance sampling treatment, as required in Monte Carlo calculations. This is the wellknown sign problem. The complex Langevin approach amounts to effectively construct a positive distribution on the complexified manifold reproducing the expectation values of the observables through their analytical extension. Here we discuss the direct construction of such positive distributions paying attention to their localization on the complexified manifold. Explicit localized representations are obtained for complex probabilities defined on Abelian and non Abelian groups. The viability and performance of a complex version of the heat bath method, based on such representations, is analyzed.
(I Can't Get No) Saturation: A simulation and guidelines for sample sizes in qualitative research.
van Rijnsoever, Frank J
2017-01-01
I explore the sample size in qualitative research that is required to reach theoretical saturation. I conceptualize a population as consisting of sub-populations that contain different types of information sources that hold a number of codes. Theoretical saturation is reached after all the codes in the population have been observed once in the sample. I delineate three different scenarios to sample information sources: "random chance," which is based on probability sampling, "minimal information," which yields at least one new code per sampling step, and "maximum information," which yields the largest number of new codes per sampling step. Next, I use simulations to assess the minimum sample size for each scenario for systematically varying hypothetical populations. I show that theoretical saturation is more dependent on the mean probability of observing codes than on the number of codes in a population. Moreover, the minimal and maximal information scenarios are significantly more efficient than random chance, but yield fewer repetitions per code to validate the findings. I formulate guidelines for purposive sampling and recommend that researchers follow a minimum information scenario.
Fractional Gaussian model in global optimization
NASA Astrophysics Data System (ADS)
Dimri, V. P.; Srivastava, R. P.
2009-12-01
Earth system is inherently non-linear and it can be characterized well if we incorporate no-linearity in the formulation and solution of the problem. General tool often used for characterization of the earth system is inversion. Traditionally inverse problems are solved using least-square based inversion by linearizing the formulation. The initial model in such inversion schemes is often assumed to follow posterior Gaussian probability distribution. It is now well established that most of the physical properties of the earth follow power law (fractal distribution). Thus, the selection of initial model based on power law probability distribution will provide more realistic solution. We present a new method which can draw samples of posterior probability density function very efficiently using fractal based statistics. The application of the method has been demonstrated to invert band limited seismic data with well control. We used fractal based probability density function which uses mean, variance and Hurst coefficient of the model space to draw initial model. Further this initial model is used in global optimization inversion scheme. Inversion results using initial models generated by our method gives high resolution estimates of the model parameters than the hitherto used gradient based liner inversion method.
Tests for senescent decline in annual survival probabilities of common pochards, Aythya ferina
Nichols, J.D.; Hines, J.E.; Blums, P.
1997-01-01
Senescent decline in survival probabilities of animals is a topic about which much has been written but little is known. Here, we present formal tests of senescence hypotheses, using 1373 recaptures from 8877 duckling (age 0) and 504 yearling Common Pochards (Aythya ferina) banded at a Latvian study site, 1975-1992. The tests are based on capture-recapture models that explicitly incorporate sampling probabilities that, themselves, may exhibit timeand age-specific variation. The tests provided no evidence of senescent decline in survival probabilities for this species. Power of the most useful test was low for gradual declines in annual survival probability with age, but good for steeper declines. We recommend use of this type of capture-recapture modeling and analysis for other investigations of senescence in animal survival rates.
Stochastic optimal operation of reservoirs based on copula functions
NASA Astrophysics Data System (ADS)
Lei, Xiao-hui; Tan, Qiao-feng; Wang, Xu; Wang, Hao; Wen, Xin; Wang, Chao; Zhang, Jing-wen
2018-02-01
Stochastic dynamic programming (SDP) has been widely used to derive operating policies for reservoirs considering streamflow uncertainties. In SDP, there is a need to calculate the transition probability matrix more accurately and efficiently in order to improve the economic benefit of reservoir operation. In this study, we proposed a stochastic optimization model for hydropower generation reservoirs, in which 1) the transition probability matrix was calculated based on copula functions; and 2) the value function of the last period was calculated by stepwise iteration. Firstly, the marginal distribution of stochastic inflow in each period was built and the joint distributions of adjacent periods were obtained using the three members of the Archimedean copulas, based on which the conditional probability formula was derived. Then, the value in the last period was calculated by a simple recursive equation with the proposed stepwise iteration method and the value function was fitted with a linear regression model. These improvements were incorporated into the classic SDP and applied to the case study in Ertan reservoir, China. The results show that the transition probability matrix can be more easily and accurately obtained by the proposed copula function based method than conventional methods based on the observed or synthetic streamflow series, and the reservoir operation benefit can also be increased.
Bovine origin Staphylococcus aureus: A new zoonotic agent?
Rao, Relangi Tulasi; Jayakumar, Kannan; Kumar, Pavitra
2017-10-01
The study aimed to assess the nature of animal origin Staphylococcus aureus strains. The study has zoonotic importance and aimed to compare virulence between two different hosts, i.e., bovine and ovine origin. Conventional polymerase chain reaction-based methods used for the characterization of S. aureus strains and chick embryo model employed for the assessment of virulence capacity of strains. All statistical tests carried on R program, version 3.0.4. After initial screening and molecular characterization of the prevalence of S. aureus found to be 42.62% in bovine origin samples and 28.35% among ovine origin samples. Meanwhile, the methicillin-resistant S. aureus prevalence is found to be meager in both the hosts. Among the samples, only 6.8% isolates tested positive for methicillin resistance. The biofilm formation quantified and the variation compared among the host. A Welch two-sample t -test found to be statistically significant, t=2.3179, df=28.103, and p=0.02795. Chicken embryo model found effective to test the pathogenicity of the strains. The study helped to conclude healthy bovines can act as S. aureus reservoirs. Bovine origin S. aureus strains are more virulent than ovine origin strains. Bovine origin strains have high probability to become zoonotic pathogen. Further, gene knock out studies may be conducted to conclude zoonocity of the bovine origin strains.
Simple and flexible SAS and SPSS programs for analyzing lag-sequential categorical data.
O'Connor, B P
1999-11-01
This paper describes simple and flexible programs for analyzing lag-sequential categorical data, using SAS and SPSS. The programs read a stream of codes and produce a variety of lag-sequential statistics, including transitional frequencies, expected transitional frequencies, transitional probabilities, adjusted residuals, z values, Yule's Q values, likelihood ratio tests of stationarity across time and homogeneity across groups or segments, transformed kappas for unidirectional dependence, bidirectional dependence, parallel and nonparallel dominance, and significance levels based on both parametric and randomization tests.
The extension of the thermal-vacuum test optimization program to multiple flights
NASA Technical Reports Server (NTRS)
Williams, R. E.; Byrd, J.
1981-01-01
The thermal vacuum test optimization model developed to provide an approach to the optimization of a test program based on prediction of flight performance with a single flight option in mind is extended to consider reflight as in space shuttle missions. The concept of 'utility', developed under the name of 'availability', is used to follow performance through the various options encountered when the capabilities of reflight and retrievability of space shuttle are available. Also, a 'lost value' model is modified to produce a measure of the probability of a mission's success, achieving a desired utility using a minimal cost test strategy. The resulting matrix of probabilities and their associated costs provides a means for project management to evaluate various test and reflight strategies.
Change-in-ratio estimators for populations with more than two subclasses
Udevitz, Mark S.; Pollock, Kenneth H.
1991-01-01
Change-in-ratio methods have been developed to estimate the size of populations with two or three population subclasses. Most of these methods require the often unreasonable assumption of equal sampling probabilities for individuals in all subclasses. This paper presents new models based on the weaker assumption that ratios of sampling probabilities are constant over time for populations with three or more subclasses. Estimation under these models requires that a value be assumed for one of these ratios when there are two samples. Explicit expressions are given for the maximum likelihood estimators under models for two samples with three or more subclasses and for three samples with two subclasses. A numerical method using readily available statistical software is described for obtaining the estimators and their standard errors under all of the models. Likelihood ratio tests that can be used in model selection are discussed. Emphasis is on the two-sample, three-subclass models for which Monte-Carlo simulation results and an illustrative example are presented.
Probability of coincidental similarity among the orbits of small bodies - I. Pairing
NASA Astrophysics Data System (ADS)
Jopek, Tadeusz Jan; Bronikowska, Małgorzata
2017-09-01
Probability of coincidental clustering among orbits of comets, asteroids and meteoroids depends on many factors like: the size of the orbital sample searched for clusters or the size of the identified group, it is different for groups of 2,3,4,… members. Probability of coincidental clustering is assessed by the numerical simulation, therefore, it depends also on the method used for the synthetic orbits generation. We have tested the impact of some of these factors. For a given size of the orbital sample we have assessed probability of random pairing among several orbital populations of different sizes. We have found how these probabilities vary with the size of the orbital samples. Finally, keeping fixed size of the orbital sample we have shown that the probability of random pairing can be significantly different for the orbital samples obtained by different observation techniques. Also for the user convenience we have obtained several formulae which, for given size of the orbital sample can be used to calculate the similarity threshold corresponding to the small value of the probability of coincidental similarity among two orbits.
Doubravsky, Karel; Dohnal, Mirko
2015-01-01
Complex decision making tasks of different natures, e.g. economics, safety engineering, ecology and biology, are based on vague, sparse, partially inconsistent and subjective knowledge. Moreover, decision making economists / engineers are usually not willing to invest too much time into study of complex formal theories. They require such decisions which can be (re)checked by human like common sense reasoning. One important problem related to realistic decision making tasks are incomplete data sets required by the chosen decision making algorithm. This paper presents a relatively simple algorithm how some missing III (input information items) can be generated using mainly decision tree topologies and integrated into incomplete data sets. The algorithm is based on an easy to understand heuristics, e.g. a longer decision tree sub-path is less probable. This heuristic can solve decision problems under total ignorance, i.e. the decision tree topology is the only information available. But in a practice, isolated information items e.g. some vaguely known probabilities (e.g. fuzzy probabilities) are usually available. It means that a realistic problem is analysed under partial ignorance. The proposed algorithm reconciles topology related heuristics and additional fuzzy sets using fuzzy linear programming. The case study, represented by a tree with six lotteries and one fuzzy probability, is presented in details. PMID:26158662
Doubravsky, Karel; Dohnal, Mirko
2015-01-01
Complex decision making tasks of different natures, e.g. economics, safety engineering, ecology and biology, are based on vague, sparse, partially inconsistent and subjective knowledge. Moreover, decision making economists / engineers are usually not willing to invest too much time into study of complex formal theories. They require such decisions which can be (re)checked by human like common sense reasoning. One important problem related to realistic decision making tasks are incomplete data sets required by the chosen decision making algorithm. This paper presents a relatively simple algorithm how some missing III (input information items) can be generated using mainly decision tree topologies and integrated into incomplete data sets. The algorithm is based on an easy to understand heuristics, e.g. a longer decision tree sub-path is less probable. This heuristic can solve decision problems under total ignorance, i.e. the decision tree topology is the only information available. But in a practice, isolated information items e.g. some vaguely known probabilities (e.g. fuzzy probabilities) are usually available. It means that a realistic problem is analysed under partial ignorance. The proposed algorithm reconciles topology related heuristics and additional fuzzy sets using fuzzy linear programming. The case study, represented by a tree with six lotteries and one fuzzy probability, is presented in details.
Mannila, H.; Koivisto, M.; Perola, M.; Varilo, T.; Hennah, W.; Ekelund, J.; Lukk, M.; Peltonen, L.; Ukkonen, E.
2003-01-01
We describe a new probabilistic method for finding haplotype blocks that is based on the use of the minimum description length (MDL) principle. We give a rigorous definition of the quality of a segmentation of a genomic region into blocks and describe a dynamic programming algorithm for finding the optimal segmentation with respect to this measure. We also describe a method for finding the probability of a block boundary for each pair of adjacent markers: this gives a tool for evaluating the significance of each block boundary. We have applied the method to the published data of Daly and colleagues. The results expose some problems that exist in the current methods for the evaluation of the significance of predicted block boundaries. Our method, MDL block finder, can be used to compare block borders in different sample sets, and we demonstrate this by applying the MDL-based method to define the block structure in chromosomes from population isolates. PMID:12761696
Mannila, H; Koivisto, M; Perola, M; Varilo, T; Hennah, W; Ekelund, J; Lukk, M; Peltonen, L; Ukkonen, E
2003-07-01
We describe a new probabilistic method for finding haplotype blocks that is based on the use of the minimum description length (MDL) principle. We give a rigorous definition of the quality of a segmentation of a genomic region into blocks and describe a dynamic programming algorithm for finding the optimal segmentation with respect to this measure. We also describe a method for finding the probability of a block boundary for each pair of adjacent markers: this gives a tool for evaluating the significance of each block boundary. We have applied the method to the published data of Daly and colleagues. The results expose some problems that exist in the current methods for the evaluation of the significance of predicted block boundaries. Our method, MDL block finder, can be used to compare block borders in different sample sets, and we demonstrate this by applying the MDL-based method to define the block structure in chromosomes from population isolates.
Tucker, Joan S; Ryan, Gery W; Golinelli, Daniela; Ewing, Brett; Wenzel, Suzanne L; Kennedy, David P; Green, Harold D; Zhou, Annie
2012-08-01
This study used an event-based approach to understand condom use in a probability sample of 309 homeless youth recruited from service and street sites in Los Angeles County. Condom use was significantly less likely when hard drug use preceded sex, the relationship was serious, the partners talked about "pulling out", or sex occurred in a non-private place (and marginally less likely when heavier drinking preceded sex, or the partnership was monogamous or abusive). Condom use was significantly more likely when the youth held positive condom attitudes or were concerned about pregnancy, the partners talked about condom use, and the partners met up by chance. This study extends previous work by simultaneously examining a broad range of individual, relationship, and contexual factors that may play a role in condom use. Results identify a number of actionable targets for programs aimed at reducing HIV/STI transmission and pregnancy risk among homeless youth.
Tucker, Joan S.; Ryan, Gery W.; Golinelli, Daniela; Munjas, Brett; Wenzel, Suzanne L.; Kennedy, David P.; Green, Harold D.; Zhou, Annie
2011-01-01
This study used an event-based approach to understand condom use in a probability sample of 309 homeless youth recruited from service and street sites in Los Angeles County. Condom use was significantly less likely when hard drug use preceded sex, the relationship was serious, the partners talked about “pulling out”, or sex occurred in a non-private place (and marginally less likely when heavier drinking preceded sex, or the partnership was monogamous or abusive). Condom use was significantly more likely when the youth held positive condom attitudes or were concerned about pregnancy, the partners talked about condom use, and the partners met up by chance. This study extends previous work by simultaneously examining a broad range of individual, relationship, and contexual factors that may play a role in condom use. Results identify a number of actionable targets for programs aimed at reducing HIV/STI transmission and pregnancy risk among homeless youth. PMID:21932093
The “Genetic Program”: Behind the Genesis of an Influential Metaphor
Peluffo, Alexandre E.
2015-01-01
The metaphor of the “genetic program,” indicating the genome as a set of instructions required to build a phenotype, has been very influential in biology despite various criticisms over the years. This metaphor, first published in 1961, is thought to have been invented independently in two different articles, one by Ernst Mayr and the other by François Jacob and Jacques Monod. Here, after a detailed analysis of what both parties meant by “genetic program,” I show, using unpublished archives, the strong resemblance between the ideas of Mayr and Monod and suggest that their idea of genetic program probably shares a common origin. I explore the possibility that the two men met before 1961 and also exchanged their ideas through common friends and colleagues in the field of molecular biology. Based on unpublished correspondence of Jacob and Monod, I highlight the important events that influenced the preparation of their influential paper, which introduced the concept of the genetic program. Finally, I suggest that the genetic program metaphor may have preceded both papers and that it was probably used informally before 1961. PMID:26170444
The CBT Advisor: An Expert System Program for Making Decisions about CBT.
ERIC Educational Resources Information Center
Kearsley, Greg
1985-01-01
Discusses structure, credibility, and use of the Computer Based Training (CBT) Advisor, an expert system designed to help managers make judgements about course selection, system selection, cost/benefits, development effort, and probable success of CBT projects. (MBR)
Brůžek, Jaroslav; Santos, Frédéric; Dutailly, Bruno; Murail, Pascal; Cunha, Eugenia
2017-10-01
A new tool for skeletal sex estimation based on measurements of the human os coxae is presented using skeletons from a metapopulation of identified adult individuals from twelve independent population samples. For reliable sex estimation, a posterior probability greater than 0.95 was considered to be the classification threshold: below this value, estimates are considered indeterminate. By providing free software, we aim to develop an even more disseminated method for sex estimation. Ten metric variables collected from 2,040 ossa coxa of adult subjects of known sex were recorded between 1986 and 2002 (reference sample). To test both the validity and reliability, a target sample consisting of two series of adult ossa coxa of known sex (n = 623) was used. The DSP2 software (Diagnose Sexuelle Probabiliste v2) is based on Linear Discriminant Analysis, and the posterior probabilities are calculated using an R script. For the reference sample, any combination of four dimensions provides a correct sex estimate in at least 99% of cases. The percentage of individuals for whom sex can be estimated depends on the number of dimensions; for all ten variables it is higher than 90%. Those results are confirmed in the target sample. Our posterior probability threshold of 0.95 for sex estimate corresponds to the traditional sectioning point used in osteological studies. DSP2 software is replacing the former version that should not be used anymore. DSP2 is a robust and reliable technique for sexing adult os coxae, and is also user friendly. © 2017 Wiley Periodicals, Inc.
Haynes, Trevor B.; Rosenberger, Amanda E.; Lindberg, Mark S.; Whitman, Matthew; Schmutz, Joel A.
2013-01-01
Studies examining species occurrence often fail to account for false absences in field sampling. We investigate detection probabilities of five gear types for six fish species in a sample of lakes on the North Slope, Alaska. We used an occupancy modeling approach to provide estimates of detection probabilities for each method. Variation in gear- and species-specific detection probability was considerable. For example, detection probabilities for the fyke net ranged from 0.82 (SE = 0.05) for least cisco (Coregonus sardinella) to 0.04 (SE = 0.01) for slimy sculpin (Cottus cognatus). Detection probabilities were also affected by site-specific variables such as depth of the lake, year, day of sampling, and lake connection to a stream. With the exception of the dip net and shore minnow traps, each gear type provided the highest detection probability of at least one species. Results suggest that a multimethod approach may be most effective when attempting to sample the entire fish community of Arctic lakes. Detection probability estimates will be useful for designing optimal fish sampling and monitoring protocols in Arctic lakes.
Mali, Ivana; Duarte, Adam; Forstner, Michael R J
2018-01-01
Abundance estimates play an important part in the regulatory and conservation decision-making process. It is important to correct monitoring data for imperfect detection when using these data to track spatial and temporal variation in abundance, especially in the case of rare and elusive species. This paper presents the first attempt to estimate abundance of the Rio Grande cooter ( Pseudemys gorzugi ) while explicitly considering the detection process. Specifically, in 2016 we monitored this rare species at two sites along the Black River, New Mexico via traditional baited hoop-net traps and less invasive visual surveys to evaluate the efficacy of these two sampling designs. We fitted the Huggins closed-capture estimator to estimate capture probabilities using the trap data and distance sampling models to estimate detection probabilities using the visual survey data. We found that only the visual survey with the highest number of observed turtles resulted in similar abundance estimates to those estimated using the trap data. However, the estimates of abundance from the remaining visual survey data were highly variable and often underestimated abundance relative to the estimates from the trap data. We suspect this pattern is related to changes in the basking behavior of the species and, thus, the availability of turtles to be detected even though all visual surveys were conducted when environmental conditions were similar. Regardless, we found that riverine habitat conditions limited our ability to properly conduct visual surveys at one site. Collectively, this suggests visual surveys may not be an effective sample design for this species in this river system. When analyzing the trap data, we found capture probabilities to be highly variable across sites and between age classes and that recapture probabilities were much lower than initial capture probabilities, highlighting the importance of accounting for detectability when monitoring this species. Although baited hoop-net traps seem to be an effective sampling design, it is important to note that this method required a relatively high trap effort to reliably estimate abundance. This information will be useful when developing a larger-scale, long-term monitoring program for this species of concern.
Steidl, Robert J.; Conway, Courtney J.; Litt, Andrea R.
2013-01-01
Standardized protocols for surveying secretive marsh birds have been implemented across North America, but the efficacy of surveys to detect population trends has not been evaluated. We used survey data collected from populations of marsh birds across North America and simulations to explore how characteristics of bird populations (proportion of survey stations occupied, abundance at occupied stations, and detection probability) and aspects of sampling effort (numbers of survey routes, stations/route, and surveys/station/year) affect statistical power to detect trends in abundance of marsh bird populations. In general, the proportion of survey stations along a route occupied by a species had a greater relative effect on power to detect trends than did the number of birds detected per survey at occupied stations. Uncertainty introduced by imperfect detection during surveys reduced power to detect trends considerably, but across the range of detection probabilities for most species of marsh birds, variation in detection probability had only a minor influence on power. For species that occupy a relatively high proportion of survey stations (0.20), have relatively high abundances at occupied stations (2.0 birds/station), and have high detection probability (0.50), ≥40 routes with 10 survey stations per route surveyed 3 times per year would provide an 80% chance of detecting a 3% annual decrease in abundance after 20 years of surveys. Under the same assumptions but for species that are less common, ≥100 routes would be needed to achieve the same power. Our results can help inform the design of programs to monitor trends in abundance of marsh bird populations, especially with regards to the amount of sampling effort necessary to meet programmatic goals.
Spiegelhalter, D J; Freedman, L S
1986-01-01
The 'textbook' approach to determining sample size in a clinical trial has some fundamental weaknesses which we discuss. We describe a new predictive method which takes account of prior clinical opinion about the treatment difference. The method adopts the point of clinical equivalence (determined by interviewing the clinical participants) as the null hypothesis. Decision rules at the end of the study are based on whether the interval estimate of the treatment difference (classical or Bayesian) includes the null hypothesis. The prior distribution is used to predict the probabilities of making the decisions to use one or other treatment or to reserve final judgement. It is recommended that sample size be chosen to control the predicted probability of the last of these decisions. An example is given from a multi-centre trial of superficial bladder cancer.
Modeling Compound Flood Hazards in Coastal Embayments
NASA Astrophysics Data System (ADS)
Moftakhari, H.; Schubert, J. E.; AghaKouchak, A.; Luke, A.; Matthew, R.; Sanders, B. F.
2017-12-01
Coastal cities around the world are built on lowland topography adjacent to coastal embayments and river estuaries, where multiple factors threaten increasing flood hazards (e.g. sea level rise and river flooding). Quantitative risk assessment is required for administration of flood insurance programs and the design of cost-effective flood risk reduction measures. This demands a characterization of extreme water levels such as 100 and 500 year return period events. Furthermore, hydrodynamic flood models are routinely used to characterize localized flood level intensities (i.e., local depth and velocity) based on boundary forcing sampled from extreme value distributions. For example, extreme flood discharges in the U.S. are estimated from measured flood peaks using the Log-Pearson Type III distribution. However, configuring hydrodynamic models for coastal embayments is challenging because of compound extreme flood events: events caused by a combination of extreme sea levels, extreme river discharges, and possibly other factors such as extreme waves and precipitation causing pluvial flooding in urban developments. Here, we present an approach for flood risk assessment that coordinates multivariate extreme analysis with hydrodynamic modeling of coastal embayments. First, we evaluate the significance of correlation structure between terrestrial freshwater inflow and oceanic variables; second, this correlation structure is described using copula functions in unit joint probability domain; and third, we choose a series of compound design scenarios for hydrodynamic modeling based on their occurrence likelihood. The design scenarios include the most likely compound event (with the highest joint probability density), preferred marginal scenario and reproduced time series of ensembles based on Monte Carlo sampling of bivariate hazard domain. The comparison between resulting extreme water dynamics under the compound hazard scenarios explained above provides an insight to the strengths/weaknesses of each approach and helps modelers choose the appropriate scenario that best fit to the needs of their project. The proposed risk assessment approach can help flood hazard modeling practitioners achieve a more reliable estimate of risk, by cautiously reducing the dimensionality of the hazard analysis.
Wang, Yunpeng; Thompson, Wesley K.; Schork, Andrew J.; Holland, Dominic; Chen, Chi-Hua; Bettella, Francesco; Desikan, Rahul S.; Li, Wen; Witoelar, Aree; Zuber, Verena; Devor, Anna; Nöthen, Markus M.; Rietschel, Marcella; Chen, Qiang; Werge, Thomas; Cichon, Sven; Weinberger, Daniel R.; Djurovic, Srdjan; O’Donovan, Michael; Visscher, Peter M.; Andreassen, Ole A.; Dale, Anders M.
2016-01-01
Most of the genetic architecture of schizophrenia (SCZ) has not yet been identified. Here, we apply a novel statistical algorithm called Covariate-Modulated Mixture Modeling (CM3), which incorporates auxiliary information (heterozygosity, total linkage disequilibrium, genomic annotations, pleiotropy) for each single nucleotide polymorphism (SNP) to enable more accurate estimation of replication probabilities, conditional on the observed test statistic (“z-score”) of the SNP. We use a multiple logistic regression on z-scores to combine information from auxiliary information to derive a “relative enrichment score” for each SNP. For each stratum of these relative enrichment scores, we obtain nonparametric estimates of posterior expected test statistics and replication probabilities as a function of discovery z-scores, using a resampling-based approach that repeatedly and randomly partitions meta-analysis sub-studies into training and replication samples. We fit a scale mixture of two Gaussians model to each stratum, obtaining parameter estimates that minimize the sum of squared differences of the scale-mixture model with the stratified nonparametric estimates. We apply this approach to the recent genome-wide association study (GWAS) of SCZ (n = 82,315), obtaining a good fit between the model-based and observed effect sizes and replication probabilities. We observed that SNPs with low enrichment scores replicate with a lower probability than SNPs with high enrichment scores even when both they are genome-wide significant (p < 5x10-8). There were 693 and 219 independent loci with model-based replication rates ≥80% and ≥90%, respectively. Compared to analyses not incorporating relative enrichment scores, CM3 increased out-of-sample yield for SNPs that replicate at a given rate. This demonstrates that replication probabilities can be more accurately estimated using prior enrichment information with CM3. PMID:26808560
DOE Office of Scientific and Technical Information (OSTI.GOV)
Conover, W.J.; Cox, D.D.; Martz, H.F.
1997-12-01
When using parametric empirical Bayes estimation methods for estimating the binomial or Poisson parameter, the validity of the assumed beta or gamma conjugate prior distribution is an important diagnostic consideration. Chi-square goodness-of-fit tests of the beta or gamma prior hypothesis are developed for use when the binomial sample sizes or Poisson exposure times vary. Nine examples illustrate the application of the methods, using real data from such diverse applications as the loss of feedwater flow rates in nuclear power plants, the probability of failure to run on demand and the failure rates of the high pressure coolant injection systems atmore » US commercial boiling water reactors, the probability of failure to run on demand of emergency diesel generators in US commercial nuclear power plants, the rate of failure of aircraft air conditioners, baseball batting averages, the probability of testing positive for toxoplasmosis, and the probability of tumors in rats. The tests are easily applied in practice by means of corresponding Mathematica{reg_sign} computer programs which are provided.« less
A probabilistic safety analysis of incidents in nuclear research reactors.
Lopes, Valdir Maciel; Agostinho Angelo Sordi, Gian Maria; Moralles, Mauricio; Filho, Tufic Madi
2012-06-01
This work aims to evaluate the potential risks of incidents in nuclear research reactors. For its development, two databases of the International Atomic Energy Agency (IAEA) were used: the Research Reactor Data Base (RRDB) and the Incident Report System for Research Reactor (IRSRR). For this study, the probabilistic safety analysis (PSA) was used. To obtain the result of the probability calculations for PSA, the theory and equations in the paper IAEA TECDOC-636 were used. A specific program to analyse the probabilities was developed within the main program, Scilab 5.1.1. for two distributions, Fischer and chi-square, both with the confidence level of 90 %. Using Sordi equations, the maximum admissible doses to compare with the risk limits established by the International Commission on Radiological Protection (ICRP) were obtained. All results achieved with this probability analysis led to the conclusion that the incidents which occurred had radiation doses within the stochastic effects reference interval established by the ICRP-64.
Generalized Wishart Mixtures for Unsupervised Classification of PolSAR Data
NASA Astrophysics Data System (ADS)
Li, Lan; Chen, Erxue; Li, Zengyuan
2013-01-01
This paper presents an unsupervised clustering algorithm based upon the expectation maximization (EM) algorithm for finite mixture modelling, using the complex wishart probability density function (PDF) for the probabilities. The mixture model enables to consider heterogeneous thematic classes which could not be better fitted by the unimodal wishart distribution. In order to make it fast and robust to calculate, we use the recently proposed generalized gamma distribution (GΓD) for the single polarization intensity data to make the initial partition. Then we use the wishart probability density function for the corresponding sample covariance matrix to calculate the posterior class probabilities for each pixel. The posterior class probabilities are used for the prior probability estimates of each class and weights for all class parameter updates. The proposed method is evaluated and compared with the wishart H-Alpha-A classification. Preliminary results show that the proposed method has better performance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, J. D.; Oberkampf, William Louis; Helton, Jon Craig
2006-10-01
Evidence theory provides an alternative to probability theory for the representation of epistemic uncertainty in model predictions that derives from epistemic uncertainty in model inputs, where the descriptor epistemic is used to indicate uncertainty that derives from a lack of knowledge with respect to the appropriate values to use for various inputs to the model. The potential benefit, and hence appeal, of evidence theory is that it allows a less restrictive specification of uncertainty than is possible within the axiomatic structure on which probability theory is based. Unfortunately, the propagation of an evidence theory representation for uncertainty through a modelmore » is more computationally demanding than the propagation of a probabilistic representation for uncertainty, with this difficulty constituting a serious obstacle to the use of evidence theory in the representation of uncertainty in predictions obtained from computationally intensive models. This presentation describes and illustrates a sampling-based computational strategy for the representation of epistemic uncertainty in model predictions with evidence theory. Preliminary trials indicate that the presented strategy can be used to propagate uncertainty representations based on evidence theory in analysis situations where naive sampling-based (i.e., unsophisticated Monte Carlo) procedures are impracticable due to computational cost.« less
Moving on From Representativeness: Testing the Utility of the Global Drug Survey.
Barratt, Monica J; Ferris, Jason A; Zahnow, Renee; Palamar, Joseph J; Maier, Larissa J; Winstock, Adam R
2017-01-01
A decline in response rates in traditional household surveys, combined with increased internet coverage and decreased research budgets, has resulted in increased attractiveness of web survey research designs based on purposive and voluntary opt-in sampling strategies. In the study of hidden or stigmatised behaviours, such as cannabis use, web survey methods are increasingly common. However, opt-in web surveys are often heavily criticised due to their lack of sampling frame and unknown representativeness. In this article, we outline the current state of the debate about the relevance of pursuing representativeness, the state of probability sampling methods, and the utility of non-probability, web survey methods especially for accessing hidden or minority populations. Our article has two aims: (1) to present a comprehensive description of the methodology we use at Global Drug Survey (GDS), an annual cross-sectional web survey and (2) to compare the age and sex distributions of cannabis users who voluntarily completed (a) a household survey or (b) a large web-based purposive survey (GDS), across three countries: Australia, the United States, and Switzerland. We find that within each set of country comparisons, the demographic distributions among recent cannabis users are broadly similar, demonstrating that the age and sex distributions of those who volunteer to be surveyed are not vastly different between these non-probability and probability methods. We conclude that opt-in web surveys of hard-to-reach populations are an efficient way of gaining in-depth understanding of stigmatised behaviours and are appropriate, as long as they are not used to estimate drug use prevalence of the general population.
Moving on From Representativeness: Testing the Utility of the Global Drug Survey
Barratt, Monica J; Ferris, Jason A; Zahnow, Renee; Palamar, Joseph J; Maier, Larissa J; Winstock, Adam R
2017-01-01
A decline in response rates in traditional household surveys, combined with increased internet coverage and decreased research budgets, has resulted in increased attractiveness of web survey research designs based on purposive and voluntary opt-in sampling strategies. In the study of hidden or stigmatised behaviours, such as cannabis use, web survey methods are increasingly common. However, opt-in web surveys are often heavily criticised due to their lack of sampling frame and unknown representativeness. In this article, we outline the current state of the debate about the relevance of pursuing representativeness, the state of probability sampling methods, and the utility of non-probability, web survey methods especially for accessing hidden or minority populations. Our article has two aims: (1) to present a comprehensive description of the methodology we use at Global Drug Survey (GDS), an annual cross-sectional web survey and (2) to compare the age and sex distributions of cannabis users who voluntarily completed (a) a household survey or (b) a large web-based purposive survey (GDS), across three countries: Australia, the United States, and Switzerland. We find that within each set of country comparisons, the demographic distributions among recent cannabis users are broadly similar, demonstrating that the age and sex distributions of those who volunteer to be surveyed are not vastly different between these non-probability and probability methods. We conclude that opt-in web surveys of hard-to-reach populations are an efficient way of gaining in-depth understanding of stigmatised behaviours and are appropriate, as long as they are not used to estimate drug use prevalence of the general population. PMID:28924351
The Influence of Mark-Recapture Sampling Effort on Estimates of Rock Lobster Survival
Kordjazi, Ziya; Frusher, Stewart; Buxton, Colin; Gardner, Caleb; Bird, Tomas
2016-01-01
Five annual capture-mark-recapture surveys on Jasus edwardsii were used to evaluate the effect of sample size and fishing effort on the precision of estimated survival probability. Datasets of different numbers of individual lobsters (ranging from 200 to 1,000 lobsters) were created by random subsampling from each annual survey. This process of random subsampling was also used to create 12 datasets of different levels of effort based on three levels of the number of traps (15, 30 and 50 traps per day) and four levels of the number of sampling-days (2, 4, 6 and 7 days). The most parsimonious Cormack-Jolly-Seber (CJS) model for estimating survival probability shifted from a constant model towards sex-dependent models with increasing sample size and effort. A sample of 500 lobsters or 50 traps used on four consecutive sampling-days was required for obtaining precise survival estimations for males and females, separately. Reduced sampling effort of 30 traps over four sampling days was sufficient if a survival estimate for both sexes combined was sufficient for management of the fishery. PMID:26990561
Method for predicting peptide detection in mass spectrometry
Kangas, Lars [West Richland, WA; Smith, Richard D [Richland, WA; Petritis, Konstantinos [Richland, WA
2010-07-13
A method of predicting whether a peptide present in a biological sample will be detected by analysis with a mass spectrometer. The method uses at least one mass spectrometer to perform repeated analysis of a sample containing peptides from proteins with known amino acids. The method then generates a data set of peptides identified as contained within the sample by the repeated analysis. The method then calculates the probability that a specific peptide in the data set was detected in the repeated analysis. The method then creates a plurality of vectors, where each vector has a plurality of dimensions, and each dimension represents a property of one or more of the amino acids present in each peptide and adjacent peptides in the data set. Using these vectors, the method then generates an algorithm from the plurality of vectors and the calculated probabilities that specific peptides in the data set were detected in the repeated analysis. The algorithm is thus capable of calculating the probability that a hypothetical peptide represented as a vector will be detected by a mass spectrometry based proteomic platform, given that the peptide is present in a sample introduced into a mass spectrometer.
NASA Technical Reports Server (NTRS)
Ryan, Robert S.; Townsend, John S.
1993-01-01
The prospective improvement of probabilistic methods for space program analysis/design entails the further development of theories, codes, and tools which match specific areas of application, the drawing of lessons from previous uses of probability and statistics data bases, the enlargement of data bases (especially in the field of structural failures), and the education of engineers and managers on the advantages of these methods. An evaluation is presently made of the current limitations of probabilistic engineering methods. Recommendations are made for specific applications.
Effect of ambient temperature storage on potable water coliform population estimations.
Standridge, J H; Delfino, J J
1983-01-01
The effect of the length of time between sampling potable water and performing coliform analyses has been a long-standing controversial issue in environmental microbiology. The issue is of practical importance since reducing the sample-to-analysis time may substantially increase costs for water analysis programs. Randomly selected samples (from those routinely collected throughout the State of Wisconsin) were analyzed for total coliforms after being held at room temperature (20 +/- 2 degrees C) for 24 and 48 h. Differences in results for the two holding times were compared with differences predicted by probability calculations. The study showed that storage of the potable water for up to 48 h had little effect on the public health significance of most samples containing more than two coliforms per 100 ml. PMID:6651296
Statistical surrogate models for prediction of high-consequence climate change.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Constantine, Paul; Field, Richard V., Jr.; Boslough, Mark Bruce Elrick
2011-09-01
In safety engineering, performance metrics are defined using probabilistic risk assessments focused on the low-probability, high-consequence tail of the distribution of possible events, as opposed to best estimates based on central tendencies. We frame the climate change problem and its associated risks in a similar manner. To properly explore the tails of the distribution requires extensive sampling, which is not possible with existing coupled atmospheric models due to the high computational cost of each simulation. We therefore propose the use of specialized statistical surrogate models (SSMs) for the purpose of exploring the probability law of various climate variables of interest.more » A SSM is different than a deterministic surrogate model in that it represents each climate variable of interest as a space/time random field. The SSM can be calibrated to available spatial and temporal data from existing climate databases, e.g., the Program for Climate Model Diagnosis and Intercomparison (PCMDI), or to a collection of outputs from a General Circulation Model (GCM), e.g., the Community Earth System Model (CESM) and its predecessors. Because of its reduced size and complexity, the realization of a large number of independent model outputs from a SSM becomes computationally straightforward, so that quantifying the risk associated with low-probability, high-consequence climate events becomes feasible. A Bayesian framework is developed to provide quantitative measures of confidence, via Bayesian credible intervals, in the use of the proposed approach to assess these risks.« less
Wagner, Tyler; Jefferson T. Deweber,; Jason Detar,; Kristine, David; John A. Sweka,
2014-01-01
Many potential stressors to aquatic environments operate over large spatial scales, prompting the need to assess and monitor both site-specific and regional dynamics of fish populations. We used hierarchical Bayesian models to evaluate the spatial and temporal variability in density and capture probability of age-1 and older Brook Trout Salvelinus fontinalis from three-pass removal data collected at 291 sites over a 37-year time period (1975–2011) in Pennsylvania streams. There was high between-year variability in density, with annual posterior means ranging from 2.1 to 10.2 fish/100 m2; however, there was no significant long-term linear trend. Brook Trout density was positively correlated with elevation and negatively correlated with percent developed land use in the network catchment. Probability of capture did not vary substantially across sites or years but was negatively correlated with mean stream width. Because of the low spatiotemporal variation in capture probability and a strong correlation between first-pass CPUE (catch/min) and three-pass removal density estimates, the use of an abundance index based on first-pass CPUE could represent a cost-effective alternative to conducting multiple-pass removal sampling for some Brook Trout monitoring and assessment objectives. Single-pass indices may be particularly relevant for monitoring objectives that do not require precise site-specific estimates, such as regional monitoring programs that are designed to detect long-term linear trends in density.
Thermodynamic modeling using BINGO-ANTIDOTE: A new strategy to investigate metamorphic rocks
NASA Astrophysics Data System (ADS)
Lanari, Pierre; Duesterhoeft, Erik
2016-04-01
BINGO-ANTIDOTE is a new program, combing the achievements of the two petrological software packages XMAPTOOLS[1] and THERIAK-DOMINO[2]. XMAPTOOLS affords information about compositional zoning in mineral and local bulk composition of domains at the thin sections scale. THERIAK-DOMINO calculates equilibrium phase assemblages from given bulk rock composition, temperature T and pressure P. Primarily BINGO-ANTIDOTE can be described as an inverse THERIAK-DOMINO, because it uses the information provided by XMAPTOOLS to calculate the probable P-T equilibrium conditions of metamorphic rocks. Consequently, the introduced program combines the strengths of forward Gibbs free energy minimization models with the intuitive output of inverse thermobarometry models. In order to get "best" P-T equilibrium conditions of a metamorphic rock sample and thus estimating the degree of agreement between the observed and calculated mineral assemblage, it is critical to define a reliable scoring strategy. BINGO uses the THERIAKD ADD-ON[3] (Duesterhoeft and de Capitani, 2013) and is a flexible model scorer with 3+1 evaluation criteria. These criteria are the statistical agreement between the observed and calculated mineral-assemblage, -proportions (vol%) and -composition (mol). Additionally, a total likelihood, consisting of the first three criteria, allows the user an evaluation of the most probable equilibrium P-T condition. ANTIDOTE is an interactive user interface, displaying the 3+1 evaluation criteria as probability P-T-maps. It can be used with and without XMAPTOOLS. As a stand-alone program, the user is able to give the program macroscopic observations (i.e., mineral names and proportions), which ANTIDOTE converts to a readable BINGO input. In this manner, the use of BINGO-ANTIDOTE opens up thermodynamics to students and people with only a basic knowledge of phase diagrams and thermodynamic modeling techniques. This presentation introduces BINGO-ANTIDOTE and includes typical examples of its functionality, such as the determination of P-T conditions of high-grade rocks. BINGO-ANTIDOTE is still under development and will soon be freely available online. References: [1] Lanari P., Vidal O., De Andrade V., Dubacq B., Lewin E., Grosch E. G. and Schwartz S. (2013) XMapTools: a MATLAB©-based program for electron microprobe X-ray image processing and geothermobarometry. Comput. Geosci. 62, 227-240. [2] de Capitani C. and Petrakakis K. (2010) The computation of equilibrium assemblage diagrams with Theriak/Domino software. Am. Mineral. 95, 1006-1016. [3] Duesterhoeft E. and de Capitani C. (2013) Theriak_D: An add-on to implement equilibrium computations in geodynamic models. Geochem. Geophys. Geosyst. 14, 4962-4967.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chatterjee, Samrat; Tipireddy, Ramakrishna; Oster, Matthew R.
Securing cyber-systems on a continual basis against a multitude of adverse events is a challenging undertaking. Game-theoretic approaches, that model actions of strategic decision-makers, are increasingly being applied to address cybersecurity resource allocation challenges. Such game-based models account for multiple player actions and represent cyber attacker payoffs mostly as point utility estimates. Since a cyber-attacker’s payoff generation mechanism is largely unknown, appropriate representation and propagation of uncertainty is a critical task. In this paper we expand on prior work and focus on operationalizing the probabilistic uncertainty quantification framework, for a notional cyber system, through: 1) representation of uncertain attacker andmore » system-related modeling variables as probability distributions and mathematical intervals, and 2) exploration of uncertainty propagation techniques including two-phase Monte Carlo sampling and probability bounds analysis.« less
NASA Astrophysics Data System (ADS)
Xia, Xintao; Wang, Zhongyu
2008-10-01
For some methods of stability analysis of a system using statistics, it is difficult to resolve the problems of unknown probability distribution and small sample. Therefore, a novel method is proposed in this paper to resolve these problems. This method is independent of probability distribution, and is useful for small sample systems. After rearrangement of the original data series, the order difference and two polynomial membership functions are introduced to estimate the true value, the lower bound and the supper bound of the system using fuzzy-set theory. Then empirical distribution function is investigated to ensure confidence level above 95%, and the degree of similarity is presented to evaluate stability of the system. Cases of computer simulation investigate stable systems with various probability distribution, unstable systems with linear systematic errors and periodic systematic errors and some mixed systems. The method of analysis for systematic stability is approved.
Salmeron, Patricia A; Christian, Becky J
2016-01-01
The purpose of this project was to determine if a bullying educational program for school nurses and certified nursing assistants/health technicians (CNAs/HTs) would increase knowledge of bullying, probability of reporting a bully, and probability of assisting a bullied victim. This educational program and evaluation employed a retrospective, post-then-pre-test design. Instruments used included a 17-item demographic questionnaire and the 12-item Reduced Aggression/ Victimization Scale Bullying Assessment Tool (BAT), a 5-point Likert Scale de - signed to assess school nurses’ and CNAs’/HTs’ understanding of bullying, the probability of reporting bullies, and the probability of assisting bullied victims before and after the educational presentation. Findings of this educational evaluation program indicated that the majority of school nurses and CNAs/HTs had an increased understanding of bullying, higher probability of reporting a bully, and assisting a bullied victim after the presentation.
Generalized probabilistic scale space for image restoration.
Wong, Alexander; Mishra, Akshaya K
2010-10-01
A novel generalized sampling-based probabilistic scale space theory is proposed for image restoration. We explore extending the definition of scale space to better account for both noise and observation models, which is important for producing accurately restored images. A new class of scale-space realizations based on sampling and probability theory is introduced to realize this extended definition in the context of image restoration. Experimental results using 2-D images show that generalized sampling-based probabilistic scale-space theory can be used to produce more accurate restored images when compared with state-of-the-art scale-space formulations, particularly under situations characterized by low signal-to-noise ratios and image degradation.
Wang, Alice; McMahan, Lanakila; Rutstein, Shea; Stauber, Christine; Reyes, Jorge; Sobsey, Mark D
2017-04-01
AbstractThe Joint Monitoring Program relies on household surveys to classify access to improved water sources instead of measuring microbiological quality. The aim of this research was to pilot a novel test for Escherichia coli quantification of household drinking water in the 2011 Demographic and Health Survey (DHS) in Peru. In the Compartment Bag Test (CBT), a 100-mL water sample is supplemented with chromogenic medium to support the growth of E. coli , poured into a bag with compartments, and incubated. A color change indicates E. coli growth, and the concentration of E. coli /100 mL is estimated as a most probable number. Triplicate water samples from 704 households were collected; one sample was analyzed in the field using the CBT, another replicate sample using the CBT was analyzed by reference laboratories, and one sample using membrane filtration (MF) was analyzed by reference laboratories. There were no statistically significant differences in E. coli concentrations between the field and laboratory CBT results, or when compared with MF results. These results suggest that the CBT for E. coli is an effective method to quantify fecal bacteria in household drinking water. The CBT can be incorporated into DHS and other national household surveys as a direct measure of drinking water safety based on microbial quality to better document access to safe drinking water.
Estimating the breeding population of long-billed curlew in the United States
Stanley, T.R.; Skagen, S.K.
2007-01-01
Determining population size and long-term trends in population size for species of high concern is a priority of international, national, and regional conservation plans. Long-billed curlews (Numenius americanus) are a species of special concern in North America due to apparent declines in their population. Because long-billed curlews are not adequately monitored by existing programs, we undertook a 2-year study with the goals of 1) determining present long-billed curlew distribution and breeding population size in the United States and 2) providing recommendations for a long-term long-billed curlew monitoring protocol. We selected a stratified random sample of survey routes in 16 western states for sampling in 2004 and 2005, and we analyzed count data from these routes to estimate detection probabilities and abundance. In addition, we evaluated habitat along roadsides to determine how well roadsides represented habitat throughout the sampling units. We estimated there were 164,515 (SE = 42,047) breeding long-billed curlews in 2004, and 109,533 (SE = 31,060) breeding individuals in 2005. These estimates far exceed currently accepted estimates based on expert opinion. We found that habitat along roadsides was representative of long-billed curlew habitat in general. We make recommendations for improving sampling methodology, and we present power curves to provide guidance on minimum sample sizes required to detect trends in abundance.
Wang, Alice; McMahan, Lanakila; Rutstein, Shea; Stauber, Christine; Reyes, Jorge; Sobsey, Mark D.
2017-01-01
The Joint Monitoring Program relies on household surveys to classify access to improved water sources instead of measuring microbiological quality. The aim of this research was to pilot a novel test for Escherichia coli quantification of household drinking water in the 2011 Demographic and Health Survey (DHS) in Peru. In the Compartment Bag Test (CBT), a 100-mL water sample is supplemented with chromogenic medium to support the growth of E. coli, poured into a bag with compartments, and incubated. A color change indicates E. coli growth, and the concentration of E. coli/100 mL is estimated as a most probable number. Triplicate water samples from 704 households were collected; one sample was analyzed in the field using the CBT, another replicate sample using the CBT was analyzed by reference laboratories, and one sample using membrane filtration (MF) was analyzed by reference laboratories. There were no statistically significant differences in E. coli concentrations between the field and laboratory CBT results, or when compared with MF results. These results suggest that the CBT for E. coli is an effective method to quantify fecal bacteria in household drinking water. The CBT can be incorporated into DHS and other national household surveys as a direct measure of drinking water safety based on microbial quality to better document access to safe drinking water. PMID:28500818
Introduction to Sample Size Choice for Confidence Intervals Based on "t" Statistics
ERIC Educational Resources Information Center
Liu, Xiaofeng Steven; Loudermilk, Brandon; Simpson, Thomas
2014-01-01
Sample size can be chosen to achieve a specified width in a confidence interval. The probability of obtaining a narrow width given that the confidence interval includes the population parameter is defined as the power of the confidence interval, a concept unfamiliar to many practitioners. This article shows how to utilize the Statistical Analysis…
Multistage variable probability forest volume inventory. [the Defiance Unit of the Navajo Nation
NASA Technical Reports Server (NTRS)
Anderson, J. E. (Principal Investigator)
1979-01-01
An inventory scheme based on the use of computer processed LANDSAT MSS data was developed. Output from the inventory scheme provides an estimate of the standing net saw timber volume of a major timber species on a selected forested area of the Navajo Nation. Such estimates are based on the values of parameters currently used for scaled sawlog conversion to mill output. The multistage variable probability sampling appears capable of producing estimates which compare favorably with those produced using conventional techniques. In addition, the reduction in time, manpower, and overall costs lend it to numerous applications.
Aerial survey methodology for bison population estimation in Yellowstone National Park
Hess, Steven C.
2002-01-01
I developed aerial survey methods for statistically rigorous bison population estimation in Yellowstone National Park to support sound resource management decisions and to understand bison ecology. Survey protocols, data recording procedures, a geographic framework, and seasonal stratifications were based on field observations from February 1998-September 2000. The reliability of this framework and strata were tested with long-term data from 1970-1997. I simulated different sample survey designs and compared them to high-effort censuses of well-defined large areas to evaluate effort, precision, and bias. Sample survey designs require much effort and extensive information on the current spatial distribution of bison and therefore do not offer any substantial reduction in time and effort over censuses. I conducted concurrent ground surveys, or 'double sampling' to estimate detection probability during aerial surveys. Group size distribution and habitat strongly affected detection probability. In winter, 75% of the groups and 92% of individual bison were detected on average from aircraft, while in summer, 79% of groups and 97% of individual bison were detected. I also used photography to quantify the bias due to counting large groups of bison accurately and found that undercounting increased with group size and could reach 15%. I compared survey conditions between seasons and identified optimal time windows for conducting surveys in both winter and summer. These windows account for the habitats and total area bison occupy, and group size distribution. Bison became increasingly scattered over the Yellowstone region in smaller groups and more occupied unfavorable habitats as winter progressed. Therefore, the best conditions for winter surveys occur early in the season (Dec-Jan). In summer, bison were most spatially aggregated and occurred in the largest groups by early August. Low variability between surveys and high detection probability provide population estimates with an overall coefficient of variation of approximately 8% and have high power for detecting trends in population change. I demonstrated how population estimates from winter and summer can be integrated into a comprehensive monitoring program to estimate annual growth rates, overall winter mortality, and an index of calf production, requiring about 30 hours of flight per year.
Mesías-García, Marta; Guerra-Hernández, Eduardo; García-Villanova, Belén
2010-05-26
The presence of ascorbic acid (AA), vitamin C (AA + dehydroascorbic acid (DHAA)) and furfural as potential precursors of furan in commercial fruit and vegetable jarred baby food was studied. Hydroxymethylfurfural (HMF) was also determined and used, together with furfural levels, as markers of thermal damage. AA, calculated DHAA and vitamin C values ranged between 22.4 and 103, 2.9 and 13.8, and 32.1 and 113.2 mg/100 g, respectively, in fruit-based baby food. However, no trace of AA was found in the vegetable-based baby food samples tested, probably because these samples are not enriched in vitamin C and the content of this vitamin in fresh vegetables is destroyed during processing. Furfural values ranged from not detected to 236 microg/100 g, being higher in vegetable samples than in fruit samples possibly because of greater AA degradation favored by a higher pH in the vegetable samples. HMF values (range: not detected-959 microg/100 g), however, were higher in the fruit samples, probably due to greater carbohydrate content degradation and as a consequence of the Maillard reaction, favored by a lower pH in these samples. According to these results, HMF would be the optimum indicator of thermal treatment for fruits, and furfural for vegetables. The higher furfural content of vegetable baby food could be considered an index of greater AA degradation and, therefore, the furan content might be higher in this kind of sample than in fruit-based baby food.
Wong, Melissa R; McKelvey, Wendy; Ito, Kazuhiko; Schiff, Corinne; Jacobson, J Bryan; Kass, Daniel
2015-03-01
We evaluated the impact of the New York City restaurant letter-grading program on restaurant hygiene, food safety practices, and public awareness. We analyzed data from 43,448 restaurants inspected between 2007 and 2013 to measure changes in inspection score and violation citations since program launch in July 2010. We used binomial regression to assess probability of scoring 0 to 13 points (A-range score). Two population-based random-digit-dial telephone surveys assessed public perceptions of the program. After we controlled for repeated restaurant observations, season of inspection, and chain restaurant status, the probability of scoring 0 to 13 points on an unannounced inspection increased 35% (95% confidence interval [CI]=31%, 40%) 3 years after compared with 3 years before grading. There were notable improvements in compliance with some specific requirements, including having a certified kitchen manager on site and being pest-free. More than 91% (95% CI=88%, 94%) of New Yorkers approved of the program and 88% (95% CI=85%, 92%) considered grades in dining decisions in 2012. Restaurant letter grading in New York City has resulted in improved sanitary conditions on unannounced inspection, suggesting that the program is an effective regulatory tool.
The National Human Exposure Assessment Survey (NHEXAS) is a federal interagency research effort coordinated by the Environmental Protection Agency (EPA), Office of Research and Development (ORD). Phase I consists of demonstration/scoping studies using probability-based sampling ...
Normal probability plots with confidence.
Chantarangsi, Wanpen; Liu, Wei; Bretz, Frank; Kiatsupaibul, Seksan; Hayter, Anthony J; Wan, Fang
2015-01-01
Normal probability plots are widely used as a statistical tool for assessing whether an observed simple random sample is drawn from a normally distributed population. The users, however, have to judge subjectively, if no objective rule is provided, whether the plotted points fall close to a straight line. In this paper, we focus on how a normal probability plot can be augmented by intervals for all the points so that, if the population distribution is normal, then all the points should fall into the corresponding intervals simultaneously with probability 1-α. These simultaneous 1-α probability intervals provide therefore an objective mean to judge whether the plotted points fall close to the straight line: the plotted points fall close to the straight line if and only if all the points fall into the corresponding intervals. The powers of several normal probability plot based (graphical) tests and the most popular nongraphical Anderson-Darling and Shapiro-Wilk tests are compared by simulation. Based on this comparison, recommendations are given in Section 3 on which graphical tests should be used in what circumstances. An example is provided to illustrate the methods. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Code of Federal Regulations, 2010 CFR
2010-04-01
... beginining study under the programs for which applications are made. (b) Candidates for the fellowship... studies under the program. (c) Candidates shall submit evidence of acceptance, or probable acceptance, for study in programs that will enhance their contributions to their employers. Evidence of probable...
Estimating site occupancy rates when detection probabilities are less than one
MacKenzie, D.I.; Nichols, J.D.; Lachman, G.B.; Droege, S.; Royle, J. Andrew; Langtimm, C.A.
2002-01-01
Nondetection of a species at a site does not imply that the species is absent unless the probability of detection is 1. We propose a model and likelihood-based method for estimating site occupancy rates when detection probabilities are 0.3). We estimated site occupancy rates for two anuran species at 32 wetland sites in Maryland, USA, from data collected during 2000 as part of an amphibian monitoring program, Frogwatch USA. Site occupancy rates were estimated as 0.49 for American toads (Bufo americanus), a 44% increase over the proportion of sites at which they were actually observed, and as 0.85 for spring peepers (Pseudacris crucifer), slightly above the observed proportion of 0.83.
Simulation modeling of population viability for the leopard darter (Percidae: Percina pantherina)
Williams, L.R.; Echelle, A.A.; Toepfer, C.S.; Williams, M.G.; Fisher, W.L.
1999-01-01
We used the computer program RAMAS to perform a population viability analysis for the leopard darter, Percina pantherina. This percid fish is a threatened species confined to five isolated rivers in the Ouachita Mountains of Oklahoma and Arkansas. A base model created from life history data indicated a 6% probability that the leopard darter would go extinct in 50 years. We performed sensitivity analyses to determine the effects of initial population size, variation in age structure, variation in severity and probability of catastrophe, and migration rate. Catastrophe (modeled as the probability and severity of drought) and migration had the greatest effects on persistence. Results of these simulations have implications for management of this species.
He, Hua; McDermott, Michael P.
2012-01-01
Sensitivity and specificity are common measures of the accuracy of a diagnostic test. The usual estimators of these quantities are unbiased if data on the diagnostic test result and the true disease status are obtained from all subjects in an appropriately selected sample. In some studies, verification of the true disease status is performed only for a subset of subjects, possibly depending on the result of the diagnostic test and other characteristics of the subjects. Estimators of sensitivity and specificity based on this subset of subjects are typically biased; this is known as verification bias. Methods have been proposed to correct verification bias under the assumption that the missing data on disease status are missing at random (MAR), that is, the probability of missingness depends on the true (missing) disease status only through the test result and observed covariate information. When some of the covariates are continuous, or the number of covariates is relatively large, the existing methods require parametric models for the probability of disease or the probability of verification (given the test result and covariates), and hence are subject to model misspecification. We propose a new method for correcting verification bias based on the propensity score, defined as the predicted probability of verification given the test result and observed covariates. This is estimated separately for those with positive and negative test results. The new method classifies the verified sample into several subsamples that have homogeneous propensity scores and allows correction for verification bias. Simulation studies demonstrate that the new estimators are more robust to model misspecification than existing methods, but still perform well when the models for the probability of disease and probability of verification are correctly specified. PMID:21856650
Public Health Dental Hygienists in Massachusetts: A Qualitative Study.
Rainchuso, Lori; Salisbury, Helen
2017-06-01
Purpose: The aim of this qualitative, phenomenological study was to explore the attitudes and perceptions of public health dental hygienists on providing preventive care to underserved populations in Massachusetts. Methods: Non-probability purposive sampling was used for initial participant recruitment, and snowball sampling occurred thereafter. Data collection occurred through semi-structured interviews. Qualitative analysis was conducted using Pitney and Parker's eight-step CREATIVE process. Results: Data saturation occurred with 10 participants (n=10), one-third of the public health dental hygienists who are practicing in Massachusetts. The majority of practice settings included school-based programs (70%), while programs for children with special needs (10%) were the least common. Two major themes emerged from the data; (a) the opportunity to be an oral health change agent and (b) barriers to practice. Six subcategories emerged from the data and are reviewed within the context of their associated themes. Additionally, career satisfaction emerged as an unintended theme, and was reported as the driving force for the majority of participants. Conclusion: This study revealed a better understanding of the public health dental hygiene workforce model in Massachusetts. Public health dental hygienists in Massachusetts perceive themselves as change agents within the health care profession, and although barriers to practice are plentiful, these oral health care professionals are committed to improving access to dental care. Copyright © 2017 The American Dental Hygienists’ Association.
Zhao, Wenle; Weng, Yanqiu; Wu, Qi; Palesch, Yuko
2012-01-01
To evaluate the performance of randomization designs under various parameter settings and trial sample sizes, and identify optimal designs with respect to both treatment imbalance and allocation randomness, we evaluate 260 design scenarios from 14 randomization designs under 15 sample sizes range from 10 to 300, using three measures for imbalance and three measures for randomness. The maximum absolute imbalance and the correct guess (CG) probability are selected to assess the trade-off performance of each randomization design. As measured by the maximum absolute imbalance and the CG probability, we found that performances of the 14 randomization designs are located in a closed region with the upper boundary (worst case) given by Efron's biased coin design (BCD) and the lower boundary (best case) from the Soares and Wu's big stick design (BSD). Designs close to the lower boundary provide a smaller imbalance and a higher randomness than designs close to the upper boundary. Our research suggested that optimization of randomization design is possible based on quantified evaluation of imbalance and randomness. Based on the maximum imbalance and CG probability, the BSD, Chen's biased coin design with imbalance tolerance method, and Chen's Ehrenfest urn design perform better than popularly used permuted block design, EBCD, and Wei's urn design. Copyright © 2011 John Wiley & Sons, Ltd.
Investigating prior probabilities in a multiple hypothesis test for use in space domain awareness
NASA Astrophysics Data System (ADS)
Hardy, Tyler J.; Cain, Stephen C.
2016-05-01
The goal of this research effort is to improve Space Domain Awareness (SDA) capabilities of current telescope systems through improved detection algorithms. Ground-based optical SDA telescopes are often spatially under-sampled, or aliased. This fact negatively impacts the detection performance of traditionally proposed binary and correlation-based detection algorithms. A Multiple Hypothesis Test (MHT) algorithm has been previously developed to mitigate the effects of spatial aliasing. This is done by testing potential Resident Space Objects (RSOs) against several sub-pixel shifted Point Spread Functions (PSFs). A MHT has been shown to increase detection performance for the same false alarm rate. In this paper, the assumption of a priori probability used in a MHT algorithm is investigated. First, an analysis of the pixel decision space is completed to determine alternate hypothesis prior probabilities. These probabilities are then implemented into a MHT algorithm, and the algorithm is then tested against previous MHT algorithms using simulated RSO data. Results are reported with Receiver Operating Characteristic (ROC) curves and probability of detection, Pd, analysis.
Aurora, R Nisha; Putcha, Nirupama; Swartz, Rachel; Punjabi, Naresh M
2016-07-01
Obstructive sleep apnea is a prevalent yet underdiagnosed condition associated with cardiovascular morbidity and mortality. Home sleep testing offers an efficient means for diagnosing obstructive sleep apnea but has been deployed primarily in clinical samples with a high pretest probability. The present study sought to assess whether obstructive sleep apnea can be diagnosed with home sleep testing in a nonreferred sample without involvement of a sleep medicine specialist. A study of community-based adults with untreated obstructive sleep apnea was undertaken. Misclassification of disease severity according to home sleep testing with and without involvement of a sleep medicine specialist was assessed, and agreement was characterized using scatter plots, Pearson's correlation coefficient, Bland-Altman analysis, and the κ statistic. Analyses were also conducted to assess whether any observed differences varied as a function of pretest probability of obstructive sleep apnea or subjective sleepiness. The sample consisted of 191 subjects, with more than half (56.5%) having obstructive sleep apnea. Without involvement of a sleep medicine specialist, obstructive sleep apnea was not identified in only 5.8% of the sample. Analyses comparing the categorical assessment of disease severity with and without a sleep medicine specialist showed that in total, 32 subjects (16.8%) were misclassified. Agreement in the disease severity with and without a sleep medicine specialist was not influenced by the pretest probability or daytime sleep tendency. Obstructive sleep apnea can be reliably identified with home sleep testing in a nonreferred sample, irrespective of the pretest probability of the disease. Copyright © 2016 Elsevier Inc. All rights reserved.
Constructed-Response Matching to Sample and Spelling Instruction.
ERIC Educational Resources Information Center
Dube, William V.; And Others
1991-01-01
This paper describes a computer-based spelling program grounded in programed instructional techniques and using constructed-response matching-to-sample procedures. Following use of the program, two mentally retarded men successfully spelled previously misspelled words. (JDD)
Surveying Europe’s Only Cave-Dwelling Chordate Species (Proteus anguinus) Using Environmental DNA
Márton, Orsolya; Schmidt, Benedikt R.; Gál, Júlia Tünde; Jelić, Dušan
2017-01-01
In surveillance of subterranean fauna, especially in the case of rare or elusive aquatic species, traditional techniques used for epigean species are often not feasible. We developed a non-invasive survey method based on environmental DNA (eDNA) to detect the presence of the red-listed cave-dwelling amphibian, Proteus anguinus, in the caves of the Dinaric Karst. We tested the method in fifteen caves in Croatia, from which the species was previously recorded or expected to occur. We successfully confirmed the presence of P. anguinus from ten caves and detected the species for the first time in five others. Using a hierarchical occupancy model we compared the availability and detection probability of eDNA of two water sampling methods, filtration and precipitation. The statistical analysis showed that both availability and detection probability depended on the method and estimates for both probabilities were higher using filter samples than for precipitation samples. Combining reliable field and laboratory methods with robust statistical modeling will give the best estimates of species occurrence. PMID:28129383
II. MORE THAN JUST CONVENIENT: THE SCIENTIFIC MERITS OF HOMOGENEOUS CONVENIENCE SAMPLES.
Jager, Justin; Putnick, Diane L; Bornstein, Marc H
2017-06-01
Despite their disadvantaged generalizability relative to probability samples, nonprobability convenience samples are the standard within developmental science, and likely will remain so because probability samples are cost-prohibitive and most available probability samples are ill-suited to examine developmental questions. In lieu of focusing on how to eliminate or sharply reduce reliance on convenience samples within developmental science, here we propose how to augment their advantages when it comes to understanding population effects as well as subpopulation differences. Although all convenience samples have less clear generalizability than probability samples, we argue that homogeneous convenience samples have clearer generalizability relative to conventional convenience samples. Therefore, when researchers are limited to convenience samples, they should consider homogeneous convenience samples as a positive alternative to conventional (or heterogeneous) convenience samples. We discuss future directions as well as potential obstacles to expanding the use of homogeneous convenience samples in developmental science. © 2017 The Society for Research in Child Development, Inc.
Probabilistic approach to lysozyme crystal nucleation kinetics.
Dimitrov, Ivaylo L; Hodzhaoglu, Feyzim V; Koleva, Dobryana P
2015-09-01
Nucleation of lysozyme crystals in quiescent solutions at a regime of progressive nucleation is investigated under an optical microscope at conditions of constant supersaturation. A method based on the stochastic nature of crystal nucleation and using discrete time sampling of small solution volumes for the presence or absence of detectable crystals is developed. It allows probabilities for crystal detection to be experimentally estimated. One hundred single samplings were used for each probability determination for 18 time intervals and six lysozyme concentrations. Fitting of a particular probability function to experimentally obtained data made possible the direct evaluation of stationary rates for lysozyme crystal nucleation, the time for growth of supernuclei to a detectable size and probability distribution of nucleation times. Obtained stationary nucleation rates were then used for the calculation of other nucleation parameters, such as the kinetic nucleation factor, nucleus size, work for nucleus formation and effective specific surface energy of the nucleus. The experimental method itself is simple and adaptable and can be used for crystal nucleation studies of arbitrary soluble substances with known solubility at particular solution conditions.
Efficiency of MY09/11 consensus PCR in the detection of multiple HPV infections.
Şahiner, Fatih; Kubar, Ayhan; Gümral, Ramazan; Ardıç, Medine; Yiğit, Nuri; Şener, Kenan; Dede, Murat; Yapar, Mehmet
2014-09-01
Human papillomavirus (HPV) DNA testing has become an important component of cervical cancer screening programs. In this study, we aimed to evaluate the efficiency of MY09/11 consensus polymerase chain reaction (PCR) for the detection of multiple HPV infections. For this purpose, MY09/11 PCR was compared to an original TaqMan-based type-specific real-time PCR assay, which can detect 20 different HPV types. Of the 654 samples, 34.1% (223/654) were HPV DNA positive according to at least one method. The relative sensitivities of MY09/11 PCR and type-specific PCR were 80.7% (180/223) and 97.8% (218/223), respectively. In all, 352 different HPV isolates (66 low-risk and 286 high-risk or probable high-risk types) were identified in 218 samples, but 5 samples, which were positive by consensus PCR only, could not be genotyped. The distribution of the 286 high-risk or probable high-risk HPVs were as follows: 24.5% HPV-16, 8.4% HPV-52, 7.7% HPV-51, 6.3% HPV-39, 6.3% HPV-82, 5.6% HPV-35, 5.6% HPV-58, 5.6% HPV-66, 5.2% HPV-18, 5.2% HPV-68, and 19.6% the other 8 types. A single HPV type was detected in 57.3% (125/218) of the genotyped samples, and multiple HPV types were found in the remaining 42.7% (93/218). The false-negative rates of MY09/11 PCR were found to be 17.4% in single infections, 23.3% in multiple infections, and 34.6% in multiple infections that contained 3 or more HPV types, with the condition that the low-risk types HPV-6 and HPV-11 be considered as a monotype. These data suggest that broad-range PCR assays may lead to significant data loss and that type-specific PCR assays can provide accurate and reliable results during cervical cancer screening. Copyright © 2014 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Gronewold, A. D.; Wolpert, R. L.; Reckhow, K. H.
2007-12-01
Most probable number (MPN) and colony-forming-unit (CFU) are two estimates of fecal coliform bacteria concentration commonly used as measures of water quality in United States shellfish harvesting waters. The MPN is the maximum likelihood estimate (or MLE) of the true fecal coliform concentration based on counts of non-sterile tubes in serial dilution of a sample aliquot, indicating bacterial metabolic activity. The CFU is the MLE of the true fecal coliform concentration based on the number of bacteria colonies emerging on a growth plate after inoculation from a sample aliquot. Each estimating procedure has intrinsic variability and is subject to additional uncertainty arising from minor variations in experimental protocol. Several versions of each procedure (using different sized aliquots or different numbers of tubes, for example) are in common use, each with its own levels of probabilistic and experimental error and uncertainty. It has been observed empirically that the MPN procedure is more variable than the CFU procedure, and that MPN estimates are somewhat higher on average than CFU estimates, on split samples from the same water bodies. We construct a probabilistic model that provides a clear theoretical explanation for the observed variability in, and discrepancy between, MPN and CFU measurements. We then explore how this variability and uncertainty might propagate into shellfish harvesting area management decisions through a two-phased modeling strategy. First, we apply our probabilistic model in a simulation-based analysis of future water quality standard violation frequencies under alternative land use scenarios, such as those evaluated under guidelines of the total maximum daily load (TMDL) program. Second, we apply our model to water quality data from shellfish harvesting areas which at present are closed (either conditionally or permanently) to shellfishing, to determine if alternative laboratory analysis procedures might have led to different management decisions. Our research results indicate that the (often large) observed differences between MPN and CFU values for the same water body are well within the ranges predicted by our probabilistic model. Our research also indicates that the probability of violating current water quality guidelines at specified true fecal coliform concentrations depends on the laboratory procedure used. As a result, quality-based management decisions, such as opening or closing a shellfishing area, may also depend on the laboratory procedure used.
Siddiqui, Niyamat A.; Rabidas, Vidya N.; Sinha, Sanjay K.; Verma, Rakesh B.; Pandey, Krishna; Singh, Vijay P.; Ranjan, Alok; Topno, Roshan K.; Lal, Chandra S.; Kumar, Vijay; Sahoo, Ganesh C.; Sridhar, Srikantaih; Pandey, Arvind; Das, Pradeep
2016-01-01
Background Visceral Leishmaniasis, commonly known as kala-azar, is widely prevalent in Bihar. The National Kala-azar Control Program has applied house-to-house survey approach several times for estimating Kala-azar incidence in the past. However, this approach includes huge logistics and operational cost, as occurrence of kala-azar is clustered in nature. The present study aims to compare efficiency, cost and feasibility of snowball sampling approach to house-to-house survey approach in capturing kala-azar cases in two endemic districts of Bihar, India. Methodology/Principal findings A community based cross-sectional study was conducted in two highly endemic Primary Health Centre (PHC) areas, each from two endemic districts of Bihar, India. Snowball technique (used to locate potential subjects with help of key informants where subjects are hard to locate) and house-to-house survey technique were applied to detect all the new cases of Kala-azar during a defined reference period of one year i.e. June, 2010 to May, 2011. The study covered a total of 105,035 households with 537,153 populations. Out of total 561 cases and 17 deaths probably due to kala-azar, identified by the study, snowball sampling approach captured only 221 cases and 13 deaths, whereas 489 cases and 17 deaths were detected by house-to-house survey approach. Higher value of McNemar’s χ² statistics (64; p<0.0001) for house-to-house survey approach than snowball sampling and relative difference (>1) indicates that most of the kala-azar cases missed by snowball sampling were captured by house-to-house approach with 13% of omission. Conclusion/Significance Snowball sampling was not found sensitive enough as it captured only about 50% of VL cases. However, it captured about 77% of the deaths probably due to kala-azar and was found more cost-effective than house-to-house approach. Standardization of snowball approach with improved procedure, training and logistics may enhance the sensitivity of snowball sampling and its application in national Kala-azar elimination programme as cost-effective approach for estimation of kala-azar burden. PMID:27681709
Siddiqui, Niyamat A; Rabidas, Vidya N; Sinha, Sanjay K; Verma, Rakesh B; Pandey, Krishna; Singh, Vijay P; Ranjan, Alok; Topno, Roshan K; Lal, Chandra S; Kumar, Vijay; Sahoo, Ganesh C; Sridhar, Srikantaih; Pandey, Arvind; Das, Pradeep
2016-09-01
Visceral Leishmaniasis, commonly known as kala-azar, is widely prevalent in Bihar. The National Kala-azar Control Program has applied house-to-house survey approach several times for estimating Kala-azar incidence in the past. However, this approach includes huge logistics and operational cost, as occurrence of kala-azar is clustered in nature. The present study aims to compare efficiency, cost and feasibility of snowball sampling approach to house-to-house survey approach in capturing kala-azar cases in two endemic districts of Bihar, India. A community based cross-sectional study was conducted in two highly endemic Primary Health Centre (PHC) areas, each from two endemic districts of Bihar, India. Snowball technique (used to locate potential subjects with help of key informants where subjects are hard to locate) and house-to-house survey technique were applied to detect all the new cases of Kala-azar during a defined reference period of one year i.e. June, 2010 to May, 2011. The study covered a total of 105,035 households with 537,153 populations. Out of total 561 cases and 17 deaths probably due to kala-azar, identified by the study, snowball sampling approach captured only 221 cases and 13 deaths, whereas 489 cases and 17 deaths were detected by house-to-house survey approach. Higher value of McNemar's χ² statistics (64; p<0.0001) for house-to-house survey approach than snowball sampling and relative difference (>1) indicates that most of the kala-azar cases missed by snowball sampling were captured by house-to-house approach with 13% of omission. Snowball sampling was not found sensitive enough as it captured only about 50% of VL cases. However, it captured about 77% of the deaths probably due to kala-azar and was found more cost-effective than house-to-house approach. Standardization of snowball approach with improved procedure, training and logistics may enhance the sensitivity of snowball sampling and its application in national Kala-azar elimination programme as cost-effective approach for estimation of kala-azar burden.
Alqahtani, Jobran M; Asaad, Ahmed M; Ahmed, Essam M; Qureshi, Mohamed A
2015-01-01
The aim was to investigate the bacteriological quality of drinking water, and explore the factors involved in the knowledge of the public about the quality of drinking water in Najran region, Saudi Arabia. A cross-sectional descriptive study. A total of 160 water samples were collected. Total coliforms, fecal coliform, and fecal streptococci were counted using Most Probable Number method. The bacterial genes lacZ and uidA specific to total coliforms and Escherichia coli, respectively, were detected using multiplex polymerase chain reaction. An interview was conducted with 1200 residents using a questionnaire. Total coliforms were detected in 8 (20%) of 40 samples from wells, 13 (32.5%) of 40 samples from tankers, and 55 (68.8%) of 80 samples from roof tanks. Twenty (25%) and 8 (10%) samples from roof tanks were positive for E. coli and Streptococcus faecalis, respectively. Of the 1200 residents participating in the study, 10%, 45.5%, and 44.5% claimed that they depended on municipal water, bottled water, and well water, respectively. The majority (95.5%) reported the use of roof water tanks as a source of water supply in their homes. Most people (80%) believed that drinking water transmitted diseases. However, only 25% of them participated in educational programs on the effect of polluted water on health. Our results could help health authorities consider a proper regular monitoring program and a sustainable continuous assessment of the quality of well water. In addition, this study highlights the importance of the awareness and educational programs for residents on the effect of polluted water on public health.
NASA Technical Reports Server (NTRS)
Hunter, H. E.
1972-01-01
The Avco Data Analysis and Prediction Techniques (ADAPT) were employed to determine laws capable of detecting failures in a heat plant up to three days in advance of the occurrence of the failure. The projected performance of algorithms yielded a detection probability of 90% with false alarm rates of the order of 1 per year for a sample rate of 1 per day with each detection, followed by 3 hourly samplings. This performance was verified on 173 independent test cases. The program also demonstrated diagnostic algorithms and the ability to predict the time of failure to approximately plus or minus 8 hours up to three days in advance of the failure. The ADAPT programs produce simple algorithms which have a unique possibility of a relatively low cost updating procedure. The algorithms were implemented on general purpose computers at Kennedy Space Flight Center and tested against current data.
Monte Carlo simulation of a photodisintegration of 3 H experiment in Geant4
NASA Astrophysics Data System (ADS)
Gray, Isaiah
2013-10-01
An upcoming experiment involving photodisintegration of 3 H at the High Intensity Gamma-Ray Source facility at Duke University has been simulated in the software package Geant4. CAD models of silicon detectors and wire chambers were imported from Autodesk Inventor using the program FastRad and the Geant4 GDML importer. Sensitive detectors were associated with the appropriate logical volumes in the exported GDML file so that changes in detector geometry will be easily manifested in the simulation. Probability distribution functions for the energy and direction of outgoing protons were generated using numerical tables from previous theory, and energies and directions were sampled from these distributions using a rejection sampling algorithm. The simulation will be a useful tool to optimize detector geometry, estimate background rates, and test data analysis algorithms. This work was supported by the Triangle Universities Nuclear Laboratory REU program at Duke University.
NASA Technical Reports Server (NTRS)
Nastrom, G. D.; Jasperson, W. H.
1983-01-01
Temperature data obtained by the Global Atmospheric Sampling Program (GASP) during the period March 1975 to July 1979 are compiled to form flight summaries of static air temperature and a geographic temperature climatology. The flight summaries include the height and location of the coldest observed temperature and the mean flight level, temperature and the standard deviation of temperature for each flight as well as for flight segments. These summaries are ordered by route and month. The temperature climatology was computed for all statistically independent temperture data for each flight. The grid used consists of 5 deg latitude, 30 deg longitude and 2000 feet vertical resolution from FL270 to FL430 for each month of the year. The number of statistically independent observations, their mean, standard deviation and the empirical 98, 50, 16, 2 and .3 probability percentiles are presented.
Optimizing bulk milk dioxin monitoring based on costs and effectiveness.
Lascano-Alcoser, V H; Velthuis, A G J; van der Fels-Klerx, H J; Hoogenboom, L A P; Oude Lansink, A G J M
2013-07-01
Dioxins are environmental pollutants, potentially present in milk products, which have negative consequences for human health and for the firms and farms involved in the dairy chain. Dioxin monitoring in feed and food has been implemented to detect their presence and estimate their levels in food chains. However, the costs and effectiveness of such programs have not been evaluated. In this study, the costs and effectiveness of bulk milk dioxin monitoring in milk trucks were estimated to optimize the sampling and pooling monitoring strategies aimed at detecting at least 1 contaminated dairy farm out of 20,000 at a target dioxin concentration level. Incidents of different proportions, in terms of the number of contaminated farms, and concentrations were simulated. A combined testing strategy, consisting of screening and confirmatory methods, was assumed as well as testing of pooled samples. Two optimization models were built using linear programming. The first model aimed to minimize monitoring costs subject to a minimum required effectiveness of finding an incident, whereas the second model aimed to maximize the effectiveness for a given monitoring budget. Our results show that a high level of effectiveness is possible, but at high costs. Given specific assumptions, monitoring with 95% effectiveness to detect an incident of 1 contaminated farm at a dioxin concentration of 2 pg of toxic equivalents/g of fat [European Commission's (EC) action level] costs €2.6 million per month. At the same level of effectiveness, a 73% cost reduction is possible when aiming to detect an incident where 2 farms are contaminated at a dioxin concentration of 3 pg of toxic equivalents/g of fat (EC maximum level). With a fixed budget of €40,000 per month, the probability of detecting an incident with a single contaminated farm at a dioxin concentration equal to the EC action level is 4.4%. This probability almost doubled (8.0%) when aiming to detect the same incident but with a dioxin concentration equal to the EC maximum level. This study shows that the effectiveness of finding an incident depends not only on the ratio at which, for testing, collected truck samples are mixed into a pooled sample (aiming at detecting certain concentration), but also the number of collected truck samples. In conclusion, the optimal cost-effective monitoring depends on the number of contaminated farms and the concentration aimed at detection. The models and study results offer quantitative support to risk managers of food industries and food safety authorities. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Musella, Vincenzo; Rinaldi, Laura; Lagazio, Corrado; Cringoli, Giuseppe; Biggeri, Annibale; Catelan, Dolores
2014-09-15
Model-based geostatistics and Bayesian approaches are appropriate in the context of Veterinary Epidemiology when point data have been collected by valid study designs. The aim is to predict a continuous infection risk surface. Little work has been done on the use of predictive infection probabilities at farm unit level. In this paper we show how to use predictive infection probability and related uncertainty from a Bayesian kriging model to draw a informative samples from the 8794 geo-referenced sheep farms of the Campania region (southern Italy). Parasitological data come from a first cross-sectional survey carried out to study the spatial distribution of selected helminths in sheep farms. A grid sampling was performed to select the farms for coprological examinations. Faecal samples were collected for 121 sheep farms and the presence of 21 different helminths were investigated using the FLOTAC technique. The 21 responses are very different in terms of geographical distribution and prevalence of infection. The observed prevalence range is from 0.83% to 96.69%. The distributions of the posterior predictive probabilities for all the 21 parasites are very heterogeneous. We show how the results of the Bayesian kriging model can be used to plan a second wave survey. Several alternatives can be chosen depending on the purposes of the second survey: weight by posterior predictive probabilities, their uncertainty or combining both information. The proposed Bayesian kriging model is simple, and the proposed samping strategy represents a useful tool to address targeted infection control treatments and surbveillance campaigns. It is easily extendable to other fields of research. Copyright © 2014 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Puma, Michael J.; Ellis, Richard
Part of a study of program management procedures in the campus-based and Basic Educational Opportunity Grant programs reports on the design of the site visit component of the study and the results of the student survey, both in terms of the yield obtained and the quality of the data. Chapter 2 describes the design of sampling methodology employed…
NASA Technical Reports Server (NTRS)
1972-01-01
The Accident Model Document is one of three documents of the Preliminary Safety Analysis Report (PSAR) - Reactor System as applied to a Space Base Program. Potential terrestrial nuclear hazards involving the zirconium hydride reactor-Brayton power module are identified for all phases of the Space Base program. The accidents/events that give rise to the hazards are defined and abort sequence trees are developed to determine the sequence of events leading to the hazard and the associated probabilities of occurence. Source terms are calculated to determine the magnitude of the hazards. The above data is used in the mission accident analysis to determine the most probable and significant accidents/events in each mission phase. The only significant hazards during the prelaunch and launch ascent phases of the mission are those which arise form criticality accidents. Fission product inventories during this time period were found to be very low due to very limited low power acceptance testing.
2013-10-29
COVERED (From - To) 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d...based on contextual information, 3) develop vision-based techniques for learning of contextual information, and detection and identification of...that takes into account many possible contexts. The probability distributions of these contexts will be learned from existing databases on common sense
Simulation-Based Model Checking for Nondeterministic Systems and Rare Events
2016-03-24
year, we have investigated AO* search and Monte Carlo Tree Search algorithms to complement and enhance CMU’s SMCMDP. 1 Final Report, March 14... tree , so we can use it to find the probability of reachability for a property in PRISM’s Probabilistic LTL. By finding the maximum probability of...savings, particularly when handling very large models. 2.3 Monte Carlo Tree Search The Monte Carlo sampling process in SMCMDP can take a long time to
Reece, Michael; Herbenick, Debby; Schick, Vanessa; Sanders, Stephanie A; Dodge, Brian; Fortenberry, J Dennis
2010-10-01
To provide a foundation for those who provide sexual health services and programs to men in the United States, the need for population-based data that describes men's sexual behaviors and their correlates remains. The purpose of this study was to, in a national probability survey of men ages 18-94 years, assess the occurrence and frequency of sexual behaviors and their associations with relationship status and health status. A national probability sample of 2,522 men aged 18 to 94 completed a cross-sectional survey about their sexual behaviors, relationship status, and health. Relationship status; health status; experience of solo masturbation, partnered masturbation, giving oral sex, receiving oral sex, vaginal intercourse and anal intercourse, in the past 90 days; frequency of solo masturbation, vaginal intercourse and anal intercourse in the past year. Masturbation, oral intercourse, and vaginal intercourse are prevalent among men throughout most of their adult life, with both occurrence and frequency varying with age and as functions of relationship type and physical health status. Masturbation is prevalent and frequent across various stages of life and for both those with and without a relational partner, with fewer men with fair to poor health reporting recent masturbation. Patterns of giving oral sex to a female partner were similar to those for receiving oral sex. Vaginal intercourse in the past 90 days was more prevalent among men in their late 20s and 30s than in the other age groups, although being reported by approximately 50% of men in the sixth and seventh decades of life. Anal intercourse and sexual interactions with other men were less common than all other sexual behaviors. Contemporary men in the United States engage in diverse solo and partnered sexual activities; however, sexual behavior is less common and more infrequent among older age cohorts. © 2010 International Society for Sexual Medicine.
Liu, Jing; Li, Yongping; Huang, Guohe; Fu, Haiyan; Zhang, Junlong; Cheng, Guanhui
2017-06-01
In this study, a multi-level-factorial risk-inference-based possibilistic-probabilistic programming (MRPP) method is proposed for supporting water quality management under multiple uncertainties. The MRPP method can handle uncertainties expressed as fuzzy-random-boundary intervals, probability distributions, and interval numbers, and analyze the effects of uncertainties as well as their interactions on modeling outputs. It is applied to plan water quality management in the Xiangxihe watershed. Results reveal that a lower probability of satisfying the objective function (θ) as well as a higher probability of violating environmental constraints (q i ) would correspond to a higher system benefit with an increased risk of violating system feasibility. Chemical plants are the major contributors to biological oxygen demand (BOD) and total phosphorus (TP) discharges; total nitrogen (TN) would be mainly discharged by crop farming. It is also discovered that optimistic decision makers should pay more attention to the interactions between chemical plant and water supply, while decision makers who possess a risk-averse attitude would focus on the interactive effect of q i and benefit of water supply. The findings can help enhance the model's applicability and identify a suitable water quality management policy for environmental sustainability according to the practical situations.
Sambo, Maganga; Johnson, Paul C. D.; Hotopp, Karen; Changalucha, Joel; Cleaveland, Sarah; Kazwala, Rudovick; Lembo, Tiziana; Lugelo, Ahmed; Lushasi, Kennedy; Maziku, Mathew; Mbunda, Eberhard; Mtema, Zacharia; Sikana, Lwitiko; Townsend, Sunny E.; Hampson, Katie
2017-01-01
Rabies can be eliminated by achieving comprehensive coverage of 70% of domestic dogs during annual mass vaccination campaigns. Estimates of vaccination coverage are, therefore, required to evaluate and manage mass dog vaccination programs; however, there is no specific guidance for the most accurate and efficient methods for estimating coverage in different settings. Here, we compare post-vaccination transects, school-based surveys, and household surveys across 28 districts in southeast Tanzania and Pemba island covering rural, urban, coastal and inland settings, and a range of different livelihoods and religious backgrounds. These approaches were explored in detail in a single district in northwest Tanzania (Serengeti), where their performance was compared with a complete dog population census that also recorded dog vaccination status. Post-vaccination transects involved counting marked (vaccinated) and unmarked (unvaccinated) dogs immediately after campaigns in 2,155 villages (24,721 dogs counted). School-based surveys were administered to 8,587 primary school pupils each representing a unique household, in 119 randomly selected schools approximately 2 months after campaigns. Household surveys were conducted in 160 randomly selected villages (4,488 households) in July/August 2011. Costs to implement these coverage assessments were $12.01, $66.12, and $155.70 per village for post-vaccination transects, school-based, and household surveys, respectively. Simulations were performed to assess the effect of sampling on the precision of coverage estimation. The sampling effort required to obtain reasonably precise estimates of coverage from household surveys is generally very high and probably prohibitively expensive for routine monitoring across large areas, particularly in communities with high human to dog ratios. School-based surveys partially overcame sampling constraints, however, were also costly to obtain reasonably precise estimates of coverage. Post-vaccination transects provided precise and timely estimates of community-level coverage that could be used to troubleshoot the performance of campaigns across large areas. However, transects typically overestimated coverage by around 10%, which therefore needs consideration when evaluating the impacts of campaigns. We discuss the advantages and disadvantages of these different methods and make recommendations for how vaccination campaigns can be better monitored and managed at different stages of rabies control and elimination programs. PMID:28352630
Estimating rates of local species extinction, colonization and turnover in animal communities
Nichols, James D.; Boulinier, T.; Hines, J.E.; Pollock, K.H.; Sauer, J.R.
1998-01-01
Species richness has been identified as a useful state variable for conservation and management purposes. Changes in richness over time provide a basis for predicting and evaluating community responses to management, to natural disturbance, and to changes in factors such as community composition (e.g., the removal of a keystone species). Probabilistic capture-recapture models have been used recently to estimate species richness from species count and presence-absence data. These models do not require the common assumption that all species are detected in sampling efforts. We extend this approach to the development of estimators useful for studying the vital rates responsible for changes in animal communities over time; rates of local species extinction, turnover, and colonization. Our approach to estimation is based on capture-recapture models for closed animal populations that permit heterogeneity in detection probabilities among the different species in the sampled community. We have developed a computer program, COMDYN, to compute many of these estimators and associated bootstrap variances. Analyses using data from the North American Breeding Bird Survey (BBS) suggested that the estimators performed reasonably well. We recommend estimators based on probabilistic modeling for future work on community responses to management efforts as well as on basic questions about community dynamics.
The National Human Exposure Assessment Survey (NHEXAS) is a federal interagency research effort coordinated by the Environmental Protection Agency (EPA), Office of Research and Development (ORD). Phase I consists of demonstration/scoping studies using probability-based sampling d...
NASA Technical Reports Server (NTRS)
Ng, Hok K.; Grabbe, Shon; Mukherjee, Avijit
2010-01-01
The optimization of traffic flows in congested airspace with varying convective weather is a challenging problem. One approach is to generate shortest routes between origins and destinations while meeting airspace capacity constraint in the presence of uncertainties, such as weather and airspace demand. This study focuses on development of an optimal flight path search algorithm that optimizes national airspace system throughput and efficiency in the presence of uncertainties. The algorithm is based on dynamic programming and utilizes the predicted probability that an aircraft will deviate around convective weather. It is shown that the running time of the algorithm increases linearly with the total number of links between all stages. The optimal routes minimize a combination of fuel cost and expected cost of route deviation due to convective weather. They are considered as alternatives to the set of coded departure routes which are predefined by FAA to reroute pre-departure flights around weather or air traffic constraints. A formula, which calculates predicted probability of deviation from a given flight path, is also derived. The predicted probability of deviation is calculated for all path candidates. Routes with the best probability are selected as optimal. The predicted probability of deviation serves as a computable measure of reliability in pre-departure rerouting. The algorithm can also be extended to automatically adjust its design parameters to satisfy the desired level of reliability.
The persisting effect of maternal mood in pregnancy on childhood psychopathology.
O'Donnell, Kieran J; Glover, Vivette; Barker, Edward D; O'Connor, Thomas G
2014-05-01
Developmental or fetal programming has emerged as a major model for understanding the early and persisting effects of prenatal exposures on the health and development of the child and adult. We leverage the power of a 14-year prospective study to examine the persisting effects of prenatal anxiety, a key candidate in the developmental programming model, on symptoms of behavioral and emotional problems across five occasions of measurement from age 4 to 13 years. The study is based on the Avon Longitudinal Study of Parents and Children cohort, a prospective, longitudinal study of a large community sample in the west of England (n = 7,944). Potential confounders included psychosocial and obstetric risk, postnatal maternal mood, paternal pre- and postnatal mood, and parenting. Results indicated that maternal prenatal anxiety predicted persistently higher behavioral and emotional symptoms across childhood with no diminishment of effect into adolescence. Elevated prenatal anxiety (top 15%) was associated with a twofold increase in risk of a probable child mental disorder, 12.31% compared with 6.83%, after allowing for confounders. Results were similar with prenatal depression. These analyses provide some of the strongest evidence to date that prenatal maternal mood has a direct and persisting effect on her child's psychiatric symptoms and support an in utero programming hypothesis.
Janson, Lucas; Schmerling, Edward; Clark, Ashley; Pavone, Marco
2015-01-01
In this paper we present a novel probabilistic sampling-based motion planning algorithm called the Fast Marching Tree algorithm (FMT*). The algorithm is specifically aimed at solving complex motion planning problems in high-dimensional configuration spaces. This algorithm is proven to be asymptotically optimal and is shown to converge to an optimal solution faster than its state-of-the-art counterparts, chiefly PRM* and RRT*. The FMT* algorithm performs a “lazy” dynamic programming recursion on a predetermined number of probabilistically-drawn samples to grow a tree of paths, which moves steadily outward in cost-to-arrive space. As such, this algorithm combines features of both single-query algorithms (chiefly RRT) and multiple-query algorithms (chiefly PRM), and is reminiscent of the Fast Marching Method for the solution of Eikonal equations. As a departure from previous analysis approaches that are based on the notion of almost sure convergence, the FMT* algorithm is analyzed under the notion of convergence in probability: the extra mathematical flexibility of this approach allows for convergence rate bounds—the first in the field of optimal sampling-based motion planning. Specifically, for a certain selection of tuning parameters and configuration spaces, we obtain a convergence rate bound of order O(n−1/d+ρ), where n is the number of sampled points, d is the dimension of the configuration space, and ρ is an arbitrarily small constant. We go on to demonstrate asymptotic optimality for a number of variations on FMT*, namely when the configuration space is sampled non-uniformly, when the cost is not arc length, and when connections are made based on the number of nearest neighbors instead of a fixed connection radius. Numerical experiments over a range of dimensions and obstacle configurations confirm our the-oretical and heuristic arguments by showing that FMT*, for a given execution time, returns substantially better solutions than either PRM* or RRT*, especially in high-dimensional configuration spaces and in scenarios where collision-checking is expensive. PMID:27003958
Importance Sampling in the Evaluation and Optimization of Buffered Failure Probability
2015-07-01
12th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12 Vancouver, Canada, July 12-15, 2015...Importance Sampling in the Evaluation and Optimization of Buffered Failure Probability Marwan M. Harajli Graduate Student, Dept. of Civil and Environ...criterion is usually the failure probability . In this paper, we examine the buffered failure probability as an attractive alternative to the failure
Gjinovci, Bahri; Idrizovic, Kemal; Uljevic, Ognjen; Sekulic, Damir
2017-01-01
There is an evident lack of studies on the effectiveness of plyometric- and skill-based-conditioning in volleyball. This study aimed to evaluate effects of 12-week plyometric- and volleyball-skill-based training on specific conditioning abilities in female volleyball players. The sample included 41 high-level female volleyball players (21.8 ± 2.1 years of age; 1.76 ± 0.06 cm; 60.8 ± 7.0 kg), who participated in plyometric- (n = 21), or skill-based-conditioning-program (n = 20). Both programs were performed twice per week. Participants were tested on body-height, body-mass (BM), countermovement jump (CMJ), standing broad jump (SBJ), medicine ball throw, (MBT) and 20-m sprint (S20M). All tests were assessed at the study baseline (pre-) and at the end of the 12-week programs (post-testing). Two-way ANOVA for repeated measurements showed significant (p<0.05) “Group x Time” effects for all variables but body-height. Plyometric group significantly reduced body-mass (trivial effect size [ES] differences; 1% average pre- to post-measurement changes), and improved their performance in S20M (moderate ES; 8%), MBT (very large ES; 25%), CMJ (large ES; 27%), and SBJ (moderate ES; 8%). Players involved in skill-based-conditioning significantly improved CMJ (large ES; 18%), SBJ (small ES; 3%), and MBT (large ES; 9%). The changes which occurred between pre- and post-testing were more inter-correlated in plyometric-group. Although both training-modalities induced positive changes in jumping- and throwing-capacities, plyometric-training is found to be more effective than skill-based conditioning in improvement of conditioning capacities of female senior volleyball players. Future studies should evaluate differential program effects in less experienced and younger players. Key points Plyometric- and skill-based-conditioning resulted in improvements in jumping and throwing capacities, but plyometric training additionally induced positive changes in anthropometrics and sprint-capacity The changes induced by plyometric training were larger in magnitude than those achieved by skill-based conditioning. The higher intensity together with possibility of more accurate adjustment of training load in plyometric training are probably the most important determinant of such differential influence. It is likely that the skill-based conditioning program did not result in changes of higher magnitude because of the players’ familiarity with volleyball-related skills. PMID:29238253
Probability-Based Inference in Cognitive Diagnosis
1994-02-01
of variables in the student model. In Siegler’s study , this corresponds to determining how a child with a given set of strategies at her disposal...programs are commercially available to carry out the number-crunching aspect. We used Andersen, Jensen, Olesen, and Jensen’s (1989) HUGIN program and Noetic ... studying how they are typically acquired (e.g., in mechanics, Clement, 1982; in ratio and proportional reasoning, Karplus, Pulos, & Stage, 1983), and
Pearson, Kristen Nicole; Kendall, William L.; Winkelman, Dana L.; Persons, William R.
2016-01-01
A key component of many monitoring programs for special status species involves capture and handling of individuals as part of capture-recapture efforts for tracking population health and demography. Minimizing negative impacts from sampling, such as through reduced handling, aids prevention of negative impacts on species from monitoring efforts. Using simulation analyses, we found that long-term population monitoring techniques, requiring physical capture (i.e. hoop-net sampling), can be reduced and supplemented with passive detections (i.e. PIT tag antenna array detections) without negatively affecting estimates of adult humpback chub (HBC; Gila cypha) survival (S) and skipped spawning probabilities (γ' = spawner transitions to a skipped spawner, γ′ = skipped spawner remains a skipped spawner). Based on our findings of the array’s in situ detection efficiency (0.42), estimability of such demographic parameters would improve over hoop-netting alone. In addition, the array provides insight into HBC population dynamics and movement patterns outside of traditional sampling periods. However, given current timing of sampling efforts, spawner abundance estimates were negatively biased when hoop-netting was reduced, suggesting not all spawning HBC are present during the current sampling events. Despite this, our findings demonstrate that PIT tag antenna arrays, even with moderate potential detectability, may allow for reduced handling of special status species while also offering potentially more efficient monitoring strategies, especially if ideal timing of sampling can be determined.
Henne, Melinda B; Stegmann, Barbara J; Neithardt, Adrienne B; Catherino, William H; Armstrong, Alicia Y; Kao, Tzu-Cheg; Segars, James H
2008-01-01
To predict the cost of a delivery following assisted reproductive technologies (ART). Cost analysis based on retrospective chart analysis. University-based ART program. Women aged >or=26 and
De Boni, Raquel; do Nascimento Silva, Pedro Luis; Bastos, Francisco Inácio; Pechansky, Flavio; de Vasconcellos, Mauricio Teixeira Leite
2012-01-01
Drinking alcoholic beverages in places such as bars and clubs may be associated with harmful consequences such as violence and impaired driving. However, methods for obtaining probabilistic samples of drivers who drink at these places remain a challenge – since there is no a priori information on this mobile population – and must be continually improved. This paper describes the procedures adopted in the selection of a population-based sample of drivers who drank at alcohol selling outlets in Porto Alegre, Brazil, which we used to estimate the prevalence of intention to drive under the influence of alcohol. The sampling strategy comprises a stratified three-stage cluster sampling: 1) census enumeration areas (CEA) were stratified by alcohol outlets (AO) density and sampled with probability proportional to the number of AOs in each CEA; 2) combinations of outlets and shifts (COS) were stratified by prevalence of alcohol-related traffic crashes and sampled with probability proportional to their squared duration in hours; and, 3) drivers who drank at the selected COS were stratified by their intention to drive and sampled using inverse sampling. Sample weights were calibrated using a post-stratification estimator. 3,118 individuals were approached and 683 drivers interviewed, leading to an estimate that 56.3% (SE = 3,5%) of the drivers intended to drive after drinking in less than one hour after the interview. Prevalence was also estimated by sex and broad age groups. The combined use of stratification and inverse sampling enabled a good trade-off between resource and time allocation, while preserving the ability to generalize the findings. The current strategy can be viewed as a step forward in the efforts to improve surveys and estimation for hard-to-reach, mobile populations. PMID:22514620
Garcia-Saenz, A; Napp, S; Lopez, S; Casal, J; Allepuz, A
2015-10-01
The achievement of the Officially Tuberculosis Free (OTF) status in regions with low bovine Tuberculosis (bTB) herd prevalence, as is the case of North-Eastern Spain (Catalonia), might be a likely option in the medium term. In this context, risk-based approaches could be an alternative surveillance strategy to the costly current strategy. However, before any change in the system may be contemplated, a reliable estimate of the sensitivity of the different surveillance components is needed. In this study, we focused on the slaughterhouse component. The probability of detection of a bTB-infected cattle by the slaughterhouses in Catalonia was estimated as the product of three consecutive probabilities: (P1) the probability that a bTB-infected animal arrived at the slaughterhouse presenting Macroscopically Detectable Lesions (MDL); (P2) the probability that MDL were detected by the routine meat inspection process and (P3) the probability that the veterinary officer suspected bTB and sent the sample for laboratory confirmation. The first probability was obtained from data collected through the bTB eradication program carried out in Catalonia between 2005 and 2008, while the last two were obtained through the expert opinion of the veterinary officers working at the slaughterhouses who fulfilled a questionnaire administered during 2014. The bTB surveillance sensitivity of the different cattle slaughterhouses in Catalonia obtained in this study was 31.4% (CI 95%: 28.6-36.2), and there were important differences among them. The low bTB surveillance sensitivity was mainly related with the low probability that a bTB-infected animal arrived at the slaughterhouse presenting MDL (around 44.8%). The variability of the sensitivity among the different slaughterhouses could be explained by significant associations between some variables included in the survey and P2. For instance, factors like attendance to training courses, number of meat technicians and speed of the slaughter chain were significantly related with the probabilities that a MDL was detected by the meat inspection procedure carried out in the slaughterhouse. Technical and policy efforts should be focused on the improvement of these factors in order to maximize the slaughterhouse sensitivity. Copyright © 2015 Elsevier B.V. All rights reserved.
Van den Brom, R; Santman-Berends, I; Luttikholt, S; Moll, L; Van Engelen, E; Vellema, P
2015-06-01
In the period from 2005 to 2009, Coxiella burnetii was a cause of abortion waves at 28 dairy goat farms and 2 dairy sheep farms in the Netherlands. Two years after the first abortion waves, a large human Q fever outbreak started mainly in the same region, and aborting small ruminants were regarded as most probable source. To distinguish between infected and noninfected herds, a surveillance program started in October 2009, based on PCR testing of bulk tank milk (BTM) samples, which had never been described before. The aim of this study was to analyze the effectiveness of this surveillance program and to evaluate both the effect of culling of pregnant dairy goats on positive farms and of vaccination on BTM results. Bulk tank milk samples were tested for C. burnetii DNA using a real-time PCR, and results were analyzed in relation to vaccination, culling, and notifiable (officially reported to government) C. burnetii abortion records. In spring and autumn, BTM samples were also tested for antibodies using an ELISA, and results were evaluated in relation to the compulsory vaccination campaign. Between October 2009 and April 2014, 1,660 (5.6%) out of 29,875 BTM samples from 401 dairy goat farms tested positive for C. burnetii DNA. The percentage of positive samples dropped from 20.5% in 2009 to 0.3% in 2014. In a multivariable model, significantly higher odds of being PCR positive in the BTM surveillance program were found in farms of which all pregnant dairy goats were culled. Additionally, the risk for C. burnetii BTM PCR positivity significantly decreased after multiple vaccinations. Bulk tank milk ELISA results were significantly higher after vaccination than before. The ELISA results were higher after multiple vaccinations compared with a single vaccination, and ELISA results on officially declared infected farms were significantly higher compared with noninfected farms. In conclusion, BTM surveillance is an effective and useful tool to detect C. burnetii shedding dairy goat herds and to monitor a Q fever outbreak, and thus the effect of implemented measures. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Violence and PTSD in Mexico: gender and regional differences.
Baker, Charlene K; Norris, Fran H; Diaz, Dayna M V; Perilla, Julia L; Murphy, Arthur D; Hill, Elizabeth G
2005-07-01
We examined the lifetime prevalence of violence in Mexico and how different characteristics of the violent event effect the probability of meeting criteria for lifetime post-traumatic stress disorder (PTSD). We interviewed a probability sample of 2,509 adults from 4 cities in Mexico (Oaxaca, Guadalajara, Hermosillo, Mérida) using the Composite International Diagnostic Interview (CIDI). Lifetime prevalence of violence was 34%. Men reported more single-experience, recurrent, physical, adolescent, adulthood, and stranger violence; women more sexual, childhood, family, and intimate partner violence. Prevalence was generally higher in Guadalajara, though the impact was greater in Oaxaca compared to other cities. Of those exposed, 11.5% met DSM-IV criteria for PTSD. Probabilities were highest after sexual and intimate partner violence, higher for women than men, and higher in Oaxaca than other cities. It is important to consider the characteristics and the context of violence in order to develop effective prevention and intervention programs to reduce the exposure to and impact of violence.
Using NetCloak to develop server-side Web-based experiments without writing CGI programs.
Wolfe, Christopher R; Reyna, Valerie F
2002-05-01
Server-side experiments use the Web server, rather than the participant's browser, to handle tasks such as random assignment, eliminating inconsistencies with JAVA and other client-side applications. Heretofore, experimenters wishing to create server-side experiments have had to write programs to create common gateway interface (CGI) scripts in programming languages such as Perl and C++. NetCloak uses simple, HTML-like commands to create CGIs. We used NetCloak to implement an experiment on probability estimation. Measurements of time on task and participants' IP addresses assisted quality control. Without prior training, in less than 1 month, we were able to use NetCloak to design and create a Web-based experiment and to help graduate students create three Web-based experiments of their own.
Time as a dimension of the sample design in national-scale forest inventories
Francis Roesch; Paul Van Deusen
2013-01-01
Historically, the goal of forest inventories has been to determine the extent of the timber resource. Predictions of how the resource was changing were made by comparing differences between successive inventories. The general view of the associated sample design was with selection probabilities based on land area observed at a discrete point in time. Time was not...
(I Can’t Get No) Saturation: A simulation and guidelines for sample sizes in qualitative research
2017-01-01
I explore the sample size in qualitative research that is required to reach theoretical saturation. I conceptualize a population as consisting of sub-populations that contain different types of information sources that hold a number of codes. Theoretical saturation is reached after all the codes in the population have been observed once in the sample. I delineate three different scenarios to sample information sources: “random chance,” which is based on probability sampling, “minimal information,” which yields at least one new code per sampling step, and “maximum information,” which yields the largest number of new codes per sampling step. Next, I use simulations to assess the minimum sample size for each scenario for systematically varying hypothetical populations. I show that theoretical saturation is more dependent on the mean probability of observing codes than on the number of codes in a population. Moreover, the minimal and maximal information scenarios are significantly more efficient than random chance, but yield fewer repetitions per code to validate the findings. I formulate guidelines for purposive sampling and recommend that researchers follow a minimum information scenario. PMID:28746358
DNA Identification of Skeletal Remains from World War II Mass Graves Uncovered in Slovenia
Marjanović, Damir; Durmić-Pašić, Adaleta; Bakal, Narcisa; Haverić, Sanin; Kalamujić, Belma; Kovačević, Lejla; Ramić, Jasmin; Pojskić, Naris; Škaro, Vedrana; Projić, Petar; Bajrović, Kasim; Hadžiselimović, Rifat; Drobnič, Katja; Huffine, Ed; Davoren, Jon; Primorac, Dragan
2007-01-01
Aim To present the joint effort of three institutions in the identification of human remains from the World War II found in two mass graves in the area of Škofja Loka, Slovenia. Methods The remains of 27 individuals were found in two small and closely located mass graves. The DNA was isolated from bone and teeth samples using either standard phenol/chloroform alcohol extraction or optimized Qiagen DNA extraction procedure. Some recovered samples required the employment of additional DNA purification methods, such as N-buthanol treatment. QuantifilerTM Human DNA Quantification Kit was used for DNA quantification. PowerPlex 16 kit was used to simultaneously amplify 15 short tandem repeat (STR) loci. Matching probabilities were estimated using the DNA View program. Results Out of all processed samples, 15 remains were fully profiled at all 15 STR loci. The other 12 profiles were partial. The least successful profile included 13 loci. Also, 69 referent samples (buccal swabs) from potential living relatives were collected and profiled. Comparison of victims' profile against referent samples database resulted in 4 strong matches. In addition, 5 other profiles were matched to certain referent samples with lower probability. Conclusion Our results show that more than 6 decades after the end of the World War II, DNA analysis may significantly contribute to the identification of the remains from that period. Additional analysis of Y-STRs and mitochondrial DNA (mtDNA) markers will be performed in the second phase of the identification project. PMID:17696306
The RBANS Effort Index: base rates in geriatric samples.
Duff, Kevin; Spering, Cynthia C; O'Bryant, Sid E; Beglinger, Leigh J; Moser, David J; Bayless, John D; Culp, Kennith R; Mold, James W; Adams, Russell L; Scott, James G
2011-01-01
The Effort Index (EI) of the RBANS was developed to assist clinicians in discriminating patients who demonstrate good effort from those with poor effort. However, there are concerns that older adults might be unfairly penalized by this index, which uses uncorrected raw scores. Using five independent samples of geriatric patients with a broad range of cognitive functioning (e.g., cognitively intact, nursing home residents, probable Alzheimer's disease), base rates of failure on the EI were calculated. In cognitively intact and mildly impaired samples, few older individuals were classified as demonstrating poor effort (e.g., 3% in cognitively intact). However, in the more severely impaired geriatric patients, over one third had EI scores that fell above suggested cutoff scores (e.g., 37% in nursing home residents, 33% in probable Alzheimer's disease). In the cognitively intact sample, older and less educated patients were more likely to have scores suggestive of poor effort. Education effects were observed in three of the four clinical samples. Overall cognitive functioning was significantly correlated with EI scores, with poorer cognition being associated with greater suspicion of low effort. The current results suggest that age, education, and level of cognitive functioning should be taken into consideration when interpreting EI results and that significant caution is warranted when examining EI scores in elders suspected of having dementia.
Langtimm, C.A.; O'Shea, T.J.; Pradel, R.; Beck, C.A.
1998-01-01
The population dynamics of large, long-lived mammals are particularly sensitive to changes in adult survival. Understanding factors affecting survival patterns is therefore critical for developing and testing theories of population dynamics and for developing management strategies aimed at preventing declines or extinction in such taxa. Few studies have used modern analytical approaches for analyzing variation and testing hypotheses about survival probabilities in large mammals. This paper reports a detailed analysis of annual adult survival in the Florida manatee (Trichechus manatus latirostris), an endangered marine mammal, based on a mark-recapture approach. Natural and boat-inflicted scars distinctively 'marked' individual manatees that were cataloged in a computer-based photographic system. Photo-documented resightings provided 'recaptures.' Using open population models, annual adult-survival probabilities were estimated for manatees observed in winter in three areas of Florida: Blue Spring, Crystal River, and the Atlantic coast. After using goodness-of-fit tests in Program RELEASE to search for violations of the assumptions of mark-recapture analysis, survival and sighting probabilities were modeled under several different biological hypotheses with Program SURGE. Estimates of mean annual probability of sighting varied from 0.948 for Blue Spring to 0.737 for Crystal River and 0.507 for the Atlantic coast. At Crystal River and Blue Spring, annual survival probabilities were best estimated as constant over the study period at 0.96 (95% CI = 0.951-0.975 and 0.900-0.985, respectively). On the Atlantic coast, where manatees are impacted more by human activities, annual survival probabilities had a significantly lower mean estimate of 0.91 (95% CI = 0.887-0.926) and varied unpredictably over the study period. For each study area, survival did not differ between sexes and was independent of relative adult age. The high constant adult-survival probabilities estimated for manatees in the Blue Spring and Crystal River areas were consistent with current mammalian life history theory and other empirical data available for large, long-lived mammals. Adult survival probabilities in these areas appeared high enough to maintain growing populations if other traits such as reproductive rates and juvenile survival were also sufficiently high lower and variable survival rates on the Atlantic coast are cause for concern.
USE OF NATURAL WATERS AS U. S. GEOLOGICAL SURVEY REFERENCE SAMPLES.
Janzer, Victor J.
1985-01-01
The U. S. Geological Survey conducts research and collects hydrologic data relating to the Nation's water resources. Seven types of natural matrix reference water samples are prepared for use in the Survey's quality assurance program. These include samples containing major constituents, trace metals, nutrients, herbicides, insecticides, trace metals in a water and suspended-sediment mixture, and precipitation (snowmelt). To prepare these reference samples, natural water is collected in plastic drums and the sediment is allowed to settle. The water is then filtered, selected constituents are added, and if necessary the water is acidified and sterilized by ultraviolet irradiation before bottling in plastic or glass. These reference samples are distributed twice yearly to more than 100 laboratories for chemical analysis. The most probable values for each constituent are determined by evaluating the data submitted by the laboratories using statistical techniques recommended by ASTM.
Willems, Sjw; Schat, A; van Noorden, M S; Fiocco, M
2018-02-01
Censored data make survival analysis more complicated because exact event times are not observed. Statistical methodology developed to account for censored observations assumes that patients' withdrawal from a study is independent of the event of interest. However, in practice, some covariates might be associated to both lifetime and censoring mechanism, inducing dependent censoring. In this case, standard survival techniques, like Kaplan-Meier estimator, give biased results. The inverse probability censoring weighted estimator was developed to correct for bias due to dependent censoring. In this article, we explore the use of inverse probability censoring weighting methodology and describe why it is effective in removing the bias. Since implementing this method is highly time consuming and requires programming and mathematical skills, we propose a user friendly algorithm in R. Applications to a toy example and to a medical data set illustrate how the algorithm works. A simulation study was carried out to investigate the performance of the inverse probability censoring weighted estimators in situations where dependent censoring is present in the data. In the simulation process, different sample sizes, strengths of the censoring model, and percentages of censored individuals were chosen. Results show that in each scenario inverse probability censoring weighting reduces the bias induced in the traditional Kaplan-Meier approach where dependent censoring is ignored.
The Power of Probability: Poster/Teaching Guide for Grades 6-8. Expect the Unexpected with Math[R
ERIC Educational Resources Information Center
Actuarial Foundation, 2013
2013-01-01
"The Power of Probability" is a new math program aligned with the National Council of Teachers of Mathematics (NCTM) and Common Core State Standards, which gives students opportunities to practice their skills and knowledge of the mathematics of probability. Developed by The Actuarial Foundation, the program's lessons and worksheets motivate…
Chao, Li-Wei; Szrek, Helena; Peltzer, Karl; Ramlagan, Shandir; Fleming, Peter; Leite, Rui; Magerman, Jesswill; Ngwenya, Godfrey B.; Pereira, Nuno Sousa; Behrman, Jere
2011-01-01
Finding an efficient method for sampling micro- and small-enterprises (MSEs) for research and statistical reporting purposes is a challenge in developing countries, where registries of MSEs are often nonexistent or outdated. This lack of a sampling frame creates an obstacle in finding a representative sample of MSEs. This study uses computer simulations to draw samples from a census of businesses and non-businesses in the Tshwane Municipality of South Africa, using three different sampling methods: the traditional probability sampling method, the compact segment sampling method, and the World Health Organization’s Expanded Programme on Immunization (EPI) sampling method. Three mechanisms by which the methods could differ are tested, the proximity selection of respondents, the at-home selection of respondents, and the use of inaccurate probability weights. The results highlight the importance of revisits and accurate probability weights, but the lesser effect of proximity selection on the samples’ statistical properties. PMID:22582004