Sample records for binomial probability distribution

  1. Zero-truncated negative binomial - Erlang distribution

    NASA Astrophysics Data System (ADS)

    Bodhisuwan, Winai; Pudprommarat, Chookait; Bodhisuwan, Rujira; Saothayanun, Luckhana

    2017-11-01

    The zero-truncated negative binomial-Erlang distribution is introduced. It is developed from negative binomial-Erlang distribution. In this work, the probability mass function is derived and some properties are included. The parameters of the zero-truncated negative binomial-Erlang distribution are estimated by using the maximum likelihood estimation. Finally, the proposed distribution is applied to real data, the number of methamphetamine in the Bangkok, Thailand. Based on the results, it shows that the zero-truncated negative binomial-Erlang distribution provided a better fit than the zero-truncated Poisson, zero-truncated negative binomial, zero-truncated generalized negative-binomial and zero-truncated Poisson-Lindley distributions for this data.

  2. Use of the negative binomial-truncated Poisson distribution in thunderstorm prediction

    NASA Technical Reports Server (NTRS)

    Cohen, A. C.

    1971-01-01

    A probability model is presented for the distribution of thunderstorms over a small area given that thunderstorm events (1 or more thunderstorms) are occurring over a larger area. The model incorporates the negative binomial and truncated Poisson distributions. Probability tables for Cape Kennedy for spring, summer, and fall months and seasons are presented. The computer program used to compute these probabilities is appended.

  3. Sampling--how big a sample?

    PubMed

    Aitken, C G

    1999-07-01

    It is thought that, in a consignment of discrete units, a certain proportion of the units contain illegal material. A sample of the consignment is to be inspected. Various methods for the determination of the sample size are compared. The consignment will be considered as a random sample from some super-population of units, a certain proportion of which contain drugs. For large consignments, a probability distribution, known as the beta distribution, for the proportion of the consignment which contains illegal material is obtained. This distribution is based on prior beliefs about the proportion. Under certain specific conditions the beta distribution gives the same numerical results as an approach based on the binomial distribution. The binomial distribution provides a probability for the number of units in a sample which contain illegal material, conditional on knowing the proportion of the consignment which contains illegal material. This is in contrast to the beta distribution which provides probabilities for the proportion of a consignment which contains illegal material, conditional on knowing the number of units in the sample which contain illegal material. The interpretation when the beta distribution is used is much more intuitively satisfactory. It is also much more flexible in its ability to cater for prior beliefs which may vary given the different circumstances of different crimes. For small consignments, a distribution, known as the beta-binomial distribution, for the number of units in the consignment which are found to contain illegal material, is obtained, based on prior beliefs about the number of units in the consignment which are thought to contain illegal material. As with the beta and binomial distributions for large samples, it is shown that, in certain specific conditions, the beta-binomial and hypergeometric distributions give the same numerical results. However, the beta-binomial distribution, as with the beta distribution, has a more intuitively satisfactory interpretation and greater flexibility. The beta and the beta-binomial distributions provide methods for the determination of the minimum sample size to be taken from a consignment in order to satisfy a certain criterion. The criterion requires the specification of a proportion and a probability.

  4. C-5A Cargo Deck Low-Frequency Vibration Environment

    DTIC Science & Technology

    1975-02-01

    SAMPLE VIBRATION CALCULATIONS 13 1. Normal Distribution 13 2. Binomial Distribution 15 IV CONCLUSIONS 17 -! V REFERENCES 18 t: FEiCENDIJJ PAGS 2LANKNOT...Calculation for Binomial Distribution 108 (Vertical Acceleration, Right Rear Cargo Deck) xi I. INTRODUCTION The availability of large transport...the end of taxi. These peaks could then be used directly to compile the probability of occurrence of specific values of acceleration using the binomial

  5. Meta-analysis of studies with bivariate binary outcomes: a marginal beta-binomial model approach

    PubMed Central

    Chen, Yong; Hong, Chuan; Ning, Yang; Su, Xiao

    2018-01-01

    When conducting a meta-analysis of studies with bivariate binary outcomes, challenges arise when the within-study correlation and between-study heterogeneity should be taken into account. In this paper, we propose a marginal beta-binomial model for the meta-analysis of studies with binary outcomes. This model is based on the composite likelihood approach, and has several attractive features compared to the existing models such as bivariate generalized linear mixed model (Chu and Cole, 2006) and Sarmanov beta-binomial model (Chen et al., 2012). The advantages of the proposed marginal model include modeling the probabilities in the original scale, not requiring any transformation of probabilities or any link function, having closed-form expression of likelihood function, and no constraints on the correlation parameter. More importantly, since the marginal beta-binomial model is only based on the marginal distributions, it does not suffer from potential misspecification of the joint distribution of bivariate study-specific probabilities. Such misspecification is difficult to detect and can lead to biased inference using currents methods. We compare the performance of the marginal beta-binomial model with the bivariate generalized linear mixed model and the Sarmanov beta-binomial model by simulation studies. Interestingly, the results show that the marginal beta-binomial model performs better than the Sarmanov beta-binomial model, whether or not the true model is Sarmanov beta-binomial, and the marginal beta-binomial model is more robust than the bivariate generalized linear mixed model under model misspecifications. Two meta-analyses of diagnostic accuracy studies and a meta-analysis of case-control studies are conducted for illustration. PMID:26303591

  6. Meta-analysis of studies with bivariate binary outcomes: a marginal beta-binomial model approach.

    PubMed

    Chen, Yong; Hong, Chuan; Ning, Yang; Su, Xiao

    2016-01-15

    When conducting a meta-analysis of studies with bivariate binary outcomes, challenges arise when the within-study correlation and between-study heterogeneity should be taken into account. In this paper, we propose a marginal beta-binomial model for the meta-analysis of studies with binary outcomes. This model is based on the composite likelihood approach and has several attractive features compared with the existing models such as bivariate generalized linear mixed model (Chu and Cole, 2006) and Sarmanov beta-binomial model (Chen et al., 2012). The advantages of the proposed marginal model include modeling the probabilities in the original scale, not requiring any transformation of probabilities or any link function, having closed-form expression of likelihood function, and no constraints on the correlation parameter. More importantly, because the marginal beta-binomial model is only based on the marginal distributions, it does not suffer from potential misspecification of the joint distribution of bivariate study-specific probabilities. Such misspecification is difficult to detect and can lead to biased inference using currents methods. We compare the performance of the marginal beta-binomial model with the bivariate generalized linear mixed model and the Sarmanov beta-binomial model by simulation studies. Interestingly, the results show that the marginal beta-binomial model performs better than the Sarmanov beta-binomial model, whether or not the true model is Sarmanov beta-binomial, and the marginal beta-binomial model is more robust than the bivariate generalized linear mixed model under model misspecifications. Two meta-analyses of diagnostic accuracy studies and a meta-analysis of case-control studies are conducted for illustration. Copyright © 2015 John Wiley & Sons, Ltd.

  7. The Difference Calculus and The NEgative Binomial Distribution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bowman, Kimiko o; Shenton, LR

    2007-01-01

    In a previous paper we state the dominant term in the third central moment of the maximum likelihood estimator k of the parameter k in the negative binomial probability function where the probability generating function is (p + 1 - pt){sup -k}. A partial sum of the series {Sigma}1/(k + x){sup 3} is involved, where x is a negative binomial random variate. In expectation this sum can only be found numerically using the computer. Here we give a simple definite integral in (0,1) for the generalized case. This means that now we do have a valid expression for {radical}{beta}{sub 11}(k)more » and {radical}{beta}{sub 11}(p). In addition we use the finite difference operator {Delta}, and E = 1 + {Delta} to set up formulas for low order moments. Other examples of the operators are quoted relating to the orthogonal set of polynomials associated with the negative binomial probability function used as a weight function.« less

  8. The Detection of Signals in Impulsive Noise.

    DTIC Science & Technology

    1983-06-01

    ASSI FICATION/ DOWN GRADING SCHEOUL1E * I1S. DISTRIBUTION STATEMENT (of th0i0 Rhport) Approved for Public Release; Distribucion Unlimited * 17...has a symmetric distribution, sgn(x i) will be -1 with probability 1/2 and +1 with probability 1/2. Considering the sum of observations as 0 binomial

  9. CUMBIN - CUMULATIVE BINOMIAL PROGRAMS

    NASA Technical Reports Server (NTRS)

    Bowerman, P. N.

    1994-01-01

    The cumulative binomial program, CUMBIN, is one of a set of three programs which calculate cumulative binomial probability distributions for arbitrary inputs. The three programs, CUMBIN, NEWTONP (NPO-17556), and CROSSER (NPO-17557), can be used independently of one another. CUMBIN can be used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. The program has been used for reliability/availability calculations. CUMBIN calculates the probability that a system of n components has at least k operating if the probability that any one operating is p and the components are independent. Equivalently, this is the reliability of a k-out-of-n system having independent components with common reliability p. CUMBIN can evaluate the incomplete beta distribution for two positive integer arguments. CUMBIN can also evaluate the cumulative F distribution and the negative binomial distribution, and can determine the sample size in a test design. CUMBIN is designed to work well with all integer values 0 < k <= n. To run the program, the user simply runs the executable version and inputs the information requested by the program. The program is not designed to weed out incorrect inputs, so the user must take care to make sure the inputs are correct. Once all input has been entered, the program calculates and lists the result. The CUMBIN program is written in C. It was developed on an IBM AT with a numeric co-processor using Microsoft C 5.0. Because the source code is written using standard C structures and functions, it should compile correctly with most C compilers. The program format is interactive. It has been implemented under DOS 3.2 and has a memory requirement of 26K. CUMBIN was developed in 1988.

  10. Extended Poisson process modelling and analysis of grouped binary data.

    PubMed

    Faddy, Malcolm J; Smith, David M

    2012-05-01

    A simple extension of the Poisson process results in binomially distributed counts of events in a time interval. A further extension generalises this to probability distributions under- or over-dispersed relative to the binomial distribution. Substantial levels of under-dispersion are possible with this modelling, but only modest levels of over-dispersion - up to Poisson-like variation. Although simple analytical expressions for the moments of these probability distributions are not available, approximate expressions for the mean and variance are derived, and used to re-parameterise the models. The modelling is applied in the analysis of two published data sets, one showing under-dispersion and the other over-dispersion. More appropriate assessment of the precision of estimated parameters and reliable model checking diagnostics follow from this more general modelling of these data sets. © 2012 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. Smisc - A collection of miscellaneous functions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Landon Sego, PNNL

    2015-08-31

    A collection of functions for statistical computing and data manipulation. These include routines for rapidly aggregating heterogeneous matrices, manipulating file names, loading R objects, sourcing multiple R files, formatting datetimes, multi-core parallel computing, stream editing, specialized plotting, etc. Smisc-package A collection of miscellaneous functions allMissing Identifies missing rows or columns in a data frame or matrix as.numericSilent Silent wrapper for coercing a vector to numeric comboList Produces all possible combinations of a set of linear model predictors cumMax Computes the maximum of the vector up to the current index cumsumNA Computes the cummulative sum of a vector without propogating NAsmore » d2binom Probability functions for the sum of two independent binomials dataIn A flexible way to import data into R. dbb The Beta-Binomial Distribution df2list Row-wise conversion of a data frame to a list dfplapply Parallelized single row processing of a data frame dframeEquiv Examines the equivalence of two dataframes or matrices dkbinom Probability functions for the sum of k independent binomials factor2character Converts all factor variables in a dataframe to character variables findDepMat Identify linearly dependent rows or columns in a matrix formatDT Converts date or datetime strings into alternate formats getExtension Filename manipulations: remove the extension or path, extract the extension or path getPath Filename manipulations: remove the extension or path, extract the extension or path grabLast Filename manipulations: remove the extension or path, extract the extension or path ifelse1 Non-vectorized version of ifelse integ Simple numerical integration routine interactionPlot Two-way Interaction Plot with Error Bar linearMap Linear mapping of a numerical vector or scalar list2df Convert a list to a data frame loadObject Loads and returns the object(s) in an ".Rdata" file more Display the contents of a file to the R terminal movAvg2 Calculate the moving average using a 2-sided window openDevice Opens a graphics device based on the filename extension p2binom Probability functions for the sum of two independent binomials padZero Pad a vector of numbers with zeros parseJob Parses a collection of elements into (almost) equal sized groups pbb The Beta-Binomial Distribution pcbinom A continuous version of the binomial cdf pkbinom Probability functions for the sum of k independent binomials plapply Simple parallelization of lapply plotFun Plot one or more functions on a single plot PowerData An example of power data pvar Prints the name and value of one or more objects qbb The Beta-Binomial Distribution rbb And numerous others (space limits reporting).« less

  12. Orchestrating Semiotic Leaps from Tacit to Cultural Quantitative Reasoning--The Case of Anticipating Experimental Outcomes of a Quasi-Binomial Random Generator

    ERIC Educational Resources Information Center

    Abrahamson, Dor

    2009-01-01

    This article reports on a case study from a design-based research project that investigated how students make sense of the disciplinary tools they are taught to use, and specifically, what personal, interpersonal, and material resources support this process. The probability topic of binomial distribution was selected due to robust documentation of…

  13. A Random Variable Transformation Process.

    ERIC Educational Resources Information Center

    Scheuermann, Larry

    1989-01-01

    Provides a short BASIC program, RANVAR, which generates random variates for various theoretical probability distributions. The seven variates include: uniform, exponential, normal, binomial, Poisson, Pascal, and triangular. (MVL)

  14. Distribution pattern of public transport passenger in Yogyakarta, Indonesia

    NASA Astrophysics Data System (ADS)

    Narendra, Alfa; Malkhamah, Siti; Sopha, Bertha Maya

    2018-03-01

    The arrival and departure distribution pattern of Trans Jogja bus passenger is one of the fundamental model for simulation. The purpose of this paper is to build models of passengers flows. This research used passengers data from January to May 2014. There is no policy that change the operation system affecting the nature of this pattern nowadays. The roads, buses, land uses, schedule, and people are relatively still the same. The data then categorized based on the direction, days, and location. Moreover, each category was fitted into some well-known discrete distributions. Those distributions are compared based on its AIC value and BIC. The chosen distribution model has the smallest AIC and BIC value and the negative binomial distribution found has the smallest AIC and BIC value. Probability mass function (PMF) plots of those models were compared to draw generic model from each categorical negative binomial distribution models. The value of accepted generic negative binomial distribution is 0.7064 and 1.4504 of mu. The minimum and maximum passenger vector value of distribution are is 0 and 41.

  15. Distribution of apparent activation energy counterparts during thermo - And thermo-oxidative degradation of Aronia melanocarpa (black chokeberry).

    PubMed

    Janković, Bojan; Marinović-Cincović, Milena; Janković, Marija

    2017-09-01

    Kinetics of degradation for Aronia melanocarpa fresh fruits in argon and air atmospheres were investigated. The investigation was based on probability distributions of apparent activation energy of counterparts (ε a ). Isoconversional analysis results indicated that the degradation process in an inert atmosphere was governed by decomposition reactions of esterified compounds. Also, based on same kinetics approach, it was assumed that in an air atmosphere, the primary compound in degradation pathways could be anthocyanins, which undergo rapid chemical reactions. A new model of reactivity demonstrated that, under inert atmospheres, expectation values for ε a occured at levels of statistical probability. These values corresponded to decomposition processes in which polyphenolic compounds might be involved. ε a values obeyed laws of binomial distribution. It was established that, for thermo-oxidative degradation, Poisson distribution represented a very successful approximation for ε a values where there was additional mechanistic complexity and the binomial distribution was no longer valid. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Enumerative and binomial sequential sampling plans for the multicolored Asian lady beetle (Coleoptera: Coccinellidae) in wine grapes.

    PubMed

    Galvan, T L; Burkness, E C; Hutchison, W D

    2007-06-01

    To develop a practical integrated pest management (IPM) system for the multicolored Asian lady beetle, Harmonia axyridis (Pallas) (Coleoptera: Coccinellidae), in wine grapes, we assessed the spatial distribution of H. axyridis and developed eight sampling plans to estimate adult density or infestation level in grape clusters. We used 49 data sets collected from commercial vineyards in 2004 and 2005, in Minnesota and Wisconsin. Enumerative plans were developed using two precision levels (0.10 and 0.25); the six binomial plans reflected six unique action thresholds (3, 7, 12, 18, 22, and 31% of cluster samples infested with at least one H. axyridis). The spatial distribution of H. axyridis in wine grapes was aggregated, independent of cultivar and year, but it was more randomly distributed as mean density declined. The average sample number (ASN) for each sampling plan was determined using resampling software. For research purposes, an enumerative plan with a precision level of 0.10 (SE/X) resulted in a mean ASN of 546 clusters. For IPM applications, the enumerative plan with a precision level of 0.25 resulted in a mean ASN of 180 clusters. In contrast, the binomial plans resulted in much lower ASNs and provided high probabilities of arriving at correct "treat or no-treat" decisions, making these plans more efficient for IPM applications. For a tally threshold of one adult per cluster, the operating characteristic curves for the six action thresholds provided binomial sequential sampling plans with mean ASNs of only 19-26 clusters, and probabilities of making correct decisions between 83 and 96%. The benefits of the binomial sampling plans are discussed within the context of improving IPM programs for wine grapes.

  17. A Unifying Probability Example.

    ERIC Educational Resources Information Center

    Maruszewski, Richard F., Jr.

    2002-01-01

    Presents an example from probability and statistics that ties together several topics including the mean and variance of a discrete random variable, the binomial distribution and its particular mean and variance, the sum of independent random variables, the mean and variance of the sum, and the central limit theorem. Uses Excel to illustrate these…

  18. Studying the Binomial Distribution Using LabVIEW

    ERIC Educational Resources Information Center

    George, Danielle J.; Hammer, Nathan I.

    2015-01-01

    This undergraduate physical chemistry laboratory exercise introduces students to the study of probability distributions both experimentally and using computer simulations. Students perform the classic coin toss experiment individually and then pool all of their data together to study the effect of experimental sample size on the binomial…

  19. A chi-square goodness-of-fit test for non-identically distributed random variables: with application to empirical Bayes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Conover, W.J.; Cox, D.D.; Martz, H.F.

    1997-12-01

    When using parametric empirical Bayes estimation methods for estimating the binomial or Poisson parameter, the validity of the assumed beta or gamma conjugate prior distribution is an important diagnostic consideration. Chi-square goodness-of-fit tests of the beta or gamma prior hypothesis are developed for use when the binomial sample sizes or Poisson exposure times vary. Nine examples illustrate the application of the methods, using real data from such diverse applications as the loss of feedwater flow rates in nuclear power plants, the probability of failure to run on demand and the failure rates of the high pressure coolant injection systems atmore » US commercial boiling water reactors, the probability of failure to run on demand of emergency diesel generators in US commercial nuclear power plants, the rate of failure of aircraft air conditioners, baseball batting averages, the probability of testing positive for toxoplasmosis, and the probability of tumors in rats. The tests are easily applied in practice by means of corresponding Mathematica{reg_sign} computer programs which are provided.« less

  20. Football goal distributions and extremal statistics

    NASA Astrophysics Data System (ADS)

    Greenhough, J.; Birch, P. C.; Chapman, S. C.; Rowlands, G.

    2002-12-01

    We analyse the distributions of the number of goals scored by home teams, away teams, and the total scored in the match, in domestic football games from 169 countries between 1999 and 2001. The probability density functions (PDFs) of goals scored are too heavy-tailed to be fitted over their entire ranges by Poisson or negative binomial distributions which would be expected for uncorrelated processes. Log-normal distributions cannot include zero scores and here we find that the PDFs are consistent with those arising from extremal statistics. In addition, we show that it is sufficient to model English top division and FA Cup matches in the seasons of 1970/71-2000/01 on Poisson or negative binomial distributions, as reported in analyses of earlier seasons, and that these are not consistent with extremal statistics.

  1. A comparison of LMC and SDL complexity measures on binomial distributions

    NASA Astrophysics Data System (ADS)

    Piqueira, José Roberto C.

    2016-02-01

    The concept of complexity has been widely discussed in the last forty years, with a lot of thinking contributions coming from all areas of the human knowledge, including Philosophy, Linguistics, History, Biology, Physics, Chemistry and many others, with mathematicians trying to give a rigorous view of it. In this sense, thermodynamics meets information theory and, by using the entropy definition, López-Ruiz, Mancini and Calbet proposed a definition for complexity that is referred as LMC measure. Shiner, Davison and Landsberg, by slightly changing the LMC definition, proposed the SDL measure and the both, LMC and SDL, are satisfactory to measure complexity for a lot of problems. Here, SDL and LMC measures are applied to the case of a binomial probability distribution, trying to clarify how the length of the data set implies complexity and how the success probability of the repeated trials determines how complex the whole set is.

  2. Confidence Intervals for True Scores Using the Skew-Normal Distribution

    ERIC Educational Resources Information Center

    Garcia-Perez, Miguel A.

    2010-01-01

    A recent comparative analysis of alternative interval estimation approaches and procedures has shown that confidence intervals (CIs) for true raw scores determined with the Score method--which uses the normal approximation to the binomial distribution--have actual coverage probabilities that are closest to their nominal level. It has also recently…

  3. NEWTONP - CUMULATIVE BINOMIAL PROGRAMS

    NASA Technical Reports Server (NTRS)

    Bowerman, P. N.

    1994-01-01

    The cumulative binomial program, NEWTONP, is one of a set of three programs which calculate cumulative binomial probability distributions for arbitrary inputs. The three programs, NEWTONP, CUMBIN (NPO-17555), and CROSSER (NPO-17557), can be used independently of one another. NEWTONP can be used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. The program has been used for reliability/availability calculations. NEWTONP calculates the probably p required to yield a given system reliability V for a k-out-of-n system. It can also be used to determine the Clopper-Pearson confidence limits (either one-sided or two-sided) for the parameter p of a Bernoulli distribution. NEWTONP can determine Bayesian probability limits for a proportion (if the beta prior has positive integer parameters). It can determine the percentiles of incomplete beta distributions with positive integer parameters. It can also determine the percentiles of F distributions and the midian plotting positions in probability plotting. NEWTONP is designed to work well with all integer values 0 < k <= n. To run the program, the user simply runs the executable version and inputs the information requested by the program. NEWTONP is not designed to weed out incorrect inputs, so the user must take care to make sure the inputs are correct. Once all input has been entered, the program calculates and lists the result. It also lists the number of iterations of Newton's method required to calculate the answer within the given error. The NEWTONP program is written in C. It was developed on an IBM AT with a numeric co-processor using Microsoft C 5.0. Because the source code is written using standard C structures and functions, it should compile correctly with most C compilers. The program format is interactive. It has been implemented under DOS 3.2 and has a memory requirement of 26K. NEWTONP was developed in 1988.

  4. A binomial stochastic kinetic approach to the Michaelis-Menten mechanism

    NASA Astrophysics Data System (ADS)

    Lente, Gábor

    2013-05-01

    This Letter presents a new method that gives an analytical approximation of the exact solution of the stochastic Michaelis-Menten mechanism without computationally demanding matrix operations. The method is based on solving the deterministic rate equations and then using the results as guiding variables of calculating probability values using binomial distributions. This principle can be generalized to a number of different kinetic schemes and is expected to be very useful in the evaluation of measurements focusing on the catalytic activity of one or a few individual enzyme molecules.

  5. Modeling the distribution of colonial species to improve estimation of plankton concentration in ballast water

    NASA Astrophysics Data System (ADS)

    Rajakaruna, Harshana; VandenByllaardt, Julie; Kydd, Jocelyn; Bailey, Sarah

    2018-03-01

    The International Maritime Organization (IMO) has set limits on allowable plankton concentrations in ballast water discharge to minimize aquatic invasions globally. Previous guidance on ballast water sampling and compliance decision thresholds was based on the assumption that probability distributions of plankton are Poisson when spatially homogenous, or negative binomial when heterogeneous. We propose a hierarchical probability model, which incorporates distributions at the level of particles (i.e., discrete individuals plus colonies per unit volume) and also within particles (i.e., individuals per particle) to estimate the average plankton concentration in ballast water. We examined the performance of the models using data for plankton in the size class ≥ 10 μm and < 50 μm, collected from five different depths of a ballast tank of a commercial ship in three independent surveys. We show that the data fit to the negative binomial and the hierarchical probability models equally well, with both models performing better than the Poisson model at the scale of our sampling. The hierarchical probability model, which accounts for both the individuals and the colonies in a sample, reduces the uncertainty associated with the concentration estimation, and improves the power of rejecting the decision on ship's compliance when a ship does not truly comply with the standard. We show examples of how to test ballast water compliance using the above models.

  6. Negative Binomial Process Count and Mixture Modeling.

    PubMed

    Zhou, Mingyuan; Carin, Lawrence

    2015-02-01

    The seemingly disjoint problems of count and mixture modeling are united under the negative binomial (NB) process. A gamma process is employed to model the rate measure of a Poisson process, whose normalization provides a random probability measure for mixture modeling and whose marginalization leads to an NB process for count modeling. A draw from the NB process consists of a Poisson distributed finite number of distinct atoms, each of which is associated with a logarithmic distributed number of data samples. We reveal relationships between various count- and mixture-modeling distributions and construct a Poisson-logarithmic bivariate distribution that connects the NB and Chinese restaurant table distributions. Fundamental properties of the models are developed, and we derive efficient Bayesian inference. It is shown that with augmentation and normalization, the NB process and gamma-NB process can be reduced to the Dirichlet process and hierarchical Dirichlet process, respectively. These relationships highlight theoretical, structural, and computational advantages of the NB process. A variety of NB processes, including the beta-geometric, beta-NB, marked-beta-NB, marked-gamma-NB and zero-inflated-NB processes, with distinct sharing mechanisms, are also constructed. These models are applied to topic modeling, with connections made to existing algorithms under Poisson factor analysis. Example results show the importance of inferring both the NB dispersion and probability parameters.

  7. Void probability as a function of the void's shape and scale-invariant models

    NASA Technical Reports Server (NTRS)

    Elizalde, E.; Gaztanaga, E.

    1991-01-01

    The dependence of counts in cells on the shape of the cell for the large scale galaxy distribution is studied. A very concrete prediction can be done concerning the void distribution for scale invariant models. The prediction is tested on a sample of the CfA catalog, and good agreement is found. It is observed that the probability of a cell to be occupied is bigger for some elongated cells. A phenomenological scale invariant model for the observed distribution of the counts in cells, an extension of the negative binomial distribution, is presented in order to illustrate how this dependence can be quantitatively determined. An original, intuitive derivation of this model is presented.

  8. Void probability as a function of the void's shape and scale-invariant models. [in studies of spacial galactic distribution

    NASA Technical Reports Server (NTRS)

    Elizalde, E.; Gaztanaga, E.

    1992-01-01

    The dependence of counts in cells on the shape of the cell for the large scale galaxy distribution is studied. A very concrete prediction can be done concerning the void distribution for scale invariant models. The prediction is tested on a sample of the CfA catalog, and good agreement is found. It is observed that the probability of a cell to be occupied is bigger for some elongated cells. A phenomenological scale invariant model for the observed distribution of the counts in cells, an extension of the negative binomial distribution, is presented in order to illustrate how this dependence can be quantitatively determined. An original, intuitive derivation of this model is presented.

  9. Simulation of flight maneuver-load distributions by utilizing stationary, non-Gaussian random load histories

    NASA Technical Reports Server (NTRS)

    Leybold, H. A.

    1971-01-01

    Random numbers were generated with the aid of a digital computer and transformed such that the probability density function of a discrete random load history composed of these random numbers had one of the following non-Gaussian distributions: Poisson, binomial, log-normal, Weibull, and exponential. The resulting random load histories were analyzed to determine their peak statistics and were compared with cumulative peak maneuver-load distributions for fighter and transport aircraft in flight.

  10. A Statistical Treatment of Bioassay Pour Fractions

    NASA Technical Reports Server (NTRS)

    Barengoltz, Jack; Hughes, David W.

    2014-01-01

    The binomial probability distribution is used to treat the statistics of a microbiological sample that is split into two parts, with only one part evaluated for spore count. One wishes to estimate the total number of spores in the sample based on the counts obtained from the part that is evaluated (pour fraction). Formally, the binomial distribution is recharacterized as a function of the observed counts (successes), with the total number (trials) an unknown. The pour fraction is the probability of success per spore (trial). This distribution must be renormalized in terms of the total number. Finally, the new renormalized distribution is integrated and mathematically inverted to yield the maximum estimate of the total number as a function of a desired level of confidence ( P(

  11. Selecting the right statistical model for analysis of insect count data by using information theoretic measures.

    PubMed

    Sileshi, G

    2006-10-01

    Researchers and regulatory agencies often make statistical inferences from insect count data using modelling approaches that assume homogeneous variance. Such models do not allow for formal appraisal of variability which in its different forms is the subject of interest in ecology. Therefore, the objectives of this paper were to (i) compare models suitable for handling variance heterogeneity and (ii) select optimal models to ensure valid statistical inferences from insect count data. The log-normal, standard Poisson, Poisson corrected for overdispersion, zero-inflated Poisson, the negative binomial distribution and zero-inflated negative binomial models were compared using six count datasets on foliage-dwelling insects and five families of soil-dwelling insects. Akaike's and Schwarz Bayesian information criteria were used for comparing the various models. Over 50% of the counts were zeros even in locally abundant species such as Ootheca bennigseni Weise, Mesoplatys ochroptera Stål and Diaecoderus spp. The Poisson model after correction for overdispersion and the standard negative binomial distribution model provided better description of the probability distribution of seven out of the 11 insects than the log-normal, standard Poisson, zero-inflated Poisson or zero-inflated negative binomial models. It is concluded that excess zeros and variance heterogeneity are common data phenomena in insect counts. If not properly modelled, these properties can invalidate the normal distribution assumptions resulting in biased estimation of ecological effects and jeopardizing the integrity of the scientific inferences. Therefore, it is recommended that statistical models appropriate for handling these data properties be selected using objective criteria to ensure efficient statistical inference.

  12. Distinguishing between Binomial, Hypergeometric and Negative Binomial Distributions

    ERIC Educational Resources Information Center

    Wroughton, Jacqueline; Cole, Tarah

    2013-01-01

    Recognizing the differences between three discrete distributions (Binomial, Hypergeometric and Negative Binomial) can be challenging for students. We present an activity designed to help students differentiate among these distributions. In addition, we present assessment results in the form of pre- and post-tests that were designed to assess the…

  13. Library Book Circulation and the Beta-Binomial Distribution.

    ERIC Educational Resources Information Center

    Gelman, E.; Sichel, H. S.

    1987-01-01

    Argues that library book circulation is a binomial rather than a Poisson process, and that individual book popularities are continuous beta distributions. Three examples demonstrate the superiority of beta over negative binomial distribution, and it is suggested that a bivariate-binomial process would be helpful in predicting future book…

  14. Use of the binomial distribution to predict impairment: application in a nonclinical sample.

    PubMed

    Axelrod, Bradley N; Wall, Jacqueline R; Estes, Bradley W

    2008-01-01

    A mathematical model based on the binomial theory was developed to illustrate when abnormal score variations occur by chance in a multitest battery (Ingraham & Aiken, 1996). It has been successfully used as a comparison for obtained test scores in clinical samples, but not in nonclinical samples. In the current study, this model has been applied to demographically corrected scores on the Halstead-Reitan Neuropsychological Test Battery, obtained from a sample of 94 nonclinical college students. Results found that 15% of the sample had impairments suggested by the Halstead Impairment Index, using criteria established by Reitan and Wolfson (1993). In addition, one-half of the sample obtained impaired scores on one or two tests. These results were compared to that predicted by the binomial model and found to be consistent. The model therefore serves as a useful resource for clinicians considering the probability of impaired test performance.

  15. A Classroom Note on the Binomial and Poisson Distributions: Biomedical Examples for Use in Teaching Introductory Statistics

    ERIC Educational Resources Information Center

    Holland, Bart K.

    2006-01-01

    A generally-educated individual should have some insight into how decisions are made in the very wide range of fields that employ statistical and probabilistic reasoning. Also, students of introductory probability and statistics are often best motivated by specific applications rather than by theory and mathematical development, because most…

  16. Analysis of multiple tank car releases in train accidents.

    PubMed

    Liu, Xiang; Liu, Chang; Hong, Yili

    2017-10-01

    There are annually over two million carloads of hazardous materials transported by rail in the United States. The American railroads use large blocks of tank cars to transport petroleum crude oil and other flammable liquids from production to consumption sites. Being different from roadway transport of hazardous materials, a train accident can potentially result in the derailment and release of multiple tank cars, which may result in significant consequences. The prior literature predominantly assumes that the occurrence of multiple tank car releases in a train accident is a series of independent Bernoulli processes, and thus uses the binomial distribution to estimate the total number of tank car releases given the number of tank cars derailing or damaged. This paper shows that the traditional binomial model can incorrectly estimate multiple tank car release probability by magnitudes in certain circumstances, thereby significantly affecting railroad safety and risk analysis. To bridge this knowledge gap, this paper proposes a novel, alternative Correlated Binomial (CB) model that accounts for the possible correlations of multiple tank car releases in the same train. We test three distinct correlation structures in the CB model, and find that they all outperform the conventional binomial model based on empirical tank car accident data. The analysis shows that considering tank car release correlations would result in a significantly improved fit of the empirical data than otherwise. Consequently, it is prudent to consider alternative modeling techniques when analyzing the probability of multiple tank car releases in railroad accidents. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Spatial distribution and sequential sampling plans for Tuta absoluta (Lepidoptera: Gelechiidae) in greenhouse tomato crops.

    PubMed

    Cocco, Arturo; Serra, Giuseppe; Lentini, Andrea; Deliperi, Salvatore; Delrio, Gavino

    2015-09-01

    The within- and between-plant distribution of the tomato leafminer, Tuta absoluta (Meyrick), was investigated in order to define action thresholds based on leaf infestation and to propose enumerative and binomial sequential sampling plans for pest management applications in protected crops. The pest spatial distribution was aggregated between plants, and median leaves were the most suitable sample to evaluate the pest density. Action thresholds of 36 and 48%, 43 and 56% and 60 and 73% infested leaves, corresponding to economic thresholds of 1 and 3% damaged fruits, were defined for tomato cultivars with big, medium and small fruits respectively. Green's method was a more suitable enumerative sampling plan as it required a lower sampling effort. Binomial sampling plans needed lower average sample sizes than enumerative plans to make a treatment decision, with probabilities of error of <0.10. The enumerative sampling plan required 87 or 343 leaves to estimate the population density in extensive or intensive ecological studies respectively. Binomial plans would be more practical and efficient for control purposes, needing average sample sizes of 17, 20 and 14 leaves to take a pest management decision in order to avoid fruit damage higher than 1% in cultivars with big, medium and small fruits respectively. © 2014 Society of Chemical Industry.

  18. Site occupancy models with heterogeneous detection probabilities

    USGS Publications Warehouse

    Royle, J. Andrew

    2006-01-01

    Models for estimating the probability of occurrence of a species in the presence of imperfect detection are important in many ecological disciplines. In these ?site occupancy? models, the possibility of heterogeneity in detection probabilities among sites must be considered because variation in abundance (and other factors) among sampled sites induces variation in detection probability (p). In this article, I develop occurrence probability models that allow for heterogeneous detection probabilities by considering several common classes of mixture distributions for p. For any mixing distribution, the likelihood has the general form of a zero-inflated binomial mixture for which inference based upon integrated likelihood is straightforward. A recent paper by Link (2003, Biometrics 59, 1123?1130) demonstrates that in closed population models used for estimating population size, different classes of mixture distributions are indistinguishable from data, yet can produce very different inferences about population size. I demonstrate that this problem can also arise in models for estimating site occupancy in the presence of heterogeneous detection probabilities. The implications of this are discussed in the context of an application to avian survey data and the development of animal monitoring programs.

  19. Generalization of multifractal theory within quantum calculus

    NASA Astrophysics Data System (ADS)

    Olemskoi, A.; Shuda, I.; Borisyuk, V.

    2010-03-01

    On the basis of the deformed series in quantum calculus, we generalize the partition function and the mass exponent of a multifractal, as well as the average of a random variable distributed over a self-similar set. For the partition function, such expansion is shown to be determined by binomial-type combinations of the Tsallis entropies related to manifold deformations, while the mass exponent expansion generalizes the known relation τq=Dq(q-1). We find the equation for the set of averages related to ordinary, escort, and generalized probabilities in terms of the deformed expansion as well. Multifractals related to the Cantor binomial set, exchange currency series, and porous-surface condensates are considered as examples.

  20. Accounting for non-independent detection when estimating abundance of organisms with a Bayesian approach

    USGS Publications Warehouse

    Martin, Julien; Royle, J. Andrew; MacKenzie, Darryl I.; Edwards, Holly H.; Kery, Marc; Gardner, Beth

    2011-01-01

    Summary 1. Binomial mixture models use repeated count data to estimate abundance. They are becoming increasingly popular because they provide a simple and cost-effective way to account for imperfect detection. However, these models assume that individuals are detected independently of each other. This assumption may often be violated in the field. For instance, manatees (Trichechus manatus latirostris) may surface in turbid water (i.e. become available for detection during aerial surveys) in a correlated manner (i.e. in groups). However, correlated behaviour, affecting the non-independence of individual detections, may also be relevant in other systems (e.g. correlated patterns of singing in birds and amphibians). 2. We extend binomial mixture models to account for correlated behaviour and therefore to account for non-independent detection of individuals. We simulated correlated behaviour using beta-binomial random variables. Our approach can be used to simultaneously estimate abundance, detection probability and a correlation parameter. 3. Fitting binomial mixture models to data that followed a beta-binomial distribution resulted in an overestimation of abundance even for moderate levels of correlation. In contrast, the beta-binomial mixture model performed considerably better in our simulation scenarios. We also present a goodness-of-fit procedure to evaluate the fit of beta-binomial mixture models. 4. We illustrate our approach by fitting both binomial and beta-binomial mixture models to aerial survey data of manatees in Florida. We found that the binomial mixture model did not fit the data, whereas there was no evidence of lack of fit for the beta-binomial mixture model. This example helps illustrate the importance of using simulations and assessing goodness-of-fit when analysing ecological data with N-mixture models. Indeed, both the simulations and the goodness-of-fit procedure highlighted the limitations of the standard binomial mixture model for aerial manatee surveys. 5. Overestimation of abundance by binomial mixture models owing to non-independent detections is problematic for ecological studies, but also for conservation. For example, in the case of endangered species, it could lead to inappropriate management decisions, such as downlisting. These issues will be increasingly relevant as more ecologists apply flexible N-mixture models to ecological data.

  1. Binomial probability distribution model-based protein identification algorithm for tandem mass spectrometry utilizing peak intensity information.

    PubMed

    Xiao, Chuan-Le; Chen, Xiao-Zhou; Du, Yang-Li; Sun, Xuesong; Zhang, Gong; He, Qing-Yu

    2013-01-04

    Mass spectrometry has become one of the most important technologies in proteomic analysis. Tandem mass spectrometry (LC-MS/MS) is a major tool for the analysis of peptide mixtures from protein samples. The key step of MS data processing is the identification of peptides from experimental spectra by searching public sequence databases. Although a number of algorithms to identify peptides from MS/MS data have been already proposed, e.g. Sequest, OMSSA, X!Tandem, Mascot, etc., they are mainly based on statistical models considering only peak-matches between experimental and theoretical spectra, but not peak intensity information. Moreover, different algorithms gave different results from the same MS data, implying their probable incompleteness and questionable reproducibility. We developed a novel peptide identification algorithm, ProVerB, based on a binomial probability distribution model of protein tandem mass spectrometry combined with a new scoring function, making full use of peak intensity information and, thus, enhancing the ability of identification. Compared with Mascot, Sequest, and SQID, ProVerB identified significantly more peptides from LC-MS/MS data sets than the current algorithms at 1% False Discovery Rate (FDR) and provided more confident peptide identifications. ProVerB is also compatible with various platforms and experimental data sets, showing its robustness and versatility. The open-source program ProVerB is available at http://bioinformatics.jnu.edu.cn/software/proverb/ .

  2. Comparison of multiplicity distributions to the negative binomial distribution in muon-proton scattering

    NASA Astrophysics Data System (ADS)

    Arneodo, M.; Arvidson, A.; Aubert, J. J.; Badełek, B.; Beaufays, J.; Bee, C. P.; Benchouk, C.; Berghoff, G.; Bird, I.; Blum, D.; Böhm, E.; de Bouard, X.; Brasse, F. W.; Braun, H.; Broll, C.; Brown, S.; Brück, H.; Calen, H.; Chima, J. S.; Ciborowski, J.; Clifft, R.; Coignet, G.; Combley, F.; Coughlan, J.; D'Agostini, G.; Dahlgren, S.; Dengler, F.; Derado, I.; Dreyer, T.; Drees, J.; Düren, M.; Eckardt, V.; Edwards, A.; Edwards, M.; Ernst, T.; Eszes, G.; Favier, J.; Ferrero, M. I.; Figiel, J.; Flauger, W.; Foster, J.; Ftáčnik, J.; Gabathuler, E.; Gajewski, J.; Gamet, R.; Gayler, J.; Geddes, N.; Grafström, P.; Grard, F.; Haas, J.; Hagberg, E.; Hasert, F. J.; Hayman, P.; Heusse, P.; Jaffré, M.; Jachołkowska, A.; Janata, F.; Jancsó, G.; Johnson, A. S.; Kabuss, E. M.; Kellner, G.; Korbel, V.; Krüger, J.; Kullander, S.; Landgraf, U.; Lanske, D.; Loken, J.; Long, K.; Maire, M.; Malecki, P.; Manz, A.; Maselli, S.; Mohr, W.; Montanet, F.; Montgomery, H. E.; Nagy, E.; Nassalski, J.; Norton, P. R.; Oakham, F. G.; Osborne, A. M.; Pascaud, C.; Pawlik, B.; Payre, P.; Peroni, C.; Peschel, H.; Pessard, H.; Pettinghale, J.; Pietrzyk, B.; Pietrzyk, U.; Pönsgen, B.; Pötsch, M.; Renton, P.; Ribarics, P.; Rith, K.; Rondio, E.; Sandacz, A.; Scheer, M.; Schlagböhmer, A.; Schiemann, H.; Schmitz, N.; Schneegans, M.; Schneider, A.; Scholz, M.; Schröder, T.; Schultze, K.; Sloan, T.; Stier, H. E.; Studt, M.; Taylor, G. N.; Thénard, J. M.; Thompson, J. C.; de La Torre, A.; Toth, J.; Urban, L.; Urban, L.; Wallucks, W.; Whalley, M.; Wheeler, S.; Williams, W. S. C.; Wimpenny, S. J.; Windmolders, R.; Wolf, G.

    1987-09-01

    The multiplicity distributions of charged hadrons produced in the deep inelastic muon-proton scattering at 280 GeV are analysed in various rapidity intervals, as a function of the total hadronic centre of mass energy W ranging from 4 20 GeV. Multiplicity distributions for the backward and forward hemispheres are also analysed separately. The data can be well parameterized by binomial distributions, extending their range of applicability to the case of lepton-proton scattering. The energy and the rapidity dependence of the parameters is presented and a smooth transition from the negative binomial distribution via Poissonian to the ordinary binomial is observed.

  3. On the p, q-binomial distribution and the Ising model

    NASA Astrophysics Data System (ADS)

    Lundow, P. H.; Rosengren, A.

    2010-08-01

    We employ p, q-binomial coefficients, a generalisation of the binomial coefficients, to describe the magnetisation distributions of the Ising model. For the complete graph this distribution corresponds exactly to the limit case p = q. We apply our investigation to the simple d-dimensional lattices for d = 1, 2, 3, 4, 5 and fit p, q-binomial distributions to our data, some of which are exact but most are sampled. For d = 1 and d = 5, the magnetisation distributions are remarkably well-fitted by p,q-binomial distributions. For d = 4 we are only slightly less successful, while for d = 2, 3 we see some deviations (with exceptions!) between the p, q-binomial and the Ising distribution. However, at certain temperatures near T c the statistical moments of the fitted distribution agree with the moments of the sampled data within the precision of sampling. We begin the paper by giving results of the behaviour of the p, q-distribution and its moment growth exponents given a certain parameterisation of p, q. Since the moment exponents are known for the Ising model (or at least approximately for d = 3) we can predict how p, q should behave and compare this to our measured p, q. The results speak in favour of the p, q-binomial distribution's correctness regarding its general behaviour in comparison to the Ising model. The full extent to which they correctly model the Ising distribution, however, is not settled.

  4. The Binomial Model in Fluctuation Analysis of Quantal Neurotransmitter Release

    PubMed Central

    Quastel, D. M. J.

    1997-01-01

    The mathematics of the binomial model for quantal neurotransmitter release is considered in general terms, to explore what information might be extractable from statistical aspects of data. For an array of N statistically independent release sites, each with a release probability p, the compound binomial always pertains, with = N

    , p′ ≡ 1 - var(m)/ =

    (1 + cvp2) and n′ ≡ /p′ = N/(1 + cvp2), where m is the output/stimulus and cvp2 is var(p)/

    2. Unless n′ is invariant with ambient conditions or stimulation paradigms, the simple binomial (cvp = 0) is untenable and n′ is neither N nor the number of “active” sites or sites with a quantum available. At each site p = popA, where po is the output probability if a site is “eligible” or “filled” despite previous quantal discharge, and pA (eligibility probability) depends at least on the replenishment rate, po, and interstimulus time. Assuming stochastic replenishment, a simple algorithm allows calculation of the full statistical composition of outputs for any hypothetical combinations of po's and refill rates, for any stimulation paradigm and spontaneous release. A rise in n′ (reduced cvp) tends to occur whenever po varies widely between sites, with a raised stimulation frequency or factors tending to increase po's. Unlike and var(m) at equilibrium, output changes early in trains of stimuli, and covariances, potentially provide information about whether changes in reflect change in or in . Formulae are derived for variance and third moments of postsynaptic responses, which depend on the quantal mix in the signals. A new, easily computed function, the area product, gives noise-unbiased variance of a series of synaptic signals and its peristimulus time distribution, which is modified by the unit channel composition of quantal responses and if the signals reflect mixed responses from synapses with different quantal time course. PMID:9017200

  5. Football fever: goal distributions and non-Gaussian statistics

    NASA Astrophysics Data System (ADS)

    Bittner, E.; Nußbaumer, A.; Janke, W.; Weigel, M.

    2009-02-01

    Analyzing football score data with statistical techniques, we investigate how the not purely random, but highly co-operative nature of the game is reflected in averaged properties such as the probability distributions of scored goals for the home and away teams. As it turns out, especially the tails of the distributions are not well described by the Poissonian or binomial model resulting from the assumption of uncorrelated random events. Instead, a good effective description of the data is provided by less basic distributions such as the negative binomial one or the probability densities of extreme value statistics. To understand this behavior from a microscopical point of view, however, no waiting time problem or extremal process need be invoked. Instead, modifying the Bernoulli random process underlying the Poissonian model to include a simple component of self-affirmation seems to describe the data surprisingly well and allows to understand the observed deviation from Gaussian statistics. The phenomenological distributions used before can be understood as special cases within this framework. We analyzed historical football score data from many leagues in Europe as well as from international tournaments, including data from all past tournaments of the “FIFA World Cup” series, and found the proposed models to be applicable rather universally. In particular, here we analyze the results of the German women’s premier football league and consider the two separate German men’s premier leagues in the East and West during the cold war times as well as the unified league after 1990 to see how scoring in football and the component of self-affirmation depend on cultural and political circumstances.

  6. Novel formulation of the ℳ model through the Generalized-K distribution for atmospheric optical channels.

    PubMed

    Garrido-Balsells, José María; Jurado-Navas, Antonio; Paris, José Francisco; Castillo-Vazquez, Miguel; Puerta-Notario, Antonio

    2015-03-09

    In this paper, a novel and deeper physical interpretation on the recently published Málaga or ℳ statistical distribution is provided. This distribution, which is having a wide acceptance by the scientific community, models the optical irradiance scintillation induced by the atmospheric turbulence. Here, the analytical expressions previously published are modified in order to express them by a mixture of the known Generalized-K and discrete Binomial and Negative Binomial distributions. In particular, the probability density function (pdf) of the ℳ model is now obtained as a linear combination of these Generalized-K pdf, in which the coefficients depend directly on the parameters of the ℳ distribution. In this way, the Málaga model can be physically interpreted as a superposition of different optical sub-channels each of them described by the corresponding Generalized-K fading model and weighted by the ℳ dependent coefficients. The expressions here proposed are simpler than the equations of the original ℳ model and are validated by means of numerical simulations by generating ℳ -distributed random sequences and their associated histogram. This novel interpretation of the Málaga statistical distribution provides a valuable tool for analyzing the performance of atmospheric optical channels for every turbulence condition.

  7. Time analysis of volcanic activity on Io by means of plasma observations

    NASA Technical Reports Server (NTRS)

    Mekler, Y.; Eviatar, A.

    1980-01-01

    A model of Io volcanism in which the probability of activity obeys a binomial distribution is presented. Observed values of the electron density obtained over a 3-year period by ground-based spectroscopy are fitted to such a distribution. The best fit is found for a total number of 15 volcanoes with a probability of individual activity at any time of 0.143. The Pioneer 10 ultraviolet observations are reinterpreted as emissions of sulfur and oxygen ions and are found to be consistent with a plasma much less dense than that observed by the Voyager spacecraft. Late 1978 and the first half of 1979 are shown to be periods of anomalous volcanicity. Rapid variations in electron density are related to enhanced radial diffusion.

  8. Assessment of some important factors affecting the singing-ground survey

    USGS Publications Warehouse

    Tautin, J.

    1982-01-01

    A brief history of the procedures used to analyze singing-ground survey data is outlined. Some weaknesses associated with the analytical procedures are discussed, and preliminary results of efforts to improve the procedures are presented. The most significant finding to date is that counts made by new observers need not be omitted when calculating an index of the woodcock population. Also, the distribution of woodcock heard singing, with respect to time after sunset, affirms the appropriateness of recommended starting times for counting woodcock. Woodcock count data fit the negative binomial probability distribution.

  9. On the robustness of the q-Gaussian family

    NASA Astrophysics Data System (ADS)

    Sicuro, Gabriele; Tempesta, Piergiulio; Rodríguez, Antonio; Tsallis, Constantino

    2015-12-01

    We introduce three deformations, called α-, β- and γ-deformation respectively, of a N-body probabilistic model, first proposed by Rodríguez et al. (2008), having q-Gaussians as N → ∞ limiting probability distributions. The proposed α- and β-deformations are asymptotically scale-invariant, whereas the γ-deformation is not. We prove that, for both α- and β-deformations, the resulting deformed triangles still have q-Gaussians as limiting distributions, with a value of q independent (dependent) on the deformation parameter in the α-case (β-case). In contrast, the γ-case, where we have used the celebrated Q-numbers and the Gauss binomial coefficients, yields other limiting probability distribution functions, outside the q-Gaussian family. These results suggest that scale-invariance might play an important role regarding the robustness of the q-Gaussian family.

  10. Distribution-free Inference of Zero-inated Binomial Data for Longitudinal Studies.

    PubMed

    He, H; Wang, W J; Hu, J; Gallop, R; Crits-Christoph, P; Xia, Y L

    2015-10-01

    Count reponses with structural zeros are very common in medical and psychosocial research, especially in alcohol and HIV research, and the zero-inflated poisson (ZIP) and zero-inflated negative binomial (ZINB) models are widely used for modeling such outcomes. However, as alcohol drinking outcomes such as days of drinkings are counts within a given period, their distributions are bounded above by an upper limit (total days in the period) and thus inherently follow a binomial or zero-inflated binomial (ZIB) distribution, rather than a Poisson or zero-inflated Poisson (ZIP) distribution, in the presence of structural zeros. In this paper, we develop a new semiparametric approach for modeling zero-inflated binomial (ZIB)-like count responses for cross-sectional as well as longitudinal data. We illustrate this approach with both simulated and real study data.

  11. The magnetisation distribution of the Ising model - a new approach

    NASA Astrophysics Data System (ADS)

    Hakan Lundow, Per; Rosengren, Anders

    2010-03-01

    A completely new approach to the Ising model in 1 to 5 dimensions is developed. We employ a generalisation of the binomial coefficients to describe the magnetisation distributions of the Ising model. For the complete graph this distribution is exact. For simple lattices of dimensions d=1 and d=5 the magnetisation distributions are remarkably well-fitted by the generalized binomial distributions. For d=4 we are only slightly less successful, while for d=2,3 we see some deviations (with exceptions!) between the generalized binomial and the Ising distribution. The results speak in favour of the generalized binomial distribution's correctness regarding their general behaviour in comparison to the Ising model. A theoretical analysis of the distribution's moments also lends support their being correct asymptotically, including the logarithmic corrections in d=4. The full extent to which they correctly model the Ising distribution, and for which graph families, is not settled though.

  12. Inference for binomial probability based on dependent Bernoulli random variables with applications to meta‐analysis and group level studies

    PubMed Central

    Bakbergenuly, Ilyas; Morgenthaler, Stephan

    2016-01-01

    We study bias arising as a result of nonlinear transformations of random variables in random or mixed effects models and its effect on inference in group‐level studies or in meta‐analysis. The findings are illustrated on the example of overdispersed binomial distributions, where we demonstrate considerable biases arising from standard log‐odds and arcsine transformations of the estimated probability p^, both for single‐group studies and in combining results from several groups or studies in meta‐analysis. Our simulations confirm that these biases are linear in ρ, for small values of ρ, the intracluster correlation coefficient. These biases do not depend on the sample sizes or the number of studies K in a meta‐analysis and result in abysmal coverage of the combined effect for large K. We also propose bias‐correction for the arcsine transformation. Our simulations demonstrate that this bias‐correction works well for small values of the intraclass correlation. The methods are applied to two examples of meta‐analyses of prevalence. PMID:27192062

  13. Optimizing Probability of Detection Point Estimate Demonstration

    NASA Technical Reports Server (NTRS)

    Koshti, Ajay M.

    2017-01-01

    Probability of detection (POD) analysis is used in assessing reliably detectable flaw size in nondestructive evaluation (NDE). MIL-HDBK-18231and associated mh18232POD software gives most common methods of POD analysis. Real flaws such as cracks and crack-like flaws are desired to be detected using these NDE methods. A reliably detectable crack size is required for safe life analysis of fracture critical parts. The paper provides discussion on optimizing probability of detection (POD) demonstration experiments using Point Estimate Method. POD Point estimate method is used by NASA for qualifying special NDE procedures. The point estimate method uses binomial distribution for probability density. Normally, a set of 29 flaws of same size within some tolerance are used in the demonstration. The optimization is performed to provide acceptable value for probability of passing demonstration (PPD) and achieving acceptable value for probability of false (POF) calls while keeping the flaw sizes in the set as small as possible.

  14. Analysis of generalized negative binomial distributions attached to hyperbolic Landau levels

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chhaiba, Hassan, E-mail: chhaiba.hassan@gmail.com; Demni, Nizar, E-mail: nizar.demni@univ-rennes1.fr; Mouayn, Zouhair, E-mail: mouayn@fstbm.ac.ma

    2016-07-15

    To each hyperbolic Landau level of the Poincaré disc is attached a generalized negative binomial distribution. In this paper, we compute the moment generating function of this distribution and supply its atomic decomposition as a perturbation of the negative binomial distribution by a finitely supported measure. Using the Mandel parameter, we also discuss the nonclassical nature of the associated coherent states. Next, we derive a Lévy-Khintchine-type representation of its characteristic function when the latter does not vanish and deduce that it is quasi-infinitely divisible except for the lowest hyperbolic Landau level corresponding to the negative binomial distribution. By considering themore » total variation of the obtained quasi-Lévy measure, we introduce a new infinitely divisible distribution for which we derive the characteristic function.« less

  15. Photon Counting Data Analysis: Application of the Maximum Likelihood and Related Methods for the Determination of Lifetimes in Mixtures of Rose Bengal and Rhodamine B

    DOE PAGES

    Santra, Kalyan; Smith, Emily A.; Petrich, Jacob W.; ...

    2016-12-12

    It is often convenient to know the minimum amount of data needed in order to obtain a result of desired accuracy and precision. It is a necessity in the case of subdiffraction-limited microscopies, such as stimulated emission depletion (STED) microscopy, owing to the limited sample volumes and the extreme sensitivity of the samples to photobleaching and photodamage. We present a detailed comparison of probability-based techniques (the maximum likelihood method and methods based on the binomial and the Poisson distributions) with residual minimization-based techniques for retrieving the fluorescence decay parameters for various two-fluorophore mixtures, as a function of the total numbermore » of photon counts, in time-correlated, single-photon counting experiments. The probability-based techniques proved to be the most robust (insensitive to initial values) in retrieving the target parameters and, in fact, performed equivalently to 2-3 significant figures. This is to be expected, as we demonstrate that the three methods are fundamentally related. Furthermore, methods based on the Poisson and binomial distributions have the desirable feature of providing a bin-by-bin analysis of a single fluorescence decay trace, which thus permits statistics to be acquired using only the one trace for not only the mean and median values of the fluorescence decay parameters but also for the associated standard deviations. Lastly, these probability-based methods lend themselves well to the analysis of the sparse data sets that are encountered in subdiffraction-limited microscopies.« less

  16. A Lower Bound to the Probability of Choosing the Optimal Passing Score for a Mastery Test When There is an External Criterion [and] Estimating the Parameters of the Beta-Binomial Distribution.

    ERIC Educational Resources Information Center

    Wilcox, Rand R.

    A mastery test is frequently described as follows: an examinee responds to n dichotomously scored test items. Depending upon the examinee's observed (number correct) score, a mastery decision is made and the examinee is advanced to the next level of instruction. Otherwise, a nonmastery decision is made and the examinee is given remedial work. This…

  17. Choosing a Transformation in Analyses of Insect Counts from Contagious Distributions with Low Means

    Treesearch

    W.D. Pepper; S.J. Zarnoch; G.L. DeBarr; P. de Groot; C.D. Tangren

    1997-01-01

    Guidelines based on computer simulation are suggested for choosing a transformation of insect counts from negative binomial distributions with low mean counts and high levels of contagion. Typical values and ranges of negative binomial model parameters were determined by fitting the model to data from 19 entomological field studies. Random sampling of negative binomial...

  18. A Monte Carlo Risk Analysis of Life Cycle Cost Prediction.

    DTIC Science & Technology

    1975-09-01

    process which occurs with each FLU failure. With this in mind there is no alternative other than the binomial distribution. 24 GOR/SM/75D-6 With all of...Weibull distribution of failures as selected by user. For each failure of the ith FLU, the model then samples from the binomial distribution to deter- mine...which is sampled from the binomial . Neither of the two conditions for normality are met, i.e., that RTS Ie close to .5 and the number of samples close

  19. Inference for binomial probability based on dependent Bernoulli random variables with applications to meta-analysis and group level studies.

    PubMed

    Bakbergenuly, Ilyas; Kulinskaya, Elena; Morgenthaler, Stephan

    2016-07-01

    We study bias arising as a result of nonlinear transformations of random variables in random or mixed effects models and its effect on inference in group-level studies or in meta-analysis. The findings are illustrated on the example of overdispersed binomial distributions, where we demonstrate considerable biases arising from standard log-odds and arcsine transformations of the estimated probability p̂, both for single-group studies and in combining results from several groups or studies in meta-analysis. Our simulations confirm that these biases are linear in ρ, for small values of ρ, the intracluster correlation coefficient. These biases do not depend on the sample sizes or the number of studies K in a meta-analysis and result in abysmal coverage of the combined effect for large K. We also propose bias-correction for the arcsine transformation. Our simulations demonstrate that this bias-correction works well for small values of the intraclass correlation. The methods are applied to two examples of meta-analyses of prevalence. © 2016 The Authors. Biometrical Journal Published by Wiley-VCH Verlag GmbH & Co. KGaA.

  20. WHAMII - An enumeration and insertion procedure with binomial bounds for the stochastic time-constrained traveling salesman problem

    NASA Technical Reports Server (NTRS)

    Dahl, Roy W.; Keating, Karen; Salamone, Daryl J.; Levy, Laurence; Nag, Barindra; Sanborn, Joan A.

    1987-01-01

    This paper presents an algorithm (WHAMII) designed to solve the Artificial Intelligence Design Challenge at the 1987 AIAA Guidance, Navigation and Control Conference. The problem under consideration is a stochastic generalization of the traveling salesman problem in which travel costs can incur a penalty with a given probability. The variability in travel costs leads to a probability constraint with respect to violating the budget allocation. Given the small size of the problem (eleven cities), an approach is considered that combines partial tour enumeration with a heuristic city insertion procedure. For computational efficiency during both the enumeration and insertion procedures, precalculated binomial probabilities are used to determine an upper bound on the actual probability of violating the budget constraint for each tour. The actual probability is calculated for the final best tour, and additional insertions are attempted until the actual probability exceeds the bound.

  1. The Binomial Distribution in Shooting

    ERIC Educational Resources Information Center

    Chalikias, Miltiadis S.

    2009-01-01

    The binomial distribution is used to predict the winner of the 49th International Shooting Sport Federation World Championship in double trap shooting held in 2006 in Zagreb, Croatia. The outcome of the competition was definitely unexpected.

  2. Partitioning Detectability Components in Populations Subject to Within-Season Temporary Emigration Using Binomial Mixture Models

    PubMed Central

    O’Donnell, Katherine M.; Thompson, Frank R.; Semlitsch, Raymond D.

    2015-01-01

    Detectability of individual animals is highly variable and nearly always < 1; imperfect detection must be accounted for to reliably estimate population sizes and trends. Hierarchical models can simultaneously estimate abundance and effective detection probability, but there are several different mechanisms that cause variation in detectability. Neglecting temporary emigration can lead to biased population estimates because availability and conditional detection probability are confounded. In this study, we extend previous hierarchical binomial mixture models to account for multiple sources of variation in detectability. The state process of the hierarchical model describes ecological mechanisms that generate spatial and temporal patterns in abundance, while the observation model accounts for the imperfect nature of counting individuals due to temporary emigration and false absences. We illustrate our model’s potential advantages, including the allowance of temporary emigration between sampling periods, with a case study of southern red-backed salamanders Plethodon serratus. We fit our model and a standard binomial mixture model to counts of terrestrial salamanders surveyed at 40 sites during 3–5 surveys each spring and fall 2010–2012. Our models generated similar parameter estimates to standard binomial mixture models. Aspect was the best predictor of salamander abundance in our case study; abundance increased as aspect became more northeasterly. Increased time-since-rainfall strongly decreased salamander surface activity (i.e. availability for sampling), while higher amounts of woody cover objects and rocks increased conditional detection probability (i.e. probability of capture, given an animal is exposed to sampling). By explicitly accounting for both components of detectability, we increased congruence between our statistical modeling and our ecological understanding of the system. We stress the importance of choosing survey locations and protocols that maximize species availability and conditional detection probability to increase population parameter estimate reliability. PMID:25775182

  3. Using beta binomials to estimate classification uncertainty for ensemble models.

    PubMed

    Clark, Robert D; Liang, Wenkel; Lee, Adam C; Lawless, Michael S; Fraczkiewicz, Robert; Waldman, Marvin

    2014-01-01

    Quantitative structure-activity (QSAR) models have enormous potential for reducing drug discovery and development costs as well as the need for animal testing. Great strides have been made in estimating their overall reliability, but to fully realize that potential, researchers and regulators need to know how confident they can be in individual predictions. Submodels in an ensemble model which have been trained on different subsets of a shared training pool represent multiple samples of the model space, and the degree of agreement among them contains information on the reliability of ensemble predictions. For artificial neural network ensembles (ANNEs) using two different methods for determining ensemble classification - one using vote tallies and the other averaging individual network outputs - we have found that the distribution of predictions across positive vote tallies can be reasonably well-modeled as a beta binomial distribution, as can the distribution of errors. Together, these two distributions can be used to estimate the probability that a given predictive classification will be in error. Large data sets comprised of logP, Ames mutagenicity, and CYP2D6 inhibition data are used to illustrate and validate the method. The distributions of predictions and errors for the training pool accurately predicted the distribution of predictions and errors for large external validation sets, even when the number of positive and negative examples in the training pool were not balanced. Moreover, the likelihood of a given compound being prospectively misclassified as a function of the degree of consensus between networks in the ensemble could in most cases be estimated accurately from the fitted beta binomial distributions for the training pool. Confidence in an individual predictive classification by an ensemble model can be accurately assessed by examining the distributions of predictions and errors as a function of the degree of agreement among the constituent submodels. Further, ensemble uncertainty estimation can often be improved by adjusting the voting or classification threshold based on the parameters of the error distribution. Finally, the profiles for models whose predictive uncertainty estimates are not reliable provide clues to that effect without the need for comparison to an external test set.

  4. Estimation of aquifer scale proportion using equal area grids: assessment of regional scale groundwater quality

    USGS Publications Warehouse

    Belitz, Kenneth; Jurgens, Bryant C.; Landon, Matthew K.; Fram, Miranda S.; Johnson, Tyler D.

    2010-01-01

    The proportion of an aquifer with constituent concentrations above a specified threshold (high concentrations) is taken as a nondimensional measure of regional scale water quality. If computed on the basis of area, it can be referred to as the aquifer scale proportion. A spatially unbiased estimate of aquifer scale proportion and a confidence interval for that estimate are obtained through the use of equal area grids and the binomial distribution. Traditionally, the confidence interval for a binomial proportion is computed using either the standard interval or the exact interval. Research from the statistics literature has shown that the standard interval should not be used and that the exact interval is overly conservative. On the basis of coverage probability and interval width, the Jeffreys interval is preferred. If more than one sample per cell is available, cell declustering is used to estimate the aquifer scale proportion, and Kish's design effect may be useful for estimating an effective number of samples. The binomial distribution is also used to quantify the adequacy of a grid with a given number of cells for identifying a small target, defined as a constituent that is present at high concentrations in a small proportion of the aquifer. Case studies illustrate a consistency between approaches that use one well per grid cell and many wells per cell. The methods presented in this paper provide a quantitative basis for designing a sampling program and for utilizing existing data.

  5. Optimal estimation for discrete time jump processes

    NASA Technical Reports Server (NTRS)

    Vaca, M. V.; Tretter, S. A.

    1978-01-01

    Optimum estimates of nonobservable random variables or random processes which influence the rate functions of a discrete time jump process (DTJP) are derived. The approach used is based on the a posteriori probability of a nonobservable event expressed in terms of the a priori probability of that event and of the sample function probability of the DTJP. Thus a general representation is obtained for optimum estimates, and recursive equations are derived for minimum mean-squared error (MMSE) estimates. In general, MMSE estimates are nonlinear functions of the observations. The problem is considered of estimating the rate of a DTJP when the rate is a random variable with a beta probability density function and the jump amplitudes are binomially distributed. It is shown that the MMSE estimates are linear. The class of beta density functions is rather rich and explains why there are insignificant differences between optimum unconstrained and linear MMSE estimates in a variety of problems.

  6. Phase transition and information cascade in a voting model

    NASA Astrophysics Data System (ADS)

    Hisakado, M.; Mori, S.

    2010-08-01

    In this paper, we introduce a voting model that is similar to a Keynesian beauty contest and analyse it from a mathematical point of view. There are two types of voters—copycat and independent—and two candidates. Our voting model is a binomial distribution (independent voters) doped in a beta binomial distribution (copycat voters). We find that the phase transition in this system is at the upper limit of t, where t is the time (or the number of the votes). Our model contains three phases. If copycats constitute a majority or even half of the total voters, the voting rate converges more slowly than it would in a binomial distribution. If independents constitute the majority of voters, the voting rate converges at the same rate as it would in a binomial distribution. We also study why it is difficult to estimate the conclusion of a Keynesian beauty contest when there is an information cascade.

  7. Estimating the Parameters of the Beta-Binomial Distribution.

    ERIC Educational Resources Information Center

    Wilcox, Rand R.

    1979-01-01

    For some situations the beta-binomial distribution might be used to describe the marginal distribution of test scores for a particular population of examinees. Several different methods of approximating the maximum likelihood estimate were investigated, and it was found that the Newton-Raphson method should be used when it yields admissable…

  8. Reliability of environmental sampling culture results using the negative binomial intraclass correlation coefficient.

    PubMed

    Aly, Sharif S; Zhao, Jianyang; Li, Ben; Jiang, Jiming

    2014-01-01

    The Intraclass Correlation Coefficient (ICC) is commonly used to estimate the similarity between quantitative measures obtained from different sources. Overdispersed data is traditionally transformed so that linear mixed model (LMM) based ICC can be estimated. A common transformation used is the natural logarithm. The reliability of environmental sampling of fecal slurry on freestall pens has been estimated for Mycobacterium avium subsp. paratuberculosis using the natural logarithm transformed culture results. Recently, the negative binomial ICC was defined based on a generalized linear mixed model for negative binomial distributed data. The current study reports on the negative binomial ICC estimate which includes fixed effects using culture results of environmental samples. Simulations using a wide variety of inputs and negative binomial distribution parameters (r; p) showed better performance of the new negative binomial ICC compared to the ICC based on LMM even when negative binomial data was logarithm, and square root transformed. A second comparison that targeted a wider range of ICC values showed that the mean of estimated ICC closely approximated the true ICC.

  9. I Remember You: Independence and the Binomial Model

    ERIC Educational Resources Information Center

    Levine, Douglas W.; Rockhill, Beverly

    2006-01-01

    We focus on the problem of ignoring statistical independence. A binomial experiment is used to determine whether judges could match, based on looks alone, dogs to their owners. The experimental design introduces dependencies such that the probability of a given judge correctly matching a dog and an owner changes from trial to trial. We show how…

  10. Simulation on Poisson and negative binomial models of count road accident modeling

    NASA Astrophysics Data System (ADS)

    Sapuan, M. S.; Razali, A. M.; Zamzuri, Z. H.; Ibrahim, K.

    2016-11-01

    Accident count data have often been shown to have overdispersion. On the other hand, the data might contain zero count (excess zeros). The simulation study was conducted to create a scenarios which an accident happen in T-junction with the assumption the dependent variables of generated data follows certain distribution namely Poisson and negative binomial distribution with different sample size of n=30 to n=500. The study objective was accomplished by fitting Poisson regression, negative binomial regression and Hurdle negative binomial model to the simulated data. The model validation was compared and the simulation result shows for each different sample size, not all model fit the data nicely even though the data generated from its own distribution especially when the sample size is larger. Furthermore, the larger sample size indicates that more zeros accident count in the dataset.

  11. Bernoulli, Darwin, and Sagan: the probability of life on other planets

    NASA Astrophysics Data System (ADS)

    Rossmo, D. Kim

    2017-04-01

    The recent discovery that billions of planets in the Milky Way Galaxy may be in circumstellar habitable zones has renewed speculation over the possibility of extraterrestrial life. The Drake equation is a probabilistic framework for estimating the number of technological advanced civilizations in our Galaxy; however, many of the equation's component probabilities are either unknown or have large error intervals. In this paper, a different method of examining this question is explored, one that replaces the various Drake factors with the single estimate for the probability of life existing on Earth. This relationship can be described by the binomial distribution if the presence of life on a given number of planets is equated to successes in a Bernoulli trial. The question of exoplanet life may then be reformulated as follows - given the probability of one or more independent successes for a given number of trials, what is the probability of two or more successes? Some of the implications of this approach for finding life on exoplanets are discussed.

  12. Modelling road accident blackspots data with the discrete generalized Pareto distribution.

    PubMed

    Prieto, Faustino; Gómez-Déniz, Emilio; Sarabia, José María

    2014-10-01

    This study shows how road traffic networks events, in particular road accidents on blackspots, can be modelled with simple probabilistic distributions. We considered the number of crashes and the number of fatalities on Spanish blackspots in the period 2003-2007, from Spanish General Directorate of Traffic (DGT). We modelled those datasets, respectively, with the discrete generalized Pareto distribution (a discrete parametric model with three parameters) and with the discrete Lomax distribution (a discrete parametric model with two parameters, and particular case of the previous model). For that, we analyzed the basic properties of both parametric models: cumulative distribution, survival, probability mass, quantile and hazard functions, genesis and rth-order moments; applied two estimation methods of their parameters: the μ and (μ+1) frequency method and the maximum likelihood method; used two goodness-of-fit tests: Chi-square test and discrete Kolmogorov-Smirnov test based on bootstrap resampling; and compared them with the classical negative binomial distribution in terms of absolute probabilities and in models including covariates. We found that those probabilistic models can be useful to describe the road accident blackspots datasets analyzed. Copyright © 2014 Elsevier Ltd. All rights reserved.

  13. [The reentrant binomial model of nuclear anomalies growth in rhabdomyosarcoma RA-23 cell populations under increasing doze of rare ionizing radiation].

    PubMed

    Alekseeva, N P; Alekseev, A O; Vakhtin, Iu B; Kravtsov, V Iu; Kuzovatov, S N; Skorikova, T I

    2008-01-01

    Distributions of nuclear morphology anomalies in transplantable rabdomiosarcoma RA-23 cell populations were investigated under effect of ionizing radiation from 0 to 45 Gy. Internuclear bridges, nuclear protrusions and dumbbell-shaped nuclei were accepted for morphological anomalies. Empirical distributions of the number of anomalies per 100 nuclei were used. The adequate model of reentrant binomial distribution has been found. The sum of binomial random variables with binomial number of summands has such distribution. Averages of these random variables were named, accordingly, internal and external average reentrant components. Their maximum likelihood estimations were received. Statistical properties of these estimations were investigated by means of statistical modeling. It has been received that at equally significant correlation between the radiation dose and the average of nuclear anomalies in cell populations after two-three cellular cycles from the moment of irradiation in vivo the irradiation doze significantly correlates with internal average reentrant component, and in remote descendants of cell transplants irradiated in vitro - with external one.

  14. Optimizing probability of detection point estimate demonstration

    NASA Astrophysics Data System (ADS)

    Koshti, Ajay M.

    2017-04-01

    The paper provides discussion on optimizing probability of detection (POD) demonstration experiments using point estimate method. The optimization is performed to provide acceptable value for probability of passing demonstration (PPD) and achieving acceptable value for probability of false (POF) calls while keeping the flaw sizes in the set as small as possible. POD Point estimate method is used by NASA for qualifying special NDE procedures. The point estimate method uses binomial distribution for probability density. Normally, a set of 29 flaws of same size within some tolerance are used in the demonstration. Traditionally largest flaw size in the set is considered to be a conservative estimate of the flaw size with minimum 90% probability and 95% confidence. The flaw size is denoted as α90/95PE. The paper investigates relationship between range of flaw sizes in relation to α90, i.e. 90% probability flaw size, to provide a desired PPD. The range of flaw sizes is expressed as a proportion of the standard deviation of the probability density distribution. Difference between median or average of the 29 flaws and α90 is also expressed as a proportion of standard deviation of the probability density distribution. In general, it is concluded that, if probability of detection increases with flaw size, average of 29 flaw sizes would always be larger than or equal to α90 and is an acceptable measure of α90/95PE. If NDE technique has sufficient sensitivity and signal-to-noise ratio, then the 29 flaw-set can be optimized to meet requirements of minimum required PPD, maximum allowable POF, requirements on flaw size tolerance about mean flaw size and flaw size detectability requirements. The paper provides procedure for optimizing flaw sizes in the point estimate demonstration flaw-set.

  15. Binomial Baseball.

    ERIC Educational Resources Information Center

    Levin, Eugene M.

    1981-01-01

    Student access to programmable calculators and computer terminals, coupled with a familiarity with baseball, provides opportunities to enhance their understanding of the binomial distribution and other aspects of analysis. (MP)

  16. Visualizing and Understanding Probability and Statistics: Graphical Simulations Using Excel

    ERIC Educational Resources Information Center

    Gordon, Sheldon P.; Gordon, Florence S.

    2009-01-01

    The authors describe a collection of dynamic interactive simulations for teaching and learning most of the important ideas and techniques of introductory statistics and probability. The modules cover such topics as randomness, simulations of probability experiments such as coin flipping, dice rolling and general binomial experiments, a simulation…

  17. A Three-Parameter Generalisation of the Beta-Binomial Distribution with Applications

    DTIC Science & Technology

    1987-07-01

    York. Rust, R.T. and Klompmaker, J.E. (1981). Improving the estimation procedure for the beta binomial t.v. exposure model. Journal of Marketing ... Research . 18, 442-448. Sabavala, D.J. and Morrison, D.G. (1977). Television show loyalty: a beta- binomial model using recall data. Journal of Advertiuing

  18. Maximizing Statistical Power When Verifying Probabilistic Forecasts of Hydrometeorological Events

    NASA Astrophysics Data System (ADS)

    DeChant, C. M.; Moradkhani, H.

    2014-12-01

    Hydrometeorological events (i.e. floods, droughts, precipitation) are increasingly being forecasted probabilistically, owing to the uncertainties in the underlying causes of the phenomenon. In these forecasts, the probability of the event, over some lead time, is estimated based on some model simulations or predictive indicators. By issuing probabilistic forecasts, agencies may communicate the uncertainty in the event occurring. Assuming that the assigned probability of the event is correct, which is referred to as a reliable forecast, the end user may perform some risk management based on the potential damages resulting from the event. Alternatively, an unreliable forecast may give false impressions of the actual risk, leading to improper decision making when protecting resources from extreme events. Due to this requisite for reliable forecasts to perform effective risk management, this study takes a renewed look at reliability assessment in event forecasts. Illustrative experiments will be presented, showing deficiencies in the commonly available approaches (Brier Score, Reliability Diagram). Overall, it is shown that the conventional reliability assessment techniques do not maximize the ability to distinguish between a reliable and unreliable forecast. In this regard, a theoretical formulation of the probabilistic event forecast verification framework will be presented. From this analysis, hypothesis testing with the Poisson-Binomial distribution is the most exact model available for the verification framework, and therefore maximizes one's ability to distinguish between a reliable and unreliable forecast. Application of this verification system was also examined within a real forecasting case study, highlighting the additional statistical power provided with the use of the Poisson-Binomial distribution.

  19. Using the β-binomial distribution to characterize forest health

    Treesearch

    S.J. Zarnoch; R.L. Anderson; R.M. Sheffield

    1995-01-01

    The β-binomial distribution is suggested as a model for describing and analyzing the dichotomous data obtained from programs monitoring the health of forests in the United States. Maximum likelihood estimation of the parameters is given as well as asymptotic likelihood ratio tests. The procedure is illustrated with data on dogwood anthracnose infection (caused...

  20. Pricing American Asian options with higher moments in the underlying distribution

    NASA Astrophysics Data System (ADS)

    Lo, Keng-Hsin; Wang, Kehluh; Hsu, Ming-Feng

    2009-01-01

    We develop a modified Edgeworth binomial model with higher moment consideration for pricing American Asian options. With lognormal underlying distribution for benchmark comparison, our algorithm is as precise as that of Chalasani et al. [P. Chalasani, S. Jha, F. Egriboyun, A. Varikooty, A refined binomial lattice for pricing American Asian options, Rev. Derivatives Res. 3 (1) (1999) 85-105] if the number of the time steps increases. If the underlying distribution displays negative skewness and leptokurtosis as often observed for stock index returns, our estimates can work better than those in Chalasani et al. [P. Chalasani, S. Jha, F. Egriboyun, A. Varikooty, A refined binomial lattice for pricing American Asian options, Rev. Derivatives Res. 3 (1) (1999) 85-105] and are very similar to the benchmarks in Hull and White [J. Hull, A. White, Efficient procedures for valuing European and American path-dependent options, J. Derivatives 1 (Fall) (1993) 21-31]. The numerical analysis shows that our modified Edgeworth binomial model can value American Asian options with greater accuracy and speed given higher moments in their underlying distribution.

  1. Spatial distribution of single-nucleotide polymorphisms related to fungicide resistance and implications for sampling.

    PubMed

    Van der Heyden, H; Dutilleul, P; Brodeur, L; Carisse, O

    2014-06-01

    Spatial distribution of single-nucleotide polymorphisms (SNPs) related to fungicide resistance was studied for Botrytis cinerea populations in vineyards and for B. squamosa populations in onion fields. Heterogeneity in this distribution was characterized by performing geostatistical analyses based on semivariograms and through the fitting of discrete probability distributions. Two SNPs known to be responsible for boscalid resistance (H272R and H272Y), both located on the B subunit of the succinate dehydrogenase gene, and one SNP known to be responsible for dicarboximide resistance (I365S) were chosen for B. cinerea in grape. For B. squamosa in onion, one SNP responsible for dicarboximide resistance (I365S homologous) was chosen. One onion field was sampled in 2009 and another one was sampled in 2010 for B. squamosa, and two vineyards were sampled in 2011 for B. cinerea, for a total of four sampled sites. Cluster sampling was carried on a 10-by-10 grid, each of the 100 nodes being the center of a 10-by-10-m quadrat. In each quadrat, 10 samples were collected and analyzed by restriction fragment length polymorphism polymerase chain reaction (PCR) or allele specific PCR. Mean SNP incidence varied from 16 to 68%, with an overall mean incidence of 43%. In the geostatistical analyses, omnidirectional variograms showed spatial autocorrelation characterized by ranges of 21 to 1 m. Various levels of anisotropy were detected, however, with variograms computed in four directions (at 0°, 45°, 90°, and 135° from the within-row direction used as reference), indicating that spatial autocorrelation was prevalent or characterized by a longer range in one direction. For all eight data sets, the β-binomial distribution was found to fit the data better than the binomial distribution. This indicates local aggregation of fungicide resistance among sampling units, as supported by estimates of the parameter θ of the β-binomial distribution of 0.09 to 0.23 (overall median value = 0.20). On the basis of the observed spatial distribution patterns of SNP incidence, sampling curves were computed for different levels of reliability, emphasizing the importance of sample size for the detection of mutation incidence below the risk threshold for control failure.

  2. Topics in Bayesian Hierarchical Modeling and its Monte Carlo Computations

    NASA Astrophysics Data System (ADS)

    Tak, Hyung Suk

    The first chapter addresses a Beta-Binomial-Logit model that is a Beta-Binomial conjugate hierarchical model with covariate information incorporated via a logistic regression. Various researchers in the literature have unknowingly used improper posterior distributions or have given incorrect statements about posterior propriety because checking posterior propriety can be challenging due to the complicated functional form of a Beta-Binomial-Logit model. We derive data-dependent necessary and sufficient conditions for posterior propriety within a class of hyper-prior distributions that encompass those used in previous studies. Frequency coverage properties of several hyper-prior distributions are also investigated to see when and whether Bayesian interval estimates of random effects meet their nominal confidence levels. The second chapter deals with a time delay estimation problem in astrophysics. When the gravitational field of an intervening galaxy between a quasar and the Earth is strong enough to split light into two or more images, the time delay is defined as the difference between their travel times. The time delay can be used to constrain cosmological parameters and can be inferred from the time series of brightness data of each image. To estimate the time delay, we construct a Gaussian hierarchical model based on a state-space representation for irregularly observed time series generated by a latent continuous-time Ornstein-Uhlenbeck process. Our Bayesian approach jointly infers model parameters via a Gibbs sampler. We also introduce a profile likelihood of the time delay as an approximation of its marginal posterior distribution. The last chapter specifies a repelling-attracting Metropolis algorithm, a new Markov chain Monte Carlo method to explore multi-modal distributions in a simple and fast manner. This algorithm is essentially a Metropolis-Hastings algorithm with a proposal that consists of a downhill move in density that aims to make local modes repelling, followed by an uphill move in density that aims to make local modes attracting. The downhill move is achieved via a reciprocal Metropolis ratio so that the algorithm prefers downward movement. The uphill move does the opposite using the standard Metropolis ratio which prefers upward movement. This down-up movement in density increases the probability of a proposed move to a different mode.

  3. [Distribution of individuals by spontaneous frequencies of lymphocytes with micronuclei. Particularity and consequences].

    PubMed

    Serebrianyĭ, A M; Akleev, A V; Aleshchenko, A V; Antoshchina, M M; Kudriashova, O V; Riabchenko, N I; Semenova, L P; Pelevina, I I

    2011-01-01

    By micronucleus (MN) assay with cytokinetic cytochalasin B block, the mean frequency of blood lymphocytes with MN has been determined in 76 Moscow inhabitants, 35 people from Obninsk and 122 from Chelyabinsk region. In contrast to the distribution of individuals on spontaneous frequency of cells with aberrations, which was shown to be binomial (Kusnetzov et al., 1980), the distribution of individuals on the spontaneous frequency of cells with MN in all three massif can be acknowledged as log-normal (chi2 test). Distribution of individuals in the joined massifs (Moscow and Obninsk inhabitants) and in the unique massif of all inspected with great reliability must be acknowledged as log-normal (0.70 and 0.86 correspondingly), but it cannot be regarded as Poisson, binomial or normal. Taking into account that log-normal distribution of children by spontaneous frequency of lymphocytes with MN has been observed by the inspection of 473 children from different kindergartens in Moscow we can make the conclusion that log-normal is regularity inherent in this type of damage of lymphocytes genome. On the contrary the distribution of individuals on induced by irradiation in vitro lymphocytes with MN frequency in most cases must be acknowledged as normal. This distribution character points out that damage appearance in the individual (genomic instability) in a single lymphocytes increases the probability of the damage appearance in another lymphocytes. We can propose that damaged stem cells lymphocyte progenitor's exchange by information with undamaged cells--the type of the bystander effect process. It can also be supposed that transmission of damage to daughter cells occurs in the time of stem cells division.

  4. Logit and probit model in toll sensitivity analysis of Solo-Ngawi, Kartasura-Palang Joglo segment based on Willingness to Pay (WTP)

    NASA Astrophysics Data System (ADS)

    Handayani, Dewi; Cahyaning Putri, Hera; Mahmudah, AMH

    2017-12-01

    Solo-Ngawi toll road project is part of the mega project of the Trans Java toll road development initiated by the government and is still under construction until now. PT Solo Ngawi Jaya (SNJ) as the Solo-Ngawi toll management company needs to determine the toll fare that is in accordance with the business plan. The determination of appropriate toll rates will affect progress in regional economic sustainability and decrease the traffic congestion. These policy instruments is crucial for achieving environmentally sustainable transport. Therefore, the objective of this research is to find out how the toll fare sensitivity of Solo-Ngawi toll road based on Willingness To Pay (WTP). Primary data was obtained by distributing stated preference questionnaires to four wheeled vehicle users in Kartasura-Palang Joglo artery road segment. Further data obtained will be analysed with logit and probit model. Based on the analysis, it is found that the effect of fare change on the amount of WTP on the binomial logit model is more sensitive than the probit model on the same travel conditions. The range of tariff change against values of WTP on the binomial logit model is 20% greater than the range of values in the probit model . On the other hand, the probability results of the binomial logit model and the binary probit have no significant difference (less than 1%).

  5. Probabilistic assessment of precipitation-triggered landslides using historical records of landslide occurence, Seattle, Washington

    USGS Publications Warehouse

    Coe, J.A.; Michael, J.A.; Crovelli, R.A.; Savage, W.Z.; Laprade, W.T.; Nashem, W.D.

    2004-01-01

    Ninety years of historical landslide records were used as input to the Poisson and binomial probability models. Results from these models show that, for precipitation-triggered landslides, approximately 9 percent of the area of Seattle has annual exceedance probabilities of 1 percent or greater. Application of the Poisson model for estimating the future occurrence of individual landslides results in a worst-case scenario map, with a maximum annual exceedance probability of 25 percent on a hillslope near Duwamish Head in West Seattle. Application of the binomial model for estimating the future occurrence of a year with one or more landslides results in a map with a maximum annual exceedance probability of 17 percent (also near Duwamish Head). Slope and geology both play a role in localizing the occurrence of landslides in Seattle. A positive correlation exists between slope and mean exceedance probability, with probability tending to increase as slope increases. Sixty-four percent of all historical landslide locations are within 150 m (500 ft, horizontal distance) of the Esperance Sand/Lawton Clay contact, but within this zone, no positive or negative correlation exists between exceedance probability and distance to the contact.

  6. A brief history of numbers and statistics with cytometric applications.

    PubMed

    Watson, J V

    2001-02-15

    A brief history of numbers and statistics traces the development of numbers from prehistory to completion of our current system of numeration with the introduction of the decimal fraction by Viete, Stevin, Burgi, and Galileo at the turn of the 16th century. This was followed by the development of what we now know as probability theory by Pascal, Fermat, and Huygens in the mid-17th century which arose in connection with questions in gambling with dice and can be regarded as the origin of statistics. The three main probability distributions on which statistics depend were introduced and/or formalized between the mid-17th and early 19th centuries: the binomial distribution by Pascal; the normal distribution by de Moivre, Gauss, and Laplace, and the Poisson distribution by Poisson. The formal discipline of statistics commenced with the works of Pearson, Yule, and Gosset at the turn of the 19th century when the first statistical tests were introduced. Elementary descriptions of the statistical tests most likely to be used in conjunction with cytometric data are given and it is shown how these can be applied to the analysis of difficult immunofluorescence distributions when there is overlap between the labeled and unlabeled cell populations. Copyright 2001 Wiley-Liss, Inc.

  7. Assessment of NDE Reliability Data

    NASA Technical Reports Server (NTRS)

    Yee, B. G. W.; Chang, F. H.; Couchman, J. C.; Lemon, G. H.; Packman, P. F.

    1976-01-01

    Twenty sets of relevant Nondestructive Evaluation (NDE) reliability data have been identified, collected, compiled, and categorized. A criterion for the selection of data for statistical analysis considerations has been formulated. A model to grade the quality and validity of the data sets has been developed. Data input formats, which record the pertinent parameters of the defect/specimen and inspection procedures, have been formulated for each NDE method. A comprehensive computer program has been written to calculate the probability of flaw detection at several confidence levels by the binomial distribution. This program also selects the desired data sets for pooling and tests the statistical pooling criteria before calculating the composite detection reliability. Probability of detection curves at 95 and 50 percent confidence levels have been plotted for individual sets of relevant data as well as for several sets of merged data with common sets of NDE parameters.

  8. Mixture models for estimating the size of a closed population when capture rates vary among individuals

    USGS Publications Warehouse

    Dorazio, R.M.; Royle, J. Andrew

    2003-01-01

    We develop a parameterization of the beta-binomial mixture that provides sensible inferences about the size of a closed population when probabilities of capture or detection vary among individuals. Three classes of mixture models (beta-binomial, logistic-normal, and latent-class) are fitted to recaptures of snowshoe hares for estimating abundance and to counts of bird species for estimating species richness. In both sets of data, rates of detection appear to vary more among individuals (animals or species) than among sampling occasions or locations. The estimates of population size and species richness are sensitive to model-specific assumptions about the latent distribution of individual rates of detection. We demonstrate using simulation experiments that conventional diagnostics for assessing model adequacy, such as deviance, cannot be relied on for selecting classes of mixture models that produce valid inferences about population size. Prior knowledge about sources of individual heterogeneity in detection rates, if available, should be used to help select among classes of mixture models that are to be used for inference.

  9. The Sequential Probability Ratio Test: An efficient alternative to exact binomial testing for Clean Water Act 303(d) evaluation.

    PubMed

    Chen, Connie; Gribble, Matthew O; Bartroff, Jay; Bay, Steven M; Goldstein, Larry

    2017-05-01

    The United States's Clean Water Act stipulates in section 303(d) that states must identify impaired water bodies for which total maximum daily loads (TMDLs) of pollution inputs into water bodies are developed. Decision-making procedures about how to list, or delist, water bodies as impaired, or not, per Clean Water Act 303(d) differ across states. In states such as California, whether or not a particular monitoring sample suggests that water quality is impaired can be regarded as a binary outcome variable, and California's current regulatory framework invokes a version of the exact binomial test to consolidate evidence across samples and assess whether the overall water body complies with the Clean Water Act. Here, we contrast the performance of California's exact binomial test with one potential alternative, the Sequential Probability Ratio Test (SPRT). The SPRT uses a sequential testing framework, testing samples as they become available and evaluating evidence as it emerges, rather than measuring all the samples and calculating a test statistic at the end of the data collection process. Through simulations and theoretical derivations, we demonstrate that the SPRT on average requires fewer samples to be measured to have comparable Type I and Type II error rates as the current fixed-sample binomial test. Policymakers might consider efficient alternatives such as SPRT to current procedure. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Application of the Hyper-Poisson Generalized Linear Model for Analyzing Motor Vehicle Crashes.

    PubMed

    Khazraee, S Hadi; Sáez-Castillo, Antonio Jose; Geedipally, Srinivas Reddy; Lord, Dominique

    2015-05-01

    The hyper-Poisson distribution can handle both over- and underdispersion, and its generalized linear model formulation allows the dispersion of the distribution to be observation-specific and dependent on model covariates. This study's objective is to examine the potential applicability of a newly proposed generalized linear model framework for the hyper-Poisson distribution in analyzing motor vehicle crash count data. The hyper-Poisson generalized linear model was first fitted to intersection crash data from Toronto, characterized by overdispersion, and then to crash data from railway-highway crossings in Korea, characterized by underdispersion. The results of this study are promising. When fitted to the Toronto data set, the goodness-of-fit measures indicated that the hyper-Poisson model with a variable dispersion parameter provided a statistical fit as good as the traditional negative binomial model. The hyper-Poisson model was also successful in handling the underdispersed data from Korea; the model performed as well as the gamma probability model and the Conway-Maxwell-Poisson model previously developed for the same data set. The advantages of the hyper-Poisson model studied in this article are noteworthy. Unlike the negative binomial model, which has difficulties in handling underdispersed data, the hyper-Poisson model can handle both over- and underdispersed crash data. Although not a major issue for the Conway-Maxwell-Poisson model, the effect of each variable on the expected mean of crashes is easily interpretable in the case of this new model. © 2014 Society for Risk Analysis.

  11. Identifiability in N-mixture models: a large-scale screening test with bird data.

    PubMed

    Kéry, Marc

    2018-02-01

    Binomial N-mixture models have proven very useful in ecology, conservation, and monitoring: they allow estimation and modeling of abundance separately from detection probability using simple counts. Recently, doubts about parameter identifiability have been voiced. I conducted a large-scale screening test with 137 bird data sets from 2,037 sites. I found virtually no identifiability problems for Poisson and zero-inflated Poisson (ZIP) binomial N-mixture models, but negative-binomial (NB) models had problems in 25% of all data sets. The corresponding multinomial N-mixture models had no problems. Parameter estimates under Poisson and ZIP binomial and multinomial N-mixture models were extremely similar. Identifiability problems became a little more frequent with smaller sample sizes (267 and 50 sites), but were unaffected by whether the models did or did not include covariates. Hence, binomial N-mixture model parameters with Poisson and ZIP mixtures typically appeared identifiable. In contrast, NB mixtures were often unidentifiable, which is worrying since these were often selected by Akaike's information criterion. Identifiability of binomial N-mixture models should always be checked. If problems are found, simpler models, integrated models that combine different observation models or the use of external information via informative priors or penalized likelihoods, may help. © 2017 by the Ecological Society of America.

  12. Remote sensing of earth terrain

    NASA Technical Reports Server (NTRS)

    Kong, J. A.

    1988-01-01

    Two monographs and 85 journal and conference papers on remote sensing of earth terrain have been published, sponsored by NASA Contract NAG5-270. A multivariate K-distribution is proposed to model the statistics of fully polarimetric data from earth terrain with polarizations HH, HV, VH, and VV. In this approach, correlated polarizations of radar signals, as characterized by a covariance matrix, are treated as the sum of N n-dimensional random vectors; N obeys the negative binomial distribution with a parameter alpha and mean bar N. Subsequently, and n-dimensional K-distribution, with either zero or non-zero mean, is developed in the limit of infinite bar N or illuminated area. The probability density function (PDF) of the K-distributed vector normalized by its Euclidean norm is independent of the parameter alpha and is the same as that derived from a zero-mean Gaussian-distributed random vector. The above model is well supported by experimental data provided by MIT Lincoln Laboratory and the Jet Propulsion Laboratory in the form of polarimetric measurements.

  13. The Rainbow Spectrum of RNA Secondary Structures.

    PubMed

    Li, Thomas J X; Reidys, Christian M

    2018-06-01

    In this paper, we analyze the length spectrum of rainbows in RNA secondary structures. A rainbow in a secondary structure is a maximal arc with respect to the partial order induced by nesting. We show that there is a significant gap in this length spectrum. We shall prove that there asymptotically almost surely exists a unique longest rainbow of length at least [Formula: see text] and that with high probability any other rainbow has finite length. We show that the distribution of the length of the longest rainbow converges to a discrete limit law and that, for finite k, the distribution of rainbows of length k becomes for large n a negative binomial distribution. We then put the results of this paper into context, comparing the analytical results with those observed in RNA minimum free energy structures, biological RNA structures and relate our findings to the sparsification of folding algorithms.

  14. Understanding poisson regression.

    PubMed

    Hayat, Matthew J; Higgins, Melinda

    2014-04-01

    Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes. Copyright 2014, SLACK Incorporated.

  15. Binomial test statistics using Psi functions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bowman, Kimiko o

    2007-01-01

    For the negative binomial model (probability generating function (p + 1 - pt){sup -k}) a logarithmic derivative is the Psi function difference {psi}(k + x) - {psi}(k); this and its derivatives lead to a test statistic to decide on the validity of a specified model. The test statistic uses a data base so there exists a comparison available between theory and application. Note that the test function is not dominated by outliers. Applications to (i) Fisher's tick data, (ii) accidents data, (iii) Weldon's dice data are included.

  16. CROSSER - CUMULATIVE BINOMIAL PROGRAMS

    NASA Technical Reports Server (NTRS)

    Bowerman, P. N.

    1994-01-01

    The cumulative binomial program, CROSSER, is one of a set of three programs which calculate cumulative binomial probability distributions for arbitrary inputs. The three programs, CROSSER, CUMBIN (NPO-17555), and NEWTONP (NPO-17556), can be used independently of one another. CROSSER can be used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. The program has been used for reliability/availability calculations. CROSSER calculates the point at which the reliability of a k-out-of-n system equals the common reliability of the n components. It is designed to work well with all integer values 0 < k <= n. To run the program, the user simply runs the executable version and inputs the information requested by the program. The program is not designed to weed out incorrect inputs, so the user must take care to make sure the inputs are correct. Once all input has been entered, the program calculates and lists the result. It also lists the number of iterations of Newton's method required to calculate the answer within the given error. The CROSSER program is written in C. It was developed on an IBM AT with a numeric co-processor using Microsoft C 5.0. Because the source code is written using standard C structures and functions, it should compile correctly with most C compilers. The program format is interactive. It has been implemented under DOS 3.2 and has a memory requirement of 26K. CROSSER was developed in 1988.

  17. Fitting statistical distributions to sea duck count data: implications for survey design and abundance estimation

    USGS Publications Warehouse

    Zipkin, Elise F.; Leirness, Jeffery B.; Kinlan, Brian P.; O'Connell, Allan F.; Silverman, Emily D.

    2014-01-01

    Determining appropriate statistical distributions for modeling animal count data is important for accurate estimation of abundance, distribution, and trends. In the case of sea ducks along the U.S. Atlantic coast, managers want to estimate local and regional abundance to detect and track population declines, to define areas of high and low use, and to predict the impact of future habitat change on populations. In this paper, we used a modified marked point process to model survey data that recorded flock sizes of Common eiders, Long-tailed ducks, and Black, Surf, and White-winged scoters. The data come from an experimental aerial survey, conducted by the United States Fish & Wildlife Service (USFWS) Division of Migratory Bird Management, during which east-west transects were flown along the Atlantic Coast from Maine to Florida during the winters of 2009–2011. To model the number of flocks per transect (the points), we compared the fit of four statistical distributions (zero-inflated Poisson, zero-inflated geometric, zero-inflated negative binomial and negative binomial) to data on the number of species-specific sea duck flocks that were recorded for each transect flown. To model the flock sizes (the marks), we compared the fit of flock size data for each species to seven statistical distributions: positive Poisson, positive negative binomial, positive geometric, logarithmic, discretized lognormal, zeta and Yule–Simon. Akaike’s Information Criterion and Vuong’s closeness tests indicated that the negative binomial and discretized lognormal were the best distributions for all species for the points and marks, respectively. These findings have important implications for estimating sea duck abundances as the discretized lognormal is a more skewed distribution than the Poisson and negative binomial, which are frequently used to model avian counts; the lognormal is also less heavy-tailed than the power law distributions (e.g., zeta and Yule–Simon), which are becoming increasingly popular for group size modeling. Choosing appropriate statistical distributions for modeling flock size data is fundamental to accurately estimating population summaries, determining required survey effort, and assessing and propagating uncertainty through decision-making processes.

  18. On Models for Binomial Data with Random Numbers of Trials

    PubMed Central

    Comulada, W. Scott; Weiss, Robert E.

    2010-01-01

    Summary A binomial outcome is a count s of the number of successes out of the total number of independent trials n = s + f, where f is a count of the failures. The n are random variables not fixed by design in many studies. Joint modeling of (s, f) can provide additional insight into the science and into the probability π of success that cannot be directly incorporated by the logistic regression model. Observations where n = 0 are excluded from the binomial analysis yet may be important to understanding how π is influenced by covariates. Correlation between s and f may exist and be of direct interest. We propose Bayesian multivariate Poisson models for the bivariate response (s, f), correlated through random effects. We extend our models to the analysis of longitudinal and multivariate longitudinal binomial outcomes. Our methodology was motivated by two disparate examples, one from teratology and one from an HIV tertiary intervention study. PMID:17688514

  19. Automated segmentation of linear time-frequency representations of marine-mammal sounds.

    PubMed

    Dadouchi, Florian; Gervaise, Cedric; Ioana, Cornel; Huillery, Julien; Mars, Jérôme I

    2013-09-01

    Many marine mammals produce highly nonlinear frequency modulations. Determining the time-frequency support of these sounds offers various applications, which include recognition, localization, and density estimation. This study introduces a low parameterized automated spectrogram segmentation method that is based on a theoretical probabilistic framework. In the first step, the background noise in the spectrogram is fitted with a Chi-squared distribution and thresholded using a Neyman-Pearson approach. In the second step, the number of false detections in time-frequency regions is modeled as a binomial distribution, and then through a Neyman-Pearson strategy, the time-frequency bins are gathered into regions of interest. The proposed method is validated on real data of large sequences of whistles from common dolphins, collected in the Bay of Biscay (France). The proposed method is also compared with two alternative approaches: the first is smoothing and thresholding of the spectrogram; the second is thresholding of the spectrogram followed by the use of morphological operators to gather the time-frequency bins and to remove false positives. This method is shown to increase the probability of detection for the same probability of false alarms.

  20. CUMPOIS- CUMULATIVE POISSON DISTRIBUTION PROGRAM

    NASA Technical Reports Server (NTRS)

    Bowerman, P. N.

    1994-01-01

    The Cumulative Poisson distribution program, CUMPOIS, is one of two programs which make calculations involving cumulative poisson distributions. Both programs, CUMPOIS (NPO-17714) and NEWTPOIS (NPO-17715), can be used independently of one another. CUMPOIS determines the approximate cumulative binomial distribution, evaluates the cumulative distribution function (cdf) for gamma distributions with integer shape parameters, and evaluates the cdf for chi-square distributions with even degrees of freedom. It can be used by statisticians and others concerned with probabilities of independent events occurring over specific units of time, area, or volume. CUMPOIS calculates the probability that n or less events (ie. cumulative) will occur within any unit when the expected number of events is given as lambda. Normally, this probability is calculated by a direct summation, from i=0 to n, of terms involving the exponential function, lambda, and inverse factorials. This approach, however, eventually fails due to underflow for sufficiently large values of n. Additionally, when the exponential term is moved outside of the summation for simplification purposes, there is a risk that the terms remaining within the summation, and the summation itself, will overflow for certain values of i and lambda. CUMPOIS eliminates these possibilities by multiplying an additional exponential factor into the summation terms and the partial sum whenever overflow/underflow situations threaten. The reciprocal of this term is then multiplied into the completed sum giving the cumulative probability. The CUMPOIS program is written in C. It was developed on an IBM AT with a numeric co-processor using Microsoft C 5.0. Because the source code is written using standard C structures and functions, it should compile correctly on most C compilers. The program format is interactive, accepting lambda and n as inputs. It has been implemented under DOS 3.2 and has a memory requirement of 26K. CUMPOIS was developed in 1988.

  1. Statistical methods for the beta-binomial model in teratology.

    PubMed Central

    Yamamoto, E; Yanagimoto, T

    1994-01-01

    The beta-binomial model is widely used for analyzing teratological data involving littermates. Recent developments in statistical analyses of teratological data are briefly reviewed with emphasis on the model. For statistical inference of the parameters in the beta-binomial distribution, separation of the likelihood introduces an likelihood inference. This leads to reducing biases of estimators and also to improving accuracy of empirical significance levels of tests. Separate inference of the parameters can be conducted in a unified way. PMID:8187716

  2. Estimating abundance while accounting for rarity, correlated behavior, and other sources of variation in counts

    USGS Publications Warehouse

    Dorazio, Robert M.; Martin, Juulien; Edwards, Holly H.

    2013-01-01

    The class of N-mixture models allows abundance to be estimated from repeated, point count surveys while adjusting for imperfect detection of individuals. We developed an extension of N-mixture models to account for two commonly observed phenomena in point count surveys: rarity and lack of independence induced by unmeasurable sources of variation in the detectability of individuals. Rarity increases the number of locations with zero detections in excess of those expected under simple models of abundance (e.g., Poisson or negative binomial). Correlated behavior of individuals and other phenomena, though difficult to measure, increases the variation in detection probabilities among surveys. Our extension of N-mixture models includes a hurdle model of abundance and a beta-binomial model of detectability that accounts for additional (extra-binomial) sources of variation in detections among surveys. As an illustration, we fit this model to repeated point counts of the West Indian manatee, which was observed in a pilot study using aerial surveys. Our extension of N-mixture models provides increased flexibility. The effects of different sets of covariates may be estimated for the probability of occurrence of a species, for its mean abundance at occupied locations, and for its detectability.

  3. Estimating abundance while accounting for rarity, correlated behavior, and other sources of variation in counts.

    PubMed

    Dorazio, Robert M; Martin, Julien; Edwards, Holly H

    2013-07-01

    The class of N-mixture models allows abundance to be estimated from repeated, point count surveys while adjusting for imperfect detection of individuals. We developed an extension of N-mixture models to account for two commonly observed phenomena in point count surveys: rarity and lack of independence induced by unmeasurable sources of variation in the detectability of individuals. Rarity increases the number of locations with zero detections in excess of those expected under simple models of abundance (e.g., Poisson or negative binomial). Correlated behavior of individuals and other phenomena, though difficult to measure, increases the variation in detection probabilities among surveys. Our extension of N-mixture models includes a hurdle model of abundance and a beta-binomial model of detectability that accounts for additional (extra-binomial) sources of variation in detections among surveys. As an illustration, we fit this model to repeated point counts of the West Indian manatee, which was observed in a pilot study using aerial surveys. Our extension of N-mixture models provides increased flexibility. The effects of different sets of covariates may be estimated for the probability of occurrence of a species, for its mean abundance at occupied locations, and for its detectability.

  4. A Bayesian method for inferring transmission chains in a partially observed epidemic.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Marzouk, Youssef M.; Ray, Jaideep

    2008-10-01

    We present a Bayesian approach for estimating transmission chains and rates in the Abakaliki smallpox epidemic of 1967. The epidemic affected 30 individuals in a community of 74; only the dates of appearance of symptoms were recorded. Our model assumes stochastic transmission of the infections over a social network. Distinct binomial random graphs model intra- and inter-compound social connections, while disease transmission over each link is treated as a Poisson process. Link probabilities and rate parameters are objects of inference. Dates of infection and recovery comprise the remaining unknowns. Distributions for smallpox incubation and recovery periods are obtained from historicalmore » data. Using Markov chain Monte Carlo, we explore the joint posterior distribution of the scalar parameters and provide an expected connectivity pattern for the social graph and infection pathway.« less

  5. Improving removal-based estimates of abundance by sampling a population of spatially distinct subpopulations

    USGS Publications Warehouse

    Dorazio, R.M.; Jelks, H.L.; Jordan, F.

    2005-01-01

     A statistical modeling framework is described for estimating the abundances of spatially distinct subpopulations of animals surveyed using removal sampling. To illustrate this framework, hierarchical models are developed using the Poisson and negative-binomial distributions to model variation in abundance among subpopulations and using the beta distribution to model variation in capture probabilities. These models are fitted to the removal counts observed in a survey of a federally endangered fish species. The resulting estimates of abundance have similar or better precision than those computed using the conventional approach of analyzing the removal counts of each subpopulation separately. Extension of the hierarchical models to include spatial covariates of abundance is straightforward and may be used to identify important features of an animal's habitat or to predict the abundance of animals at unsampled locations.

  6. The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded.

    PubMed

    Nakagawa, Shinichi; Johnson, Paul C D; Schielzeth, Holger

    2017-09-01

    The coefficient of determination R 2 quantifies the proportion of variance explained by a statistical model and is an important summary statistic of biological interest. However, estimating R 2 for generalized linear mixed models (GLMMs) remains challenging. We have previously introduced a version of R 2 that we called [Formula: see text] for Poisson and binomial GLMMs, but not for other distributional families. Similarly, we earlier discussed how to estimate intra-class correlation coefficients (ICCs) using Poisson and binomial GLMMs. In this paper, we generalize our methods to all other non-Gaussian distributions, in particular to negative binomial and gamma distributions that are commonly used for modelling biological data. While expanding our approach, we highlight two useful concepts for biologists, Jensen's inequality and the delta method, both of which help us in understanding the properties of GLMMs. Jensen's inequality has important implications for biologically meaningful interpretation of GLMMs, whereas the delta method allows a general derivation of variance associated with non-Gaussian distributions. We also discuss some special considerations for binomial GLMMs with binary or proportion data. We illustrate the implementation of our extension by worked examples from the field of ecology and evolution in the R environment. However, our method can be used across disciplines and regardless of statistical environments. © 2017 The Author(s).

  7. Application of binomial and multinomial probability statistics to the sampling design process of a global grain tracing and recall system

    USDA-ARS?s Scientific Manuscript database

    Small, coded, pill-sized tracers embedded in grain are proposed as a method for grain traceability. A sampling process for a grain traceability system was designed and investigated by applying probability statistics using a science-based sampling approach to collect an adequate number of tracers fo...

  8. Points on the Path to Probability.

    ERIC Educational Resources Information Center

    Kiernan, James F.

    2001-01-01

    Presents the problem of points and the development of the binomial triangle, or Pascal's triangle. Examines various attempts to solve this problem to give students insight into the nature of mathematical discovery. (KHR)

  9. Distribution pattern of phthirapterans infesting certain common Indian birds.

    PubMed

    Saxena, A K; Kumar, Sandeep; Gupta, Nidhi; Mitra, J D; Ali, S A; Srivastava, Roshni

    2007-08-01

    The prevalence and frequency distribution patterns of 10 phthirapteran species infesting house sparrows, Indian parakeets, common mynas, and white breasted kingfishers were recorded in the district of Rampur, India, during 2004-05. The sample mean abundances, mean intensities, range of infestations, variance to mean ratios, values of the exponent of the negative binomial distribution, and the indices of discrepancy were also computed. Frequency distribution patterns of all phthirapteran species were skewed, but the observed frequencies did not correspond to the negative binomial distribution. Thus, adult-nymph ratios varied in different species from 1:0.53 to 1:1.25. Sex ratios of different phthirapteran species ranged from 1:1.10 to 1:1.65 and were female biased.

  10. Spatiotemporal hurdle models for zero-inflated count data: Exploring trends in emergency department visits.

    PubMed

    Neelon, Brian; Chang, Howard H; Ling, Qiang; Hastings, Nicole S

    2016-12-01

    Motivated by a study exploring spatiotemporal trends in emergency department use, we develop a class of two-part hurdle models for the analysis of zero-inflated areal count data. The models consist of two components-one for the probability of any emergency department use and one for the number of emergency department visits given use. Through a hierarchical structure, the models incorporate both patient- and region-level predictors, as well as spatially and temporally correlated random effects for each model component. The random effects are assigned multivariate conditionally autoregressive priors, which induce dependence between the components and provide spatial and temporal smoothing across adjacent spatial units and time periods, resulting in improved inferences. To accommodate potential overdispersion, we consider a range of parametric specifications for the positive counts, including truncated negative binomial and generalized Poisson distributions. We adopt a Bayesian inferential approach, and posterior computation is handled conveniently within standard Bayesian software. Our results indicate that the negative binomial and generalized Poisson hurdle models vastly outperform the Poisson hurdle model, demonstrating that overdispersed hurdle models provide a useful approach to analyzing zero-inflated spatiotemporal data. © The Author(s) 2014.

  11. Discrimination of numerical proportions: A comparison of binomial and Gaussian models.

    PubMed

    Raidvee, Aire; Lember, Jüri; Allik, Jüri

    2017-01-01

    Observers discriminated the numerical proportion of two sets of elements (N = 9, 13, 33, and 65) that differed either by color or orientation. According to the standard Thurstonian approach, the accuracy of proportion discrimination is determined by irreducible noise in the nervous system that stochastically transforms the number of presented visual elements onto a continuum of psychological states representing numerosity. As an alternative to this customary approach, we propose a Thurstonian-binomial model, which assumes discrete perceptual states, each of which is associated with a certain visual element. It is shown that the probability β with which each visual element can be noticed and registered by the perceptual system can explain data of numerical proportion discrimination at least as well as the continuous Thurstonian-Gaussian model, and better, if the greater parsimony of the Thurstonian-binomial model is taken into account using AIC model selection. We conclude that Gaussian and binomial models represent two different fundamental principles-internal noise vs. using only a fraction of available information-which are both plausible descriptions of visual perception.

  12. Binomial leap methods for simulating stochastic chemical kinetics.

    PubMed

    Tian, Tianhai; Burrage, Kevin

    2004-12-01

    This paper discusses efficient simulation methods for stochastic chemical kinetics. Based on the tau-leap and midpoint tau-leap methods of Gillespie [D. T. Gillespie, J. Chem. Phys. 115, 1716 (2001)], binomial random variables are used in these leap methods rather than Poisson random variables. The motivation for this approach is to improve the efficiency of the Poisson leap methods by using larger stepsizes. Unlike Poisson random variables whose range of sample values is from zero to infinity, binomial random variables have a finite range of sample values. This probabilistic property has been used to restrict possible reaction numbers and to avoid negative molecular numbers in stochastic simulations when larger stepsize is used. In this approach a binomial random variable is defined for a single reaction channel in order to keep the reaction number of this channel below the numbers of molecules that undergo this reaction channel. A sampling technique is also designed for the total reaction number of a reactant species that undergoes two or more reaction channels. Samples for the total reaction number are not greater than the molecular number of this species. In addition, probability properties of the binomial random variables provide stepsize conditions for restricting reaction numbers in a chosen time interval. These stepsize conditions are important properties of robust leap control strategies. Numerical results indicate that the proposed binomial leap methods can be applied to a wide range of chemical reaction systems with very good accuracy and significant improvement on efficiency over existing approaches. (c) 2004 American Institute of Physics.

  13. A methodology to design heuristics for model selection based on the characteristics of data: Application to investigate when the Negative Binomial Lindley (NB-L) is preferred over the Negative Binomial (NB).

    PubMed

    Shirazi, Mohammadali; Dhavala, Soma Sekhar; Lord, Dominique; Geedipally, Srinivas Reddy

    2017-10-01

    Safety analysts usually use post-modeling methods, such as the Goodness-of-Fit statistics or the Likelihood Ratio Test, to decide between two or more competitive distributions or models. Such metrics require all competitive distributions to be fitted to the data before any comparisons can be accomplished. Given the continuous growth in introducing new statistical distributions, choosing the best one using such post-modeling methods is not a trivial task, in addition to all theoretical or numerical issues the analyst may face during the analysis. Furthermore, and most importantly, these measures or tests do not provide any intuitions into why a specific distribution (or model) is preferred over another (Goodness-of-Logic). This paper ponders into these issues by proposing a methodology to design heuristics for Model Selection based on the characteristics of data, in terms of descriptive summary statistics, before fitting the models. The proposed methodology employs two analytic tools: (1) Monte-Carlo Simulations and (2) Machine Learning Classifiers, to design easy heuristics to predict the label of the 'most-likely-true' distribution for analyzing data. The proposed methodology was applied to investigate when the recently introduced Negative Binomial Lindley (NB-L) distribution is preferred over the Negative Binomial (NB) distribution. Heuristics were designed to select the 'most-likely-true' distribution between these two distributions, given a set of prescribed summary statistics of data. The proposed heuristics were successfully compared against classical tests for several real or observed datasets. Not only they are easy to use and do not need any post-modeling inputs, but also, using these heuristics, the analyst can attain useful information about why the NB-L is preferred over the NB - or vice versa- when modeling data. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Dispersion and sampling of adult Dermacentor andersoni in rangeland in Western North America.

    PubMed

    Rochon, K; Scoles, G A; Lysyk, T J

    2012-03-01

    A fixed precision sampling plan was developed for off-host populations of adult Rocky Mountain wood tick, Dermacentor andersoni (Stiles) based on data collected by dragging at 13 locations in Alberta, Canada; Washington; and Oregon. In total, 222 site-date combinations were sampled. Each site-date combination was considered a sample, and each sample ranged in size from 86 to 250 10 m2 quadrats. Analysis of simulated quadrats ranging in size from 10 to 50 m2 indicated that the most precise sample unit was the 10 m2 quadrat. Samples taken when abundance < 0.04 ticks per 10 m2 were more likely to not depart significantly from statistical randomness than samples taken when abundance was greater. Data were grouped into ten abundance classes and assessed for fit to the Poisson and negative binomial distributions. The Poisson distribution fit only data in abundance classes < 0.02 ticks per 10 m2, while the negative binomial distribution fit data from all abundance classes. A negative binomial distribution with common k = 0.3742 fit data in eight of the 10 abundance classes. Both the Taylor and Iwao mean-variance relationships were fit and used to predict sample sizes for a fixed level of precision. Sample sizes predicted using the Taylor model tended to underestimate actual sample sizes, while sample sizes estimated using the Iwao model tended to overestimate actual sample sizes. Using a negative binomial with common k provided estimates of required sample sizes closest to empirically calculated sample sizes.

  15. Interrelationships Between Receiver/Relative Operating Characteristics Display, Binomial, Logit, and Bayes' Rule Probability of Detection Methodologies

    NASA Technical Reports Server (NTRS)

    Generazio, Edward R.

    2014-01-01

    Unknown risks are introduced into failure critical systems when probability of detection (POD) capabilities are accepted without a complete understanding of the statistical method applied and the interpretation of the statistical results. The presence of this risk in the nondestructive evaluation (NDE) community is revealed in common statements about POD. These statements are often interpreted in a variety of ways and therefore, the very existence of the statements identifies the need for a more comprehensive understanding of POD methodologies. Statistical methodologies have data requirements to be met, procedures to be followed, and requirements for validation or demonstration of adequacy of the POD estimates. Risks are further enhanced due to the wide range of statistical methodologies used for determining the POD capability. Receiver/Relative Operating Characteristics (ROC) Display, simple binomial, logistic regression, and Bayes' rule POD methodologies are widely used in determining POD capability. This work focuses on Hit-Miss data to reveal the framework of the interrelationships between Receiver/Relative Operating Characteristics Display, simple binomial, logistic regression, and Bayes' Rule methodologies as they are applied to POD. Knowledge of these interrelationships leads to an intuitive and global understanding of the statistical data, procedural and validation requirements for establishing credible POD estimates.

  16. Use of negative binomial distribution to describe the presence of Anisakis in Thyrsites atun.

    PubMed

    Peña-Rehbein, Patricio; De los Ríos-Escalante, Patricio

    2012-01-01

    Nematodes of the genus Anisakis have marine fishes as intermediate hosts. One of these hosts is Thyrsites atun, an important fishery resource in Chile between 38 and 41° S. This paper describes the frequency and number of Anisakis nematodes in the internal organs of Thyrsites atun. An analysis based on spatial distribution models showed that the parasites tend to be clustered. The variation in the number of parasites per host could be described by the negative binomial distribution. The maximum observed number of parasites was nine parasites per host. The environmental and zoonotic aspects of the study are also discussed.

  17. A Financial Market Model Incorporating Herd Behaviour.

    PubMed

    Wray, Christopher M; Bishop, Steven R

    2016-01-01

    Herd behaviour in financial markets is a recurring phenomenon that exacerbates asset price volatility, and is considered a possible contributor to market fragility. While numerous studies investigate herd behaviour in financial markets, it is often considered without reference to the pricing of financial instruments or other market dynamics. Here, a trader interaction model based upon informational cascades in the presence of information thresholds is used to construct a new model of asset price returns that allows for both quiescent and herd-like regimes. Agent interaction is modelled using a stochastic pulse-coupled network, parametrised by information thresholds and a network coupling probability. Agents may possess either one or two information thresholds that, in each case, determine the number of distinct states an agent may occupy before trading takes place. In the case where agents possess two thresholds (labelled as the finite state-space model, corresponding to agents' accumulating information over a bounded state-space), and where coupling strength is maximal, an asymptotic expression for the cascade-size probability is derived and shown to follow a power law when a critical value of network coupling probability is attained. For a range of model parameters, a mixture of negative binomial distributions is used to approximate the cascade-size distribution. This approximation is subsequently used to express the volatility of model price returns in terms of the model parameter which controls the network coupling probability. In the case where agents possess a single pulse-coupling threshold (labelled as the semi-infinite state-space model corresponding to agents' accumulating information over an unbounded state-space), numerical evidence is presented that demonstrates volatility clustering and long-memory patterns in the volatility of asset returns. Finally, output from the model is compared to both the distribution of historical stock returns and the market price of an equity index option.

  18. Seasonal changes in spatial patterns of two annual plants in the Chihuahuan Desert, USA

    USGS Publications Warehouse

    Yin, Z.-Y.; Guo, Q.; Ren, H.; Peng, S.-L.

    2005-01-01

    Spatial pattern of a biotic population may change over time as its component individuals grow or die out, but whether this is the case for desert annual plants is largely unknown. Here we examined seasonal changes in spatial patterns of two annuals, Eriogonum abertianum and Haplopappus gracilis, in initial (winter) and final (summer) densities. The density was measured as the number of individuals from 384 permanent quadrats (each 0.5 m × 0.5 m) in the Chihuahuan Desert near Portal, Arizona, USA. We used three probability distributions (binomial, Poisson, and negative binomial or NB) that represent three basic spatial patterns (regular, random, and clumped) to fit the observed frequency distributions of densities of the two annuals. Both species showed clear clumped patterns as characterized by the NB and had similar inverse J-shaped frequency distribution curves in two density categories. Also, both species displayed a reduced degree of aggregation from winter to summer after the spring drought (massive die-off), as indicated by the increased k-parameter of the NB and decreased values of another NB parameter p, variance/mean ratio, Lloyd’s Index of Patchiness, and David and Moore’s Index of Clumping. Further, we hypothesized that while the NB (i.e., Poisson-logarithmic) well fits the distribution of individuals per quadrat, its components, the Poisson and logarithmic, may describe the distributions of clumps per quadrat and of individuals per clump, respectively. We thus obtained the means and variances for (1) individuals per quadrat, (2) clumps per quadrat, and (3) individuals per clump. The results showed that the decrease of the density from winter to summer for each plant resulted from the decrease of individuals per clump, rather than from the decrease of clumps per quadrat. The great similarities between the two annuals indicate that our observed temporal changes in spatial patterns may be common among desert annual plants.

  19. A statistical treatment of bioassay pour fractions

    NASA Astrophysics Data System (ADS)

    Barengoltz, Jack; Hughes, David

    A bioassay is a method for estimating the number of bacterial spores on a spacecraft surface for the purpose of demonstrating compliance with planetary protection (PP) requirements (Ref. 1). The details of the process may be seen in the appropriate PP document (e.g., for NASA, Ref. 2). In general, the surface is mechanically sampled with a damp sterile swab or wipe. The completion of the process is colony formation in a growth medium in a plate (Petri dish); the colonies are counted. Consider a set of samples from randomly selected, known areas of one spacecraft surface, for simplicity. One may calculate the mean and standard deviation of the bioburden density, which is the ratio of counts to area sampled. The standard deviation represents an estimate of the variation from place to place of the true bioburden density commingled with the precision of the individual sample counts. The accuracy of individual sample results depends on the equipment used, the collection method, and the culturing method. One aspect that greatly influences the result is the pour fraction, which is the quantity of fluid added to the plates divided by the total fluid used in extracting spores from the sampling equipment. In an analysis of a single sample’s counts due to the pour fraction, one seeks to answer the question: What is the probability that if a certain number of spores are counted with a known pour fraction, that there are an additional number of spores in the part of the rinse not poured. This is given for specific values by the binomial distribution density, where detection (of culturable spores) is success and the probability of success is the pour fraction. A special summation over the binomial distribution, equivalent to adding for all possible values of the true total number of spores, is performed. This distribution when normalized will almost yield the desired quantity. It is the probability that the additional number of spores does not exceed a certain value. Of course, for a desired value of uncertainty, one must invert the calculation. However, this probability of finding exactly the number of spores in the poured part is correct only in the case where all values of the true number of spores greater than or equal to the adjusted count are equally probable. This is not realistic, of course, but the result can only overestimate the uncertainty. So it is useful. In probability speak, one has the conditional probability given any true total number of spores. Therefore one must multiply it by the probability of each possible true count, before the summation. If the counts for a sample set (of which this is one sample) are available, one may use the calculated variance and the normal probability distribution. In this approach, one assumes a normal distribution and neglects the contribution from spatial variation. The former is a common assumption. The latter can only add to the conservatism (over estimate the number of spores at some level of confidence). A more straightforward approach is to assume a Poisson probability distribution for the measured total sample set counts, and use the product of the number of samples and the mean number of counts per sample as the mean of the Poisson distribution. It is necessary to set the total count to 1 in the Poisson distribution when actual total count is zero. Finally, even when the planetary protection requirements for spore burden refer only to the mean values, they require an adjustment for pour fraction and method efficiency (a PP specification based on independent data). The adjusted mean values are a 50/50 proposition (e.g., the probability of the true total counts in the sample set exceeding the estimate is 0.50). However, this is highly unconservative when the total counts are zero. No adjustment to the mean values occurs for either pour fraction or efficiency. The recommended approach is once again to set the total counts to 1, but now applied to the mean values. Then one may apply the corrections to the revised counts. It can be shown by the methods developed in this work that this change is usually conservative enough to increase the level of confidence in the estimate to 0.5. 1. NASA. (2005) Planetary protection provisions for robotic extraterrestrial missions. NPR 8020.12C, April 2005, National Aeronautics and Space Administration, Washington, DC. 2. NASA. (2010) Handbook for the Microbiological Examination of Space Hardware, NASA-HDBK-6022, National Aeronautics and Space Administration, Washington, DC.

  20. Assessment of NDE reliability data

    NASA Technical Reports Server (NTRS)

    Yee, B. G. W.; Couchman, J. C.; Chang, F. H.; Packman, D. F.

    1975-01-01

    Twenty sets of relevant nondestructive test (NDT) reliability data were identified, collected, compiled, and categorized. A criterion for the selection of data for statistical analysis considerations was formulated, and a model to grade the quality and validity of the data sets was developed. Data input formats, which record the pertinent parameters of the defect/specimen and inspection procedures, were formulated for each NDE method. A comprehensive computer program was written and debugged to calculate the probability of flaw detection at several confidence limits by the binomial distribution. This program also selects the desired data sets for pooling and tests the statistical pooling criteria before calculating the composite detection reliability. An example of the calculated reliability of crack detection in bolt holes by an automatic eddy current method is presented.

  1. Single- and multiple-pulse noncoherent detection statistics associated with partially developed speckle.

    PubMed

    Osche, G R

    2000-08-20

    Single- and multiple-pulse detection statistics are presented for aperture-averaged direct detection optical receivers operating against partially developed speckle fields. A partially developed speckle field arises when the probability density function of the received intensity does not follow negative exponential statistics. The case of interest here is the target surface that exhibits diffuse as well as specular components in the scattered radiation. An approximate expression is derived for the integrated intensity at the aperture, which leads to single- and multiple-pulse discrete probability density functions for the case of a Poisson signal in Poisson noise with an additive coherent component. In the absence of noise, the single-pulse discrete density function is shown to reduce to a generalized negative binomial distribution. The radar concept of integration loss is discussed in the context of direct detection optical systems where it is shown that, given an appropriate set of system parameters, multiple-pulse processing can be more efficient than single-pulse processing over a finite range of the integration parameter n.

  2. [Monitoring microbiological safety of small systems of water distribution. Comparison of two sampling programs in a town in central Italy].

    PubMed

    Papini, Paolo; Faustini, Annunziata; Manganello, Rosa; Borzacchi, Giancarlo; Spera, Domenico; Perucci, Carlo A

    2005-01-01

    To determine the frequency of sampling in small water distribution systems (<5,000 inhabitants) and compare the results according to different hypotheses in bacteria distribution. We carried out two sampling programs to monitor the water distribution system in a town in Central Italy between July and September 1992; the Poisson distribution assumption implied 4 water samples, the assumption of negative binomial distribution implied 21 samples. Coliform organisms were used as indicators of water safety. The network consisted of two pipe rings and two wells fed by the same water source. The number of summer customers varied considerably from 3,000 to 20,000. The mean density was 2.33 coliforms/100 ml (sd= 5.29) for 21 samples and 3 coliforms/100 ml (sd= 6) for four samples. However the hypothesis of homogeneity was rejected (p-value <0.001) and the probability of II type error with the assumption of heterogeneity was higher with 4 samples (beta= 0.24) than with 21 (beta= 0.05). For this small network, determining the samples' size according to heterogeneity hypothesis strengthens the statement that water is drinkable compared with homogeneity assumption.

  3. Estimation of Multinomial Probabilities.

    DTIC Science & Technology

    1978-11-01

    1971) and Alam (1978) have shown that the maximum likelihood estimator is admissible with respect to the quadratic loss. Steinhaus (1957) and Trybula...appear). Johnson, B. Mck. (1971). On admissible estimators for certain fixed sample binomial populations. Ann. Math. Statist. 92, 1579-1587. Steinhaus , H

  4. Accident prediction model for public highway-rail grade crossings.

    PubMed

    Lu, Pan; Tolliver, Denver

    2016-05-01

    Considerable research has focused on roadway accident frequency analysis, but relatively little research has examined safety evaluation at highway-rail grade crossings. Highway-rail grade crossings are critical spatial locations of utmost importance for transportation safety because traffic crashes at highway-rail grade crossings are often catastrophic with serious consequences. The Poisson regression model has been employed to analyze vehicle accident frequency as a good starting point for many years. The most commonly applied variations of Poisson including negative binomial, and zero-inflated Poisson. These models are used to deal with common crash data issues such as over-dispersion (sample variance is larger than the sample mean) and preponderance of zeros (low sample mean and small sample size). On rare occasions traffic crash data have been shown to be under-dispersed (sample variance is smaller than the sample mean) and traditional distributions such as Poisson or negative binomial cannot handle under-dispersion well. The objective of this study is to investigate and compare various alternate highway-rail grade crossing accident frequency models that can handle the under-dispersion issue. The contributions of the paper are two-fold: (1) application of probability models to deal with under-dispersion issues and (2) obtain insights regarding to vehicle crashes at public highway-rail grade crossings. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Testing the anisotropy in the angular distribution of Fermi/GBM gamma-ray bursts

    NASA Astrophysics Data System (ADS)

    Tarnopolski, M.

    2017-12-01

    Gamma-ray bursts (GRBs) were confirmed to be of extragalactic origin due to their isotropic angular distribution, combined with the fact that they exhibited an intensity distribution that deviated strongly from the -3/2 power law. This finding was later confirmed with the first redshift, equal to at least z = 0.835, measured for GRB970508. Despite this result, the data from CGRO/BATSE and Swift/BAT indicate that long GRBs are indeed distributed isotropically, but the distribution of short GRBs is anisotropic. Fermi/GBM has detected 1669 GRBs up to date, and their sky distribution is examined in this paper. A number of statistical tests are applied: nearest neighbour analysis, fractal dimension, dipole and quadrupole moments of the distribution function decomposed into spherical harmonics, binomial test and the two-point angular correlation function. Monte Carlo benchmark testing of each test is performed in order to evaluate its reliability. It is found that short GRBs are distributed anisotropically in the sky, and long ones have an isotropic distribution. The probability that these results are not a chance occurrence is equal to at least 99.98 per cent and 30.68 per cent for short and long GRBs, respectively. The cosmological context of this finding and its relation to large-scale structures is discussed.

  6. [Hepatitis B case grouping serological study among six chinese families in Almeria, Spain].

    PubMed

    Barroso García, Pilar; Lucerna Méndez, M Angeles; Adrián Monforte, Estrella; Parrón Carreño, Tesifón

    2004-01-01

    Following the detection of two cases of members of 6 Chinese families having tested positive for the hepatitis B virus, a study of those living in these families was begun for the purpose of knowing the spread of the infection within the family environment of the cases detected. Descriptive study. Population under study: 24 members of six Chinese families. Age, sex, serological diagnosis, risk factors, healthcare-related attitude. Clinical records, serological data, epidemiological survey and immunization cards. A family focus was employed and the genogram used. Distribution Binomial spread for calculating probability of occurrence of the process to be studied. A total of 14 males (58.3%) and 10 females (41.7%) ranking from 1 to 54 years of age were studied. The age group having the largest number of subjects studied was the age 21-30 group (37.5%). Twelve chronic hepatitis B infections were recorded (50%). No relationship was found to exist with the risk factors studied in the epidemiological survey conducted. The probability of this number of chronic hepatitis cases occurring was 0.066 x 10(-6). It was concluded that the prevalence of infection found was probable due to intra-family transmission. Given the low probability of occurrence of a process of this type, the case grouping found is considered to be high.

  7. Using exceedance probabilities to detect anomalies in routinely recorded animal health data, with particular reference to foot-and-mouth disease in Viet Nam.

    PubMed

    Richards, K K; Hazelton, M L; Stevenson, M A; Lockhart, C Y; Pinto, J; Nguyen, L

    2014-10-01

    The widespread availability of computer hardware and software for recording and storing disease event information means that, in theory, we have the necessary information to carry out detailed analyses of factors influencing the spatial distribution of disease in animal populations. However, the reliability of such analyses depends on data quality, with anomalous records having the potential to introduce significant bias and lead to inappropriate decision making. In this paper we promote the use of exceedance probabilities as a tool for detecting anomalies when applying hierarchical spatio-temporal models to animal health data. We illustrate this methodology through a case study data on outbreaks of foot-and-mouth disease (FMD) in Viet Nam for the period 2006-2008. A flexible binomial logistic regression was employed to model the number of FMD infected communes within each province of the country. Standard analyses of the residuals from this model failed to identify problems, but exceedance probabilities identified provinces in which the number of reported FMD outbreaks was unexpectedly low. This finding is interesting given that these provinces are on major cattle movement pathways through Viet Nam. Copyright © 2014 Elsevier Ltd. All rights reserved.

  8. Marginalized zero-inflated negative binomial regression with application to dental caries

    PubMed Central

    Preisser, John S.; Das, Kalyan; Long, D. Leann; Divaris, Kimon

    2015-01-01

    The zero-inflated negative binomial regression model (ZINB) is often employed in diverse fields such as dentistry, health care utilization, highway safety, and medicine to examine relationships between exposures of interest and overdispersed count outcomes exhibiting many zeros. The regression coefficients of ZINB have latent class interpretations for a susceptible subpopulation at risk for the disease/condition under study with counts generated from a negative binomial distribution and for a non-susceptible subpopulation that provides only zero counts. The ZINB parameters, however, are not well-suited for estimating overall exposure effects, specifically, in quantifying the effect of an explanatory variable in the overall mixture population. In this paper, a marginalized zero-inflated negative binomial regression (MZINB) model for independent responses is proposed to model the population marginal mean count directly, providing straightforward inference for overall exposure effects based on maximum likelihood estimation. Through simulation studies, the finite sample performance of MZINB is compared to marginalized zero-inflated Poisson, Poisson, and negative binomial regression. The MZINB model is applied in the evaluation of a school-based fluoride mouthrinse program on dental caries in 677 children. PMID:26568034

  9. Joint Analysis of Binomial and Continuous Traits with a Recursive Model: A Case Study Using Mortality and Litter Size of Pigs

    PubMed Central

    Varona, Luis; Sorensen, Daniel

    2014-01-01

    This work presents a model for the joint analysis of a binomial and a Gaussian trait using a recursive parametrization that leads to a computationally efficient implementation. The model is illustrated in an analysis of mortality and litter size in two breeds of Danish pigs, Landrace and Yorkshire. Available evidence suggests that mortality of piglets increased partly as a result of successful selection for total number of piglets born. In recent years there has been a need to decrease the incidence of mortality in pig-breeding programs. We report estimates of genetic variation at the level of the logit of the probability of mortality and quantify how it is affected by the size of the litter. Several models for mortality are considered and the best fits are obtained by postulating linear and cubic relationships between the logit of the probability of mortality and litter size, for Landrace and Yorkshire, respectively. An interpretation of how the presence of genetic variation affects the probability of mortality in the population is provided and we discuss and quantify the prospects of selecting for reduced mortality, without affecting litter size. PMID:24414548

  10. Introducing Perception and Modelling of Spatial Randomness in Classroom

    ERIC Educational Resources Information Center

    De Nóbrega, José Renato

    2017-01-01

    A strategy to facilitate understanding of spatial randomness is described, using student activities developed in sequence: looking at spatial patterns, simulating approximate spatial randomness using a grid of equally-likely squares, using binomial probabilities for approximations and predictions and then comparing with given Poisson…

  11. Type I error probability spending for post-market drug and vaccine safety surveillance with binomial data.

    PubMed

    Silva, Ivair R

    2018-01-15

    Type I error probability spending functions are commonly used for designing sequential analysis of binomial data in clinical trials, but it is also quickly emerging for near-continuous sequential analysis of post-market drug and vaccine safety surveillance. It is well known that, for clinical trials, when the null hypothesis is not rejected, it is still important to minimize the sample size. Unlike in post-market drug and vaccine safety surveillance, that is not important. In post-market safety surveillance, specially when the surveillance involves identification of potential signals, the meaningful statistical performance measure to be minimized is the expected sample size when the null hypothesis is rejected. The present paper shows that, instead of the convex Type I error spending shape conventionally used in clinical trials, a concave shape is more indicated for post-market drug and vaccine safety surveillance. This is shown for both, continuous and group sequential analysis. Copyright © 2017 John Wiley & Sons, Ltd.

  12. The analysis of incontinence episodes and other count data in patients with overactive bladder by Poisson and negative binomial regression.

    PubMed

    Martina, R; Kay, R; van Maanen, R; Ridder, A

    2015-01-01

    Clinical studies in overactive bladder have traditionally used analysis of covariance or nonparametric methods to analyse the number of incontinence episodes and other count data. It is known that if the underlying distributional assumptions of a particular parametric method do not hold, an alternative parametric method may be more efficient than a nonparametric one, which makes no assumptions regarding the underlying distribution of the data. Therefore, there are advantages in using methods based on the Poisson distribution or extensions of that method, which incorporate specific features that provide a modelling framework for count data. One challenge with count data is overdispersion, but methods are available that can account for this through the introduction of random effect terms in the modelling, and it is this modelling framework that leads to the negative binomial distribution. These models can also provide clinicians with a clearer and more appropriate interpretation of treatment effects in terms of rate ratios. In this paper, the previously used parametric and non-parametric approaches are contrasted with those based on Poisson regression and various extensions in trials evaluating solifenacin and mirabegron in patients with overactive bladder. In these applications, negative binomial models are seen to fit the data well. Copyright © 2014 John Wiley & Sons, Ltd.

  13. Lotka's Law and Institutional Productivity.

    ERIC Educational Resources Information Center

    Kumar, Suresh; Sharma, Praveen; Garg, K. C.

    1998-01-01

    Examines the applicability of Lotka's Law, negative binomial distribution, and lognormal distribution for institutional productivity in the same way as it is to authors and their productivity. Results indicate that none of the distributions are applicable for institutional productivity in engineering sciences. (Author/LRW)

  14. Binomial Test Method for Determining Probability of Detection Capability for Fracture Critical Applications

    NASA Technical Reports Server (NTRS)

    Generazio, Edward R.

    2011-01-01

    The capability of an inspection system is established by applications of various methodologies to determine the probability of detection (POD). One accepted metric of an adequate inspection system is that for a minimum flaw size and all greater flaw sizes, there is 0.90 probability of detection with 95% confidence (90/95 POD). Directed design of experiments for probability of detection (DOEPOD) has been developed to provide an efficient and accurate methodology that yields estimates of POD and confidence bounds for both Hit-Miss or signal amplitude testing, where signal amplitudes are reduced to Hit-Miss by using a signal threshold Directed DOEPOD uses a nonparametric approach for the analysis or inspection data that does require any assumptions about the particular functional form of a POD function. The DOEPOD procedure identifies, for a given sample set whether or not the minimum requirement of 0.90 probability of detection with 95% confidence is demonstrated for a minimum flaw size and for all greater flaw sizes (90/95 POD). The DOEPOD procedures are sequentially executed in order to minimize the number of samples needed to demonstrate that there is a 90/95 POD lower confidence bound at a given flaw size and that the POD is monotonic for flaw sizes exceeding that 90/95 POD flaw size. The conservativeness of the DOEPOD methodology results is discussed. Validated guidelines for binomial estimation of POD for fracture critical inspection are established.

  15. On measures of association among genetic variables

    PubMed Central

    Gianola, Daniel; Manfredi, Eduardo; Simianer, Henner

    2012-01-01

    Summary Systems involving many variables are important in population and quantitative genetics, for example, in multi-trait prediction of breeding values and in exploration of multi-locus associations. We studied departures of the joint distribution of sets of genetic variables from independence. New measures of association based on notions of statistical distance between distributions are presented. These are more general than correlations, which are pairwise measures, and lack a clear interpretation beyond the bivariate normal distribution. Our measures are based on logarithmic (Kullback-Leibler) and on relative ‘distances’ between distributions. Indexes of association are developed and illustrated for quantitative genetics settings in which the joint distribution of the variables is either multivariate normal or multivariate-t, and we show how the indexes can be used to study linkage disequilibrium in a two-locus system with multiple alleles and present applications to systems of correlated beta distributions. Two multivariate beta and multivariate beta-binomial processes are examined, and new distributions are introduced: the GMS-Sarmanov multivariate beta and its beta-binomial counterpart. PMID:22742500

  16. Bayesian inference for disease prevalence using negative binomial group testing

    PubMed Central

    Pritchard, Nicholas A.; Tebbs, Joshua M.

    2011-01-01

    Group testing, also known as pooled testing, and inverse sampling are both widely used methods of data collection when the goal is to estimate a small proportion. Taking a Bayesian approach, we consider the new problem of estimating disease prevalence from group testing when inverse (negative binomial) sampling is used. Using different distributions to incorporate prior knowledge of disease incidence and different loss functions, we derive closed form expressions for posterior distributions and resulting point and credible interval estimators. We then evaluate our new estimators, on Bayesian and classical grounds, and apply our methods to a West Nile Virus data set. PMID:21259308

  17. Bayesian sample size calculations in phase II clinical trials using a mixture of informative priors.

    PubMed

    Gajewski, Byron J; Mayo, Matthew S

    2006-08-15

    A number of researchers have discussed phase II clinical trials from a Bayesian perspective. A recent article by Mayo and Gajewski focuses on sample size calculations, which they determine by specifying an informative prior distribution and then calculating a posterior probability that the true response will exceed a prespecified target. In this article, we extend these sample size calculations to include a mixture of informative prior distributions. The mixture comes from several sources of information. For example consider information from two (or more) clinicians. The first clinician is pessimistic about the drug and the second clinician is optimistic. We tabulate the results for sample size design using the fact that the simple mixture of Betas is a conjugate family for the Beta- Binomial model. We discuss the theoretical framework for these types of Bayesian designs and show that the Bayesian designs in this paper approximate this theoretical framework. Copyright 2006 John Wiley & Sons, Ltd.

  18. Making sense of sparse rating data in collaborative filtering via topographic organization of user preference patterns.

    PubMed

    Polcicová, Gabriela; Tino, Peter

    2004-01-01

    We introduce topographic versions of two latent class models (LCM) for collaborative filtering. Latent classes are topologically organized on a square grid. Topographic organization of latent classes makes orientation in rating/preference patterns captured by the latent classes easier and more systematic. The variation in film rating patterns is modelled by multinomial and binomial distributions with varying independence assumptions. In the first stage of topographic LCM construction, self-organizing maps with neural field organized according to the LCM topology are employed. We apply our system to a large collection of user ratings for films. The system can provide useful visualization plots unveiling user preference patterns buried in the data, without loosing potential to be a good recommender model. It appears that multinomial distribution is most adequate if the model is regularized by tight grid topologies. Since we deal with probabilistic models of the data, we can readily use tools from probability and information theories to interpret and visualize information extracted by our system.

  19. Improved confidence intervals when the sample is counted an integer times longer than the blank.

    PubMed

    Potter, William Edward; Strzelczyk, Jadwiga Jodi

    2011-05-01

    Past computer solutions for confidence intervals in paired counting are extended to the case where the ratio of the sample count time to the blank count time is taken to be an integer, IRR. Previously, confidence intervals have been named Neyman-Pearson confidence intervals; more correctly they should have been named Neyman confidence intervals or simply confidence intervals. The technique utilized mimics a technique used by Pearson and Hartley to tabulate confidence intervals for the expected value of the discrete Poisson and Binomial distributions. The blank count and the contribution of the sample to the gross count are assumed to be Poisson distributed. The expected value of the blank count, in the sample count time, is assumed known. The net count, OC, is taken to be the gross count minus the product of IRR with the blank count. The probability density function (PDF) for the net count can be determined in a straightforward manner.

  20. The spatial distribution of fixed mutations within genes coding for proteins

    NASA Technical Reports Server (NTRS)

    Holmquist, R.; Goodman, M.; Conroy, T.; Czelusniak, J.

    1983-01-01

    An examination has been conducted of the extensive amino acid sequence data now available for five protein families - the alpha crystallin A chain, myoglobin, alpha and beta hemoglobin, and the cytochromes c - with the goal of estimating the true spatial distribution of base substitutions within genes that code for proteins. In every case the commonly used Poisson density failed to even approximate the experimental pattern of base substitution. For the 87 species of beta hemoglobin examined, for example, the probability that the observed results were from a Poisson process was the minuscule 10 to the -44th. Analogous results were obtained for the other functional families. All the data were reasonably, but not perfectly, described by the negative binomial density. In particular, most of the data were described by one of the very simple limiting forms of this density, the geometric density. The implications of this for evolutionary inference are discussed. It is evident that most estimates of total base substitutions between genes are badly in need of revision.

  1. Critical Values for Lawshe's Content Validity Ratio: Revisiting the Original Methods of Calculation

    ERIC Educational Resources Information Center

    Ayre, Colin; Scally, Andrew John

    2014-01-01

    The content validity ratio originally proposed by Lawshe is widely used to quantify content validity and yet methods used to calculate the original critical values were never reported. Methods for original calculation of critical values are suggested along with tables of exact binomial probabilities.

  2. An Exercise to Introduce Power

    ERIC Educational Resources Information Center

    Seier, Edith; Liu, Yali

    2013-01-01

    In introductory statistics courses, the concept of power is usually presented in the context of testing hypotheses about the population mean. We instead propose an exercise that uses a binomial probability table to introduce the idea of power in the context of testing a population proportion. (Contains 2 tables, and 2 figures.)

  3. Spatial Distribution of Adult Anthonomus grandis Boheman (Coleoptera: Curculionidae) and Damage to Cotton Flower Buds Due to Feeding and Oviposition.

    PubMed

    Grigolli, J F J; Souza, L A; Fernandes, M G; Busoli, A C

    2017-08-01

    The cotton boll weevil Anthonomus grandis Boheman (Coleoptera: Curculionidae) is the main pest in cotton crop around the world, directly affecting cotton production. In order to establish a sequential sampling plan, it is crucial to understand the spatial distribution of the pest population and the damage it causes to the crop through the different developmental stages of cotton plants. Therefore, this study aimed to investigate the spatial distribution of adults in the cultivation area and their oviposition and feeding behavior throughout the development of the cotton plants. The experiment was conducted in Maracaju, Mato Grosso do Sul, Brazil, in the 2012/2013 and 2013/2014 growing seasons, in an area of 10,000 m 2 , planted with the cotton cultivar FM 993. The experimental area was divided into 100 plots of 100 m 2 (10 × 10 m) each, and five plants per plot were sampled weekly throughout the crop cycle. The number of flower buds with feeding and oviposition punctures and of adult A. grandis was recorded throughout the crop cycle in five plants per plot. After determining the aggregation indices (variance/mean ratio, Morisita's index, exponent k of the negative binomial distribution, and Green's coefficient) and adjusting the frequencies observed in the field to the distribution of frequencies (Poisson, negative binomial, and positive binomial) using the chi-squared test, it was observed that flower buds with punctures derived from feeding, oviposition, and feeding + oviposition showed an aggregated distribution in the cultivation area until 85 days after emergence and a random distribution after this stage. The adults of A. grandis presented a random distribution in the cultivation area.

  4. A Financial Market Model Incorporating Herd Behaviour

    PubMed Central

    2016-01-01

    Herd behaviour in financial markets is a recurring phenomenon that exacerbates asset price volatility, and is considered a possible contributor to market fragility. While numerous studies investigate herd behaviour in financial markets, it is often considered without reference to the pricing of financial instruments or other market dynamics. Here, a trader interaction model based upon informational cascades in the presence of information thresholds is used to construct a new model of asset price returns that allows for both quiescent and herd-like regimes. Agent interaction is modelled using a stochastic pulse-coupled network, parametrised by information thresholds and a network coupling probability. Agents may possess either one or two information thresholds that, in each case, determine the number of distinct states an agent may occupy before trading takes place. In the case where agents possess two thresholds (labelled as the finite state-space model, corresponding to agents’ accumulating information over a bounded state-space), and where coupling strength is maximal, an asymptotic expression for the cascade-size probability is derived and shown to follow a power law when a critical value of network coupling probability is attained. For a range of model parameters, a mixture of negative binomial distributions is used to approximate the cascade-size distribution. This approximation is subsequently used to express the volatility of model price returns in terms of the model parameter which controls the network coupling probability. In the case where agents possess a single pulse-coupling threshold (labelled as the semi-infinite state-space model corresponding to agents’ accumulating information over an unbounded state-space), numerical evidence is presented that demonstrates volatility clustering and long-memory patterns in the volatility of asset returns. Finally, output from the model is compared to both the distribution of historical stock returns and the market price of an equity index option. PMID:27007236

  5. An In-Depth Analysis of the Chung-Lu Model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Winlaw, M.; DeSterck, H.; Sanders, G.

    2015-10-28

    In the classic Erd}os R enyi random graph model [5] each edge is chosen with uniform probability and the degree distribution is binomial, limiting the number of graphs that can be modeled using the Erd}os R enyi framework [10]. The Chung-Lu model [1, 2, 3] is an extension of the Erd}os R enyi model that allows for more general degree distributions. The probability of each edge is no longer uniform and is a function of a user-supplied degree sequence, which by design is the expected degree sequence of the model. This property makes it an easy model to work withmore » theoretically and since the Chung-Lu model is a special case of a random graph model with a given degree sequence, many of its properties are well known and have been studied extensively [2, 3, 13, 8, 9]. It is also an attractive null model for many real-world networks, particularly those with power-law degree distributions and it is sometimes used as a benchmark for comparison with other graph generators despite some of its limitations [12, 11]. We know for example, that the average clustering coe cient is too low relative to most real world networks. As well, measures of a nity are also too low relative to most real-world networks of interest. However, despite these limitations or perhaps because of them, the Chung-Lu model provides a basis for comparing new graph models.« less

  6. Linear approximations of global behaviors in nonlinear systems with moderate or strong noise

    NASA Astrophysics Data System (ADS)

    Liang, Junhao; Din, Anwarud; Zhou, Tianshou

    2018-03-01

    While many physical or chemical systems can be modeled by nonlinear Langevin equations (LEs), dynamical analysis of these systems is challenging in the cases of moderate and strong noise. Here we develop a linear approximation scheme, which can transform an often intractable LE into a linear set of binomial moment equations (BMEs). This scheme provides a feasible way to capture nonlinear behaviors in the sense of probability distribution and is effective even when the noise is moderate or big. Based on BMEs, we further develop a noise reduction technique, which can effectively handle tough cases where traditional small-noise theories are inapplicable. The overall method not only provides an approximation-based paradigm to analysis of the local and global behaviors of nonlinear noisy systems but also has a wide range of applications.

  7. Shot-noise evidence of fractional quasiparticle creation in a local fractional quantum Hall state.

    PubMed

    Hashisaka, Masayuki; Ota, Tomoaki; Muraki, Koji; Fujisawa, Toshimasa

    2015-02-06

    We experimentally identify fractional quasiparticle creation in a tunneling process through a local fractional quantum Hall (FQH) state. The local FQH state is prepared in a low-density region near a quantum point contact in an integer quantum Hall (IQH) system. Shot-noise measurements reveal a clear transition from elementary-charge tunneling at low bias to fractional-charge tunneling at high bias. The fractional shot noise is proportional to T(1)(1-T(1)) over a wide range of T(1), where T(1) is the transmission probability of the IQH edge channel. This binomial distribution indicates that fractional quasiparticles emerge from the IQH state to be transmitted through the local FQH state. The study of this tunneling process enables us to elucidate the dynamics of Laughlin quasiparticles in FQH systems.

  8. Temporal acceleration of spatially distributed kinetic Monte Carlo simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chatterjee, Abhijit; Vlachos, Dionisios G.

    The computational intensity of kinetic Monte Carlo (KMC) simulation is a major impediment in simulating large length and time scales. In recent work, an approximate method for KMC simulation of spatially uniform systems, termed the binomial {tau}-leap method, was introduced [A. Chatterjee, D.G. Vlachos, M.A. Katsoulakis, Binomial distribution based {tau}-leap accelerated stochastic simulation, J. Chem. Phys. 122 (2005) 024112], where molecular bundles instead of individual processes are executed over coarse-grained time increments. This temporal coarse-graining can lead to significant computational savings but its generalization to spatially lattice KMC simulation has not been realized yet. Here we extend the binomial {tau}-leapmore » method to lattice KMC simulations by combining it with spatially adaptive coarse-graining. Absolute stability and computational speed-up analyses for spatial systems along with simulations provide insights into the conditions where accuracy and substantial acceleration of the new spatio-temporal coarse-graining method are ensured. Model systems demonstrate that the r-time increment criterion of Chatterjee et al. obeys the absolute stability limit for values of r up to near 1.« less

  9. Non-stochastic sampling error in quantal analyses for Campylobacter species on poultry products

    USDA-ARS?s Scientific Manuscript database

    Using primers and fluorescent probes specific for the most common foodborne Campylobacter species (C. jejuni = Cj and C. coli = Cc), we developed a multiplex, most probable number (MPN) assay using quantitative PCR (qPCR) as the determinant for binomial detection: number of p positives out of n = 6 ...

  10. Mental health status and healthcare utilization among community dwelling older adults.

    PubMed

    Adepoju, Omolola; Lin, Szu-Hsuan; Mileski, Michael; Kruse, Clemens Scott; Mask, Andrew

    2018-04-27

    Shifts in mental health utilization patterns are necessary to allow for meaningful access to care for vulnerable populations. There have been long standing issues in how mental health is provided, which has caused problems in that care being efficacious for those seeking it. To assess the relationship between mental health status and healthcare utilization among adults ≥65 years. A negative binomial regression model was used to assess the relationship between mental health status and healthcare utilization related to office-based physician visits, while a two-part model, consisting of logistic regression and negative binomial regression, was used to separately model emergency visits and inpatient services. The receipt of care in office-based settings were marginally higher for subjects with mental health difficulties. Both probabilities and counts of inpatient hospitalizations were similar across mental health categories. The count of ER visits was similar across mental health categories; however, the probability of having an emergency department visit was marginally higher for older adults who reported mental health difficulties in 2012. These findings are encouraging and lend promise to the recent initiatives on addressing gaps in mental healthcare services.

  11. Funnel plot control limits to identify poorly performing healthcare providers when there is uncertainty in the value of the benchmark.

    PubMed

    Manktelow, Bradley N; Seaton, Sarah E; Evans, T Alun

    2016-12-01

    There is an increasing use of statistical methods, such as funnel plots, to identify poorly performing healthcare providers. Funnel plots comprise the construction of control limits around a benchmark and providers with outcomes falling outside the limits are investigated as potential outliers. The benchmark is usually estimated from observed data but uncertainty in this estimate is usually ignored when constructing control limits. In this paper, the use of funnel plots in the presence of uncertainty in the value of the benchmark is reviewed for outcomes from a Binomial distribution. Two methods to derive the control limits are shown: (i) prediction intervals; (ii) tolerance intervals Tolerance intervals formally include the uncertainty in the value of the benchmark while prediction intervals do not. The probability properties of 95% control limits derived using each method were investigated through hypothesised scenarios. Neither prediction intervals nor tolerance intervals produce funnel plot control limits that satisfy the nominal probability characteristics when there is uncertainty in the value of the benchmark. This is not necessarily to say that funnel plots have no role to play in healthcare, but that without the development of intervals satisfying the nominal probability characteristics they must be interpreted with care. © The Author(s) 2014.

  12. Estimating relative risks for common outcome using PROC NLP.

    PubMed

    Yu, Binbing; Wang, Zhuoqiao

    2008-05-01

    In cross-sectional or cohort studies with binary outcomes, it is biologically interpretable and of interest to estimate the relative risk or prevalence ratio, especially when the response rates are not rare. Several methods have been used to estimate the relative risk, among which the log-binomial models yield the maximum likelihood estimate (MLE) of the parameters. Because of restrictions on the parameter space, the log-binomial models often run into convergence problems. Some remedies, e.g., the Poisson and Cox regressions, have been proposed. However, these methods may give out-of-bound predicted response probabilities. In this paper, a new computation method using the SAS Nonlinear Programming (NLP) procedure is proposed to find the MLEs. The proposed NLP method was compared to the COPY method, a modified method to fit the log-binomial model. Issues in the implementation are discussed. For illustration, both methods were applied to data on the prevalence of microalbuminuria (micro-protein leakage into urine) for kidney disease patients from the Diabetes Control and Complications Trial. The sample SAS macro for calculating relative risk is provided in the appendix.

  13. Selecting a distributional assumption for modelling relative densities of benthic macroinvertebrates

    USGS Publications Warehouse

    Gray, B.R.

    2005-01-01

    The selection of a distributional assumption suitable for modelling macroinvertebrate density data is typically challenging. Macroinvertebrate data often exhibit substantially larger variances than expected under a standard count assumption, that of the Poisson distribution. Such overdispersion may derive from multiple sources, including heterogeneity of habitat (historically and spatially), differing life histories for organisms collected within a single collection in space and time, and autocorrelation. Taken to extreme, heterogeneity of habitat may be argued to explain the frequent large proportions of zero observations in macroinvertebrate data. Sampling locations may consist of habitats defined qualitatively as either suitable or unsuitable. The former category may yield random or stochastic zeroes and the latter structural zeroes. Heterogeneity among counts may be accommodated by treating the count mean itself as a random variable, while extra zeroes may be accommodated using zero-modified count assumptions, including zero-inflated and two-stage (or hurdle) approaches. These and linear assumptions (following log- and square root-transformations) were evaluated using 9 years of mayfly density data from a 52 km, ninth-order reach of the Upper Mississippi River (n = 959). The data exhibited substantial overdispersion relative to that expected under a Poisson assumption (i.e. variance:mean ratio = 23 ??? 1), and 43% of the sampling locations yielded zero mayflies. Based on the Akaike Information Criterion (AIC), count models were improved most by treating the count mean as a random variable (via a Poisson-gamma distributional assumption) and secondarily by zero modification (i.e. improvements in AIC values = 9184 units and 47-48 units, respectively). Zeroes were underestimated by the Poisson, log-transform and square root-transform models, slightly by the standard negative binomial model but not by the zero-modified models (61%, 24%, 32%, 7%, and 0%, respectively). However, the zero-modified Poisson models underestimated small counts (1 ??? y ??? 4) and overestimated intermediate counts (7 ??? y ??? 23). Counts greater than zero were estimated well by zero-modified negative binomial models, while counts greater than one were also estimated well by the standard negative binomial model. Based on AIC and percent zero estimation criteria, the two-stage and zero-inflated models performed similarly. The above inferences were largely confirmed when the models were used to predict values from a separate, evaluation data set (n = 110). An exception was that, using the evaluation data set, the standard negative binomial model appeared superior to its zero-modified counterparts using the AIC (but not percent zero criteria). This and other evidence suggest that a negative binomial distributional assumption should be routinely considered when modelling benthic macroinvertebrate data from low flow environments. Whether negative binomial models should themselves be routinely examined for extra zeroes requires, from a statistical perspective, more investigation. However, this question may best be answered by ecological arguments that may be specific to the sampled species and locations. ?? 2004 Elsevier B.V. All rights reserved.

  14. FluBreaks: early epidemic detection from Google flu trends.

    PubMed

    Pervaiz, Fahad; Pervaiz, Mansoor; Abdur Rehman, Nabeel; Saif, Umar

    2012-10-04

    The Google Flu Trends service was launched in 2008 to track changes in the volume of online search queries related to flu-like symptoms. Over the last few years, the trend data produced by this service has shown a consistent relationship with the actual number of flu reports collected by the US Centers for Disease Control and Prevention (CDC), often identifying increases in flu cases weeks in advance of CDC records. However, contrary to popular belief, Google Flu Trends is not an early epidemic detection system. Instead, it is designed as a baseline indicator of the trend, or changes, in the number of disease cases. To evaluate whether these trends can be used as a basis for an early warning system for epidemics. We present the first detailed algorithmic analysis of how Google Flu Trends can be used as a basis for building a fully automated system for early warning of epidemics in advance of methods used by the CDC. Based on our work, we present a novel early epidemic detection system, called FluBreaks (dritte.org/flubreaks), based on Google Flu Trends data. We compared the accuracy and practicality of three types of algorithms: normal distribution algorithms, Poisson distribution algorithms, and negative binomial distribution algorithms. We explored the relative merits of these methods, and related our findings to changes in Internet penetration and population size for the regions in Google Flu Trends providing data. Across our performance metrics of percentage true-positives (RTP), percentage false-positives (RFP), percentage overlap (OT), and percentage early alarms (EA), Poisson- and negative binomial-based algorithms performed better in all except RFP. Poisson-based algorithms had average values of 99%, 28%, 71%, and 76% for RTP, RFP, OT, and EA, respectively, whereas negative binomial-based algorithms had average values of 97.8%, 17.8%, 60%, and 55% for RTP, RFP, OT, and EA, respectively. Moreover, the EA was also affected by the region's population size. Regions with larger populations (regions 4 and 6) had higher values of EA than region 10 (which had the smallest population) for negative binomial- and Poisson-based algorithms. The difference was 12.5% and 13.5% on average in negative binomial- and Poisson-based algorithms, respectively. We present the first detailed comparative analysis of popular early epidemic detection algorithms on Google Flu Trends data. We note that realizing this opportunity requires moving beyond the cumulative sum and historical limits method-based normal distribution approaches, traditionally employed by the CDC, to negative binomial- and Poisson-based algorithms to deal with potentially noisy search query data from regions with varying population and Internet penetrations. Based on our work, we have developed FluBreaks, an early warning system for flu epidemics using Google Flu Trends.

  15. Evolution of a Modified Binomial Random Graph by Agglomeration

    NASA Astrophysics Data System (ADS)

    Kang, Mihyun; Pachon, Angelica; Rodríguez, Pablo M.

    2018-02-01

    In the classical Erdős-Rényi random graph G( n, p) there are n vertices and each of the possible edges is independently present with probability p. The random graph G( n, p) is homogeneous in the sense that all vertices have the same characteristics. On the other hand, numerous real-world networks are inhomogeneous in this respect. Such an inhomogeneity of vertices may influence the connection probability between pairs of vertices. The purpose of this paper is to propose a new inhomogeneous random graph model which is obtained in a constructive way from the Erdős-Rényi random graph G( n, p). Given a configuration of n vertices arranged in N subsets of vertices (we call each subset a super-vertex), we define a random graph with N super-vertices by letting two super-vertices be connected if and only if there is at least one edge between them in G( n, p). Our main result concerns the threshold for connectedness. We also analyze the phase transition for the emergence of the giant component and the degree distribution. Even though our model begins with G( n, p), it assumes the existence of some community structure encoded in the configuration. Furthermore, under certain conditions it exhibits a power law degree distribution. Both properties are important for real-world applications.

  16. A Stochastic Model For Extracting Sediment Delivery Timescales From Sediment Budgets

    NASA Astrophysics Data System (ADS)

    Pizzuto, J. E.; Benthem, A.; Karwan, D. L.; Keeler, J. J.; Skalak, K.

    2015-12-01

    Watershed managers need to quantify sediment storage and delivery timescales to understand the time required for best management practices to improve downstream water quality. To address this need, we route sediment downstream using a random walk through a series of valley compartments spaced at 1 km intervals. The probability of storage within each compartment, q, is specified from a sediment budget and is defined as the ratio of the volume deposited to the annual sediment flux. Within each compartment, the probability of sediment moving directly downstream without being stored is p=1-q. If sediment is stored within a compartment, its "resting time" is specified by a stochastic exponential waiting time distribution with a mean of 10 years. After a particle's waiting time is over, it moves downstream to the next compartment by fluvial transport. Over a distance of "n" compartments, a sediment particle may be stored from 0 to n times with the probability of each outcome (store or not store) specified by the binomial distribution. We assign q = 0.02, a stream velocity of 0.5 m/s, an event "intermittency "of 0.01, and assume a balanced sediment budget. Travel time probability density functions have a steep peak at the shortest times, representing rapid transport in the channel of the fraction of sediment that moves downstream without being stored. However, the probability of moving downstream "n" km without storage is pn (0.90 for 5 km, 0.36 for 50 km, 0.006 for 250 km), so travel times are increasingly dominated by storage with increasing distance. Median travel times for 5, 50, and 250 km are 0.03, 4.4, and 46.5 years. After a distance of approximately 2/q or 100 km (2/0.02/km), the median travel time is determined by storage timescales, and active fluvial transport is irrelevant. Our model extracts travel time statistics from sediment budgets, and can be cast as a differential equation and solved numerically for more complex systems.

  17. Collective Human Mobility Pattern from Taxi Trips in Urban Area

    PubMed Central

    Peng, Chengbin; Jin, Xiaogang; Wong, Ka-Chun; Shi, Meixia; Liò, Pietro

    2012-01-01

    We analyze the passengers' traffic pattern for 1.58 million taxi trips of Shanghai, China. By employing the non-negative matrix factorization and optimization methods, we find that, people travel on workdays mainly for three purposes: commuting between home and workplace, traveling from workplace to workplace, and others such as leisure activities. Therefore, traffic flow in one area or between any pair of locations can be approximated by a linear combination of three basis flows, corresponding to the three purposes respectively. We name the coefficients in the linear combination as traffic powers, each of which indicates the strength of each basis flow. The traffic powers on different days are typically different even for the same location, due to the uncertainty of the human motion. Therefore, we provide a probability distribution function for the relative deviation of the traffic power. This distribution function is in terms of a series of functions for normalized binomial distributions. It can be well explained by statistical theories and is verified by empirical data. These findings are applicable in predicting the road traffic, tracing the traffic pattern and diagnosing the traffic related abnormal events. These results can also be used to infer land uses of urban area quite parsimoniously. PMID:22529917

  18. Ecological and pest-management implications of sex differences in scarab landing patterns on grape vines

    PubMed Central

    Boyer, Stéphane; Lefort, Marie-Caroline; Nboyine, Jerry; Wratten, Steve D.

    2017-01-01

    Background Melolonthinae beetles, comprising different white grub species, are a globally-distributed pest group. Their larvae feed on roots of several crop and forestry species, and adults can cause severe defoliation. In New Zealand, the endemic scarab pest Costelytra zealandica (White) causes severe defoliation on different horticultural crops, including grape vines (Vitis vinifera). Understanding flight and landing behaviours of this pest can help inform pest management decisions. Methods Adult beetles were counted and then removed from 96 grape vine plants from 21:30 until 23:00 h, every day from October 26 until December 2, during 2014 and 2015. Also, adults were removed from the grape vine foliage at dusk 5, 10, 15, 20 and 25 min after flight started on 2015. Statistical analyses were performed using generalised linear models with a beta-binomial distribution to analyse proportions and with a negative binomial distribution for beetle abundance. Results By analysing C. zealandica sex ratios during its entire flight season, it is clear that the proportion of males is higher at the beginning of the season, gradually declining towards its end. When adults were successively removed from the grape vines at 5-min intervals after flight activity begun, the mean proportion of males ranged from 6–28%. The male proportion suggests males were attracted to females that had already landed on grape vines, probably through pheromone release. Discussion The seasonal and daily changes in adult C. zealandica sex ratio throughout its flight season are presented for the first time. Although seasonal changes in sex ratio have been reported for other melolonthines, changes during their daily flight activity have not been analysed so far. Sex-ratio changes can have important consequences for the management of this pest species, and possibly for other melolonthines, as it has been previously suggested that C. zealandica females land on plants that produce a silhouette against the sky. Therefore, long-term management might evaluate the effect of different plant heights and architecture on female melolonthine landing patterns, with consequences for male distribution, and subsequently overall damage within horticultural areas. PMID:28462026

  19. Ecological and pest-management implications of sex differences in scarab landing patterns on grape vines.

    PubMed

    González-Chang, Mauricio; Boyer, Stéphane; Lefort, Marie-Caroline; Nboyine, Jerry; Wratten, Steve D

    2017-01-01

    Melolonthinae beetles, comprising different white grub species, are a globally-distributed pest group. Their larvae feed on roots of several crop and forestry species, and adults can cause severe defoliation. In New Zealand, the endemic scarab pest Costelytra zealandica (White) causes severe defoliation on different horticultural crops, including grape vines ( Vitis vinifera ). Understanding flight and landing behaviours of this pest can help inform pest management decisions. Adult beetles were counted and then removed from 96 grape vine plants from 21:30 until 23:00 h, every day from October 26 until December 2, during 2014 and 2015. Also, adults were removed from the grape vine foliage at dusk 5, 10, 15, 20 and 25 min after flight started on 2015. Statistical analyses were performed using generalised linear models with a beta-binomial distribution to analyse proportions and with a negative binomial distribution for beetle abundance. By analysing C. zealandica sex ratios during its entire flight season, it is clear that the proportion of males is higher at the beginning of the season, gradually declining towards its end. When adults were successively removed from the grape vines at 5-min intervals after flight activity begun, the mean proportion of males ranged from 6-28%. The male proportion suggests males were attracted to females that had already landed on grape vines, probably through pheromone release. The seasonal and daily changes in adult C. zealandica sex ratio throughout its flight season are presented for the first time. Although seasonal changes in sex ratio have been reported for other melolonthines, changes during their daily flight activity have not been analysed so far. Sex-ratio changes can have important consequences for the management of this pest species, and possibly for other melolonthines, as it has been previously suggested that C. zealandica females land on plants that produce a silhouette against the sky. Therefore, long-term management might evaluate the effect of different plant heights and architecture on female melolonthine landing patterns, with consequences for male distribution, and subsequently overall damage within horticultural areas.

  20. Ecology of nonnative Siberian prawn (Palaemon modestus) in the lower Snake River, Washington, USA

    USGS Publications Warehouse

    Erhardt, John M.; Tiffan, Kenneth F.

    2016-01-01

    We assessed the abundance, distribution, and ecology of the nonnative Siberian prawn Palaemon modestus in the lower Snake River, Washington, USA. Analysis of prawn passage abundance at three Snake River dams showed that populations are growing at exponential rates, especially at Little Goose Dam where over 464,000 prawns were collected in 2015. Monthly beam trawling during 2011–2013 provided information on prawn abundance and distribution in Lower Granite and Little Goose Reservoirs. Zero-inflated regression predicted that the probability of prawn presence increased with decreasing water velocity and increasing depth. Negative binomial models predicted higher catch rates of prawns in deeper water and in closer proximity to dams. Temporally, prawn densities decreased slightly in the summer, likely due to the mortality of older individuals, and then increased in autumn and winter with the emergence and recruitment of young of the year. Seasonal length frequencies showed that distinct juvenile and adult size classes exist throughout the year, suggesting prawns live from 1 to 2 years and may be able to reproduce multiple times during their life. Most juvenile prawns become reproductive adults in 1 year, and peak reproduction occurs from late July through October. Mean fecundity (189 eggs) and reproductive output (11.9 %) are similar to that in their native range. The current use of deep habitats by prawns likely makes them unavailable to most predators in the reservoirs. The distribution and role of Siberian prawns in the lower Snake River food web will probably continue to change as the population grows and warrants continued monitoring and investigation.

  1. Enumerative and binomial sampling plans for citrus mealybug (Homoptera: pseudococcidae) in citrus groves.

    PubMed

    Martínez-Ferrer, María Teresa; Ripollés, José Luís; Garcia-Marí, Ferran

    2006-06-01

    The spatial distribution of the citrus mealybug, Planococcus citri (Risso) (Homoptera: Pseudococcidae), was studied in citrus groves in northeastern Spain. Constant precision sampling plans were designed for all developmental stages of citrus mealybug under the fruit calyx, for late stages on fruit, and for females on trunks and main branches; more than 66, 286, and 101 data sets, respectively, were collected from nine commercial fields during 1992-1998. Dispersion parameters were determined using Taylor's power law, giving aggregated spatial patterns for citrus mealybug populations in three locations of the tree sampled. A significant relationship between the number of insects per organ and the percentage of occupied organs was established using either Wilson and Room's binomial model or Kono and Sugino's empirical formula. Constant precision (E = 0.25) sampling plans (i.e., enumerative plans) for estimating mean densities were developed using Green's equation and the two binomial models. For making management decisions, enumerative counts may be less labor-intensive than binomial sampling. Therefore, we recommend enumerative sampling plans for the use in an integrated pest management program in citrus. Required sample sizes for the range of population densities near current management thresholds, in the three plant locations calyx, fruit, and trunk were 50, 110-330, and 30, respectively. Binomial sampling, especially the empirical model, required a higher sample size to achieve equivalent levels of precision.

  2. Estimating relative risks in multicenter studies with a small number of centers - which methods to use? A simulation study.

    PubMed

    Pedroza, Claudia; Truong, Van Thi Thanh

    2017-11-02

    Analyses of multicenter studies often need to account for center clustering to ensure valid inference. For binary outcomes, it is particularly challenging to properly adjust for center when the number of centers or total sample size is small, or when there are few events per center. Our objective was to evaluate the performance of generalized estimating equation (GEE) log-binomial and Poisson models, generalized linear mixed models (GLMMs) assuming binomial and Poisson distributions, and a Bayesian binomial GLMM to account for center effect in these scenarios. We conducted a simulation study with few centers (≤30) and 50 or fewer subjects per center, using both a randomized controlled trial and an observational study design to estimate relative risk. We compared the GEE and GLMM models with a log-binomial model without adjustment for clustering in terms of bias, root mean square error (RMSE), and coverage. For the Bayesian GLMM, we used informative neutral priors that are skeptical of large treatment effects that are almost never observed in studies of medical interventions. All frequentist methods exhibited little bias, and the RMSE was very similar across the models. The binomial GLMM had poor convergence rates, ranging from 27% to 85%, but performed well otherwise. The results show that both GEE models need to use small sample corrections for robust SEs to achieve proper coverage of 95% CIs. The Bayesian GLMM had similar convergence rates but resulted in slightly more biased estimates for the smallest sample sizes. However, it had the smallest RMSE and good coverage across all scenarios. These results were very similar for both study designs. For the analyses of multicenter studies with a binary outcome and few centers, we recommend adjustment for center with either a GEE log-binomial or Poisson model with appropriate small sample corrections or a Bayesian binomial GLMM with informative priors.

  3. Co-Infestation and Spatial Distribution of Bactrocera carambolae and Anastrepha spp. (Diptera: Tephritidae) in Common Guava in the Eastern Amazon

    PubMed Central

    Deus, E. G.; Godoy, W. A. C.; Sousa, M. S. M.; Lopes, G. N.; Jesus-Barros, C. R.; Silva, J. G.; Adaime, R.

    2016-01-01

    Field infestation and spatial distribution of introduced Bactrocera carambolae Drew and Hancock and native species of Anastrepha in common guavas [Psidium guajava (L.)] were investigated in the eastern Amazon. Fruit sampling was carried out in the municipalities of Calçoene and Oiapoque in the state of Amapá, Brazil. The frequency distribution of larvae in fruit was fitted to the negative binomial distribution. Anastrepha striata was more abundant in both sampled areas in comparison to Anastrepha fraterculus (Wiedemann) and B. carambolae. The frequency distribution analysis of adults revealed an aggregated pattern for B. carambolae as well as for A. fraterculus and Anastrepha striata Schiner, described by the negative binomial distribution. Although the populations of Anastrepha spp. may have suffered some impact due to the presence of B. carambolae, the results are still not robust enough to indicate effective reduction in the abundance of Anastrepha spp. caused by B. carambolae in a general sense. The high degree of aggregation observed for both species suggests interspecific co-occurrence with the simultaneous presence of both species in the analysed fruit. Moreover, a significant fraction of uninfested guavas also indicated absence of competitive displacement. PMID:27638949

  4. levels and sociodemographic correlates of accelerometer-based physical activity in Irish children: a cross-sectional study.

    PubMed

    Li, Xia; Kearney, Patricia M; Keane, Eimear; Harrington, Janas M; Fitzgerald, Anthony P

    2017-06-01

    The aim of this study was to explore levels and sociodemographic correlates of physical activity (PA) over 1 week using accelerometer data. Accelerometer data was collected over 1 week from 1075 8-11-year-old children in the cross-sectional Cork Children's Lifestyle Study. Threshold values were used to categorise activity intensity as sedentary, light, moderate or vigorous. Questionnaires collected data on demographic factors. Smoothed curves were used to display minute by minute variations. Binomial regression was used to identify factors correlated with the probability of meeting WHO 60 min moderate to vigorous PA guidelines. Overall, 830 children (mean (SD) age: 9.9(0.7) years, 56.3% boys) were included. From the binomial multiple regression analysis, boys were found more likely to meet guidelines (probability ratio 1.17, 95% CI 1.06 to 1.28) than girls. Older children were less likely to meet guidelines than younger children (probability ratio 0.91, CI 0.87 to 0.95). Normal weight children were more likely than overweight and obese children to meet guidelines (probability ratio 1.25, CI 1.16 to 1.34). Children in urban areas were more likely to meet guidelines than those in rural areas (probability ratio 1.19, CI 1.07 to 1.33). Longer daylight length days were associated with greater probability of meeting guidelines compared to shorter daylight length days. PA levels differed by individual factors including age, gender and weight status as well as by environmental factors including residence and daylight length. Less than one-quarter of children (26.8% boys, 16.2% girls) meet guidelines. Effective intervention policies are urgently needed to increase PA. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.

  5. Demonstrating the Safety and Reliability of a New System or Spacecraft: Incorporating Analyses and Reviews of the Design and Processing in Determining the Number of Tests to be Conducted

    NASA Technical Reports Server (NTRS)

    Vesely, William E.; Colon, Alfredo E.

    2010-01-01

    Design Safety/Reliability is associated with the probability of no failure-causing faults existing in a design. Confidence in the non-existence of failure-causing faults is increased by performing tests with no failure. Reliability-Growth testing requirements are based on initial assurance and fault detection probability. Using binomial tables generally gives too many required tests compared to reliability-growth requirements. Reliability-Growth testing requirements are based on reliability principles and factors and should be used.

  6. Modeling number of claims and prediction of total claim amount

    NASA Astrophysics Data System (ADS)

    Acar, Aslıhan Şentürk; Karabey, Uǧur

    2017-07-01

    In this study we focus on annual number of claims of a private health insurance data set which belongs to a local insurance company in Turkey. In addition to Poisson model and negative binomial model, zero-inflated Poisson model and zero-inflated negative binomial model are used to model the number of claims in order to take into account excess zeros. To investigate the impact of different distributional assumptions for the number of claims on the prediction of total claim amount, predictive performances of candidate models are compared by using root mean square error (RMSE) and mean absolute error (MAE) criteria.

  7. Multiple electron processes of He and Ne by proton impact

    NASA Astrophysics Data System (ADS)

    Terekhin, Pavel Nikolaevich; Montenegro, Pablo; Quinto, Michele; Monti, Juan; Fojon, Omar; Rivarola, Roberto

    2016-05-01

    A detailed investigation of multiple electron processes (single and multiple ionization, single capture, transfer-ionization) of He and Ne is presented for proton impact at intermediate and high collision energies. Exclusive absolute cross sections for these processes have been obtained by calculation of transition probabilities in the independent electron and independent event models as a function of impact parameter in the framework of the continuum distorted wave-eikonal initial state theory. A binomial analysis is employed to calculate exclusive probabilities. The comparison with available theoretical and experimental results shows that exclusive probabilities are needed for a reliable description of the experimental data. The developed approach can be used for obtaining the input database for modeling multiple electron processes of charged particles passing through the matter.

  8. QMRA for Drinking Water: 2. The Effect of Pathogen Clustering in Single-Hit Dose-Response Models.

    PubMed

    Nilsen, Vegard; Wyller, John

    2016-01-01

    Spatial and/or temporal clustering of pathogens will invalidate the commonly used assumption of Poisson-distributed pathogen counts (doses) in quantitative microbial risk assessment. In this work, the theoretically predicted effect of spatial clustering in conventional "single-hit" dose-response models is investigated by employing the stuttering Poisson distribution, a very general family of count distributions that naturally models pathogen clustering and contains the Poisson and negative binomial distributions as special cases. The analysis is facilitated by formulating the dose-response models in terms of probability generating functions. It is shown formally that the theoretical single-hit risk obtained with a stuttering Poisson distribution is lower than that obtained with a Poisson distribution, assuming identical mean doses. A similar result holds for mixed Poisson distributions. Numerical examples indicate that the theoretical single-hit risk is fairly insensitive to moderate clustering, though the effect tends to be more pronounced for low mean doses. Furthermore, using Jensen's inequality, an upper bound on risk is derived that tends to better approximate the exact theoretical single-hit risk for highly overdispersed dose distributions. The bound holds with any dose distribution (characterized by its mean and zero inflation index) and any conditional dose-response model that is concave in the dose variable. Its application is exemplified with published data from Norovirus feeding trials, for which some of the administered doses were prepared from an inoculum of aggregated viruses. The potential implications of clustering for dose-response assessment as well as practical risk characterization are discussed. © 2016 Society for Risk Analysis.

  9. Exploring the effects of roadway characteristics on the frequency and severity of head-on crashes: case studies from Malaysian federal roads.

    PubMed

    Hosseinpour, Mehdi; Yahaya, Ahmad Shukri; Sadullah, Ahmad Farhan

    2014-01-01

    Head-on crashes are among the most severe collision types and of great concern to road safety authorities. Therefore, it justifies more efforts to reduce both the frequency and severity of this collision type. To this end, it is necessary to first identify factors associating with the crash occurrence. This can be done by developing crash prediction models that relate crash outcomes to a set of contributing factors. This study intends to identify the factors affecting both the frequency and severity of head-on crashes that occurred on 448 segments of five federal roads in Malaysia. Data on road characteristics and crash history were collected on the study segments during a 4-year period between 2007 and 2010. The frequency of head-on crashes were fitted by developing and comparing seven count-data models including Poisson, standard negative binomial (NB), random-effect negative binomial, hurdle Poisson, hurdle negative binomial, zero-inflated Poisson, and zero-inflated negative binomial models. To model crash severity, a random-effect generalized ordered probit model (REGOPM) was used given a head-on crash had occurred. With respect to the crash frequency, the random-effect negative binomial (RENB) model was found to outperform the other models according to goodness of fit measures. Based on the results of the model, the variables horizontal curvature, terrain type, heavy-vehicle traffic, and access points were found to be positively related to the frequency of head-on crashes, while posted speed limit and shoulder width decreased the crash frequency. With regard to the crash severity, the results of REGOPM showed that horizontal curvature, paved shoulder width, terrain type, and side friction were associated with more severe crashes, whereas land use, access points, and presence of median reduced the probability of severe crashes. Based on the results of this study, some potential countermeasures were proposed to minimize the risk of head-on crashes. Copyright © 2013 Elsevier Ltd. All rights reserved.

  10. Bycatch, bait, anglers, and roads: quantifying vector activity and propagule introduction risk across lake ecosystems.

    PubMed

    Drake, D Andrew R; Mandrak, Nicholas E

    2014-06-01

    Long implicated in the invasion process, live-bait anglers are highly mobile species vectors with frequent overland transport of fishes. To test hypotheses about the role of anglers in propagule transport, we developed a social-ecological model quantifying the opportunity for species transport beyond the invaded range resulting from bycatch during commercial bait operations, incidental transport, and release to lake ecosystems by anglers. We combined a gravity model with a stochastic, agent-based simulation, representing a 1-yr iteration of live-bait angling and the dynamics of propagule transport at fine spatiotemporal scales (i.e., probability of introducing n propagules per lake per year). A baseline scenario involving round goby (Neogobius melanostomus) indicated that most angling trips were benign; irrespective of lake visitation, anglers failed to purchase and transport propagules (benign trips, median probability P = 0.99912). However, given the large number of probability trials (4.2 million live-bait angling events per year), even the rarest sequence of events (uptake, movement, and deposition of propagules) is anticipated to occur. Risky trips (modal P = 0.00088 trips per year; approximately 1 in 1136) were sufficient to introduce a substantial number of propagules (modal values, Poisson model = 3715 propagules among 1288 lakes per year; zero-inflated negative binomial model = 6722 propagules among 1292 lakes per year). Two patterns of lake-specific introduction risk emerged. Large lakes supporting substantial angling activity experienced propagule pressure likely to surpass demographic barriers to establishment (top 2.5% of lakes with modal outcomes of five to 76 propagules per year; 303 high-risk lakes with three or more propagules, per year). Small or remote lakes were less likely to receive propagules; however, most risk distributions were leptokurtic with a long right tail, indicating the rare occurrence of high propagule loads to most waterbodies. Infestation simulations indicated that the number of high-risk waterbodies could be as great as 1318 (zero-inflated negative binomial), whereas a 90% reduction in bycatch from baseline would reduce the modal number of high risk lakes to zero. Results indicate that the combination of invasive bycatch and live-bait anglers warrants management concern as a species vector, but that risk is confined to a subset of individuals and recipient sites that may be effectively managed with targeted strategies.

  11. A Methodology for Quantifying Certain Design Requirements During the Design Phase

    NASA Technical Reports Server (NTRS)

    Adams, Timothy; Rhodes, Russel

    2005-01-01

    A methodology for developing and balancing quantitative design requirements for safety, reliability, and maintainability has been proposed. Conceived as the basis of a more rational approach to the design of spacecraft, the methodology would also be applicable to the design of automobiles, washing machines, television receivers, or almost any other commercial product. Heretofore, it has been common practice to start by determining the requirements for reliability of elements of a spacecraft or other system to ensure a given design life for the system. Next, safety requirements are determined by assessing the total reliability of the system and adding redundant components and subsystems necessary to attain safety goals. As thus described, common practice leaves the maintainability burden to fall to chance; therefore, there is no control of recurring costs or of the responsiveness of the system. The means that have been used in assessing maintainability have been oriented toward determining the logistical sparing of components so that the components are available when needed. The process established for developing and balancing quantitative requirements for safety (S), reliability (R), and maintainability (M) derives and integrates NASA s top-level safety requirements and the controls needed to obtain program key objectives for safety and recurring cost (see figure). Being quantitative, the process conveniently uses common mathematical models. Even though the process is shown as being worked from the top down, it can also be worked from the bottom up. This process uses three math models: (1) the binomial distribution (greaterthan- or-equal-to case), (2) reliability for a series system, and (3) the Poisson distribution (less-than-or-equal-to case). The zero-fail case for the binomial distribution approximates the commonly known exponential distribution or "constant failure rate" distribution. Either model can be used. The binomial distribution was selected for modeling flexibility because it conveniently addresses both the zero-fail and failure cases. The failure case is typically used for unmanned spacecraft as with missiles.

  12. Broad distribution spectrum from Gaussian to power law appears in stochastic variations in RNA-seq data.

    PubMed

    Awazu, Akinori; Tanabe, Takahiro; Kamitani, Mari; Tezuka, Ayumi; Nagano, Atsushi J

    2018-05-29

    Gene expression levels exhibit stochastic variations among genetically identical organisms under the same environmental conditions. In many recent transcriptome analyses based on RNA sequencing (RNA-seq), variations in gene expression levels among replicates were assumed to follow a negative binomial distribution, although the physiological basis of this assumption remains unclear. In this study, RNA-seq data were obtained from Arabidopsis thaliana under eight conditions (21-27 replicates), and the characteristics of gene-dependent empirical probability density function (ePDF) profiles of gene expression levels were analyzed. For A. thaliana and Saccharomyces cerevisiae, various types of ePDF of gene expression levels were obtained that were classified as Gaussian, power law-like containing a long tail, or intermediate. These ePDF profiles were well fitted with a Gauss-power mixing distribution function derived from a simple model of a stochastic transcriptional network containing a feedback loop. The fitting function suggested that gene expression levels with long-tailed ePDFs would be strongly influenced by feedback regulation. Furthermore, the features of gene expression levels are correlated with their functions, with the levels of essential genes tending to follow a Gaussian-like ePDF while those of genes encoding nucleic acid-binding proteins and transcription factors exhibit long-tailed ePDF.

  13. Application of a hurdle negative binomial count data model to demand for bass fishing in the southeastern United States.

    PubMed

    Bilgic, Abdulbaki; Florkowski, Wojciech J

    2007-06-01

    This paper identifies factors that influence the demand for a bass fishing trip taken in the southeastern United States using a hurdle negative binomial count data model. The probability of fishing for a bass is estimated in the first stage and the fishing trip frequency is estimated in the second stage for individuals reporting bass fishing trips in the Southeast. The applied approach allows the decomposition of the effects of factors responsible for the decision to take a trip and the trip number. Calculated partial and total elasticities indicate a highly inelastic demand for the number of fishing trips as trip costs increase. However, the demand can be expected to increase if anglers experience a success measured by the number of caught fish or their size. Benefit estimates based on alternative estimation methods differ substantially, suggesting the need for testing each modeling approach applied in empirical studies.

  14. A binomial modeling approach for upscaling colloid transport under unfavorable conditions: Emergent prediction of extended tailing

    NASA Astrophysics Data System (ADS)

    Hilpert, Markus; Rasmuson, Anna; Johnson, William P.

    2017-07-01

    Colloid transport in saturated porous media is significantly influenced by colloidal interactions with grain surfaces. Near-surface fluid domain colloids experience relatively low fluid drag and relatively strong colloidal forces that slow their downgradient translation relative to colloids in bulk fluid. Near-surface fluid domain colloids may reenter into the bulk fluid via diffusion (nanoparticles) or expulsion at rear flow stagnation zones, they may immobilize (attach) via primary minimum interactions, or they may move along a grain-to-grain contact to the near-surface fluid domain of an adjacent grain. We introduce a simple model that accounts for all possible permutations of mass transfer within a dual pore and grain network. The primary phenomena thereby represented in the model are mass transfer of colloids between the bulk and near-surface fluid domains and immobilization. Colloid movement is described by a Markov chain, i.e., a sequence of trials in a 1-D network of unit cells, which contain a pore and a grain. Using combinatorial analysis, which utilizes the binomial coefficient, we derive the residence time distribution, i.e., an inventory of the discrete colloid travel times through the network and of their probabilities to occur. To parameterize the network model, we performed mechanistic pore-scale simulations in a single unit cell that determined the likelihoods and timescales associated with the above colloid mass transfer processes. We found that intergrain transport of colloids in the near-surface fluid domain can cause extended tailing, which has traditionally been attributed to hydrodynamic dispersion emanating from flow tortuosity of solute trajectories.

  15. Mixing in High Schmidt Number Turbulent Jets

    DTIC Science & Technology

    1991-01-01

    the higher Sc jet is less well mixed. The difference is less pronounced at higher Re. Flame length estimates imply either an increase in entrainment...72 8.0 Estimation of flame lengths ....................................... 74 8.1 Estim ation m...A.4) Lf flame length N number of trials (Eq. 3.1) p exponent in fits of the variance behavior with Re p probability of a binomial event (Eq. 3.1) p

  16. Introductory Statistics in the Garden

    ERIC Educational Resources Information Center

    Wagaman, John C.

    2017-01-01

    This article describes four semesters of introductory statistics courses that incorporate service learning and gardening into the curriculum with applications of the binomial distribution, least squares regression and hypothesis testing. The activities span multiple semesters and are iterative in nature.

  17. Inferring subunit stoichiometry from single molecule photobleaching

    PubMed Central

    2013-01-01

    Single molecule photobleaching is a powerful tool for determining the stoichiometry of protein complexes. By attaching fluorophores to proteins of interest, the number of associated subunits in a complex can be deduced by imaging single molecules and counting fluorophore photobleaching steps. Because some bleaching steps might be unobserved, the ensemble of steps will be binomially distributed. In this work, it is shown that inferring the true composition of a complex from such data is nontrivial because binomially distributed observations present an ill-posed inference problem. That is, a unique and optimal estimate of the relevant parameters cannot be extracted from the observations. Because of this, a method has not been firmly established to quantify confidence when using this technique. This paper presents a general inference model for interpreting such data and provides methods for accurately estimating parameter confidence. The formalization and methods presented here provide a rigorous analytical basis for this pervasive experimental tool. PMID:23712552

  18. Interpreting carnivore scent-station surveys

    USGS Publications Warehouse

    Sargeant, G.A.; Johnson, D.H.; Berg, W.E.

    1998-01-01

    The scent-station survey method has been widely used to estimate trends in carnivore abundance. However, statistical properties of scent-station data are poorly understood, and the relation between scent-station indices and carnivore abundance has not been adequately evaluated. We assessed properties of scent-station indices by analyzing data collected in Minnesota during 1986-03. Visits to stations separated by <2 km were correlated for all species because individual carnivores sometimes visited several stations in succession. Thus, visits to stations had an intractable statistical distribution. Dichotomizing results for lines of 10 stations (0 or 21 visits) produced binomially distributed data that were robust to multiple visits by individuals. We abandoned 2-way comparisons among years in favor of tests for population trend, which are less susceptible to bias, and analyzed results separately for biogeographic sections of Minnesota because trends differed among sections. Before drawing inferences about carnivore population trends, we reevaluated published validation experiments. Results implicated low statistical power and confounding as possible explanations for equivocal or conflicting results of validation efforts. Long-term trends in visitation rates probably reflect real changes in populations, but poor spatial and temporal resolution, susceptibility to confounding, and low statistical power limit the usefulness of this survey method.

  19. [Sequential sampling plans to Orthezia praelonga Douglas (Hemiptera: Sternorrhyncha, Ortheziidae) in citrus].

    PubMed

    Costa, Marilia G; Barbosa, José C; Yamamoto, Pedro T

    2007-01-01

    The sequential sampling is characterized by using samples of variable sizes, and has the advantage of reducing sampling time and costs if compared to fixed-size sampling. To introduce an adequate management for orthezia, sequential sampling plans were developed for orchards under low and high infestation. Data were collected in Matão, SP, in commercial stands of the orange variety 'Pêra Rio', at five, nine and 15 years of age. Twenty samplings were performed in the whole area of each stand by observing the presence or absence of scales on plants, being plots comprised of ten plants. After observing that in all of the three stands the scale population was distributed according to the contagious model, fitting the Negative Binomial Distribution in most samplings, two sequential sampling plans were constructed according to the Sequential Likelihood Ratio Test (SLRT). To construct these plans an economic threshold of 2% was adopted and the type I and II error probabilities were fixed in alpha = beta = 0.10. Results showed that the maximum numbers of samples expected to determine control need were 172 and 76 samples for stands with low and high infestation, respectively.

  20. Non-normal Distributions Commonly Used in Health, Education, and Social Sciences: A Systematic Review

    PubMed Central

    Bono, Roser; Blanca, María J.; Arnau, Jaume; Gómez-Benito, Juana

    2017-01-01

    Statistical analysis is crucial for research and the choice of analytical technique should take into account the specific distribution of data. Although the data obtained from health, educational, and social sciences research are often not normally distributed, there are very few studies detailing which distributions are most likely to represent data in these disciplines. The aim of this systematic review was to determine the frequency of appearance of the most common non-normal distributions in the health, educational, and social sciences. The search was carried out in the Web of Science database, from which we retrieved the abstracts of papers published between 2010 and 2015. The selection was made on the basis of the title and the abstract, and was performed independently by two reviewers. The inter-rater reliability for article selection was high (Cohen’s kappa = 0.84), and agreement regarding the type of distribution reached 96.5%. A total of 262 abstracts were included in the final review. The distribution of the response variable was reported in 231 of these abstracts, while in the remaining 31 it was merely stated that the distribution was non-normal. In terms of their frequency of appearance, the most-common non-normal distributions can be ranked in descending order as follows: gamma, negative binomial, multinomial, binomial, lognormal, and exponential. In addition to identifying the distributions most commonly used in empirical studies these results will help researchers to decide which distributions should be included in simulation studies examining statistical procedures. PMID:28959227

  1. Buffon, Georges Louis Leclerc Comte de (1707-88)

    NASA Astrophysics Data System (ADS)

    Murdin, P.

    2000-11-01

    French naturalist. Discovered the binomial theorem and worked on probability theory. In astronomy he suggested that the Earth might have been created by the collision of a comet with the Sun. Based on the cooling rate of iron, he calculated in Théorie de la Terre that the age of the Earth was 75 000 years. This estimate, so much larger than the official 6000 years, was condemned by the Catholic C...

  2. Statistical models for RNA-seq data derived from a two-condition 48-replicate experiment.

    PubMed

    Gierliński, Marek; Cole, Christian; Schofield, Pietà; Schurch, Nicholas J; Sherstnev, Alexander; Singh, Vijender; Wrobel, Nicola; Gharbi, Karim; Simpson, Gordon; Owen-Hughes, Tom; Blaxter, Mark; Barton, Geoffrey J

    2015-11-15

    High-throughput RNA sequencing (RNA-seq) is now the standard method to determine differential gene expression. Identifying differentially expressed genes crucially depends on estimates of read-count variability. These estimates are typically based on statistical models such as the negative binomial distribution, which is employed by the tools edgeR, DESeq and cuffdiff. Until now, the validity of these models has usually been tested on either low-replicate RNA-seq data or simulations. A 48-replicate RNA-seq experiment in yeast was performed and data tested against theoretical models. The observed gene read counts were consistent with both log-normal and negative binomial distributions, while the mean-variance relation followed the line of constant dispersion parameter of ∼0.01. The high-replicate data also allowed for strict quality control and screening of 'bad' replicates, which can drastically affect the gene read-count distribution. RNA-seq data have been submitted to ENA archive with project ID PRJEB5348. g.j.barton@dundee.ac.uk. © The Author 2015. Published by Oxford University Press.

  3. Statistical models for RNA-seq data derived from a two-condition 48-replicate experiment

    PubMed Central

    Cole, Christian; Schofield, Pietà; Schurch, Nicholas J.; Sherstnev, Alexander; Singh, Vijender; Wrobel, Nicola; Gharbi, Karim; Simpson, Gordon; Owen-Hughes, Tom; Blaxter, Mark; Barton, Geoffrey J.

    2015-01-01

    Motivation: High-throughput RNA sequencing (RNA-seq) is now the standard method to determine differential gene expression. Identifying differentially expressed genes crucially depends on estimates of read-count variability. These estimates are typically based on statistical models such as the negative binomial distribution, which is employed by the tools edgeR, DESeq and cuffdiff. Until now, the validity of these models has usually been tested on either low-replicate RNA-seq data or simulations. Results: A 48-replicate RNA-seq experiment in yeast was performed and data tested against theoretical models. The observed gene read counts were consistent with both log-normal and negative binomial distributions, while the mean-variance relation followed the line of constant dispersion parameter of ∼0.01. The high-replicate data also allowed for strict quality control and screening of ‘bad’ replicates, which can drastically affect the gene read-count distribution. Availability and implementation: RNA-seq data have been submitted to ENA archive with project ID PRJEB5348. Contact: g.j.barton@dundee.ac.uk PMID:26206307

  4. Technical and biological variance structure in mRNA-Seq data: life in the real world

    PubMed Central

    2012-01-01

    Background mRNA expression data from next generation sequencing platforms is obtained in the form of counts per gene or exon. Counts have classically been assumed to follow a Poisson distribution in which the variance is equal to the mean. The Negative Binomial distribution which allows for over-dispersion, i.e., for the variance to be greater than the mean, is commonly used to model count data as well. Results In mRNA-Seq data from 25 subjects, we found technical variation to generally follow a Poisson distribution as has been reported previously and biological variability was over-dispersed relative to the Poisson model. The mean-variance relationship across all genes was quadratic, in keeping with a Negative Binomial (NB) distribution. Over-dispersed Poisson and NB distributional assumptions demonstrated marked improvements in goodness-of-fit (GOF) over the standard Poisson model assumptions, but with evidence of over-fitting in some genes. Modeling of experimental effects improved GOF for high variance genes but increased the over-fitting problem. Conclusions These conclusions will guide development of analytical strategies for accurate modeling of variance structure in these data and sample size determination which in turn will aid in the identification of true biological signals that inform our understanding of biological systems. PMID:22769017

  5. Behavioral Analysis of Visitors to a Medical Institution's Website Using Markov Chain Monte Carlo Methods.

    PubMed

    Suzuki, Teppei; Tani, Yuji; Ogasawara, Katsuhiko

    2016-07-25

    Consistent with the "attention, interest, desire, memory, action" (AIDMA) model of consumer behavior, patients collect information about available medical institutions using the Internet to select information for their particular needs. Studies of consumer behavior may be found in areas other than medical institution websites. Such research uses Web access logs for visitor search behavior. At this time, research applying the patient searching behavior model to medical institution website visitors is lacking. We have developed a hospital website search behavior model using a Bayesian approach to clarify the behavior of medical institution website visitors and determine the probability of their visits, classified by search keyword. We used the website data access log of a clinic of internal medicine and gastroenterology in the Sapporo suburbs, collecting data from January 1 through June 31, 2011. The contents of the 6 website pages included the following: home, news, content introduction for medical examinations, mammography screening, holiday person-on-duty information, and other. The search keywords we identified as best expressing website visitor needs were listed as the top 4 headings from the access log: clinic name, clinic name + regional name, clinic name + medical examination, and mammography screening. Using the search keywords as the explaining variable, we built a binomial probit model that allows inspection of the contents of each purpose variable. Using this model, we determined a beta value and generated a posterior distribution. We performed the simulation using Markov Chain Monte Carlo methods with a noninformation prior distribution for this model and determined the visit probability classified by keyword for each category. In the case of the keyword "clinic name," the visit probability to the website, repeated visit to the website, and contents page for medical examination was positive. In the case of the keyword "clinic name and regional name," the probability for a repeated visit to the website and the mammography screening page was negative. In the case of the keyword "clinic name + medical examination," the visit probability to the website was positive, and the visit probability to the information page was negative. When visitors referred to the keywords "mammography screening," the visit probability to the mammography screening page was positive (95% highest posterior density interval = 3.38-26.66). Further analysis for not only the clinic website but also various other medical institution websites is necessary to build a general inspection model for medical institution websites; we want to consider this in future research. Additionally, we hope to use the results obtained in this study as a prior distribution for future work to conduct higher-precision analysis.

  6. Behavioral Analysis of Visitors to a Medical Institution’s Website Using Markov Chain Monte Carlo Methods

    PubMed Central

    Tani, Yuji

    2016-01-01

    Background Consistent with the “attention, interest, desire, memory, action” (AIDMA) model of consumer behavior, patients collect information about available medical institutions using the Internet to select information for their particular needs. Studies of consumer behavior may be found in areas other than medical institution websites. Such research uses Web access logs for visitor search behavior. At this time, research applying the patient searching behavior model to medical institution website visitors is lacking. Objective We have developed a hospital website search behavior model using a Bayesian approach to clarify the behavior of medical institution website visitors and determine the probability of their visits, classified by search keyword. Methods We used the website data access log of a clinic of internal medicine and gastroenterology in the Sapporo suburbs, collecting data from January 1 through June 31, 2011. The contents of the 6 website pages included the following: home, news, content introduction for medical examinations, mammography screening, holiday person-on-duty information, and other. The search keywords we identified as best expressing website visitor needs were listed as the top 4 headings from the access log: clinic name, clinic name + regional name, clinic name + medical examination, and mammography screening. Using the search keywords as the explaining variable, we built a binomial probit model that allows inspection of the contents of each purpose variable. Using this model, we determined a beta value and generated a posterior distribution. We performed the simulation using Markov Chain Monte Carlo methods with a noninformation prior distribution for this model and determined the visit probability classified by keyword for each category. Results In the case of the keyword “clinic name,” the visit probability to the website, repeated visit to the website, and contents page for medical examination was positive. In the case of the keyword “clinic name and regional name,” the probability for a repeated visit to the website and the mammography screening page was negative. In the case of the keyword “clinic name + medical examination,” the visit probability to the website was positive, and the visit probability to the information page was negative. When visitors referred to the keywords “mammography screening,” the visit probability to the mammography screening page was positive (95% highest posterior density interval = 3.38-26.66). Conclusions Further analysis for not only the clinic website but also various other medical institution websites is necessary to build a general inspection model for medical institution websites; we want to consider this in future research. Additionally, we hope to use the results obtained in this study as a prior distribution for future work to conduct higher-precision analysis. PMID:27457537

  7. Do parent–child acculturation gaps affect early adolescent Latino alcohol use? A study of the probability and extent of use

    PubMed Central

    2013-01-01

    The literature has been mixed regarding how parent–child relationships are affected by the acculturation process and how this process relates to alcohol use among Latino youth. The mixed results may be due to, at least, two factors: First, staggered migration in which one or both parents arrive to the new country and then send for the children may lead to faster acculturation in parents than in children for some families. Second, acculturation may have different effects depending on which aspects of alcohol use are being examined. This study addresses the first factor by testing for a curvilinear trend in the acculturation-alcohol use relationship and the second by modeling past year alcohol use as a zero inflated negative binomial distribution. Additionally, this study examined the unique and mediation effects of parent–child acculturation discrepancies (gap), mother involvement in children’s schooling, father involvement in children’s schooling, and effective parenting on youth alcohol use during the last 12 months, measured as the probability of using and the extent of use. Direct paths from parent–child acculturation discrepancy to alcohol use, and mediated paths through mother involvement, father involvement, and effective parenting were also tested. Only father involvement fully mediated the path from parent–child acculturation discrepancies to the probability of alcohol use. None of the variables examined mediated the path from parent–child acculturation discrepancies to the extent of alcohol use. Effective parenting was unrelated to acculturation discrepancies; however, it maintained a significant direct effect on the probability of youth alcohol use and the extent of use after controlling for mother and father involvement. Implications for prevention strategies are discussed. PMID:23347822

  8. Accounting for Selection Bias in Studies of Acute Cardiac Events.

    PubMed

    Banack, Hailey R; Harper, Sam; Kaufman, Jay S

    2018-06-01

    In cardiovascular research, pre-hospital mortality represents an important potential source of selection bias. Inverse probability of censoring weights are a method to account for this source of bias. The objective of this article is to examine and correct for the influence of selection bias due to pre-hospital mortality on the relationship between cardiovascular risk factors and all-cause mortality after an acute cardiac event. The relationship between the number of cardiovascular disease (CVD) risk factors (0-5; smoking status, diabetes, hypertension, dyslipidemia, and obesity) and all-cause mortality was examined using data from the Atherosclerosis Risk in Communities (ARIC) study. To illustrate the magnitude of selection bias, estimates from an unweighted generalized linear model with a log link and binomial distribution were compared with estimates from an inverse probability of censoring weighted model. In unweighted multivariable analyses the estimated risk ratio for mortality ranged from 1.09 (95% confidence interval [CI], 0.98-1.21) for 1 CVD risk factor to 1.95 (95% CI, 1.41-2.68) for 5 CVD risk factors. In the inverse probability of censoring weights weighted analyses, the risk ratios ranged from 1.14 (95% CI, 0.94-1.39) to 4.23 (95% CI, 2.69-6.66). Estimates from the inverse probability of censoring weighted model were substantially greater than unweighted, adjusted estimates across all risk factor categories. This shows the magnitude of selection bias due to pre-hospital mortality and effect on estimates of the effect of CVD risk factors on mortality. Moreover, the results highlight the utility of using this method to address a common form of bias in cardiovascular research. Copyright © 2018 Canadian Cardiovascular Society. Published by Elsevier Inc. All rights reserved.

  9. An analytical framework for estimating aquatic species density from environmental DNA

    USGS Publications Warehouse

    Chambert, Thierry; Pilliod, David S.; Goldberg, Caren S.; Doi, Hideyuki; Takahara, Teruhiko

    2018-01-01

    Environmental DNA (eDNA) analysis of water samples is on the brink of becoming a standard monitoring method for aquatic species. This method has improved detection rates over conventional survey methods and thus has demonstrated effectiveness for estimation of site occupancy and species distribution. The frontier of eDNA applications, however, is to infer species density. Building upon previous studies, we present and assess a modeling approach that aims at inferring animal density from eDNA. The modeling combines eDNA and animal count data from a subset of sites to estimate species density (and associated uncertainties) at other sites where only eDNA data are available. As a proof of concept, we first perform a cross-validation study using experimental data on carp in mesocosms. In these data, fish densities are known without error, which allows us to test the performance of the method with known data. We then evaluate the model using field data from a study on a stream salamander species to assess the potential of this method to work in natural settings, where density can never be known with absolute certainty. Two alternative distributions (Normal and Negative Binomial) to model variability in eDNA concentration data are assessed. Assessment based on the proof of concept data (carp) revealed that the Negative Binomial model provided much more accurate estimates than the model based on a Normal distribution, likely because eDNA data tend to be overdispersed. Greater imprecision was found when we applied the method to the field data, but the Negative Binomial model still provided useful density estimates. We call for further model development in this direction, as well as further research targeted at sampling design optimization. It will be important to assess these approaches on a broad range of study systems.

  10. Fitting Cure Rate Model to Breast Cancer Data of Cancer Research Center.

    PubMed

    Baghestani, Ahmad Reza; Zayeri, Farid; Akbari, Mohammad Esmaeil; Shojaee, Leyla; Khadembashi, Naghmeh; Shahmirzalou, Parviz

    2015-01-01

    The Cox PH model is one of the most significant statistical models in studying survival of patients. But, in the case of patients with long-term survival, it may not be the most appropriate. In such cases, a cure rate model seems more suitable. The purpose of this study was to determine clinical factors associated with cure rate of patients with breast cancer. In order to find factors affecting cure rate (response), a non-mixed cure rate model with negative binomial distribution for latent variable was used. Variables selected were recurrence cancer, status for HER2, estrogen receptor (ER) and progesterone receptor (PR), size of tumor, grade of cancer, stage of cancer, type of surgery, age at the diagnosis time and number of removed positive lymph nodes. All analyses were performed using PROC MCMC processes in the SAS 9.2 program. The mean (SD) age of patients was equal to 48.9 (11.1) months. For these patients, 1, 5 and 10-year survival rates were 95, 79 and 50 percent respectively. All of the mentioned variables were effective in cure fraction. Kaplan-Meier curve showed cure model's use competence. Unlike other variables, existence of ER and PR positivity will increase probability of cure in patients. In the present study, Weibull distribution was used for the purpose of analysing survival times. Model fitness with other distributions such as log-N and log-logistic and other distributions for latent variable is recommended.

  11. Statistical inference involving binomial and negative binomial parameters.

    PubMed

    García-Pérez, Miguel A; Núñez-Antón, Vicente

    2009-05-01

    Statistical inference about two binomial parameters implies that they are both estimated by binomial sampling. There are occasions in which one aims at testing the equality of two binomial parameters before and after the occurrence of the first success along a sequence of Bernoulli trials. In these cases, the binomial parameter before the first success is estimated by negative binomial sampling whereas that after the first success is estimated by binomial sampling, and both estimates are related. This paper derives statistical tools to test two hypotheses, namely, that both binomial parameters equal some specified value and that both parameters are equal though unknown. Simulation studies are used to show that in small samples both tests are accurate in keeping the nominal Type-I error rates, and also to determine sample size requirements to detect large, medium, and small effects with adequate power. Additional simulations also show that the tests are sufficiently robust to certain violations of their assumptions.

  12. Modeling avian abundance from replicated counts using binomial mixture models

    USGS Publications Warehouse

    Kery, Marc; Royle, J. Andrew; Schmid, Hans

    2005-01-01

    Abundance estimation in ecology is usually accomplished by capture–recapture, removal, or distance sampling methods. These may be hard to implement at large spatial scales. In contrast, binomial mixture models enable abundance estimation without individual identification, based simply on temporally and spatially replicated counts. Here, we evaluate mixture models using data from the national breeding bird monitoring program in Switzerland, where some 250 1-km2 quadrats are surveyed using the territory mapping method three times during each breeding season. We chose eight species with contrasting distribution (wide–narrow), abundance (high–low), and detectability (easy–difficult). Abundance was modeled as a random effect with a Poisson or negative binomial distribution, with mean affected by forest cover, elevation, and route length. Detectability was a logit-linear function of survey date, survey date-by-elevation, and sampling effort (time per transect unit). Resulting covariate effects and parameter estimates were consistent with expectations. Detectability per territory (for three surveys) ranged from 0.66 to 0.94 (mean 0.84) for easy species, and from 0.16 to 0.83 (mean 0.53) for difficult species, depended on survey effort for two easy and all four difficult species, and changed seasonally for three easy and three difficult species. Abundance was positively related to route length in three high-abundance and one low-abundance (one easy and three difficult) species, and increased with forest cover in five forest species, decreased for two nonforest species, and was unaffected for a generalist species. Abundance estimates under the most parsimonious mixture models were between 1.1 and 8.9 (median 1.8) times greater than estimates based on territory mapping; hence, three surveys were insufficient to detect all territories for each species. We conclude that binomial mixture models are an important new approach for estimating abundance corrected for detectability when only repeated-count data are available. Future developments envisioned include estimation of trend, occupancy, and total regional abundance.

  13. Distribution of chewing lice upon the polygynous peacock Pavo cristatus.

    PubMed

    Stewart, I R; Clark, F; Petrie, M

    1996-04-01

    An opportunistic survey of louse distribution upon the peacock Pavo cristatus was undertaken following a cull of 23 birds from an English zoo. After complete skin and feather dissolution, 2 species of lice were retrieved, Goniodes pavonis and Amyrsidea minuta. The distribution of both louse species could be described by a negative binomial model. The significance of this is discussed in relation to transmission dynamics of lice in the atypical avian mating system found in the peacock, which involves no male parental care.

  14. Making sense of the noise: The effect of hydrology on silver carp eDNA detection in the Chicago area waterway system.

    PubMed

    Song, Jeffery W; Small, Mitchell J; Casman, Elizabeth A

    2017-12-15

    Environmental DNA (eDNA) sampling is an emerging tool for monitoring the spread of aquatic invasive species. One confounding factor when interpreting eDNA sampling evidence is that eDNA can be present in the water in the absence of living target organisms, originating from excreta, dead tissue, boats, or sewage effluent, etc. In the Chicago Area Waterway System (CAWS), electric fish dispersal barriers were built to prevent non-native Asian carp species from invading Lake Michigan, and yet Asian carp eDNA has been detected above the barriers sporadically since 2009. In this paper the influence of stream flow characteristics in the CAWS on the probability of invasive Asian carp eDNA detection in the CAWS from 2009 to 2012 was examined. In the CAWS, the direction of stream flow is mostly away from Lake Michigan, though there are infrequent reversals in flow direction towards Lake Michigan during dry spells. We find that the flow reversal volume into the Lake has a statistically significant positive relationship with eDNA detection probability, while other covariates, like gage height, precipitation, season, water temperature, dissolved oxygen concentration, pH and chlorophyll concentration do not. This suggests that stream flow direction is highly influential on eDNA detection in the CAWS and should be considered when interpreting eDNA evidence. We also find that the beta-binomial regression model provides a stronger fit for eDNA detection probability compared to a binomial regression model. This paper provides a statistical modeling framework for interpreting eDNA sampling evidence and for evaluating covariates influencing eDNA detection. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Sleep Disruption Medical Intervention Forecasting (SDMIF) Module for the Integrated Medical Model

    NASA Technical Reports Server (NTRS)

    Lewandowski, Beth; Brooker, John; Mallis, Melissa; Hursh, Steve; Caldwell, Lynn; Myers, Jerry

    2011-01-01

    The NASA Integrated Medical Model (IMM) assesses the risk, including likelihood and impact of occurrence, of all credible in-flight medical conditions. Fatigue due to sleep disruption is a condition that could lead to operational errors, potentially resulting in loss of mission or crew. Pharmacological consumables are mitigation strategies used to manage the risks associated with sleep deficits. The likelihood of medical intervention due to sleep disruption was estimated with a well validated sleep model and a Monte Carlo computer simulation in an effort to optimize the quantity of consumables. METHODS: The key components of the model are the mission parameter program, the calculation of sleep intensity and the diagnosis and decision module. The mission parameter program was used to create simulated daily sleep/wake schedules for an ISS increment. The hypothetical schedules included critical events such as dockings and extravehicular activities and included actual sleep time and sleep quality. The schedules were used as inputs to the Sleep, Activity, Fatigue and Task Effectiveness (SAFTE) Model (IBR Inc., Baltimore MD), which calculated sleep intensity. Sleep data from an ISS study was used to relate calculated sleep intensity to the probability of sleep medication use, using a generalized linear model for binomial regression. A human yes/no decision process using a binomial random number was also factored into sleep medication use probability. RESULTS: These probability calculations were repeated 5000 times resulting in an estimate of the most likely amount of sleep aids used during an ISS mission and a 95% confidence interval. CONCLUSIONS: These results were transferred to the parent IMM for further weighting and integration with other medical conditions, to help inform operational decisions. This model is a potential planning tool for ensuring adequate sleep during sleep disrupted periods of a mission.

  16. Parameter Estimation for the Dirichlet-Multinomial Distribution Using Supplementary Beta-Binomial Data.

    DTIC Science & Technology

    1987-07-01

    multinomial distribution as a magazine exposure model. J. of Marketing Research . 21, 100-106. Lehmann, E.L. (1983). Theory of Point Estimation. John Wiley and... Marketing Research . 21, 89-99. V I flWflW WflW~WWMWSS tWN ,rw fl rwwrwwr-w~ w-. ~. - - -- .~ 𔃾 4’.) ~a 4’ ., 𔃾. ’-4. .4.: .4~ I .4. ~J3iAf a,’ -a’ 4

  17. Logistic quantile regression provides improved estimates for bounded avian counts: a case study of California Spotted Owl fledgling production

    Treesearch

    Brian S. Cade; Barry R. Noon; Rick D. Scherer; John J. Keane

    2017-01-01

    Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical...

  18. Yes, the GIGP Really Does Work--And Is Workable!

    ERIC Educational Resources Information Center

    Burrell, Quentin L.; Fenton, Michael R.

    1993-01-01

    Discusses the generalized inverse Gaussian-Poisson (GIGP) process for informetric modeling. Negative binomial distribution is discussed, construction of the GIGP process is explained, zero-truncated GIGP is considered, and applications of the process with journals, library circulation statistics, and database index terms are described. (50…

  19. Sampling Error in Relation to Cyst Nematode Population Density Estimation in Small Field Plots.

    PubMed

    Župunski, Vesna; Jevtić, Radivoje; Jokić, Vesna Spasić; Župunski, Ljubica; Lalošević, Mirjana; Ćirić, Mihajlo; Ćurčić, Živko

    2017-06-01

    Cyst nematodes are serious plant-parasitic pests which could cause severe yield losses and extensive damage. Since there is still very little information about error of population density estimation in small field plots, this study contributes to the broad issue of population density assessment. It was shown that there was no significant difference between cyst counts of five or seven bulk samples taken per each 1-m 2 plot, if average cyst count per examined plot exceeds 75 cysts per 100 g of soil. Goodness of fit of data to probability distribution tested with χ 2 test confirmed a negative binomial distribution of cyst counts for 21 out of 23 plots. The recommended measure of sampling precision of 17% expressed through coefficient of variation ( cv ) was achieved if the plots of 1 m 2 contaminated with more than 90 cysts per 100 g of soil were sampled with 10-core bulk samples taken in five repetitions. If plots were contaminated with less than 75 cysts per 100 g of soil, 10-core bulk samples taken in seven repetitions gave cv higher than 23%. This study indicates that more attention should be paid on estimation of sampling error in experimental field plots to ensure more reliable estimation of population density of cyst nematodes.

  20. Defect formation in LaGa(Mg,Ni)O3-δ : A statistical thermodynamic analysis validated by mixed conductivity and magnetic susceptibility measurements

    NASA Astrophysics Data System (ADS)

    Naumovich, E. N.; Kharton, V. V.; Yaremchenko, A. A.; Patrakeev, M. V.; Kellerman, D. G.; Logvinovich, D. I.; Kozhevnikov, V. L.

    2006-08-01

    A statistical thermodynamic approach to analyze defect thermodynamics in strongly nonideal solid solutions was proposed and validated by a case study focused on the oxygen intercalation processes in mixed-conducting LaGa0.65Mg0.15Ni0.20O3-δ perovskite. The oxygen nonstoichiometry of Ni-doped lanthanum gallate, measured by coulometric titration and thermogravimetric analysis at 923-1223K in the oxygen partial pressure range 5×10-5to0.9atm , indicates the coexistence of Ni2+ , Ni3+ , and Ni4+ oxidation states. The formation of tetravalent nickel was also confirmed by the magnetic susceptibility data at 77-600K , and by the analysis of p -type electronic conductivity and Seebeck coefficient as function of the oxygen pressure at 1023-1223K . The oxygen thermodynamics and the partial ionic and hole conductivities are strongly affected by the point-defect interactions, primarily the Coulombic repulsion between oxygen vacancies and/or electron holes and the vacancy association with Mg2+ cations. These factors can be analyzed by introducing the defect interaction energy in the concentration-dependent part of defect chemical potentials expressed by the discrete Fermi-Dirac distribution, and taking into account the probabilities of local configurations calculated via binomial distributions.

  1. Simulating high spatial resolution high severity burned area in Sierra Nevada forests for California Spotted Owl habitat climate change risk assessment and management.

    NASA Astrophysics Data System (ADS)

    Keyser, A.; Westerling, A. L.; Jones, G.; Peery, M. Z.

    2017-12-01

    Sierra Nevada forests have experienced an increase in very large fires with significant areas of high burn severity, such as the Rim (2013) and King (2014) fires, that have impacted habitat of endangered species such as the California spotted owl. In order to support land manager forest management planning and risk assessment activities, we used historical wildfire histories from the Monitoring Trends in Burn Severity project and gridded hydroclimate and land surface characteristics data to develope statistical models to simulate the frequency, location and extent of high severity burned area in Sierra Nevada forest wildfires as functions of climate and land surface characteristics. We define high severity here as BA90 area: the area comprising patches with ninety percent or more basal area killed within a larger fire. We developed a system of statistical models to characterize the probability of large fire occurrence, the probability of significant BA90 area present given a large fire, and the total extent of BA90 area in a fire on a 1/16 degree lat/lon grid over the Sierra Nevada. Repeated draws from binomial and generalized pareto distributions using these probabilities generated a library of simulated histories of high severity fire for a range of near (50 yr) future climate and fuels management scenarios. Fuels management scenarios were provided by USFS Region 5. Simulated BA90 area was then downscaled to 30 m resolution using a statistical model we developed using Random Forest techniques to estimate the probability of adjacent 30m pixels burning with ninety percent basal kill as a function of fire size and vegetation and topographic features. The result is a library of simulated high resolution maps of BA90 burned areas for a range of climate and fuels management scenarios with which we estimated conditional probabilities of owl nesting sites being impacted by high severity wildfire.

  2. Linking parasite populations in hosts to parasite populations in space through Taylor's law and the negative binomial distribution

    PubMed Central

    Poulin, Robert; Lagrue, Clément

    2017-01-01

    The spatial distribution of individuals of any species is a basic concern of ecology. The spatial distribution of parasites matters to control and conservation of parasites that affect human and nonhuman populations. This paper develops a quantitative theory to predict the spatial distribution of parasites based on the distribution of parasites in hosts and the spatial distribution of hosts. Four models are tested against observations of metazoan hosts and their parasites in littoral zones of four lakes in Otago, New Zealand. These models differ in two dichotomous assumptions, constituting a 2 × 2 theoretical design. One assumption specifies whether the variance function of the number of parasites per host individual is described by Taylor's law (TL) or the negative binomial distribution (NBD). The other assumption specifies whether the numbers of parasite individuals within each host in a square meter of habitat are independent or perfectly correlated among host individuals. We find empirically that the variance–mean relationship of the numbers of parasites per square meter is very well described by TL but is not well described by NBD. Two models that posit perfect correlation of the parasite loads of hosts in a square meter of habitat approximate observations much better than two models that posit independence of parasite loads of hosts in a square meter, regardless of whether the variance–mean relationship of parasites per host individual obeys TL or NBD. We infer that high local interhost correlations in parasite load strongly influence the spatial distribution of parasites. Local hotspots could influence control and conservation of parasites. PMID:27994156

  3. Probability of lek collapse is lower inside sage-grouse Core Areas: Effectiveness of conservation policy for a landscape species.

    PubMed

    Spence, Emma Suzuki; Beck, Jeffrey L; Gregory, Andrew J

    2017-01-01

    Greater sage-grouse (Centrocercus urophasianus) occupy sagebrush (Artemisia spp.) habitats in 11 western states and 2 Canadian provinces. In September 2015, the U.S. Fish and Wildlife Service announced the listing status for sage-grouse had changed from warranted but precluded to not warranted. The primary reason cited for this change of status was that the enactment of new regulatory mechanisms was sufficient to protect sage-grouse populations. One such plan is the 2008, Wyoming Sage Grouse Executive Order (SGEO), enacted by Governor Freudenthal. The SGEO identifies "Core Areas" that are to be protected by keeping them relatively free from further energy development and limiting other forms of anthropogenic disturbances near active sage-grouse leks. Using the Wyoming Game and Fish Department's sage-grouse lek count database and the Wyoming Oil and Gas Conservation Commission database of oil and gas well locations, we investigated the effectiveness of Wyoming's Core Areas, specifically: 1) how well Core Areas encompass the distribution of sage-grouse in Wyoming, 2) whether Core Area leks have a reduced probability of lek collapse, and 3) what, if any, edge effects intensification of oil and gas development adjacent to Core Areas may be having on Core Area populations. Core Areas contained 77% of male sage-grouse attending leks and 64% of active leks. Using Bayesian binomial probability analysis, we found an average 10.9% probability of lek collapse in Core Areas and an average 20.4% probability of lek collapse outside Core Areas. Using linear regression, we found development density outside Core Areas was related to the probability of lek collapse inside Core Areas. Specifically, probability of collapse among leks >4.83 km from inside Core Area boundaries was significantly related to well density within 1.61 km (1-mi) and 4.83 km (3-mi) outside of Core Area boundaries. Collectively, these data suggest that the Wyoming Core Area Strategy has benefited sage-grouse and sage-grouse habitat conservation; however, additional guidelines limiting development densities adjacent to Core Areas may be necessary to effectively protect Core Area populations.

  4. Observer error structure in bull trout redd counts in Montana streams: Implications for inference on true redd numbers

    USGS Publications Warehouse

    Muhlfeld, Clint C.; Taper, Mark L.; Staples, David F.; Shepard, Bradley B.

    2006-01-01

    Despite the widespread use of redd counts to monitor trends in salmonid populations, few studies have evaluated the uncertainties in observed counts. We assessed the variability in redd counts for migratory bull trout Salvelinus confluentus among experienced observers in Lion and Goat creeks, which are tributaries to the Swan River, Montana. We documented substantially lower observer variability in bull trout redd counts than did previous studies. Observer counts ranged from 78% to 107% of our best estimates of true redd numbers in Lion Creek and from 90% to 130% of our best estimates in Goat Creek. Observers made both errors of omission and errors of false identification, and we modeled this combination by use of a binomial probability of detection and a Poisson count distribution of false identifications. Redd detection probabilities were high (mean = 83%) and exhibited no significant variation among observers (SD = 8%). We applied this error structure to annual redd counts in the Swan River basin (1982–2004) to correct for observer error and thus derived more accurate estimates of redd numbers and associated confidence intervals. Our results indicate that bias in redd counts can be reduced if experienced observers are used to conduct annual redd counts. Future studies should assess both sources of observer error to increase the validity of using redd counts for inferring true redd numbers in different basins. This information will help fisheries biologists to more precisely monitor population trends, identify recovery and extinction thresholds for conservation and recovery programs, ascertain and predict how management actions influence distribution and abundance, and examine effects of recovery and restoration activities.

  5. Indexing the relative abundance of age-0 white sturgeons in an impoundment of the lower Columbia River from highly skewed trawling data

    USGS Publications Warehouse

    Counihan, T.D.; Miller, Allen I.; Parsley, M.J.

    1999-01-01

    The development of recruitment monitoring programs for age-0 white sturgeons Acipenser transmontanus is complicated by the statistical properties of catch-per-unit-effort (CPUE) data. We found that age-0 CPUE distributions from bottom trawl surveys violated assumptions of statistical procedures based on normal probability theory. Further, no single data transformation uniformly satisfied these assumptions because CPUE distribution properties varied with the sample mean (??(CPUE)). Given these analytic problems, we propose that an additional index of age-0 white sturgeon relative abundance, the proportion of positive tows (Ep), be used to estimate sample sizes before conducting age-0 recruitment surveys and to evaluate statistical hypothesis tests comparing the relative abundance of age-0 white sturgeons among years. Monte Carlo simulations indicated that Ep was consistently more precise than ??(CPUE), and because Ep is binomially rather than normally distributed, surveys can be planned and analyzed without violating the assumptions of procedures based on normal probability theory. However, we show that Ep may underestimate changes in relative abundance at high levels and confound our ability to quantify responses to management actions if relative abundance is consistently high. If data suggest that most samples will contain age-0 white sturgeons, estimators of relative abundance other than Ep should be considered. Because Ep may also obscure correlations to climatic and hydrologic variables if high abundance levels are present in time series data, we recommend ??(CPUE) be used to describe relations to environmental variables. The use of both Ep and ??(CPUE) will facilitate the evaluation of hypothesis tests comparing relative abundance levels and correlations to variables affecting age-0 recruitment. Estimated sample sizes for surveys should therefore be based on detecting predetermined differences in Ep, but data necessary to calculate ??(CPUE) should also be collected.

  6. Implementing reduced-risk integrated pest management in fresh-market cabbage: influence of sampling parameters, and validation of binomial sequential sampling plans for the cabbage looper (Lepidoptera Noctuidae).

    PubMed

    Burkness, Eric C; Hutchison, W D

    2009-10-01

    Populations of cabbage looper, Trichoplusiani (Lepidoptera: Noctuidae), were sampled in experimental plots and commercial fields of cabbage (Brasicca spp.) in Minnesota during 1998-1999 as part of a larger effort to implement an integrated pest management program. Using a resampling approach and the Wald's sequential probability ratio test, sampling plans with different sampling parameters were evaluated using independent presence/absence and enumerative data. Evaluations and comparisons of the different sampling plans were made based on the operating characteristic and average sample number functions generated for each plan and through the use of a decision probability matrix. Values for upper and lower decision boundaries, sequential error rates (alpha, beta), and tally threshold were modified to determine parameter influence on the operating characteristic and average sample number functions. The following parameters resulted in the most desirable operating characteristic and average sample number functions; action threshold of 0.1 proportion of plants infested, tally threshold of 1, alpha = beta = 0.1, upper boundary of 0.15, lower boundary of 0.05, and resampling with replacement. We found that sampling parameters can be modified and evaluated using resampling software to achieve desirable operating characteristic and average sample number functions. Moreover, management of T. ni by using binomial sequential sampling should provide a good balance between cost and reliability by minimizing sample size and maintaining a high level of correct decisions (>95%) to treat or not treat.

  7. Temporal Evolution of Non-equilibrium Gamma’ Precipitates in a Rapidly Quenched Nickel Base Superalloy (Preprint)

    DTIC Science & Technology

    2014-04-01

    with the binomial distribution for a particular dataset. This technique is more commonly known as the Langer, Bar-on and Miller ( LBM ) method [22,23...distribution unlimited. Using the LBM method, the frequency distribution plot for a dataset corresponding to a phase separated system, exhibiting a split peak...estimated parameters (namely μ1, μ2, σ, fγ’ and fγ) obtained from the LBM plots in Fig. 5 are summarized in Table 3. The EWQ sample does not exhibit any

  8. Estimation of the cure rate in Iranian breast cancer patients.

    PubMed

    Rahimzadeh, Mitra; Baghestani, Ahmad Reza; Gohari, Mahmood Reza; Pourhoseingholi, Mohamad Amin

    2014-01-01

    Although the Cox's proportional hazard model is the popular approach for survival analysis to investigate significant risk factors of cancer patient survival, it is not appropriate in the case of log-term disease free survival. Recently, cure rate models have been introduced to distinguish between clinical determinants of cure and variables associated with the time to event of interest. The aim of this study was to use a cure rate model to determine the clinical associated factors for cure rates of patients with breast cancer (BC). This prospective cohort study covered 305 patients with BC, admitted at Shahid Faiazbakhsh Hospital, Tehran, during 2006 to 2008 and followed until April 2012. Cases of patient death were confirmed by telephone contact. For data analysis, a non-mixed cure rate model with Poisson distribution and negative binomial distribution were employed. All analyses were carried out using a developed Macro in WinBugs. Deviance information criteria (DIC) were employed to find the best model. The overall 1-year, 3-year and 5-year relative survival rates were 97%, 89% and 74%. Metastasis and stage of BC were the significant factors, but age was significant only in negative binomial model. The DIC also showed that the negative binomial model had a better fit. This study indicated that, metastasis and stage of BC were identified as the clinical criteria for cure rates. There are limited studies on BC survival which employed these cure rate models to identify the clinical factors associated with cure. These models are better than Cox, in the case of long-term survival.

  9. The Gumbel hypothesis test for left censored observations using regional earthquake records as an example

    NASA Astrophysics Data System (ADS)

    Thompson, E. M.; Hewlett, J. B.; Baise, L. G.; Vogel, R. M.

    2011-01-01

    Annual maximum (AM) time series are incomplete (i.e., censored) when no events are included above the assumed censoring threshold (i.e., magnitude of completeness). We introduce a distrtibutional hypothesis test for left-censored Gumbel observations based on the probability plot correlation coefficient (PPCC). Critical values of the PPCC hypothesis test statistic are computed from Monte-Carlo simulations and are a function of sample size, censoring level, and significance level. When applied to a global catalog of earthquake observations, the left-censored Gumbel PPCC tests are unable to reject the Gumbel hypothesis for 45 of 46 seismic regions. We apply four different field significance tests for combining individual tests into a collective hypothesis test. None of the field significance tests are able to reject the global hypothesis that AM earthquake magnitudes arise from a Gumbel distribution. Because the field significance levels are not conclusive, we also compute the likelihood that these field significance tests are unable to reject the Gumbel model when the samples arise from a more complex distributional alternative. A power study documents that the censored Gumbel PPCC test is unable to reject some important and viable Generalized Extreme Value (GEV) alternatives. Thus, we cannot rule out the possibility that the global AM earthquake time series could arise from a GEV distribution with a finite upper bound, also known as a reverse Weibull distribution. Our power study also indicates that the binomial and uniform field significance tests are substantially more powerful than the more commonly used Bonferonni and false discovery rate multiple comparison procedures.

  10. Determinants of the geographic distribution of Puumala virus and Lyme borreliosis infections in Belgium.

    PubMed

    Linard, Catherine; Lamarque, Pénélope; Heyman, Paul; Ducoffre, Geneviève; Luyasu, Victor; Tersago, Katrien; Vanwambeke, Sophie O; Lambin, Eric F

    2007-05-02

    Vector-borne and zoonotic diseases generally display clear spatial patterns due to different space-dependent factors. Land cover and land use influence disease transmission by controlling both the spatial distribution of vectors or hosts, and the probability of contact with susceptible human populations. The objective of this study was to combine environmental and socio-economic factors to explain the spatial distribution of two emerging human diseases in Belgium, Puumala virus (PUUV) and Lyme borreliosis. Municipalities were taken as units of analysis. Negative binomial regressions including a correction for spatial endogeneity show that the spatial distribution of PUUV and Lyme borreliosis infections are associated with a combination of factors linked to the vector and host populations, to human behaviours, and to landscape attributes. Both diseases are associated with the presence of forests, which are the preferred habitat for vector or host populations. The PUUV infection risk is higher in remote forest areas, where the level of urbanisation is low, and among low-income populations. The Lyme borreliosis transmission risk is higher in mixed landscapes with forests and spatially dispersed houses, mostly in wealthy peri-urban areas. The spatial dependence resulting from a combination of endogenous and exogenous processes could be accounted for in the model on PUUV but not for Lyme borreliosis. A large part of the spatial variation in disease risk can be explained by environmental and socio-economic factors. The two diseases not only are most prevalent in different regions but also affect different groups of people. Combining these two criteria may increase the efficiency of information campaigns through appropriate targeting.

  11. Effects of dynamical grouping on cooperation in N-person evolutionary snowdrift game

    NASA Astrophysics Data System (ADS)

    Ji, M.; Xu, C.; Hui, P. M.

    2011-09-01

    A population typically consists of agents that continually distribute themselves into different groups at different times. This dynamic grouping has recently been shown to be essential in explaining many features observed in human activities including social, economic, and military activities. We study the effects of dynamic grouping on the level of cooperation in a modified evolutionary N-person snowdrift game. Due to the formation of dynamical groups, the competition takes place in groups of different sizes at different times and players of different strategies are mixed by the grouping dynamics. It is found that the level of cooperation is greatly enhanced by the dynamic grouping of agents, when compared with a static population of the same size. As a parameter β, which characterizes the relative importance of the reward and cost, increases, the fraction of cooperative players fC increases and it is possible to achieve a fully cooperative state. Analytically, we present a dynamical equation that incorporates the effects of the competing game and group size distribution. The distribution of cooperators in different groups is assumed to be a binomial distribution, which is confirmed by simulations. Results from the analytic equation are in good agreement with numerical results from simulations. We also present detailed simulation results of fC over the parameter space spanned by the probabilities of group coalescence νm and group fragmentation νp in the grouping dynamics. A high νm and low νp promotes cooperation, and a favorable reward characterized by a high β would lead to a fully cooperative state.

  12. Structure and plasticity potential of neural networks in the cerebral cortex

    NASA Astrophysics Data System (ADS)

    Fares, Tarec Edmond

    In this thesis, we first described a theoretical framework for the analysis of spine remodeling plasticity. We provided a quantitative description of two models of spine remodeling in which the presence of a bouton is either required or not for the formation of a new synapse. We derived expressions for the density of potential synapses in the neuropil, the connectivity fraction, which is the ratio of actual to potential synapses, and the number of structurally different circuits attainable with spine remodeling. We calculated these parameters in mouse occipital cortex, rat CA1, monkey V1, and human temporal cortex. We found that on average a dendritic spine can choose among 4-7 potential targets in rodents and 10-20 potential targets in primates. The neuropil's potential for structural circuit remodeling is highest in rat CA1 (7.1-8.6 bits/mum3) and lowest in monkey V1 (1.3-1.5 bits/mum 3 We next studied the role neuron morphology plays in defining synaptic connectivity. As previously stated it is clear that only pairs of neurons with closely positioned axonal and dendritic branches can be synaptically coupled. For excitatory neurons in the cerebral cortex, ). We also evaluated the lower bound of neuron selectivity in the choice of synaptic partners. Post-synaptic excitatory neurons in rodents make synaptic contacts with more than 21-30% of pre-synaptic axons encountered with new spine growth. Primate neurons appear to be more selective, making synaptic connections with more than 7-15% of encountered axons. We next studied the role neuron morphology plays in defining synaptic connectivity. As previously stated it is clear that only pairs of neurons with closely positioned axonal and dendritic branches can be synaptically coupled. For excitatory neurons in the cerebral cortex, such axo-dendritic oppositions, or potential synapses, must be bridged by dendritic spines to form synaptic connections. To explore the rules by which synaptic connections are formed within the constraints imposed by neuron morphology, we compared the distributions of the numbers of actual and potential synapses between pre- and post-synaptic neurons forming different laminar projections in rat barrel cortex. Quantitative comparison explicitly ruled out the hypothesis that individual synapses between neurons are formed independently of each other. Instead, the data are consistent with a cooperative scheme of synapse formation, where multiple-synaptic connections between neurons are stabilized, while neurons that do not establish a critical number of synapses are not likely to remain synaptically coupled. In the above two projects, analysis of potential synapse numbers played an important role in shaping our understanding of connectivity and structural plasticity. In the third part of this thesis, we shift our attention to the study of the distribution of potential synapse numbers. This distribution is dependent on the details of neuron morphology and it defines synaptic connectivity patterns attainable with spine remodeling. To better understand how the distribution of potential synapse numbers is influenced by the overlap and the shapes of axonal and dendritic arbors, we first analyzed uniform disconnected arbors generated in silico. The resulting distributions are well described by binomial functions. We used a dataset of neurons reconstructed in 3D and generated the potential synapse distributions for neurons of different classes. Quantitative analysis showed that the binomial distribution is a good fit to this data as well. All distributions considered clustered into two categories, inhibitory to inhibitory and excitatory to excitatory projections. We showed that the distributions of potential synapse numbers are universally described by a family of single parameter (p) binomial functions, where p = 0.08, and for the inhibitory and p = 0.19 for the excitatory projections. In the last part of this thesis an attempt is made to incorporate some of the biological constraints we considered thus far, into an artificial neural network model. It became clear that several features of synaptic connectivity are ubiquitous among different cortical networks: (1) neural networks are predominately excitatory, containing roughly 80% of excitatory neurons and synapses, (2) neural networks are only sparsely interconnected, where the probabilities of finding connected neurons are always less than 50% even for neighboring cells, (3) the distribution of connection strengths has been shown to have a slow non-exponential decay. In the attempt to understand the advantage of such network architecture for learning and memory, we analyzed the associative memory capacity of a biologically constrained perceptron-like neural network model. The artificial neural network we consider consists of robust excitatory and inhibitory McCulloch and Pitts neurons with a constant firing threshold. Our theoretical results show that the capacity for associative memory storage in such networks increases with an addition of a small fraction of inhibitory neurons, while the connection probability remains below 50%. (Abstract shortened by UMI.)

  13. How Interesting Is a Cricket Match?

    ERIC Educational Resources Information Center

    Glaister, P.

    2006-01-01

    Even for those passionate about both cricket and maths, each can have their dull moments. This article brings together the sometimes-dry binomial distribution with a problem of cricket matches where the result of the series has already been decided, and so are "dead". It is hoped that readers will become more interested in at least one…

  14. Learner-Controlled Scaffolding Linked to Open-Ended Problems in a Digital Learning Environment

    ERIC Educational Resources Information Center

    Edson, Alden Jack

    2017-01-01

    This exploratory study reports on how students activated learner-controlled scaffolding and navigated through sequences of connected problems in a digital learning environment. A design experiment was completed to (re)design, iteratively develop, test, and evaluate a digital version of an instructional unit focusing on binomial distributions and…

  15. Simplified pupal surveys of Aedes aegypti (L.) for entomologic surveillance and dengue control.

    PubMed

    Barrera, Roberto

    2009-07-01

    Pupal surveys of Aedes aegypti (L.) are useful indicators of risk for dengue transmission, although sample sizes for reliable estimations can be large. This study explores two methods for making pupal surveys more practical yet reliable and used data from 10 pupal surveys conducted in Puerto Rico during 2004-2008. The number of pupae per person for each sampling followed a negative binomial distribution, thus showing aggregation. One method found a common aggregation parameter (k) for the negative binomial distribution, a finding that enabled the application of a sequential sampling method requiring few samples to determine whether the number of pupae/person was above a vector density threshold for dengue transmission. A second approach used the finding that the mean number of pupae/person is correlated with the proportion of pupa-infested households and calculated equivalent threshold proportions of pupa-positive households. A sequential sampling program was also developed for this method to determine whether observed proportions of infested households were above threshold levels. These methods can be used to validate entomological thresholds for dengue transmission.

  16. Development of tiger habitat suitability model using geospatial tools-a case study in Achankmar Wildlife Sanctuary (AMWLS), Chhattisgarh India.

    PubMed

    Singh, R; Joshi, P K; Kumar, M; Dash, P P; Joshi, B D

    2009-08-01

    Geospatial tools supported by ancillary geo-database and extensive fieldwork regarding the distribution of tiger and its prey in Anchankmar Wildlife Sanctuary (AMWLS) were used to build a tiger habitat suitability model. This consists of a quantitative geographical information system (GIS) based approach using field parameters and spatial thematic information. The estimates of tiger sightings, its prey sighting and predicted distribution with the assistance of contextual environmental data including terrain, road network, settlement and drainage surfaces were used to develop the model. Eight variables in the dataset viz., forest cover type, forest cover density, slope, aspect, altitude, and distance from road, settlement and drainage were seen as suitable proxies and were used as independent variables in the analysis. Principal component analysis and binomial multiple logistic regression were used for statistical treatments of collected habitat parameters from field and independent variables respectively. The assessment showed a strong expert agreement between the predicted and observed suitable areas. A combination of the generated information and published literature was also used while building a habitat suitability map for the tiger. The modeling approach has taken the habitat preference parameters of the tiger and potential distribution of prey species into account. For assessing the potential distribution of prey species, independent suitability models were developed and validated with the ground truth. It is envisaged that inclusion of the prey distribution probability strengthens the model when a key species is under question. The results of the analysis indicate that tiger occur throughout the sanctuary. The results have been found to be an important input as baseline information for population modeling and natural resource management in the wildlife sanctuary. The development and application of similar models can help in better management of the protected areas of national interest.

  17. Metaprop: a Stata command to perform meta-analysis of binomial data.

    PubMed

    Nyaga, Victoria N; Arbyn, Marc; Aerts, Marc

    2014-01-01

    Meta-analyses have become an essential tool in synthesizing evidence on clinical and epidemiological questions derived from a multitude of similar studies assessing the particular issue. Appropriate and accessible statistical software is needed to produce the summary statistic of interest. Metaprop is a statistical program implemented to perform meta-analyses of proportions in Stata. It builds further on the existing Stata procedure metan which is typically used to pool effects (risk ratios, odds ratios, differences of risks or means) but which is also used to pool proportions. Metaprop implements procedures which are specific to binomial data and allows computation of exact binomial and score test-based confidence intervals. It provides appropriate methods for dealing with proportions close to or at the margins where the normal approximation procedures often break down, by use of the binomial distribution to model the within-study variability or by allowing Freeman-Tukey double arcsine transformation to stabilize the variances. Metaprop was applied on two published meta-analyses: 1) prevalence of HPV-infection in women with a Pap smear showing ASC-US; 2) cure rate after treatment for cervical precancer using cold coagulation. The first meta-analysis showed a pooled HPV-prevalence of 43% (95% CI: 38%-48%). In the second meta-analysis, the pooled percentage of cured women was 94% (95% CI: 86%-97%). By using metaprop, no studies with 0% or 100% proportions were excluded from the meta-analysis. Furthermore, study specific and pooled confidence intervals always were within admissible values, contrary to the original publication, where metan was used.

  18. A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.

    PubMed

    Ferrari, Alberto; Comelli, Mario

    2016-12-01

    In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. HYPERSAMP - HYPERGEOMETRIC ATTRIBUTE SAMPLING SYSTEM BASED ON RISK AND FRACTION DEFECTIVE

    NASA Technical Reports Server (NTRS)

    De, Salvo L. J.

    1994-01-01

    HYPERSAMP is a demonstration of an attribute sampling system developed to determine the minimum sample size required for any preselected value for consumer's risk and fraction of nonconforming. This statistical method can be used in place of MIL-STD-105E sampling plans when a minimum sample size is desirable, such as when tests are destructive or expensive. HYPERSAMP utilizes the Hypergeometric Distribution and can be used for any fraction nonconforming. The program employs an iterative technique that circumvents the obstacle presented by the factorial of a non-whole number. HYPERSAMP provides the required Hypergeometric sample size for any equivalent real number of nonconformances in the lot or batch under evaluation. Many currently used sampling systems, such as the MIL-STD-105E, utilize the Binomial or the Poisson equations as an estimate of the Hypergeometric when performing inspection by attributes. However, this is primarily because of the difficulty in calculation of the factorials required by the Hypergeometric. Sampling plans based on the Binomial or Poisson equations will result in the maximum sample size possible with the Hypergeometric. The difference in the sample sizes between the Poisson or Binomial and the Hypergeometric can be significant. For example, a lot size of 400 devices with an error rate of 1.0% and a confidence of 99% would require a sample size of 400 (all units would need to be inspected) for the Binomial sampling plan and only 273 for a Hypergeometric sampling plan. The Hypergeometric results in a savings of 127 units, a significant reduction in the required sample size. HYPERSAMP is a demonstration program and is limited to sampling plans with zero defectives in the sample (acceptance number of zero). Since it is only a demonstration program, the sample size determination is limited to sample sizes of 1500 or less. The Hypergeometric Attribute Sampling System demonstration code is a spreadsheet program written for IBM PC compatible computers running DOS and Lotus 1-2-3 or Quattro Pro. This program is distributed on a 5.25 inch 360K MS-DOS format diskette, and the program price includes documentation. This statistical method was developed in 1992.

  20. Higher moments of net-proton multiplicity distributions in a heavy-ion event pile-up scenario

    NASA Astrophysics Data System (ADS)

    Garg, P.; Mishra, D. K.

    2017-10-01

    High-luminosity modern accelerators, like the Relativistic Heavy Ion Collider (RHIC) at Brookhaven National Laboratory (BNL) and Large Hadron Collider (LHC) at European Organization for Nuclear Research (CERN), inherently have event pile-up scenarios which significantly contribute to physics events as a background. While state-of-the-art tracking algorithms and detector concepts take care of these event pile-up scenarios, several offline analytical techniques are used to remove such events from the physics analysis. It is still difficult to identify the remaining pile-up events in an event sample for physics analysis. Since the fraction of these events is significantly small, it may not be as serious of an issue for other analyses as it would be for an event-by-event analysis. Particularly when the characteristics of the multiplicity distribution are observable, one needs to be very careful. In the present work, we demonstrate how a small fraction of residual pile-up events can change the moments and their ratios of an event-by-event net-proton multiplicity distribution, which are sensitive to the dynamical fluctuations due to the QCD critical point. For this study, we assume that the individual event-by-event proton and antiproton multiplicity distributions follow Poisson, negative binomial, or binomial distributions. We observe a significant effect in cumulants and their ratios of net-proton multiplicity distributions due to pile-up events, particularly at lower energies. It might be crucial to estimate the fraction of pile-up events in the data sample while interpreting the experimental observable for the critical point.

  1. Efficient estimation of abundance for patchily distributed populations via two-phase, adaptive sampling.

    USGS Publications Warehouse

    Conroy, M.J.; Runge, J.P.; Barker, R.J.; Schofield, M.R.; Fonnesbeck, C.J.

    2008-01-01

    Many organisms are patchily distributed, with some patches occupied at high density, others at lower densities, and others not occupied. Estimation of overall abundance can be difficult and is inefficient via intensive approaches such as capture-mark-recapture (CMR) or distance sampling. We propose a two-phase sampling scheme and model in a Bayesian framework to estimate abundance for patchily distributed populations. In the first phase, occupancy is estimated by binomial detection samples taken on all selected sites, where selection may be of all sites available, or a random sample of sites. Detection can be by visual surveys, detection of sign, physical captures, or other approach. At the second phase, if a detection threshold is achieved, CMR or other intensive sampling is conducted via standard procedures (grids or webs) to estimate abundance. Detection and CMR data are then used in a joint likelihood to model probability of detection in the occupancy sample via an abundance-detection model. CMR modeling is used to estimate abundance for the abundance-detection relationship, which in turn is used to predict abundance at the remaining sites, where only detection data are collected. We present a full Bayesian modeling treatment of this problem, in which posterior inference on abundance and other parameters (detection, capture probability) is obtained under a variety of assumptions about spatial and individual sources of heterogeneity. We apply the approach to abundance estimation for two species of voles (Microtus spp.) in Montana, USA. We also use a simulation study to evaluate the frequentist properties of our procedure given known patterns in abundance and detection among sites as well as design criteria. For most population characteristics and designs considered, bias and mean-square error (MSE) were low, and coverage of true parameter values by Bayesian credibility intervals was near nominal. Our two-phase, adaptive approach allows efficient estimation of abundance of rare and patchily distributed species and is particularly appropriate when sampling in all patches is impossible, but a global estimate of abundance is required.

  2. A Statistical Tool for Risk Assessment as Function of Number of Retrieved Lymph Nodes from Rectal Cancer Patients.

    PubMed

    Wu, Zhenyu; Qin, Guoyou; Zhao, Naiqing; Jia, Huixun; Zheng, Xueying

    2018-05-16

    Although a minimum of 12 lymph nodes (LNs) has been recommended for colorectal cancer, there remains considerable debates for rectal cancer patients. Inadequacy of examined LNs would lead to under-staging, and inappropriate treatment as a consequence. We describe statistical tool that allows an estimate the probability of false-negative nodes. A total of 26,778 adenocarcinoma rectum cancer patients with tumour stage (T stage) 1-3, diagnosed between 2004 and 2013, who did not receive neoadjuvant therapies and had at least one histologically assessed LN, were extracted from the Surveillance, Epidemiology and End Results (SEER) database. A statistical tool using beta-binomial distribution was developed to estimate the probability of an occult nodal disease is truly node-negative as a function of total number of LNs examined and T stage. The probability of falsely identifying a patient as node-negative decreased with an increasing number of nodes examined for each stage. It was estimated to be 72%, 66% and 52% for T1, T2 and T3 patients respectively with a single node examined. To confirm an occult nodal disease with 90% confidence, 5, 9, and 29 nodes need to be examined for patients from stages T1, T2, and T3, respectively. The false-negative rate of the examined lymph nodes in rectal cancer was verified to be dependent preoperatively on the clinical tumour stage. A more accurate nodal staging score was developed to recommend a threshold on the minimum number of examined nodes regarding to the favored level of confidence. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  3. Scoring in genetically modified organism proficiency tests based on log-transformed results.

    PubMed

    Thompson, Michael; Ellison, Stephen L R; Owen, Linda; Mathieson, Kenneth; Powell, Joanne; Key, Pauline; Wood, Roger; Damant, Andrew P

    2006-01-01

    The study considers data from 2 UK-based proficiency schemes and includes data from a total of 29 rounds and 43 test materials over a period of 3 years. The results from the 2 schemes are similar and reinforce each other. The amplification process used in quantitative polymerase chain reaction determinations predicts a mixture of normal, binomial, and lognormal distributions dominated by the latter 2. As predicted, the study results consistently follow a positively skewed distribution. Log-transformation prior to calculating z-scores is effective in establishing near-symmetric distributions that are sufficiently close to normal to justify interpretation on the basis of the normal distribution.

  4. Probing the statistics of primordial fluctuations and their evolution

    NASA Technical Reports Server (NTRS)

    Gaztanaga, Enrique; Yokoyama, Jun'ichi

    1993-01-01

    The statistical distribution of fluctuations on various scales is analyzed in terms of the counts in cells of smoothed density fields, using volume-limited samples of galaxy redshift catalogs. It is shown that the distribution on large scales, with volume average of the two-point correlation function of the smoothed field less than about 0.05, is consistent with Gaussian. Statistics are shown to agree remarkably well with the negative binomial distribution, which has hierarchial correlations and a Gaussian behavior at large scales. If these observed properties correspond to the matter distribution, they suggest that our universe started with Gaussian fluctuations and evolved keeping hierarchial form.

  5. Distribution, occupancy, and habitat correlates of American martens (Martes americana) in Rocky Mountain National Park, Colorado

    USGS Publications Warehouse

    Baldwin, R.A.; Bender, L.C.

    2008-01-01

    A clear understanding of habitat associations of martens (Martes americana) is necessary to effectively manage and monitor populations. However, this information was lacking for martens in most of their southern range, particularly during the summer season. We studied the distribution and habitat correlates of martens from 2004 to 2006 in Rocky Mountain National Park (RMNP) across 3 spatial scales: site-specific, home-range, and landscape. We used remote-sensored cameras from early August through late October to inventory occurrence of martens and modeled occurrence as a function of habitat and landscape variables using binary response (BR) and binomial count (BC) logistic regression, and occupancy modeling (OM). We also assessed which was the most appropriate modeling technique for martens in RMNP. Of the 3 modeling techniques, OM appeared to be most appropriate given the explanatory power of derived models and its incorporation of detection probabilities, although the results from BR and BC provided corroborating evidence of important habitat correlates. Location of sites in the western portion of the park, riparian mixed-conifer stands, and mixed-conifer with aspen patches were most frequently positively correlated with occurrence of martens, whereas more xeric and open sites were avoided. Additionally, OM yielded unbiased occupancy values ranging from 91% to 100% and 20% to 30% for the western and eastern portions of RMNP, respectively. ?? 2008 American Society of Mammalogists.

  6. Analysis of overdispersed count data by mixtures of Poisson variables and Poisson processes.

    PubMed

    Hougaard, P; Lee, M L; Whitmore, G A

    1997-12-01

    Count data often show overdispersion compared to the Poisson distribution. Overdispersion is typically modeled by a random effect for the mean, based on the gamma distribution, leading to the negative binomial distribution for the count. This paper considers a larger family of mixture distributions, including the inverse Gaussian mixture distribution. It is demonstrated that it gives a significantly better fit for a data set on the frequency of epileptic seizures. The same approach can be used to generate counting processes from Poisson processes, where the rate or the time is random. A random rate corresponds to variation between patients, whereas a random time corresponds to variation within patients.

  7. Charged particle multiplicities in deep inelastic scattering at HERA

    NASA Astrophysics Data System (ADS)

    Aid, S.; Anderson, M.; Andreev, V.; Andrieu, B.; Appuhn, R.-D.; Babaev, A.; Bähr, J.; Bán, J.; Ban, Y.; Baranov, P.; Barrelet, E.; Barschke, R.; Bartel, W.; Barth, M.; Bassler, U.; Beck, H. P.; Behrend, H.-J.; Belousov, A.; Berger, Ch.; Bernardi, G.; Bertrand-Coremans, G.; Besançon, M.; Beyer, R.; Biddulph, P.; Bispham, P.; Bizot, J. C.; Blobel, V.; Borras, K.; Botterweck, F.; Boudry, V.; Braemer, A.; Braunschweig, W.; Brisson, V.; Bruel, P.; Bruncko, D.; Brune, C.; Buchholz, R.; Büngener, L.; Bürger, J.; Büsser, F. W.; Buniatian, A.; Burke, S.; Burton, M. J.; Calvet, D.; Campbell, A. J.; Carli, T.; Charlet, M.; Clarke, D.; Clegg, A. B.; Clerbaux, B.; Cocks, S.; Contreras, J. G.; Cormack, C.; Coughlan, J. A.; Courau, A.; Cousinou, M.-C.; Cozzika, G.; Criegee, L.; Cussans, D. G.; Cvach, J.; Dagoret, S.; Dainton, J. B.; Dau, W. D.; Daum, K.; David, M.; Davis, C. L.; Delcourt, B.; de Roeck, A.; de Wolf, E. A.; Dirkmann, M.; Dixon, P.; di Nezza, P.; Dlugosz, W.; Dollfus, C.; Dowell, J. D.; Dreis, H. B.; Droutskoi, A.; Dünger, O.; Duhm, H.; Ebert, J.; Ebert, T. R.; Eckerlin, G.; Efremenko, V.; Egli, S.; Eichler, R.; Eisele, F.; Eisenhandler, E.; Elsen, E.; Erdmann, M.; Erdmann, W.; Evrard, E.; Fahr, A. B.; Favart, L.; Fedotov, A.; Feeken, D.; Felst, R.; Feltesse, J.; Ferencei, J.; Ferrarotto, F.; Flamm, K.; Fleischer, M.; Flieser, M.; Flügge, G.; Fomenko, A.; Fominykh, B.; Formánek, J.; Foster, J. M.; Franke, G.; Fretwurst, E.; Gabathuler, E.; Gabathuler, K.; Gaede, F.; Garvey, J.; Gayler, J.; Gebauer, M.; Genzel, H.; Gerhards, R.; Glazov, A.; Goerlach, U.; Goerlich, L.; Gogitidze, N.; Goldberg, M.; Goldner, D.; Golec-Biernat, K.; Gonzalez-Pineiro, B.; Gorelov, I.; Grab, C.; Grässler, H.; Greenshaw, T.; Griffiths, R. K.; Grindhammer, G.; Gruber, A.; Gruber, C.; Haack, J.; Hadig, T.; Haidt, D.; Hajduk, L.; Hampel, M.; Haynes, W. J.; Heinzelmann, G.; Henderson, R. C. W.; Henschel, H.; Herynek, I.; Hess, M. F.; Hewitt, K.; Hildesheim, W.; Hiller, K. H.; Hilton, C. D.; Hladký, J.; Hoeger, K. C.; Höppner, M.; Hoffmann, D.; Holtom, T.; Horisberger, R.; Hudgson, V. L.; Hütte, M.; Ibbotson, M.; Itterbeck, H.; Jacholkowska, A.; Jacobsson, C.; Jaffre, M.; Janoth, J.; Jansen, T.; Jönsson, L.; Johnson, D. P.; Jung, H.; Kalmus, P. I. P.; Kander, M.; Kant, D.; Kaschowitz, R.; Kathage, U.; Katzy, J.; Kaufmann, H. H.; Kaufmann, O.; Kazarian, S.; Kenyon, I. R.; Kermiche, S.; Keuker, C.; Kiesling, C.; Klein, M.; Kleinwort, C.; Knies, G.; Köhler, T.; Köhne, J. H.; Kolanoski, H.; Kole, F.; Kolya, S. D.; Korbel, V.; Korn, M.; Kostka, P.; Kotelnikov, S. K.; Krämerkämper, T.; Krasny, M. W.; Krehbiel, H.; Krücker, D.; Küster, H.; Kuhlen, M.; Kurča, T.; Kurzhöfer, J.; Lacour, D.; Laforge, B.; Lander, R.; Landon, M. P. J.; Lange, W.; Langenegger, U.; Laporte, J.-F.; Lebedev, A.; Lehner, F.; Levonian, S.; Lindström, G.; Lindstroem, M.; Link, J.; Linsel, F.; Lipinski, J.; List, B.; Lobo, G.; Lomas, J. W.; Lopez, G. C.; Lubimov, V.; Lüke, D.; Magnussen, N.; Malinovski, E.; Mani, S.; Maraček, R.; Marage, P.; Marks, J.; Marshall, R.; Martens, J.; Martin, G.; Martin, R.; Martyn, H.-U.; Martyniak, J.; Mavroidis, T.; Maxfield, S. J.; McMahon, S. J.; Mehta, A.; Meier, K.; Meyer, A.; Meyer, A.; Meyer, H.; Meyer, J.; Meyer, P.-O.; Migliori, A.; Mikocki, S.; Milstead, D.; Moeck, J.; Moreau, F.; Morris, J. V.; Mroczko, E.; Müller, D.; Müller, G.; Müller, K.; Müller, M.; Murín, P.; Nagovizin, V.; Nahnhauer, R.; Naroska, B.; Naumann, Th.; Négri, I.; Newman, P. R.; Newton, D.; Nguyen, H. K.; Nicholls, T. C.; Niebergall, F.; Niebuhr, C.; Niedzballa, Ch.; Niggli, H.; Nisius, R.; Nowak, G.; Noyes, G. W.; Nyberg-Werther, M.; Oakden, M.; Oberlack, H.; Olsson, J. E.; Ozerov, D.; Palmen, P.; Panaro, E.; Panitch, A.; Pascaud, C.; Patel, G. D.; Pawletta, H.; Peppel, E.; Perez, E.; Phillips, J. P.; Pieuchot, A.; Pitzl, D.; Pope, G.; Prell, S.; Rabbertz, K.; Rädel, G.; Reimer, P.; Reinshagen, S.; Rick, H.; Riech, V.; Riedlberger, J.; Riepenhausen, F.; Riess, S.; Rizvi, E.; Robertson, S. M.; Robmann, P.; Roloff, H. E.; Roosen, R.; Rosenbauer, K.; Rostovtsev, A.; Rouse, F.; Royon, C.; Rüter, K.; Rusakov, S.; Rybicki, K.; Sankey, D. P. C.; Schacht, P.; Schiek, S.; Schleif, S.; Schleper, P.; von Schlippe, W.; Schmidt, D.; Schmidt, G.; Schöning, A.; Schröder, V.; Schuhmann, E.; Schwab, B.; Sefkow, F.; Seidel, M.; Sell, R.; Semenov, A.; Shekelyan, V.; Sheviakov, I.; Shtarkov, L. N.; Siegmon, G.; Siewert, U.; Sirois, Y.; Skillicorn, I. O.; Smirnov, P.; Smith, J. R.; Solochenko, V.; Soloviev, Y.; Specka, A.; Spiekermann, J.; Spielman, S.; Spitzer, H.; Squinabol, F.; Steenbock, M.; Steffen, P.; Steinberg, R.; Steiner, H.; Steinhart, J.; Stella, B.; Stellberger, A.; Stier, J.; Stiewe, J.; Stößlein, U.; Stolze, K.; Straumann, U.; Struczinski, W.; Sutton, J. P.; Tapprogge, S.; Taševský, M.; Tchernyshov, V.; Tchetchelnitski, S.; Theissen, J.; Thiebaux, C.; Thompson, G.; Truöl, P.; Tsipolitis, G.; Turnau, J.; Tutas, J.; Uelkes, P.; Usik, A.; Valkár, S.; Valkárová, A.; Vallée, C.; Vandenplas, D.; van Esch, P.; van Mechelen, P.; Vazdik, Y.; Verrecchia, P.; Villet, G.; Wacker, K.; Wagener, A.; Wagener, M.; Walther, A.; Waugh, B.; Weber, G.; Weber, M.; Wegener, D.; Wegner, A.; Wengler, T.; Werner, M.; West, L. R.; Wilksen, T.; Willard, S.; Winde, M.; Winter, G.-G.; Wittek, C.; Wobisch, M.; Wünsch, E.; Žáček, J.; Zarbock, D.; Zhang, Z.; Zhokin, A.; Zini, P.; Zomer, F.; Zsembery, J.; Zuber, K.; Zurnedden, M.

    1996-12-01

    Using the H1 detector at HERA, charged particle multiplicity distributions in deep inelastic e + p scattering have been measured over a large kinematical region. The evolution with W and Q 2 of the multiplicity distribution and of the multiplicity moments in pseudorapidity domains of varying size is studied in the current fragmentation region of the hadronic centre-of-mass frame. The results are compared with data from fixed target lepton-nucleon interactions, e + e - annihilations and hadron-hadron collisions as well as with expectations from QCD based parton models. Fits to the Negative Binomial and Lognormal distributions are presented.

  8. Investigation of shipping accident injury severity and mortality.

    PubMed

    Weng, Jinxian; Yang, Dong

    2015-03-01

    Shipping movements are operated in a complex and high-risk environment. Fatal shipping accidents are the nightmares of seafarers. With ten years' worldwide ship accident data, this study develops a binary logistic regression model and a zero-truncated binomial regression model to predict the probability of fatal shipping accidents and corresponding mortalities. The model results show that both the probability of fatal accidents and mortalities are greater for collision, fire/explosion, contact, grounding, sinking accidents occurred in adverse weather conditions and darkness conditions. Sinking has the largest effects on the increment of fatal accident probability and mortalities. The results also show that the bigger number of mortalities is associated with shipping accidents occurred far away from the coastal area/harbor/port. In addition, cruise ships are found to have more mortalities than non-cruise ships. The results of this study are beneficial for policy-makers in proposing efficient strategies to prevent fatal shipping accidents. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. QNB: differential RNA methylation analysis for count-based small-sample sequencing data with a quad-negative binomial model.

    PubMed

    Liu, Lian; Zhang, Shao-Wu; Huang, Yufei; Meng, Jia

    2017-08-31

    As a newly emerged research area, RNA epigenetics has drawn increasing attention recently for the participation of RNA methylation and other modifications in a number of crucial biological processes. Thanks to high throughput sequencing techniques, such as, MeRIP-Seq, transcriptome-wide RNA methylation profile is now available in the form of count-based data, with which it is often of interests to study the dynamics at epitranscriptomic layer. However, the sample size of RNA methylation experiment is usually very small due to its costs; and additionally, there usually exist a large number of genes whose methylation level cannot be accurately estimated due to their low expression level, making differential RNA methylation analysis a difficult task. We present QNB, a statistical approach for differential RNA methylation analysis with count-based small-sample sequencing data. Compared with previous approaches such as DRME model based on a statistical test covering the IP samples only with 2 negative binomial distributions, QNB is based on 4 independent negative binomial distributions with their variances and means linked by local regressions, and in the way, the input control samples are also properly taken care of. In addition, different from DRME approach, which relies only the input control sample only for estimating the background, QNB uses a more robust estimator for gene expression by combining information from both input and IP samples, which could largely improve the testing performance for very lowly expressed genes. QNB showed improved performance on both simulated and real MeRIP-Seq datasets when compared with competing algorithms. And the QNB model is also applicable to other datasets related RNA modifications, including but not limited to RNA bisulfite sequencing, m 1 A-Seq, Par-CLIP, RIP-Seq, etc.

  10. [Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].

    PubMed

    Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L

    2017-03-10

    To evaluate the estimation of prevalence ratio ( PR ) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR 's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95 %CI : 1.005-1.265), 1.128(95 %CI : 1.001-1.264) and 1.132(95 %CI : 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95 % CI : 1.055-1.206) and 1.126(95 % CI : 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR , which was 1.125 (95 %CI : 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR . Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.

  11. Drivers of multidimensional eco-innovation: empirical evidence from the Brazilian industry.

    PubMed

    da Silva Rabêlo, Olivan; de Azevedo Melo, Andrea Sales Soares

    2018-03-08

    The study analyses the relationships between the main drivers of eco-innovation introduced by innovative industries, focused on cooperation strategy. Eco-innovation is analysed by means of a multidimensional identification strategy, showing the relationships between the independent variables and the variable of interest. The literature discussing environmental innovation is different from the one discussing other types of innovation inasmuch as it seeks to grasp its determinants and to mostly highlight the relevance of environmental regulation. The key feature of this paper is that it ascribes special relevance to cooperation strategy with external partners and to the propensity of innovative industry introducing eco-innovation. A sample of 35,060 Brazilian industries were analysed, between 2003 and 2011, by means of Binomial, Multinomial and Ordinal logistic regressions with microdata collected with the research and innovation department (PINTEC) from the Brazilian Institute of Geography and Statistics (Instituto Brasileiro de Geografia e Estatística). The econometric results estimated by the Logit Multinomial method suggest that the cooperation with external partners practiced by innovative industries facilitates the adoption of eco-innovation in dimension 01 with probability of 64.59%, 57.63% in dimension 02 and 81.02% in dimension 03. The data reveal that the higher the degree of eco-innovation complexity, the harder industries seek to obtain cooperation with external partners. When calculating with the Logit Ordinal and Binomial models, cooperation increases the probability that the industry is eco-innovative in 65.09% and 89.34%, respectively. Environmental regulation and innovation in product and information management were also positively correlated as drivers of eco-innovation.

  12. On performance of parametric and distribution-free models for zero-inflated and over-dispersed count responses.

    PubMed

    Tang, Wan; Lu, Naiji; Chen, Tian; Wang, Wenjuan; Gunzler, Douglas David; Han, Yu; Tu, Xin M

    2015-10-30

    Zero-inflated Poisson (ZIP) and negative binomial (ZINB) models are widely used to model zero-inflated count responses. These models extend the Poisson and negative binomial (NB) to address excessive zeros in the count response. By adding a degenerate distribution centered at 0 and interpreting it as describing a non-risk group in the population, the ZIP (ZINB) models a two-component population mixture. As in applications of Poisson and NB, the key difference between ZIP and ZINB is the allowance for overdispersion by the ZINB in its NB component in modeling the count response for the at-risk group. Overdispersion arising in practice too often does not follow the NB, and applications of ZINB to such data yield invalid inference. If sources of overdispersion are known, other parametric models may be used to directly model the overdispersion. Such models too are subject to assumed distributions. Further, this approach may not be applicable if information about the sources of overdispersion is unavailable. In this paper, we propose a distribution-free alternative and compare its performance with these popular parametric models as well as a moment-based approach proposed by Yu et al. [Statistics in Medicine 2013; 32: 2390-2405]. Like the generalized estimating equations, the proposed approach requires no elaborate distribution assumptions. Compared with the approach of Yu et al., it is more robust to overdispersed zero-inflated responses. We illustrate our approach with both simulated and real study data. Copyright © 2015 John Wiley & Sons, Ltd.

  13. Generalized empirical Bayesian methods for discovery of differential data in high-throughput biology.

    PubMed

    Hardcastle, Thomas J

    2016-01-15

    High-throughput data are now commonplace in biological research. Rapidly changing technologies and application mean that novel methods for detecting differential behaviour that account for a 'large P, small n' setting are required at an increasing rate. The development of such methods is, in general, being done on an ad hoc basis, requiring further development cycles and a lack of standardization between analyses. We present here a generalized method for identifying differential behaviour within high-throughput biological data through empirical Bayesian methods. This approach is based on our baySeq algorithm for identification of differential expression in RNA-seq data based on a negative binomial distribution, and in paired data based on a beta-binomial distribution. Here we show how the same empirical Bayesian approach can be applied to any parametric distribution, removing the need for lengthy development of novel methods for differently distributed data. Comparisons with existing methods developed to address specific problems in high-throughput biological data show that these generic methods can achieve equivalent or better performance. A number of enhancements to the basic algorithm are also presented to increase flexibility and reduce computational costs. The methods are implemented in the R baySeq (v2) package, available on Bioconductor http://www.bioconductor.org/packages/release/bioc/html/baySeq.html. tjh48@cam.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Determinants of the geographic distribution of Puumala virus and Lyme borreliosis infections in Belgium

    PubMed Central

    Linard, Catherine; Lamarque, Pénélope; Heyman, Paul; Ducoffre, Geneviève; Luyasu, Victor; Tersago, Katrien; Vanwambeke, Sophie O; Lambin, Eric F

    2007-01-01

    Background Vector-borne and zoonotic diseases generally display clear spatial patterns due to different space-dependent factors. Land cover and land use influence disease transmission by controlling both the spatial distribution of vectors or hosts, and the probability of contact with susceptible human populations. The objective of this study was to combine environmental and socio-economic factors to explain the spatial distribution of two emerging human diseases in Belgium, Puumala virus (PUUV) and Lyme borreliosis. Municipalities were taken as units of analysis. Results Negative binomial regressions including a correction for spatial endogeneity show that the spatial distribution of PUUV and Lyme borreliosis infections are associated with a combination of factors linked to the vector and host populations, to human behaviours, and to landscape attributes. Both diseases are associated with the presence of forests, which are the preferred habitat for vector or host populations. The PUUV infection risk is higher in remote forest areas, where the level of urbanisation is low, and among low-income populations. The Lyme borreliosis transmission risk is higher in mixed landscapes with forests and spatially dispersed houses, mostly in wealthy peri-urban areas. The spatial dependence resulting from a combination of endogenous and exogenous processes could be accounted for in the model on PUUV but not for Lyme borreliosis. Conclusion A large part of the spatial variation in disease risk can be explained by environmental and socio-economic factors. The two diseases not only are most prevalent in different regions but also affect different groups of people. Combining these two criteria may increase the efficiency of information campaigns through appropriate targeting. PMID:17474974

  15. Frequency distribution of Echinococcus multilocularis and other helminths of foxes in Kyrgyzstan

    PubMed Central

    I., Ziadinov; P., Deplazes; A., Mathis; B., Mutunova; K., Abdykerimov; R., Nurgaziev; P.R, Torgerson

    2010-01-01

    Echinococcosis is a major emerging zoonosis in central Asia. A study of the helminth fauna of foxes from Naryn Oblast in central Kyrgyzstan was undertaken to investigate the abundance of Echinococcus multilocularis in a district where a high prevalence of this parasite had previously been detected in dogs. A total of 151 foxes (Vulpes vulpes) were investigated in a necropsy study. Of these 96 (64%) were infected with E. multilocularis with a mean abundance of 8669 parasites per fox. This indicates that red foxes are a major definitive host of E. multilocularis in this country. This also demonstrates that the abundance and prevalence of E. multilocularis in the natural definitive host are likely to be high in geographical regions where there is a concomitant high prevalence in alternative definitive hosts such as dogs. In addition Mesocestoides spp., Dipylidium caninum, Taenia spp., Toxocara canis, Toxascaris leonina, Capillaria and Acanthocephala spp. were found in 99 (66%), 50 (33%), 48 (32%), 46 (30%), 9 (6%), 34 (23%) and 2 (1%) of foxes, respectively. The prevalence but not the abundance of E. multilocularis decreased with age. The abundance of Dipylidium caninum also decreased with age. The frequency distribution of E. multilocularis and Mesocestoides spp. followed a zero inflated negative binomial distribution, whilst all other helminths had a negative binomial distribution. This demonstrates that the frequency distribution of positive counts and not just the frequency of zeros in the data set can determine if a zero inflated or non-zero inflated model is more appropriate. This is because the prevalences of E. multolocularis and Mesocestoides spp. were the highest (and hence had fewest zero counts) yet the parasite distribution nevertheless gave a better fit to the zero inflated models. PMID:20434845

  16. Assessing historical rate changes in global tsunami occurrence

    USGS Publications Warehouse

    Geist, E.L.; Parsons, T.

    2011-01-01

    The global catalogue of tsunami events is examined to determine if transient variations in tsunami rates are consistent with a Poisson process commonly assumed for tsunami hazard assessments. The primary data analyzed are tsunamis with maximum sizes >1m. The record of these tsunamis appears to be complete since approximately 1890. A secondary data set of tsunamis >0.1m is also analyzed that appears to be complete since approximately 1960. Various kernel density estimates used to determine the rate distribution with time indicate a prominent rate change in global tsunamis during the mid-1990s. Less prominent rate changes occur in the early- and mid-20th century. To determine whether these rate fluctuations are anomalous, the distribution of annual event numbers for the tsunami catalogue is compared to Poisson and negative binomial distributions, the latter of which includes the effects of temporal clustering. Compared to a Poisson distribution, the negative binomial distribution model provides a consistent fit to tsunami event numbers for the >1m data set, but the Poisson null hypothesis cannot be falsified for the shorter duration >0.1m data set. Temporal clustering of tsunami sources is also indicated by the distribution of interevent times for both data sets. Tsunami event clusters consist only of two to four events, in contrast to protracted sequences of earthquakes that make up foreshock-main shock-aftershock sequences. From past studies of seismicity, it is likely that there is a physical triggering mechanism responsible for events within the tsunami source 'mini-clusters'. In conclusion, prominent transient rate increases in the occurrence of global tsunamis appear to be caused by temporal grouping of geographically distinct mini-clusters, in addition to the random preferential location of global M >7 earthquakes along offshore fault zones.

  17. Students' Informal Inference about the Binomial Distribution of "Bunny Hops": A Dialogic Perspective

    ERIC Educational Resources Information Center

    Kazak, Sibel; Fujita, Taro; Wegerif, Rupert

    2016-01-01

    The study explores the development of 11-year-old students' informal inference about random bunny hops through student talk and use of computer simulation tools. Our aim in this paper is to draw on dialogic theory to explain how students make shifts in perspective, from intuition-based reasoning to more powerful, formal ways of using probabilistic…

  18. A Model Comparison for Count Data with a Positively Skewed Distribution with an Application to the Number of University Mathematics Courses Completed

    ERIC Educational Resources Information Center

    Liou, Pey-Yan

    2009-01-01

    The current study examines three regression models: OLS (ordinary least square) linear regression, Poisson regression, and negative binomial regression for analyzing count data. Simulation results show that the OLS regression model performed better than the others, since it did not produce more false statistically significant relationships than…

  19. A Negative Binomial Regression Model for Accuracy Tests

    ERIC Educational Resources Information Center

    Hung, Lai-Fa

    2012-01-01

    Rasch used a Poisson model to analyze errors and speed in reading tests. An important property of the Poisson distribution is that the mean and variance are equal. However, in social science research, it is very common for the variance to be greater than the mean (i.e., the data are overdispersed). This study embeds the Rasch model within an…

  20. Togetherness among Plasmodium falciparum gametocytes: interpretation through simulation and consequences for malaria transmission.

    PubMed

    Gaillard, F O; Boudin, C; Chau, N P; Robert, V; Pichon, G

    2003-11-01

    Previous experimental gametocyte infections of Anopheles arabiensis on 3 volunteers naturally infected with Plasmodium falciparum were conducted in Senegal. They showed that gametocyte counts in the mosquitoes are, like macroparasite intakes, heterogeneous (overdispersed). They followed a negative binomial distribution, the overdispersion coefficient seeming constant (k = 3.1). To try to explain this heterogeneity, we used an individual-based model (IBM), simulating the behaviour of gametocytes in the human blood circulation and their ingestion by mosquitoes. The hypothesis was that there exists a clustering of the gametocytes in the capillaries. From a series of simulations, in the case of clustering the following results were obtained: (i) the distribution of the gametocytes ingested by the mosquitoes followed a negative binomial, (ii) the k coefficient significantly increased with the density of circulating gametocytes. To validate this model result, 2 more experiments were conducted in Cameroon. Pooled experiments showed a distinct density dependency of the k-values. The simulation results and the experimental results were thus in agreement and suggested that an aggregation process at the microscopic level might produce the density-dependent overdispersion at the macroscopic level. Simulations also suggested that the clustering of gametocytes might facilitate fertilization of gametes.

  1. Inference of R 0 and Transmission Heterogeneity from the Size Distribution of Stuttering Chains

    PubMed Central

    Blumberg, Seth; Lloyd-Smith, James O.

    2013-01-01

    For many infectious disease processes such as emerging zoonoses and vaccine-preventable diseases, and infections occur as self-limited stuttering transmission chains. A mechanistic understanding of transmission is essential for characterizing the risk of emerging diseases and monitoring spatio-temporal dynamics. Thus methods for inferring and the degree of heterogeneity in transmission from stuttering chain data have important applications in disease surveillance and management. Previous researchers have used chain size distributions to infer , but estimation of the degree of individual-level variation in infectiousness (as quantified by the dispersion parameter, ) has typically required contact tracing data. Utilizing branching process theory along with a negative binomial offspring distribution, we demonstrate how maximum likelihood estimation can be applied to chain size data to infer both and the dispersion parameter that characterizes heterogeneity. While the maximum likelihood value for is a simple function of the average chain size, the associated confidence intervals are dependent on the inferred degree of transmission heterogeneity. As demonstrated for monkeypox data from the Democratic Republic of Congo, this impacts when a statistically significant change in is detectable. In addition, by allowing for superspreading events, inference of shifts the threshold above which a transmission chain should be considered anomalously large for a given value of (thus reducing the probability of false alarms about pathogen adaptation). Our analysis of monkeypox also clarifies the various ways that imperfect observation can impact inference of transmission parameters, and highlights the need to quantitatively evaluate whether observation is likely to significantly bias results. PMID:23658504

  2. A comparison of observation-level random effect and Beta-Binomial models for modelling overdispersion in Binomial data in ecology & evolution.

    PubMed

    Harrison, Xavier A

    2015-01-01

    Overdispersion is a common feature of models of biological data, but researchers often fail to model the excess variation driving the overdispersion, resulting in biased parameter estimates and standard errors. Quantifying and modeling overdispersion when it is present is therefore critical for robust biological inference. One means to account for overdispersion is to add an observation-level random effect (OLRE) to a model, where each data point receives a unique level of a random effect that can absorb the extra-parametric variation in the data. Although some studies have investigated the utility of OLRE to model overdispersion in Poisson count data, studies doing so for Binomial proportion data are scarce. Here I use a simulation approach to investigate the ability of both OLRE models and Beta-Binomial models to recover unbiased parameter estimates in mixed effects models of Binomial data under various degrees of overdispersion. In addition, as ecologists often fit random intercept terms to models when the random effect sample size is low (<5 levels), I investigate the performance of both model types under a range of random effect sample sizes when overdispersion is present. Simulation results revealed that the efficacy of OLRE depends on the process that generated the overdispersion; OLRE failed to cope with overdispersion generated from a Beta-Binomial mixture model, leading to biased slope and intercept estimates, but performed well for overdispersion generated by adding random noise to the linear predictor. Comparison of parameter estimates from an OLRE model with those from its corresponding Beta-Binomial model readily identified when OLRE were performing poorly due to disagreement between effect sizes, and this strategy should be employed whenever OLRE are used for Binomial data to assess their reliability. Beta-Binomial models performed well across all contexts, but showed a tendency to underestimate effect sizes when modelling non-Beta-Binomial data. Finally, both OLRE and Beta-Binomial models performed poorly when models contained <5 levels of the random intercept term, especially for estimating variance components, and this effect appeared independent of total sample size. These results suggest that OLRE are a useful tool for modelling overdispersion in Binomial data, but that they do not perform well in all circumstances and researchers should take care to verify the robustness of parameter estimates of OLRE models.

  3. Landau-Zener extension of the Tavis-Cummings model: Structure of the solution

    DOE PAGES

    Sun, Chen; Sinitsyn, Nikolai A.

    2016-09-07

    We explore the recently discovered solution of the driven Tavis-Cummings model (DTCM). It describes interaction of an arbitrary number of two-level systems with a bosonic mode that has linearly time-dependent frequency. We derive compact and tractable expressions for transition probabilities in terms of the well-known special functions. In this form, our formulas are suitable for fast numerical calculations and analytical approximations. As an application, we obtain the semiclassical limit of the exact solution and compare it to prior approximations. Furthermore, we also reveal connection between DTCM and q-deformed binomial statistics.

  4. M-Bonomial Coefficients and Their Identities

    ERIC Educational Resources Information Center

    Asiru, Muniru A.

    2010-01-01

    In this note, we introduce M-bonomial coefficients or (M-bonacci binomial coefficients). These are similar to the binomial and the Fibonomial (or Fibonacci-binomial) coefficients and can be displayed in a triangle similar to Pascal's triangle from which some identities become obvious.

  5. Zero-state Markov switching count-data models: an empirical assessment.

    PubMed

    Malyshkina, Nataliya V; Mannering, Fred L

    2010-01-01

    In this study, a two-state Markov switching count-data model is proposed as an alternative to zero-inflated models to account for the preponderance of zeros sometimes observed in transportation count data, such as the number of accidents occurring on a roadway segment over some period of time. For this accident-frequency case, zero-inflated models assume the existence of two states: one of the states is a zero-accident count state, which has accident probabilities that are so low that they cannot be statistically distinguished from zero, and the other state is a normal-count state, in which counts can be non-negative integers that are generated by some counting process, for example, a Poisson or negative binomial. While zero-inflated models have come under some criticism with regard to accident-frequency applications - one fact is undeniable - in many applications they provide a statistically superior fit to the data. The Markov switching approach we propose seeks to overcome some of the criticism associated with the zero-accident state of the zero-inflated model by allowing individual roadway segments to switch between zero and normal-count states over time. An important advantage of this Markov switching approach is that it allows for the direct statistical estimation of the specific roadway-segment state (i.e., zero-accident or normal-count state) whereas traditional zero-inflated models do not. To demonstrate the applicability of this approach, a two-state Markov switching negative binomial model (estimated with Bayesian inference) and standard zero-inflated negative binomial models are estimated using five-year accident frequencies on Indiana interstate highway segments. It is shown that the Markov switching model is a viable alternative and results in a superior statistical fit relative to the zero-inflated models.

  6. Neighborhood educational disparities in active commuting among women: the effect of distance between the place of residence and the place of work/study (an ACTI-Cités study).

    PubMed

    Perchoux, Camille; Nazare, Julie-Anne; Benmarhnia, Tarik; Salze, Paul; Feuillet, Thierry; Hercberg, Serge; Hess, Franck; Menai, Mehdi; Weber, Christiane; Charreire, Hélène; Enaux, Christophe; Oppert, Jean-Michel; Simon, Chantal

    2017-06-12

    Active transportation has been associated with favorable health outcomes. Previous research highlighted the influence of neighborhood educational level on active transportation. However, little is known regarding the effect of commuting distance on social disparities in active commuting. In this regard, women have been poorly studied. The objective of this paper was to evaluate the relationship between neighborhood educational level and active commuting, and to assess whether the commuting distance modifies this relationship in adult women. This cross-sectional study is based on a subsample of women from the Nutrinet-Santé web-cohort (N = 1169). Binomial, log-binomial and negative binomial regressions were used to assess the associations between neighborhood education level and (i) the likelihood of reporting any active commuting time, and (ii) the share of commuting time made by active transportation modes. Potential effect measure modification of distance to work on the previous associations was assessed both on the additive and the multiplicative scales. Neighborhood education level was positively associated with the probability of reporting any active commuting time (relative risk = 1.774; p < 0.05) and the share of commuting time spent active (relative risk = 1.423; p < 0.05). The impact of neighborhood education was greater at long distances to work for both outcomes. Our results suggest that neighborhood educational disparities in active commuting tend to increase with commuting distance among women. Further research is needed to provide geographically driven guidance for health promotion intervention aiming at reducing disparities in active transportation among socioeconomic groups.

  7. Regular exercise and related factors in patients with Parkinson's disease: Applying zero-inflated negative binomial modeling of exercise count data.

    PubMed

    Lee, JuHee; Park, Chang Gi; Choi, Moonki

    2016-05-01

    This study was conducted to identify risk factors that influence regular exercise among patients with Parkinson's disease in Korea. Parkinson's disease is prevalent in the elderly, and may lead to a sedentary lifestyle. Exercise can enhance physical and psychological health. However, patients with Parkinson's disease are less likely to exercise than are other populations due to physical disability. A secondary data analysis and cross-sectional descriptive study were conducted. A convenience sample of 106 patients with Parkinson's disease was recruited at an outpatient neurology clinic of a tertiary hospital in Korea. Demographic characteristics, disease-related characteristics (including disease duration and motor symptoms), self-efficacy for exercise, balance, and exercise level were investigated. Negative binomial regression and zero-inflated negative binomial regression for exercise count data were utilized to determine factors involved in exercise. The mean age of participants was 65.85 ± 8.77 years, and the mean duration of Parkinson's disease was 7.23 ± 6.02 years. Most participants indicated that they engaged in regular exercise (80.19%). Approximately half of participants exercised at least 5 days per week for 30 min, as recommended (51.9%). Motor symptoms were a significant predictor of exercise in the count model, and self-efficacy for exercise was a significant predictor of exercise in the zero model. Severity of motor symptoms was related to frequency of exercise. Self-efficacy contributed to the probability of exercise. Symptom management and improvement of self-efficacy for exercise are important to encourage regular exercise in patients with Parkinson's disease. Copyright © 2015 Elsevier Inc. All rights reserved.

  8. Performance and structure of single-mode bosonic codes

    NASA Astrophysics Data System (ADS)

    Albert, Victor V.; Noh, Kyungjoo; Duivenvoorden, Kasper; Young, Dylan J.; Brierley, R. T.; Reinhold, Philip; Vuillot, Christophe; Li, Linshu; Shen, Chao; Girvin, S. M.; Terhal, Barbara M.; Jiang, Liang

    2018-03-01

    The early Gottesman, Kitaev, and Preskill (GKP) proposal for encoding a qubit in an oscillator has recently been followed by cat- and binomial-code proposals. Numerically optimized codes have also been proposed, and we introduce codes of this type here. These codes have yet to be compared using the same error model; we provide such a comparison by determining the entanglement fidelity of all codes with respect to the bosonic pure-loss channel (i.e., photon loss) after the optimal recovery operation. We then compare achievable communication rates of the combined encoding-error-recovery channel by calculating the channel's hashing bound for each code. Cat and binomial codes perform similarly, with binomial codes outperforming cat codes at small loss rates. Despite not being designed to protect against the pure-loss channel, GKP codes significantly outperform all other codes for most values of the loss rate. We show that the performance of GKP and some binomial codes increases monotonically with increasing average photon number of the codes. In order to corroborate our numerical evidence of the cat-binomial-GKP order of performance occurring at small loss rates, we analytically evaluate the quantum error-correction conditions of those codes. For GKP codes, we find an essential singularity in the entanglement fidelity in the limit of vanishing loss rate. In addition to comparing the codes, we draw parallels between binomial codes and discrete-variable systems. First, we characterize one- and two-mode binomial as well as multiqubit permutation-invariant codes in terms of spin-coherent states. Such a characterization allows us to introduce check operators and error-correction procedures for binomial codes. Second, we introduce a generalization of spin-coherent states, extending our characterization to qudit binomial codes and yielding a multiqudit code.

  9. Associations among habitat characteristics and meningeal worm prevalence in eastern South Dakota, USA

    USGS Publications Warehouse

    Jacques, Christopher N.; Jenks, Jonathan A.; Klaver, Robert W.; Dubay, Shelli A.

    2017-01-01

    Few studies have evaluated how wetland and forest characteristics influence the prevalence of meningeal worm (Parelaphostrongylus tenuis) infection of deer throughout the grassland biome of central North America. We used previously collected, county-level prevalence data to evaluate associations between habitat characteristics and probability of meningeal worm infection in white-tailed deer (Odocoileus virginianus) across eastern South Dakota, US. The highest-ranked binomial regression model for detecting probability of meningeal worm infection was spring temperature + summer precipitation + percent wetland; weight of evidence (wi=0.71) favored this model over alternative models, though predictive capability was low (Receiver operating characteristic=0.62). Probability of meningeal worm infection increased by 1.3- and 1.6-fold for each 1-cm and 1-C increase in summer precipitation and spring temperature, respectively. Similarly, probability of infection increased 1.2-fold for each 1% increase in wetland habitat. Our findings highlight the importance of wetland habitat in predicting meningeal worm infection across eastern South Dakota. Future research is warranted to evaluate the relationships between climatic conditions (e.g., drought, wet cycles) and deer habitat selection in maintaining P. tenuis along the western boundary of the parasite.

  10. A comparison of different ways of including baseline counts in negative binomial models for data from falls prevention trials.

    PubMed

    Zheng, Han; Kimber, Alan; Goodwin, Victoria A; Pickering, Ruth M

    2018-01-01

    A common design for a falls prevention trial is to assess falling at baseline, randomize participants into an intervention or control group, and ask them to record the number of falls they experience during a follow-up period of time. This paper addresses how best to include the baseline count in the analysis of the follow-up count of falls in negative binomial (NB) regression. We examine the performance of various approaches in simulated datasets where both counts are generated from a mixed Poisson distribution with shared random subject effect. Including the baseline count after log-transformation as a regressor in NB regression (NB-logged) or as an offset (NB-offset) resulted in greater power than including the untransformed baseline count (NB-unlogged). Cook and Wei's conditional negative binomial (CNB) model replicates the underlying process generating the data. In our motivating dataset, a statistically significant intervention effect resulted from the NB-logged, NB-offset, and CNB models, but not from NB-unlogged, and large, outlying baseline counts were overly influential in NB-unlogged but not in NB-logged. We conclude that there is little to lose by including the log-transformed baseline count in standard NB regression compared to CNB for moderate to larger sized datasets. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. Robust inference in the negative binomial regression model with an application to falls data.

    PubMed

    Aeberhard, William H; Cantoni, Eva; Heritier, Stephane

    2014-12-01

    A popular way to model overdispersed count data, such as the number of falls reported during intervention studies, is by means of the negative binomial (NB) distribution. Classical estimating methods are well-known to be sensitive to model misspecifications, taking the form of patients falling much more than expected in such intervention studies where the NB regression model is used. We extend in this article two approaches for building robust M-estimators of the regression parameters in the class of generalized linear models to the NB distribution. The first approach achieves robustness in the response by applying a bounded function on the Pearson residuals arising in the maximum likelihood estimating equations, while the second approach achieves robustness by bounding the unscaled deviance components. For both approaches, we explore different choices for the bounding functions. Through a unified notation, we show how close these approaches may actually be as long as the bounding functions are chosen and tuned appropriately, and provide the asymptotic distributions of the resulting estimators. Moreover, we introduce a robust weighted maximum likelihood estimator for the overdispersion parameter, specific to the NB distribution. Simulations under various settings show that redescending bounding functions yield estimates with smaller biases under contamination while keeping high efficiency at the assumed model, and this for both approaches. We present an application to a recent randomized controlled trial measuring the effectiveness of an exercise program at reducing the number of falls among people suffering from Parkinsons disease to illustrate the diagnostic use of such robust procedures and their need for reliable inference. © 2014, The International Biometric Society.

  12. Dispersion models and sampling of cacao mirid bug Sahlbergella singularis (Hemiptera: Miridae) on Theobroma Cacao in southern Cameroon.

    PubMed

    Bisseleua, D H B; Vidal, Stefan

    2011-02-01

    The spatio-temporal distribution of Sahlbergella singularis Haglung, a major pest of cacao trees (Theobroma cacao) (Malvaceae), was studied for 2 yr in traditional cacao forest gardens in the humid forest area of southern Cameroon. The first objective was to analyze the dispersion of this insect on cacao trees. The second objective was to develop sampling plans based on fixed levels of precision for estimating S. singularis populations. The following models were used to analyze the data: Taylor's power law, Iwao's patchiness regression, the Nachman model, and the negative binomial distribution. Our results document that Taylor's power law was a better fit for the data than the Iwao and Nachman models. Taylor's b and Iwao's β were both significantly >1, indicating that S. singularis aggregated on specific trees. This result was further supported by the calculated common k of 1.75444. Iwao's α was significantly <0, indicating that the basic distribution component of S. singularis was the individual insect. Comparison of negative binomial (NBD) and Nachman models indicated that the NBD model was appropriate for studying S. singularis distribution. Optimal sample sizes for fixed precision levels of 0.10, 0.15, and 0.25 were estimated with Taylor's regression coefficients. Required sample sizes increased dramatically with increasing levels of precision. This is the first study on S. singularis dispersion in cacao plantations. Sampling plans, presented here, should be a tool for research on population dynamics and pest management decisions of mirid bugs on cacao. © 2011 Entomological Society of America

  13. Problems on Divisibility of Binomial Coefficients

    ERIC Educational Resources Information Center

    Osler, Thomas J.; Smoak, James

    2004-01-01

    Twelve unusual problems involving divisibility of the binomial coefficients are represented in this article. The problems are listed in "The Problems" section. All twelve problems have short solutions which are listed in "The Solutions" section. These problems could be assigned to students in any course in which the binomial theorem and Pascal's…

  14. Application of binomial-edited CPMG to shale characterization

    USGS Publications Warehouse

    Washburn, Kathryn E.; Birdwell, Justin E.

    2014-01-01

    Unconventional shale resources may contain a significant amount of hydrogen in organic solids such as kerogen, but it is not possible to directly detect these solids with many NMR systems. Binomial-edited pulse sequences capitalize on magnetization transfer between solids, semi-solids, and liquids to provide an indirect method of detecting solid organic materials in shales. When the organic solids can be directly measured, binomial-editing helps distinguish between different phases. We applied a binomial-edited CPMG pulse sequence to a range of natural and experimentally-altered shale samples. The most substantial signal loss is seen in shales rich in organic solids while fluids associated with inorganic pores seem essentially unaffected. This suggests that binomial-editing is a potential method for determining fluid locations, solid organic content, and kerogen–bitumen discrimination.

  15. Can Bayesian models play a role in dental caries epidemiology? Evidence from an application to the BELCAP data set.

    PubMed

    Matranga, Domenica; Firenze, Alberto; Vullo, Angela

    2013-10-01

    The aim of this study was to show the potential of Bayesian analysis in statistical modelling of dental caries data. Because of the bounded nature of the dmft (DMFT) index, zero-inflated binomial (ZIB) and beta-binomial (ZIBB) models were considered. The effects of incorporating prior information available about the parameters of models were also shown. The data set used in this study was the Belo Horizonte Caries Prevention (BELCAP) study (Böhning et al. (1999)), consisting of five variables collected among 797 Brazilian school children designed to evaluate four programmes for reducing caries. Only the eight primary molar teeth were considered in the data set. A data augmentation algorithm was used for estimation. Firstly, noninformative priors were used to express our lack of knowledge about the regression parameters. Secondly, prior information about the probability of being a structural zero dmft and the probability of being caries affected in the subpopulation of susceptible children was incorporated. With noninformative priors, the best fitting model was the ZIBB. Education (OR = 0.76, 95% CrI: 0.59, 0.99), all interventions (OR = 0.46, 95% CrI: 0.35, 0.62), rinsing (OR = 0.61, 95% CrI: 0.47, 0.80) and hygiene (OR = 0.65, 95% CrI: 0.49, 0.86) were demonstrated to be factors protecting children from being caries affected. Being male increased the probability of being caries diseased (OR = 1.19, 95% CrI: 1.01, 1.42). However, after incorporating informative priors, ZIB models' estimates were not influenced, while ZIBB models reduced deviance and confirmed the association with all interventions and rinsing only. In our application, Bayesian estimates showed a similar accuracy and precision than likelihood-based estimates, although they offered many computational advantages and the possibility of expressing all forms of uncertainty in terms of probability. The overdispersion parameter could expound why the introduction of prior information had significant effects on the parameters of the ZIBB model, while ZIB estimates remained unchanged. Finally, the best performance of ZIBB compared to the ZIB model was shown to catch overdispersion in data. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  16. Jump-and-return sandwiches: A new family of binomial-like selective inversion sequences with improved performance

    NASA Astrophysics Data System (ADS)

    Brenner, Tom; Chen, Johnny; Stait-Gardner, Tim; Zheng, Gang; Matsukawa, Shingo; Price, William S.

    2018-03-01

    A new family of binomial-like inversion sequences, named jump-and-return sandwiches (JRS), has been developed by inserting a binomial-like sequence into a standard jump-and-return sequence, discovered through use of a stochastic Genetic Algorithm optimisation. Compared to currently used binomial-like inversion sequences (e.g., 3-9-19 and W5), the new sequences afford wider inversion bands and narrower non-inversion bands with an equal number of pulses. As an example, two jump-and-return sandwich 10-pulse sequences achieved 95% inversion at offsets corresponding to 9.4% and 10.3% of the non-inversion band spacing, compared to 14.7% for the binomial-like W5 inversion sequence, i.e., they afforded non-inversion bands about two thirds the width of the W5 non-inversion band.

  17. Limits, discovery and cut optimization for a Poisson process with uncertainty in background and signal efficiency: TRolke 2.0

    NASA Astrophysics Data System (ADS)

    Lundberg, J.; Conrad, J.; Rolke, W.; Lopez, A.

    2010-03-01

    A C++ class was written for the calculation of frequentist confidence intervals using the profile likelihood method. Seven combinations of Binomial, Gaussian, Poissonian and Binomial uncertainties are implemented. The package provides routines for the calculation of upper and lower limits, sensitivity and related properties. It also supports hypothesis tests which take uncertainties into account. It can be used in compiled C++ code, in Python or interactively via the ROOT analysis framework. Program summaryProgram title: TRolke version 2.0 Catalogue identifier: AEFT_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEFT_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: MIT license No. of lines in distributed program, including test data, etc.: 3431 No. of bytes in distributed program, including test data, etc.: 21 789 Distribution format: tar.gz Programming language: ISO C++. Computer: Unix, GNU/Linux, Mac. Operating system: Linux 2.6 (Scientific Linux 4 and 5, Ubuntu 8.10), Darwin 9.0 (Mac-OS X 10.5.8). RAM:˜20 MB Classification: 14.13. External routines: ROOT ( http://root.cern.ch/drupal/) Nature of problem: The problem is to calculate a frequentist confidence interval on the parameter of a Poisson process with statistical or systematic uncertainties in signal efficiency or background. Solution method: Profile likelihood method, Analytical Running time:<10 seconds per extracted limit.

  18. Linear algebra of the permutation invariant Crow-Kimura model of prebiotic evolution.

    PubMed

    Bratus, Alexander S; Novozhilov, Artem S; Semenov, Yuri S

    2014-10-01

    A particular case of the famous quasispecies model - the Crow-Kimura model with a permutation invariant fitness landscape - is investigated. Using the fact that the mutation matrix in the case of a permutation invariant fitness landscape has a special tridiagonal form, a change of the basis is suggested such that in the new coordinates a number of analytical results can be obtained. In particular, using the eigenvectors of the mutation matrix as the new basis, we show that the quasispecies distribution approaches a binomial one and give simple estimates for the speed of convergence. Another consequence of the suggested approach is a parametric solution to the system of equations determining the quasispecies. Using this parametric solution we show that our approach leads to exact asymptotic results in some cases, which are not covered by the existing methods. In particular, we are able to present not only the limit behavior of the leading eigenvalue (mean population fitness), but also the exact formulas for the limit quasispecies eigenvector for special cases. For instance, this eigenvector has a geometric distribution in the case of the classical single peaked fitness landscape. On the biological side, we propose a mathematical definition, based on the closeness of the quasispecies to the binomial distribution, which can be used as an operational definition of the notorious error threshold. Using this definition, we suggest two approximate formulas to estimate the critical mutation rate after which the quasispecies delocalization occurs. Copyright © 2014 Elsevier Inc. All rights reserved.

  19. Sequential sampling of ribes populations in the control of white pine blister rust (Cronartium ribicola Fischer) in California

    Treesearch

    Harold R. Offord

    1966-01-01

    Sequential sampling based on a negative binomial distribution of ribes populations required less than half the time taken by regular systematic line transect sampling in a comparison test. It gave the same control decision as the regular method in 9 of 13 field trials. A computer program that permits sequential plans to be built readily for other white pine regions is...

  20. Design and analysis of three-arm trials with negative binomially distributed endpoints.

    PubMed

    Mütze, Tobias; Munk, Axel; Friede, Tim

    2016-02-20

    A three-arm clinical trial design with an experimental treatment, an active control, and a placebo control, commonly referred to as the gold standard design, enables testing of non-inferiority or superiority of the experimental treatment compared with the active control. In this paper, we propose methods for designing and analyzing three-arm trials with negative binomially distributed endpoints. In particular, we develop a Wald-type test with a restricted maximum-likelihood variance estimator for testing non-inferiority or superiority. For this test, sample size and power formulas as well as optimal sample size allocations will be derived. The performance of the proposed test will be assessed in an extensive simulation study with regard to type I error rate, power, sample size, and sample size allocation. For the purpose of comparison, Wald-type statistics with a sample variance estimator and an unrestricted maximum-likelihood estimator are included in the simulation study. We found that the proposed Wald-type test with a restricted variance estimator performed well across the considered scenarios and is therefore recommended for application in clinical trials. The methods proposed are motivated and illustrated by a recent clinical trial in multiple sclerosis. The R package ThreeArmedTrials, which implements the methods discussed in this paper, is available on CRAN. Copyright © 2015 John Wiley & Sons, Ltd.

  1. Statistical procedures for analyzing mental health services data.

    PubMed

    Elhai, Jon D; Calhoun, Patrick S; Ford, Julian D

    2008-08-15

    In mental health services research, analyzing service utilization data often poses serious problems, given the presence of substantially skewed data distributions. This article presents a non-technical introduction to statistical methods specifically designed to handle the complexly distributed datasets that represent mental health service use, including Poisson, negative binomial, zero-inflated, and zero-truncated regression models. A flowchart is provided to assist the investigator in selecting the most appropriate method. Finally, a dataset of mental health service use reported by medical patients is described, and a comparison of results across several different statistical methods is presented. Implications of matching data analytic techniques appropriately with the often complexly distributed datasets of mental health services utilization variables are discussed.

  2. Estimating safety effects of pavement management factors utilizing Bayesian random effect models.

    PubMed

    Jiang, Ximiao; Huang, Baoshan; Zaretzki, Russell L; Richards, Stephen; Yan, Xuedong

    2013-01-01

    Previous studies of pavement management factors that relate to the occurrence of traffic-related crashes are rare. Traditional research has mostly employed summary statistics of bidirectional pavement quality measurements in extended longitudinal road segments over a long time period, which may cause a loss of important information and result in biased parameter estimates. The research presented in this article focuses on crash risk of roadways with overall fair to good pavement quality. Real-time and location-specific data were employed to estimate the effects of pavement management factors on the occurrence of crashes. This research is based on the crash data and corresponding pavement quality data for the Tennessee state route highways from 2004 to 2009. The potential temporal and spatial correlations among observations caused by unobserved factors were considered. Overall 6 models were built accounting for no correlation, temporal correlation only, and both the temporal and spatial correlations. These models included Poisson, negative binomial (NB), one random effect Poisson and negative binomial (OREP, ORENB), and two random effect Poisson and negative binomial (TREP, TRENB) models. The Bayesian method was employed to construct these models. The inference is based on the posterior distribution from the Markov chain Monte Carlo (MCMC) simulation. These models were compared using the deviance information criterion. Analysis of the posterior distribution of parameter coefficients indicates that the pavement management factors indexed by Present Serviceability Index (PSI) and Pavement Distress Index (PDI) had significant impacts on the occurrence of crashes, whereas the variable rutting depth was not significant. Among other factors, lane width, median width, type of terrain, and posted speed limit were significant in affecting crash frequency. The findings of this study indicate that a reduction in pavement roughness would reduce the likelihood of traffic-related crashes. Hence, maintaining a low level of pavement roughness is strongly suggested. In addition, the results suggested that the temporal correlation among observations was significant and that the ORENB model outperformed all other models.

  3. Revealing Word Order: Using Serial Position in Binomials to Predict Properties of the Speaker

    ERIC Educational Resources Information Center

    Iliev, Rumen; Smirnova, Anastasia

    2016-01-01

    Three studies test the link between word order in binomials and psychological and demographic characteristics of a speaker. While linguists have already suggested that psychological, cultural and societal factors are important in choosing word order in binomials, the vast majority of relevant research was focused on general factors and on broadly…

  4. Univariate and bivariate likelihood-based meta-analysis methods performed comparably when marginal sensitivity and specificity were the targets of inference.

    PubMed

    Dahabreh, Issa J; Trikalinos, Thomas A; Lau, Joseph; Schmid, Christopher H

    2017-03-01

    To compare statistical methods for meta-analysis of sensitivity and specificity of medical tests (e.g., diagnostic or screening tests). We constructed a database of PubMed-indexed meta-analyses of test performance from which 2 × 2 tables for each included study could be extracted. We reanalyzed the data using univariate and bivariate random effects models fit with inverse variance and maximum likelihood methods. Analyses were performed using both normal and binomial likelihoods to describe within-study variability. The bivariate model using the binomial likelihood was also fit using a fully Bayesian approach. We use two worked examples-thoracic computerized tomography to detect aortic injury and rapid prescreening of Papanicolaou smears to detect cytological abnormalities-to highlight that different meta-analysis approaches can produce different results. We also present results from reanalysis of 308 meta-analyses of sensitivity and specificity. Models using the normal approximation produced sensitivity and specificity estimates closer to 50% and smaller standard errors compared to models using the binomial likelihood; absolute differences of 5% or greater were observed in 12% and 5% of meta-analyses for sensitivity and specificity, respectively. Results from univariate and bivariate random effects models were similar, regardless of estimation method. Maximum likelihood and Bayesian methods produced almost identical summary estimates under the bivariate model; however, Bayesian analyses indicated greater uncertainty around those estimates. Bivariate models produced imprecise estimates of the between-study correlation of sensitivity and specificity. Differences between methods were larger with increasing proportion of studies that were small or required a continuity correction. The binomial likelihood should be used to model within-study variability. Univariate and bivariate models give similar estimates of the marginal distributions for sensitivity and specificity. Bayesian methods fully quantify uncertainty and their ability to incorporate external evidence may be useful for imprecisely estimated parameters. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Some characteristics of repeated sickness absence

    PubMed Central

    Ferguson, David

    1972-01-01

    Ferguson, D. (1972).Brit. J. industr. Med.,29, 420-431. Some characteristics of repeated sickness absence. Several studies have shown that frequency of absence attributed to sickness is not distributed randomly but tends to follow the negative binomial distribution, and this has been taken to support the concept of `proneness' to such absence. Thus, the distribution of sickness absence resembles that of minor injury at work demonstrated over 50 years ago. Because the investigation of proneness to absence does not appear to have been reported by others in Australia, the opportunity was taken, during a wider study of health among telegraphists in a large communications undertaking, to analyse some characteristics of repeated sickness absence. The records of medically certified and uncertified sickness absence of all 769 telegraphists continuously employed in all State capitals over a two-and-a-half-year period were compared with those of 411 clerks and 415 mechanics and, in Sydney, 380 mail sorters and 80 of their supervisors. All telegraphists in Sydney, Melbourne, and Brisbane, and all mail sorters in Sydney, who were available and willing were later medically examined. From their absence pattern repeaters (employees who had had eight or more certified absences in two and a half years) were separated into three types based on a presumptive origin in chance, recurrent disease and symptomatic non-specific disorder. The observed distribution of individual frequency of certified absence over the full two-and-a-half-year period of study followed that expected from the univariate negative binomial, using maximum likelihood estimators, rather than the poisson distribution, in three of the four occupational groups in Sydney. Limited correlational and bivariate analysis supported the interpretation of proneness ascribed to the univariate fit. In the two groups studied, frequency of uncertified absence could not be fitted by the negative binomial, although the numbers of such absences in individuals in successive years were relatively highly correlated. All types of repeater were commoner in Sydney than in the other capital city offices, which differed little from each other. Repeaters were more common among those whose absence was attributed to neurosis, alimentary and upper respiratory tract disorder, and injury. Out of more than 90 health, personal, social, and industrial attributes determined at examination, only two (ethanol habit and adverse attitude to pay) showed any statistically significant association when telegraphist repeaters in Sydney were compared with employees who were rarely absent. Though repeating tended to be associated with chronic or recurrent ill health revealed at examination, one quarter of repeaters had little such ill health and one quarter of rarely absent employees had much. It was concluded that, in the population studied, the fitting of the negative binomial to frequency of certified sickness absence could, in the circumstances of the study, reasonably be given an interpretation of proneness. In that population also repeating varies geographically and occupationally, and is poorly associated with disease and other attributes uncovered at examination, with the exception of the ethanol habit. Repeaters are more often neurotic than employees who are rarely absent but also are more often stable double jobbers. The repeater should be identified for what help may be given him, if needed, otherwise it would seem more profitable to attack those features in work design and organization which influence motivation to come to work. Social factors which predispose to repeated absence are less amenable to modification. PMID:4636662

  6. Earthquake number forecasts testing

    NASA Astrophysics Data System (ADS)

    Kagan, Yan Y.

    2017-10-01

    We study the distributions of earthquake numbers in two global earthquake catalogues: Global Centroid-Moment Tensor and Preliminary Determinations of Epicenters. The properties of these distributions are especially required to develop the number test for our forecasts of future seismic activity rate, tested by the Collaboratory for Study of Earthquake Predictability (CSEP). A common assumption, as used in the CSEP tests, is that the numbers are described by the Poisson distribution. It is clear, however, that the Poisson assumption for the earthquake number distribution is incorrect, especially for the catalogues with a lower magnitude threshold. In contrast to the one-parameter Poisson distribution so widely used to describe earthquake occurrences, the negative-binomial distribution (NBD) has two parameters. The second parameter can be used to characterize the clustering or overdispersion of a process. We also introduce and study a more complex three-parameter beta negative-binomial distribution. We investigate the dependence of parameters for both Poisson and NBD distributions on the catalogue magnitude threshold and on temporal subdivision of catalogue duration. First, we study whether the Poisson law can be statistically rejected for various catalogue subdivisions. We find that for most cases of interest, the Poisson distribution can be shown to be rejected statistically at a high significance level in favour of the NBD. Thereafter, we investigate whether these distributions fit the observed distributions of seismicity. For this purpose, we study upper statistical moments of earthquake numbers (skewness and kurtosis) and compare them to the theoretical values for both distributions. Empirical values for the skewness and the kurtosis increase for the smaller magnitude threshold and increase with even greater intensity for small temporal subdivision of catalogues. The Poisson distribution for large rate values approaches the Gaussian law, therefore its skewness and kurtosis both tend to zero for large earthquake rates: for the Gaussian law, these values are identically zero. A calculation of the NBD skewness and kurtosis levels based on the values of the first two statistical moments of the distribution, shows rapid increase of these upper moments levels. However, the observed catalogue values of skewness and kurtosis are rising even faster. This means that for small time intervals, the earthquake number distribution is even more heavy-tailed than the NBD predicts. Therefore for small time intervals, we propose using empirical number distributions appropriately smoothed for testing forecasted earthquake numbers.

  7. Modeling species-abundance relationships in multi-species collections

    USGS Publications Warehouse

    Peng, S.; Yin, Z.; Ren, H.; Guo, Q.

    2003-01-01

    Species-abundance relationship is one of the most fundamental aspects of community ecology. Since Motomura first developed the geometric series model to describe the feature of community structure, ecologists have developed many other models to fit the species-abundance data in communities. These models can be classified into empirical and theoretical ones, including (1) statistical models, i.e., negative binomial distribution (and its extension), log-series distribution (and its extension), geometric distribution, lognormal distribution, Poisson-lognormal distribution, (2) niche models, i.e., geometric series, broken stick, overlapping niche, particulate niche, random assortment, dominance pre-emption, dominance decay, random fraction, weighted random fraction, composite niche, Zipf or Zipf-Mandelbrot model, and (3) dynamic models describing community dynamics and restrictive function of environment on community. These models have different characteristics and fit species-abundance data in various communities or collections. Among them, log-series distribution, lognormal distribution, geometric series, and broken stick model have been most widely used.

  8. Finding consistent patterns: A nonparametric approach for identifying differential expression in RNA-Seq data

    PubMed Central

    Li, Jun; Tibshirani, Robert

    2015-01-01

    We discuss the identification of features that are associated with an outcome in RNA-Sequencing (RNA-Seq) and other sequencing-based comparative genomic experiments. RNA-Seq data takes the form of counts, so models based on the normal distribution are generally unsuitable. The problem is especially challenging because different sequencing experiments may generate quite different total numbers of reads, or ‘sequencing depths’. Existing methods for this problem are based on Poisson or negative binomial models: they are useful but can be heavily influenced by ‘outliers’ in the data. We introduce a simple, nonparametric method with resampling to account for the different sequencing depths. The new method is more robust than parametric methods. It can be applied to data with quantitative, survival, two-class or multiple-class outcomes. We compare our proposed method to Poisson and negative binomial-based methods in simulated and real data sets, and find that our method discovers more consistent patterns than competing methods. PMID:22127579

  9. Modeling abundance using multinomial N-mixture models

    USGS Publications Warehouse

    Royle, Andy

    2016-01-01

    Multinomial N-mixture models are a generalization of the binomial N-mixture models described in Chapter 6 to allow for more complex and informative sampling protocols beyond simple counts. Many commonly used protocols such as multiple observer sampling, removal sampling, and capture-recapture produce a multivariate count frequency that has a multinomial distribution and for which multinomial N-mixture models can be developed. Such protocols typically result in more precise estimates than binomial mixture models because they provide direct information about parameters of the observation process. We demonstrate the analysis of these models in BUGS using several distinct formulations that afford great flexibility in the types of models that can be developed, and we demonstrate likelihood analysis using the unmarked package. Spatially stratified capture-recapture models are one class of models that fall into the multinomial N-mixture framework, and we discuss analysis of stratified versions of classical models such as model Mb, Mh and other classes of models that are only possible to describe within the multinomial N-mixture framework.

  10. Dynamic equilibrium of reconstituting hematopoietic stem cell populations.

    PubMed

    O'Quigley, John

    2010-12-01

    Clonal dominance in hematopoietic stem cell populations is an important question of interest but not one we can directly answer. Any estimates are based on indirect measurement. For marked populations, we can equate empirical and theoretical moments for binomial sampling, in particular we can use the well-known formula for the sampling variation of a binomial proportion. The empirical variance itself cannot always be reliably estimated and some caution is needed. We describe the difficulties here and identify ready solutions which only require appropriate use of variance-stabilizing transformations. From these we obtain estimators for the steady state, or dynamic equilibrium, of the number of hematopoietic stem cells involved in repopulating the marrow. The calculations themselves are not too involved. We give the distribution theory for the estimator as well as simple approximations for practical application. As an illustration, we rework on data recently gathered to address the question as to whether or not reconstitution of marrow grafts in the clinical setting might be considered to be oligoclonal.

  11. Jump-and-return sandwiches: A new family of binomial-like selective inversion sequences with improved performance.

    PubMed

    Brenner, Tom; Chen, Johnny; Stait-Gardner, Tim; Zheng, Gang; Matsukawa, Shingo; Price, William S

    2018-03-01

    A new family of binomial-like inversion sequences, named jump-and-return sandwiches (JRS), has been developed by inserting a binomial-like sequence into a standard jump-and-return sequence, discovered through use of a stochastic Genetic Algorithm optimisation. Compared to currently used binomial-like inversion sequences (e.g., 3-9-19 and W5), the new sequences afford wider inversion bands and narrower non-inversion bands with an equal number of pulses. As an example, two jump-and-return sandwich 10-pulse sequences achieved 95% inversion at offsets corresponding to 9.4% and 10.3% of the non-inversion band spacing, compared to 14.7% for the binomial-like W5 inversion sequence, i.e., they afforded non-inversion bands about two thirds the width of the W5 non-inversion band. Copyright © 2018 Elsevier Inc. All rights reserved.

  12. Adjusted Wald Confidence Interval for a Difference of Binomial Proportions Based on Paired Data

    ERIC Educational Resources Information Center

    Bonett, Douglas G.; Price, Robert M.

    2012-01-01

    Adjusted Wald intervals for binomial proportions in one-sample and two-sample designs have been shown to perform about as well as the best available methods. The adjusted Wald intervals are easy to compute and have been incorporated into introductory statistics courses. An adjusted Wald interval for paired binomial proportions is proposed here and…

  13. Censored Hurdle Negative Binomial Regression (Case Study: Neonatorum Tetanus Case in Indonesia)

    NASA Astrophysics Data System (ADS)

    Yuli Rusdiana, Riza; Zain, Ismaini; Wulan Purnami, Santi

    2017-06-01

    Hurdle negative binomial model regression is a method that can be used for discreate dependent variable, excess zero and under- and overdispersion. It uses two parts approach. The first part estimates zero elements from dependent variable is zero hurdle model and the second part estimates not zero elements (non-negative integer) from dependent variable is called truncated negative binomial models. The discrete dependent variable in such cases is censored for some values. The type of censor that will be studied in this research is right censored. This study aims to obtain the parameter estimator hurdle negative binomial regression for right censored dependent variable. In the assessment of parameter estimation methods used Maximum Likelihood Estimator (MLE). Hurdle negative binomial model regression for right censored dependent variable is applied on the number of neonatorum tetanus cases in Indonesia. The type data is count data which contains zero values in some observations and other variety value. This study also aims to obtain the parameter estimator and test statistic censored hurdle negative binomial model. Based on the regression results, the factors that influence neonatorum tetanus case in Indonesia is the percentage of baby health care coverage and neonatal visits.

  14. Inhomogeneity of the density of Parascaris spp. eggs in faeces of individual foals and the use of hypothesis testing for treatment decision making.

    PubMed

    Wilkes, E J A; Cowling, A; Woodgate, R G; Hughes, K J

    2016-10-15

    Faecal egg counts (FEC) are used widely for monitoring of parasite infection in animals, treatment decision-making and estimation of anthelmintic efficacy. When a single count or sample mean is used as a point estimate of the expectation of the egg distribution over some time interval, the variability in the egg density is not accounted for. Although variability, including quantifying sources, of egg count data has been described, the spatiotemporal distribution of nematode eggs in faeces is not well understood. We believe that statistical inference about the mean egg count for treatment decision-making has not been used previously. The aim of this study was to examine the density of Parascaris eggs in solution and faeces and to describe the use of hypothesis testing for decision-making. Faeces from two foals with Parascaris burdens were mixed with magnesium sulphate solution and 30 McMaster chambers were examined to determine the egg distribution in a well-mixed solution. To examine the distribution of eggs in faeces from an individual animal, three faecal piles from a foal with a known Parascaris burden were obtained, from which 81 counts were performed. A single faecal sample was also collected daily from 20 foals on three consecutive days and a FEC was performed on three separate portions of each sample. As appropriate, Poisson or negative binomial confidence intervals for the distribution mean were calculated. Parascaris eggs in a well-mixed solution conformed to a homogeneous Poisson process, while the egg density in faeces was not homogeneous, but aggregated. This study provides an extension from homogeneous to inhomogeneous Poisson processes, leading to an understanding of why Poisson and negative binomial distributions correspondingly provide a good fit for egg count data. The application of one-sided hypothesis tests for decision-making is presented. Copyright © 2016 Elsevier B.V. All rights reserved.

  15. Node degree distribution in spanning trees

    NASA Astrophysics Data System (ADS)

    Pozrikidis, C.

    2016-03-01

    A method is presented for computing the number of spanning trees involving one link or a specified group of links, and excluding another link or a specified group of links, in a network described by a simple graph in terms of derivatives of the spanning-tree generating function defined with respect to the eigenvalues of the Kirchhoff (weighted Laplacian) matrix. The method is applied to deduce the node degree distribution in a complete or randomized set of spanning trees of an arbitrary network. An important feature of the proposed method is that the explicit construction of spanning trees is not required. It is shown that the node degree distribution in the spanning trees of the complete network is described by the binomial distribution. Numerical results are presented for the node degree distribution in square, triangular, and honeycomb lattices.

  16. Ecological effects of the invasive giant madagascar day gecko on endemic mauritian geckos: applications of binomial-mixture and species distribution models.

    PubMed

    Buckland, Steeves; Cole, Nik C; Aguirre-Gutiérrez, Jesús; Gallagher, Laura E; Henshaw, Sion M; Besnard, Aurélien; Tucker, Rachel M; Bachraz, Vishnu; Ruhomaun, Kevin; Harris, Stephen

    2014-01-01

    The invasion of the giant Madagascar day gecko Phelsuma grandis has increased the threats to the four endemic Mauritian day geckos (Phelsuma spp.) that have survived on mainland Mauritius. We had two main aims: (i) to predict the spatial distribution and overlap of P. grandis and the endemic geckos at a landscape level; and (ii) to investigate the effects of P. grandis on the abundance and risks of extinction of the endemic geckos at a local scale. An ensemble forecasting approach was used to predict the spatial distribution and overlap of P. grandis and the endemic geckos. We used hierarchical binomial mixture models and repeated visual estimate surveys to calculate the abundance of the endemic geckos in sites with and without P. grandis. The predicted range of each species varied from 85 km2 to 376 km2. Sixty percent of the predicted range of P. grandis overlapped with the combined predicted ranges of the four endemic geckos; 15% of the combined predicted ranges of the four endemic geckos overlapped with P. grandis. Levin's niche breadth varied from 0.140 to 0.652 between P. grandis and the four endemic geckos. The abundance of endemic geckos was 89% lower in sites with P. grandis compared to sites without P. grandis, and the endemic geckos had been extirpated at four of ten sites we surveyed with P. grandis. Species Distribution Modelling, together with the breadth metrics, predicted that P. grandis can partly share the equivalent niche with endemic species and survive in a range of environmental conditions. We provide strong evidence that smaller endemic geckos are unlikely to survive in sympatry with P. grandis. This is a cause of concern in both Mauritius and other countries with endemic species of Phelsuma.

  17. Poisson, Poisson-gamma and zero-inflated regression models of motor vehicle crashes: balancing statistical fit and theory.

    PubMed

    Lord, Dominique; Washington, Simon P; Ivan, John N

    2005-01-01

    There has been considerable research conducted over the last 20 years focused on predicting motor vehicle crashes on transportation facilities. The range of statistical models commonly applied includes binomial, Poisson, Poisson-gamma (or negative binomial), zero-inflated Poisson and negative binomial models (ZIP and ZINB), and multinomial probability models. Given the range of possible modeling approaches and the host of assumptions with each modeling approach, making an intelligent choice for modeling motor vehicle crash data is difficult. There is little discussion in the literature comparing different statistical modeling approaches, identifying which statistical models are most appropriate for modeling crash data, and providing a strong justification from basic crash principles. In the recent literature, it has been suggested that the motor vehicle crash process can successfully be modeled by assuming a dual-state data-generating process, which implies that entities (e.g., intersections, road segments, pedestrian crossings, etc.) exist in one of two states-perfectly safe and unsafe. As a result, the ZIP and ZINB are two models that have been applied to account for the preponderance of "excess" zeros frequently observed in crash count data. The objective of this study is to provide defensible guidance on how to appropriate model crash data. We first examine the motor vehicle crash process using theoretical principles and a basic understanding of the crash process. It is shown that the fundamental crash process follows a Bernoulli trial with unequal probability of independent events, also known as Poisson trials. We examine the evolution of statistical models as they apply to the motor vehicle crash process, and indicate how well they statistically approximate the crash process. We also present the theory behind dual-state process count models, and note why they have become popular for modeling crash data. A simulation experiment is then conducted to demonstrate how crash data give rise to "excess" zeros frequently observed in crash data. It is shown that the Poisson and other mixed probabilistic structures are approximations assumed for modeling the motor vehicle crash process. Furthermore, it is demonstrated that under certain (fairly common) circumstances excess zeros are observed-and that these circumstances arise from low exposure and/or inappropriate selection of time/space scales and not an underlying dual state process. In conclusion, carefully selecting the time/space scales for analysis, including an improved set of explanatory variables and/or unobserved heterogeneity effects in count regression models, or applying small-area statistical methods (observations with low exposure) represent the most defensible modeling approaches for datasets with a preponderance of zeros.

  18. Demonstration of fundamental statistics by studying timing of electronics signals in a physics-based laboratory

    NASA Astrophysics Data System (ADS)

    Beach, Shaun E.; Semkow, Thomas M.; Remling, David J.; Bradt, Clayton J.

    2017-07-01

    We have developed accessible methods to demonstrate fundamental statistics in several phenomena, in the context of teaching electronic signal processing in a physics-based college-level curriculum. A relationship between the exponential time-interval distribution and Poisson counting distribution for a Markov process with constant rate is derived in a novel way and demonstrated using nuclear counting. Negative binomial statistics is demonstrated as a model for overdispersion and justified by the effect of electronic noise in nuclear counting. The statistics of digital packets on a computer network are shown to be compatible with the fractal-point stochastic process leading to a power-law as well as generalized inverse Gaussian density distributions of time intervals between packets.

  19. Variability in results from negative binomial models for Lyme disease measured at different spatial scales.

    PubMed

    Tran, Phoebe; Waller, Lance

    2015-01-01

    Lyme disease has been the subject of many studies due to increasing incidence rates year after year and the severe complications that can arise in later stages of the disease. Negative binomial models have been used to model Lyme disease in the past with some success. However, there has been little focus on the reliability and consistency of these models when they are used to study Lyme disease at multiple spatial scales. This study seeks to explore how sensitive/consistent negative binomial models are when they are used to study Lyme disease at different spatial scales (at the regional and sub-regional levels). The study area includes the thirteen states in the Northeastern United States with the highest Lyme disease incidence during the 2002-2006 period. Lyme disease incidence at county level for the period of 2002-2006 was linked with several previously identified key landscape and climatic variables in a negative binomial regression model for the Northeastern region and two smaller sub-regions (the New England sub-region and the Mid-Atlantic sub-region). This study found that negative binomial models, indeed, were sensitive/inconsistent when used at different spatial scales. We discuss various plausible explanations for such behavior of negative binomial models. Further investigation of the inconsistency and sensitivity of negative binomial models when used at different spatial scales is important for not only future Lyme disease studies and Lyme disease risk assessment/management but any study that requires use of this model type in a spatial context. Copyright © 2014 Elsevier Inc. All rights reserved.

  20. Surveillance of Physicians Causing Potential Drug-Drug Interactions in Ambulatory Care: A Pilot Study in Switzerland

    PubMed Central

    Bucher, Heiner C.; Achermann, Rita; Stohler, Nadja; Meier, Christoph R.

    2016-01-01

    Objectives We analysed potential drug-drug interactions (DDI) in ambulatory care in Switzerland based on claims data from three large health insurers in 2010 to identify physicians with peculiar prescription behaviour differing from peers of the same specialty. Methods We analysed contraindicated or potentially contraindicated DDI from the national drug formulary and calculated for each physician the ratios of the number of patients with a potential DDI divided by the number of patients at risk and used a zero inflated binomial distribution to correct for the inflated number of observations with no DDI. We then calculated the probability that the number of caused potential DDI of physicians was unlikely (p-value < 0.05 and ≥0.01) and very unlikely (p-value <0.01) to be due to chance. Results Of 1'607'233 females and 1'525'307 males 1.3% and 1.2% were exposed to at least one potential DDI during 12 months. When analysing the 40 most common DDI, 598 and 416 of 18,297 physicians (3.3% and 2.3%) were causing potential DDI in a frequency unlikely (p<0.05 and p≥0.01) and very unlikely (p<0.01) to be explained by chance. Patients cared by general practitioners and cardiologists had the lowest probability (0.20 and 0.26) for not being exposed to DDI. Conclusions Contraindicated or potentially contraindicated DDI are frequent in ambulatory care in Switzerland, with a small proportion of physicians causing potential DDI in a frequency that is very unlikely to be explained by chance when compared to peers of the same specialty. PMID:26808430

  1. The surprising benefit of passive-aggressive behaviour at Christmas parties: being crowned king of the crackers.

    PubMed

    Huang, B Emma; Clifford, David; Lê Cao, Kim-Anh

    2014-12-11

    To test the effects of technique and attitude in pulling Christmas crackers. A binomial trial conducted at a Christmas-in-July dinner party involving five anonymous dinner guests, including two of the authors. Number of wins achieved by different strategies, with a win defined as securing the larger portion of the cracker. The previously "guaranteed" strategy for victory, employing a downwards angle towards the puller, failed to differentiate itself from random chance (win rate, 6/15; probability of winning, 0.40; 95% CI, 0.15-0.65). A novel passive-aggressive strategy, in which one individual just holds on without pulling, provided a significant advantage (win rate, 11/12; probability of winning, 0.92; 95% CI, 0.76-1.00). The passive-aggressive strategy of failing to pull has a high rate of success at winning Christmas crackers; however, excessive adoption of this approach will result in a complete failure, with no winners at all.

  2. Derivation of the expressions for γ50 and D50 for different individual TCP and NTCP models

    NASA Astrophysics Data System (ADS)

    Stavreva, N.; Stavrev, P.; Warkentin, B.; Fallone, B. G.

    2002-10-01

    This paper presents a complete set of formulae for the position (D50) and the normalized slope (γ50) of the dose-response relationship based on the most commonly used radiobiological models for tumours as well as for normal tissues. The functional subunit response models (critical element and critical volume) are used in the derivation of the formulae for the normal tissue. Binomial statistics are used to describe the tumour control probability, the functional subunit response as well as the normal tissue complication probability. The formulae are derived for the single hit and linear quadratic models of cell kill in terms of the number of fractions and dose per fraction. It is shown that the functional subunit models predict very steep, almost step-like, normal tissue individual dose-response relationships. Furthermore, the formulae for the normalized gradient depend on the cellular parameters α and β when written in terms of number of fractions, but not when written in terms of dose per fraction.

  3. Analysis of dengue fever risk using geostatistics model in bone regency

    NASA Astrophysics Data System (ADS)

    Amran, Stang, Mallongi, Anwar

    2017-03-01

    This research aim is to analysis of dengue fever risk based on Geostatistics model in Bone Regency. Risk levels of dengue fever are denoted by parameter of Binomial distribution. Effect of temperature, rainfalls, elevation, and larvae abundance are investigated through Geostatistics model. Bayesian hierarchical method is used in estimation process. Using dengue fever data in eleven locations this research shows that temperature and rainfall have significant effect of dengue fever risk in Bone regency.

  4. Measurement error and outcome distributions: Methodological issues in regression analyses of behavioral coding data.

    PubMed

    Holsclaw, Tracy; Hallgren, Kevin A; Steyvers, Mark; Smyth, Padhraic; Atkins, David C

    2015-12-01

    Behavioral coding is increasingly used for studying mechanisms of change in psychosocial treatments for substance use disorders (SUDs). However, behavioral coding data typically include features that can be problematic in regression analyses, including measurement error in independent variables, non normal distributions of count outcome variables, and conflation of predictor and outcome variables with third variables, such as session length. Methodological research in econometrics has shown that these issues can lead to biased parameter estimates, inaccurate standard errors, and increased Type I and Type II error rates, yet these statistical issues are not widely known within SUD treatment research, or more generally, within psychotherapy coding research. Using minimally technical language intended for a broad audience of SUD treatment researchers, the present paper illustrates the nature in which these data issues are problematic. We draw on real-world data and simulation-based examples to illustrate how these data features can bias estimation of parameters and interpretation of models. A weighted negative binomial regression is introduced as an alternative to ordinary linear regression that appropriately addresses the data characteristics common to SUD treatment behavioral coding data. We conclude by demonstrating how to use and interpret these models with data from a study of motivational interviewing. SPSS and R syntax for weighted negative binomial regression models is included in online supplemental materials. (c) 2016 APA, all rights reserved).

  5. Measurement error and outcome distributions: Methodological issues in regression analyses of behavioral coding data

    PubMed Central

    Holsclaw, Tracy; Hallgren, Kevin A.; Steyvers, Mark; Smyth, Padhraic; Atkins, David C.

    2015-01-01

    Behavioral coding is increasingly used for studying mechanisms of change in psychosocial treatments for substance use disorders (SUDs). However, behavioral coding data typically include features that can be problematic in regression analyses, including measurement error in independent variables, non-normal distributions of count outcome variables, and conflation of predictor and outcome variables with third variables, such as session length. Methodological research in econometrics has shown that these issues can lead to biased parameter estimates, inaccurate standard errors, and increased type-I and type-II error rates, yet these statistical issues are not widely known within SUD treatment research, or more generally, within psychotherapy coding research. Using minimally-technical language intended for a broad audience of SUD treatment researchers, the present paper illustrates the nature in which these data issues are problematic. We draw on real-world data and simulation-based examples to illustrate how these data features can bias estimation of parameters and interpretation of models. A weighted negative binomial regression is introduced as an alternative to ordinary linear regression that appropriately addresses the data characteristics common to SUD treatment behavioral coding data. We conclude by demonstrating how to use and interpret these models with data from a study of motivational interviewing. SPSS and R syntax for weighted negative binomial regression models is included in supplementary materials. PMID:26098126

  6. [Spatial epidemiological study on malaria epidemics in Hainan province].

    PubMed

    Wen, Liang; Shi, Run-He; Fang, Li-Qun; Xu, De-Zhong; Li, Cheng-Yi; Wang, Yong; Yuan, Zheng-Quan; Zhang, Hui

    2008-06-01

    To better understand the characteristics of spatial distribution of malaria epidemics in Hainan province and to explore the relationship between malaria epidemics and environmental factors, as well to develop prediction model on malaria epidemics. Data on Malaria and meteorological factors were collected in all 19 counties in Hainan province from May to Oct., 2000, and the proportion of land use types of these counties in this period were extracted from digital map of land use in Hainan province. Land surface temperatures (LST) were extracted from MODIS images and elevations of these counties were extracted from DEM of Hainan province. The coefficients of correlation of malaria incidences and these environmental factors were then calculated with SPSS 13.0, and negative binomial regression analysis were done using SAS 9.0. The incidence of malaria showed (1) positive correlations to elevation, proportion of forest land area and grassland area; (2) negative correlations to the proportion of cultivated area, urban and rural residents and to industrial enterprise area, LST; (3) no correlations to meteorological factors, proportion of water area, and unemployed land area. The prediction model of malaria which came from negative binomial regression analysis was: I (monthly, unit: 1/1,000,000) = exp (-1.672-0.399xLST). Spatial distribution of malaria epidemics was associated with some environmental factors, and prediction model of malaria epidemic could be developed with indexes which extracted from satellite remote sensing images.

  7. nbCNV: a multi-constrained optimization model for discovering copy number variants in single-cell sequencing data.

    PubMed

    Zhang, Changsheng; Cai, Hongmin; Huang, Jingying; Song, Yan

    2016-09-17

    Variations in DNA copy number have an important contribution to the development of several diseases, including autism, schizophrenia and cancer. Single-cell sequencing technology allows the dissection of genomic heterogeneity at the single-cell level, thereby providing important evolutionary information about cancer cells. In contrast to traditional bulk sequencing, single-cell sequencing requires the amplification of the whole genome of a single cell to accumulate enough samples for sequencing. However, the amplification process inevitably introduces amplification bias, resulting in an over-dispersing portion of the sequencing data. Recent study has manifested that the over-dispersed portion of the single-cell sequencing data could be well modelled by negative binomial distributions. We developed a read-depth based method, nbCNV to detect the copy number variants (CNVs). The nbCNV method uses two constraints-sparsity and smoothness to fit the CNV patterns under the assumption that the read signals are negatively binomially distributed. The problem of CNV detection was formulated as a quadratic optimization problem, and was solved by an efficient numerical solution based on the classical alternating direction minimization method. Extensive experiments to compare nbCNV with existing benchmark models were conducted on both simulated data and empirical single-cell sequencing data. The results of those experiments demonstrate that nbCNV achieves superior performance and high robustness for the detection of CNVs in single-cell sequencing data.

  8. Genetic parameters for racing records in trotters using linear and generalized linear models.

    PubMed

    Suontama, M; van der Werf, J H J; Juga, J; Ojala, M

    2012-09-01

    Heritability and repeatability and genetic and phenotypic correlations were estimated for trotting race records with linear and generalized linear models using 510,519 records on 17,792 Finnhorses and 513,161 records on 25,536 Standardbred trotters. Heritability and repeatability were estimated for single racing time and earnings traits with linear models, and logarithmic scale was used for racing time and fourth-root scale for earnings to correct for nonnormality. Generalized linear models with a gamma distribution were applied for single racing time and with a multinomial distribution for single earnings traits. In addition, genetic parameters for annual earnings were estimated with linear models on the observed and fourth-root scales. Racing success traits of single placings, winnings, breaking stride, and disqualifications were analyzed using generalized linear models with a binomial distribution. Estimates of heritability were greatest for racing time, which ranged from 0.32 to 0.34. Estimates of heritability were low for single earnings with all distributions, ranging from 0.01 to 0.09. Annual earnings were closer to normal distribution than single earnings. Heritability estimates were moderate for annual earnings on the fourth-root scale, 0.19 for Finnhorses and 0.27 for Standardbred trotters. Heritability estimates for binomial racing success variables ranged from 0.04 to 0.12, being greatest for winnings and least for breaking stride. Genetic correlations among racing traits were high, whereas phenotypic correlations were mainly low to moderate, except correlations between racing time and earnings were high. On the basis of a moderate heritability and moderate to high repeatability for racing time and annual earnings, selection of horses for these traits is effective when based on a few repeated records. Because of high genetic correlations, direct selection for racing time and annual earnings would also result in good genetic response in racing success.

  9. Multiplicity distributions of charged hadrons in vp and charged current interactions

    NASA Astrophysics Data System (ADS)

    Jones, G. T.; Jones, R. W. L.; Kennedy, B. W.; Morrison, D. R. O.; Mobayyen, M. M.; Wainstein, S.; Aderholz, M.; Hantke, D.; Katz, U. F.; Kern, J.; Schmitz, N.; Wittek, W.; Borner, H. P.; Myatt, G.; Radojicic, D.; Burke, S.

    1992-03-01

    Using data on vp andbar vp charged current interactions from a bubble chamber experiment with BEBC at CERN, the multiplicity distributions of charged hadrons are investigated. The analysis is based on ˜20000 events with incident v and ˜10000 events with incidentbar v. The invariant mass W of the total hadronic system ranges from 3 GeV to ˜14 GeV. The experimental multiplicity distributions are fitted by the binomial function (for different intervals of W and in different intervals of the rapidity y), by the Levy function and the lognormal function. All three parametrizations give acceptable values for X 2. For fixed W, forward and backward multiplicities are found to be uncorrelated. The normalized moments of the charged multiplicity distributions are measured as a function of W. They show a violation of KNO scaling.

  10. Analytical theory of mesoscopic Bose-Einstein condensation in an ideal gas

    NASA Astrophysics Data System (ADS)

    Kocharovsky, Vitaly V.; Kocharovsky, Vladimir V.

    2010-03-01

    We find the universal structure and scaling of the Bose-Einstein condensation (BEC) statistics and thermodynamics (Gibbs free energy, average energy, heat capacity) for a mesoscopic canonical-ensemble ideal gas in a trap with an arbitrary number of atoms, any volume, and any temperature, including the whole critical region. We identify a universal constraint-cutoff mechanism that makes BEC fluctuations strongly non-Gaussian and is responsible for all unusual critical phenomena of the BEC phase transition in the ideal gas. The main result is an analytical solution to the problem of critical phenomena. It is derived by, first, calculating analytically the universal probability distribution of the noncondensate occupation, or a Landau function, and then using it for the analytical calculation of the universal functions for the particular physical quantities via the exact formulas which express the constraint-cutoff mechanism. We find asymptotics of that analytical solution as well as its simple analytical approximations which describe the universal structure of the critical region in terms of the parabolic cylinder or confluent hypergeometric functions. The obtained results for the order parameter, all higher-order moments of BEC fluctuations, and thermodynamic quantities perfectly match the known asymptotics outside the critical region for both low and high temperature limits. We suggest two- and three-level trap models of BEC and find their exact solutions in terms of the cutoff negative binomial distribution (which tends to the cutoff gamma distribution in the continuous limit) and the confluent hypergeometric distribution, respectively. Also, we present an exactly solvable cutoff Gaussian model of BEC in a degenerate interacting gas. All these exact solutions confirm the universality and constraint-cutoff origin of the strongly non-Gaussian BEC statistics. We introduce a regular refinement scheme for the condensate statistics approximations on the basis of the infrared universality of higher-order cumulants and the method of superposition and show how to model BEC statistics in the actual traps. In particular, we find that the three-level trap model with matching the first four or five cumulants is enough to yield remarkably accurate results for all interesting quantities in the whole critical region. We derive an exact multinomial expansion for the noncondensate occupation probability distribution and find its high-temperature asymptotics (Poisson distribution) and corrections to it. Finally, we demonstrate that the critical exponents and a few known terms of the Taylor expansion of the universal functions, which were calculated previously from fitting the finite-size simulations within the phenomenological renormalization-group theory, can be easily obtained from the presented full analytical solutions for the mesoscopic BEC as certain approximations in the close vicinity of the critical point.

  11. New Class of Quantum Error-Correcting Codes for a Bosonic Mode

    NASA Astrophysics Data System (ADS)

    Michael, Marios H.; Silveri, Matti; Brierley, R. T.; Albert, Victor V.; Salmilehto, Juha; Jiang, Liang; Girvin, S. M.

    2016-07-01

    We construct a new class of quantum error-correcting codes for a bosonic mode, which are advantageous for applications in quantum memories, communication, and scalable computation. These "binomial quantum codes" are formed from a finite superposition of Fock states weighted with binomial coefficients. The binomial codes can exactly correct errors that are polynomial up to a specific degree in bosonic creation and annihilation operators, including amplitude damping and displacement noise as well as boson addition and dephasing errors. For realistic continuous-time dissipative evolution, the codes can perform approximate quantum error correction to any given order in the time step between error detection measurements. We present an explicit approximate quantum error recovery operation based on projective measurements and unitary operations. The binomial codes are tailored for detecting boson loss and gain errors by means of measurements of the generalized number parity. We discuss optimization of the binomial codes and demonstrate that by relaxing the parity structure, codes with even lower unrecoverable error rates can be achieved. The binomial codes are related to existing two-mode bosonic codes, but offer the advantage of requiring only a single bosonic mode to correct amplitude damping as well as the ability to correct other errors. Our codes are similar in spirit to "cat codes" based on superpositions of the coherent states but offer several advantages such as smaller mean boson number, exact rather than approximate orthonormality of the code words, and an explicit unitary operation for repumping energy into the bosonic mode. The binomial quantum codes are realizable with current superconducting circuit technology, and they should prove useful in other quantum technologies, including bosonic quantum memories, photonic quantum communication, and optical-to-microwave up- and down-conversion.

  12. Estimating spatial and temporal components of variation in count data using negative binomial mixed models

    USGS Publications Warehouse

    Irwin, Brian J.; Wagner, Tyler; Bence, James R.; Kepler, Megan V.; Liu, Weihai; Hayes, Daniel B.

    2013-01-01

    Partitioning total variability into its component temporal and spatial sources is a powerful way to better understand time series and elucidate trends. The data available for such analyses of fish and other populations are usually nonnegative integer counts of the number of organisms, often dominated by many low values with few observations of relatively high abundance. These characteristics are not well approximated by the Gaussian distribution. We present a detailed description of a negative binomial mixed-model framework that can be used to model count data and quantify temporal and spatial variability. We applied these models to data from four fishery-independent surveys of Walleyes Sander vitreus across the Great Lakes basin. Specifically, we fitted models to gill-net catches from Wisconsin waters of Lake Superior; Oneida Lake, New York; Saginaw Bay in Lake Huron, Michigan; and Ohio waters of Lake Erie. These long-term monitoring surveys varied in overall sampling intensity, the total catch of Walleyes, and the proportion of zero catches. Parameter estimation included the negative binomial scaling parameter, and we quantified the random effects as the variations among gill-net sampling sites, the variations among sampled years, and site × year interactions. This framework (i.e., the application of a mixed model appropriate for count data in a variance-partitioning context) represents a flexible approach that has implications for monitoring programs (e.g., trend detection) and for examining the potential of individual variance components to serve as response metrics to large-scale anthropogenic perturbations or ecological changes.

  13. Comparison of robustness to outliers between robust poisson models and log-binomial models when estimating relative risks for common binary outcomes: a simulation study.

    PubMed

    Chen, Wansu; Shi, Jiaxiao; Qian, Lei; Azen, Stanley P

    2014-06-26

    To estimate relative risks or risk ratios for common binary outcomes, the most popular model-based methods are the robust (also known as modified) Poisson and the log-binomial regression. Of the two methods, it is believed that the log-binomial regression yields more efficient estimators because it is maximum likelihood based, while the robust Poisson model may be less affected by outliers. Evidence to support the robustness of robust Poisson models in comparison with log-binomial models is very limited. In this study a simulation was conducted to evaluate the performance of the two methods in several scenarios where outliers existed. The findings indicate that for data coming from a population where the relationship between the outcome and the covariate was in a simple form (e.g. log-linear), the two models yielded comparable biases and mean square errors. However, if the true relationship contained a higher order term, the robust Poisson models consistently outperformed the log-binomial models even when the level of contamination is low. The robust Poisson models are more robust (or less sensitive) to outliers compared to the log-binomial models when estimating relative risks or risk ratios for common binary outcomes. Users should be aware of the limitations when choosing appropriate models to estimate relative risks or risk ratios.

  14. Speech-discrimination scores modeled as a binomial variable.

    PubMed

    Thornton, A R; Raffin, M J

    1978-09-01

    Many studies have reported variability data for tests of speech discrimination, and the disparate results of these studies have not been given a simple explanation. Arguments over the relative merits of 25- vs 50-word tests have ignored the basic mathematical properties inherent in the use of percentage scores. The present study models performance on clinical tests of speech discrimination as a binomial variable. A binomial model was developed, and some of its characteristics were tested against data from 4120 scores obtained on the CID Auditory Test W-22. A table for determining significant deviations between scores was generated and compared to observed differences in half-list scores for the W-22 tests. Good agreement was found between predicted and observed values. Implications of the binomial characteristics of speech-discrimination scores are discussed.

  15. Possibility and Challenges of Conversion of Current Virus Species Names to Linnaean Binomials

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Postler, Thomas S.; Clawson, Anna N.; Amarasinghe, Gaya K.

    Botanical, mycological, zoological, and prokaryotic species names follow the Linnaean format, consisting of an italicized Latinized binomen with a capitalized genus name and a lower case species epithet (e.g., Homo sapiens). Virus species names, however, do not follow a uniform format, and, even when binomial, are not Linnaean in style. In this thought exercise, we attempted to convert all currently official names of species included in the virus family Arenaviridae and the virus order Mononegavirales to Linnaean binomials, and to identify and address associated challenges and concerns. Surprisingly, this endeavor was not as complicated or time-consuming as even the authorsmore » of this article expected when conceiving the experiment. [Arenaviridae; binomials; ICTV; International Committee on Taxonomy of Viruses; Mononegavirales; virus nomenclature; virus taxonomy.]« less

  16. Mutational Analysis of the Adaptor Protein 2 Sigma Subunit (AP2S1) Gene: Search for Autosomal Dominant Hypocalcemia Type 3 (ADH3)

    PubMed Central

    Rogers, Angela; Nesbit, M. Andrew; Hannan, Fadil M.; Howles, Sarah A.; Gorvin, Caroline M.; Cranston, Treena; Allgrove, Jeremy; Bevan, John S.; Bano, Gul; Brain, Caroline; Datta, Vipan; Grossman, Ashley B.; Hodgson, Shirley V.; Izatt, Louise; Millar-Jones, Lynne; Pearce, Simon H.; Robertson, Lisa; Selby, Peter L.; Shine, Brian; Snape, Katie; Warner, Justin

    2014-01-01

    Context: Autosomal dominant hypocalcemia (ADH) types 1 and 2 are due to calcium-sensing receptor (CASR) and G-protein subunit-α11 (GNA11) gain-of-function mutations, respectively, whereas CASR and GNA11 loss-of-function mutations result in familial hypocalciuric hypercalcemia (FHH) types 1 and 2, respectively. Loss-of-function mutations of adaptor protein-2 sigma subunit (AP2σ 2), encoded by AP2S1, cause FHH3, and we therefore sought for gain-of-function AP2S1 mutations that may cause an additional form of ADH, which we designated ADH3. Objective: The objective of the study was to investigate the hypothesis that gain-of-function AP2S1 mutations may cause ADH3. Design: The sample size required for the detection of at least one mutation with a greater than 95% likelihood was determined by binomial probability analysis. Nineteen patients (including six familial cases) with hypocalcemia in association with low or normal serum PTH concentrations, consistent with ADH, but who did not have CASR or GNA11 mutations, were ascertained. Leukocyte DNA was used for sequence and copy number variation analysis of AP2S1. Results: Binomial probability analysis, using the assumption that AP2S1 mutations would occur in hypocalcemic patients at a prevalence of 20%, which is observed in FHH patients without CASR or GNA11 mutations, indicated that the likelihood of detecting at least one AP2S1 mutation was greater than 95% and greater than 98% in sample sizes of 14 and 19 hypocalcemic patients, respectively. AP2S1 mutations and copy number variations were not detected in the 19 hypocalcemic patients. Conclusion: The absence of AP2S1 abnormalities in hypocalcemic patients, suggests that ADH3 may not occur or otherwise represents a rare hypocalcemic disorder. PMID:24708097

  17. Detection of influenza-like illness aberrations by directly monitoring Pearson residuals of fitted negative binomial regression models.

    PubMed

    Chan, Ta-Chien; Teng, Yung-Chu; Hwang, Jing-Shiang

    2015-02-21

    Emerging novel influenza outbreaks have increasingly been a threat to the public and a major concern of public health departments. Real-time data in seamless surveillance systems such as health insurance claims data for influenza-like illnesses (ILI) are ready for analysis, making it highly desirable to develop practical techniques to analyze such readymade data for outbreak detection so that the public can receive timely influenza epidemic warnings. This study proposes a simple and effective approach to analyze area-based health insurance claims data including outpatient and emergency department (ED) visits for early detection of any aberrations of ILI. The health insurance claims data during 2004-2009 from a national health insurance research database were used for developing early detection methods. The proposed approach fitted the daily new ILI visits and monitored the Pearson residuals directly for aberration detection. First, negative binomial regression was used for both outpatient and ED visits to adjust for potentially influential factors such as holidays, weekends, seasons, temporal dependence and temperature. Second, if the Pearson residuals exceeded 1.96, aberration signals were issued. The empirical validation of the model was done in 2008 and 2009. In addition, we designed a simulation study to compare the time of outbreak detection, non-detection probability and false alarm rate between the proposed method and modified CUSUM. The model successfully detected the aberrations of 2009 pandemic (H1N1) influenza virus in northern, central and southern Taiwan. The proposed approach was more sensitive in identifying aberrations in ED visits than those in outpatient visits. Simulation studies demonstrated that the proposed approach could detect the aberrations earlier, and with lower non-detection probability and mean false alarm rate in detecting aberrations compared to modified CUSUM methods. The proposed simple approach was able to filter out temporal trends, adjust for temperature, and issue warning signals for the first wave of the influenza epidemic in a timely and accurate manner.

  18. The role of vaspin as a predictor of coronary angiography result in SCAD (stable coronary artery disease) patients.

    PubMed

    Stančík, Matej; Ságová, Ivana; Kantorová, Ema; Mokáň, Marián

    2017-05-08

    The role of vaspin in the pathogenesis of stable coronary artery disease (SCAD) have been repeatedly addressed in clinical studies. However, from the point of view of clinical practice, the results of earlier studies are still inconclusive. The data of 106 SCAD patients who received coronary angiography and 85 coronary artery disease-free controls were collected and analysed. The patients were divided into subgroups according to their pre-test probability (PTP) and according to the result of coronary angiography. Fasting vaspin concentrations were compared between subgroups of SCAD patients and between target group and controls. The effect of age and smoking on the result of coronary angiography was compared to the effect of vaspin using the binomial regression. We did not find significant difference in vaspin level between target group and controls. Unless the pre-test probability was taken into account, we did not find vaspin difference in the target group, when dividing patients on the basis of presence/absence of significant coronary stenosis. In the subgroup of SCAD patients with PTP between 15% - 65%, those with significant coronary stenoses had higher mean vaspin concentration (0,579 ± 0,898 ng/ml) than patients without significant stenoses. (0,379 ± 0,732 ng/ml) (t = -2595; p = 0,012; d = 0,658; 1-β = 0,850). Age, smoking status and vaspin significantly contributed to the HSCS prediction in binomial regression model in patients with low PTP (OR: 1.1, 4.9, 8.7, respectively). According to our results, vaspin cannot be used as an independent marker for the presence of CAD in general population. However, our results indicate that measuring vaspin in SCAD patients might be clinically useful in patients with PTP below 66%.

  19. [Individual variation in the frequency of chromosome aberrations under the influence of chemical mutagens. I. Inter-cultural and inter-individual variations in the effect of mutagens on human lymphocytes].

    PubMed

    Iakovenko, K N; Tarusina, T O

    1976-01-01

    The study of the distribution law of human peripheral blood cultures for the sensitivity to thiophosphamide was performed. In the first experiment the blood from one person was used, in the second one the blood was used from different persons. "The percent of aberrant cells" and "the number of chromosome breaks per 100 cells" were scored. The distribution law of the cultures in all the experiments was found to be normal. Analysis of the variances on the percent of aberrant cells showed that the distribution law of the cultures received from one donor corresponded to the binomial one, and that of the cultures received from different donors--to the Poisson's one.

  20. Maximally Informative Stimuli and Tuning Curves for Sigmoidal Rate-Coding Neurons and Populations

    NASA Astrophysics Data System (ADS)

    McDonnell, Mark D.; Stocks, Nigel G.

    2008-08-01

    A general method for deriving maximally informative sigmoidal tuning curves for neural systems with small normalized variability is presented. The optimal tuning curve is a nonlinear function of the cumulative distribution function of the stimulus and depends on the mean-variance relationship of the neural system. The derivation is based on a known relationship between Shannon’s mutual information and Fisher information, and the optimality of Jeffrey’s prior. It relies on the existence of closed-form solutions to the converse problem of optimizing the stimulus distribution for a given tuning curve. It is shown that maximum mutual information corresponds to constant Fisher information only if the stimulus is uniformly distributed. As an example, the case of sub-Poisson binomial firing statistics is analyzed in detail.

  1. Radial basis function and its application in tourism management

    NASA Astrophysics Data System (ADS)

    Hu, Shan-Feng; Zhu, Hong-Bin; Zhao, Lei

    2018-05-01

    In this work, several applications and the performances of the radial basis function (RBF) are briefly reviewed at first. After that, the binomial function combined with three different RBFs including the multiquadric (MQ), inverse quadric (IQ) and inverse multiquadric (IMQ) distributions are adopted to model the tourism data of Huangshan in China. Simulation results showed that all the models match very well with the sample data. It is found that among the three models, the IMQ-RBF model is more suitable for forecasting the tourist flow.

  2. The relative contribution of climatic, edaphic, and biotic drivers to risk of tree mortality from drought

    NASA Astrophysics Data System (ADS)

    March, R. G.; Moore, G. W.; Edgar, C. B.; Lawing, A. M.; Washington-Allen, R. A.

    2015-12-01

    In recorded history, the 2011 Texas Drought was comparable in severity only to a drought that occurred 300 years ago. By mid-September, 88% of the state experienced 'exceptional' conditions, with the rest experiencing 'extreme' or 'severe' drought. By recent estimates, the 2011 Texas Drought killed 6.2% of all the state's trees, at a rate nearly 9 times greater than average. The vast spatial scale and relatively uniform intensity of this drought has provided an opportunity to examine the comparative interactions among forest types, terrain, and edaphic factors across major climate gradients which in 2011 were subjected to extreme drought conditions that ultimately caused massive tree mortality. We used maximum entropy modeling (Maxent) to rank environmental landscape factors with the potential to drive drought-related tree mortality and test the assumption that the relative importance of these factors are scale-dependent. Occurrence data of dead trees were collected during the summer of 2012 from 599 field plots distributed across Texas with 30% used for model evaluation. Bioclimatic variables, ecoregions, soils characteristics, and topographic variables were modeled with drought-killed tree occurrence. Their relative contribution to the model was seen as their relative importance in driving mortality. To test determinants at a more local scale, we examined Landsat 7 scenes in East and West Texas with moderate-resolution data for the same variables above with the exception of climate. All models were significantly better than random in binomial tests of omission and receiver operating characteristic analyses. The modeled spatial distribution of probability of occurrence showed high probability of mortality in the east-central oak woodlands and the mixed pine-hardwood forest region in northeast Texas. Both regional and local models were dominated by biotic factors (ecoregion and forest type, respectively). Forest density and precipitation of driest month also contributed highly to the regional model. The local models gave more importance to available water storage at root zone and hillshade. Understanding how environmental factors drive drought-related mortality can help predict vulnerable landscapes and aid in preparing for future drought events.

  3. Predictive accuracy of particle filtering in dynamic models supporting outbreak projections.

    PubMed

    Safarishahrbijari, Anahita; Teyhouee, Aydin; Waldner, Cheryl; Liu, Juxin; Osgood, Nathaniel D

    2017-09-26

    While a new generation of computational statistics algorithms and availability of data streams raises the potential for recurrently regrounding dynamic models with incoming observations, the effectiveness of such arrangements can be highly subject to specifics of the configuration (e.g., frequency of sampling and representation of behaviour change), and there has been little attempt to identify effective configurations. Combining dynamic models with particle filtering, we explored a solution focusing on creating quickly formulated models regrounded automatically and recurrently as new data becomes available. Given a latent underlying case count, we assumed that observed incident case counts followed a negative binomial distribution. In accordance with the condensation algorithm, each such observation led to updating of particle weights. We evaluated the effectiveness of various particle filtering configurations against each other and against an approach without particle filtering according to the accuracy of the model in predicting future prevalence, given data to a certain point and a norm-based discrepancy metric. We examined the effectiveness of particle filtering under varying times between observations, negative binomial dispersion parameters, and rates with which the contact rate could evolve. We observed that more frequent observations of empirical data yielded super-linearly improved accuracy in model predictions. We further found that for the data studied here, the most favourable assumptions to make regarding the parameters associated with the negative binomial distribution and changes in contact rate were robust across observation frequency and the observation point in the outbreak. Combining dynamic models with particle filtering can perform well in projecting future evolution of an outbreak. Most importantly, the remarkable improvements in predictive accuracy resulting from more frequent sampling suggest that investments to achieve efficient reporting mechanisms may be more than paid back by improved planning capacity. The robustness of the results on particle filter configuration in this case study suggests that it may be possible to formulate effective standard guidelines and regularized approaches for such techniques in particular epidemiological contexts. Most importantly, the work tentatively suggests potential for health decision makers to secure strong guidance when anticipating outbreak evolution for emerging infectious diseases by combining even very rough models with particle filtering method.

  4. Modeling Tetanus Neonatorum case using the regression of negative binomial and zero-inflated negative binomial

    NASA Astrophysics Data System (ADS)

    Amaliana, Luthfatul; Sa'adah, Umu; Wayan Surya Wardhani, Ni

    2017-12-01

    Tetanus Neonatorum is an infectious disease that can be prevented by immunization. The number of Tetanus Neonatorum cases in East Java Province is the highest in Indonesia until 2015. Tetanus Neonatorum data contain over dispersion and big enough proportion of zero-inflation. Negative Binomial (NB) regression is an alternative method when over dispersion happens in Poisson regression. However, the data containing over dispersion and zero-inflation are more appropriately analyzed by using Zero-Inflated Negative Binomial (ZINB) regression. The purpose of this study are: (1) to model Tetanus Neonatorum cases in East Java Province with 71.05 percent proportion of zero-inflation by using NB and ZINB regression, (2) to obtain the best model. The result of this study indicates that ZINB is better than NB regression with smaller AIC.

  5. Assumptions of acceptance sampling and the implications for lot contamination: Escherichia coli O157 in lots of Australian manufacturing beef.

    PubMed

    Kiermeier, Andreas; Mellor, Glen; Barlow, Robert; Jenson, Ian

    2011-04-01

    The aims of this work were to determine the distribution and concentration of Escherichia coli O157 in lots of beef destined for grinding (manufacturing beef) that failed to meet Australian requirements for export, to use these data to better understand the performance of sampling plans based on the binomial distribution, and to consider alternative approaches for evaluating sampling plans. For each of five lots from which E. coli O157 had been detected, 900 samples from the external carcass surface were tested. E. coli O157 was not detected in three lots, whereas in two lots E. coli O157 was detected in 2 and 74 samples. For lots in which E. coli O157 was not detected in the present study, the E. coli O157 level was estimated to be <12 cells per 27.2-kg carton. For the most contaminated carton, the total number of E. coli O157 cells was estimated to be 813. In the two lots in which E. coli O157 was detected, the pathogen was detected in 1 of 12 and 2 of 12 cartons. The use of acceptance sampling plans based on a binomial distribution can provide a falsely optimistic view of the value of sampling as a control measure when applied to assessment of E. coli O157 contamination in manufacturing beef. Alternative approaches to understanding sampling plans, which do not assume homogeneous contamination throughout the lot, appear more realistic. These results indicate that despite the application of stringent sampling plans, sampling and testing approaches are inefficient for controlling microbiological quality.

  6. GENE-Counter: A Computational Pipeline for the Analysis of RNA-Seq Data for Gene Expression Differences

    PubMed Central

    Di, Yanming; Schafer, Daniel W.; Wilhelm, Larry J.; Fox, Samuel E.; Sullivan, Christopher M.; Curzon, Aron D.; Carrington, James C.; Mockler, Todd C.; Chang, Jeff H.

    2011-01-01

    GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts. PMID:21998647

  7. Video performance for high security applications.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Connell, Jack C.; Norman, Bradley C.

    2010-06-01

    The complexity of physical protection systems has increased to address modern threats to national security and emerging commercial technologies. A key element of modern physical protection systems is the data presented to the human operator used for rapid determination of the cause of an alarm, whether false (e.g., caused by an animal, debris, etc.) or real (e.g., a human adversary). Alarm assessment, the human validation of a sensor alarm, primarily relies on imaging technologies and video systems. Developing measures of effectiveness (MOE) that drive the design or evaluation of a video system or technology becomes a challenge, given the subjectivitymore » of the application (e.g., alarm assessment). Sandia National Laboratories has conducted empirical analysis using field test data and mathematical models such as binomial distribution and Johnson target transfer functions to develop MOEs for video system technologies. Depending on the technology, the task of the security operator and the distance to the target, the Probability of Assessment (PAs) can be determined as a function of a variety of conditions or assumptions. PAs used as an MOE allows the systems engineer to conduct trade studies, make informed design decisions, or evaluate new higher-risk technologies. This paper outlines general video system design trade-offs, discusses ways video can be used to increase system performance and lists MOEs for video systems used in subjective applications such as alarm assessment.« less

  8. Are star formation rates of galaxies bimodal?

    NASA Astrophysics Data System (ADS)

    Feldmann, Robert

    2017-09-01

    Star formation rate (SFR) distributions of galaxies are often assumed to be bimodal with modes corresponding to star-forming and quiescent galaxies, respectively. Both classes of galaxies are typically studied separately, and SFR distributions of star-forming galaxies are commonly modelled as lognormals. Using both observational data and results from numerical simulations, I argue that this division into star-forming and quiescent galaxies is unnecessary from a theoretical point of view and that the SFR distributions of the whole population can be well fitted by zero-inflated negative binomial distributions. This family of distributions has three parameters that determine the average SFR of the galaxies in the sample, the scatter relative to the star-forming sequence and the fraction of galaxies with zero SFRs, respectively. The proposed distributions naturally account for (I) the discrete nature of star formation, (II) the presence of 'dead' galaxies with zero SFRs and (III) asymmetric scatter. Excluding 'dead' galaxies, the distribution of log SFR is unimodal with a peak at the star-forming sequence and an extended tail towards low SFRs. However, uncertainties and biases in the SFR measurements can create the appearance of a bimodal distribution.

  9. Abstract knowledge versus direct experience in processing of binomial expressions

    PubMed Central

    Morgan, Emily; Levy, Roger

    2016-01-01

    We ask whether word order preferences for binomial expressions of the form A and B (e.g. bread and butter) are driven by abstract linguistic knowledge of ordering constraints referencing the semantic, phonological, and lexical properties of the constituent words, or by prior direct experience with the specific items in questions. Using forced-choice and self-paced reading tasks, we demonstrate that online processing of never-before-seen binomials is influenced by abstract knowledge of ordering constraints, which we estimate with a probabilistic model. In contrast, online processing of highly frequent binomials is primarily driven by direct experience, which we estimate from corpus frequency counts. We propose a trade-off wherein processing of novel expressions relies upon abstract knowledge, while reliance upon direct experience increases with increased exposure to an expression. Our findings support theories of language processing in which both compositional generation and direct, holistic reuse of multi-word expressions play crucial roles. PMID:27776281

  10. [Perceived discrimination at work for being an immigrant: a study on self-perceived mental health status among immigrants in Italy].

    PubMed

    Di Napoli, Anteo; Gatta, Rosaria; Rossi, Alessandra; Perez, Monica; Costanzo, Gianfranco; Mirisola, Concetta; Petrelli, Alessio

    2017-01-01

    exposure to discrimination is widely understood as a social determinant of psychophysical health and a contributing factor to health inequities among social groups. Few studies exist, particularly in Italy, about the effects of discrimination among immigrants at workplace. to analyse the association between perceived discrimination at work for being an immigrant and mental health status among immigrants in Italy. a sub-sample of 12,408 immigrants residing in Italy was analysed. data came from the survey "Social conditions and integration of foreign citizens in Italy", carried out in 2011-2012 by the Italian National Institute of Statistics (Istat). Self-perceived mental health status was measured through mental component summary (MCS) of SF-12 questionnaire, assuming as worse health status MCS score distribution ≤1st quartile. In order to evaluate the probability of poor health status, a multivariate log-binomial model was performed assuming: discrimination at work for being an immigrant as determinant variable; age, gender, educational level, employment status, area of origin, residence in Italy, length of stay in Italy, self-perceived loneliness and satisfaction about life as potential confounding variables. among immigrants, 15.8% referred discrimination at his/her workplace in Italy for being an immigrant. Higher probability of poor mental health status was observed for immigrants who referred discrimination at workplace (Prevalence Rate Ratio - PRR: 1.16) who arrived in Italy since at least 5 years (PRR: 1.14), for not employed subjects (PRR: 1.31), and for people from the Americas (PRR: 1.14). Lower probability of poor mental health status was found in immigrants from Western- Central Asia (PRR: 0.83) and Eastern-Pacific Asia (PRR: 0.79). Compared to immigrants residing in North-Eastern Italy, higher probability of worse mental health status was observed in people who resided in Northern-Western (PRR: 1.30), Central (PRR: 1.26), and Southern (PRR: 1.15) Italian regions. our findings confirm that discrimination at workplace for being an immigrant is a risk factor for self-perceived mental health among immigrants in Italy, suggesting that an overall public health response is essential in addition to work-based interventions. Improving working conditions, promoting organisational strategies to support coping behaviours, and challenging discrimination can improve mental health status of immigrants.

  11. Ageing under unequal circumstances: a cross-sectional analysis of the gender and socioeconomic patterning of functional limitations among the Southern European elderly.

    PubMed

    Serrano-Alarcón, Manuel; Perelman, Julian

    2017-10-03

    In a context of population ageing, it is a priority for planning and prevention to understand the socioeconomic (SE) patterning of functional limitations and its consequences on healthcare needs. This paper aims at measuring the gender and SE inequalities in functional limitations and their age of onset among the Southern European elderly; then, we evaluate how functional status is linked to formal and informal care use. We used Portuguese, Italian and Spanish data from the Survey of Health, Ageing and Retirement in Europe (SHARE) of 2011 (n = 9233). We constructed a summary functional limitation score as the sum of two variables: i) Activities of Daily Living (ADL) and ii) Instrumental Activities of Daily Living (IADL). We modelled the functional limitation as a function of age, gender, education, subjective poverty, employment and marital status using multinomial logit models. We then estimated how functional limitation affected informal and formal care demand using negative binomial and logistic models. Women were 2.3 percentage points (pp) more likely to experience severe functional limitation than men, and overcame a 10% probability threshold of suffering from severe limitation around 5 years earlier. Subjective poverty was associated with a 3.1 pp. higher probability of severe functional limitation. Having a university degree reduced the probability of severe functional limitation by 3.5 pp. as compared to none educational level. Discrepancies were wider for the oldest old: women aged 65-79 years old were 3.3 pp. more likely to suffer severe limitations, the excess risk increasing to 15.5 pp. among those older than 80. Similarly, educational inequalities in functional limitation were wider at older ages. Being severely limited was related with a 32.1 pp. higher probability of receiving any informal care, as compared to those moderately limited. Finally, those severely limited had on average 3.2 hospitalization days and 4.6 doctor consultations more, per year, than those without limitations. Functional limitations are unequally distributed, hitting women and the worse-off earlier and more severely, with consequences on care needs. Considering the burden on healthcare systems and families, public health policies should seek to reduce current inequalities in functional limitations.

  12. Recall of anti-tobacco advertisements and effects on quitting behavior: results from the California smokers cohort.

    PubMed

    Leas, Eric C; Myers, Mark G; Strong, David R; Hofstetter, C Richard; Al-Delaimy, Wael K

    2015-02-01

    We assessed whether an anti-tobacco television advertisement called "Stages," which depicted a woman giving a brief emotional narrative of her experiences with tobacco use, would be recalled more often and have a greater effect on smoking cessation than 3 other advertisements with different intended themes. Our data were derived from a sample of 2596 California adult smokers. We used multivariable log-binomial and modified Poisson regression models to calculate respondents' probability of quitting as a result of advertisement recall. More respondents recalled the "Stages" ad (58.5%) than the 3 other ads (23.1%, 23.4%, and 25.6%; P<.001). Respondents who recalled "Stages" at baseline had a higher probability than those who did not recall the ad of making a quit attempt between baseline and follow-up (adjusted risk ratio [RR]=1.18; 95% confidence interval [CI]=1.03, 1.34) and a higher probability of being in a period of smoking abstinence for at least a month at follow-up (adjusted RR=1.55; 95% CI=1.02, 2.37). Anti-tobacco television advertisements that depict visceral and personal messages may be recalled by a larger percentage of smokers and may have a greater impact on smoking cessation than other types of advertisements.

  13. Diversity in the American Society of Anesthesiologists Leadership.

    PubMed

    Toledo, Paloma; Duce, Lorent; Adams, Jerome; Ross, Vernon H; Thompson, Kelli M; Wong, Cynthia A

    2017-05-01

    Women and minorities are underrepresented in US academic medicine. The Sullivan Commission on Diversity in the Healthcare Workforce emphasized the importance of diverse leadership for reducing health care disparities. The objective of this study was to evaluate the demographics of the American Society of Anesthesiologists leadership. We hypothesized that the percentage of women and underrepresented minorities is less than that of their respective proportions in the general physician workforce. An electronic survey was developed by the authors and mailed to 595 members of the American Society of Anesthesiologists leadership who had valid email addresses, including the members of the 2014 House of Delegates and state society leaders who were not the members of the House of Delegates. Univariate statistics were used to characterize survey responses and the probability distributions were estimated using the binomial distribution. A one-sample t test was used to compare the percentage of women and minorities in the survey pool to that of the corresponding percentages in the general physician workforce (38.0% women and 8.9% minorities), and the US population (51.0% women and 32.0% minorities). The survey response rate was 54%. A total of 21.1% (95% confidence interval: 16.4%-25.7%) of respondents were women and 6.0% (95% confidence interval: 3.3%-8.7%) were minorities. The proportion of women in the American Society of Anesthesiologist leadership was lower than the general medical workforce and the US population (P < .001 for both); the proportion of underrepresented minorities was lower than the US population (P < .001). Women and minorities are underrepresented in the leadership of the American Society of Anesthesiologists. Efforts should be made to increase the diversity of the American Society of Anesthesiologists leadership with the goal of reducing overall anesthesia workforce disparities.

  14. Self-affirmation model for football goal distributions

    NASA Astrophysics Data System (ADS)

    Bittner, E.; Nußbaumer, A.; Janke, W.; Weigel, M.

    2007-06-01

    Analyzing football score data with statistical techniques, we investigate how the highly co-operative nature of the game is reflected in averaged properties such as the distributions of scored goals for the home and away teams. It turns out that in particular the tails of the distributions are not well described by independent Bernoulli trials, but rather well modeled by negative binomial or generalized extreme value distributions. To understand this behavior from first principles, we suggest to modify the Bernoulli random process to include a simple component of self-affirmation which seems to describe the data surprisingly well and allows to interpret the observed deviation from Gaussian statistics. The phenomenological distributions used before can be understood as special cases within this framework. We analyzed historical football score data from many leagues in Europe as well as from international tournaments and found the proposed models to be applicable rather universally. In particular, here we compare men's and women's leagues and the separate German leagues during the cold war times and find some remarkable differences.

  15. A probability metric for identifying high-performing facilities: an application for pay-for-performance programs.

    PubMed

    Shwartz, Michael; Peköz, Erol A; Burgess, James F; Christiansen, Cindy L; Rosen, Amy K; Berlowitz, Dan

    2014-12-01

    Two approaches are commonly used for identifying high-performing facilities on a performance measure: one, that the facility is in a top quantile (eg, quintile or quartile); and two, that a confidence interval is below (or above) the average of the measure for all facilities. This type of yes/no designation often does not do well in distinguishing high-performing from average-performing facilities. To illustrate an alternative continuous-valued metric for profiling facilities--the probability a facility is in a top quantile--and show the implications of using this metric for profiling and pay-for-performance. We created a composite measure of quality from fiscal year 2007 data based on 28 quality indicators from 112 Veterans Health Administration nursing homes. A Bayesian hierarchical multivariate normal-binomial model was used to estimate shrunken rates of the 28 quality indicators, which were combined into a composite measure using opportunity-based weights. Rates were estimated using Markov Chain Monte Carlo methods as implemented in WinBUGS. The probability metric was calculated from the simulation replications. Our probability metric allowed better discrimination of high performers than the point or interval estimate of the composite score. In a pay-for-performance program, a smaller top quantile (eg, a quintile) resulted in more resources being allocated to the highest performers, whereas a larger top quantile (eg, being above the median) distinguished less among high performers and allocated more resources to average performers. The probability metric has potential but needs to be evaluated by stakeholders in different types of delivery systems.

  16. [Estimating survival of thrushes: modeling capture-recapture probabilities].

    PubMed

    Burskiî, O V

    2011-01-01

    The stochastic modeling technique serves as a way to correctly separate "return rate" of marked animals into survival rate (phi) and capture probability (p). The method can readily be used with the program MARK freely distributed through Internet (Cooch, White, 2009). Input data for the program consist of "capture histories" of marked animals--strings of units and zeros indicating presence or absence of the individual among captures (or sightings) along the set of consequent recapture occasions (e.g., years). Probability of any history is a product of binomial probabilities phi, p or their complements (1 - phi) and (1 - p) for each year of observation over the individual. Assigning certain values to parameters phi and p, one can predict the composition of all individual histories in the sample and assess the likelihood of the prediction. The survival parameters for different occasions and cohorts of individuals can be set either equal or different, as well as recapture parameters can be set in different ways. There is a possibility to constraint the parameters, according to the hypothesis being tested, in the form of a specific model. Within the specified constraints, the program searches for parameter values that describe the observed composition of histories with the maximum likelihood. It computes the parameter estimates along with confidence limits and the overall model likelihood. There is a set of tools for testing the model goodness-of-fit under assumption of equality of survival rates among individuals and independence of their fates. Other tools offer a proper selection among a possible variety of models, providing the best parity between details and precision in describing reality. The method was applied to 20-yr recapture and resighting data series on 4 thrush species (genera Turdus, Zoothera) breeding in the Yenisei River floodplain within the middle taiga subzone. The capture probabilities were quite independent of observational efforts fluctuations while differing significantly between the species and sexes. The estimates of adult survival rate, obtained for the Siberian migratory populations, were lower than those for sedentary populations from both the tropics and intermediate latitudes with marine climate (data by Ricklefs, 1997). Two factors, the average temperature influencing birds during their annual movements, and climatic seasonality (temperature difference between summer and winter) in the breeding area, fit the latitudinal pattern of survival most closely (R2 = 0.90). Final survival of migrants reflects an adaptive life history compromise for use of superabundant resources in breeding area at the cost of avoidance of severe winter conditions.

  17. Computational Aspects of N-Mixture Models

    PubMed Central

    Dennis, Emily B; Morgan, Byron JT; Ridout, Martin S

    2015-01-01

    The N-mixture model is widely used to estimate the abundance of a population in the presence of unknown detection probability from only a set of counts subject to spatial and temporal replication (Royle, 2004, Biometrics 60, 105–115). We explain and exploit the equivalence of N-mixture and multivariate Poisson and negative-binomial models, which provides powerful new approaches for fitting these models. We show that particularly when detection probability and the number of sampling occasions are small, infinite estimates of abundance can arise. We propose a sample covariance as a diagnostic for this event, and demonstrate its good performance in the Poisson case. Infinite estimates may be missed in practice, due to numerical optimization procedures terminating at arbitrarily large values. It is shown that the use of a bound, K, for an infinite summation in the N-mixture likelihood can result in underestimation of abundance, so that default values of K in computer packages should be avoided. Instead we propose a simple automatic way to choose K. The methods are illustrated by analysis of data on Hermann's tortoise Testudo hermanni. PMID:25314629

  18. Factors that influence the interests of farmer in shallots farming at Cinta Dame village of Simanindo sub district of Samosir district

    NASA Astrophysics Data System (ADS)

    Siregar, A. F.; Supriana, T.

    2018-02-01

    Shallots contains a lot of usefull ingredients for human life, especially as flavor to dishes by Indonesian. The need for shallots was increasing as increasing population. The increased demand of shallots caused the price to increase due to production in North Sumatera was low. The objective of this study is to analyze interest and factors that affect the interest of farmers in shallots farming and analyze the responses from each factors to the interest of farmers in shallots farming. The samples were 85 farmers in shallots farming. Binomial logit was used as data analysis method. The result of the study showed that the factors that influence the interest of farmers in shallots farming consist of land area, experience, income, supporting and trauma.. The opportunity of farmers in shallots farming increased 22% if the area of land increased by one acre. The probability variable with the supporting is higher 0,3 % compared without the supporting. While the probability variable without the trauma is higher 0,014 % compared with the trauma.

  19. Metocean design parameter estimation for fixed platform based on copula functions

    NASA Astrophysics Data System (ADS)

    Zhai, Jinjin; Yin, Qilin; Dong, Sheng

    2017-08-01

    Considering the dependent relationship among wave height, wind speed, and current velocity, we construct novel trivariate joint probability distributions via Archimedean copula functions. Total 30-year data of wave height, wind speed, and current velocity in the Bohai Sea are hindcast and sampled for case study. Four kinds of distributions, namely, Gumbel distribution, lognormal distribution, Weibull distribution, and Pearson Type III distribution, are candidate models for marginal distributions of wave height, wind speed, and current velocity. The Pearson Type III distribution is selected as the optimal model. Bivariate and trivariate probability distributions of these environmental conditions are established based on four bivariate and trivariate Archimedean copulas, namely, Clayton, Frank, Gumbel-Hougaard, and Ali-Mikhail-Haq copulas. These joint probability models can maximize marginal information and the dependence among the three variables. The design return values of these three variables can be obtained by three methods: univariate probability, conditional probability, and joint probability. The joint return periods of different load combinations are estimated by the proposed models. Platform responses (including base shear, overturning moment, and deck displacement) are further calculated. For the same return period, the design values of wave height, wind speed, and current velocity obtained by the conditional and joint probability models are much smaller than those by univariate probability. Considering the dependence among variables, the multivariate probability distributions provide close design parameters to actual sea state for ocean platform design.

  20. Models of multidimensional discrete distribution of probabilities of random variables in information systems

    NASA Astrophysics Data System (ADS)

    Gromov, Yu Yu; Minin, Yu V.; Ivanova, O. G.; Morozova, O. N.

    2018-03-01

    Multidimensional discrete distributions of probabilities of independent random values were received. Their one-dimensional distribution is widely used in probability theory. Producing functions of those multidimensional distributions were also received.

  1. On extinction time of a generalized endemic chain-binomial model.

    PubMed

    Aydogmus, Ozgur

    2016-09-01

    We considered a chain-binomial epidemic model not conferring immunity after infection. Mean field dynamics of the model has been analyzed and conditions for the existence of a stable endemic equilibrium are determined. The behavior of the chain-binomial process is probabilistically linked to the mean field equation. As a result of this link, we were able to show that the mean extinction time of the epidemic increases at least exponentially as the population size grows. We also present simulation results for the process to validate our analytical findings. Copyright © 2016 Elsevier Inc. All rights reserved.

  2. Solar San Diego: The Impact of Binomial Rate Structures on Real PV Systems; Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    VanGeet, O.; Brown, E.; Blair, T.

    2008-05-01

    There is confusion in the marketplace regarding the impact of solar photovoltaics (PV) on the user's actual electricity bill under California Net Energy Metering, particularly with binomial tariffs (those that include both demand and energy charges) and time-of-use (TOU) rate structures. The City of San Diego has extensive real-time electrical metering on most of its buildings and PV systems, with interval data for overall consumption and PV electrical production available for multiple years. This paper uses 2007 PV-system data from two city facilities to illustrate the impacts of binomial rate designs. The analysis will determine the energy and demand savingsmore » that the PV systems are achieving relative to the absence of systems. A financial analysis of PV-system performance under various rate structures is presented. The data revealed that actual demand and energy use benefits of binomial tariffs increase in summer months, when solar resources allow for maximized electricity production. In a binomial tariff system, varying on- and semi-peak times can result in approximately $1,100 change in demand charges per month over not having a PV system in place, an approximate 30% cost savings. The PV systems are also shown to have a 30%-50% reduction in facility energy charges in 2007.« less

  3. ATM Quality of Service Tests for Digitized Video Using ATM Over Satellite: Laboratory Tests

    NASA Technical Reports Server (NTRS)

    Ivancic, William D.; Brooks, David E.; Frantz, Brian D.

    1997-01-01

    A digitized video application was used to help determine minimum quality of service parameters for asynchronous transfer mode (ATM) over satellite. For these tests, binomially distributed and other errors were digitally inserted in an intermediate frequency link via a satellite modem and a commercial gaussian noise generator. In this paper, the relation- ship between the ATM cell error and cell loss parameter specifications is discussed with regard to this application. In addition, the video-encoding algorithms, test configurations, and results are presented in detail.

  4. Population Characteristics and the Nature of Egg Shells of two Phthirapteran Species Parasitizing Indian Cattle Egrets

    PubMed Central

    Ahmad, Aftab; Khan, Vikram; Badola, Smita; Arya, Gaurav; Bansal, Nayanci; Saxena, A. K.

    2010-01-01

    The prevalence, intensities of infestation, range of infestation and population composition of two phthirapteran species, Ardeicola expallidus Blagoveshtchensky (Phthiraptera: Philopteridae) and Ciconiphilus decimfasciatus Boisduval and Lacordaire (Menoponidae) on seventy cattle egrets were recorded during August 2004 to March 2005, in India. The frequency distribution patterns of both the species were skewed but did not correspond to the negative binomial model. The oviposition sites, egg laying patterns and the nature of the eggs of the two species were markedly different. PMID:21067416

  5. Cost-effective binomial sequential sampling of western bean cutworm, Striacosta albicosta (Lepidoptera: Noctuidae), egg masses in corn.

    PubMed

    Paula-Moraes, S; Burkness, E C; Hunt, T E; Wright, R J; Hein, G L; Hutchison, W D

    2011-12-01

    Striacosta albicosta (Smith) (Lepidoptera: Noctuidae), is a native pest of dry beans (Phaseolus vulgaris L.) and corn (Zea mays L.). As a result of larval feeding damage on corn ears, S. albicosta has a narrow treatment window; thus, early detection of the pest in the field is essential, and egg mass sampling has become a popular monitoring tool. Three action thresholds for field and sweet corn currently are used by crop consultants, including 4% of plants infested with egg masses on sweet corn in the silking-tasseling stage, 8% of plants infested with egg masses on field corn with approximately 95% tasseled, and 20% of plants infested with egg masses on field corn during mid-milk-stage corn. The current monitoring recommendation is to sample 20 plants at each of five locations per field (100 plants total). In an effort to develop a more cost-effective sampling plan for S. albicosta egg masses, several alternative binomial sampling plans were developed using Wald's sequential probability ratio test, and validated using Resampling for Validation of Sampling Plans (RVSP) software. The benefit-cost ratio also was calculated and used to determine the final selection of sampling plans. Based on final sampling plans selected for each action threshold, the average sample number required to reach a treat or no-treat decision ranged from 38 to 41 plants per field. This represents a significant savings in sampling cost over the current recommendation of 100 plants.

  6. Emperical Tests of Acceptance Sampling Plans

    NASA Technical Reports Server (NTRS)

    White, K. Preston, Jr.; Johnson, Kenneth L.

    2012-01-01

    Acceptance sampling is a quality control procedure applied as an alternative to 100% inspection. A random sample of items is drawn from a lot to determine the fraction of items which have a required quality characteristic. Both the number of items to be inspected and the criterion for determining conformance of the lot to the requirement are given by an appropriate sampling plan with specified risks of Type I and Type II sampling errors. In this paper, we present the results of empirical tests of the accuracy of selected sampling plans reported in the literature. These plans are for measureable quality characteristics which are known have either binomial, exponential, normal, gamma, Weibull, inverse Gaussian, or Poisson distributions. In the main, results support the accepted wisdom that variables acceptance plans are superior to attributes (binomial) acceptance plans, in the sense that these provide comparable protection against risks at reduced sampling cost. For the Gaussian and Weibull plans, however, there are ranges of the shape parameters for which the required sample sizes are in fact larger than the corresponding attributes plans, dramatically so for instances of large skew. Tests further confirm that the published inverse-Gaussian (IG) plan is flawed, as reported by White and Johnson (2011).

  7. Comparison and Field Validation of Binomial Sampling Plans for Oligonychus perseae (Acari: Tetranychidae) on Hass Avocado in Southern California.

    PubMed

    Lara, Jesus R; Hoddle, Mark S

    2015-08-01

    Oligonychus perseae Tuttle, Baker, & Abatiello is a foliar pest of 'Hass' avocados [Persea americana Miller (Lauraceae)]. The recommended action threshold is 50-100 motile mites per leaf, but this count range and other ecological factors associated with O. perseae infestations limit the application of enumerative sampling plans in the field. Consequently, a comprehensive modeling approach was implemented to compare the practical application of various binomial sampling models for decision-making of O. perseae in California. An initial set of sequential binomial sampling models were developed using three mean-proportion modeling techniques (i.e., Taylor's power law, maximum likelihood, and an empirical model) in combination with two-leaf infestation tally thresholds of either one or two mites. Model performance was evaluated using a robust mite count database consisting of >20,000 Hass avocado leaves infested with varying densities of O. perseae and collected from multiple locations. Operating characteristic and average sample number results for sequential binomial models were used as the basis to develop and validate a standardized fixed-size binomial sampling model with guidelines on sample tree and leaf selection within blocks of avocado trees. This final validated model requires a leaf sampling cost of 30 leaves and takes into account the spatial dynamics of O. perseae to make reliable mite density classifications for a 50-mite action threshold. Recommendations for implementing this fixed-size binomial sampling plan to assess densities of O. perseae in commercial California avocado orchards are discussed. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  8. Statistical tests for whether a given set of independent, identically distributed draws comes from a specified probability density.

    PubMed

    Tygert, Mark

    2010-09-21

    We discuss several tests for determining whether a given set of independent and identically distributed (i.i.d.) draws does not come from a specified probability density function. The most commonly used are Kolmogorov-Smirnov tests, particularly Kuiper's variant, which focus on discrepancies between the cumulative distribution function for the specified probability density and the empirical cumulative distribution function for the given set of i.i.d. draws. Unfortunately, variations in the probability density function often get smoothed over in the cumulative distribution function, making it difficult to detect discrepancies in regions where the probability density is small in comparison with its values in surrounding regions. We discuss tests without this deficiency, complementing the classical methods. The tests of the present paper are based on the plain fact that it is unlikely to draw a random number whose probability is small, provided that the draw is taken from the same distribution used in calculating the probability (thus, if we draw a random number whose probability is small, then we can be confident that we did not draw the number from the same distribution used in calculating the probability).

  9. Replacement/Refurbishment of JSC/NASA POD Specimens

    NASA Technical Reports Server (NTRS)

    Castner, Willard L.

    2010-01-01

    The NASA Special NDE certification process requires demonstration of NDE capability by test per NASA-STD-5009. This test is performed with fatigue cracked specimens containing very small cracks. The certification test results are usually based on binomial statistics and must meet a 90/95 Probability of Detection (POD). The assumption is that fatigue cracks are tightly closed, difficult to detect, and inspectors and processes passing such a test are well qualified for inspecting NASA fracture critical hardware. The JSC NDE laboratory has what may be the largest inventory that exists of such fatigue cracked NDE demonstration specimens. These specimens were produced by the hundreds in the late 1980s and early 1990s. None have been produced since that time and the condition and usability of the specimens are questionable.

  10. Using the Binomial Series to Prove the Arithmetic Mean-Geometric Mean Inequality

    ERIC Educational Resources Information Center

    Persky, Ronald L.

    2003-01-01

    In 1968, Leon Gerber compared (1 + x)[superscript a] to its kth partial sum as a binomial series. His result is stated and, as an application of this result, a proof of the arithmetic mean-geometric mean inequality is presented.

  11. Four Bootstrap Confidence Intervals for the Binomial-Error Model.

    ERIC Educational Resources Information Center

    Lin, Miao-Hsiang; Hsiung, Chao A.

    1992-01-01

    Four bootstrap methods are identified for constructing confidence intervals for the binomial-error model. The extent to which similar results are obtained and the theoretical foundation of each method and its relevance and ranges of modeling the true score uncertainty are discussed. (SLD)

  12. Possibility and Challenges of Conversion of Current Virus Species Names to Linnaean Binomials.

    PubMed

    Postler, Thomas S; Clawson, Anna N; Amarasinghe, Gaya K; Basler, Christopher F; Bavari, Sbina; Benko, Mária; Blasdell, Kim R; Briese, Thomas; Buchmeier, Michael J; Bukreyev, Alexander; Calisher, Charles H; Chandran, Kartik; Charrel, Rémi; Clegg, Christopher S; Collins, Peter L; Juan Carlos, De La Torre; Derisi, Joseph L; Dietzgen, Ralf G; Dolnik, Olga; Dürrwald, Ralf; Dye, John M; Easton, Andrew J; Emonet, Sébastian; Formenty, Pierre; Fouchier, Ron A M; Ghedin, Elodie; Gonzalez, Jean-Paul; Harrach, Balázs; Hewson, Roger; Horie, Masayuki; Jiang, Dàohóng; Kobinger, Gary; Kondo, Hideki; Kropinski, Andrew M; Krupovic, Mart; Kurath, Gael; Lamb, Robert A; Leroy, Eric M; Lukashevich, Igor S; Maisner, Andrea; Mushegian, Arcady R; Netesov, Sergey V; Nowotny, Norbert; Patterson, Jean L; Payne, Susan L; PaWeska, Janusz T; Peters, Clarence J; Radoshitzky, Sheli R; Rima, Bertus K; Romanowski, Victor; Rubbenstroth, Dennis; Sabanadzovic, Sead; Sanfaçon, Hélène; Salvato, Maria S; Schwemmle, Martin; Smither, Sophie J; Stenglein, Mark D; Stone, David M; Takada, Ayato; Tesh, Robert B; Tomonaga, Keizo; Tordo, Noël; Towner, Jonathan S; Vasilakis, Nikos; Volchkov, Viktor E; Wahl-Jensen, Victoria; Walker, Peter J; Wang, Lin-Fa; Varsani, Arvind; Whitfield, Anna E; Zerbini, F Murilo; Kuhn, Jens H

    2017-05-01

    Botanical, mycological, zoological, and prokaryotic species names follow the Linnaean format, consisting of an italicized Latinized binomen with a capitalized genus name and a lower case species epithet (e.g., Homo sapiens). Virus species names, however, do not follow a uniform format, and, even when binomial, are not Linnaean in style. In this thought exercise, we attempted to convert all currently official names of species included in the virus family Arenaviridae and the virus order Mononegavirales to Linnaean binomials, and to identify and address associated challenges and concerns. Surprisingly, this endeavor was not as complicated or time-consuming as even the authors of this article expected when conceiving the experiment. [Arenaviridae; binomials; ICTV; International Committee on Taxonomy of Viruses; Mononegavirales; virus nomenclature; virus taxonomy.]. Published by Oxford University Press on behalf of Society of Systematic Biologists 2016. This work is written by a US Government employee and is in the public domain in the US.

  13. The arcsine is asinine: the analysis of proportions in ecology.

    PubMed

    Warton, David I; Hui, Francis K C

    2011-01-01

    The arcsine square root transformation has long been standard procedure when analyzing proportional data in ecology, with applications in data sets containing binomial and non-binomial response variables. Here, we argue that the arcsine transform should not be used in either circumstance. For binomial data, logistic regression has greater interpretability and higher power than analyses of transformed data. However, it is important to check the data for additional unexplained variation, i.e., overdispersion, and to account for it via the inclusion of random effects in the model if found. For non-binomial data, the arcsine transform is undesirable on the grounds of interpretability, and because it can produce nonsensical predictions. The logit transformation is proposed as an alternative approach to address these issues. Examples are presented in both cases to illustrate these advantages, comparing various methods of analyzing proportions including untransformed, arcsine- and logit-transformed linear models and logistic regression (with or without random effects). Simulations demonstrate that logistic regression usually provides a gain in power over other methods.

  14. Spiritual and ceremonial plants in North America: an assessment of Moerman's ethnobotanical database comparing Residual, Binomial, Bayesian and Imprecise Dirichlet Model (IDM) analysis.

    PubMed

    Turi, Christina E; Murch, Susan J

    2013-07-09

    Ethnobotanical research and the study of plants used for rituals, ceremonies and to connect with the spirit world have led to the discovery of many novel psychoactive compounds such as nicotine, caffeine, and cocaine. In North America, spiritual and ceremonial uses of plants are well documented and can be accessed online via the University of Michigan's Native American Ethnobotany Database. The objective of the study was to compare Residual, Bayesian, Binomial and Imprecise Dirichlet Model (IDM) analyses of ritual, ceremonial and spiritual plants in Moerman's ethnobotanical database and to identify genera that may be good candidates for the discovery of novel psychoactive compounds. The database was queried with the following format "Family Name AND Ceremonial OR Spiritual" for 263 North American botanical families. Spiritual and ceremonial flora consisted of 86 families with 517 species belonging to 292 genera. Spiritual taxa were then grouped further into ceremonial medicines and items categories. Residual, Bayesian, Binomial and IDM analysis were performed to identify over and under-utilized families. The 4 statistical approaches were in good agreement when identifying under-utilized families but large families (>393 species) were underemphasized by Binomial, Bayesian and IDM approaches for over-utilization. Residual, Binomial, and IDM analysis identified similar families as over-utilized in the medium (92-392 species) and small (<92 species) classes. The families Apiaceae, Asteraceae, Ericacea, Pinaceae and Salicaceae were identified as significantly over-utilized as ceremonial medicines in medium and large sized families. Analysis of genera within the Apiaceae and Asteraceae suggest that the genus Ligusticum and Artemisia are good candidates for facilitating the discovery of novel psychoactive compounds. The 4 statistical approaches were not consistent in the selection of over-utilization of flora. Residual analysis revealed overall trends that were supported by Binomial analysis when separated into small, medium and large families. The Bayesian, Binomial and IDM approaches identified different genera as potentially important. Species belonging to the genus Artemisia and Ligusticum were most consistently identified and may be valuable in future studies of the ethnopharmacology. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  15. Single particle momentum and angular distributions in hadron-hadron collisions at ultrahigh energies

    NASA Technical Reports Server (NTRS)

    Chou, T. T.; Chen, N. Y.

    1985-01-01

    The forward-backward charged multiplicity distribution (P n sub F, n sub B) of events in the 540 GeV antiproton-proton collider has been extensively studied by the UA5 Collaboration. It was pointed out that the distribution with respect to n = n sub F + n sub B satisfies approximate KNO scaling and that with respect to Z = n sub F - n sub B is binomial. The geometrical model of hadron-hadron collision interprets the large multiplicity fluctuation as due to the widely different nature of collisions at different impact parameters b. For a single impact parameter b, the collision in the geometrical model should exhibit stochastic behavior. This separation of the stochastic and nonstochastic (KNO) aspects of multiparticle production processes gives conceptually a lucid and attractive picture of such collisions, leading to the concept of partition temperature T sub p and the single particle momentum spectrum to be discussed in detail.

  16. How to retrieve additional information from the multiplicity distributions

    NASA Astrophysics Data System (ADS)

    Wilk, Grzegorz; Włodarczyk, Zbigniew

    2017-01-01

    Multiplicity distributions (MDs) P(N) measured in multiparticle production processes are most frequently described by the negative binomial distribution (NBD). However, with increasing collision energy some systematic discrepancies have become more and more apparent. They are usually attributed to the possible multi-source structure of the production process and described using a multi-NBD form of the MD. We investigate the possibility of keeping a single NBD but with its parameters depending on the multiplicity N. This is done by modifying the widely known clan model of particle production leading to the NBD form of P(N). This is then confronted with the approach based on the so-called cascade-stochastic formalism which is based on different types of recurrence relations defining P(N). We demonstrate that a combination of both approaches allows the retrieval of additional valuable information from the MDs, namely the oscillatory behavior of the counting statistics apparently visible in the high energy data.

  17. Selecting Tools to Model Integer and Binomial Multiplication

    ERIC Educational Resources Information Center

    Pratt, Sarah Smitherman; Eddy, Colleen M.

    2017-01-01

    Mathematics teachers frequently provide concrete manipulatives to students during instruction; however, the rationale for using certain manipulatives in conjunction with concepts may not be explored. This article focuses on area models that are currently used in classrooms to provide concrete examples of integer and binomial multiplication. The…

  18. On mobile wireless ad hoc IP video transports

    NASA Astrophysics Data System (ADS)

    Kazantzidis, Matheos

    2006-05-01

    Multimedia transports in wireless, ad-hoc, multi-hop or mobile networks must be capable of obtaining information about the network and adaptively tune sending and encoding parameters to the network response. Obtaining meaningful metrics to guide a stable congestion control mechanism in the transport (i.e. passive, simple, end-to-end and network technology independent) is a complex problem. Equally difficult is obtaining a reliable QoS metrics that agrees with user perception in a client/server or distributed environment. Existing metrics, objective or subjective, are commonly used after or before to test or report on a transmission and require access to both original and transmitted frames. In this paper, we propose that an efficient and successful video delivery and the optimization of overall network QoS requires innovation in a) a direct measurement of available and bottleneck capacity for its congestion control and b) a meaningful subjective QoS metric that is dynamically reported to video sender. Once these are in place, a binomial -stable, fair and TCP friendly- algorithm can be used to determine the sending rate and other packet video parameters. An adaptive mpeg codec can then continually test and fit its parameters and temporal-spatial data-error control balance using the perceived QoS dynamic feedback. We suggest a new measurement based on a packet dispersion technique that is independent of underlying network mechanisms. We then present a binomial control based on direct measurements. We implement a QoS metric that is known to agree with user perception (MPQM) in a client/server, distributed environment by using predetermined table lookups and characterization of video content.

  19. Association between month of birth and melanoma risk: fact or fiction?

    PubMed

    Fiessler, Cornelia; Pfahlberg, Annette B; Keller, Andrea K; Radespiel-Tröger, Martin; Uter, Wolfgang; Gefeller, Olaf

    2017-04-01

    Evidence on the effect of ultraviolet radiation (UVR) exposure in infancy on melanoma risk in later life is scarce. Three recent studies suggest that people born in spring carry a higher melanoma risk. Our study aimed at verifying whether such a seasonal pattern of melanoma risk actually exists. Data from the population-based Cancer Registry Bavaria (CRB) on the birth months of 28 374 incident melanoma cases between 2002 and 2012 were analysed and compared with data from the Bavarian State Office for Statistics and Data Processing on the birth month distribution in the Bavarian population. Crude and adjusted analyses using negative binomial regression models were performed in the total study group and supplemented by several subgroup analyses. In the crude analysis, the birth months March-May were over-represented among melanoma cases. Negative binomial regression models adjusted only for sex and birth year revealed a seasonal association between melanoma risk and birth month with 13-21% higher relative incidence rates for March, April and May compared with the reference December. However, after additionally adjusting for the birth month distribution of the Bavarian population, these risk estimates decreased markedly and no association with the birth month was observed any more. Similar results emerged in all subgroup analyses. Our large registry-based study provides no evidence that people born in spring carry a higher risk for developing melanoma in later life and thus lends no support to the hypothesis of higher UVR susceptibility during the first months of life. © The Author 2016; all rights reserved. Published by Oxford University Press on behalf of the International Epidemiological Association

  20. Generalized seasonal autoregressive integrated moving average models for count data with application to malaria time series with low case numbers.

    PubMed

    Briët, Olivier J T; Amerasinghe, Priyanie H; Vounatsou, Penelope

    2013-01-01

    With the renewed drive towards malaria elimination, there is a need for improved surveillance tools. While time series analysis is an important tool for surveillance, prediction and for measuring interventions' impact, approximations by commonly used Gaussian methods are prone to inaccuracies when case counts are low. Therefore, statistical methods appropriate for count data are required, especially during "consolidation" and "pre-elimination" phases. Generalized autoregressive moving average (GARMA) models were extended to generalized seasonal autoregressive integrated moving average (GSARIMA) models for parsimonious observation-driven modelling of non Gaussian, non stationary and/or seasonal time series of count data. The models were applied to monthly malaria case time series in a district in Sri Lanka, where malaria has decreased dramatically in recent years. The malaria series showed long-term changes in the mean, unstable variance and seasonality. After fitting negative-binomial Bayesian models, both a GSARIMA and a GARIMA deterministic seasonality model were selected based on different criteria. Posterior predictive distributions indicated that negative-binomial models provided better predictions than Gaussian models, especially when counts were low. The G(S)ARIMA models were able to capture the autocorrelation in the series. G(S)ARIMA models may be particularly useful in the drive towards malaria elimination, since episode count series are often seasonal and non-stationary, especially when control is increased. Although building and fitting GSARIMA models is laborious, they may provide more realistic prediction distributions than do Gaussian methods and may be more suitable when counts are low.

  1. Generalized Seasonal Autoregressive Integrated Moving Average Models for Count Data with Application to Malaria Time Series with Low Case Numbers

    PubMed Central

    Briët, Olivier J. T.; Amerasinghe, Priyanie H.; Vounatsou, Penelope

    2013-01-01

    Introduction With the renewed drive towards malaria elimination, there is a need for improved surveillance tools. While time series analysis is an important tool for surveillance, prediction and for measuring interventions’ impact, approximations by commonly used Gaussian methods are prone to inaccuracies when case counts are low. Therefore, statistical methods appropriate for count data are required, especially during “consolidation” and “pre-elimination” phases. Methods Generalized autoregressive moving average (GARMA) models were extended to generalized seasonal autoregressive integrated moving average (GSARIMA) models for parsimonious observation-driven modelling of non Gaussian, non stationary and/or seasonal time series of count data. The models were applied to monthly malaria case time series in a district in Sri Lanka, where malaria has decreased dramatically in recent years. Results The malaria series showed long-term changes in the mean, unstable variance and seasonality. After fitting negative-binomial Bayesian models, both a GSARIMA and a GARIMA deterministic seasonality model were selected based on different criteria. Posterior predictive distributions indicated that negative-binomial models provided better predictions than Gaussian models, especially when counts were low. The G(S)ARIMA models were able to capture the autocorrelation in the series. Conclusions G(S)ARIMA models may be particularly useful in the drive towards malaria elimination, since episode count series are often seasonal and non-stationary, especially when control is increased. Although building and fitting GSARIMA models is laborious, they may provide more realistic prediction distributions than do Gaussian methods and may be more suitable when counts are low. PMID:23785448

  2. [Application of negative binomial regression and modified Poisson regression in the research of risk factors for injury frequency].

    PubMed

    Cao, Qingqing; Wu, Zhenqiang; Sun, Ying; Wang, Tiezhu; Han, Tengwei; Gu, Chaomei; Sun, Yehuan

    2011-11-01

    To Eexplore the application of negative binomial regression and modified Poisson regression analysis in analyzing the influential factors for injury frequency and the risk factors leading to the increase of injury frequency. 2917 primary and secondary school students were selected from Hefei by cluster random sampling method and surveyed by questionnaire. The data on the count event-based injuries used to fitted modified Poisson regression and negative binomial regression model. The risk factors incurring the increase of unintentional injury frequency for juvenile students was explored, so as to probe the efficiency of these two models in studying the influential factors for injury frequency. The Poisson model existed over-dispersion (P < 0.0001) based on testing by the Lagrangemultiplier. Therefore, the over-dispersion dispersed data using a modified Poisson regression and negative binomial regression model, was fitted better. respectively. Both showed that male gender, younger age, father working outside of the hometown, the level of the guardian being above junior high school and smoking might be the results of higher injury frequencies. On a tendency of clustered frequency data on injury event, both the modified Poisson regression analysis and negative binomial regression analysis can be used. However, based on our data, the modified Poisson regression fitted better and this model could give a more accurate interpretation of relevant factors affecting the frequency of injury.

  3. Patterns of medicinal plant use: an examination of the Ecuadorian Shuar medicinal flora using contingency table and binomial analyses.

    PubMed

    Bennett, Bradley C; Husby, Chad E

    2008-03-28

    Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.

  4. General solution of the chemical master equation and modality of marginal distributions for hierarchic first-order reaction networks.

    PubMed

    Reis, Matthias; Kromer, Justus A; Klipp, Edda

    2018-01-20

    Multimodality is a phenomenon which complicates the analysis of statistical data based exclusively on mean and variance. Here, we present criteria for multimodality in hierarchic first-order reaction networks, consisting of catalytic and splitting reactions. Those networks are characterized by independent and dependent subnetworks. First, we prove the general solvability of the Chemical Master Equation (CME) for this type of reaction network and thereby extend the class of solvable CME's. Our general solution is analytical in the sense that it allows for a detailed analysis of its statistical properties. Given Poisson/deterministic initial conditions, we then prove the independent species to be Poisson/binomially distributed, while the dependent species exhibit generalized Poisson/Khatri Type B distributions. Generalized Poisson/Khatri Type B distributions are multimodal for an appropriate choice of parameters. We illustrate our criteria for multimodality by several basic models, as well as the well-known two-stage transcription-translation network and Bateman's model from nuclear physics. For both examples, multimodality was previously not reported.

  5. Investigation of Dielectric Breakdown Characteristics for Double-break Vacuum Interrupter and Dielectric Breakdown Probability Distribution in Vacuum Interrupter

    NASA Astrophysics Data System (ADS)

    Shioiri, Tetsu; Asari, Naoki; Sato, Junichi; Sasage, Kosuke; Yokokura, Kunio; Homma, Mitsutaka; Suzuki, Katsumi

    To investigate the reliability of equipment of vacuum insulation, a study was carried out to clarify breakdown probability distributions in vacuum gap. Further, a double-break vacuum circuit breaker was investigated for breakdown probability distribution. The test results show that the breakdown probability distribution of the vacuum gap can be represented by a Weibull distribution using a location parameter, which shows the voltage that permits a zero breakdown probability. The location parameter obtained from Weibull plot depends on electrode area. The shape parameter obtained from Weibull plot of vacuum gap was 10∼14, and is constant irrespective non-uniform field factor. The breakdown probability distribution after no-load switching can be represented by Weibull distribution using a location parameter. The shape parameter after no-load switching was 6∼8.5, and is constant, irrespective of gap length. This indicates that the scatter of breakdown voltage was increased by no-load switching. If the vacuum circuit breaker uses a double break, breakdown probability at low voltage becomes lower than single-break probability. Although potential distribution is a concern in the double-break vacuum cuicuit breaker, its insulation reliability is better than that of the single-break vacuum interrupter even if the bias of the vacuum interrupter's sharing voltage is taken into account.

  6. Multilevel Models for Binary Data

    ERIC Educational Resources Information Center

    Powers, Daniel A.

    2012-01-01

    The methods and models for categorical data analysis cover considerable ground, ranging from regression-type models for binary and binomial data, count data, to ordered and unordered polytomous variables, as well as regression models that mix qualitative and continuous data. This article focuses on methods for binary or binomial data, which are…

  7. Using a Betabinomial distribution to estimate the prevalence of adherence to physical activity guidelines among children and youth.

    PubMed

    Garriguet, Didier

    2016-04-01

    Estimates of the prevalence of adherence to physical activity guidelines in the population are generally the result of averaging individual probability of adherence based on the number of days people meet the guidelines and the number of days they are assessed. Given this number of active and inactive days (days assessed minus days active), the conditional probability of meeting the guidelines that has been used in the past is a Beta (1 + active days, 1 + inactive days) distribution assuming the probability p of a day being active is bounded by 0 and 1 and averages 50%. A change in the assumption about the distribution of p is required to better match the discrete nature of the data and to better assess the probability of adherence when the percentage of active days in the population differs from 50%. Using accelerometry data from the Canadian Health Measures Survey, the probability of adherence to physical activity guidelines is estimated using a conditional probability given the number of active and inactive days distributed as a Betabinomial(n, a + active days , β + inactive days) assuming that p is randomly distributed as Beta(a, β) where the parameters a and β are estimated by maximum likelihood. The resulting Betabinomial distribution is discrete. For children aged 6 or older, the probability of meeting physical activity guidelines 7 out of 7 days is similar to published estimates. For pre-schoolers, the Betabinomial distribution yields higher estimates of adherence to the guidelines than the Beta distribution, in line with the probability of being active on any given day. In estimating the probability of adherence to physical activity guidelines, the Betabinomial distribution has several advantages over the previously used Beta distribution. It is a discrete distribution and maximizes the richness of accelerometer data.

  8. Conditional modeling of antibody titers using a zero-inflated poisson random effects model: application to Fabrazyme.

    PubMed

    Bonate, Peter L; Sung, Crystal; Welch, Karen; Richards, Susan

    2009-10-01

    Patients that are exposed to biotechnology-derived therapeutics often develop antibodies to the therapeutic, the magnitude of which is assessed by measuring antibody titers. A statistical approach for analyzing antibody titer data conditional on seroconversion is presented. The proposed method is to first transform the antibody titer data based on a geometric series using a common ratio of 2 and a scale factor of 50 and then analyze the exponent using a zero-inflated or hurdle model assuming a Poisson or negative binomial distribution with random effects to account for patient heterogeneity. Patient specific covariates can be used to model the probability of developing an antibody response, i.e., seroconversion, as well as the magnitude of the antibody titer itself. The method was illustrated using antibody titer data from 87 male seroconverted Fabry patients receiving Fabrazyme. Titers from five clinical trials were collected over 276 weeks of therapy with anti-Fabrazyme IgG titers ranging from 100 to 409,600 after exclusion of seronegative patients. The best model to explain seroconversion was a zero-inflated Poisson (ZIP) model where cumulative dose (under a constant dose regimen of dosing every 2 weeks) influenced the probability of seroconversion. There was an 80% chance of seroconversion when the cumulative dose reached 210 mg (90% confidence interval: 194-226 mg). No difference in antibody titers was noted between Japanese or Western patients. Once seroconverted, antibody titers did not remain constant but decreased in an exponential manner from an initial magnitude to a new lower steady-state value. The expected titer after the new steady-state titer had been achieved was 870 (90% CI: 630-1109). The half-life to the new steady-state value after seroconversion was 44 weeks (90% CI: 17-70 weeks). Time to seroconversion did not appear to be correlated with titer at the time of seroconversion. The method can be adequately used to model antibody titer data.

  9. Establishing endangered species recovery criteria using predictive simulation modeling

    USGS Publications Warehouse

    McGowan, Conor P.; Catlin, Daniel H.; Shaffer, Terry L.; Gratto-Trevor, Cheri L.; Aron, Carol

    2014-01-01

    Listing a species under the Endangered Species Act (ESA) and developing a recovery plan requires U.S. Fish and Wildlife Service to establish specific and measurable criteria for delisting. Generally, species are listed because they face (or are perceived to face) elevated risk of extinction due to issues such as habitat loss, invasive species, or other factors. Recovery plans identify recovery criteria that reduce extinction risk to an acceptable level. It logically follows that the recovery criteria, the defined conditions for removing a species from ESA protections, need to be closely related to extinction risk. Extinction probability is a population parameter estimated with a model that uses current demographic information to project the population into the future over a number of replicates, calculating the proportion of replicated populations that go extinct. We simulated extinction probabilities of piping plovers in the Great Plains and estimated the relationship between extinction probability and various demographic parameters. We tested the fit of regression models linking initial abundance, productivity, or population growth rate to extinction risk, and then, using the regression parameter estimates, determined the conditions required to reduce extinction probability to some pre-defined acceptable threshold. Binomial regression models with mean population growth rate and the natural log of initial abundance were the best predictors of extinction probability 50 years into the future. For example, based on our regression models, an initial abundance of approximately 2400 females with an expected mean population growth rate of 1.0 will limit extinction risk for piping plovers in the Great Plains to less than 0.048. Our method provides a straightforward way of developing specific and measurable recovery criteria linked directly to the core issue of extinction risk. Published by Elsevier Ltd.

  10. Speed congenics: accelerated genome recovery using genetic markers.

    PubMed

    Visscher, P M

    1999-08-01

    Genetic markers throughout the genome can be used to speed up 'recovery' of the recipient genome in the backcrossing phase of the construction of a congenic strain. The prediction of the genomic proportion during backcrossing depends on the assumptions regarding the distribution of chromosome segments, the population structure, the marker spacing and the selection strategy. In this study simulation was used to investigate the rate of recovery of the recipient genome for a mouse, Drosophila and Arabidopsis genome. It was shown that an incorrect assumption of a binomial distribution of chromosome segments, and failing to take account of a reduction in variance in genomic proportion due to selection, can lead to a downward bias of up to two generations in the estimation of the number of generations required for the formation of a congenic strain.

  11. Epidemiological study of the intestinal helminths of wild boar (Sus scrofa) and mouflon (Ovis gmelini musimon) in central Italy.

    PubMed

    Magi, M; Bertani, M; Dell'Omodarme, M; Prati, M C

    2002-12-01

    Since 1995 the population of wild ungulates increased significantly in the "Parco provinciale dei Monti Livornesi" (Livorno, Tuscany, Central Italy). We studied the intestinal macroparasites of two hosts, the wild boar (Sus scrofa) and the mouflon (Ovis gmelini musimon). In the case of wild boars we found a dominant parasite species, Globocephalus urosubulatus. For this parasite the frequency distribution of the number of parasites per host agrees with a negative binomial distribution. There is not a significant correlation between the age of the animals and the parasitosis. Furthermore the mean parasite burden of male and female wild boars does not differ significantly. In the case of mouflons we found a dominant parasite species Nematodirus filicollis with Trichuris ovis as codominant species.

  12. Bursts of Self-Conscious Emotions in the Daily Lives of Emerging Adults.

    PubMed

    Conroy, David E; Ram, Nilam; Pincus, Aaron L; Rebar, Amanda L

    Self-conscious emotions play a role in regulating daily achievement strivings, social behavior, and health, but little is known about the processes underlying their daily manifestation. Emerging adults (n = 182) completed daily diaries for eight days and multilevel models were estimated to evaluate whether, how much, and why their emotions varied from day-to-day. Within-person variation in authentic pride was normally-distributed across people and days whereas the other emotions were burst-like and characterized by zero-inflated, negative binomial distributions. Perceiving social interactions as generally communal increased the odds of hubristic pride activation and reduced the odds of guilt activation; daily communal behavior reduced guilt intensity. Results illuminated processes through which meaning about the self-in-relation-to-others is constructed during a critical period of development.

  13. Flood Frequency Analysis With Historical and Paleoflood Information

    NASA Astrophysics Data System (ADS)

    Stedinger, Jery R.; Cohn, Timothy A.

    1986-05-01

    An investigation is made of flood quantile estimators which can employ "historical" and paleoflood information in flood frequency analyses. Two categories of historical information are considered: "censored" data, where the magnitudes of historical flood peaks are known; and "binomial" data, where only threshold exceedance information is available. A Monte Carlo study employing the two-parameter lognormal distribution shows that maximum likelihood estimators (MLEs) can extract the equivalent of an additional 10-30 years of gage record from a 50-year period of historical observation. The MLE routines are shown to be substantially better than an adjusted-moment estimator similar to the one recommended in Bulletin 17B of the United States Water Resources Council Hydrology Committee (1982). The MLE methods performed well even when floods were drawn from other than the assumed lognormal distribution.

  14. A robust design mark-resight abundance estimator allowing heterogeneity in resighting probabilities

    USGS Publications Warehouse

    McClintock, B.T.; White, Gary C.; Burnham, K.P.

    2006-01-01

    This article introduces the beta-binomial estimator (BBE), a closed-population abundance mark-resight model combining the favorable qualities of maximum likelihood theory and the allowance of individual heterogeneity in sighting probability (p). The model may be parameterized for a robust sampling design consisting of multiple primary sampling occasions where closure need not be met between primary occasions. We applied the model to brown bear data from three study areas in Alaska and compared its performance to the joint hypergeometric estimator (JHE) and Bowden's estimator (BOWE). BBE estimates suggest heterogeneity levels were non-negligible and discourage the use of JHE for these data. Compared to JHE and BOWE, confidence intervals were considerably shorter for the AICc model-averaged BBE. To evaluate the properties of BBE relative to JHE and BOWE when sample sizes are small, simulations were performed with data from three primary occasions generated under both individual heterogeneity and temporal variation in p. All models remained consistent regardless of levels of variation in p. In terms of precision, the AICc model-averaged BBE showed advantages over JHE and BOWE when heterogeneity was present and mean sighting probabilities were similar between primary occasions. Based on the conditions examined, BBE is a reliable alternative to JHE or BOWE and provides a framework for further advances in mark-resight abundance estimation. ?? 2006 American Statistical Association and the International Biometric Society.

  15. Recall of Anti-Tobacco Advertisements and Effects on Quitting Behavior: Results From the California Smokers Cohort

    PubMed Central

    Leas, Eric C.; Myers, Mark G.; Strong, David R.; Hofstetter, C. Richard

    2015-01-01

    Objectives. We assessed whether an anti-tobacco television advertisement called “Stages,” which depicted a woman giving a brief emotional narrative of her experiences with tobacco use, would be recalled more often and have a greater effect on smoking cessation than 3 other advertisements with different intended themes. Methods. Our data were derived from a sample of 2596 California adult smokers. We used multivariable log-binomial and modified Poisson regression models to calculate respondents’ probability of quitting as a result of advertisement recall. Results. More respondents recalled the “Stages” ad (58.5%) than the 3 other ads (23.1%, 23.4%, and 25.6%; P < .001). Respondents who recalled “Stages” at baseline had a higher probability than those who did not recall the ad of making a quit attempt between baseline and follow-up (adjusted risk ratio [RR] = 1.18; 95% confidence interval [CI] = 1.03, 1.34) and a higher probability of being in a period of smoking abstinence for at least a month at follow-up (adjusted RR = 1.55; 95% CI = 1.02, 2.37). Conclusions. Anti-tobacco television advertisements that depict visceral and personal messages may be recalled by a larger percentage of smokers and may have a greater impact on smoking cessation than other types of advertisements. PMID:25521871

  16. Possibility and Challenges of Conversion of Current Virus Species Names to Linnaean Binomials

    PubMed Central

    Postler, Thomas S.; Clawson, Anna N.; Amarasinghe, Gaya K.; Basler, Christopher F.; Bavari, Sbina; Benkő, Mária; Blasdell, Kim R.; Briese, Thomas; Buchmeier, Michael J.; Bukreyev, Alexander; Calisher, Charles H.; Chandran, Kartik; Charrel, Rémi; Clegg, Christopher S.; Collins, Peter L.; Juan Carlos, De La Torre; Derisi, Joseph L.; Dietzgen, Ralf G.; Dolnik, Olga; Dürrwald, Ralf; Dye, John M.; Easton, Andrew J.; Emonet, Sébastian; Formenty, Pierre; Fouchier, Ron A. M.; Ghedin, Elodie; Gonzalez, Jean-Paul; Harrach, Balázs; Hewson, Roger; Horie, Masayuki; Jiāng, Dàohóng; Kobinger, Gary; Kondo, Hideki; Kropinski, Andrew M.; Krupovic, Mart; Kurath, Gael; Lamb, Robert A.; Leroy, Eric M.; Lukashevich, Igor S.; Maisner, Andrea; Mushegian, Arcady R.; Netesov, Sergey V.; Nowotny, Norbert; Patterson, Jean L.; Payne, Susan L.; PaWeska, Janusz T.; Peters, Clarence J.; Radoshitzky, Sheli R.; Rima, Bertus K.; Romanowski, Victor; Rubbenstroth, Dennis; Sabanadzovic, Sead; Sanfaçon, Hélène; Salvato, Maria S.; Schwemmle, Martin; Smither, Sophie J.; Stenglein, Mark D.; Stone, David M.; Takada, Ayato; Tesh, Robert B.; Tomonaga, Keizo; Tordo, Noël; Towner, Jonathan S.; Vasilakis, Nikos; Volchkov, Viktor E.; Wahl-Jensen, Victoria; Walker, Peter J.; Wang, Lin-Fa; Varsani, Arvind; Whitfield, Anna E.; Zerbini, F. Murilo; Kuhn, Jens H.

    2017-01-01

    Abstract Botanical, mycological, zoological, and prokaryotic species names follow the Linnaean format, consisting of an italicized Latinized binomen with a capitalized genus name and a lower case species epithet (e.g., Homo sapiens). Virus species names, however, do not follow a uniform format, and, even when binomial, are not Linnaean in style. In this thought exercise, we attempted to convert all currently official names of species included in the virus family Arenaviridae and the virus order Mononegavirales to Linnaean binomials, and to identify and address associated challenges and concerns. Surprisingly, this endeavor was not as complicated or time-consuming as even the authors of this article expected when conceiving the experiment. PMID:27798405

  17. Binomial Coefficients Modulo a Prime--A Visualization Approach to Undergraduate Research

    ERIC Educational Resources Information Center

    Bardzell, Michael; Poimenidou, Eirini

    2011-01-01

    In this article we present, as a case study, results of undergraduate research involving binomial coefficients modulo a prime "p." We will discuss how undergraduates were involved in the project, even with a minimal mathematical background beforehand. There are two main avenues of exploration described to discover these binomial…

  18. Integer Solutions of Binomial Coefficients

    ERIC Educational Resources Information Center

    Gilbertson, Nicholas J.

    2016-01-01

    A good formula is like a good story, rich in description, powerful in communication, and eye-opening to readers. The formula presented in this article for determining the coefficients of the binomial expansion of (x + y)n is one such "good read." The beauty of this formula is in its simplicity--both describing a quantitative situation…

  19. Confidence Intervals for Weighted Composite Scores under the Compound Binomial Error Model

    ERIC Educational Resources Information Center

    Kim, Kyung Yong; Lee, Won-Chan

    2018-01-01

    Reporting confidence intervals with test scores helps test users make important decisions about examinees by providing information about the precision of test scores. Although a variety of estimation procedures based on the binomial error model are available for computing intervals for test scores, these procedures assume that items are randomly…

  20. Coin Tossing Explains the Activity of Opposing Microtubule Motors on Phagosomes.

    PubMed

    Sanghavi, Paulomi; D'Souza, Ashwin; Rai, Ashim; Rai, Arpan; Padinhatheeri, Ranjith; Mallik, Roop

    2018-05-07

    How the opposing activity of kinesin and dynein motors generates polarized distribution of organelles inside cells is poorly understood and hotly debated [1, 2]. Possible explanations include stochastic mechanical competition [3, 4], coordinated regulation by motor-associated proteins [5-7], mechanical activation of motors [8], and lipid-induced organization [9]. Here, we address this question by using phagocytosed latex beads to generate early phagosomes (EPs) that move bidirectionally along microtubules (MTs) in an in vitro assay [9]. Dynein/kinesin activity on individual EPs is recorded as real-time force generation of the motors against an optical trap. Activity of one class of motors frequently coincides with, or is rapidly followed by opposite motors. This leads to frequent and rapid reversals of EPs in the trap. Remarkably, the choice between dynein and kinesin can be explained by the tossing of a coin. Opposing motors therefore appear to function stochastically and independently of each other, as also confirmed by observing no effect on kinesin function when dynein is inhibited on the EPs. A simple binomial probability calculation based on the geometry of EP-microtubule contact explains the observed activity of dynein and kinesin on phagosomes. This understanding of intracellular transport in terms of a hypothetical coin, if it holds true for other cargoes, provides a conceptual framework to explain the polarized localization of organelles inside cells. Copyright © 2018 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  1. Macro-level pedestrian and bicycle crash analysis: Incorporating spatial spillover effects in dual state count models.

    PubMed

    Cai, Qing; Lee, Jaeyoung; Eluru, Naveen; Abdel-Aty, Mohamed

    2016-08-01

    This study attempts to explore the viability of dual-state models (i.e., zero-inflated and hurdle models) for traffic analysis zones (TAZs) based pedestrian and bicycle crash frequency analysis. Additionally, spatial spillover effects are explored in the models by employing exogenous variables from neighboring zones. The dual-state models such as zero-inflated negative binomial and hurdle negative binomial models (with and without spatial effects) are compared with the conventional single-state model (i.e., negative binomial). The model comparison for pedestrian and bicycle crashes revealed that the models that considered observed spatial effects perform better than the models that did not consider the observed spatial effects. Across the models with spatial spillover effects, the dual-state models especially zero-inflated negative binomial model offered better performance compared to single-state models. Moreover, the model results clearly highlighted the importance of various traffic, roadway, and sociodemographic characteristics of the TAZ as well as neighboring TAZs on pedestrian and bicycle crash frequency. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. A comparison of different statistical methods analyzing hypoglycemia data using bootstrap simulations.

    PubMed

    Jiang, Honghua; Ni, Xiao; Huster, William; Heilmann, Cory

    2015-01-01

    Hypoglycemia has long been recognized as a major barrier to achieving normoglycemia with intensive diabetic therapies. It is a common safety concern for the diabetes patients. Therefore, it is important to apply appropriate statistical methods when analyzing hypoglycemia data. Here, we carried out bootstrap simulations to investigate the performance of the four commonly used statistical models (Poisson, negative binomial, analysis of covariance [ANCOVA], and rank ANCOVA) based on the data from a diabetes clinical trial. Zero-inflated Poisson (ZIP) model and zero-inflated negative binomial (ZINB) model were also evaluated. Simulation results showed that Poisson model inflated type I error, while negative binomial model was overly conservative. However, after adjusting for dispersion, both Poisson and negative binomial models yielded slightly inflated type I errors, which were close to the nominal level and reasonable power. Reasonable control of type I error was associated with ANCOVA model. Rank ANCOVA model was associated with the greatest power and with reasonable control of type I error. Inflated type I error was observed with ZIP and ZINB models.

  3. Random trinomial tree models and vanilla options

    NASA Astrophysics Data System (ADS)

    Ganikhodjaev, Nasir; Bayram, Kamola

    2013-09-01

    In this paper we introduce and study random trinomial model. The usual trinomial model is prescribed by triple of numbers (u, d, m). We call the triple (u, d, m) an environment of the trinomial model. A triple (Un, Dn, Mn), where {Un}, {Dn} and {Mn} are the sequences of independent, identically distributed random variables with 0 < Dn < 1 < Un and Mn = 1 for all n, is called a random environment and trinomial tree model with random environment is called random trinomial model. The random trinomial model is considered to produce more accurate results than the random binomial model or usual trinomial model.

  4. Event-by-event gluon multiplicity, energy density, and eccentricities in ultrarelativistic heavy-ion collisions

    NASA Astrophysics Data System (ADS)

    Schenke, Björn; Tribedy, Prithwish; Venugopalan, Raju

    2012-09-01

    The event-by-event multiplicity distribution, the energy densities and energy density weighted eccentricity moments ɛn (up to n=6) at early times in heavy-ion collisions at both the BNL Relativistic Heavy Ion Collider (RHIC) (s=200GeV) and the CERN Large Hardron Collider (LHC) (s=2.76TeV) are computed in the IP-Glasma model. This framework combines the impact parameter dependent saturation model (IP-Sat) for nucleon parton distributions (constrained by HERA deeply inelastic scattering data) with an event-by-event classical Yang-Mills description of early-time gluon fields in heavy-ion collisions. The model produces multiplicity distributions that are convolutions of negative binomial distributions without further assumptions or parameters. In the limit of large dense systems, the n-particle gluon distribution predicted by the Glasma-flux tube model is demonstrated to be nonperturbatively robust. In the general case, the effect of additional geometrical fluctuations is quantified. The eccentricity moments are compared to the MC-KLN model; a noteworthy feature is that fluctuation dominated odd moments are consistently larger than in the MC-KLN model.

  5. Intermittency via moments and distributions in central O+Cu collisions at 14. 6 A[center dot]GeV/c

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tannenbaum, M.J.

    Fluctuations in pseudorapidity distributions of charged particles from central (ZCAL) collisions of [sup 16]O+Cu at 14.6 A[center dot]GeV/c have been analyzed by Ju Kang using the method of scaled factorial moments as a function of the interval [delta][eta] an apparent power-law growth of moments with decreasing interval is observed down to [delta][eta] [approximately] 0.1, and the measured slope parameters are found to obey two scaling rules. Previous experience with E[sub T] distributions suggested that fluctuations of multiplicity and transverse energy can be well described by Gamma or Negative Binomial Distributions (NBD) and excellent fits to NBD were obtained in allmore » [delta][eta] bins. The k parameter of the NBD fit was found to increase linearly with the [delta][eta] interval, which due to the well known property of the NBD under convolution, indicates that the multiplicity distributions in adjacent bins of pseudorapidity [delta][eta] [approximately] 0.1 are largely statistically independent.« less

  6. Intermittency via moments and distributions in central O+Cu collisions at 14.6 A{center_dot}GeV/c

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tannenbaum, M.J.; The E802 Collaboration

    Fluctuations in pseudorapidity distributions of charged particles from central (ZCAL) collisions of {sup 16}O+Cu at 14.6 A{center_dot}GeV/c have been analyzed by Ju Kang using the method of scaled factorial moments as a function of the interval {delta}{eta} an apparent power-law growth of moments with decreasing interval is observed down to {delta}{eta} {approximately} 0.1, and the measured slope parameters are found to obey two scaling rules. Previous experience with E{sub T} distributions suggested that fluctuations of multiplicity and transverse energy can be well described by Gamma or Negative Binomial Distributions (NBD) and excellent fits to NBD were obtained in all {delta}{eta}more » bins. The k parameter of the NBD fit was found to increase linearly with the {delta}{eta} interval, which due to the well known property of the NBD under convolution, indicates that the multiplicity distributions in adjacent bins of pseudorapidity {delta}{eta} {approximately} 0.1 are largely statistically independent.« less

  7. Spatial distribution of citizen science casuistic observations for different taxonomic groups.

    PubMed

    Tiago, Patrícia; Ceia-Hasse, Ana; Marques, Tiago A; Capinha, César; Pereira, Henrique M

    2017-10-16

    Opportunistic citizen science databases are becoming an important way of gathering information on species distributions. These data are temporally and spatially dispersed and could have limitations regarding biases in the distribution of the observations in space and/or time. In this work, we test the influence of landscape variables in the distribution of citizen science observations for eight taxonomic groups. We use data collected through a Portuguese citizen science database (biodiversity4all.org). We use a zero-inflated negative binomial regression to model the distribution of observations as a function of a set of variables representing the landscape features plausibly influencing the spatial distribution of the records. Results suggest that the density of paths is the most important variable, having a statistically significant positive relationship with number of observations for seven of the eight taxa considered. Wetland coverage was also identified as having a significant, positive relationship, for birds, amphibians and reptiles, and mammals. Our results highlight that the distribution of species observations, in citizen science projects, is spatially biased. Higher frequency of observations is driven largely by accessibility and by the presence of water bodies. We conclude that efforts are required to increase the spatial evenness of sampling effort from volunteers.

  8. Kinetics of low-temperature transitions and a reaction rate theory from non-equilibrium distributions

    PubMed Central

    Aquilanti, Vincenzo; Coutinho, Nayara Dantas

    2017-01-01

    This article surveys the empirical information which originated both by laboratory experiments and by computational simulations, and expands previous understanding of the rates of chemical processes in the low-temperature range, where deviations from linearity of Arrhenius plots were revealed. The phenomenological two-parameter Arrhenius equation requires improvement for applications where interpolation or extrapolations are demanded in various areas of modern science. Based on Tolman's theorem, the dependence of the reciprocal of the apparent activation energy as a function of reciprocal absolute temperature permits the introduction of a deviation parameter d covering uniformly a variety of rate processes, from those where quantum mechanical tunnelling is significant and d < 0, to those where d > 0, corresponding to the Pareto–Tsallis statistical weights: these generalize the Boltzmann–Gibbs weight, which is recovered for d = 0. It is shown here how the weights arise, relaxing the thermodynamic equilibrium limit, either for a binomial distribution if d > 0 or for a negative binomial distribution if d < 0, formally corresponding to Fermion-like or Boson-like statistics, respectively. The current status of the phenomenology is illustrated emphasizing case studies; specifically (i) the super-Arrhenius kinetics, where transport phenomena accelerate processes as the temperature increases; (ii) the sub-Arrhenius kinetics, where quantum mechanical tunnelling propitiates low-temperature reactivity; (iii) the anti-Arrhenius kinetics, where processes with no energetic obstacles are rate-limited by molecular reorientation requirements. Particular attention is given for case (i) to the treatment of diffusion and viscosity, for case (ii) to formulation of a transition rate theory for chemical kinetics including quantum mechanical tunnelling, and for case (iii) to the stereodirectional specificity of the dynamics of reactions strongly hindered by the increase of temperature. This article is part of the themed issue ‘Theoretical and computational studies of non-equilibrium and non-statistical dynamics in the gas phase, in the condensed phase and at interfaces’. PMID:28320904

  9. Kinetics of low-temperature transitions and a reaction rate theory from non-equilibrium distributions.

    PubMed

    Aquilanti, Vincenzo; Coutinho, Nayara Dantas; Carvalho-Silva, Valter Henrique

    2017-04-28

    This article surveys the empirical information which originated both by laboratory experiments and by computational simulations, and expands previous understanding of the rates of chemical processes in the low-temperature range, where deviations from linearity of Arrhenius plots were revealed. The phenomenological two-parameter Arrhenius equation requires improvement for applications where interpolation or extrapolations are demanded in various areas of modern science. Based on Tolman's theorem, the dependence of the reciprocal of the apparent activation energy as a function of reciprocal absolute temperature permits the introduction of a deviation parameter d covering uniformly a variety of rate processes, from those where quantum mechanical tunnelling is significant and d  < 0, to those where d  > 0, corresponding to the Pareto-Tsallis statistical weights: these generalize the Boltzmann-Gibbs weight, which is recovered for d  = 0. It is shown here how the weights arise, relaxing the thermodynamic equilibrium limit, either for a binomial distribution if d  > 0 or for a negative binomial distribution if d  < 0, formally corresponding to Fermion-like or Boson-like statistics, respectively. The current status of the phenomenology is illustrated emphasizing case studies; specifically (i) the super -Arrhenius kinetics, where transport phenomena accelerate processes as the temperature increases; (ii) the sub -Arrhenius kinetics, where quantum mechanical tunnelling propitiates low-temperature reactivity; (iii) the anti -Arrhenius kinetics, where processes with no energetic obstacles are rate-limited by molecular reorientation requirements. Particular attention is given for case (i) to the treatment of diffusion and viscosity, for case (ii) to formulation of a transition rate theory for chemical kinetics including quantum mechanical tunnelling, and for case (iii) to the stereodirectional specificity of the dynamics of reactions strongly hindered by the increase of temperature.This article is part of the themed issue 'Theoretical and computational studies of non-equilibrium and non-statistical dynamics in the gas phase, in the condensed phase and at interfaces'. © 2017 The Author(s).

  10. Zero adjusted models with applications to analysing helminths count data.

    PubMed

    Chipeta, Michael G; Ngwira, Bagrey M; Simoonga, Christopher; Kazembe, Lawrence N

    2014-11-27

    It is common in public health and epidemiology that the outcome of interest is counts of events occurrence. Analysing these data using classical linear models is mostly inappropriate, even after transformation of outcome variables due to overdispersion. Zero-adjusted mixture count models such as zero-inflated and hurdle count models are applied to count data when over-dispersion and excess zeros exist. Main objective of the current paper is to apply such models to analyse risk factors associated with human helminths (S. haematobium) particularly in a case where there's a high proportion of zero counts. The data were collected during a community-based randomised control trial assessing the impact of mass drug administration (MDA) with praziquantel in Malawi, and a school-based cross sectional epidemiology survey in Zambia. Count data models including traditional (Poisson and negative binomial) models, zero modified models (zero inflated Poisson and zero inflated negative binomial) and hurdle models (Poisson logit hurdle and negative binomial logit hurdle) were fitted and compared. Using Akaike information criteria (AIC), the negative binomial logit hurdle (NBLH) and zero inflated negative binomial (ZINB) showed best performance in both datasets. With regards to zero count capturing, these models performed better than other models. This paper showed that zero modified NBLH and ZINB models are more appropriate methods for the analysis of data with excess zeros. The choice between the hurdle and zero-inflated models should be based on the aim and endpoints of the study.

  11. Probability Distributome: A Web Computational Infrastructure for Exploring the Properties, Interrelations, and Applications of Probability Distributions.

    PubMed

    Dinov, Ivo D; Siegrist, Kyle; Pearl, Dennis K; Kalinin, Alexandr; Christou, Nicolas

    2016-06-01

    Probability distributions are useful for modeling, simulation, analysis, and inference on varieties of natural processes and physical phenomena. There are uncountably many probability distributions. However, a few dozen families of distributions are commonly defined and are frequently used in practice for problem solving, experimental applications, and theoretical studies. In this paper, we present a new computational and graphical infrastructure, the Distributome , which facilitates the discovery, exploration and application of diverse spectra of probability distributions. The extensible Distributome infrastructure provides interfaces for (human and machine) traversal, search, and navigation of all common probability distributions. It also enables distribution modeling, applications, investigation of inter-distribution relations, as well as their analytical representations and computational utilization. The entire Distributome framework is designed and implemented as an open-source, community-built, and Internet-accessible infrastructure. It is portable, extensible and compatible with HTML5 and Web2.0 standards (http://Distributome.org). We demonstrate two types of applications of the probability Distributome resources: computational research and science education. The Distributome tools may be employed to address five complementary computational modeling applications (simulation, data-analysis and inference, model-fitting, examination of the analytical, mathematical and computational properties of specific probability distributions, and exploration of the inter-distributional relations). Many high school and college science, technology, engineering and mathematics (STEM) courses may be enriched by the use of modern pedagogical approaches and technology-enhanced methods. The Distributome resources provide enhancements for blended STEM education by improving student motivation, augmenting the classical curriculum with interactive webapps, and overhauling the learning assessment protocols.

  12. Probability Distributome: A Web Computational Infrastructure for Exploring the Properties, Interrelations, and Applications of Probability Distributions

    PubMed Central

    Dinov, Ivo D.; Siegrist, Kyle; Pearl, Dennis K.; Kalinin, Alexandr; Christou, Nicolas

    2015-01-01

    Probability distributions are useful for modeling, simulation, analysis, and inference on varieties of natural processes and physical phenomena. There are uncountably many probability distributions. However, a few dozen families of distributions are commonly defined and are frequently used in practice for problem solving, experimental applications, and theoretical studies. In this paper, we present a new computational and graphical infrastructure, the Distributome, which facilitates the discovery, exploration and application of diverse spectra of probability distributions. The extensible Distributome infrastructure provides interfaces for (human and machine) traversal, search, and navigation of all common probability distributions. It also enables distribution modeling, applications, investigation of inter-distribution relations, as well as their analytical representations and computational utilization. The entire Distributome framework is designed and implemented as an open-source, community-built, and Internet-accessible infrastructure. It is portable, extensible and compatible with HTML5 and Web2.0 standards (http://Distributome.org). We demonstrate two types of applications of the probability Distributome resources: computational research and science education. The Distributome tools may be employed to address five complementary computational modeling applications (simulation, data-analysis and inference, model-fitting, examination of the analytical, mathematical and computational properties of specific probability distributions, and exploration of the inter-distributional relations). Many high school and college science, technology, engineering and mathematics (STEM) courses may be enriched by the use of modern pedagogical approaches and technology-enhanced methods. The Distributome resources provide enhancements for blended STEM education by improving student motivation, augmenting the classical curriculum with interactive webapps, and overhauling the learning assessment protocols. PMID:27158191

  13. Random Partition Distribution Indexed by Pairwise Information

    PubMed Central

    Dahl, David B.; Day, Ryan; Tsai, Jerry W.

    2017-01-01

    We propose a random partition distribution indexed by pairwise similarity information such that partitions compatible with the similarities are given more probability. The use of pairwise similarities, in the form of distances, is common in some clustering algorithms (e.g., hierarchical clustering), but we show how to use this type of information to define a prior partition distribution for flexible Bayesian modeling. A defining feature of the distribution is that it allocates probability among partitions within a given number of subsets, but it does not shift probability among sets of partitions with different numbers of subsets. Our distribution places more probability on partitions that group similar items yet keeps the total probability of partitions with a given number of subsets constant. The distribution of the number of subsets (and its moments) is available in closed-form and is not a function of the similarities. Our formulation has an explicit probability mass function (with a tractable normalizing constant) so the full suite of MCMC methods may be used for posterior inference. We compare our distribution with several existing partition distributions, showing that our formulation has attractive properties. We provide three demonstrations to highlight the features and relative performance of our distribution. PMID:29276318

  14. A brief introduction to probability.

    PubMed

    Di Paola, Gioacchino; Bertani, Alessandro; De Monte, Lavinia; Tuzzolino, Fabio

    2018-02-01

    The theory of probability has been debated for centuries: back in 1600, French mathematics used the rules of probability to place and win bets. Subsequently, the knowledge of probability has significantly evolved and is now an essential tool for statistics. In this paper, the basic theoretical principles of probability will be reviewed, with the aim of facilitating the comprehension of statistical inference. After a brief general introduction on probability, we will review the concept of the "probability distribution" that is a function providing the probabilities of occurrence of different possible outcomes of a categorical or continuous variable. Specific attention will be focused on normal distribution that is the most relevant distribution applied to statistical analysis.

  15. Evaluation of an operational malaria outbreak identification and response system in Mpumalanga Province, South Africa.

    PubMed

    Coleman, Marlize; Coleman, Michael; Mabuza, Aaron M; Kok, Gerdalize; Coetzee, Maureen; Durrheim, David N

    2008-04-27

    To evaluate the performance of a novel malaria outbreak identification system in the epidemic prone rural area of Mpumalanga Province, South Africa, for timely identification of malaria outbreaks and guiding integrated public health responses. Using five years of historical notification data, two binomial thresholds were determined for each primary health care facility in the highest malaria risk area of Mpumalanga province. Whenever the thresholds were exceeded at health facility level (tier 1), primary health care staff notified the malaria control programme, which then confirmed adequate stocks of malaria treatment to manage potential increased cases. The cases were followed up at household level to verify the likely source of infection. The binomial thresholds were reviewed at village/town level (tier 2) to determine whether additional response measures were required. In addition, an automated electronic outbreak identification system at town/village level (tier 2) was integrated into the case notification database (tier 3) to ensure that unexpected increases in case notification were not missed.The performance of these binomial outbreak thresholds was evaluated against other currently recommended thresholds using retrospective data. The acceptability of the system at primary health care level was evaluated through structured interviews with health facility staff. Eighty four percent of health facilities reported outbreaks within 24 hours (n = 95), 92% (n = 104) within 48 hours and 100% (n = 113) within 72 hours. Appropriate response to all malaria outbreaks (n = 113, tier 1, n = 46, tier 2) were achieved within 24 hours. The system was positively viewed by all health facility staff. When compared to other epidemiological systems for a specified 12 month outbreak season (June 2003 to July 2004) the binomial exact thresholds produced one false weekly outbreak, the C-sum 12 weekly outbreaks and the mean + 2 SD nine false weekly outbreaks. Exceeding the binomial level 1 threshold triggered an alert four weeks prior to an outbreak, but exceeding the binomial level 2 threshold identified an outbreak as it occurred. The malaria outbreak surveillance system using binomial thresholds achieved its primary goal of identifying outbreaks early facilitating appropriate local public health responses aimed at averting a possible large-scale epidemic in a low, and unstable, malaria transmission setting.

  16. Impact of cigarette smoking on utilization of nursing home services.

    PubMed

    Warner, Kenneth E; McCammon, Ryan J; Fries, Brant E; Langa, Kenneth M

    2013-11-01

    Few studies have examined the effects of smoking on nursing home utilization, generally using poor data on smoking status. No previous study has distinguished utilization for recent from long-term quitters. Using the Health and Retirement Study, we assessed nursing home utilization by never-smokers, long-term quitters (quit >3 years), recent quitters (quit ≤3 years), and current smokers. We used logistic regression to evaluate the likelihood of a nursing home admission. For those with an admission, we used negative binomial regression on the number of nursing home nights. Finally, we employed zero-inflated negative binomial regression to estimate nights for the full sample. Controlling for other variables, compared with never-smokers, long-term quitters have an odds ratio (OR) for nursing home admission of 1.18 (95% CI: 1.07-1.2), current smokers 1.39 (1.23-1.57), and recent quitters 1.55 (1.29-1.87). The probability of admission rises rapidly with age and is lower for African Americans and Hispanics, more affluent respondents, respondents with a spouse present in the home, and respondents with a living child. Given admission, smoking status is not associated with length of stay (LOS). LOS is longer for older respondents and women and shorter for more affluent respondents and those with spouses present. Compared with otherwise identical never-smokers, former and current smokers have a significantly increased risk of nursing home admission. That recent quitters are at greatest risk of admission is consistent with evidence that many stop smoking because they are sick, often due to smoking.

  17. Landau-Zener extension of the Tavis-Cummings model: structure of the solution

    NASA Astrophysics Data System (ADS)

    Sun, Chen; Sinitsyn, Nikolai

    We explore the recently discovered solution of the driven Tavis-Cummings model (DTCM). It describes interaction of arbitrary number of two-level systems with a bosonic mode that has linearly time-dependent frequency. We derive compact and tractable expressions for transition probabilities in terms of the well known special functions. In the new form, our formulas are suitable for fast numerical calculations and analytical approximations. As an application, we obtain the semiclassical limit of the exact solution and compare it to prior approximations. We also reveal connection between DTCM and q-deformed binomial statistics. Under the auspices of the National Nuclear Security Administration of the U.S. Department of Energy at Los Alamos National Laboratory under Contract No. DE-AC52-06NA25396. Authors also thank the support from the LDRD program at LANL.

  18. Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach.

    PubMed

    Nagelkerke, Nico; Fidler, Vaclav

    2015-01-01

    The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are incorrectly classified/labeled as healthy controls. We show that this leads to zero-inflated binomial model with a defective logistic regression or discrimination function, whose parameters can be estimated using standard statistical methods such as maximum likelihood. These parameters can be used to estimate the probability of true group membership among those, possibly erroneously, classified as controls. Two examples are analyzed and discussed. A simulation study explores properties of the maximum likelihood parameter estimates and the estimates of the number of mislabeled observations.

  19. Digital simulation of two-dimensional random fields with arbitrary power spectra and non-Gaussian probability distribution functions.

    PubMed

    Yura, Harold T; Hanson, Steen G

    2012-04-01

    Methods for simulation of two-dimensional signals with arbitrary power spectral densities and signal amplitude probability density functions are disclosed. The method relies on initially transforming a white noise sample set of random Gaussian distributed numbers into a corresponding set with the desired spectral distribution, after which this colored Gaussian probability distribution is transformed via an inverse transform into the desired probability distribution. In most cases the method provides satisfactory results and can thus be considered an engineering approach. Several illustrative examples with relevance for optics are given.

  20. An evaluation of the NASA/GSFC Barnes field spectral reflectometer model 14-758, using signal/noise as a measure of utility

    NASA Astrophysics Data System (ADS)

    Bell, R.; Labovitz, M. L.

    1982-07-01

    A Barnes field spectral reflectometer which collected information in 373 channels covering the region from 0.4 to 2.5 micrometers was assessed for signal utility. A band was judged unsatisfactory if the probability was 0.1 or greater than its signal to noise ratio was less than eight to one. For each of the bands the probability of a noisy observation was estimated under a binomial assumption from a set of field crop spectra covering an entire growing season. A 95% confidence interval was calculated about each estimate and bands whose lower confidence limits were greater than 0.1 were judged unacceptable. As a result, 283 channels were deemed statistically satisfactory. Excluded channels correspond to portions of the electromagnetic spectrum (EMS) where high atmospheric absorption and filter wheel overlap occur. In addition, the analyses uncovered intervals of unsatisfactory detection capability within the blue, red and far infrared regions of vegetation spectra. From the results of the analysis it was recommended that 90 channels monitored by the instrument under consideration be eliminated from future studies. These channels are tabulated and discussed.

  1. Impact of a letter-grade program on restaurant sanitary conditions and diner behavior in New York City.

    PubMed

    Wong, Melissa R; McKelvey, Wendy; Ito, Kazuhiko; Schiff, Corinne; Jacobson, J Bryan; Kass, Daniel

    2015-03-01

    We evaluated the impact of the New York City restaurant letter-grading program on restaurant hygiene, food safety practices, and public awareness. We analyzed data from 43,448 restaurants inspected between 2007 and 2013 to measure changes in inspection score and violation citations since program launch in July 2010. We used binomial regression to assess probability of scoring 0 to 13 points (A-range score). Two population-based random-digit-dial telephone surveys assessed public perceptions of the program. After we controlled for repeated restaurant observations, season of inspection, and chain restaurant status, the probability of scoring 0 to 13 points on an unannounced inspection increased 35% (95% confidence interval [CI]=31%, 40%) 3 years after compared with 3 years before grading. There were notable improvements in compliance with some specific requirements, including having a certified kitchen manager on site and being pest-free. More than 91% (95% CI=88%, 94%) of New Yorkers approved of the program and 88% (95% CI=85%, 92%) considered grades in dining decisions in 2012. Restaurant letter grading in New York City has resulted in improved sanitary conditions on unannounced inspection, suggesting that the program is an effective regulatory tool.

  2. An evaluation of the NASA/GSFC Barnes field spectral reflecometer model 14-758, using signal/noise as a measure of utility

    NASA Technical Reports Server (NTRS)

    Bell, R.; Labovitz, M. L.

    1982-01-01

    A Barnes field spectral reflectometer which collected information in 373 channels covering the region from 0.4 to 2.5 micrometers was assessed for signal utility. A band was judged unsatisfactory if the probability was 0.1 or greater than its signal to noise ratio was less than eight to one. For each of the bands the probability of a noisy observation was estimated under a binomial assumption from a set of field crop spectra covering an entire growing season. A 95% confidence interval was calculated about each estimate and bands whose lower confidence limits were greater than 0.1 were judged unacceptable. As a result, 283 channels were deemed statistically satisfactory. Excluded channels correspond to portions of the electromagnetic spectrum (EMS) where high atmospheric absorption and filter wheel overlap occur. In addition, the analyses uncovered intervals of unsatisfactory detection capability within the blue, red and far infrared regions of vegetation spectra. From the results of the analysis it was recommended that 90 channels monitored by the instrument under consideration be eliminated from future studies. These channels are tabulated and discussed.

  3. The global impact distribution of Near-Earth objects

    NASA Astrophysics Data System (ADS)

    Rumpf, Clemens; Lewis, Hugh G.; Atkinson, Peter M.

    2016-02-01

    Asteroids that could collide with the Earth are listed on the publicly available Near-Earth object (NEO) hazard web sites maintained by the National Aeronautics and Space Administration (NASA) and the European Space Agency (ESA). The impact probability distribution of 69 potentially threatening NEOs from these lists that produce 261 dynamically distinct impact instances, or Virtual Impactors (VIs), were calculated using the Asteroid Risk Mitigation and Optimization Research (ARMOR) tool in conjunction with OrbFit. ARMOR projected the impact probability of each VI onto the surface of the Earth as a spatial probability distribution. The projection considers orbit solution accuracy and the global impact probability. The method of ARMOR is introduced and the tool is validated against two asteroid-Earth collision cases with objects 2008 TC3 and 2014 AA. In the analysis, the natural distribution of impact corridors is contrasted against the impact probability distribution to evaluate the distributions' conformity with the uniform impact distribution assumption. The distribution of impact corridors is based on the NEO population and orbital mechanics. The analysis shows that the distribution of impact corridors matches the common assumption of uniform impact distribution and the result extends the evidence base for the uniform assumption from qualitative analysis of historic impact events into the future in a quantitative way. This finding is confirmed in a parallel analysis of impact points belonging to a synthetic population of 10,006 VIs. Taking into account the impact probabilities introduced significant variation into the results and the impact probability distribution, consequently, deviates markedly from uniformity. The concept of impact probabilities is a product of the asteroid observation and orbit determination technique and, thus, represents a man-made component that is largely disconnected from natural processes. It is important to consider impact probabilities because such information represents the best estimate of where an impact might occur.

  4. DRME: Count-based differential RNA methylation analysis at small sample size scenario.

    PubMed

    Liu, Lian; Zhang, Shao-Wu; Gao, Fan; Zhang, Yixin; Huang, Yufei; Chen, Runsheng; Meng, Jia

    2016-04-15

    Differential methylation, which concerns difference in the degree of epigenetic regulation via methylation between two conditions, has been formulated as a beta or beta-binomial distribution to address the within-group biological variability in sequencing data. However, a beta or beta-binomial model is usually difficult to infer at small sample size scenario with discrete reads count in sequencing data. On the other hand, as an emerging research field, RNA methylation has drawn more and more attention recently, and the differential analysis of RNA methylation is significantly different from that of DNA methylation due to the impact of transcriptional regulation. We developed DRME to better address the differential RNA methylation problem. The proposed model can effectively describe within-group biological variability at small sample size scenario and handles the impact of transcriptional regulation on RNA methylation. We tested the newly developed DRME algorithm on simulated and 4 MeRIP-Seq case-control studies and compared it with Fisher's exact test. It is in principle widely applicable to several other RNA-related data types as well, including RNA Bisulfite sequencing and PAR-CLIP. The code together with an MeRIP-Seq dataset is available online (https://github.com/lzcyzm/DRME) for evaluation and reproduction of the figures shown in this article. Copyright © 2016 Elsevier Inc. All rights reserved.

  5. Possibility and challenges of conversion of current virus species names to Linnaean binomials

    USGS Publications Warehouse

    Thomas, Postler; Clawson, Anna N.; Amarasinghe, Gaya K.; Basler, Christopher F.; Bavari, Sina; Benko, Maria; Blasdell, Kim R.; Briese, Thomas; Buchmeier, Michael J.; Bukreyev, Alexander; Calisher, Charles H.; Chandran, Kartik; Charrel, Remi; Clegg, Christopher S.; Collins, Peter L.; De la Torre, Juan Carlos; DeRisi, Joseph L.; Dietzgen, Ralf G.; Dolnik, Olga; Durrwald, Ralf; Dye, John M.; Easton, Andrew J.; Emonet, Sebastian; Formenty, Pierre; Fouchier, Ron A. M.; Ghedin, Elodie; Gonzalez, Jean-Paul; Harrach, Balazs; Hewson, Roger; Horie, Masayuki; Jiang, Daohong; Kobinger, Gary P.; Kondo, Hideki; Kropinski, Andrew; Krupovic, Mart; Kurath, Gael; Lamb, Robert A.; Leroy, Eric M.; Lukashevich, Igor S.; Maisner, Andrea; Mushegian, Arcady; Netesov, Sergey V.; Nowotny, Norbert; Patterson, Jean L.; Payne, Susan L.; Paweska, Janusz T.; Peters, C.J.; Radoshitzky, Sheli; Rima, Bertus K.; Romanowski, Victor; Rubbenstroth, Dennis; Sabanadzovic, Sead; Sanfacon, Helene; Salvato , Maria; Schwemmle, Martin; Smither, Sophie J.; Stenglein, Mark; Stone, D.M.; Takada , Ayato; Tesh, Robert B.; Tomonaga, Keizo; Tordo, N.; Towner, Jonathan S.; Vasilakis, Nikos; Volchkov, Victor E.; Jensen, Victoria; Walker, Peter J.; Wang, Lin-Fa; Varsani, Arvind; Whitfield , Anna E.; Zerbini, Francisco Murilo; Kuhn, Jens H.

    2017-01-01

    Botanical, mycological, zoological, and prokaryotic species names follow the Linnaean format, consisting of an italicized Latinized binomen with a capitalized genus name and a lower case species epithet (e.g., Homo sapiens). Virus species names, however, do not follow a uniform format, and, even when binomial, are not Linnaean in style. In this thought exercise, we attempted to convert all currently official names of species included in the virus family Arenaviridae and the virus order Mononegavirales to Linnaean binomials, and to identify and address associated challenges and concerns. Surprisingly, this endeavor was not as complicated or time-consuming as even the authors of this article expected when conceiving the experiment.

  6. A semi-nonparametric Poisson regression model for analyzing motor vehicle crash data.

    PubMed

    Ye, Xin; Wang, Ke; Zou, Yajie; Lord, Dominique

    2018-01-01

    This paper develops a semi-nonparametric Poisson regression model to analyze motor vehicle crash frequency data collected from rural multilane highway segments in California, US. Motor vehicle crash frequency on rural highway is a topic of interest in the area of transportation safety due to higher driving speeds and the resultant severity level. Unlike the traditional Negative Binomial (NB) model, the semi-nonparametric Poisson regression model can accommodate an unobserved heterogeneity following a highly flexible semi-nonparametric (SNP) distribution. Simulation experiments are conducted to demonstrate that the SNP distribution can well mimic a large family of distributions, including normal distributions, log-gamma distributions, bimodal and trimodal distributions. Empirical estimation results show that such flexibility offered by the SNP distribution can greatly improve model precision and the overall goodness-of-fit. The semi-nonparametric distribution can provide a better understanding of crash data structure through its ability to capture potential multimodality in the distribution of unobserved heterogeneity. When estimated coefficients in empirical models are compared, SNP and NB models are found to have a substantially different coefficient for the dummy variable indicating the lane width. The SNP model with better statistical performance suggests that the NB model overestimates the effect of lane width on crash frequency reduction by 83.1%.

  7. Factors affecting the probability of first year medical student dropout in the UK: a logistic analysis for the intake cohorts of 1980-92.

    PubMed

    Arulampalam, Wiji; Naylor, Robin; Smith, Jeremy

    2004-05-01

    In the context of the 1997 Report of the Medical Workforce Standing Advisory Committee, it is important that we develop an understanding of the factors influencing medical school retention rates. To analyse the determinants of the probability that an individual medical student will drop out of medical school during their first year of study. Binomial and multinomial logistic regression analysis of individual-level administrative data on 51 810 students in 21 medical schools in the UK for the intake cohorts of 1980-92 was performed. The overall average first year dropout rate over the period 1980-92 was calculated to be 3.8%. We found that the probability that a student would drop out of medical school during their first year of study was influenced significantly by both the subjects studied at A-level and by the scores achieved. For example, achieving 1 grade higher in biology, chemistry or physics reduced the dropout probability by 0.38% points, equivalent to a fall of 10%. We also found that males were about 8% more likely to drop out than females. The medical school attended also had a significant effect on the estimated dropout probability. Indicators of both the social class and the previous school background of the student were largely insignificant. Policies aimed at increasing the size of the medical student intake in the UK and of widening access to students from non-traditional backgrounds should be informed by evidence that student dropout probabilities are sensitive to measures of A-level attainment, such as subject studied and scores achieved. If traditional entry requirements or standards are relaxed, then this is likely to have detrimental effects on medical schools' retention rates unless accompanied by appropriate measures such as focussed student support.

  8. Risk factors related to Toxoplasma gondii seroprevalence in indoor-housed Dutch dairy goats.

    PubMed

    Deng, Huifang; Dam-Deisz, Cecile; Luttikholt, Saskia; Maas, Miriam; Nielen, Mirjam; Swart, Arno; Vellema, Piet; van der Giessen, Joke; Opsteegh, Marieke

    2016-02-01

    Toxoplasma gondii can cause disease in goats, but also has impact on human health through food-borne transmission. Our aims were to determine the seroprevalence of T. gondii infection in indoor-housed Dutch dairy goats and to identify the risk factors related to T. gondii seroprevalence. Fifty-two out of ninety approached farmers with indoor-kept goats (58%) participated by answering a standardized questionnaire and contributing 32 goat blood samples each. Serum samples were tested for T. gondii SAG1 antibodies by ELISA and results showed that the frequency distribution of the log10-transformed OD-values fitted well with a binary mixture of a shifted gamma and a shifted reflected gamma distribution. The overall animal seroprevalence was 13.3% (95% CI: 11.7–14.9%), and at least one seropositive animal was found on 61.5% (95% CI: 48.3–74.7%) of the farms. To evaluate potential risk factors on herd level, three modeling strategies (Poisson, negative binomial and zero-inflated) were compared. The negative binomial model fitted the data best with the number of cats (1–4 cats: IR: 2.6, 95% CI: 1.1–6.5; > = 5 cats:IR: 14.2, 95% CI: 3.9–51.1) and mean animal age (IR: 1.5, 95% CI: 1.1–2.1) related to herd positivity. In conclusion, the ELISA test was 100% sensitive and specific based on binary mixture analysis. T. gondii infection is prevalent in indoor housed Dutch dairy goats but at a lower overall animal level seroprevalence than outdoor farmed goats in other European countries, and cat exposure is an important risk factor.

  9. Building test data from real outbreaks for evaluating detection algorithms.

    PubMed

    Texier, Gaetan; Jackson, Michael L; Siwe, Leonel; Meynard, Jean-Baptiste; Deparis, Xavier; Chaudet, Herve

    2017-01-01

    Benchmarking surveillance systems requires realistic simulations of disease outbreaks. However, obtaining these data in sufficient quantity, with a realistic shape and covering a sufficient range of agents, size and duration, is known to be very difficult. The dataset of outbreak signals generated should reflect the likely distribution of authentic situations faced by the surveillance system, including very unlikely outbreak signals. We propose and evaluate a new approach based on the use of historical outbreak data to simulate tailored outbreak signals. The method relies on a homothetic transformation of the historical distribution followed by resampling processes (Binomial, Inverse Transform Sampling Method-ITSM, Metropolis-Hasting Random Walk, Metropolis-Hasting Independent, Gibbs Sampler, Hybrid Gibbs Sampler). We carried out an analysis to identify the most important input parameters for simulation quality and to evaluate performance for each of the resampling algorithms. Our analysis confirms the influence of the type of algorithm used and simulation parameters (i.e. days, number of cases, outbreak shape, overall scale factor) on the results. We show that, regardless of the outbreaks, algorithms and metrics chosen for the evaluation, simulation quality decreased with the increase in the number of days simulated and increased with the number of cases simulated. Simulating outbreaks with fewer cases than days of duration (i.e. overall scale factor less than 1) resulted in an important loss of information during the simulation. We found that Gibbs sampling with a shrinkage procedure provides a good balance between accuracy and data dependency. If dependency is of little importance, binomial and ITSM methods are accurate. Given the constraint of keeping the simulation within a range of plausible epidemiological curves faced by the surveillance system, our study confirms that our approach can be used to generate a large spectrum of outbreak signals.

  10. Building test data from real outbreaks for evaluating detection algorithms

    PubMed Central

    Texier, Gaetan; Jackson, Michael L.; Siwe, Leonel; Meynard, Jean-Baptiste; Deparis, Xavier; Chaudet, Herve

    2017-01-01

    Benchmarking surveillance systems requires realistic simulations of disease outbreaks. However, obtaining these data in sufficient quantity, with a realistic shape and covering a sufficient range of agents, size and duration, is known to be very difficult. The dataset of outbreak signals generated should reflect the likely distribution of authentic situations faced by the surveillance system, including very unlikely outbreak signals. We propose and evaluate a new approach based on the use of historical outbreak data to simulate tailored outbreak signals. The method relies on a homothetic transformation of the historical distribution followed by resampling processes (Binomial, Inverse Transform Sampling Method—ITSM, Metropolis-Hasting Random Walk, Metropolis-Hasting Independent, Gibbs Sampler, Hybrid Gibbs Sampler). We carried out an analysis to identify the most important input parameters for simulation quality and to evaluate performance for each of the resampling algorithms. Our analysis confirms the influence of the type of algorithm used and simulation parameters (i.e. days, number of cases, outbreak shape, overall scale factor) on the results. We show that, regardless of the outbreaks, algorithms and metrics chosen for the evaluation, simulation quality decreased with the increase in the number of days simulated and increased with the number of cases simulated. Simulating outbreaks with fewer cases than days of duration (i.e. overall scale factor less than 1) resulted in an important loss of information during the simulation. We found that Gibbs sampling with a shrinkage procedure provides a good balance between accuracy and data dependency. If dependency is of little importance, binomial and ITSM methods are accurate. Given the constraint of keeping the simulation within a range of plausible epidemiological curves faced by the surveillance system, our study confirms that our approach can be used to generate a large spectrum of outbreak signals. PMID:28863159

  11. Flood return level analysis of Peaks over Threshold series under changing climate

    NASA Astrophysics Data System (ADS)

    Li, L.; Xiong, L.; Hu, T.; Xu, C. Y.; Guo, S.

    2016-12-01

    Obtaining insights into future flood estimation is of great significance for water planning and management. Traditional flood return level analysis with the stationarity assumption has been challenged by changing environments. A method that takes into consideration the nonstationarity context has been extended to derive flood return levels for Peaks over Threshold (POT) series. With application to POT series, a Poisson distribution is normally assumed to describe the arrival rate of exceedance events, but this distribution assumption has at times been reported as invalid. The Negative Binomial (NB) distribution is therefore proposed as an alternative to the Poisson distribution assumption. Flood return levels were extrapolated in nonstationarity context for the POT series of the Weihe basin, China under future climate scenarios. The results show that the flood return levels estimated under nonstationarity can be different with an assumption of Poisson and NB distribution, respectively. The difference is found to be related to the threshold value of POT series. The study indicates the importance of distribution selection in flood return level analysis under nonstationarity and provides a reference on the impact of climate change on flood estimation in the Weihe basin for the future.

  12. Bursts of Self-Conscious Emotions in the Daily Lives of Emerging Adults

    PubMed Central

    Conroy, David E.; Ram, Nilam; Pincus, Aaron L.; Rebar, Amanda L.

    2015-01-01

    Self-conscious emotions play a role in regulating daily achievement strivings, social behavior, and health, but little is known about the processes underlying their daily manifestation. Emerging adults (n = 182) completed daily diaries for eight days and multilevel models were estimated to evaluate whether, how much, and why their emotions varied from day-to-day. Within-person variation in authentic pride was normally-distributed across people and days whereas the other emotions were burst-like and characterized by zero-inflated, negative binomial distributions. Perceiving social interactions as generally communal increased the odds of hubristic pride activation and reduced the odds of guilt activation; daily communal behavior reduced guilt intensity. Results illuminated processes through which meaning about the self-in-relation-to-others is constructed during a critical period of development. PMID:25859164

  13. Nonadditive entropies yield probability distributions with biases not warranted by the data.

    PubMed

    Pressé, Steve; Ghosh, Kingshuk; Lee, Julian; Dill, Ken A

    2013-11-01

    Different quantities that go by the name of entropy are used in variational principles to infer probability distributions from limited data. Shore and Johnson showed that maximizing the Boltzmann-Gibbs form of the entropy ensures that probability distributions inferred satisfy the multiplication rule of probability for independent events in the absence of data coupling such events. Other types of entropies that violate the Shore and Johnson axioms, including nonadditive entropies such as the Tsallis entropy, violate this basic consistency requirement. Here we use the axiomatic framework of Shore and Johnson to show how such nonadditive entropy functions generate biases in probability distributions that are not warranted by the underlying data.

  14. Raw and Central Moments of Binomial Random Variables via Stirling Numbers

    ERIC Educational Resources Information Center

    Griffiths, Martin

    2013-01-01

    We consider here the problem of calculating the moments of binomial random variables. It is shown how formulae for both the raw and the central moments of such random variables may be obtained in a recursive manner utilizing Stirling numbers of the first kind. Suggestions are also provided as to how students might be encouraged to explore this…

  15. A Mixed-Effects Heterogeneous Negative Binomial Model for Postfire Conifer Regeneration in Northeastern California, USA

    Treesearch

    Justin S. Crotteau; Martin W. Ritchie; J. Morgan Varner

    2014-01-01

    Many western USA fire regimes are typified by mixed-severity fire, which compounds the variability inherent to natural regeneration densities in associated forests. Tree regeneration data are often discrete and nonnegative; accordingly, we fit a series of Poisson and negative binomial variation models to conifer seedling counts across four distinct burn severities and...

  16. Performance of the modified Poisson regression approach for estimating relative risks from clustered prospective data.

    PubMed

    Yelland, Lisa N; Salter, Amy B; Ryan, Philip

    2011-10-15

    Modified Poisson regression, which combines a log Poisson regression model with robust variance estimation, is a useful alternative to log binomial regression for estimating relative risks. Previous studies have shown both analytically and by simulation that modified Poisson regression is appropriate for independent prospective data. This method is often applied to clustered prospective data, despite a lack of evidence to support its use in this setting. The purpose of this article is to evaluate the performance of the modified Poisson regression approach for estimating relative risks from clustered prospective data, by using generalized estimating equations to account for clustering. A simulation study is conducted to compare log binomial regression and modified Poisson regression for analyzing clustered data from intervention and observational studies. Both methods generally perform well in terms of bias, type I error, and coverage. Unlike log binomial regression, modified Poisson regression is not prone to convergence problems. The methods are contrasted by using example data sets from 2 large studies. The results presented in this article support the use of modified Poisson regression as an alternative to log binomial regression for analyzing clustered prospective data when clustering is taken into account by using generalized estimating equations.

  17. RELAV - RELIABILITY/AVAILABILITY ANALYSIS PROGRAM

    NASA Technical Reports Server (NTRS)

    Bowerman, P. N.

    1994-01-01

    RELAV (Reliability/Availability Analysis Program) is a comprehensive analytical tool to determine the reliability or availability of any general system which can be modeled as embedded k-out-of-n groups of items (components) and/or subgroups. Both ground and flight systems at NASA's Jet Propulsion Laboratory have utilized this program. RELAV can assess current system performance during the later testing phases of a system design, as well as model candidate designs/architectures or validate and form predictions during the early phases of a design. Systems are commonly modeled as System Block Diagrams (SBDs). RELAV calculates the success probability of each group of items and/or subgroups within the system assuming k-out-of-n operating rules apply for each group. The program operates on a folding basis; i.e. it works its way towards the system level from the most embedded level by folding related groups into single components. The entire folding process involves probabilities; therefore, availability problems are performed in terms of the probability of success, and reliability problems are performed for specific mission lengths. An enhanced cumulative binomial algorithm is used for groups where all probabilities are equal, while a fast algorithm based upon "Computing k-out-of-n System Reliability", Barlow & Heidtmann, IEEE TRANSACTIONS ON RELIABILITY, October 1984, is used for groups with unequal probabilities. Inputs to the program include a description of the system and any one of the following: 1) availabilities of the items, 2) mean time between failures and mean time to repairs for the items from which availabilities are calculated, 3) mean time between failures and mission length(s) from which reliabilities are calculated, or 4) failure rates and mission length(s) from which reliabilities are calculated. The results are probabilities of success of each group and the system in the given configuration. RELAV assumes exponential failure distributions for reliability calculations and infinite repair resources for availability calculations. No more than 967 items or groups can be modeled by RELAV. If larger problems can be broken into subsystems of 967 items or less, the subsystem results can be used as item inputs to a system problem. The calculated availabilities are steady-state values. Group results are presented in the order in which they were calculated (from the most embedded level out to the system level). This provides a good mechanism to perform trade studies. Starting from the system result and working backwards, the granularity gets finer; therefore, system elements that contribute most to system degradation are detected quickly. RELAV is a C-language program originally developed under the UNIX operating system on a MASSCOMP MC500 computer. It has been modified, as necessary, and ported to an IBM PC compatible with a math coprocessor. The current version of the program runs in the DOS environment and requires a Turbo C vers. 2.0 compiler. RELAV has a memory requirement of 103 KB and was developed in 1989. RELAV is a copyrighted work with all copyright vested in NASA.

  18. ProbOnto: ontology and knowledge base of probability distributions.

    PubMed

    Swat, Maciej J; Grenon, Pierre; Wimalaratne, Sarala

    2016-09-01

    Probability distributions play a central role in mathematical and statistical modelling. The encoding, annotation and exchange of such models could be greatly simplified by a resource providing a common reference for the definition of probability distributions. Although some resources exist, no suitably detailed and complex ontology exists nor any database allowing programmatic access. ProbOnto, is an ontology-based knowledge base of probability distributions, featuring more than 80 uni- and multivariate distributions with their defining functions, characteristics, relationships and re-parameterization formulas. It can be used for model annotation and facilitates the encoding of distribution-based models, related functions and quantities. http://probonto.org mjswat@ebi.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  19. PRODIGEN: visualizing the probability landscape of stochastic gene regulatory networks in state and time space.

    PubMed

    Ma, Chihua; Luciani, Timothy; Terebus, Anna; Liang, Jie; Marai, G Elisabeta

    2017-02-15

    Visualizing the complex probability landscape of stochastic gene regulatory networks can further biologists' understanding of phenotypic behavior associated with specific genes. We present PRODIGEN (PRObability DIstribution of GEne Networks), a web-based visual analysis tool for the systematic exploration of probability distributions over simulation time and state space in such networks. PRODIGEN was designed in collaboration with bioinformaticians who research stochastic gene networks. The analysis tool combines in a novel way existing, expanded, and new visual encodings to capture the time-varying characteristics of probability distributions: spaghetti plots over one dimensional projection, heatmaps of distributions over 2D projections, enhanced with overlaid time curves to display temporal changes, and novel individual glyphs of state information corresponding to particular peaks. We demonstrate the effectiveness of the tool through two case studies on the computed probabilistic landscape of a gene regulatory network and of a toggle-switch network. Domain expert feedback indicates that our visual approach can help biologists: 1) visualize probabilities of stable states, 2) explore the temporal probability distributions, and 3) discover small peaks in the probability landscape that have potential relation to specific diseases.

  20. Comparision of the different probability distributions for earthquake hazard assessment in the North Anatolian Fault Zone

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yilmaz, Şeyda, E-mail: seydayilmaz@ktu.edu.tr; Bayrak, Erdem, E-mail: erdmbyrk@gmail.com; Bayrak, Yusuf, E-mail: bayrak@ktu.edu.tr

    In this study we examined and compared the three different probabilistic distribution methods for determining the best suitable model in probabilistic assessment of earthquake hazards. We analyzed a reliable homogeneous earthquake catalogue between a time period 1900-2015 for magnitude M ≥ 6.0 and estimated the probabilistic seismic hazard in the North Anatolian Fault zone (39°-41° N 30°-40° E) using three distribution methods namely Weibull distribution, Frechet distribution and three-parameter Weibull distribution. The distribution parameters suitability was evaluated Kolmogorov-Smirnov (K-S) goodness-of-fit test. We also compared the estimated cumulative probability and the conditional probabilities of occurrence of earthquakes for different elapsed timemore » using these three distribution methods. We used Easyfit and Matlab software to calculate these distribution parameters and plotted the conditional probability curves. We concluded that the Weibull distribution method was the most suitable than other distribution methods in this region.« less

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sigeti, David E.; Pelak, Robert A.

    We present a Bayesian statistical methodology for identifying improvement in predictive simulations, including an analysis of the number of (presumably expensive) simulations that will need to be made in order to establish with a given level of confidence that an improvement has been observed. Our analysis assumes the ability to predict (or postdict) the same experiments with legacy and new simulation codes and uses a simple binomial model for the probability, {theta}, that, in an experiment chosen at random, the new code will provide a better prediction than the old. This model makes it possible to do statistical analysis withmore » an absolute minimum of assumptions about the statistics of the quantities involved, at the price of discarding some potentially important information in the data. In particular, the analysis depends only on whether or not the new code predicts better than the old in any given experiment, and not on the magnitude of the improvement. We show how the posterior distribution for {theta} may be used, in a kind of Bayesian hypothesis testing, both to decide if an improvement has been observed and to quantify our confidence in that decision. We quantify the predictive probability that should be assigned, prior to taking any data, to the possibility of achieving a given level of confidence, as a function of sample size. We show how this predictive probability depends on the true value of {theta} and, in particular, how there will always be a region around {theta} = 1/2 where it is highly improbable that we will be able to identify an improvement in predictive capability, although the width of this region will shrink to zero as the sample size goes to infinity. We show how the posterior standard deviation may be used, as a kind of 'plan B metric' in the case that the analysis shows that {theta} is close to 1/2 and argue that such a plan B should generally be part of hypothesis testing. All the analysis presented in the paper is done with a general beta-function prior for {theta}, enabling sequential analysis in which a small number of new simulations may be done and the resulting posterior for {theta} used as a prior to inform the next stage of power analysis.« less

  2. Quasi-equilibrium theory for the distribution of rare alleles in a subdivided population: justification and implications.

    PubMed

    Burr, T L

    2000-05-01

    This paper examines a quasi-equilibrium theory of rare alleles for subdivided populations that follow an island-model version of the Wright-Fisher model of evolution. All mutations are assumed to create new alleles. We present four results: (1) conditions for the theory to apply are formally established using properties of the moments of the binomial distribution; (2) approximations currently in the literature can be replaced with exact results that are in better agreement with our simulations; (3) a modified maximum likelihood estimator of migration rate exhibits the same good performance on island-model data or on data simulated from the multinomial mixed with the Dirichlet distribution, and (4) a connection between the rare-allele method and the Ewens Sampling Formula for the infinite-allele mutation model is made. This introduces a new and simpler proof for the expected number of alleles implied by the Ewens Sampling Formula. Copyright 2000 Academic Press.

  3. Confidence of compliance: a Bayesian approach for percentile standards.

    PubMed

    McBride, G B; Ellis, J C

    2001-04-01

    Rules for assessing compliance with percentile standards commonly limit the number of exceedances permitted in a batch of samples taken over a defined assessment period. Such rules are commonly developed using classical statistical methods. Results from alternative Bayesian methods are presented (using beta-distributed prior information and a binomial likelihood), resulting in "confidence of compliance" graphs. These allow simple reading of the consumer's risk and the supplier's risks for any proposed rule. The influence of the prior assumptions required by the Bayesian technique on the confidence results is demonstrated, using two reference priors (uniform and Jeffreys') and also using optimistic and pessimistic user-defined priors. All four give less pessimistic results than does the classical technique, because interpreting classical results as "confidence of compliance" actually invokes a Bayesian approach with an extreme prior distribution. Jeffreys' prior is shown to be the most generally appropriate choice of prior distribution. Cost savings can be expected using rules based on this approach.

  4. Incorporating Skew into RMS Surface Roughness Probability Distribution

    NASA Technical Reports Server (NTRS)

    Stahl, Mark T.; Stahl, H. Philip.

    2013-01-01

    The standard treatment of RMS surface roughness data is the application of a Gaussian probability distribution. This handling of surface roughness ignores the skew present in the surface and overestimates the most probable RMS of the surface, the mode. Using experimental data we confirm the Gaussian distribution overestimates the mode and application of an asymmetric distribution provides a better fit. Implementing the proposed asymmetric distribution into the optical manufacturing process would reduce the polishing time required to meet surface roughness specifications.

  5. Statistical guides to estimating the number of undiscovered mineral deposits: an example with porphyry copper deposits

    USGS Publications Warehouse

    Singer, Donald A.; Menzie, W.D.; Cheng, Qiuming; Bonham-Carter, G. F.

    2005-01-01

    Estimating numbers of undiscovered mineral deposits is a fundamental part of assessing mineral resources. Some statistical tools can act as guides to low variance, unbiased estimates of the number of deposits. The primary guide is that the estimates must be consistent with the grade and tonnage models. Another statistical guide is the deposit density (i.e., the number of deposits per unit area of permissive rock in well-explored control areas). Preliminary estimates and confidence limits of the number of undiscovered deposits in a tract of given area may be calculated using linear regression and refined using frequency distributions with appropriate parameters. A Poisson distribution leads to estimates having lower relative variances than the regression estimates and implies a random distribution of deposits. Coefficients of variation are used to compare uncertainties of negative binomial, Poisson, or MARK3 empirical distributions that have the same expected number of deposits as the deposit density. Statistical guides presented here allow simple yet robust estimation of the number of undiscovered deposits in permissive terranes. 

  6. The Estimation of Tree Posterior Probabilities Using Conditional Clade Probability Distributions

    PubMed Central

    Larget, Bret

    2013-01-01

    In this article I introduce the idea of conditional independence of separated subtrees as a principle by which to estimate the posterior probability of trees using conditional clade probability distributions rather than simple sample relative frequencies. I describe an algorithm for these calculations and software which implements these ideas. I show that these alternative calculations are very similar to simple sample relative frequencies for high probability trees but are substantially more accurate for relatively low probability trees. The method allows the posterior probability of unsampled trees to be calculated when these trees contain only clades that are in other sampled trees. Furthermore, the method can be used to estimate the total probability of the set of sampled trees which provides a measure of the thoroughness of a posterior sample. [Bayesian phylogenetics; conditional clade distributions; improved accuracy; posterior probabilities of trees.] PMID:23479066

  7. Negative Binomial Fits to Multiplicity Distributions from Central Collisions of (16)O+Cu at 14.6A GeV/c and Intermittency

    NASA Technical Reports Server (NTRS)

    Tannenbaum, M. J.

    1994-01-01

    The concept of "Intermittency" was introduced by Bialas and Peschanski to try to explain the "large" fluctuations of multiplicity in restricted intervals of rapidity or pseudorapidity. A formalism was proposed to to study non-statistical (more precisely, non-Poisson) fluctuations as a function of the size of rapidity interval, and it was further suggested that the "spikes" in the rapidity fluctuations were evidence of fractal or intermittent behavior, in analogy to turbulence in fluid dynamics which is characterized by self-similar fluctuations at all scales-the absence of well defined scale of length.

  8. Definite Integrals, Some Involving Residue Theory Evaluated by Maple Code

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bowman, Kimiko o

    2010-01-01

    The calculus of residue is applied to evaluate certain integrals in the range (-{infinity} to {infinity}) using the Maple symbolic code. These integrals are of the form {integral}{sub -{infinity}}{sup {infinity}} cos(x)/[(x{sup 2} + a{sup 2})(x{sup 2} + b{sup 2}) (x{sup 2} + c{sup 2})]dx and similar extensions. The Maple code is also applied to expressions in maximum likelihood estimator moments when sampling from the negative binomial distribution. In general the Maple code approach to the integrals gives correct answers to specified decimal places, but the symbolic result may be extremely long and complex.

  9. Vertical changes in the probability distribution of downward irradiance within the near-surface ocean under sunny conditions

    NASA Astrophysics Data System (ADS)

    Gernez, Pierre; Stramski, Dariusz; Darecki, Miroslaw

    2011-07-01

    Time series measurements of fluctuations in underwater downward irradiance, Ed, within the green spectral band (532 nm) show that the probability distribution of instantaneous irradiance varies greatly as a function of depth within the near-surface ocean under sunny conditions. Because of intense light flashes caused by surface wave focusing, the near-surface probability distributions are highly skewed to the right and are heavy tailed. The coefficients of skewness and excess kurtosis at depths smaller than 1 m can exceed 3 and 20, respectively. We tested several probability models, such as lognormal, Gumbel, Fréchet, log-logistic, and Pareto, which are potentially suited to describe the highly skewed heavy-tailed distributions. We found that the models cannot approximate with consistently good accuracy the high irradiance values within the right tail of the experimental distribution where the probability of these values is less than 10%. This portion of the distribution corresponds approximately to light flashes with Ed > 1.5?, where ? is the time-averaged downward irradiance. However, the remaining part of the probability distribution covering all irradiance values smaller than the 90th percentile can be described with a reasonable accuracy (i.e., within 20%) with a lognormal model for all 86 measurements from the top 10 m of the ocean included in this analysis. As the intensity of irradiance fluctuations decreases with depth, the probability distribution tends toward a function symmetrical around the mean like the normal distribution. For the examined data set, the skewness and excess kurtosis assumed values very close to zero at a depth of about 10 m.

  10. Estimating cavity tree and snag abundance using negative binomial regression models and nearest neighbor imputation methods

    Treesearch

    Bianca N.I. Eskelson; Hailemariam Temesgen; Tara M. Barrett

    2009-01-01

    Cavity tree and snag abundance data are highly variable and contain many zero observations. We predict cavity tree and snag abundance from variables that are readily available from forest cover maps or remotely sensed data using negative binomial (NB), zero-inflated NB, and zero-altered NB (ZANB) regression models as well as nearest neighbor (NN) imputation methods....

  11. The influence of environmental variables on the presence of white sharks, Carcharodon carcharias at two popular Cape Town bathing beaches: a generalized additive mixed model.

    PubMed

    Weltz, Kay; Kock, Alison A; Winker, Henning; Attwood, Colin; Sikweyiya, Monwabisi

    2013-01-01

    Shark attacks on humans are high profile events which can significantly influence policies related to the coastal zone. A shark warning system in South Africa, Shark Spotters, recorded 378 white shark (Carcharodon carcharias) sightings at two popular beaches, Fish Hoek and Muizenberg, during 3690 six-hour long spotting shifts, during the months September to May 2006 to 2011. The probabilities of shark sightings were related to environmental variables using Binomial Generalized Additive Mixed Models (GAMMs). Sea surface temperature was significant, with the probability of shark sightings increasing rapidly as SST exceeded 14 °C and approached a maximum at 18 °C, whereafter it remains high. An 8 times (Muizenberg) and 5 times (Fish Hoek) greater likelihood of sighting a shark was predicted at 18 °C than at 14 °C. Lunar phase was also significant with a prediction of 1.5 times (Muizenberg) and 4 times (Fish Hoek) greater likelihood of a shark sighting at new moon than at full moon. At Fish Hoek, the probability of sighting a shark was 1.6 times higher during the afternoon shift compared to the morning shift, but no diel effect was found at Muizenberg. A significant increase in the number of shark sightings was identified over the last three years, highlighting the need for ongoing research into shark attack mitigation. These patterns will be incorporated into shark awareness and bather safety campaigns in Cape Town.

  12. Utilization of infertility services: how much does money matter?

    PubMed

    Farley Ordovensky Staniec, J; Webb, Natalie J

    2007-06-01

    To estimate the effects of financial access and other individual characteristics on the likelihood that a woman pursues infertility treatment and the choice of treatment type. The 1995 National Survey of Family Growth. We use a binomial logit model to estimate the effects of financial access and individual characteristics on the likelihood that a woman pursues infertility treatment. We then use a multinomial logit model to estimate the differential effects of these variables across treatment types. This study analyzes the subset of 1,210 women who meet the definition of infertile or subfecund from the 1995 National Survey of Family Growth. We find that income, insurance coverage, age, and parity (number of previous births) all significantly affect the probability of seeking infertility treatment; however, the effect of these variables on choice of treatment type varies significantly. Neither income nor insurance influences the probability of seeking advice, a relatively low cost, low yield treatment. At the other end of the spectrum, the choice to pursue assisted reproductive technologies (ARTs)-a much more expensive but potentially more productive option-is highly influenced by income, but merely having private insurance has no significant effect. In the middle of the spectrum are treatment options such as testing, surgery, and medications, for which "financial access" increases their probability of selection. Our results illustrate that for the sample of infertile of subfecund women of childbearing age studied, and considering their options, financial access to infertility treatment does matter.

  13. The Influence of Environmental Variables on the Presence of White Sharks, Carcharodon carcharias at Two Popular Cape Town Bathing Beaches: A Generalized Additive Mixed Model

    PubMed Central

    Weltz, Kay; Kock, Alison A.; Winker, Henning; Attwood, Colin; Sikweyiya, Monwabisi

    2013-01-01

    Shark attacks on humans are high profile events which can significantly influence policies related to the coastal zone. A shark warning system in South Africa, Shark Spotters, recorded 378 white shark (Carcharodon carcharias) sightings at two popular beaches, Fish Hoek and Muizenberg, during 3690 six-hour long spotting shifts, during the months September to May 2006 to 2011. The probabilities of shark sightings were related to environmental variables using Binomial Generalized Additive Mixed Models (GAMMs). Sea surface temperature was significant, with the probability of shark sightings increasing rapidly as SST exceeded 14°C and approached a maximum at 18°C, whereafter it remains high. An 8 times (Muizenberg) and 5 times (Fish Hoek) greater likelihood of sighting a shark was predicted at 18°C than at 14°C. Lunar phase was also significant with a prediction of 1.5 times (Muizenberg) and 4 times (Fish Hoek) greater likelihood of a shark sighting at new moon than at full moon. At Fish Hoek, the probability of sighting a shark was 1.6 times higher during the afternoon shift compared to the morning shift, but no diel effect was found at Muizenberg. A significant increase in the number of shark sightings was identified over the last three years, highlighting the need for ongoing research into shark attack mitigation. These patterns will be incorporated into shark awareness and bather safety campaigns in Cape Town. PMID:23874668

  14. Tackling missing radiographic progression data: multiple imputation technique compared with inverse probability weights and complete case analysis.

    PubMed

    Descalzo, Miguel Á; Garcia, Virginia Villaverde; González-Alvaro, Isidoro; Carbonell, Jordi; Balsa, Alejandro; Sanmartí, Raimon; Lisbona, Pilar; Hernandez-Barrera, Valentín; Jiménez-Garcia, Rodrigo; Carmona, Loreto

    2013-02-01

    To describe the results of different statistical ways of addressing radiographic outcome affected by missing data--multiple imputation technique, inverse probability weights and complete case analysis--using data from an observational study. A random sample of 96 RA patients was selected for a follow-up study in which radiographs of hands and feet were scored. Radiographic progression was tested by comparing the change in the total Sharp-van der Heijde radiographic score (TSS) and the joint erosion score (JES) from baseline to the end of the second year of follow-up. MI technique, inverse probability weights in weighted estimating equation (WEE) and CC analysis were used to fit a negative binomial regression. Major predictors of radiographic progression were JES and joint space narrowing (JSN) at baseline, together with baseline disease activity measured by DAS28 for TSS and MTX use for JES. Results from CC analysis show larger coefficients and s.e.s compared with MI and weighted techniques. The results from the WEE model were quite in line with those of MI. If it seems plausible that CC or MI analysis may be valid, then MI should be preferred because of its greater efficiency. CC analysis resulted in inefficient estimates or, translated into non-statistical terminology, could guide us into inaccurate results and unwise conclusions. The methods discussed here will contribute to the use of alternative approaches for tackling missing data in observational studies.

  15. Perceptions of Unprofessional Social Media Behavior Among Emergency Medicine Physicians.

    PubMed

    Soares, William; Shenvi, Christina; Waller, Nikki; Johnson, Reuben; Hodgson, Carol S

    2017-02-01

    Use of social media (SM) by physicians has exposed issues of privacy and professionalism. While guidelines have been created for SM use, details regarding specific SM behaviors that could lead to disciplinary action presently do not exist. To compare State Medical Board (SMB) directors' perceptions of investigation for specific SM behaviors with those of emergency medicine (EM) physicians. A multicenter anonymous survey was administered to physicians at 3 academic EM residency programs. Surveys consisted of case vignettes, asking, "If the SMB were informed of the content, how likely would they be to initiate an investigation, possibly leading to disciplinary action?" (1, very unlikely, to 4, very likely). Results were compared to published probabilities using exact binomial testing. Of 205 eligible physicians, 119 (58%) completed the survey. Compared to SMB directors, EM physicians indicated similar probabilities of investigation for themes involving identifying patient images, inappropriate communication, and discriminatory speech. Participants indicated lower probabilities of investigation for themes including derogatory speech (32%, 95% confidence interval [CI] 24-41 versus 46%, P  < .05); alcohol intoxication (41%, 95% CI 32-51 versus 73%, P  < .05); and holding alcohol without intoxication (7%, 95% CI 3-13 versus 40%, P  < .05). There were no significant associations with position, hospital site, years since medical school, or prior SM professionalism training. Physicians reported a lower likelihood of investigation for themes that intersect with social identity, compared to SMB directors, particularly for images of alcohol and derogatory speech.

  16. Predicting the probability of slip in gait: methodology and distribution study.

    PubMed

    Gragg, Jared; Yang, James

    2016-01-01

    The likelihood of a slip is related to the available and required friction for a certain activity, here gait. Classical slip and fall analysis presumed that a walking surface was safe if the difference between the mean available and required friction coefficients exceeded a certain threshold. Previous research was dedicated to reformulating the classical slip and fall theory to include the stochastic variation of the available and required friction when predicting the probability of slip in gait. However, when predicting the probability of a slip, previous researchers have either ignored the variation in the required friction or assumed the available and required friction to be normally distributed. Also, there are no published results that actually give the probability of slip for various combinations of required and available frictions. This study proposes a modification to the equation for predicting the probability of slip, reducing the previous equation from a double-integral to a more convenient single-integral form. Also, a simple numerical integration technique is provided to predict the probability of slip in gait: the trapezoidal method. The effect of the random variable distributions on the probability of slip is also studied. It is shown that both the required and available friction distributions cannot automatically be assumed as being normally distributed. The proposed methods allow for any combination of distributions for the available and required friction, and numerical results are compared to analytical solutions for an error analysis. The trapezoidal method is shown to be highly accurate and efficient. The probability of slip is also shown to be sensitive to the input distributions of the required and available friction. Lastly, a critical value for the probability of slip is proposed based on the number of steps taken by an average person in a single day.

  17. Integrated-Circuit Pseudorandom-Number Generator

    NASA Technical Reports Server (NTRS)

    Steelman, James E.; Beasley, Jeff; Aragon, Michael; Ramirez, Francisco; Summers, Kenneth L.; Knoebel, Arthur

    1992-01-01

    Integrated circuit produces 8-bit pseudorandom numbers from specified probability distribution, at rate of 10 MHz. Use of Boolean logic, circuit implements pseudorandom-number-generating algorithm. Circuit includes eight 12-bit pseudorandom-number generators, outputs are uniformly distributed. 8-bit pseudorandom numbers satisfying specified nonuniform probability distribution are generated by processing uniformly distributed outputs of eight 12-bit pseudorandom-number generators through "pipeline" of D flip-flops, comparators, and memories implementing conditional probabilities on zeros and ones.

  18. Inferring patterns in mitochondrial DNA sequences through hypercube independent spanning trees.

    PubMed

    Silva, Eduardo Sant Ana da; Pedrini, Helio

    2016-03-01

    Given a graph G, a set of spanning trees rooted at a vertex r of G is said vertex/edge independent if, for each vertex v of G, v≠r, the paths of r to v in any pair of trees are vertex/edge disjoint. Independent spanning trees (ISTs) provide a number of advantages in data broadcasting due to their fault tolerant properties. For this reason, some studies have addressed the issue by providing mechanisms for constructing independent spanning trees efficiently. In this work, we investigate how to construct independent spanning trees on hypercubes, which are generated based upon spanning binomial trees, and how to use them to predict mitochondrial DNA sequence parts through paths on the hypercube. The prediction works both for inferring mitochondrial DNA sequences comprised of six bases as well as infer anomalies that probably should not belong to the mitochondrial DNA standard. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. Bivariate normal, conditional and rectangular probabilities: A computer program with applications

    NASA Technical Reports Server (NTRS)

    Swaroop, R.; Brownlow, J. D.; Ashwworth, G. R.; Winter, W. R.

    1980-01-01

    Some results for the bivariate normal distribution analysis are presented. Computer programs for conditional normal probabilities, marginal probabilities, as well as joint probabilities for rectangular regions are given: routines for computing fractile points and distribution functions are also presented. Some examples from a closed circuit television experiment are included.

  20. Assessment of source probabilities for potential tsunamis affecting the U.S. Atlantic coast

    USGS Publications Warehouse

    Geist, E.L.; Parsons, T.

    2009-01-01

    Estimating the likelihood of tsunamis occurring along the U.S. Atlantic coast critically depends on knowledge of tsunami source probability. We review available information on both earthquake and landslide probabilities from potential sources that could generate local and transoceanic tsunamis. Estimating source probability includes defining both size and recurrence distributions for earthquakes and landslides. For the former distribution, source sizes are often distributed according to a truncated or tapered power-law relationship. For the latter distribution, sources are often assumed to occur in time according to a Poisson process, simplifying the way tsunami probabilities from individual sources can be aggregated. For the U.S. Atlantic coast, earthquake tsunami sources primarily occur at transoceanic distances along plate boundary faults. Probabilities for these sources are constrained from previous statistical studies of global seismicity for similar plate boundary types. In contrast, there is presently little information constraining landslide probabilities that may generate local tsunamis. Though there is significant uncertainty in tsunami source probabilities for the Atlantic, results from this study yield a comparative analysis of tsunami source recurrence rates that can form the basis for future probabilistic analyses.

  1. Patterns and determinants of use of pharmacological therapies for intermittent claudication in PAD outpatients: results of the IDOMENEO study.

    PubMed

    Cimminiello, Claudio; Polo Friz, Hernan; Marano, Giuseppe; Arpaia, Guido; Boracchi, Patrizia; Spezzigu, Gabriella; Visonà, Adriana

    2017-06-01

    Peripheral arterial disease (PAD) usually presents with intermittent claudication (IC). The aim of the present study was to assess, in clinical practice, the pattern of use of pharmacological therapies for IC in stable PAD outpatients. A propensity analysis was performed using data from the IDOMENEO study, an observational prospective multicenter cohort study. The association between any pharmacological symptomatic IC therapy with different variables was investigated using generalized linear mixed models with pharmacological therapy as response variable and binomial error. Study population: 213 patients, male sex 147 (69.0%), mean age 70.0±8.6 years. Only 36.6% was under pharmacological treatment for IC, being cilostazol the most used medication (21.6%). Univariate analysis showed a probability of a patient of being assigned to any pharmacological symptomatic IC therapy of 67.0% when Ankle-Brachial Index (ABI) <0.6 and 29.8% when ABI>0.6 (P=0.0048), and a propensity to avoid pharmacological treatment for patients with a high number of drugs to treat cardiovascular risk factors (probability of 55.2% for <4 drugs and 19.6% for >4 drugs, P=0.0317). Multivariate analysis confirmed a higher probability of assigning treatment for ABI<0.6 (P=0.0274), and a trend to a lower probability in patients under polypharmacy (>4 drugs: OR=0.13, P=0.0546). In clinical practice, only one third of stable outpatients with IC used symptomatic pharmacological therapy for IC. We found a propensity of clinicians to assign any symptomatic pharmacological IC therapy to patients with lower values of ABI and a propensity to avoid this kind of treatment in patients under polypharmacy.

  2. The sedimentation properties of ferritins. New insights and analysis of methods of nanoparticle preparation

    PubMed Central

    May, Carrie A.; Grady, John K.; Laue, Thomas M.; Poli, Maura; Arosio, Paolo; Chasteen, N. Dennis

    2010-01-01

    Background Ferritin exhibits complex behavior in the ultracentrifuge due to variability in iron core size among molecules. A comprehensive study was undertaken to develop procedures for obtaining more uniform cores and assessing their homogeneity. Methods Analytical ultracentrifugation was used to measure the mineral core size distributions obtained by adding iron under high- and low-flux conditions to horse spleen (apoHoSF) and human H-chain (apoHuHF) apoferritins. Results More uniform core sizes are obtained with the homopolymer human H-chain ferritin than with the heteropolymer horse spleen HoSF protein in which subpopulations of HoSF molecules with varying iron content are observed. A binomial probability distribution of H- and L-subunits among protein shells qualitatively accounts for the observed subpopulations. The addition of Fe2+ to apoHuHF produces iron core particle size diameters from 3.8 ± 0.3 to 6.2 ± 0.3 nm. Diameters from 3.4 ± 0.6 to 6.5 ± 0.6 nm are obtained with natural HoSF after sucrose gradient fractionation. The change in the sedimentation coefficient as iron accumulates in ferritin suggests that the protein shell contracts ~10% to a more compact structure, a finding consistent with published electron micrographs. The physicochemical parameters for apoHoSF (15%/85% H/L subunits) are M = 484,120 g/mol, ν̄ = 0.735mL/g, s20,w = 17.0 S and D20,W = 3.21 × 10−7 cm2/s; and for apoHuHF M = 506,266 g/mol, ν̄ = 0.724 mL/g, s20,w = 18.3 S and D20,w = 3.18 × 10−7 cm2/s. Significance The methods presented here should prove useful in the synthesis of size controlled nanoparticles of other minerals. PMID:20307627

  3. Multinomial mixture model with heterogeneous classification probabilities

    USGS Publications Warehouse

    Holland, M.D.; Gray, B.R.

    2011-01-01

    Royle and Link (Ecology 86(9):2505-2512, 2005) proposed an analytical method that allowed estimation of multinomial distribution parameters and classification probabilities from categorical data measured with error. While useful, we demonstrate algebraically and by simulations that this method yields biased multinomial parameter estimates when the probabilities of correct category classifications vary among sampling units. We address this shortcoming by treating these probabilities as logit-normal random variables within a Bayesian framework. We use Markov chain Monte Carlo to compute Bayes estimates from a simulated sample from the posterior distribution. Based on simulations, this elaborated Royle-Link model yields nearly unbiased estimates of multinomial and correct classification probability estimates when classification probabilities are allowed to vary according to the normal distribution on the logit scale or according to the Beta distribution. The method is illustrated using categorical submersed aquatic vegetation data. ?? 2010 Springer Science+Business Media, LLC.

  4. Spatial distribution of parasitism on Phyllocnistis citrella Stainton, 1856 (Lepidoptera: Gracillariidae) in citrus orchards.

    PubMed

    Jahnke, S M; Redaelli, L R; Diefenbach, L M G; Efrom, C F

    2008-11-01

    Many species of microhymenopterous parasitoids have been registered on Phyllocnistis citrella, the citrus leafminer. The present study aimed to identify the spatial distribution pattern of the native and introduced parasitoids of P. citrella in two citrus orchards in Montenegro, RS. The new shoots from 24 randomly selected trees in each orchard were inspected at the bottom (0-1.5 m) and top (1.5-2.5 m) stratum and had their position relative to the quadrants (North, South, East and West) registered at every 15 days from July/2002 to June/2003. The leaves with pupae were collected and kept isolated until the emergence of parasitoids or of the leaf miner; so, the sampling was biased towards parasitoids that emerge in the host pupal phase. The horizontal spatial distribution was evaluated testing the fitness of data to the Poisson and negative binomial distributions. In Montenegrina, there was no significant difference in the number of parasitoids and in the mean number of pupae found in the top and bottom strata (chi2 = 0.66; df = 1; P > 0.05) (chi2 = 0.27; df =1; P > 0.05), respectively. In relation to the quadrants, the highest average numbers of the leafminer pupae and of parasitoids were registered at the East quadrant (chi2 = 11.81; df = 3; P < 0.05), (chi2 = 10.36; df = 3; P < 0.05). In the Murcott orchard, a higher number of parasitoids was found at the top stratum (63.5%) (chi2 = 7.24; df =1 P < 0.05), the same occurring with the average number of P. citrella pupae (62.9%) (chi2 = 6.66; df = 1; P < 0.05). The highest number of parasitoids and of miners was registered at the North quadrant (chi2 = 19. 29; df = 3; P < 0.05), (chi2 = 4.39; df = 3; P < 0.05). In both orchards, there was no difference between the numbers of shoots either relative to the strata as well as to the quadrants. As the number of shoots did not varied much relative to the quadrants, it is possible that the higher number of miners and parasitoids in the East and West quadrants would be influenced by the higher solar exposure of these quadrants. The data of the horizontal spatial distribution of the parasitism fit to the negative binomial distribution in all sampling occasions, indicating an aggregated pattern.

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Friar, James Lewis; Goldman, Terrance J.; Pérez-Mercader, J.

    In this paper, we apply the Law of Total Probability to the construction of scale-invariant probability distribution functions (pdf's), and require that probability measures be dimensionless and unitless under a continuous change of scales. If the scale-change distribution function is scale invariant then the constructed distribution will also be scale invariant. Repeated application of this construction on an arbitrary set of (normalizable) pdf's results again in scale-invariant distributions. The invariant function of this procedure is given uniquely by the reciprocal distribution, suggesting a kind of universality. Finally, we separately demonstrate that the reciprocal distribution results uniquely from requiring maximum entropymore » for size-class distributions with uniform bin sizes.« less

  6. Positive phase space distributions and uncertainty relations

    NASA Technical Reports Server (NTRS)

    Kruger, Jan

    1993-01-01

    In contrast to a widespread belief, Wigner's theorem allows the construction of true joint probabilities in phase space for distributions describing the object system as well as for distributions depending on the measurement apparatus. The fundamental role of Heisenberg's uncertainty relations in Schroedinger form (including correlations) is pointed out for these two possible interpretations of joint probability distributions. Hence, in order that a multivariate normal probability distribution in phase space may correspond to a Wigner distribution of a pure or a mixed state, it is necessary and sufficient that Heisenberg's uncertainty relation in Schroedinger form should be satisfied.

  7. Bayesian analysis of volcanic eruptions

    NASA Astrophysics Data System (ADS)

    Ho, Chih-Hsiang

    1990-10-01

    The simple Poisson model generally gives a good fit to many volcanoes for volcanic eruption forecasting. Nonetheless, empirical evidence suggests that volcanic activity in successive equal time-periods tends to be more variable than a simple Poisson with constant eruptive rate. An alternative model is therefore examined in which eruptive rate(λ) for a given volcano or cluster(s) of volcanoes is described by a gamma distribution (prior) rather than treated as a constant value as in the assumptions of a simple Poisson model. Bayesian analysis is performed to link two distributions together to give the aggregate behavior of the volcanic activity. When the Poisson process is expanded to accomodate a gamma mixing distribution on λ, a consequence of this mixed (or compound) Poisson model is that the frequency distribution of eruptions in any given time-period of equal length follows the negative binomial distribution (NBD). Applications of the proposed model and comparisons between the generalized model and simple Poisson model are discussed based on the historical eruptive count data of volcanoes Mauna Loa (Hawaii) and Etna (Italy). Several relevant facts lead to the conclusion that the generalized model is preferable for practical use both in space and time.

  8. Ubiquity of Benford's law and emergence of the reciprocal distribution

    DOE PAGES

    Friar, James Lewis; Goldman, Terrance J.; Pérez-Mercader, J.

    2016-04-07

    In this paper, we apply the Law of Total Probability to the construction of scale-invariant probability distribution functions (pdf's), and require that probability measures be dimensionless and unitless under a continuous change of scales. If the scale-change distribution function is scale invariant then the constructed distribution will also be scale invariant. Repeated application of this construction on an arbitrary set of (normalizable) pdf's results again in scale-invariant distributions. The invariant function of this procedure is given uniquely by the reciprocal distribution, suggesting a kind of universality. Finally, we separately demonstrate that the reciprocal distribution results uniquely from requiring maximum entropymore » for size-class distributions with uniform bin sizes.« less

  9. Spatial Probability Distribution of Strata's Lithofacies and its Impacts on Land Subsidence in Huairou Emergency Water Resources Region of Beijing

    NASA Astrophysics Data System (ADS)

    Li, Y.; Gong, H.; Zhu, L.; Guo, L.; Gao, M.; Zhou, C.

    2016-12-01

    Continuous over-exploitation of groundwater causes dramatic drawdown, and leads to regional land subsidence in the Huairou Emergency Water Resources region, which is located in the up-middle part of the Chaobai river basin of Beijing. Owing to the spatial heterogeneity of strata's lithofacies of the alluvial fan, ground deformation has no significant positive correlation with groundwater drawdown, and one of the challenges ahead is to quantify the spatial distribution of strata's lithofacies. The transition probability geostatistics approach provides potential for characterizing the distribution of heterogeneous lithofacies in the subsurface. Combined the thickness of clay layer extracted from the simulation, with deformation field acquired from PS-InSAR technology, the influence of strata's lithofacies on land subsidence can be analyzed quantitatively. The strata's lithofacies derived from borehole data were generalized into four categories and their probability distribution in the observe space was mined by using the transition probability geostatistics, of which clay was the predominant compressible material. Geologically plausible realizations of lithofacies distribution were produced, accounting for complex heterogeneity in alluvial plain. At a particular probability level of more than 40 percent, the volume of clay defined was 55 percent of the total volume of strata's lithofacies. This level, equaling nearly the volume of compressible clay derived from the geostatistics, was thus chosen to represent the boundary between compressible and uncompressible material. The method incorporates statistical geological information, such as distribution proportions, average lengths and juxtaposition tendencies of geological types, mainly derived from borehole data and expert knowledge, into the Markov chain model of transition probability. Some similarities of patterns were indicated between the spatial distribution of deformation field and clay layer. In the area with roughly similar water table decline, locations in the subsurface having a higher probability for the existence of compressible material occur more than that in the location with a lower probability. Such estimate of spatial probability distribution is useful to analyze the uncertainty of land subsidence.

  10. The exact probability distribution of the rank product statistics for replicated experiments.

    PubMed

    Eisinga, Rob; Breitling, Rainer; Heskes, Tom

    2013-03-18

    The rank product method is a widely accepted technique for detecting differentially regulated genes in replicated microarray experiments. To approximate the sampling distribution of the rank product statistic, the original publication proposed a permutation approach, whereas recently an alternative approximation based on the continuous gamma distribution was suggested. However, both approximations are imperfect for estimating small tail probabilities. In this paper we relate the rank product statistic to number theory and provide a derivation of its exact probability distribution and the true tail probabilities. Copyright © 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  11. Modeling the probability distribution of peak discharge for infiltrating hillslopes

    NASA Astrophysics Data System (ADS)

    Baiamonte, Giorgio; Singh, Vijay P.

    2017-07-01

    Hillslope response plays a fundamental role in the prediction of peak discharge at the basin outlet. The peak discharge for the critical duration of rainfall and its probability distribution are needed for designing urban infrastructure facilities. This study derives the probability distribution, denoted as GABS model, by coupling three models: (1) the Green-Ampt model for computing infiltration, (2) the kinematic wave model for computing discharge hydrograph from the hillslope, and (3) the intensity-duration-frequency (IDF) model for computing design rainfall intensity. The Hortonian mechanism for runoff generation is employed for computing the surface runoff hydrograph. Since the antecedent soil moisture condition (ASMC) significantly affects the rate of infiltration, its effect on the probability distribution of peak discharge is investigated. Application to a watershed in Sicily, Italy, shows that with the increase of probability, the expected effect of ASMC to increase the maximum discharge diminishes. Only for low values of probability, the critical duration of rainfall is influenced by ASMC, whereas its effect on the peak discharge seems to be less for any probability. For a set of parameters, the derived probability distribution of peak discharge seems to be fitted by the gamma distribution well. Finally, an application to a small watershed, with the aim to test the possibility to arrange in advance the rational runoff coefficient tables to be used for the rational method, and a comparison between peak discharges obtained by the GABS model with those measured in an experimental flume for a loamy-sand soil were carried out.

  12. Narrow log-periodic modulations in non-Markovian random walks

    NASA Astrophysics Data System (ADS)

    Diniz, R. M. B.; Cressoni, J. C.; da Silva, M. A. A.; Mariz, A. M.; de Araújo, J. M.

    2017-12-01

    What are the necessary ingredients for log-periodicity to appear in the dynamics of a random walk model? Can they be subtle enough to be overlooked? Previous studies suggest that long-range damaged memory and negative feedback together are necessary conditions for the emergence of log-periodic oscillations. The role of negative feedback would then be crucial, forcing the system to change direction. In this paper we show that small-amplitude log-periodic oscillations can emerge when the system is driven by positive feedback. Due to their very small amplitude, these oscillations can easily be mistaken for numerical finite-size effects. The models we use consist of discrete-time random walks with strong memory correlations where the decision process is taken from memory profiles based either on a binomial distribution or on a delta distribution. Anomalous superdiffusive behavior and log-periodic modulations are shown to arise in the large time limit for convenient choices of the models parameters.

  13. It's time to move on from the bell curve.

    PubMed

    Robinson, Lawrence R

    2017-11-01

    The bell curve was first described in the 18th century by de Moivre and Gauss to depict the distribution of binomial events, such as coin tossing, or repeated measures of physical objects. In the 19th and 20th centuries, the bell curve was appropriated, or perhaps misappropriated, to apply to biologic and social measures across people. For many years we used it to derive reference values for our electrophysiologic studies. There is, however, no reason to believe that electrophysiologic measures should approximate a bell-curve distribution, and empiric evidence suggests they do not. The concept of using mean ± 2 standard deviations should be abandoned. Reference values are best derived by using non-parametric analyses, such as percentile values. This proposal aligns with the recommendation of the recent normative data task force of the American Association of Neuromuscular & Electrodiagnostic Medicine and follows sound statistical principles. Muscle Nerve 56: 859-860, 2017. © 2017 Wiley Periodicals, Inc.

  14. Host nutrition alters the variance in parasite transmission potential

    PubMed Central

    Vale, Pedro F.; Choisy, Marc; Little, Tom J.

    2013-01-01

    The environmental conditions experienced by hosts are known to affect their mean parasite transmission potential. How different conditions may affect the variance of transmission potential has received less attention, but is an important question for disease management, especially if specific ecological contexts are more likely to foster a few extremely infectious hosts. Using the obligate-killing bacterium Pasteuria ramosa and its crustacean host Daphnia magna, we analysed how host nutrition affected the variance of individual parasite loads, and, therefore, transmission potential. Under low food, individual parasite loads showed similar mean and variance, following a Poisson distribution. By contrast, among well-nourished hosts, parasite loads were right-skewed and overdispersed, following a negative binomial distribution. Abundant food may, therefore, yield individuals causing potentially more transmission than the population average. Measuring both the mean and variance of individual parasite loads in controlled experimental infections may offer a useful way of revealing risk factors for potential highly infectious hosts. PMID:23407498

  15. Host nutrition alters the variance in parasite transmission potential.

    PubMed

    Vale, Pedro F; Choisy, Marc; Little, Tom J

    2013-04-23

    The environmental conditions experienced by hosts are known to affect their mean parasite transmission potential. How different conditions may affect the variance of transmission potential has received less attention, but is an important question for disease management, especially if specific ecological contexts are more likely to foster a few extremely infectious hosts. Using the obligate-killing bacterium Pasteuria ramosa and its crustacean host Daphnia magna, we analysed how host nutrition affected the variance of individual parasite loads, and, therefore, transmission potential. Under low food, individual parasite loads showed similar mean and variance, following a Poisson distribution. By contrast, among well-nourished hosts, parasite loads were right-skewed and overdispersed, following a negative binomial distribution. Abundant food may, therefore, yield individuals causing potentially more transmission than the population average. Measuring both the mean and variance of individual parasite loads in controlled experimental infections may offer a useful way of revealing risk factors for potential highly infectious hosts.

  16. Discrete epidemic models with arbitrary stage distributions and applications to disease control.

    PubMed

    Hernandez-Ceron, Nancy; Feng, Zhilan; Castillo-Chavez, Carlos

    2013-10-01

    W.O. Kermack and A.G. McKendrick introduced in their fundamental paper, A Contribution to the Mathematical Theory of Epidemics, published in 1927, a deterministic model that captured the qualitative dynamic behavior of single infectious disease outbreaks. A Kermack–McKendrick discrete-time general framework, motivated by the emergence of a multitude of models used to forecast the dynamics of epidemics, is introduced in this manuscript. Results that allow us to measure quantitatively the role of classical and general distributions on disease dynamics are presented. The case of the geometric distribution is used to evaluate the impact of waiting-time distributions on epidemiological processes or public health interventions. In short, the geometric distribution is used to set up the baseline or null epidemiological model used to test the relevance of realistic stage-period distribution on the dynamics of single epidemic outbreaks. A final size relationship involving the control reproduction number, a function of transmission parameters and the means of distributions used to model disease or intervention control measures, is computed. Model results and simulations highlight the inconsistencies in forecasting that emerge from the use of specific parametric distributions. Examples, using the geometric, Poisson and binomial distributions, are used to highlight the impact of the choices made in quantifying the risk posed by single outbreaks and the relative importance of various control measures.

  17. Constructing inverse probability weights for continuous exposures: a comparison of methods.

    PubMed

    Naimi, Ashley I; Moodie, Erica E M; Auger, Nathalie; Kaufman, Jay S

    2014-03-01

    Inverse probability-weighted marginal structural models with binary exposures are common in epidemiology. Constructing inverse probability weights for a continuous exposure can be complicated by the presence of outliers, and the need to identify a parametric form for the exposure and account for nonconstant exposure variance. We explored the performance of various methods to construct inverse probability weights for continuous exposures using Monte Carlo simulation. We generated two continuous exposures and binary outcomes using data sampled from a large empirical cohort. The first exposure followed a normal distribution with homoscedastic variance. The second exposure followed a contaminated Poisson distribution, with heteroscedastic variance equal to the conditional mean. We assessed six methods to construct inverse probability weights using: a normal distribution, a normal distribution with heteroscedastic variance, a truncated normal distribution with heteroscedastic variance, a gamma distribution, a t distribution (1, 3, and 5 degrees of freedom), and a quantile binning approach (based on 10, 15, and 20 exposure categories). We estimated the marginal odds ratio for a single-unit increase in each simulated exposure in a regression model weighted by the inverse probability weights constructed using each approach, and then computed the bias and mean squared error for each method. For the homoscedastic exposure, the standard normal, gamma, and quantile binning approaches performed best. For the heteroscedastic exposure, the quantile binning, gamma, and heteroscedastic normal approaches performed best. Our results suggest that the quantile binning approach is a simple and versatile way to construct inverse probability weights for continuous exposures.

  18. Nonvisualization of the ovaries on pelvic ultrasound: does MRI add anything?

    PubMed

    Lisanti, Christopher J; Wood, Jonathan R; Schwope, Ryan B

    2014-02-01

    The purpose of our study is to assess the utility of pelvic magnetic resonance imaging (MRI) in the event that either one or both ovaries are not visualized by pelvic ultrasound. This HIPAA-compliant retrospective study was approved by our local institutional review board and informed consent waived. 1926 pelvic MRI examinations between March 2007 and December 2011 were reviewed and included if a combined transabdominal and endovaginal pelvic ultrasound had been performed in the preceding 6 months with at least one ovary nonvisualized. Ovaries not visualized on pelvic ultrasound were assumed to be normal and compared with the pelvic MRI findings. MRI findings were categorized as concordant or discordant. Discordant findings were divided into malignant, non-malignant physiologic or non-malignant non-physiologic. The modified Wald, the "rule of thirds", and the binomial distribution probability tests were performed. 255 pelvic ultrasounds met inclusion criteria with 364 ovaries not visualized. 0 malignancies were detected on MRI. 6.9% (25/364) of nonvisualized ovaries had non-malignant discordant findings on MRI: 5.2% (19/364) physiologic, 1.6% (6/364) non-physiologic. Physiologic findings included: 16 functional cysts and 3 hemorrhagic cysts. Non-physiologic findings included: 3 cysts in post-menopausal women, 1 hydrosalpinx, and 2 broad ligament fibroids. The theoretical risk of detecting an ovarian carcinoma on pelvic MRI when an ovary is not visualized on ultrasound ranges from 0 to 1.3%. If an ovary is not visualized on pelvic ultrasound, it can be assumed to be without carcinoma and MRI rarely adds additional information.

  19. Model studies of the beam-filling error for rain-rate retrieval with microwave radiometers

    NASA Technical Reports Server (NTRS)

    Ha, Eunho; North, Gerald R.

    1995-01-01

    Low-frequency (less than 20 GHz) single-channel microwave retrievals of rain rate encounter the problem of beam-filling error. This error stems from the fact that the relationship between microwave brightness temperature and rain rate is nonlinear, coupled with the fact that the field of view is large or comparable to important scales of variability of the rain field. This means that one may not simply insert the area average of the brightness temperature into the formula for rain rate without incurring both bias and random error. The statistical heterogeneity of the rain-rate field in the footprint of the instrument is key to determining the nature of these errors. This paper makes use of a series of random rain-rate fields to study the size of the bias and random error associated with beam filling. A number of examples are analyzed in detail: the binomially distributed field, the gamma, the Gaussian, the mixed gamma, the lognormal, and the mixed lognormal ('mixed' here means there is a finite probability of no rain rate at a point of space-time). Of particular interest are the applicability of a simple error formula due to Chiu and collaborators and a formula that might hold in the large field of view limit. It is found that the simple formula holds for Gaussian rain-rate fields but begins to fail for highly skewed fields such as the mixed lognormal. While not conclusively demonstrated here, it is suggested that the notionof climatologically adjusting the retrievals to remove the beam-filling bias is a reasonable proposition.

  20. Spatial Dependence and Sampling of Phytoseiid Populations on Hass Avocados in Southern California.

    PubMed

    Lara, Jesús R; Amrich, Ruth; Saremi, Naseem T; Hoddle, Mark S

    2016-04-22

    Research on phytoseiid mites has been critical for developing an effective biocontrol strategy for suppressing Oligonchus perseae Tuttle, Baker, and Abatiello (Acari: Tetranychidae) in California avocado orchards. However, basic understanding of the spatial ecology of natural populations of phytoseiids in relation to O. perseae infestations and the validation of research-based strategies for assessing densities of these predators has been limited. To address these shortcomings, cross-sectional and longitudinal observations consisting of >3,000 phytoseiids and 500,000 O. perseae counted on 11,341 leaves were collected across 10 avocado orchards during a 10-yr period. Subsets of these data were analyzed statistically to characterize the spatial distribution of phytoseiids in avocado orchards and to evaluate the merits of developing binomial and enumerative sampling strategies for these predators. Spatial correlation of phytoseiids between trees was detected at one site, and a strong association of phytoseiids with elevated O. perseae densities was detected at four sites. Sampling simulations revealed that enumeration-based sampling performed better than binomial sampling for estimating phytoseiid densities. The ecological implications of these findings and potential for developing a custom sampling plan to estimate densities of phytoseiids inhabiting sampled trees in avocado orchards in California are discussed. © The Authors 2016. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  1. Spatial distribution of the risk of dengue fever in southeast Brazil, 2006-2007

    PubMed Central

    2011-01-01

    Background Many factors have been associated with circulation of the dengue fever virus and vector, although the dynamics of transmission are not yet fully understood. The aim of this work is to estimate the spatial distribution of the risk of dengue fever in an area of continuous dengue occurrence. Methods This is a spatial population-based case-control study that analyzed 538 cases and 727 controls in one district of the municipality of Campinas, São Paulo, Brazil, from 2006-2007, considering socio-demographic, ecological, case severity, and household infestation variables. Information was collected by in-home interviews and inspection of living conditions in and around the homes studied. Cases were classified as mild or severe according to clinical data, and they were compared with controls through a multinomial logistic model. A generalized additive model was used in order to include space in a non-parametric fashion with cubic smoothing splines. Results Variables associated with increased incidence of all dengue cases in the multiple binomial regression model were: higher larval density (odds ratio (OR) = 2.3 (95%CI: 2.0-2.7)), reports of mosquito bites during the day (OR = 1.8 (95%CI: 1.4-2.4)), the practice of water storage at home (OR = 2.5 (95%CI: 1.4, 4.3)), low frequency of garbage collection (OR = 2.6 (95%CI: 1.6-4.5)) and lack of basic sanitation (OR = 2.9 (95%CI: 1.8-4.9)). Staying at home during the day was protective against the disease (OR = 0.5 (95%CI: 0.3-0.6)). When cases were analyzed by categories (mild and severe) in the multinomial model, age and number of breeding sites more than 10 were significant only for the occurrence of severe cases (OR = 0.97, (95%CI: 0.96-0.99) and OR = 2.1 (95%CI: 1.2-3.5), respectively. Spatial distribution of risks of mild and severe dengue fever differed from each other in the 2006/2007 epidemic, in the study area. Conclusions Age and presence of more than 10 breeding sites were significant only for severe cases. Other predictors of mild and severe cases were similar in the multiple models. The analyses of multinomial models and spatial distribution maps of dengue fever probabilities suggest an area-specific epidemic with varying clinical and demographic characteristics. PMID:21599980

  2. Plasmodium falciparum in the southeastern Atlantic forest: a challenge to the bromeliad-malaria paradigm?

    PubMed

    Laporta, Gabriel Zorello; Burattini, Marcelo Nascimento; Levy, Debora; Fukuya, Linah Akemi; de Oliveira, Tatiane Marques Porangaba; Maselli, Luciana Morganti Ferreira; Conn, Jan Evelyn; Massad, Eduardo; Bydlowski, Sergio Paulo; Sallum, Maria Anice Mureb

    2015-04-25

    Recently an unexpectedly high prevalence of Plasmodium falciparum was found in asymptomatic blood donors living in the southeastern Brazilian Atlantic forest. The bromeliad-malaria paradigm assumes that transmission of Plasmodium vivax and Plasmodium malariae involves species of the subgenus Kerteszia of Anopheles and only a few cases of P. vivax malaria are reported annually in this region. The expectations of this paradigm are a low prevalence of P. vivax and a null prevalence of P. falciparum. Therefore, the aim of this study was to verify if P. falciparum is actively circulating in the southeastern Brazilian Atlantic forest remains. In this study, anophelines were collected with Shannon and CDC-light traps in seven distinct Atlantic forest landscapes over a 4-month period. Field-collected Anopheles mosquitoes were tested by real-time PCR assay in pools of ten, and then each mosquito from every positive pool, separately for P. falciparum and P. vivax. Genomic DNA of P. falciparum or P. vivax from positive anophelines was then amplified by traditional PCR for sequencing of the 18S ribosomal DNA to confirm Plasmodium species. Binomial probabilities were calculated to identify non-random results of the P. falciparum-infected anopheline findings. The overall proportion of anophelines naturally infected with P. falciparum was 4.4% (21/480) and only 0.8% (4/480) with P. vivax. All of the infected mosquitoes were found in intermixed natural and human-modified environments and most were Anopheles cruzii (22/25 = 88%, 18 P. falciparum plus 4 P. vivax). Plasmodium falciparum was confirmed by sequencing in 76% (16/21) of positive mosquitoes, whereas P. vivax was confirmed in only 25% (1/4). Binomial probabilities suggest that P. falciparum actively circulates throughout the region and that there may be a threshold of the forested over human-modified environment ratio upon which the proportion of P. falciparum-infected anophelines increases significantly. These results show that P. falciparum actively circulates, in higher proportion than P. vivax, among Anopheles mosquitoes of fragments of the southeastern Brazilian Atlantic forest. This finding challenges the classical bromeliad-malaria paradigm, which considers P. vivax circulation as the driver for the dynamics of residual malaria transmission in this region.

  3. Induced abortion in a Southern European region: examining inequalities between native and immigrant women.

    PubMed

    Rodriguez-Alvarez, Elena; Borrell, Luisa N; González-Rábago, Yolanda; Martín, Unai; Lanborena, Nerea

    2016-09-01

    To examine induced abortion (IA) inequalities between native and immigrant women in a Southern European region and whether these inequalities depend on a 2010 Law facilitating IA. We conducted two analyses: (1) prevalence of total IAs, repeat and second trimester IA, in native and immigrant women aged 12-49 years for years 2009-2013 according to country of origin; and (2) log-binomial regression was used to quantify the association of place of origin with repeat and second trimester IAs among women with IAs. Immigrants were more likely to have an IA than Spanish women, with the highest probability in Sub-Saharan Africa (PR 8.32 95 % CI 3.66-18.92). Immigrant women with an IA from countries other than Maghreb and Asia have higher probabilities of a repeat IA than women from Spain. Women from Europe non-EU/Romania were 50 % (95 % CI 0.30-0.79) less likely to have a second trimester IA, while women from Central America/Caribbean were 45 % (95 % CI 1.11-1.89) more likely than Spanish women. The 2010 Law did not affect these associations. There is a need for parenthood planning programs and more information and access to contraception methods especially in immigrant women to help decrease IAs.

  4. Estimating synaptic parameters from mean, variance, and covariance in trains of synaptic responses.

    PubMed

    Scheuss, V; Neher, E

    2001-10-01

    Fluctuation analysis of synaptic transmission using the variance-mean approach has been restricted in the past to steady-state responses. Here we extend this method to short repetitive trains of synaptic responses, during which the response amplitudes are not stationary. We consider intervals between trains, long enough so that the system is in the same average state at the beginning of each train. This allows analysis of ensemble means and variances for each response in a train separately. Thus, modifications in synaptic efficacy during short-term plasticity can be attributed to changes in synaptic parameters. In addition, we provide practical guidelines for the analysis of the covariance between successive responses in trains. Explicit algorithms to estimate synaptic parameters are derived and tested by Monte Carlo simulations on the basis of a binomial model of synaptic transmission, allowing for quantal variability, heterogeneity in the release probability, and postsynaptic receptor saturation and desensitization. We find that the combined analysis of variance and covariance is advantageous in yielding an estimate for the number of release sites, which is independent of heterogeneity in the release probability under certain conditions. Furthermore, it allows one to calculate the apparent quantal size for each response in a sequence of stimuli.

  5. Impact of a Letter-Grade Program on Restaurant Sanitary Conditions and Diner Behavior in New York City

    PubMed Central

    McKelvey, Wendy; Ito, Kazuhiko; Schiff, Corinne; Jacobson, J. Bryan; Kass, Daniel

    2015-01-01

    Objectives. We evaluated the impact of the New York City restaurant letter-grading program on restaurant hygiene, food safety practices, and public awareness. Methods. We analyzed data from 43 448 restaurants inspected between 2007 and 2013 to measure changes in inspection score and violation citations since program launch in July 2010. We used binomial regression to assess probability of scoring 0 to 13 points (A-range score). Two population-based random-digit-dial telephone surveys assessed public perceptions of the program. Results. After we controlled for repeated restaurant observations, season of inspection, and chain restaurant status, the probability of scoring 0 to 13 points on an unannounced inspection increased 35% (95% confidence interval [CI] = 31%, 40%) 3 years after compared with 3 years before grading. There were notable improvements in compliance with some specific requirements, including having a certified kitchen manager on site and being pest-free. More than 91% (95% CI = 88%, 94%) of New Yorkers approved of the program and 88% (95% CI = 85%, 92%) considered grades in dining decisions in 2012. Conclusions. Restaurant letter grading in New York City has resulted in improved sanitary conditions on unannounced inspection, suggesting that the program is an effective regulatory tool. PMID:25602861

  6. Effects of fish species composition on Diphyllobothrium spp. infections in brown trout - is three-spined stickleback a key species?

    PubMed

    Kuhn, J A; Frainer, A; Knudsen, R; Kristoffersen, R; Amundsen, P-A

    2016-11-01

    Subarctic populations of brown trout (Salmo trutta) are often heavily infected with cestodes of the genus Diphyllobothrium, assumedly because of their piscivorous behaviour. This study explores possible associations between availability of fish prey and Diphyllobothrium spp. infections in lacustrine trout populations. Trout in (i) allopatry (group T); (ii) sympatry with Arctic charr (Salvelinus alpinus) (group TC); and (iii) sympatry with charr and three-spined stickleback (Gasterosteus aculeatus) (group TCS) were contrasted. Mean abundance and intensity of Diphyllobothrium spp. were higher in group TCS compared to groups TC and T. Prevalence, however, was similarly higher in groups TCS and TC compared to group T. Zero-altered negative binomial modelling identified the lowest probability of infection in group T and similar probabilities of infection in groups TC and TCS, whereas the highest intensity was predicted in group TCS. The most infected trout were from the group co-occurring with stickleback (TCS), possibly due to a higher availability of fish prey. In conclusion, our study demonstrates elevated Diphyllobothrium spp. infections in lacustrine trout populations where fish prey are available and suggests that highly available and easily caught stickleback prey may play a key role in the transmission of Diphyllobothrium spp. parasite larvae. © 2016 John Wiley & Sons Ltd.

  7. Comparison of Deterministic and Probabilistic Radial Distribution Systems Load Flow

    NASA Astrophysics Data System (ADS)

    Gupta, Atma Ram; Kumar, Ashwani

    2017-12-01

    Distribution system network today is facing the challenge of meeting increased load demands from the industrial, commercial and residential sectors. The pattern of load is highly dependent on consumer behavior and temporal factors such as season of the year, day of the week or time of the day. For deterministic radial distribution load flow studies load is taken as constant. But, load varies continually with a high degree of uncertainty. So, there is a need to model probable realistic load. Monte-Carlo Simulation is used to model the probable realistic load by generating random values of active and reactive power load from the mean and standard deviation of the load and for solving a Deterministic Radial Load Flow with these values. The probabilistic solution is reconstructed from deterministic data obtained for each simulation. The main contribution of the work is: Finding impact of probable realistic ZIP load modeling on balanced radial distribution load flow. Finding impact of probable realistic ZIP load modeling on unbalanced radial distribution load flow. Compare the voltage profile and losses with probable realistic ZIP load modeling for balanced and unbalanced radial distribution load flow.

  8. Probability Distribution of Turbulent Kinetic Energy Dissipation Rate in Ocean: Observations and Approximations

    NASA Astrophysics Data System (ADS)

    Lozovatsky, I.; Fernando, H. J. S.; Planella-Morato, J.; Liu, Zhiyu; Lee, J.-H.; Jinadasa, S. U. P.

    2017-10-01

    The probability distribution of turbulent kinetic energy dissipation rate in stratified ocean usually deviates from the classic lognormal distribution that has been formulated for and often observed in unstratified homogeneous layers of atmospheric and oceanic turbulence. Our measurements of vertical profiles of micro-scale shear, collected in the East China Sea, northern Bay of Bengal, to the south and east of Sri Lanka, and in the Gulf Stream region, show that the probability distributions of the dissipation rate ɛ˜r in the pycnoclines (r ˜ 1.4 m is the averaging scale) can be successfully modeled by the Burr (type XII) probability distribution. In weakly stratified boundary layers, lognormal distribution of ɛ˜r is preferable, although the Burr is an acceptable alternative. The skewness Skɛ and the kurtosis Kɛ of the dissipation rate appear to be well correlated in a wide range of Skɛ and Kɛ variability.

  9. Poisson and negative binomial item count techniques for surveys with sensitive question.

    PubMed

    Tian, Guo-Liang; Tang, Man-Lai; Wu, Qin; Liu, Yin

    2017-04-01

    Although the item count technique is useful in surveys with sensitive questions, privacy of those respondents who possess the sensitive characteristic of interest may not be well protected due to a defect in its original design. In this article, we propose two new survey designs (namely the Poisson item count technique and negative binomial item count technique) which replace several independent Bernoulli random variables required by the original item count technique with a single Poisson or negative binomial random variable, respectively. The proposed models not only provide closed form variance estimate and confidence interval within [0, 1] for the sensitive proportion, but also simplify the survey design of the original item count technique. Most importantly, the new designs do not leak respondents' privacy. Empirical results show that the proposed techniques perform satisfactorily in the sense that it yields accurate parameter estimate and confidence interval.

  10. Evaluation of the Three Parameter Weibull Distribution Function for Predicting Fracture Probability in Composite Materials

    DTIC Science & Technology

    1978-03-01

    for the risk of rupture for a unidirectionally laminat - ed composite subjected to pure bending. (5D This equation can be simplified further by use of...C EVALUATION OF THE THREE PARAMETER WEIBULL DISTRIBUTION FUNCTION FOR PREDICTING FRACTURE PROBABILITY IN COMPOSITE MATERIALS. THESIS / AFIT/GAE...EVALUATION OF THE THREE PARAMETER WE1BULL DISTRIBUTION FUNCTION FOR PREDICTING FRACTURE PROBABILITY IN COMPOSITE MATERIALS THESIS Presented

  11. Bivariate extreme value distributions

    NASA Technical Reports Server (NTRS)

    Elshamy, M.

    1992-01-01

    In certain engineering applications, such as those occurring in the analyses of ascent structural loads for the Space Transportation System (STS), some of the load variables have a lower bound of zero. Thus, the need for practical models of bivariate extreme value probability distribution functions with lower limits was identified. We discuss the Gumbel models and present practical forms of bivariate extreme probability distributions of Weibull and Frechet types with two parameters. Bivariate extreme value probability distribution functions can be expressed in terms of the marginal extremel distributions and a 'dependence' function subject to certain analytical conditions. Properties of such bivariate extreme distributions, sums and differences of paired extremals, as well as the corresponding forms of conditional distributions, are discussed. Practical estimation techniques are also given.

  12. Eruption probabilities for the Lassen Volcanic Center and regional volcanism, northern California, and probabilities for large explosive eruptions in the Cascade Range

    USGS Publications Warehouse

    Nathenson, Manuel; Clynne, Michael A.; Muffler, L.J. Patrick

    2012-01-01

    Chronologies for eruptive activity of the Lassen Volcanic Center and for eruptions from the regional mafic vents in the surrounding area of the Lassen segment of the Cascade Range are here used to estimate probabilities of future eruptions. For the regional mafic volcanism, the ages of many vents are known only within broad ranges, and two models are developed that should bracket the actual eruptive ages. These chronologies are used with exponential, Weibull, and mixed-exponential probability distributions to match the data for time intervals between eruptions. For the Lassen Volcanic Center, the probability of an eruption in the next year is 1.4x10-4 for the exponential distribution and 2.3x10-4 for the mixed exponential distribution. For the regional mafic vents, the exponential distribution gives a probability of an eruption in the next year of 6.5x10-4, but the mixed exponential distribution indicates that the current probability, 12,000 years after the last event, could be significantly lower. For the exponential distribution, the highest probability is for an eruption from a regional mafic vent. Data on areas and volumes of lava flows and domes of the Lassen Volcanic Center and of eruptions from the regional mafic vents provide constraints on the probable sizes of future eruptions. Probabilities of lava-flow coverage are similar for the Lassen Volcanic Center and for regional mafic vents, whereas the probable eruptive volumes for the mafic vents are generally smaller. Data have been compiled for large explosive eruptions (>≈ 5 km3 in deposit volume) in the Cascade Range during the past 1.2 m.y. in order to estimate probabilities of eruption. For erupted volumes >≈5 km3, the rate of occurrence since 13.6 ka is much higher than for the entire period, and we use these data to calculate the annual probability of a large eruption at 4.6x10-4. For erupted volumes ≥10 km3, the rate of occurrence has been reasonably constant from 630 ka to the present, giving more confidence in the estimate, and we use those data to calculate the annual probability of a large eruption in the next year at 1.4x10-5.

  13. Burst wait time simulation of CALIBAN reactor at delayed super-critical state

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Humbert, P.; Authier, N.; Richard, B.

    2012-07-01

    In the past, the super prompt critical wait time probability distribution was measured on CALIBAN fast burst reactor [4]. Afterwards, these experiments were simulated with a very good agreement by solving the non-extinction probability equation [5]. Recently, the burst wait time probability distribution has been measured at CEA-Valduc on CALIBAN at different delayed super-critical states [6]. However, in the delayed super-critical case the non-extinction probability does not give access to the wait time distribution. In this case it is necessary to compute the time dependent evolution of the full neutron count number probability distribution. In this paper we present themore » point model deterministic method used to calculate the probability distribution of the wait time before a prescribed count level taking into account prompt neutrons and delayed neutron precursors. This method is based on the solution of the time dependent adjoint Kolmogorov master equations for the number of detections using the generating function methodology [8,9,10] and inverse discrete Fourier transforms. The obtained results are then compared to the measurements and Monte-Carlo calculations based on the algorithm presented in [7]. (authors)« less

  14. Steady state, relaxation and first-passage properties of a run-and-tumble particle in one-dimension

    NASA Astrophysics Data System (ADS)

    Malakar, Kanaya; Jemseena, V.; Kundu, Anupam; Vijay Kumar, K.; Sabhapandit, Sanjib; Majumdar, Satya N.; Redner, S.; Dhar, Abhishek

    2018-04-01

    We investigate the motion of a run-and-tumble particle (RTP) in one dimension. We find the exact probability distribution of the particle with and without diffusion on the infinite line, as well as in a finite interval. In the infinite domain, this probability distribution approaches a Gaussian form in the long-time limit, as in the case of a regular Brownian particle. At intermediate times, this distribution exhibits unexpected multi-modal forms. In a finite domain, the probability distribution reaches a steady-state form with peaks at the boundaries, in contrast to a Brownian particle. We also study the relaxation to the steady-state analytically. Finally we compute the survival probability of the RTP in a semi-infinite domain with an absorbing boundary condition at the origin. In the finite interval, we compute the exit probability and the associated exit times. We provide numerical verification of our analytical results.

  15. Fitness Probability Distribution of Bit-Flip Mutation.

    PubMed

    Chicano, Francisco; Sutton, Andrew M; Whitley, L Darrell; Alba, Enrique

    2015-01-01

    Bit-flip mutation is a common mutation operator for evolutionary algorithms applied to optimize functions over binary strings. In this paper, we develop results from the theory of landscapes and Krawtchouk polynomials to exactly compute the probability distribution of fitness values of a binary string undergoing uniform bit-flip mutation. We prove that this probability distribution can be expressed as a polynomial in p, the probability of flipping each bit. We analyze these polynomials and provide closed-form expressions for an easy linear problem (Onemax), and an NP-hard problem, MAX-SAT. We also discuss a connection of the results with runtime analysis.

  16. Factors Associated With Bites to a Child From a Dog Living in the Same Home: A Bi-National Comparison.

    PubMed

    Messam, Locksley L McV; Kass, Philip H; Chomel, Bruno B; Hart, Lynette A

    2018-01-01

    We conducted a veterinary clinic-based retrospective cohort study aimed at identifying child-, dog-, and home-environment factors associated with dog bites to children aged 5-15 years old living in the same home as a dog in Kingston, Jamaica (236) and San Francisco, USA (61). Secondarily, we wished to compare these factors to risk factors for dog bites to the general public. Participant information was collected via interviewer-administered questionnaire using proxy respondents. Data were analyzed using log-binomial regression to estimate relative risks and associated 95% confidence intervals (CIs) for each exposure-dog bite relationship. Exploiting the correspondence between X% confidence intervals and X% Bayesian probability intervals obtained using a uniform prior distribution, for each exposure, we calculated probabilities of the true (population) RRs ≥ 1.25 or ≤0.8, for positive or negative associations, respectively. Boys and younger children were at higher risk for bites, than girls and older children, respectively. Dogs living in a home with no yard space were at an elevated risk (RR = 2.97; 95% CI: 1.06-8.33) of biting a child living in the same home, compared to dogs that had yard space. Dogs routinely allowed inside for some portion of the day (RR = 3.00; 95% CI: 0.94-9.62) and dogs routinely allowed to sleep in a family member's bedroom (RR = 2.82; 95% CI: 1.17-6.81) were also more likely to bite a child living in the home than those that were not. In San Francisco, but less so in Kingston, bites were inversely associated with the number of children in the home. While in Kingston, but not in San Francisco, smaller breeds and dogs obtained for companionship were at higher risk for biting than larger breeds and dogs obtained for protection, respectively. Overall, for most exposures, the observed associations were consistent with population RRs of practical importance (i.e., RRs ≥ 1.25 or ≤0.8). Finally, we found substantial consistency between risk factors for bites to children and previously reported risk factors for general bites.

  17. The variation in the health status of immigrants and Italians during the global crisis and the role of socioeconomic factors.

    PubMed

    Petrelli, Alessio; Di Napoli, Anteo; Rossi, Alessandra; Costanzo, Gianfranco; Mirisola, Concetta; Gargiulo, Lidia

    2017-06-12

    The effects of the recent global economic and financial crisis especially affected the most vulnerable social groups. Objective of the study was to investigate variation of self-perceived health status in Italians and immigrants during the economic global crisis, focusing on demographic and socioeconomic factors. Through a cross-sectional design we analyzed the national sample of multipurpose surveys "Health conditions and use of health services" (2005 and 2013) conducted by the Italian National Institute of Statistics (ISTAT). Physical Component Summary (PCS) and Mental Component Summary (MCS) scores, derived from SF-12 questionnaire, were assumed as study outcome, dichotomizing variables distribution at 1 st quartile. Prevalence rate ratios (PRR) were estimated through log-binomial regression models, stratified by citizenship and gender, evaluating the association between PCS and MCS with surveys' year, adjusting for age, educational level, employment status, self-perceived economic resources, smoking habits, body mass index. From 2005 to 2013 the proportion of people not employed or reporting scarce/insufficient economic resources increased, especially among men, in particular immigrants. Compared with 2005 we observed in 2013 among Italians a significant lower probability of worse PCS (PRR = 0.96 both for males and females), while no differences were observed among immigrants; a higher probability of worse MCS was observed, particularly among men (Italians: PRR = 1.26;95%CI:1.22-1.29; immigrants: PRR = 1.19;95%CI:1.03-1.38). Self-perceived scarce/insufficient economic resources were strongly and significantly associated with worse PCS and MCS for all subgroups. Lower educational level was strongly associated with worse PCS in Italians and slightly associated with worse MCS for all subgroups. Being not employed was associated with worse health status, especially mental health among men. Our findings support the hypothesis that economic global crisis could have negatively affected health status, particularly mental health, of Italians and immigrants. Furthermore, results suggest socioeconomic inequalities increase, in economic resources availability dimension. In a context of public health resources' limitation due to financial crisis, policy decision makers and health service managers must face the challenge of equity in health.

  18. On the Relationship between Molecular Hit Rates in High-Throughput Screening and Molecular Descriptors.

    PubMed

    Hansson, Mari; Pemberton, John; Engkvist, Ola; Feierberg, Isabella; Brive, Lars; Jarvis, Philip; Zander-Balderud, Linda; Chen, Hongming

    2014-06-01

    High-throughput screening (HTS) is widely used in the pharmaceutical industry to identify novel chemical starting points for drug discovery projects. The current study focuses on the relationship between molecular hit rate in recent in-house HTS and four common molecular descriptors: lipophilicity (ClogP), size (heavy atom count, HEV), fraction of sp(3)-hybridized carbons (Fsp3), and fraction of molecular framework (f(MF)). The molecular hit rate is defined as the fraction of times the molecule has been assigned as active in the HTS campaigns where it has been screened. Beta-binomial statistical models were built to model the molecular hit rate as a function of these descriptors. The advantage of the beta-binomial statistical models is that the correlation between the descriptors is taken into account. Higher degree polynomial terms of the descriptors were also added into the beta-binomial statistic model to improve the model quality. The relative influence of different molecular descriptors on molecular hit rate has been estimated, taking into account that the descriptors are correlated to each other through applying beta-binomial statistical modeling. The results show that ClogP has the largest influence on the molecular hit rate, followed by Fsp3 and HEV. f(MF) has only a minor influence besides its correlation with the other molecular descriptors. © 2013 Society for Laboratory Automation and Screening.

  19. Data mining of tree-based models to analyze freeway accident frequency.

    PubMed

    Chang, Li-Yen; Chen, Wen-Chieh

    2005-01-01

    Statistical models, such as Poisson or negative binomial regression models, have been employed to analyze vehicle accident frequency for many years. However, these models have their own model assumptions and pre-defined underlying relationship between dependent and independent variables. If these assumptions are violated, the model could lead to erroneous estimation of accident likelihood. Classification and Regression Tree (CART), one of the most widely applied data mining techniques, has been commonly employed in business administration, industry, and engineering. CART does not require any pre-defined underlying relationship between target (dependent) variable and predictors (independent variables) and has been shown to be a powerful tool, particularly for dealing with prediction and classification problems. This study collected the 2001-2002 accident data of National Freeway 1 in Taiwan. A CART model and a negative binomial regression model were developed to establish the empirical relationship between traffic accidents and highway geometric variables, traffic characteristics, and environmental factors. The CART findings indicated that the average daily traffic volume and precipitation variables were the key determinants for freeway accident frequencies. By comparing the prediction performance between the CART and the negative binomial regression models, this study demonstrates that CART is a good alternative method for analyzing freeway accident frequencies. By comparing the prediction performance between the CART and the negative binomial regression models, this study demonstrates that CART is a good alternative method for analyzing freeway accident frequencies.

  20. The Impact of an Instructional Intervention Designed to Support Development of Stochastic Understanding of Probability Distribution

    ERIC Educational Resources Information Center

    Conant, Darcy Lynn

    2013-01-01

    Stochastic understanding of probability distribution undergirds development of conceptual connections between probability and statistics and supports development of a principled understanding of statistical inference. This study investigated the impact of an instructional course intervention designed to support development of stochastic…

  1. A new framework of statistical inferences based on the valid joint sampling distribution of the observed counts in an incomplete contingency table.

    PubMed

    Tian, Guo-Liang; Li, Hui-Qiong

    2017-08-01

    Some existing confidence interval methods and hypothesis testing methods in the analysis of a contingency table with incomplete observations in both margins entirely depend on an underlying assumption that the sampling distribution of the observed counts is a product of independent multinomial/binomial distributions for complete and incomplete counts. However, it can be shown that this independency assumption is incorrect and can result in unreliable conclusions because of the under-estimation of the uncertainty. Therefore, the first objective of this paper is to derive the valid joint sampling distribution of the observed counts in a contingency table with incomplete observations in both margins. The second objective is to provide a new framework for analyzing incomplete contingency tables based on the derived joint sampling distribution of the observed counts by developing a Fisher scoring algorithm to calculate maximum likelihood estimates of parameters of interest, the bootstrap confidence interval methods, and the bootstrap testing hypothesis methods. We compare the differences between the valid sampling distribution and the sampling distribution under the independency assumption. Simulation studies showed that average/expected confidence-interval widths of parameters based on the sampling distribution under the independency assumption are shorter than those based on the new sampling distribution, yielding unrealistic results. A real data set is analyzed to illustrate the application of the new sampling distribution for incomplete contingency tables and the analysis results again confirm the conclusions obtained from the simulation studies.

  2. Probability distributions of the electroencephalogram envelope of preterm infants.

    PubMed

    Saji, Ryoya; Hirasawa, Kyoko; Ito, Masako; Kusuda, Satoshi; Konishi, Yukuo; Taga, Gentaro

    2015-06-01

    To determine the stationary characteristics of electroencephalogram (EEG) envelopes for prematurely born (preterm) infants and investigate the intrinsic characteristics of early brain development in preterm infants. Twenty neurologically normal sets of EEGs recorded in infants with a post-conceptional age (PCA) range of 26-44 weeks (mean 37.5 ± 5.0 weeks) were analyzed. Hilbert transform was applied to extract the envelope. We determined the suitable probability distribution of the envelope and performed a statistical analysis. It was found that (i) the probability distributions for preterm EEG envelopes were best fitted by lognormal distributions at 38 weeks PCA or less, and by gamma distributions at 44 weeks PCA; (ii) the scale parameter of the lognormal distribution had positive correlations with PCA as well as a strong negative correlation with the percentage of low-voltage activity; (iii) the shape parameter of the lognormal distribution had significant positive correlations with PCA; (iv) the statistics of mode showed significant linear relationships with PCA, and, therefore, it was considered a useful index in PCA prediction. These statistics, including the scale parameter of the lognormal distribution and the skewness and mode derived from a suitable probability distribution, may be good indexes for estimating stationary nature in developing brain activity in preterm infants. The stationary characteristics, such as discontinuity, asymmetry, and unimodality, of preterm EEGs are well indicated by the statistics estimated from the probability distribution of the preterm EEG envelopes. Copyright © 2014 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

  3. A flexible count data model to fit the wide diversity of expression profiles arising from extensively replicated RNA-seq experiments

    PubMed Central

    2013-01-01

    Background High-throughput RNA sequencing (RNA-seq) offers unprecedented power to capture the real dynamics of gene expression. Experimental designs with extensive biological replication present a unique opportunity to exploit this feature and distinguish expression profiles with higher resolution. RNA-seq data analysis methods so far have been mostly applied to data sets with few replicates and their default settings try to provide the best performance under this constraint. These methods are based on two well-known count data distributions: the Poisson and the negative binomial. The way to properly calibrate them with large RNA-seq data sets is not trivial for the non-expert bioinformatics user. Results Here we show that expression profiles produced by extensively-replicated RNA-seq experiments lead to a rich diversity of count data distributions beyond the Poisson and the negative binomial, such as Poisson-Inverse Gaussian or Pólya-Aeppli, which can be captured by a more general family of count data distributions called the Poisson-Tweedie. The flexibility of the Poisson-Tweedie family enables a direct fitting of emerging features of large expression profiles, such as heavy-tails or zero-inflation, without the need to alter a single configuration parameter. We provide a software package for R called tweeDEseq implementing a new test for differential expression based on the Poisson-Tweedie family. Using simulations on synthetic and real RNA-seq data we show that tweeDEseq yields P-values that are equally or more accurate than competing methods under different configuration parameters. By surveying the tiny fraction of sex-specific gene expression changes in human lymphoblastoid cell lines, we also show that tweeDEseq accurately detects differentially expressed genes in a real large RNA-seq data set with improved performance and reproducibility over the previously compared methodologies. Finally, we compared the results with those obtained from microarrays in order to check for reproducibility. Conclusions RNA-seq data with many replicates leads to a handful of count data distributions which can be accurately estimated with the statistical model illustrated in this paper. This method provides a better fit to the underlying biological variability; this may be critical when comparing groups of RNA-seq samples with markedly different count data distributions. The tweeDEseq package forms part of the Bioconductor project and it is available for download at http://www.bioconductor.org. PMID:23965047

  4. Coronary artery calcium distributions in older persons in the AGES-Reykjavik study

    PubMed Central

    Gudmundsson, Elias Freyr; Gudnason, Vilmundur; Sigurdsson, Sigurdur; Launer, Lenore J.; Harris, Tamara B.; Aspelund, Thor

    2013-01-01

    Coronary Artery Calcium (CAC) is a sign of advanced atherosclerosis and an independent risk factor for cardiac events. Here, we describe CAC-distributions in an unselected aged population and compare modelling methods to characterize CAC-distribution. CAC is difficult to model because it has a skewed and zero inflated distribution with over-dispersion. Data are from the AGES-Reykjavik sample, a large population based study [2002-2006] in Iceland of 5,764 persons aged 66-96 years. Linear regressions using logarithmic- and Box-Cox transformations on CAC+1, quantile regression and a Zero-Inflated Negative Binomial model (ZINB) were applied. Methods were compared visually and with the PRESS-statistic, R2 and number of detected associations with concurrently measured variables. There were pronounced differences in CAC according to sex, age, history of coronary events and presence of plaque in the carotid artery. Associations with conventional coronary artery disease (CAD) risk factors varied between the sexes. The ZINB model provided the best results with respect to the PRESS-statistic, R2, and predicted proportion of zero scores. The ZINB model detected similar numbers of associations as the linear regression on ln(CAC+1) and usually with the same risk factors. PMID:22990371

  5. Measurement of higher cumulants of net-charge multiplicity distributions in Au +Au collisions at √{sN N}=7.7 -200 GeV

    NASA Astrophysics Data System (ADS)

    Adare, A.; Afanasiev, S.; Aidala, C.; Ajitanand, N. N.; Akiba, Y.; Akimoto, R.; Al-Bataineh, H.; Alexander, J.; Al-Ta'Ani, H.; Angerami, A.; Aoki, K.; Apadula, N.; Aramaki, Y.; Asano, H.; Aschenauer, E. C.; Atomssa, E. T.; Averbeck, R.; Awes, T. C.; Azmoun, B.; Babintsev, V.; Bai, M.; Baksay, G.; Baksay, L.; Bannier, B.; Barish, K. N.; Bassalleck, B.; Basye, A. T.; Bathe, S.; Baublis, V.; Baumann, C.; Baumgart, S.; Bazilevsky, A.; Belikov, S.; Belmont, R.; Bennett, R.; Berdnikov, A.; Berdnikov, Y.; Bickley, A. A.; Black, D.; Blau, D. S.; Bok, J. S.; Boyle, K.; Brooks, M. L.; Bryslawskyj, J.; Buesching, H.; Bumazhnov, V.; Bunce, G.; Butsyk, S.; Camacho, C. M.; Campbell, S.; Castera, P.; Chen, C.-H.; Chi, C. Y.; Chiu, M.; Choi, I. J.; Choi, J. B.; Choi, S.; Choudhury, R. K.; Christiansen, P.; Chujo, T.; Chung, P.; Chvala, O.; Cianciolo, V.; Citron, Z.; Cole, B. A.; Connors, M.; Constantin, P.; Cronin, N.; Crossette, N.; Csanád, M.; Csörgő, T.; Dahms, T.; Dairaku, S.; Danchev, I.; Das, K.; Datta, A.; Daugherity, M. S.; David, G.; Dehmelt, K.; Denisov, A.; Deshpande, A.; Desmond, E. J.; Dharmawardane, K. V.; Dietzsch, O.; Ding, L.; Dion, A.; Do, J. H.; Donadelli, M.; D'Orazio, L.; Drapier, O.; Drees, A.; Drees, K. A.; Durham, J. M.; Durum, A.; Dutta, D.; Edwards, S.; Efremenko, Y. V.; Ellinghaus, F.; Engelmore, T.; Enokizono, A.; En'yo, H.; Esumi, S.; Eyser, K. O.; Fadem, B.; Fields, D. E.; Finger, M.; Finger, M.; Fleuret, F.; Fokin, S. L.; Fraenkel, Z.; Frantz, J. E.; Franz, A.; Frawley, A. D.; Fujiwara, K.; Fukao, Y.; Fusayasu, T.; Gainey, K.; Gal, C.; Garg, P.; Garishvili, A.; Garishvili, I.; Giordano, F.; Glenn, A.; Gong, H.; Gong, X.; Gonin, M.; Goto, Y.; Granier de Cassagnac, R.; Grau, N.; Greene, S. V.; Grosse Perdekamp, M.; Gu, Y.; Gunji, T.; Guo, L.; Gustafsson, H.-Å.; Hachiya, T.; Haggerty, J. S.; Hahn, K. I.; Hamagaki, H.; Hamblen, J.; Han, R.; Hanks, J.; Hartouni, E. P.; Hashimoto, K.; Haslum, E.; Hayano, R.; Hayashi, S.; He, X.; Heffner, M.; Hemmick, T. K.; Hester, T.; Hill, J. C.; Hohlmann, M.; Hollis, R. S.; Holzmann, W.; Homma, K.; Hong, B.; Horaguchi, T.; Hori, Y.; Hornback, D.; Huang, S.; Ichihara, T.; Ichimiya, R.; Ide, J.; Iinuma, H.; Ikeda, Y.; Imai, K.; Imazu, Y.; Imrek, J.; Inaba, M.; Iordanova, A.; Isenhower, D.; Ishihara, M.; Isinhue, A.; Isobe, T.; Issah, M.; Isupov, A.; Ivanishchev, D.; Jacak, B. V.; Javani, M.; Jia, J.; Jiang, X.; Jin, J.; Johnson, B. M.; Joo, K. S.; Jouan, D.; Jumper, D. S.; Kajihara, F.; Kametani, S.; Kamihara, N.; Kamin, J.; Kaneti, S.; Kang, B. H.; Kang, J. H.; Kang, J. S.; Kapustinsky, J.; Karatsu, K.; Kasai, M.; Kawall, D.; Kawashima, M.; Kazantsev, A. V.; Kempel, T.; Key, J. A.; Khandai, P. K.; Khanzadeev, A.; Kijima, K. M.; Kim, B. I.; Kim, C.; Kim, D. H.; Kim, D. J.; Kim, E.; Kim, E.-J.; Kim, H. J.; Kim, K.-B.; Kim, S. H.; Kim, Y.-J.; Kim, Y. K.; Kinney, E.; Kiriluk, K.; Kiss, Á.; Kistenev, E.; Klatsky, J.; Kleinjan, D.; Kline, P.; Kochenda, L.; Komatsu, Y.; Komkov, B.; Konno, M.; Koster, J.; Kotchetkov, D.; Kotov, D.; Kozlov, A.; Král, A.; Kravitz, A.; Krizek, F.; Kunde, G. J.; Kurita, K.; Kurosawa, M.; Kwon, Y.; Kyle, G. S.; Lacey, R.; Lai, Y. S.; Lajoie, J. G.; Lebedev, A.; Lee, B.; Lee, D. M.; Lee, J.; Lee, K.; Lee, K. B.; Lee, K. S.; Lee, S. H.; Lee, S. R.; Leitch, M. J.; Leite, M. A. L.; Leitgab, M.; Leitner, E.; Lenzi, B.; Lewis, B.; Li, X.; Liebing, P.; Lim, S. H.; Linden Levy, L. A.; Liška, T.; Litvinenko, A.; Liu, H.; Liu, M. X.; Love, B.; Luechtenborg, R.; Lynch, D.; Maguire, C. F.; Makdisi, Y. I.; Makek, M.; Malakhov, A.; Malik, M. D.; Manion, A.; Manko, V. I.; Mannel, E.; Mao, Y.; Maruyama, T.; Masui, H.; Masumoto, S.; Matathias, F.; McCumber, M.; McGaughey, P. L.; McGlinchey, D.; McKinney, C.; Means, N.; Meles, A.; Mendoza, M.; Meredith, B.; Miake, Y.; Mibe, T.; Midori, J.; Mignerey, A. C.; Mikeš, P.; Miki, K.; Milov, A.; Mishra, D. K.; Mishra, M.; Mitchell, J. T.; Miyachi, Y.; Miyasaka, S.; Mohanty, A. K.; Mohapatra, S.; Moon, H. J.; Morino, Y.; Morreale, A.; Morrison, D. P.; Moskowitz, M.; Motschwiller, S.; Moukhanova, T. V.; Murakami, T.; Murata, J.; Mwai, A.; Nagae, T.; Nagamiya, S.; Nagle, J. L.; Naglis, M.; Nagy, M. I.; Nakagawa, I.; Nakamiya, Y.; Nakamura, K. R.; Nakamura, T.; Nakano, K.; Nattrass, C.; Nederlof, A.; Netrakanti, P. K.; Newby, J.; Nguyen, M.; Nihashi, M.; Niida, T.; Nouicer, R.; Novitzky, N.; Nukariya, A.; Nyanin, A. S.; Obayashi, H.; O'Brien, E.; Oda, S. X.; Ogilvie, C. A.; Oka, M.; Okada, K.; Onuki, Y.; Oskarsson, A.; Ouchida, M.; Ozawa, K.; Pak, R.; Pantuev, V.; Papavassiliou, V.; Park, B. H.; Park, I. H.; Park, J.; Park, S.; Park, S. K.; Park, W. J.; Pate, S. F.; Patel, L.; Pei, H.; Peng, J.-C.; Pereira, H.; Perepelitsa, D. V.; Peresedov, V.; Peressounko, D. Yu.; Petti, R.; Pinkenburg, C.; Pisani, R. P.; Proissl, M.; Purschke, M. L.; Purwar, A. K.; Qu, H.; Rak, J.; Rakotozafindrabe, A.; Ravinovich, I.; Read, K. F.; Reygers, K.; Reynolds, D.; Riabov, V.; Riabov, Y.; Richardson, E.; Riveli, N.; Roach, D.; Roche, G.; Rolnick, S. D.; Rosati, M.; Rosen, C. A.; Rosendahl, S. S. E.; Rosnet, P.; Rukoyatkin, P.; Ružička, P.; Ryu, M. S.; Sahlmueller, B.; Saito, N.; Sakaguchi, T.; Sakashita, K.; Sako, H.; Samsonov, V.; Sano, M.; Sano, S.; Sarsour, M.; Sato, S.; Sato, T.; Sawada, S.; Sedgwick, K.; Seele, J.; Seidl, R.; Semenov, A. Yu.; Sen, A.; Seto, R.; Sett, P.; Sharma, D.; Shein, I.; Shibata, T.-A.; Shigaki, K.; Shimomura, M.; Shoji, K.; Shukla, P.; Sickles, A.; Silva, C. L.; Silvermyr, D.; Silvestre, C.; Sim, K. S.; Singh, B. K.; Singh, C. P.; Singh, V.; Skolnik, M.; Slunečka, M.; Solano, S.; Soltz, R. A.; Sondheim, W. E.; Sorensen, S. P.; Sourikova, I. V.; Sparks, N. A.; Stankus, P. W.; Steinberg, P.; Stenlund, E.; Stepanov, M.; Ster, A.; Stoll, S. P.; Sugitate, T.; Sukhanov, A.; Sun, J.; Sziklai, J.; Takagui, E. M.; Takahara, A.; Taketani, A.; Tanabe, R.; Tanaka, Y.; Taneja, S.; Tanida, K.; Tannenbaum, M. J.; Tarafdar, S.; Taranenko, A.; Tarján, P.; Tennant, E.; Themann, H.; Thomas, T. L.; Todoroki, T.; Togawa, M.; Toia, A.; Tomášek, L.; Tomášek, M.; Torii, H.; Towell, R. S.; Tserruya, I.; Tsuchimoto, Y.; Tsuji, T.; Vale, C.; Valle, H.; van Hecke, H. W.; Vargyas, M.; Vazquez-Zambrano, E.; Veicht, A.; Velkovska, J.; Vértesi, R.; Vinogradov, A. A.; Virius, M.; Voas, B.; Vossen, A.; Vrba, V.; Vznuzdaev, E.; Wang, X. R.; Watanabe, D.; Watanabe, K.; Watanabe, Y.; Watanabe, Y. S.; Wei, F.; Wei, R.; Wessels, J.; Whitaker, S.; White, S. N.; Winter, D.; Wolin, S.; Wood, J. P.; Woody, C. L.; Wright, R. M.; Wysocki, M.; Xia, B.; Xie, W.; Yamaguchi, Y. L.; Yamaura, K.; Yang, R.; Yanovich, A.; Ying, J.; Yokkaichi, S.; You, Z.; Young, G. R.; Younus, I.; Yushmanov, I. E.; Zajc, W. A.; Zelenski, A.; Zhang, C.; Zhou, S.; Zolin, L.; Phenix Collaboration

    2016-01-01

    We report the measurement of cumulants (Cn,n =1 ,...,4 ) of the net-charge distributions measured within pseudorapidity (|η |<0.35 ) in Au +Au collisions at √{sNN}=7.7 -200 GeV with the PHENIX experiment at the Relativistic Heavy Ion Collider. The ratios of cumulants (e.g., C1/C2 , C3/C1 ) of the net-charge distributions, which can be related to volume independent susceptibility ratios, are studied as a function of centrality and energy. These quantities are important to understand the quantum-chromodynamics phase diagram and possible existence of a critical end point. The measured values are very well described by expectation from negative binomial distributions. We do not observe any nonmonotonic behavior in the ratios of the cumulants as a function of collision energy. The measured values of C1/C2 and C3/C1 can be directly compared to lattice quantum-chromodynamics calculations and thus allow extraction of both the chemical freeze-out temperature and the baryon chemical potential at each center-of-mass energy. The extracted baryon chemical potentials are in excellent agreement with a thermal-statistical analysis model.

  6. Flood Frequency Curves - Use of information on the likelihood of extreme floods

    NASA Astrophysics Data System (ADS)

    Faber, B.

    2011-12-01

    Investment in the infrastructure that reduces flood risk for flood-prone communities must incorporate information on the magnitude and frequency of flooding in that area. Traditionally, that information has been a probability distribution of annual maximum streamflows developed from the historical gaged record at a stream site. Practice in the United States fits a Log-Pearson type3 distribution to the annual maximum flows of an unimpaired streamflow record, using the method of moments to estimate distribution parameters. The procedure makes the assumptions that annual peak streamflow events are (1) independent, (2) identically distributed, and (3) form a representative sample of the overall probability distribution. Each of these assumptions can be challenged. We rarely have enough data to form a representative sample, and therefore must compute and display the uncertainty in the estimated flood distribution. But, is there a wet/dry cycle that makes precipitation less than independent between successive years? Are the peak flows caused by different types of events from different statistical populations? How does the watershed or climate changing over time (non-stationarity) affect the probability distribution floods? Potential approaches to avoid these assumptions vary from estimating trend and shift and removing them from early data (and so forming a homogeneous data set), to methods that estimate statistical parameters that vary with time. A further issue in estimating a probability distribution of flood magnitude (the flood frequency curve) is whether a purely statistical approach can accurately capture the range and frequency of floods that are of interest. A meteorologically-based analysis produces "probable maximum precipitation" (PMP) and subsequently a "probable maximum flood" (PMF) that attempts to describe an upper bound on flood magnitude in a particular watershed. This analysis can help constrain the upper tail of the probability distribution, well beyond the range of gaged data or even historical or paleo-flood data, which can be very important in risk analyses performed for flood risk management and dam and levee safety studies.

  7. The probability and severity of decompression sickness

    PubMed Central

    Hada, Ethan A.; Vann, Richard D.; Denoble, Petar J.

    2017-01-01

    Decompression sickness (DCS), which is caused by inert gas bubbles in tissues, is an injury of concern for scuba divers, compressed air workers, astronauts, and aviators. Case reports for 3322 air and N2-O2 dives, resulting in 190 DCS events, were retrospectively analyzed and the outcomes were scored as (1) serious neurological, (2) cardiopulmonary, (3) mild neurological, (4) pain, (5) lymphatic or skin, and (6) constitutional or nonspecific manifestations. Following standard U.S. Navy medical definitions, the data were grouped into mild—Type I (manifestations 4–6)–and serious–Type II (manifestations 1–3). Additionally, we considered an alternative grouping of mild–Type A (manifestations 3–6)–and serious–Type B (manifestations 1 and 2). The current U.S. Navy guidance allows for a 2% probability of mild DCS and a 0.1% probability of serious DCS. We developed a hierarchical trinomial (3-state) probabilistic DCS model that simultaneously predicts the probability of mild and serious DCS given a dive exposure. Both the Type I/II and Type A/B discriminations of mild and serious DCS resulted in a highly significant (p << 0.01) improvement in trinomial model fit over the binomial (2-state) model. With the Type I/II definition, we found that the predicted probability of ‘mild’ DCS resulted in a longer allowable bottom time for the same 2% limit. However, for the 0.1% serious DCS limit, we found a vastly decreased allowable bottom dive time for all dive depths. If the Type A/B scoring was assigned to outcome severity, the no decompression limits (NDL) for air dives were still controlled by the acceptable serious DCS risk limit rather than the acceptable mild DCS risk limit. However, in this case, longer NDL limits were allowed than with the Type I/II scoring. The trinomial model mild and serious probabilities agree reasonably well with the current air NDL only with the Type A/B scoring and when 0.2% risk of serious DCS is allowed. PMID:28296928

  8. The influence of baseline marijuana use on treatment of cocaine dependence: application of an informative-priors bayesian approach.

    PubMed

    Green, Charles; Schmitz, Joy; Lindsay, Jan; Pedroza, Claudia; Lane, Scott; Agnelli, Rob; Kjome, Kimberley; Moeller, F Gerard

    2012-01-01

    Marijuana use is prevalent among patients with cocaine dependence and often non-exclusionary in clinical trials of potential cocaine medications. The dual-focus of this study was to (1) examine the moderating effect of baseline marijuana use on response to treatment with levodopa/carbidopa for cocaine dependence; and (2) apply an informative-priors, Bayesian approach for estimating the probability of a subgroup-by-treatment interaction effect. A secondary data analysis of two previously published, double-blind, randomized controlled trials provided complete data for the historical (Study 1: N = 64 placebo), and current (Study 2: N = 113) data sets. Negative binomial regression evaluated Treatment Effectiveness Scores (TES) as a function of medication condition (levodopa/carbidopa, placebo), baseline marijuana use (days in past 30), and their interaction. Bayesian analysis indicated that there was a 96% chance that baseline marijuana use predicts differential response to treatment with levodopa/carbidopa. Simple effects indicated that among participants receiving levodopa/carbidopa the probability that baseline marijuana confers harm in terms of reducing TES was 0.981; whereas the probability that marijuana confers harm within the placebo condition was 0.163. For every additional day of marijuana use reported at baseline, participants in the levodopa/carbidopa condition demonstrated a 5.4% decrease in TES; while participants in the placebo condition demonstrated a 4.9% increase in TES. The potential moderating effect of marijuana on cocaine treatment response should be considered in future trial designs. Applying Bayesian subgroup analysis proved informative in characterizing this patient-treatment interaction effect.

  9. Estimating a test's accuracy using tailored meta-analysis-How setting-specific data may aid study selection.

    PubMed

    Willis, Brian H; Hyde, Christopher J

    2014-05-01

    To determine a plausible estimate for a test's performance in a specific setting using a new method for selecting studies. It is shown how routine data from practice may be used to define an "applicable region" for studies in receiver operating characteristic space. After qualitative appraisal, studies are selected based on the probability that their study accuracy estimates arose from parameters lying in this applicable region. Three methods for calculating these probabilities are developed and used to tailor the selection of studies for meta-analysis. The Pap test applied to the UK National Health Service (NHS) Cervical Screening Programme provides a case example. The meta-analysis for the Pap test included 68 studies, but at most 17 studies were considered applicable to the NHS. For conventional meta-analysis, the sensitivity and specificity (with 95% confidence intervals) were estimated to be 72.8% (65.8, 78.8) and 75.4% (68.1, 81.5) compared with 50.9% (35.8, 66.0) and 98.0% (95.4, 99.1) from tailored meta-analysis using a binomial method for selection. Thus, for a cervical intraepithelial neoplasia (CIN) 1 prevalence of 2.2%, the post-test probability for CIN 1 would increase from 6.2% to 36.6% between the two methods of meta-analysis. Tailored meta-analysis provides a method for augmenting study selection based on the study's applicability to a setting. As such, the summary estimate is more likely to be plausible for a setting and could improve diagnostic prediction in practice. Copyright © 2014 Elsevier Inc. All rights reserved.

  10. A discrete choice model of drug abuse treatment location.

    PubMed Central

    Goodman, A C; Nishiura, E; Hankin, J R

    1998-01-01

    OBJECTIVE: To identify short-term drug abuse treatment location risk factors for ten large, self-insured firms starting January 1, 1989 and ending December 31, 1991. DATA SOURCES/STUDY SETTING: Study population selected from a large database of health insurance claims for all treatment events starting January 1, 1989 and ending December 31, 1991. STUDY DESIGN: A nested binomial logit method is used to estimate firm-specific patterns of treatment location. The differences in treatment location patterns among firms are then decomposed into firm effects (holding explanatory variables constant among firms) and variable effects (holding firm-specific parameters constant). PRINCIPAL FINDINGS: Probability of inpatient drug treatment is directly related to the type of drug diagnosis. The most important factors are diagnoses of drug dependence (versus drug abuse) and/or a cocaine dependence. Firm-specific factors also make a substantive difference. Controlling for patient risk factors, firm-specific probabilities of inpatient treatment vary by as much as 87 percent. Controlling for practices of firms and their insurance carriers, differing patient risk profiles cause probabilities of inpatient treatment to vary by as much as 69 percent among firms. Use of the outpatient setting increased over the three-year period. CONCLUSIONS: There are two plausible explanations for the findings. First, people beginning treatment later in the three-year period had less severe conditions than earlier cases and therefore had less need of inpatient treatment. Second, drug abuse treatment experienced the same trend toward the increased use of outpatient care that characterized treatment for other illnesses in the 1980s and early 1990s. PMID:9566181

  11. The Influence of Baseline Marijuana Use on Treatment of Cocaine Dependence: Application of an Informative-Priors Bayesian Approach

    PubMed Central

    Green, Charles; Schmitz, Joy; Lindsay, Jan; Pedroza, Claudia; Lane, Scott; Agnelli, Rob; Kjome, Kimberley; Moeller, F. Gerard

    2012-01-01

    Background: Marijuana use is prevalent among patients with cocaine dependence and often non-exclusionary in clinical trials of potential cocaine medications. The dual-focus of this study was to (1) examine the moderating effect of baseline marijuana use on response to treatment with levodopa/carbidopa for cocaine dependence; and (2) apply an informative-priors, Bayesian approach for estimating the probability of a subgroup-by-treatment interaction effect. Method: A secondary data analysis of two previously published, double-blind, randomized controlled trials provided complete data for the historical (Study 1: N = 64 placebo), and current (Study 2: N = 113) data sets. Negative binomial regression evaluated Treatment Effectiveness Scores (TES) as a function of medication condition (levodopa/carbidopa, placebo), baseline marijuana use (days in past 30), and their interaction. Results: Bayesian analysis indicated that there was a 96% chance that baseline marijuana use predicts differential response to treatment with levodopa/carbidopa. Simple effects indicated that among participants receiving levodopa/carbidopa the probability that baseline marijuana confers harm in terms of reducing TES was 0.981; whereas the probability that marijuana confers harm within the placebo condition was 0.163. For every additional day of marijuana use reported at baseline, participants in the levodopa/carbidopa condition demonstrated a 5.4% decrease in TES; while participants in the placebo condition demonstrated a 4.9% increase in TES. Conclusion: The potential moderating effect of marijuana on cocaine treatment response should be considered in future trial designs. Applying Bayesian subgroup analysis proved informative in characterizing this patient-treatment interaction effect. PMID:23115553

  12. Technology-enhanced Interactive Teaching of Marginal, Joint and Conditional Probabilities: The Special Case of Bivariate Normal Distribution

    PubMed Central

    Dinov, Ivo D.; Kamino, Scott; Bhakhrani, Bilal; Christou, Nicolas

    2014-01-01

    Summary Data analysis requires subtle probability reasoning to answer questions like What is the chance of event A occurring, given that event B was observed? This generic question arises in discussions of many intriguing scientific questions such as What is the probability that an adolescent weighs between 120 and 140 pounds given that they are of average height? and What is the probability of (monetary) inflation exceeding 4% and housing price index below 110? To address such problems, learning some applied, theoretical or cross-disciplinary probability concepts is necessary. Teaching such courses can be improved by utilizing modern information technology resources. Students’ understanding of multivariate distributions, conditional probabilities, correlation and causation can be significantly strengthened by employing interactive web-based science educational resources. Independent of the type of a probability course (e.g. majors, minors or service probability course, rigorous measure-theoretic, applied or statistics course) student motivation, learning experiences and knowledge retention may be enhanced by blending modern technological tools within the classical conceptual pedagogical models. We have designed, implemented and disseminated a portable open-source web-application for teaching multivariate distributions, marginal, joint and conditional probabilities using the special case of bivariate Normal distribution. A real adolescent height and weight dataset is used to demonstrate the classroom utilization of the new web-application to address problems of parameter estimation, univariate and multivariate inference. PMID:25419016

  13. Technology-enhanced Interactive Teaching of Marginal, Joint and Conditional Probabilities: The Special Case of Bivariate Normal Distribution.

    PubMed

    Dinov, Ivo D; Kamino, Scott; Bhakhrani, Bilal; Christou, Nicolas

    2013-01-01

    Data analysis requires subtle probability reasoning to answer questions like What is the chance of event A occurring, given that event B was observed? This generic question arises in discussions of many intriguing scientific questions such as What is the probability that an adolescent weighs between 120 and 140 pounds given that they are of average height? and What is the probability of (monetary) inflation exceeding 4% and housing price index below 110? To address such problems, learning some applied, theoretical or cross-disciplinary probability concepts is necessary. Teaching such courses can be improved by utilizing modern information technology resources. Students' understanding of multivariate distributions, conditional probabilities, correlation and causation can be significantly strengthened by employing interactive web-based science educational resources. Independent of the type of a probability course (e.g. majors, minors or service probability course, rigorous measure-theoretic, applied or statistics course) student motivation, learning experiences and knowledge retention may be enhanced by blending modern technological tools within the classical conceptual pedagogical models. We have designed, implemented and disseminated a portable open-source web-application for teaching multivariate distributions, marginal, joint and conditional probabilities using the special case of bivariate Normal distribution. A real adolescent height and weight dataset is used to demonstrate the classroom utilization of the new web-application to address problems of parameter estimation, univariate and multivariate inference.

  14. Distributional assumptions in food and feed commodities- development of fit-for-purpose sampling protocols.

    PubMed

    Paoletti, Claudia; Esbensen, Kim H

    2015-01-01

    Material heterogeneity influences the effectiveness of sampling procedures. Most sampling guidelines used for assessment of food and/or feed commodities are based on classical statistical distribution requirements, the normal, binomial, and Poisson distributions-and almost universally rely on the assumption of randomness. However, this is unrealistic. The scientific food and feed community recognizes a strong preponderance of non random distribution within commodity lots, which should be a more realistic prerequisite for definition of effective sampling protocols. Nevertheless, these heterogeneity issues are overlooked as the prime focus is often placed only on financial, time, equipment, and personnel constraints instead of mandating acquisition of documented representative samples under realistic heterogeneity conditions. This study shows how the principles promulgated in the Theory of Sampling (TOS) and practically tested over 60 years provide an effective framework for dealing with the complete set of adverse aspects of both compositional and distributional heterogeneity (material sampling errors), as well as with the errors incurred by the sampling process itself. The results of an empirical European Union study on genetically modified soybean heterogeneity, Kernel Lot Distribution Assessment are summarized, as they have a strong bearing on the issue of proper sampling protocol development. TOS principles apply universally in the food and feed realm and must therefore be considered the only basis for development of valid sampling protocols free from distributional constraints.

  15. Initial conditions in high-energy collisions

    NASA Astrophysics Data System (ADS)

    Petreska, Elena

    This thesis is focused on the initial stages of high-energy collisions in the saturation regime. We start by extending the McLerran-Venugopalan distribution of color sources in the initial wave-function of nuclei in heavy-ion collisions. We derive a fourth-order operator in the action and discuss its relevance for the description of color charge distributions in protons in high-energy experiments. We calculate the dipole scattering amplitude in proton-proton collisions with the quartic action and find an agreement with experimental data. We also obtain a modification to the fluctuation parameter of the negative binomial distribution of particle multiplicities in proton-proton experiments. The result implies an advancement of the fourth-order action towards Gaussian when the energy is increased. Finally, we calculate perturbatively the expectation value of the magnetic Wilson loop operator in the first moments of heavy-ion collisions. For the magnetic flux we obtain a first non-trivial term that is proportional to the square of the area of the loop. The result is close to numerical calculations for small area loops.

  16. Characterization of trapped charges distribution in terms of mirror plot curve.

    PubMed

    Al-Obaidi, Hassan N; Mahdi, Ali S; Khaleel, Imad H

    2018-01-01

    Accumulation of charges (electrons) at the specimen surface in scanning electron microscope (SEM) lead to generate an electrostatic potential. By using the method of image charges, this potential is defined in the chamber's space of such apparatus. The deduced formula is expressed in terms a general volumetric distribution which proposed to be an infinitesimal spherical extension. With aid of a binomial theorem the defined potential is expanded to a multipolar form. Then resultant formula is adopted to modify a novel mirror plot equation so as to detect the real distribution of trapped charges. Simulation results reveal that trapped charges may take a various sort of arrangement such as monopole, quadruple and octuple. But existence of any of these arrangements alone may never be take place, rather are some a formations of a mix of them. Influence of each type of these profiles depends on the distance between the incident electron and surface of a sample. Result also shows that trapped charge's amount of trapped charges can refer to a threshold for failing of point charge approximation. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. Stylized facts in internal rates of return on stock index and its derivative transactions

    NASA Astrophysics Data System (ADS)

    Pichl, Lukáš; Kaizoji, Taisei; Yamano, Takuya

    2007-08-01

    Universal features in stock markets and their derivative markets are studied by means of probability distributions in internal rates of return on buy and sell transaction pairs. Unlike the stylized facts in normalized log returns, the probability distributions for such single asset encounters incorporate the time factor by means of the internal rate of return, defined as the continuous compound interest. Resulting stylized facts are shown in the probability distributions derived from the daily series of TOPIX, S & P 500 and FTSE 100 index close values. The application of the above analysis to minute-tick data of NIKKEI 225 and its futures market, respectively, reveals an interesting difference in the behavior of the two probability distributions, in case a threshold on the minimal duration of the long position is imposed. It is therefore suggested that the probability distributions of the internal rates of return could be used for causality mining between the underlying and derivative stock markets. The highly specific discrete spectrum, which results from noise trader strategies as opposed to the smooth distributions observed for fundamentalist strategies in single encounter transactions may be useful in deducing the type of investment strategy from trading revenues of small portfolio investors.

  18. Probabilistic Reasoning for Robustness in Automated Planning

    NASA Technical Reports Server (NTRS)

    Schaffer, Steven; Clement, Bradley; Chien, Steve

    2007-01-01

    A general-purpose computer program for planning the actions of a spacecraft or other complex system has been augmented by incorporating a subprogram that reasons about uncertainties in such continuous variables as times taken to perform tasks and amounts of resources to be consumed. This subprogram computes parametric probability distributions for time and resource variables on the basis of user-supplied models of actions and resources that they consume. The current system accepts bounded Gaussian distributions over action duration and resource use. The distributions are then combined during planning to determine the net probability distribution of each resource at any time point. In addition to a full combinatoric approach, several approximations for arriving at these combined distributions are available, including maximum-likelihood and pessimistic algorithms. Each such probability distribution can then be integrated to obtain a probability that execution of the plan under consideration would violate any constraints on the resource. The key idea is to use these probabilities of conflict to score potential plans and drive a search toward planning low-risk actions. An output plan provides a balance between the user s specified averseness to risk and other measures of optimality.

  19. Does the name really matter? The importance of botanical nomenclature and plant taxonomy in biomedical research.

    PubMed

    Bennett, Bradley C; Balick, Michael J

    2014-03-28

    Medical research on plant-derived compounds requires a breadth of expertise from field to laboratory and clinical skills. Too often basic botanical skills are evidently lacking, especially with respect to plant taxonomy and botanical nomenclature. Binomial and familial names, synonyms and author citations are often misconstrued. The correct botanical name, linked to a vouchered specimen, is the sine qua non of phytomedical research. Without the unique identifier of a proper binomial, research cannot accurately be linked to the existing literature. Perhaps more significant, is the ambiguity of species determinations that ensues of from poor taxonomic practices. This uncertainty, not surprisingly, obstructs reproducibility of results-the cornerstone of science. Based on our combined six decades of experience with medicinal plants, we discuss the problems of inaccurate taxonomy and botanical nomenclature in biomedical research. This problems appear all too frequently in manuscripts and grant applications that we review and they extend to the published literature. We also review the literature on the importance of taxonomy in other disciplines that relate to medicinal plant research. In most cases, questions regarding orthography, synonymy, author citations, and current family designations of most plant binomials can be resolved using widely-available online databases and other electronic resources. Some complex problems require consultation with a professional plant taxonomist, which also is important for accurate identification of voucher specimens. Researchers should provide the currently accepted binomial and complete author citation, provide relevant synonyms, and employ the Angiosperm Phylogeny Group III family name. Taxonomy is a vital adjunct not only to plant-medicine research but to virtually every field of science. Medicinal plant researchers can increase the precision and utility of their investigations by following sound practices with respect to botanical nomenclature. Correct spellings, accepted binomials, author citations, synonyms, and current family designations can readily be found on reliable online databases. When questions arise, researcher should consult plant taxonomists. © 2013 Published by Elsevier Ireland Ltd.

  20. Mathematical Model to estimate the wind power using four-parameter Burr distribution

    NASA Astrophysics Data System (ADS)

    Liu, Sanming; Wang, Zhijie; Pan, Zhaoxu

    2018-03-01

    When the real probability of wind speed in the same position needs to be described, the four-parameter Burr distribution is more suitable than other distributions. This paper introduces its important properties and characteristics. Also, the application of the four-parameter Burr distribution in wind speed prediction is discussed, and the expression of probability distribution of output power of wind turbine is deduced.

  1. Entropy Methods For Univariate Distributions in Decision Analysis

    NASA Astrophysics Data System (ADS)

    Abbas, Ali E.

    2003-03-01

    One of the most important steps in decision analysis practice is the elicitation of the decision-maker's belief about an uncertainty of interest in the form of a representative probability distribution. However, the probability elicitation process is a task that involves many cognitive and motivational biases. Alternatively, the decision-maker may provide other information about the distribution of interest, such as its moments, and the maximum entropy method can be used to obtain a full distribution subject to the given moment constraints. In practice however, decision makers cannot readily provide moments for the distribution, and are much more comfortable providing information about the fractiles of the distribution of interest or bounds on its cumulative probabilities. In this paper we present a graphical method to determine the maximum entropy distribution between upper and lower probability bounds and provide an interpretation for the shape of the maximum entropy distribution subject to fractile constraints, (FMED). We also discuss the problems with the FMED in that it is discontinuous and flat over each fractile interval. We present a heuristic approximation to a distribution if in addition to its fractiles, we also know it is continuous and work through full examples to illustrate the approach.

  2. Work probability distribution for a ferromagnet with long-ranged and short-ranged correlations

    NASA Astrophysics Data System (ADS)

    Bhattacharjee, J. K.; Kirkpatrick, T. R.; Sengers, J. V.

    2018-04-01

    Work fluctuations and work probability distributions are fundamentally different in systems with short-ranged versus long-ranged correlations. Specifically, in systems with long-ranged correlations the work distribution is extraordinarily broad compared to systems with short-ranged correlations. This difference profoundly affects the possible applicability of fluctuation theorems like the Jarzynski fluctuation theorem. The Heisenberg ferromagnet, well below its Curie temperature, is a system with long-ranged correlations in very low magnetic fields due to the presence of Goldstone modes. As the magnetic field is increased the correlations gradually become short ranged. Hence, such a ferromagnet is an ideal system for elucidating the changes of the work probability distribution as one goes from a domain with long-ranged correlations to a domain with short-ranged correlations by tuning the magnetic field. A quantitative analysis of this crossover behavior of the work probability distribution and the associated fluctuations is presented.

  3. Methods to elicit probability distributions from experts: a systematic review of reported practice in health technology assessment.

    PubMed

    Grigore, Bogdan; Peters, Jaime; Hyde, Christopher; Stein, Ken

    2013-11-01

    Elicitation is a technique that can be used to obtain probability distribution from experts about unknown quantities. We conducted a methodology review of reports where probability distributions had been elicited from experts to be used in model-based health technology assessments. Databases including MEDLINE, EMBASE and the CRD database were searched from inception to April 2013. Reference lists were checked and citation mapping was also used. Studies describing their approach to the elicitation of probability distributions were included. Data was abstracted on pre-defined aspects of the elicitation technique. Reports were critically appraised on their consideration of the validity, reliability and feasibility of the elicitation exercise. Fourteen articles were included. Across these studies, the most marked features were heterogeneity in elicitation approach and failure to report key aspects of the elicitation method. The most frequently used approaches to elicitation were the histogram technique and the bisection method. Only three papers explicitly considered the validity, reliability and feasibility of the elicitation exercises. Judged by the studies identified in the review, reports of expert elicitation are insufficient in detail and this impacts on the perceived usability of expert-elicited probability distributions. In this context, the wider credibility of elicitation will only be improved by better reporting and greater standardisation of approach. Until then, the advantage of eliciting probability distributions from experts may be lost.

  4. Computer simulation of random variables and vectors with arbitrary probability distribution laws

    NASA Technical Reports Server (NTRS)

    Bogdan, V. M.

    1981-01-01

    Assume that there is given an arbitrary n-dimensional probability distribution F. A recursive construction is found for a sequence of functions x sub 1 = f sub 1 (U sub 1, ..., U sub n), ..., x sub n = f sub n (U sub 1, ..., U sub n) such that if U sub 1, ..., U sub n are independent random variables having uniform distribution over the open interval (0,1), then the joint distribution of the variables x sub 1, ..., x sub n coincides with the distribution F. Since uniform independent random variables can be well simulated by means of a computer, this result allows one to simulate arbitrary n-random variables if their joint probability distribution is known.

  5. Nuclear risk analysis of the Ulysses mission

    NASA Astrophysics Data System (ADS)

    Bartram, Bart W.; Vaughan, Frank R.; Englehart, Richard W., Dr.

    1991-01-01

    The use of a radioisotope thermoelectric generator fueled with plutonium-238 dioxide on the Space Shuttle-launched Ulysses mission implies some level of risk due to potential accidents. This paper describes the method used to quantify risks in the Ulysses mission Final Safety Analysis Report prepared for the U.S. Department of Energy. The starting point for the analysis described herein is following input of source term probability distributions from the General Electric Company. A Monte Carlo technique is used to develop probability distributions of radiological consequences for a range of accident scenarios thoughout the mission. Factors affecting radiological consequences are identified, the probability distribution of the effect of each factor determined, and the functional relationship among all the factors established. The probability distributions of all the factor effects are then combined using a Monte Carlo technique. The results of the analysis are presented in terms of complementary cumulative distribution functions (CCDF) by mission sub-phase, phase, and the overall mission. The CCDFs show the total probability that consequences (calculated health effects) would be equal to or greater than a given value.

  6. Force Density Function Relationships in 2-D Granular Media

    NASA Technical Reports Server (NTRS)

    Youngquist, Robert C.; Metzger, Philip T.; Kilts, Kelly N.

    2004-01-01

    An integral transform relationship is developed to convert between two important probability density functions (distributions) used in the study of contact forces in granular physics. Developing this transform has now made it possible to compare and relate various theoretical approaches with one another and with the experimental data despite the fact that one may predict the Cartesian probability density and another the force magnitude probability density. Also, the transforms identify which functional forms are relevant to describe the probability density observed in nature, and so the modified Bessel function of the second kind has been identified as the relevant form for the Cartesian probability density corresponding to exponential forms in the force magnitude distribution. Furthermore, it is shown that this transform pair supplies a sufficient mathematical framework to describe the evolution of the force magnitude distribution under shearing. Apart from the choice of several coefficients, whose evolution of values must be explained in the physics, this framework successfully reproduces the features of the distribution that are taken to be an indicator of jamming and unjamming in a granular packing. Key words. Granular Physics, Probability Density Functions, Fourier Transforms

  7. A modified chain binomial model to analyse the ongoing measles epidemic in Greece, July 2017 to February 2018

    PubMed Central

    Lytras, Theodore; Georgakopoulou, Theano; Tsiodras, Sotirios

    2018-01-01

    Greece is currently experiencing a large measles outbreak, in the context of multiple similar outbreaks across Europe. We devised and applied a modified chain-binomial epidemic model, requiring very simple data, to estimate the transmission parameters of this outbreak. Model results indicate sustained measles transmission among the Greek Roma population, necessitating a targeted mass vaccination campaign to halt further spread of the epidemic. Our model may be useful for other countries facing similar measles outbreaks. PMID:29717695

  8. A modified chain binomial model to analyse the ongoing measles epidemic in Greece, July 2017 to February 2018.

    PubMed

    Lytras, Theodore; Georgakopoulou, Theano; Tsiodras, Sotirios

    2018-04-01

    Greece is currently experiencing a large measles outbreak, in the context of multiple similar outbreaks across Europe. We devised and applied a modified chain-binomial epidemic model, requiring very simple data, to estimate the transmission parameters of this outbreak. Model results indicate sustained measles transmission among the Greek Roma population, necessitating a targeted mass vaccination campaign to halt further spread of the epidemic. Our model may be useful for other countries facing similar measles outbreaks.

  9. An evaluation of procedures to estimate monthly precipitation probabilities

    NASA Astrophysics Data System (ADS)

    Legates, David R.

    1991-01-01

    Many frequency distributions have been used to evaluate monthly precipitation probabilities. Eight of these distributions (including Pearson type III, extreme value, and transform normal probability density functions) are comparatively examined to determine their ability to represent accurately variations in monthly precipitation totals for global hydroclimatological analyses. Results indicate that a modified version of the Box-Cox transform-normal distribution more adequately describes the 'true' precipitation distribution than does any of the other methods. This assessment was made using a cross-validation procedure for a global network of 253 stations for which at least 100 years of monthly precipitation totals were available.

  10. q-Gaussian distributions of leverage returns, first stopping times, and default risk valuations

    NASA Astrophysics Data System (ADS)

    Katz, Yuri A.; Tian, Li

    2013-10-01

    We study the probability distributions of daily leverage returns of 520 North American industrial companies that survive de-listing during the financial crisis, 2006-2012. We provide evidence that distributions of unbiased leverage returns of all individual firms belong to the class of q-Gaussian distributions with the Tsallis entropic parameter within the interval 1

  11. Estimating the prevalence and intensity of Schistosoma mansoni infection among rural communities in Western Tanzania: The influence of sampling strategy and statistical approach

    PubMed Central

    Bakuza, Jared S.; Denwood, Matthew J.; Nkwengulila, Gamba

    2017-01-01

    Background Schistosoma mansoni is a parasite of major public health importance in developing countries, where it causes a neglected tropical disease known as intestinal schistosomiasis. However, the distribution of the parasite within many endemic regions is currently unknown, which hinders effective control. The purpose of this study was to characterize the prevalence and intensity of infection of S. mansoni in a remote area of western Tanzania. Methodology/Principal findings Stool samples were collected from 192 children and 147 adults residing in Gombe National Park and four nearby villages. Children were actively sampled in local schools, and adults were sampled passively by voluntary presentation at the local health clinics. The two datasets were therefore analysed separately. Faecal worm egg count (FWEC) data were analysed using negative binomial and zero-inflated negative binomial (ZINB) models with explanatory variables of site, sex, and age. The ZINB models indicated that a substantial proportion of the observed zero FWEC reflected a failure to detect eggs in truly infected individuals, meaning that the estimated true prevalence was much higher than the apparent prevalence as calculated based on the simple proportion of non-zero FWEC. For the passively sampled data from adults, the data were consistent with close to 100% true prevalence of infection. Both the prevalence and intensity of infection differed significantly between sites, but there were no significant associations with sex or age. Conclusions/Significance Overall, our data suggest a more widespread distribution of S. mansoni in this part of Tanzania than was previously thought. The apparent prevalence estimates substantially under-estimated the true prevalence as determined by the ZINB models, and the two types of sampling strategies also resulted in differing conclusions regarding prevalence of infection. We therefore recommend that future surveillance programmes designed to assess risk factors should use active sampling whenever possible, in order to avoid the self-selection bias associated with passive sampling. PMID:28934206

  12. Probability weighted moments: Definition and relation to parameters of several distributions expressable in inverse form

    USGS Publications Warehouse

    Greenwood, J. Arthur; Landwehr, J. Maciunas; Matalas, N.C.; Wallis, J.R.

    1979-01-01

    Distributions whose inverse forms are explicitly defined, such as Tukey's lambda, may present problems in deriving their parameters by more conventional means. Probability weighted moments are introduced and shown to be potentially useful in expressing the parameters of these distributions.

  13. Univariate Probability Distributions

    ERIC Educational Resources Information Center

    Leemis, Lawrence M.; Luckett, Daniel J.; Powell, Austin G.; Vermeer, Peter E.

    2012-01-01

    We describe a web-based interactive graphic that can be used as a resource in introductory classes in mathematical statistics. This interactive graphic presents 76 common univariate distributions and gives details on (a) various features of the distribution such as the functional form of the probability density function and cumulative distribution…

  14. A probability space for quantum models

    NASA Astrophysics Data System (ADS)

    Lemmens, L. F.

    2017-06-01

    A probability space contains a set of outcomes, a collection of events formed by subsets of the set of outcomes and probabilities defined for all events. A reformulation in terms of propositions allows to use the maximum entropy method to assign the probabilities taking some constraints into account. The construction of a probability space for quantum models is determined by the choice of propositions, choosing the constraints and making the probability assignment by the maximum entropy method. This approach shows, how typical quantum distributions such as Maxwell-Boltzmann, Fermi-Dirac and Bose-Einstein are partly related with well-known classical distributions. The relation between the conditional probability density, given some averages as constraints and the appropriate ensemble is elucidated.

  15. Regional probability distribution of the annual reference evapotranspiration and its effective parameters in Iran

    NASA Astrophysics Data System (ADS)

    Khanmohammadi, Neda; Rezaie, Hossein; Montaseri, Majid; Behmanesh, Javad

    2017-10-01

    The reference evapotranspiration (ET0) plays an important role in water management plans in arid or semi-arid countries such as Iran. For this reason, the regional analysis of this parameter is important. But, ET0 process is affected by several meteorological parameters such as wind speed, solar radiation, temperature and relative humidity. Therefore, the effect of distribution type of effective meteorological variables on ET0 distribution was analyzed. For this purpose, the regional probability distribution of the annual ET0 and its effective parameters were selected. Used data in this research was recorded data at 30 synoptic stations of Iran during 1960-2014. Using the probability plot correlation coefficient (PPCC) test and the L-moment method, five common distributions were compared and the best distribution was selected. The results of PPCC test and L-moment diagram indicated that the Pearson type III distribution was the best probability distribution for fitting annual ET0 and its four effective parameters. The results of RMSE showed that the ability of the PPCC test and L-moment method for regional analysis of reference evapotranspiration and its effective parameters was similar. The results also showed that the distribution type of the parameters which affected ET0 values can affect the distribution of reference evapotranspiration.

  16. Probabilistic reasoning in data analysis.

    PubMed

    Sirovich, Lawrence

    2011-09-20

    This Teaching Resource provides lecture notes, slides, and a student assignment for a lecture on probabilistic reasoning in the analysis of biological data. General probabilistic frameworks are introduced, and a number of standard probability distributions are described using simple intuitive ideas. Particular attention is focused on random arrivals that are independent of prior history (Markovian events), with an emphasis on waiting times, Poisson processes, and Poisson probability distributions. The use of these various probability distributions is applied to biomedical problems, including several classic experimental studies.

  17. Multiobjective fuzzy stochastic linear programming problems with inexact probability distribution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hamadameen, Abdulqader Othman; Zainuddin, Zaitul Marlizawati

    This study deals with multiobjective fuzzy stochastic linear programming problems with uncertainty probability distribution which are defined as fuzzy assertions by ambiguous experts. The problem formulation has been presented and the two solutions strategies are; the fuzzy transformation via ranking function and the stochastic transformation when α{sup –}. cut technique and linguistic hedges are used in the uncertainty probability distribution. The development of Sen’s method is employed to find a compromise solution, supported by illustrative numerical example.

  18. Work probability distribution and tossing a biased coin

    NASA Astrophysics Data System (ADS)

    Saha, Arnab; Bhattacharjee, Jayanta K.; Chakraborty, Sagar

    2011-01-01

    We show that the rare events present in dissipated work that enters Jarzynski equality, when mapped appropriately to the phenomenon of large deviations found in a biased coin toss, are enough to yield a quantitative work probability distribution for the Jarzynski equality. This allows us to propose a recipe for constructing work probability distribution independent of the details of any relevant system. The underlying framework, developed herein, is expected to be of use in modeling other physical phenomena where rare events play an important role.

  19. Hybrid computer technique yields random signal probability distributions

    NASA Technical Reports Server (NTRS)

    Cameron, W. D.

    1965-01-01

    Hybrid computer determines the probability distributions of instantaneous and peak amplitudes of random signals. This combined digital and analog computer system reduces the errors and delays of manual data analysis.

  20. Fast Reliability Assessing Method for Distribution Network with Distributed Renewable Energy Generation

    NASA Astrophysics Data System (ADS)

    Chen, Fan; Huang, Shaoxiong; Ding, Jinjin; Ding, Jinjin; Gao, Bo; Xie, Yuguang; Wang, Xiaoming

    2018-01-01

    This paper proposes a fast reliability assessing method for distribution grid with distributed renewable energy generation. First, the Weibull distribution and the Beta distribution are used to describe the probability distribution characteristics of wind speed and solar irradiance respectively, and the models of wind farm, solar park and local load are built for reliability assessment. Then based on power system production cost simulation probability discretization and linearization power flow, a optimal power flow objected with minimum cost of conventional power generation is to be resolved. Thus a reliability assessment for distribution grid is implemented fast and accurately. The Loss Of Load Probability (LOLP) and Expected Energy Not Supplied (EENS) are selected as the reliability index, a simulation for IEEE RBTS BUS6 system in MATLAB indicates that the fast reliability assessing method calculates the reliability index much faster with the accuracy ensured when compared with Monte Carlo method.

Top