Sample records for sample size model

  1. Sample Size and Statistical Conclusions from Tests of Fit to the Rasch Model According to the Rasch Unidimensional Measurement Model (Rumm) Program in Health Outcome Measurement.

    PubMed

    Hagell, Peter; Westergren, Albert

    Sample size is a major factor in statistical null hypothesis testing, which is the basis for many approaches to testing Rasch model fit. Few sample size recommendations for testing fit to the Rasch model concern the Rasch Unidimensional Measurement Models (RUMM) software, which features chi-square and ANOVA/F-ratio based fit statistics, including Bonferroni and algebraic sample size adjustments. This paper explores the occurrence of Type I errors with RUMM fit statistics, and the effects of algebraic sample size adjustments. Data with simulated Rasch model fitting 25-item dichotomous scales and sample sizes ranging from N = 50 to N = 2500 were analysed with and without algebraically adjusted sample sizes. Results suggest the occurrence of Type I errors with N less then or equal to 500, and that Bonferroni correction as well as downward algebraic sample size adjustment are useful to avoid such errors, whereas upward adjustment of smaller samples falsely signal misfit. Our observations suggest that sample sizes around N = 250 to N = 500 may provide a good balance for the statistical interpretation of the RUMM fit statistics studied here with respect to Type I errors and under the assumption of Rasch model fit within the examined frame of reference (i.e., about 25 item parameters well targeted to the sample).

  2. Sample Size Determination for Regression Models Using Monte Carlo Methods in R

    ERIC Educational Resources Information Center

    Beaujean, A. Alexander

    2014-01-01

    A common question asked by researchers using regression models is, What sample size is needed for my study? While there are formulae to estimate sample sizes, their assumptions are often not met in the collected data. A more realistic approach to sample size determination requires more information such as the model of interest, strength of the…

  3. Hierarchical modeling of cluster size in wildlife surveys

    USGS Publications Warehouse

    Royle, J. Andrew

    2008-01-01

    Clusters or groups of individuals are the fundamental unit of observation in many wildlife sampling problems, including aerial surveys of waterfowl, marine mammals, and ungulates. Explicit accounting of cluster size in models for estimating abundance is necessary because detection of individuals within clusters is not independent and detectability of clusters is likely to increase with cluster size. This induces a cluster size bias in which the average cluster size in the sample is larger than in the population at large. Thus, failure to account for the relationship between delectability and cluster size will tend to yield a positive bias in estimates of abundance or density. I describe a hierarchical modeling framework for accounting for cluster-size bias in animal sampling. The hierarchical model consists of models for the observation process conditional on the cluster size distribution and the cluster size distribution conditional on the total number of clusters. Optionally, a spatial model can be specified that describes variation in the total number of clusters per sample unit. Parameter estimation, model selection, and criticism may be carried out using conventional likelihood-based methods. An extension of the model is described for the situation where measurable covariates at the level of the sample unit are available. Several candidate models within the proposed class are evaluated for aerial survey data on mallard ducks (Anas platyrhynchos).

  4. A Note on Sample Size and Solution Propriety for Confirmatory Factor Analytic Models

    ERIC Educational Resources Information Center

    Jackson, Dennis L.; Voth, Jennifer; Frey, Marc P.

    2013-01-01

    Determining an appropriate sample size for use in latent variable modeling techniques has presented ongoing challenges to researchers. In particular, small sample sizes are known to present concerns over sampling error for the variances and covariances on which model estimation is based, as well as for fit indexes and convergence failures. The…

  5. The attention-weighted sample-size model of visual short-term memory: Attention capture predicts resource allocation and memory load.

    PubMed

    Smith, Philip L; Lilburn, Simon D; Corbett, Elaine A; Sewell, David K; Kyllingsbæk, Søren

    2016-09-01

    We investigated the capacity of visual short-term memory (VSTM) in a phase discrimination task that required judgments about the configural relations between pairs of black and white features. Sewell et al. (2014) previously showed that VSTM capacity in an orientation discrimination task was well described by a sample-size model, which views VSTM as a resource comprised of a finite number of noisy stimulus samples. The model predicts the invariance of [Formula: see text] , the sum of squared sensitivities across items, for displays of different sizes. For phase discrimination, the set-size effect significantly exceeded that predicted by the sample-size model for both simultaneously and sequentially presented stimuli. Instead, the set-size effect and the serial position curves with sequential presentation were predicted by an attention-weighted version of the sample-size model, which assumes that one of the items in the display captures attention and receives a disproportionate share of resources. The choice probabilities and response time distributions from the task were well described by a diffusion decision model in which the drift rates embodied the assumptions of the attention-weighted sample-size model. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  6. Sample sizes and model comparison metrics for species distribution models

    Treesearch

    B.B. Hanberry; H.S. He; D.C. Dey

    2012-01-01

    Species distribution models use small samples to produce continuous distribution maps. The question of how small a sample can be to produce an accurate model generally has been answered based on comparisons to maximum sample sizes of 200 observations or fewer. In addition, model comparisons often are made with the kappa statistic, which has become controversial....

  7. Effects of sample size on estimates of population growth rates calculated with matrix models.

    PubMed

    Fiske, Ian J; Bruna, Emilio M; Bolker, Benjamin M

    2008-08-28

    Matrix models are widely used to study the dynamics and demography of populations. An important but overlooked issue is how the number of individuals sampled influences estimates of the population growth rate (lambda) calculated with matrix models. Even unbiased estimates of vital rates do not ensure unbiased estimates of lambda-Jensen's Inequality implies that even when the estimates of the vital rates are accurate, small sample sizes lead to biased estimates of lambda due to increased sampling variance. We investigated if sampling variability and the distribution of sampling effort among size classes lead to biases in estimates of lambda. Using data from a long-term field study of plant demography, we simulated the effects of sampling variance by drawing vital rates and calculating lambda for increasingly larger populations drawn from a total population of 3842 plants. We then compared these estimates of lambda with those based on the entire population and calculated the resulting bias. Finally, we conducted a review of the literature to determine the sample sizes typically used when parameterizing matrix models used to study plant demography. We found significant bias at small sample sizes when survival was low (survival = 0.5), and that sampling with a more-realistic inverse J-shaped population structure exacerbated this bias. However our simulations also demonstrate that these biases rapidly become negligible with increasing sample sizes or as survival increases. For many of the sample sizes used in demographic studies, matrix models are probably robust to the biases resulting from sampling variance of vital rates. However, this conclusion may depend on the structure of populations or the distribution of sampling effort in ways that are unexplored. We suggest more intensive sampling of populations when individual survival is low and greater sampling of stages with high elasticities.

  8. The Statistics and Mathematics of High Dimension Low Sample Size Asymptotics.

    PubMed

    Shen, Dan; Shen, Haipeng; Zhu, Hongtu; Marron, J S

    2016-10-01

    The aim of this paper is to establish several deep theoretical properties of principal component analysis for multiple-component spike covariance models. Our new results reveal an asymptotic conical structure in critical sample eigendirections under the spike models with distinguishable (or indistinguishable) eigenvalues, when the sample size and/or the number of variables (or dimension) tend to infinity. The consistency of the sample eigenvectors relative to their population counterparts is determined by the ratio between the dimension and the product of the sample size with the spike size. When this ratio converges to a nonzero constant, the sample eigenvector converges to a cone, with a certain angle to its corresponding population eigenvector. In the High Dimension, Low Sample Size case, the angle between the sample eigenvector and its population counterpart converges to a limiting distribution. Several generalizations of the multi-spike covariance models are also explored, and additional theoretical results are presented.

  9. Sample size considerations using mathematical models: an example with Chlamydia trachomatis infection and its sequelae pelvic inflammatory disease.

    PubMed

    Herzog, Sereina A; Low, Nicola; Berghold, Andrea

    2015-06-19

    The success of an intervention to prevent the complications of an infection is influenced by the natural history of the infection. Assumptions about the temporal relationship between infection and the development of sequelae can affect the predicted effect size of an intervention and the sample size calculation. This study investigates how a mathematical model can be used to inform sample size calculations for a randomised controlled trial (RCT) using the example of Chlamydia trachomatis infection and pelvic inflammatory disease (PID). We used a compartmental model to imitate the structure of a published RCT. We considered three different processes for the timing of PID development, in relation to the initial C. trachomatis infection: immediate, constant throughout, or at the end of the infectious period. For each process we assumed that, of all women infected, the same fraction would develop PID in the absence of an intervention. We examined two sets of assumptions used to calculate the sample size in a published RCT that investigated the effect of chlamydia screening on PID incidence. We also investigated the influence of the natural history parameters of chlamydia on the required sample size. The assumed event rates and effect sizes used for the sample size calculation implicitly determined the temporal relationship between chlamydia infection and PID in the model. Even small changes in the assumed PID incidence and relative risk (RR) led to considerable differences in the hypothesised mechanism of PID development. The RR and the sample size needed per group also depend on the natural history parameters of chlamydia. Mathematical modelling helps to understand the temporal relationship between an infection and its sequelae and can show how uncertainties about natural history parameters affect sample size calculations when planning a RCT.

  10. Sample Size Requirements for Structural Equation Models: An Evaluation of Power, Bias, and Solution Propriety

    ERIC Educational Resources Information Center

    Wolf, Erika J.; Harrington, Kelly M.; Clark, Shaunna L.; Miller, Mark W.

    2013-01-01

    Determining sample size requirements for structural equation modeling (SEM) is a challenge often faced by investigators, peer reviewers, and grant writers. Recent years have seen a large increase in SEMs in the behavioral science literature, but consideration of sample size requirements for applied SEMs often relies on outdated rules-of-thumb.…

  11. Sample Size and Item Parameter Estimation Precision When Utilizing the One-Parameter "Rasch" Model

    ERIC Educational Resources Information Center

    Custer, Michael

    2015-01-01

    This study examines the relationship between sample size and item parameter estimation precision when utilizing the one-parameter model. Item parameter estimates are examined relative to "true" values by evaluating the decline in root mean squared deviation (RMSD) and the number of outliers as sample size increases. This occurs across…

  12. The Impact of Sample Size and Other Factors When Estimating Multilevel Logistic Models

    ERIC Educational Resources Information Center

    Schoeneberger, Jason A.

    2016-01-01

    The design of research studies utilizing binary multilevel models must necessarily incorporate knowledge of multiple factors, including estimation method, variance component size, or number of predictors, in addition to sample sizes. This Monte Carlo study examined the performance of random effect binary outcome multilevel models under varying…

  13. Rasch fit statistics and sample size considerations for polytomous data.

    PubMed

    Smith, Adam B; Rush, Robert; Fallowfield, Lesley J; Velikova, Galina; Sharpe, Michael

    2008-05-29

    Previous research on educational data has demonstrated that Rasch fit statistics (mean squares and t-statistics) are highly susceptible to sample size variation for dichotomously scored rating data, although little is known about this relationship for polytomous data. These statistics help inform researchers about how well items fit to a unidimensional latent trait, and are an important adjunct to modern psychometrics. Given the increasing use of Rasch models in health research the purpose of this study was therefore to explore the relationship between fit statistics and sample size for polytomous data. Data were collated from a heterogeneous sample of cancer patients (n = 4072) who had completed both the Patient Health Questionnaire - 9 and the Hospital Anxiety and Depression Scale. Ten samples were drawn with replacement for each of eight sample sizes (n = 25 to n = 3200). The Rating and Partial Credit Models were applied and the mean square and t-fit statistics (infit/outfit) derived for each model. The results demonstrated that t-statistics were highly sensitive to sample size, whereas mean square statistics remained relatively stable for polytomous data. It was concluded that mean square statistics were relatively independent of sample size for polytomous data and that misfit to the model could be identified using published recommended ranges.

  14. Rasch fit statistics and sample size considerations for polytomous data

    PubMed Central

    Smith, Adam B; Rush, Robert; Fallowfield, Lesley J; Velikova, Galina; Sharpe, Michael

    2008-01-01

    Background Previous research on educational data has demonstrated that Rasch fit statistics (mean squares and t-statistics) are highly susceptible to sample size variation for dichotomously scored rating data, although little is known about this relationship for polytomous data. These statistics help inform researchers about how well items fit to a unidimensional latent trait, and are an important adjunct to modern psychometrics. Given the increasing use of Rasch models in health research the purpose of this study was therefore to explore the relationship between fit statistics and sample size for polytomous data. Methods Data were collated from a heterogeneous sample of cancer patients (n = 4072) who had completed both the Patient Health Questionnaire – 9 and the Hospital Anxiety and Depression Scale. Ten samples were drawn with replacement for each of eight sample sizes (n = 25 to n = 3200). The Rating and Partial Credit Models were applied and the mean square and t-fit statistics (infit/outfit) derived for each model. Results The results demonstrated that t-statistics were highly sensitive to sample size, whereas mean square statistics remained relatively stable for polytomous data. Conclusion It was concluded that mean square statistics were relatively independent of sample size for polytomous data and that misfit to the model could be identified using published recommended ranges. PMID:18510722

  15. Determination of sample size for higher volatile data using new framework of Box-Jenkins model with GARCH: A case study on gold price

    NASA Astrophysics Data System (ADS)

    Roslindar Yaziz, Siti; Zakaria, Roslinazairimah; Hura Ahmad, Maizah

    2017-09-01

    The model of Box-Jenkins - GARCH has been shown to be a promising tool for forecasting higher volatile time series. In this study, the framework of determining the optimal sample size using Box-Jenkins model with GARCH is proposed for practical application in analysing and forecasting higher volatile data. The proposed framework is employed to daily world gold price series from year 1971 to 2013. The data is divided into 12 different sample sizes (from 30 to 10200). Each sample is tested using different combination of the hybrid Box-Jenkins - GARCH model. Our study shows that the optimal sample size to forecast gold price using the framework of the hybrid model is 1250 data of 5-year sample. Hence, the empirical results of model selection criteria and 1-step-ahead forecasting evaluations suggest that the latest 12.25% (5-year data) of 10200 data is sufficient enough to be employed in the model of Box-Jenkins - GARCH with similar forecasting performance as by using 41-year data.

  16. A novel sample size formula for the weighted log-rank test under the proportional hazards cure model.

    PubMed

    Xiong, Xiaoping; Wu, Jianrong

    2017-01-01

    The treatment of cancer has progressed dramatically in recent decades, such that it is no longer uncommon to see a cure or log-term survival in a significant proportion of patients with various types of cancer. To adequately account for the cure fraction when designing clinical trials, the cure models should be used. In this article, a sample size formula for the weighted log-rank test is derived under the fixed alternative hypothesis for the proportional hazards cure models. Simulation showed that the proposed sample size formula provides an accurate estimation of sample size for designing clinical trials under the proportional hazards cure models. Copyright © 2016 John Wiley & Sons, Ltd.

  17. Simulation on Poisson and negative binomial models of count road accident modeling

    NASA Astrophysics Data System (ADS)

    Sapuan, M. S.; Razali, A. M.; Zamzuri, Z. H.; Ibrahim, K.

    2016-11-01

    Accident count data have often been shown to have overdispersion. On the other hand, the data might contain zero count (excess zeros). The simulation study was conducted to create a scenarios which an accident happen in T-junction with the assumption the dependent variables of generated data follows certain distribution namely Poisson and negative binomial distribution with different sample size of n=30 to n=500. The study objective was accomplished by fitting Poisson regression, negative binomial regression and Hurdle negative binomial model to the simulated data. The model validation was compared and the simulation result shows for each different sample size, not all model fit the data nicely even though the data generated from its own distribution especially when the sample size is larger. Furthermore, the larger sample size indicates that more zeros accident count in the dataset.

  18. Quantifying and Mitigating the Effect of Preferential Sampling on Phylodynamic Inference

    PubMed Central

    Karcher, Michael D.; Palacios, Julia A.; Bedford, Trevor; Suchard, Marc A.; Minin, Vladimir N.

    2016-01-01

    Phylodynamics seeks to estimate effective population size fluctuations from molecular sequences of individuals sampled from a population of interest. One way to accomplish this task formulates an observed sequence data likelihood exploiting a coalescent model for the sampled individuals’ genealogy and then integrating over all possible genealogies via Monte Carlo or, less efficiently, by conditioning on one genealogy estimated from the sequence data. However, when analyzing sequences sampled serially through time, current methods implicitly assume either that sampling times are fixed deterministically by the data collection protocol or that their distribution does not depend on the size of the population. Through simulation, we first show that, when sampling times do probabilistically depend on effective population size, estimation methods may be systematically biased. To correct for this deficiency, we propose a new model that explicitly accounts for preferential sampling by modeling the sampling times as an inhomogeneous Poisson process dependent on effective population size. We demonstrate that in the presence of preferential sampling our new model not only reduces bias, but also improves estimation precision. Finally, we compare the performance of the currently used phylodynamic methods with our proposed model through clinically-relevant, seasonal human influenza examples. PMID:26938243

  19. Sample size, confidence, and contingency judgement.

    PubMed

    Clément, Mélanie; Mercier, Pierre; Pastò, Luigi

    2002-06-01

    According to statistical models, the acquisition function of contingency judgement is due to confidence increasing with sample size. According to associative models, the function reflects the accumulation of associative strength on which the judgement is based. Which view is right? Thirty university students assessed the relation between a fictitious medication and a symptom of skin discoloration in conditions that varied sample size (4, 6, 8 or 40 trials) and contingency (delta P = .20, .40, .60 or .80). Confidence was also collected. Contingency judgement was lower for smaller samples, while confidence level correlated inversely with sample size. This dissociation between contingency judgement and confidence contradicts the statistical perspective.

  20. Review of Sample Size for Structural Equation Models in Second Language Testing and Learning Research: A Monte Carlo Approach

    ERIC Educational Resources Information Center

    In'nami, Yo; Koizumi, Rie

    2013-01-01

    The importance of sample size, although widely discussed in the literature on structural equation modeling (SEM), has not been widely recognized among applied SEM researchers. To narrow this gap, we focus on second language testing and learning studies and examine the following: (a) Is the sample size sufficient in terms of precision and power of…

  1. Dispersion and sampling of adult Dermacentor andersoni in rangeland in Western North America.

    PubMed

    Rochon, K; Scoles, G A; Lysyk, T J

    2012-03-01

    A fixed precision sampling plan was developed for off-host populations of adult Rocky Mountain wood tick, Dermacentor andersoni (Stiles) based on data collected by dragging at 13 locations in Alberta, Canada; Washington; and Oregon. In total, 222 site-date combinations were sampled. Each site-date combination was considered a sample, and each sample ranged in size from 86 to 250 10 m2 quadrats. Analysis of simulated quadrats ranging in size from 10 to 50 m2 indicated that the most precise sample unit was the 10 m2 quadrat. Samples taken when abundance < 0.04 ticks per 10 m2 were more likely to not depart significantly from statistical randomness than samples taken when abundance was greater. Data were grouped into ten abundance classes and assessed for fit to the Poisson and negative binomial distributions. The Poisson distribution fit only data in abundance classes < 0.02 ticks per 10 m2, while the negative binomial distribution fit data from all abundance classes. A negative binomial distribution with common k = 0.3742 fit data in eight of the 10 abundance classes. Both the Taylor and Iwao mean-variance relationships were fit and used to predict sample sizes for a fixed level of precision. Sample sizes predicted using the Taylor model tended to underestimate actual sample sizes, while sample sizes estimated using the Iwao model tended to overestimate actual sample sizes. Using a negative binomial with common k provided estimates of required sample sizes closest to empirically calculated sample sizes.

  2. PTM Modeling of Dredged Suspended Sediment at Proposed Polaris Point and Ship Repair Facility CVN Berthing Sites - Apra Harbor, Guam

    DTIC Science & Technology

    2017-09-01

    ADCP locations used for model calibration. ......................................................................... 12 Figure 4-3. Sample water...Example of fine sediment sample [Set d, Sample B30]. (B) Example of coarse sediment sample [Set d, sample B05...Turning Basin average sediment size distribution curve. ................................................... 21 Figure 5-5. Turning Basin average size

  3. Sample Size in Qualitative Interview Studies: Guided by Information Power.

    PubMed

    Malterud, Kirsti; Siersma, Volkert Dirk; Guassora, Ann Dorrit

    2015-11-27

    Sample sizes must be ascertained in qualitative studies like in quantitative studies but not by the same means. The prevailing concept for sample size in qualitative studies is "saturation." Saturation is closely tied to a specific methodology, and the term is inconsistently applied. We propose the concept "information power" to guide adequate sample size for qualitative studies. Information power indicates that the more information the sample holds, relevant for the actual study, the lower amount of participants is needed. We suggest that the size of a sample with sufficient information power depends on (a) the aim of the study, (b) sample specificity, (c) use of established theory, (d) quality of dialogue, and (e) analysis strategy. We present a model where these elements of information and their relevant dimensions are related to information power. Application of this model in the planning and during data collection of a qualitative study is discussed. © The Author(s) 2015.

  4. Performance and separation occurrence of binary probit regression estimator using maximum likelihood method and Firths approach under different sample size

    NASA Astrophysics Data System (ADS)

    Lusiana, Evellin Dewi

    2017-12-01

    The parameters of binary probit regression model are commonly estimated by using Maximum Likelihood Estimation (MLE) method. However, MLE method has limitation if the binary data contains separation. Separation is the condition where there are one or several independent variables that exactly grouped the categories in binary response. It will result the estimators of MLE method become non-convergent, so that they cannot be used in modeling. One of the effort to resolve the separation is using Firths approach instead. This research has two aims. First, to identify the chance of separation occurrence in binary probit regression model between MLE method and Firths approach. Second, to compare the performance of binary probit regression model estimator that obtained by MLE method and Firths approach using RMSE criteria. Those are performed using simulation method and under different sample size. The results showed that the chance of separation occurrence in MLE method for small sample size is higher than Firths approach. On the other hand, for larger sample size, the probability decreased and relatively identic between MLE method and Firths approach. Meanwhile, Firths estimators have smaller RMSE than MLEs especially for smaller sample sizes. But for larger sample sizes, the RMSEs are not much different. It means that Firths estimators outperformed MLE estimator.

  5. Effects of Calibration Sample Size and Item Bank Size on Ability Estimation in Computerized Adaptive Testing

    ERIC Educational Resources Information Center

    Sahin, Alper; Weiss, David J.

    2015-01-01

    This study aimed to investigate the effects of calibration sample size and item bank size on examinee ability estimation in computerized adaptive testing (CAT). For this purpose, a 500-item bank pre-calibrated using the three-parameter logistic model with 10,000 examinees was simulated. Calibration samples of varying sizes (150, 250, 350, 500,…

  6. Precision Efficacy Analysis for Regression.

    ERIC Educational Resources Information Center

    Brooks, Gordon P.

    When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…

  7. Sample size calculations for case-control studies

    Cancer.gov

    This R package can be used to calculate the required samples size for unconditional multivariate analyses of unmatched case-control studies. The sample sizes are for a scalar exposure effect, such as binary, ordinal or continuous exposures. The sample sizes can also be computed for scalar interaction effects. The analyses account for the effects of potential confounder variables that are also included in the multivariate logistic model.

  8. Characterization of the Particle Size and Polydispersity of Dicumarol Using Solid-State NMR Spectroscopy.

    PubMed

    Dempah, Kassibla Elodie; Lubach, Joseph W; Munson, Eric J

    2017-03-06

    A variety of particle sizes of a model compound, dicumarol, were prepared and characterized in order to investigate the correlation between particle size and solid-state NMR (SSNMR) proton spin-lattice relaxation ( 1 H T 1 ) times. Conventional laser diffraction and scanning electron microscopy were used as particle size measurement techniques and showed crystalline dicumarol samples with sizes ranging from tens of micrometers to a few micrometers. Dicumarol samples were prepared using both bottom-up and top-down particle size control approaches, via antisolvent microprecipitation and cryogrinding. It was observed that smaller particles of dicumarol generally had shorter 1 H T 1 times than larger ones. Additionally, cryomilled particles had the shortest 1 H T 1 times encountered (8 s). SSNMR 1 H T 1 times of all the samples were measured and showed as-received dicumarol to have a T 1 of 1500 s, whereas the 1 H T 1 times of the precipitated samples ranged from 20 to 80 s, with no apparent change in the physical form of dicumarol. Physical mixtures of different sized particles were also analyzed to determine the effect of sample inhomogeneity on 1 H T 1 values. Mixtures of cryoground and as-received dicumarol were clearly inhomogeneous as they did not fit well to a one-component relaxation model, but could be fit much better to a two-component model with both fast-and slow-relaxing regimes. Results indicate that samples of crystalline dicumarol containing two significantly different particle size populations could be deconvoluted solely based on their differences in 1 H T 1 times. Relative populations of each particle size regime could also be approximated using two-component fitting models. Using NMR theory on spin diffusion as a reference, and taking into account the presence of crystal defects, a model for the correlation between the particle size of dicumarol and its 1 H T 1 time was proposed.

  9. Single and simultaneous binary mergers in Wright-Fisher genealogies.

    PubMed

    Melfi, Andrew; Viswanath, Divakar

    2018-05-01

    The Kingman coalescent is a commonly used model in genetics, which is often justified with reference to the Wright-Fisher (WF) model. Current proofs of convergence of WF and other models to the Kingman coalescent assume a constant sample size. However, sample sizes have become quite large in human genetics. Therefore, we develop a convergence theory that allows the sample size to increase with population size. If the haploid population size is N and the sample size is N 1∕3-ϵ , ϵ>0, we prove that Wright-Fisher genealogies involve at most a single binary merger in each generation with probability converging to 1 in the limit of large N. Single binary merger or no merger in each generation of the genealogy implies that the Kingman partition distribution is obtained exactly. If the sample size is N 1∕2-ϵ , Wright-Fisher genealogies may involve simultaneous binary mergers in a single generation but do not involve triple mergers in the large N limit. The asymptotic theory is verified using numerical calculations. Variable population sizes are handled algorithmically. It is found that even distant bottlenecks can increase the probability of triple mergers as well as simultaneous binary mergers in WF genealogies. Copyright © 2018 Elsevier Inc. All rights reserved.

  10. The Effects of Model Misspecification and Sample Size on LISREL Maximum Likelihood Estimates.

    ERIC Educational Resources Information Center

    Baldwin, Beatrice

    The robustness of LISREL computer program maximum likelihood estimates under specific conditions of model misspecification and sample size was examined. The population model used in this study contains one exogenous variable; three endogenous variables; and eight indicator variables, two for each latent variable. Conditions of model…

  11. A simple approach to power and sample size calculations in logistic regression and Cox regression models.

    PubMed

    Vaeth, Michael; Skovlund, Eva

    2004-06-15

    For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.

  12. Chi-Squared Test of Fit and Sample Size-A Comparison between a Random Sample Approach and a Chi-Square Value Adjustment Method.

    PubMed

    Bergh, Daniel

    2015-01-01

    Chi-square statistics are commonly used for tests of fit of measurement models. Chi-square is also sensitive to sample size, which is why several approaches to handle large samples in test of fit analysis have been developed. One strategy to handle the sample size problem may be to adjust the sample size in the analysis of fit. An alternative is to adopt a random sample approach. The purpose of this study was to analyze and to compare these two strategies using simulated data. Given an original sample size of 21,000, for reductions of sample sizes down to the order of 5,000 the adjusted sample size function works as good as the random sample approach. In contrast, when applying adjustments to sample sizes of lower order the adjustment function is less effective at approximating the chi-square value for an actual random sample of the relevant size. Hence, the fit is exaggerated and misfit under-estimated using the adjusted sample size function. Although there are big differences in chi-square values between the two approaches at lower sample sizes, the inferences based on the p-values may be the same.

  13. Monitoring Species of Concern Using Noninvasive Genetic Sampling and Capture-Recapture Methods

    DTIC Science & Technology

    2016-11-01

    ABBREVIATIONS AICc Akaike’s Information Criterion with small sample size correction AZGFD Arizona Game and Fish Department BMGR Barry M. Goldwater...MNKA Minimum Number Known Alive N Abundance Ne Effective Population Size NGS Noninvasive Genetic Sampling NGS-CR Noninvasive Genetic...parameter estimates from capture-recapture models require sufficient sample sizes , capture probabilities and low capture biases. For NGS-CR, sample

  14. Grain Size and Parameter Recovery with TIMSS and the General Diagnostic Model

    ERIC Educational Resources Information Center

    Skaggs, Gary; Wilkins, Jesse L. M.; Hein, Serge F.

    2016-01-01

    The purpose of this study was to explore the degree of grain size of the attributes and the sample sizes that can support accurate parameter recovery with the General Diagnostic Model (GDM) for a large-scale international assessment. In this resampling study, bootstrap samples were obtained from the 2003 Grade 8 TIMSS in Mathematics at varying…

  15. Validation of abundance estimates from mark–recapture and removal techniques for rainbow trout captured by electrofishing in small streams

    USGS Publications Warehouse

    Rosenberger, Amanda E.; Dunham, Jason B.

    2005-01-01

    Estimation of fish abundance in streams using the removal model or the Lincoln - Peterson mark - recapture model is a common practice in fisheries. These models produce misleading results if their assumptions are violated. We evaluated the assumptions of these two models via electrofishing of rainbow trout Oncorhynchus mykiss in central Idaho streams. For one-, two-, three-, and four-pass sampling effort in closed sites, we evaluated the influences of fish size and habitat characteristics on sampling efficiency and the accuracy of removal abundance estimates. We also examined the use of models to generate unbiased estimates of fish abundance through adjustment of total catch or biased removal estimates. Our results suggested that the assumptions of the mark - recapture model were satisfied and that abundance estimates based on this approach were unbiased. In contrast, the removal model assumptions were not met. Decreasing sampling efficiencies over removal passes resulted in underestimated population sizes and overestimates of sampling efficiency. This bias decreased, but was not eliminated, with increased sampling effort. Biased removal estimates based on different levels of effort were highly correlated with each other but were less correlated with unbiased mark - recapture estimates. Stream size decreased sampling efficiency, and stream size and instream wood increased the negative bias of removal estimates. We found that reliable estimates of population abundance could be obtained from models of sampling efficiency for different levels of effort. Validation of abundance estimates requires extra attention to routine sampling considerations but can help fisheries biologists avoid pitfalls associated with biased data and facilitate standardized comparisons among studies that employ different sampling methods.

  16. Inventory implications of using sampling variances in estimation of growth model coefficients

    Treesearch

    Albert R. Stage; William R. Wykoff

    2000-01-01

    Variables based on stand densities or stocking have sampling errors that depend on the relation of tree size to plot size and on the spatial structure of the population, ignoring the sampling errors of such variables, which include most measures of competition used in both distance-dependent and distance-independent growth models, can bias the predictions obtained from...

  17. On sample size of the kruskal-wallis test with application to a mouse peritoneal cavity study.

    PubMed

    Fan, Chunpeng; Zhang, Donghui; Zhang, Cun-Hui

    2011-03-01

    As the nonparametric generalization of the one-way analysis of variance model, the Kruskal-Wallis test applies when the goal is to test the difference between multiple samples and the underlying population distributions are nonnormal or unknown. Although the Kruskal-Wallis test has been widely used for data analysis, power and sample size methods for this test have been investigated to a much lesser extent. This article proposes new power and sample size calculation methods for the Kruskal-Wallis test based on the pilot study in either a completely nonparametric model or a semiparametric location model. No assumption is made on the shape of the underlying population distributions. Simulation results show that, in terms of sample size calculation for the Kruskal-Wallis test, the proposed methods are more reliable and preferable to some more traditional methods. A mouse peritoneal cavity study is used to demonstrate the application of the methods. © 2010, The International Biometric Society.

  18. Designing a two-rank acceptance sampling plan for quality inspection of geospatial data products

    NASA Astrophysics Data System (ADS)

    Tong, Xiaohua; Wang, Zhenhua; Xie, Huan; Liang, Dan; Jiang, Zuoqin; Li, Jinchao; Li, Jun

    2011-10-01

    To address the disadvantages of classical sampling plans designed for traditional industrial products, we originally propose a two-rank acceptance sampling plan (TRASP) for the inspection of geospatial data outputs based on the acceptance quality level (AQL). The first rank sampling plan is to inspect the lot consisting of map sheets, and the second is to inspect the lot consisting of features in an individual map sheet. The TRASP design is formulated as an optimization problem with respect to sample size and acceptance number, which covers two lot size cases. The first case is for a small lot size with nonconformities being modeled by a hypergeometric distribution function, and the second is for a larger lot size with nonconformities being modeled by a Poisson distribution function. The proposed TRASP is illustrated through two empirical case studies. Our analysis demonstrates that: (1) the proposed TRASP provides a general approach for quality inspection of geospatial data outputs consisting of non-uniform items and (2) the proposed acceptance sampling plan based on TRASP performs better than other classical sampling plans. It overcomes the drawbacks of percent sampling, i.e., "strictness for large lot size, toleration for small lot size," and those of a national standard used specifically for industrial outputs, i.e., "lots with different sizes corresponding to the same sampling plan."

  19. A size-dependent constitutive model of bulk metallic glasses in the supercooled liquid region

    PubMed Central

    Yao, Di; Deng, Lei; Zhang, Mao; Wang, Xinyun; Tang, Na; Li, Jianjun

    2015-01-01

    Size effect is of great importance in micro forming processes. In this paper, micro cylinder compression was conducted to investigate the deformation behavior of bulk metallic glasses (BMGs) in supercooled liquid region with different deformation variables including sample size, temperature and strain rate. It was found that the elastic and plastic behaviors of BMGs have a strong dependence on the sample size. The free volume and defect concentration were introduced to explain the size effect. In order to demonstrate the influence of deformation variables on steady stress, elastic modulus and overshoot phenomenon, four size-dependent factors were proposed to construct a size-dependent constitutive model based on the Maxwell-pulse type model previously presented by the authors according to viscosity theory and free volume model. The proposed constitutive model was then adopted in finite element method simulations, and validated by comparing the micro cylinder compression and micro double cup extrusion experimental data with the numerical results. Furthermore, the model provides a new approach to understanding the size-dependent plastic deformation behavior of BMGs. PMID:25626690

  20. Sample size requirements for the design of reliability studies: precision consideration.

    PubMed

    Shieh, Gwowen

    2014-09-01

    In multilevel modeling, the intraclass correlation coefficient based on the one-way random-effects model is routinely employed to measure the reliability or degree of resemblance among group members. To facilitate the advocated practice of reporting confidence intervals in future reliability studies, this article presents exact sample size procedures for precise interval estimation of the intraclass correlation coefficient under various allocation and cost structures. Although the suggested approaches do not admit explicit sample size formulas and require special algorithms for carrying out iterative computations, they are more accurate than the closed-form formulas constructed from large-sample approximations with respect to the expected width and assurance probability criteria. This investigation notes the deficiency of existing methods and expands the sample size methodology for the design of reliability studies that have not previously been discussed in the literature.

  1. Repopulation of calibrations with samples from the target site: effect of the size of the calibration.

    NASA Astrophysics Data System (ADS)

    Guerrero, C.; Zornoza, R.; Gómez, I.; Mataix-Solera, J.; Navarro-Pedreño, J.; Mataix-Beneyto, J.; García-Orenes, F.

    2009-04-01

    Near infrared (NIR) reflectance spectroscopy offers important advantages because is a non-destructive technique, the pre-treatments needed in samples are minimal, and the spectrum of the sample is obtained in less than 1 minute without the needs of chemical reagents. For these reasons, NIR is a fast and cost-effective method. Moreover, NIR allows the analysis of several constituents or parameters simultaneously from the same spectrum once it is obtained. For this, a needed steep is the development of soil spectral libraries (set of samples analysed and scanned) and calibrations (using multivariate techniques). The calibrations should contain the variability of the target site soils in which the calibration is to be used. Many times this premise is not easy to fulfil, especially in libraries recently developed. A classical way to solve this problem is through the repopulation of libraries and the subsequent recalibration of the models. In this work we studied the changes in the accuracy of the predictions as a consequence of the successive addition of samples to repopulation. In general, calibrations with high number of samples and high diversity are desired. But we hypothesized that calibrations with lower quantities of samples (lower size) will absorb more easily the spectral characteristics of the target site. Thus, we suspect that the size of the calibration (model) that will be repopulated could be important. For this reason we also studied this effect in the accuracy of predictions of the repopulated models. In this study we used those spectra of our library which contained data of soil Kjeldahl Nitrogen (NKj) content (near to 1500 samples). First, those spectra from the target site were removed from the spectral library. Then, different quantities of samples of the library were selected (representing the 5, 10, 25, 50, 75 and 100% of the total library). These samples were used to develop calibrations with different sizes (%) of samples. We used partial least squares regression, and leave-one-out cross validation as methods of calibration. Two methods were used to select the different quantities (size of models) of samples: (1) Based on Characteristics of Spectra (BCS), and (2) Based on NKj Values of Samples (BVS). Both methods tried to select representative samples. Each of the calibrations (containing the 5, 10, 25, 50, 75 or 100% of the total samples of the library) was repopulated with samples from the target site and then recalibrated (by leave-one-out cross validation). This procedure was sequential. In each step, 2 samples from the target site were added to the models, and then recalibrated. This process was repeated successively 10 times, being 20 the total number of samples added. A local model was also created with the 20 samples used for repopulation. The repopulated, non-repopulated and local calibrations were used to predict the NKj content in those samples from the target site not included in repopulations. For the measurement of the accuracy of the predictions, the r2, RMSEP and slopes were calculated comparing predicted with analysed NKj values. This scheme was repeated for each of the four target sites studied. In general, scarce differences can be found between results obtained with BCS and BVS models. We observed that the repopulation of models increased the r2 of the predictions in sites 1 and 3. The repopulation caused scarce changes of the r2 of the predictions in sites 2 and 4, maybe due to the high initial values (using non-repopulated models r2 >0.90). As consequence of repopulation, the RMSEP decreased in all the sites except in site 2, where a very low RMESP was obtained before the repopulation (0.4 g×kg-1). The slopes trended to approximate to 1, but this value was reached only in site 4 and after the repopulation with 20 samples. In sites 3 and 4, accurate predictions were obtained using the local models. Predictions obtained with models using similar size of samples (similar %) were averaged with the aim to describe the main patterns. The r2 of predictions obtained with models of higher size were not more accurate than those obtained with models of lower size. After repopulation, the RMSEP of predictions using models with lower sizes (5, 10 and 25% of samples of the library) were lower than RMSEP obtained with higher sizes (75 and 100%), indicating that small models can easily integrate the variability of the soils from the target site. The results suggest that calibrations of small size could be repopulated and "converted" in local calibrations. According to this, we can focus most of the efforts in the obtainment of highly accurate analytical values in a reduced set of samples (including some samples from the target sites). The patterns observed here are in opposition with the idea of global models. These results could encourage the expansion of this technique, because very large data based seems not to be needed. Future studies with very different samples will help to confirm the robustness of the patterns observed. Authors acknowledge to "Bancaja-UMH" for the financial support of the project "NIRPROS".

  2. Trap configuration and spacing influences parameter estimates in spatial capture-recapture models

    USGS Publications Warehouse

    Sun, Catherine C.; Fuller, Angela K.; Royle, J. Andrew

    2014-01-01

    An increasing number of studies employ spatial capture-recapture models to estimate population size, but there has been limited research on how different spatial sampling designs and trap configurations influence parameter estimators. Spatial capture-recapture models provide an advantage over non-spatial models by explicitly accounting for heterogeneous detection probabilities among individuals that arise due to the spatial organization of individuals relative to sampling devices. We simulated black bear (Ursus americanus) populations and spatial capture-recapture data to evaluate the influence of trap configuration and trap spacing on estimates of population size and a spatial scale parameter, sigma, that relates to home range size. We varied detection probability and home range size, and considered three trap configurations common to large-mammal mark-recapture studies: regular spacing, clustered, and a temporal sequence of different cluster configurations (i.e., trap relocation). We explored trap spacing and number of traps per cluster by varying the number of traps. The clustered arrangement performed well when detection rates were low, and provides for easier field implementation than the sequential trap arrangement. However, performance differences between trap configurations diminished as home range size increased. Our simulations suggest it is important to consider trap spacing relative to home range sizes, with traps ideally spaced no more than twice the spatial scale parameter. While spatial capture-recapture models can accommodate different sampling designs and still estimate parameters with accuracy and precision, our simulations demonstrate that aspects of sampling design, namely trap configuration and spacing, must consider study area size, ranges of individual movement, and home range sizes in the study population.

  3. Random-effects linear modeling and sample size tables for two special crossover designs of average bioequivalence studies: the four-period, two-sequence, two-formulation and six-period, three-sequence, three-formulation designs.

    PubMed

    Diaz, Francisco J; Berg, Michel J; Krebill, Ron; Welty, Timothy; Gidal, Barry E; Alloway, Rita; Privitera, Michael

    2013-12-01

    Due to concern and debate in the epilepsy medical community and to the current interest of the US Food and Drug Administration (FDA) in revising approaches to the approval of generic drugs, the FDA is currently supporting ongoing bioequivalence studies of antiepileptic drugs, the EQUIGEN studies. During the design of these crossover studies, the researchers could not find commercial or non-commercial statistical software that quickly allowed computation of sample sizes for their designs, particularly software implementing the FDA requirement of using random-effects linear models for the analyses of bioequivalence studies. This article presents tables for sample-size evaluations of average bioequivalence studies based on the two crossover designs used in the EQUIGEN studies: the four-period, two-sequence, two-formulation design, and the six-period, three-sequence, three-formulation design. Sample-size computations assume that random-effects linear models are used in bioequivalence analyses with crossover designs. Random-effects linear models have been traditionally viewed by many pharmacologists and clinical researchers as just mathematical devices to analyze repeated-measures data. In contrast, a modern view of these models attributes an important mathematical role in theoretical formulations in personalized medicine to them, because these models not only have parameters that represent average patients, but also have parameters that represent individual patients. Moreover, the notation and language of random-effects linear models have evolved over the years. Thus, another goal of this article is to provide a presentation of the statistical modeling of data from bioequivalence studies that highlights the modern view of these models, with special emphasis on power analyses and sample-size computations.

  4. The feasibility of using explicit method for linear correction of the particle size variation using NIR Spectroscopy combined with PLS2regression method

    NASA Astrophysics Data System (ADS)

    Yulia, M.; Suhandy, D.

    2018-03-01

    NIR spectra obtained from spectral data acquisition system contains both chemical information of samples as well as physical information of the samples, such as particle size and bulk density. Several methods have been established for developing calibration models that can compensate for sample physical information variations. One common approach is to include physical information variation in the calibration model both explicitly and implicitly. The objective of this study was to evaluate the feasibility of using explicit method to compensate the influence of different particle size of coffee powder in NIR calibration model performance. A number of 220 coffee powder samples with two different types of coffee (civet and non-civet) and two different particle sizes (212 and 500 µm) were prepared. Spectral data was acquired using NIR spectrometer equipped with an integrating sphere for diffuse reflectance measurement. A discrimination method based on PLS-DA was conducted and the influence of different particle size on the performance of PLS-DA was investigated. In explicit method, we add directly the particle size as predicted variable results in an X block containing only the NIR spectra and a Y block containing the particle size and type of coffee. The explicit inclusion of the particle size into the calibration model is expected to improve the accuracy of type of coffee determination. The result shows that using explicit method the quality of the developed calibration model for type of coffee determination is a little bit superior with coefficient of determination (R2) = 0.99 and root mean square error of cross-validation (RMSECV) = 0.041. The performance of the PLS2 calibration model for type of coffee determination with particle size compensation was quite good and able to predict the type of coffee in two different particle sizes with relatively high R2 pred values. The prediction also resulted in low bias and RMSEP values.

  5. Regression modeling of particle size distributions in urban storm water: advancements through improved sample collection methods

    USGS Publications Warehouse

    Fienen, Michael N.; Selbig, William R.

    2012-01-01

    A new sample collection system was developed to improve the representation of sediment entrained in urban storm water by integrating water quality samples from the entire water column. The depth-integrated sampler arm (DISA) was able to mitigate sediment stratification bias in storm water, thereby improving the characterization of suspended-sediment concentration and particle size distribution at three independent study locations. Use of the DISA decreased variability, which improved statistical regression to predict particle size distribution using surrogate environmental parameters, such as precipitation depth and intensity. The performance of this statistical modeling technique was compared to results using traditional fixed-point sampling methods and was found to perform better. When environmental parameters can be used to predict particle size distributions, environmental managers have more options when characterizing concentrations, loads, and particle size distributions in urban runoff.

  6. Ranked set sampling: cost and optimal set size.

    PubMed

    Nahhas, Ramzi W; Wolfe, Douglas A; Chen, Haiying

    2002-12-01

    McIntyre (1952, Australian Journal of Agricultural Research 3, 385-390) introduced ranked set sampling (RSS) as a method for improving estimation of a population mean in settings where sampling and ranking of units from the population are inexpensive when compared with actual measurement of the units. Two of the major factors in the usefulness of RSS are the set size and the relative costs of the various operations of sampling, ranking, and measurement. In this article, we consider ranking error models and cost models that enable us to assess the effect of different cost structures on the optimal set size for RSS. For reasonable cost structures, we find that the optimal RSS set sizes are generally larger than had been anticipated previously. These results will provide a useful tool for determining whether RSS is likely to lead to an improvement over simple random sampling in a given setting and, if so, what RSS set size is best to use in this case.

  7. 40 CFR 90.706 - Engine sample selection.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... = emission test result for an individual engine. x = mean of emission test results of the actual sample. FEL... test with the last test result from the previous model year and then calculate the required sample size.... Test results used to calculate the variables in the following Sample Size Equation must be final...

  8. Modeling ultrasound propagation through material of increasing geometrical complexity.

    PubMed

    Odabaee, Maryam; Odabaee, Mostafa; Pelekanos, Matthew; Leinenga, Gerhard; Götz, Jürgen

    2018-06-01

    Ultrasound is increasingly being recognized as a neuromodulatory and therapeutic tool, inducing a broad range of bio-effects in the tissue of experimental animals and humans. To achieve these effects in a predictable manner in the human brain, the thick cancellous skull presents a problem, causing attenuation. In order to overcome this challenge, as a first step, the acoustic properties of a set of simple bone-modeling resin samples that displayed an increasing geometrical complexity (increasing step sizes) were analyzed. Using two Non-Destructive Testing (NDT) transducers, we found that Wiener deconvolution predicted the Ultrasound Acoustic Response (UAR) and attenuation caused by the samples. However, whereas the UAR of samples with step sizes larger than the wavelength could be accurately estimated, the prediction was not accurate when the sample had a smaller step size. Furthermore, a Finite Element Analysis (FEA) performed in ANSYS determined that the scattering and refraction of sound waves was significantly higher in complex samples with smaller step sizes compared to simple samples with a larger step size. Together, this reveals an interaction of frequency and geometrical complexity in predicting the UAR and attenuation. These findings could in future be applied to poro-visco-elastic materials that better model the human skull. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.

  9. Accounting for Incomplete Species Detection in Fish Community Monitoring

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McManamay, Ryan A; Orth, Dr. Donald J; Jager, Yetta

    2013-01-01

    Riverine fish assemblages are heterogeneous and very difficult to characterize with a one-size-fits-all approach to sampling. Furthermore, detecting changes in fish assemblages over time requires accounting for variation in sampling designs. We present a modeling approach that permits heterogeneous sampling by accounting for site and sampling covariates (including method) in a model-based framework for estimation (versus a sampling-based framework). We snorkeled during three surveys and electrofished during a single survey in suite of delineated habitats stratified by reach types. We developed single-species occupancy models to determine covariates influencing patch occupancy and species detection probabilities whereas community occupancy models estimated speciesmore » richness in light of incomplete detections. For most species, information-theoretic criteria showed higher support for models that included patch size and reach as covariates of occupancy. In addition, models including patch size and sampling method as covariates of detection probabilities also had higher support. Detection probability estimates for snorkeling surveys were higher for larger non-benthic species whereas electrofishing was more effective at detecting smaller benthic species. The number of sites and sampling occasions required to accurately estimate occupancy varied among fish species. For rare benthic species, our results suggested that higher number of occasions, and especially the addition of electrofishing, may be required to improve detection probabilities and obtain accurate occupancy estimates. Community models suggested that richness was 41% higher than the number of species actually observed and the addition of an electrofishing survey increased estimated richness by 13%. These results can be useful to future fish assemblage monitoring efforts by informing sampling designs, such as site selection (e.g. stratifying based on patch size) and determining effort required (e.g. number of sites versus occasions).« less

  10. Seasonal variation in size-dependent survival of juvenile Atlantic salmon (Salmo salar): Performance of multistate capture-mark-recapture models

    USGS Publications Warehouse

    Letcher, B.H.; Horton, G.E.

    2008-01-01

    We estimated the magnitude and shape of size-dependent survival (SDS) across multiple sampling intervals for two cohorts of stream-dwelling Atlantic salmon (Salmo salar) juveniles using multistate capture-mark-recapture (CMR) models. Simulations designed to test the effectiveness of multistate models for detecting SDS in our system indicated that error in SDS estimates was low and that both time-invariant and time-varying SDS could be detected with sample sizes of >250, average survival of >0.6, and average probability of capture of >0.6, except for cases of very strong SDS. In the field (N ??? 750, survival 0.6-0.8 among sampling intervals, probability of capture 0.6-0.8 among sampling occasions), about one-third of the sampling intervals showed evidence of SDS, with poorer survival of larger fish during the age-2+ autumn and quadratic survival (opposite direction between cohorts) during age-1+ spring. The varying magnitude and shape of SDS among sampling intervals suggest a potential mechanism for the maintenance of the very wide observed size distributions. Estimating SDS using multistate CMR models appears complementary to established approaches, can provide estimates with low error, and can be used to detect intermittent SDS. ?? 2008 NRC Canada.

  11. An Investigation of Sample Size Splitting on ATFIND and DIMTEST

    ERIC Educational Resources Information Center

    Socha, Alan; DeMars, Christine E.

    2013-01-01

    Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…

  12. VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS

    PubMed Central

    Huang, Jian; Horowitz, Joel L.; Wei, Fengrong

    2010-01-01

    We consider a nonparametric additive model of a conditional mean function in which the number of variables and additive components may be larger than the sample size but the number of nonzero additive components is “small” relative to the sample size. The statistical problem is to determine which additive components are nonzero. The additive components are approximated by truncated series expansions with B-spline bases. With this approximation, the problem of component selection becomes that of selecting the groups of coefficients in the expansion. We apply the adaptive group Lasso to select nonzero components, using the group Lasso to obtain an initial estimator and reduce the dimension of the problem. We give conditions under which the group Lasso selects a model whose number of components is comparable with the underlying model, and the adaptive group Lasso selects the nonzero components correctly with probability approaching one as the sample size increases and achieves the optimal rate of convergence. The results of Monte Carlo experiments show that the adaptive group Lasso procedure works well with samples of moderate size. A data example is used to illustrate the application of the proposed method. PMID:21127739

  13. The relationship between national-level carbon dioxide emissions and population size: an assessment of regional and temporal variation, 1960-2005.

    PubMed

    Jorgenson, Andrew K; Clark, Brett

    2013-01-01

    This study examines the regional and temporal differences in the statistical relationship between national-level carbon dioxide emissions and national-level population size. The authors analyze panel data from 1960 to 2005 for a diverse sample of nations, and employ descriptive statistics and rigorous panel regression modeling techniques. Initial descriptive analyses indicate that all regions experienced overall increases in carbon emissions and population size during the 45-year period of investigation, but with notable differences. For carbon emissions, the sample of countries in Asia experienced the largest percent increase, followed by countries in Latin America, Africa, and lastly the sample of relatively affluent countries in Europe, North America, and Oceania combined. For population size, the sample of countries in Africa experienced the largest percent increase, followed countries in Latin America, Asia, and the combined sample of countries in Europe, North America, and Oceania. Findings for two-way fixed effects panel regression elasticity models of national-level carbon emissions indicate that the estimated elasticity coefficient for population size is much smaller for nations in Africa than for nations in other regions of the world. Regarding potential temporal changes, from 1960 to 2005 the estimated elasticity coefficient for population size decreased by 25% for the sample of Africa countries, 14% for the sample of Asia countries, 6.5% for the sample of Latin America countries, but remained the same in size for the sample of countries in Europe, North America, and Oceania. Overall, while population size continues to be the primary driver of total national-level anthropogenic carbon dioxide emissions, the findings for this study highlight the need for future research and policies to recognize that the actual impacts of population size on national-level carbon emissions differ across both time and region.

  14. What is the extent of prokaryotic diversity?

    PubMed Central

    Curtis, Thomas P; Head, Ian M; Lunn, Mary; Woodcock, Stephen; Schloss, Patrick D; Sloan, William T

    2006-01-01

    The extent of microbial diversity is an intrinsically fascinating subject of profound practical importance. The term ‘diversity’ may allude to the number of taxa or species richness as well as their relative abundance. There is uncertainty about both, primarily because sample sizes are too small. Non-parametric diversity estimators make gross underestimates if used with small sample sizes on unevenly distributed communities. One can make richness estimates over many scales using small samples by assuming a species/taxa-abundance distribution. However, no one knows what the underlying taxa-abundance distributions are for bacterial communities. Latterly, diversity has been estimated by fitting data from gene clone libraries and extrapolating from this to taxa-abundance curves to estimate richness. However, since sample sizes are small, we cannot be sure that such samples are representative of the community from which they were drawn. It is however possible to formulate, and calibrate, models that predict the diversity of local communities and of samples drawn from that local community. The calibration of such models suggests that migration rates are small and decrease as the community gets larger. The preliminary predictions of the model are qualitatively consistent with the patterns seen in clone libraries in ‘real life’. The validation of this model is also confounded by small sample sizes. However, if such models were properly validated, they could form invaluable tools for the prediction of microbial diversity and a basis for the systematic exploration of microbial diversity on the planet. PMID:17028084

  15. Development of a copula-based particle filter (CopPF) approach for hydrologic data assimilation under consideration of parameter interdependence

    NASA Astrophysics Data System (ADS)

    Fan, Y. R.; Huang, G. H.; Baetz, B. W.; Li, Y. P.; Huang, K.

    2017-06-01

    In this study, a copula-based particle filter (CopPF) approach was developed for sequential hydrological data assimilation by considering parameter correlation structures. In CopPF, multivariate copulas are proposed to reflect parameter interdependence before the resampling procedure with new particles then being sampled from the obtained copulas. Such a process can overcome both particle degeneration and sample impoverishment. The applicability of CopPF is illustrated with three case studies using a two-parameter simplified model and two conceptual hydrologic models. The results for the simplified model indicate that model parameters are highly correlated in the data assimilation process, suggesting a demand for full description of their dependence structure. Synthetic experiments on hydrologic data assimilation indicate that CopPF can rejuvenate particle evolution in large spaces and thus achieve good performances with low sample size scenarios. The applicability of CopPF is further illustrated through two real-case studies. It is shown that, compared with traditional particle filter (PF) and particle Markov chain Monte Carlo (PMCMC) approaches, the proposed method can provide more accurate results for both deterministic and probabilistic prediction with a sample size of 100. Furthermore, the sample size would not significantly influence the performance of CopPF. Also, the copula resampling approach dominates parameter evolution in CopPF, with more than 50% of particles sampled by copulas in most sample size scenarios.

  16. Model selection with multiple regression on distance matrices leads to incorrect inferences.

    PubMed

    Franckowiak, Ryan P; Panasci, Michael; Jarvis, Karl J; Acuña-Rodriguez, Ian S; Landguth, Erin L; Fortin, Marie-Josée; Wagner, Helene H

    2017-01-01

    In landscape genetics, model selection procedures based on Information Theoretic and Bayesian principles have been used with multiple regression on distance matrices (MRM) to test the relationship between multiple vectors of pairwise genetic, geographic, and environmental distance. Using Monte Carlo simulations, we examined the ability of model selection criteria based on Akaike's information criterion (AIC), its small-sample correction (AICc), and the Bayesian information criterion (BIC) to reliably rank candidate models when applied with MRM while varying the sample size. The results showed a serious problem: all three criteria exhibit a systematic bias toward selecting unnecessarily complex models containing spurious random variables and erroneously suggest a high level of support for the incorrectly ranked best model. These problems effectively increased with increasing sample size. The failure of AIC, AICc, and BIC was likely driven by the inflated sample size and different sum-of-squares partitioned by MRM, and the resulting effect on delta values. Based on these findings, we strongly discourage the continued application of AIC, AICc, and BIC for model selection with MRM.

  17. The interplay of various sources of noise on reliability of species distribution models hinges on ecological specialisation.

    PubMed

    Soultan, Alaaeldin; Safi, Kamran

    2017-01-01

    Digitized species occurrence data provide an unprecedented source of information for ecologists and conservationists. Species distribution model (SDM) has become a popular method to utilise these data for understanding the spatial and temporal distribution of species, and for modelling biodiversity patterns. Our objective is to study the impact of noise in species occurrence data (namely sample size and positional accuracy) on the performance and reliability of SDM, considering the multiplicative impact of SDM algorithms, species specialisation, and grid resolution. We created a set of four 'virtual' species characterized by different specialisation levels. For each of these species, we built the suitable habitat models using five algorithms at two grid resolutions, with varying sample sizes and different levels of positional accuracy. We assessed the performance and reliability of the SDM according to classic model evaluation metrics (Area Under the Curve and True Skill Statistic) and model agreement metrics (Overall Concordance Correlation Coefficient and geographic niche overlap) respectively. Our study revealed that species specialisation had by far the most dominant impact on the SDM. In contrast to previous studies, we found that for widespread species, low sample size and low positional accuracy were acceptable, and useful distribution ranges could be predicted with as few as 10 species occurrences. Range predictions for narrow-ranged species, however, were sensitive to sample size and positional accuracy, such that useful distribution ranges required at least 20 species occurrences. Against expectations, the MAXENT algorithm poorly predicted the distribution of specialist species at low sample size.

  18. Impact of crystalline defects and size on X-ray line broadening: A phenomenological approach for tetragonal SnO{sub 2} nanocrystals

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Muhammed Shafi, P.; Chandra Bose, A., E-mail: acbose@nitt.edu

    2015-05-15

    Nanocrystalline tin oxide (SnO{sub 2}) powders with different grain size were prepared by chemical precipitation method. The reaction was carried out by varying the period of hydrolysis and the as-prepared samples were annealed at different temperatures. The samples were characterized using X-ray powder diffractometer and transmission electron microscopy. The microstrain and crystallite size were calculated for all the samples by using Williamson-Hall (W-H) models namely, isotropic strain model (ISM), anisotropic strain model (ASM) and uniform deformation energy density model (UDEDM). The morphology and particle size were determined using TEM micrographs. The directional dependant young’s modulus was modified as an equationmore » relating elastic compliances (s{sub ij}) and Miller indices of the lattice plane (hkl) for tetragonal crystal system and also the equation for elastic compliance in terms of stiffness constants was derived. The changes in crystallite size and microstrain due to lattice defects were observed while varying the hydrolysis time and the annealing temperature. The dependence of crystallite size on lattice strain was studied. The results were correlated with the available studies on electrical properties using impedance spectroscopy.« less

  19. Modeling motor vehicle crashes using Poisson-gamma models: examining the effects of low sample mean values and small sample size on the estimation of the fixed dispersion parameter.

    PubMed

    Lord, Dominique

    2006-07-01

    There has been considerable research conducted on the development of statistical models for predicting crashes on highway facilities. Despite numerous advancements made for improving the estimation tools of statistical models, the most common probabilistic structure used for modeling motor vehicle crashes remains the traditional Poisson and Poisson-gamma (or Negative Binomial) distribution; when crash data exhibit over-dispersion, the Poisson-gamma model is usually the model of choice most favored by transportation safety modelers. Crash data collected for safety studies often have the unusual attributes of being characterized by low sample mean values. Studies have shown that the goodness-of-fit of statistical models produced from such datasets can be significantly affected. This issue has been defined as the "low mean problem" (LMP). Despite recent developments on methods to circumvent the LMP and test the goodness-of-fit of models developed using such datasets, no work has so far examined how the LMP affects the fixed dispersion parameter of Poisson-gamma models used for modeling motor vehicle crashes. The dispersion parameter plays an important role in many types of safety studies and should, therefore, be reliably estimated. The primary objective of this research project was to verify whether the LMP affects the estimation of the dispersion parameter and, if it is, to determine the magnitude of the problem. The secondary objective consisted of determining the effects of an unreliably estimated dispersion parameter on common analyses performed in highway safety studies. To accomplish the objectives of the study, a series of Poisson-gamma distributions were simulated using different values describing the mean, the dispersion parameter, and the sample size. Three estimators commonly used by transportation safety modelers for estimating the dispersion parameter of Poisson-gamma models were evaluated: the method of moments, the weighted regression, and the maximum likelihood method. In an attempt to complement the outcome of the simulation study, Poisson-gamma models were fitted to crash data collected in Toronto, Ont. characterized by a low sample mean and small sample size. The study shows that a low sample mean combined with a small sample size can seriously affect the estimation of the dispersion parameter, no matter which estimator is used within the estimation process. The probability the dispersion parameter becomes unreliably estimated increases significantly as the sample mean and sample size decrease. Consequently, the results show that an unreliably estimated dispersion parameter can significantly undermine empirical Bayes (EB) estimates as well as the estimation of confidence intervals for the gamma mean and predicted response. The paper ends with recommendations about minimizing the likelihood of producing Poisson-gamma models with an unreliable dispersion parameter for modeling motor vehicle crashes.

  20. Analysis of YBCO high temperature superconductor doped with silver nanoparticles and carbon nanotubes using Williamson-Hall and size-strain plot

    NASA Astrophysics Data System (ADS)

    Dadras, Sedigheh; Davoudiniya, Masoumeh

    2018-05-01

    This paper sets out to investigate and compare the effects of Ag nanoparticles and carbon nanotubes (CNTs) doping on the mechanical properties of Y1Ba2Cu3O7-δ (YBCO) high temperature superconductor. For this purpose, the pure and doped YBCO samples were synthesized by sol-gel method. The microstructural analysis of the samples is performed using X-ray diffraction (XRD). The crystalline size, lattice strain and stress of the pure and doped YBCO samples were estimated by modified forms of Williamson-Hall analysis (W-H), namely, uniform deformation model (UDM), uniform deformation stress model (UDSM) and the size-strain plot method (SSP). These results show that the crystalline size, lattice strain and stress of the YBCO samples declined by Ag nanoparticles and CNTs doping.

  1. Sample size calculation for stepped wedge and other longitudinal cluster randomised trials.

    PubMed

    Hooper, Richard; Teerenstra, Steven; de Hoop, Esther; Eldridge, Sandra

    2016-11-20

    The sample size required for a cluster randomised trial is inflated compared with an individually randomised trial because outcomes of participants from the same cluster are correlated. Sample size calculations for longitudinal cluster randomised trials (including stepped wedge trials) need to take account of at least two levels of clustering: the clusters themselves and times within clusters. We derive formulae for sample size for repeated cross-section and closed cohort cluster randomised trials with normally distributed outcome measures, under a multilevel model allowing for variation between clusters and between times within clusters. Our formulae agree with those previously described for special cases such as crossover and analysis of covariance designs, although simulation suggests that the formulae could underestimate required sample size when the number of clusters is small. Whether using a formula or simulation, a sample size calculation requires estimates of nuisance parameters, which in our model include the intracluster correlation, cluster autocorrelation, and individual autocorrelation. A cluster autocorrelation less than 1 reflects a situation where individuals sampled from the same cluster at different times have less correlated outcomes than individuals sampled from the same cluster at the same time. Nuisance parameters could be estimated from time series obtained in similarly clustered settings with the same outcome measure, using analysis of variance to estimate variance components. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  2. Using Structural Equation Modeling to Assess Functional Connectivity in the Brain: Power and Sample Size Considerations

    ERIC Educational Resources Information Center

    Sideridis, Georgios; Simos, Panagiotis; Papanicolaou, Andrew; Fletcher, Jack

    2014-01-01

    The present study assessed the impact of sample size on the power and fit of structural equation modeling applied to functional brain connectivity hypotheses. The data consisted of time-constrained minimum norm estimates of regional brain activity during performance of a reading task obtained with magnetoencephalography. Power analysis was first…

  3. Effect of dislocation pile-up on size-dependent yield strength in finite single-crystal micro-samples

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pan, Bo; Shibutani, Yoji, E-mail: sibutani@mech.eng.osaka-u.ac.jp; Zhang, Xu

    2015-07-07

    Recent research has explained that the steeply increasing yield strength in metals depends on decreasing sample size. In this work, we derive a statistical physical model of the yield strength of finite single-crystal micro-pillars that depends on single-ended dislocation pile-up inside the micro-pillars. We show that this size effect can be explained almost completely by considering the stochastic lengths of the dislocation source and the dislocation pile-up length in the single-crystal micro-pillars. The Hall–Petch-type relation holds even in a microscale single-crystal, which is characterized by its dislocation source lengths. Our quantitative conclusions suggest that the number of dislocation sources andmore » pile-ups are significant factors for the size effect. They also indicate that starvation of dislocation sources is another reason for the size effect. Moreover, we investigated the explicit relationship between the stacking fault energy and the dislocation “pile-up” effect inside the sample: materials with low stacking fault energy exhibit an obvious dislocation pile-up effect. Our proposed physical model predicts a sample strength that agrees well with experimental data, and our model can give a more precise prediction than the current single arm source model, especially for materials with low stacking fault energy.« less

  4. What is a species? A new universal method to measure differentiation and assess the taxonomic rank of allopatric populations, using continuous variables

    PubMed Central

    Donegan, Thomas M.

    2018-01-01

    Abstract Existing models for assigning species, subspecies, or no taxonomic rank to populations which are geographically separated from one another were analyzed. This was done by subjecting over 3,000 pairwise comparisons of vocal or biometric data based on birds to a variety of statistical tests that have been proposed as measures of differentiation. One current model which aims to test diagnosability (Isler et al. 1998) is highly conservative, applying a hard cut-off, which excludes from consideration differentiation below diagnosis. It also includes non-overlap as a requirement, a measure which penalizes increases to sample size. The “species scoring” model of Tobias et al. (2010) involves less drastic cut-offs, but unlike Isler et al. (1998), does not control adequately for sample size and attributes scores in many cases to differentiation which is not statistically significant. Four different models of assessing effect sizes were analyzed: using both pooled and unpooled standard deviations and controlling for sample size using t-distributions or omitting to do so. Pooled standard deviations produced more conservative effect sizes when uncontrolled for sample size but less conservative effect sizes when so controlled. Pooled models require assumptions to be made that are typically elusive or unsupported for taxonomic studies. Modifications to improving these frameworks are proposed, including: (i) introducing statistical significance as a gateway to attributing any weighting to findings of differentiation; (ii) abandoning non-overlap as a test; (iii) recalibrating Tobias et al. (2010) scores based on effect sizes controlled for sample size using t-distributions. A new universal method is proposed for measuring differentiation in taxonomy using continuous variables and a formula is proposed for ranking allopatric populations. This is based first on calculating effect sizes using unpooled standard deviations, controlled for sample size using t-distributions, for a series of different variables. All non-significant results are excluded by scoring them as zero. Distance between any two populations is calculated using Euclidian summation of non-zeroed effect size scores. If the score of an allopatric pair exceeds that of a related sympatric pair, then the allopatric population can be ranked as species and, if not, then at most subspecies rank should be assigned. A spreadsheet has been programmed and is being made available which allows this and other tests of differentiation and rank studied in this paper to be rapidly analyzed. PMID:29780266

  5. Performance of the likelihood ratio difference (G2 Diff) test for detecting unidimensionality in applications of the multidimensional Rasch model.

    PubMed

    Harrell-Williams, Leigh; Wolfe, Edward W

    2014-01-01

    Previous research has investigated the influence of sample size, model misspecification, test length, ability distribution offset, and generating model on the likelihood ratio difference test in applications of item response models. This study extended that research to the evaluation of dimensionality using the multidimensional random coefficients multinomial logit model (MRCMLM). Logistic regression analysis of simulated data reveal that sample size and test length have a large effect on the capacity of the LR difference test to correctly identify unidimensionality, with shorter tests and smaller sample sizes leading to smaller Type I error rates. Higher levels of simulated misfit resulted in fewer incorrect decisions than data with no or little misfit. However, Type I error rates indicate that the likelihood ratio difference test is not suitable under any of the simulated conditions for evaluating dimensionality in applications of the MRCMLM.

  6. DESCARTES' RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA.

    PubMed

    Bhaskar, Anand; Song, Yun S

    2014-01-01

    The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS for arbitrarily large sample sizes. Although in principle this nonidentifiability issue poses a thorny challenge to statistical inference, the population size functions involved in the counterexamples are arguably not so biologically realistic. Here, we revisit this problem and examine the identifiability of demographic models under the restriction that the population sizes are piecewise-defined where each piece belongs to some family of biologically-motivated functions. Under this assumption, we prove that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large. We obtain a general bound on the sample size sufficient for identifiability; the bound depends on the number of pieces in the demographic model and also on the type of population size function in each piece. In the cases of piecewise-constant, piecewise-exponential and piecewise-generalized-exponential models, which are often assumed in population genomic inferences, we provide explicit formulas for the bounds as simple functions of the number of pieces. Lastly, we obtain analogous results for the "folded" SFS, which is often used when there is ambiguity as to which allelic type is ancestral. Our results are proved using a generalization of Descartes' rule of signs for polynomials to the Laplace transform of piecewise continuous functions.

  7. DESCARTES’ RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA1

    PubMed Central

    Bhaskar, Anand; Song, Yun S.

    2016-01-01

    The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS for arbitrarily large sample sizes. Although in principle this nonidentifiability issue poses a thorny challenge to statistical inference, the population size functions involved in the counterexamples are arguably not so biologically realistic. Here, we revisit this problem and examine the identifiability of demographic models under the restriction that the population sizes are piecewise-defined where each piece belongs to some family of biologically-motivated functions. Under this assumption, we prove that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large. We obtain a general bound on the sample size sufficient for identifiability; the bound depends on the number of pieces in the demographic model and also on the type of population size function in each piece. In the cases of piecewise-constant, piecewise-exponential and piecewise-generalized-exponential models, which are often assumed in population genomic inferences, we provide explicit formulas for the bounds as simple functions of the number of pieces. Lastly, we obtain analogous results for the “folded” SFS, which is often used when there is ambiguity as to which allelic type is ancestral. Our results are proved using a generalization of Descartes’ rule of signs for polynomials to the Laplace transform of piecewise continuous functions. PMID:28018011

  8. Sampling through time and phylodynamic inference with coalescent and birth–death models

    PubMed Central

    Volz, Erik M.; Frost, Simon D. W.

    2014-01-01

    Many population genetic models have been developed for the purpose of inferring population size and growth rates from random samples of genetic data. We examine two popular approaches to this problem, the coalescent and the birth–death-sampling model (BDM), in the context of estimating population size and birth rates in a population growing exponentially according to the birth–death branching process. For sequences sampled at a single time, we found the coalescent and the BDM gave virtually indistinguishable results in terms of the growth rates and fraction of the population sampled, even when sampling from a small population. For sequences sampled at multiple time points, we find that the birth–death model estimators are subject to large bias if the sampling process is misspecified. Since BDMs incorporate a model of the sampling process, we show how much of the statistical power of BDMs arises from the sequence of sample times and not from the genealogical tree. This motivates the development of a new coalescent estimator, which is augmented with a model of the known sampling process and is potentially more precise than the coalescent that does not use sample time information. PMID:25401173

  9. Meta-analysis of genome-wide association from genomic prediction models

    USDA-ARS?s Scientific Manuscript database

    A limitation of many genome-wide association studies (GWA) in animal breeding is that there are many loci with small effect sizes; thus, larger sample sizes (N) are required to guarantee suitable power of detection. To increase sample size, results from different GWA can be combined in a meta-analys...

  10. Generalized SAMPLE SIZE Determination Formulas for Investigating Contextual Effects by a Three-Level Random Intercept Model.

    PubMed

    Usami, Satoshi

    2017-03-01

    Behavioral and psychological researchers have shown strong interests in investigating contextual effects (i.e., the influences of combinations of individual- and group-level predictors on individual-level outcomes). The present research provides generalized formulas for determining the sample size needed in investigating contextual effects according to the desired level of statistical power as well as width of confidence interval. These formulas are derived within a three-level random intercept model that includes one predictor/contextual variable at each level to simultaneously cover various kinds of contextual effects that researchers can show interest. The relative influences of indices included in the formulas on the standard errors of contextual effects estimates are investigated with the aim of further simplifying sample size determination procedures. In addition, simulation studies are performed to investigate finite sample behavior of calculated statistical power, showing that estimated sample sizes based on derived formulas can be both positively and negatively biased due to complex effects of unreliability of contextual variables, multicollinearity, and violation of assumption regarding the known variances. Thus, it is advisable to compare estimated sample sizes under various specifications of indices and to evaluate its potential bias, as illustrated in the example.

  11. Optimal sample sizes for the design of reliability studies: power consideration.

    PubMed

    Shieh, Gwowen

    2014-09-01

    Intraclass correlation coefficients are used extensively to measure the reliability or degree of resemblance among group members in multilevel research. This study concerns the problem of the necessary sample size to ensure adequate statistical power for hypothesis tests concerning the intraclass correlation coefficient in the one-way random-effects model. In view of the incomplete and problematic numerical results in the literature, the approximate sample size formula constructed from Fisher's transformation is reevaluated and compared with an exact approach across a wide range of model configurations. These comprehensive examinations showed that the Fisher transformation method is appropriate only under limited circumstances, and therefore it is not recommended as a general method in practice. For advance design planning of reliability studies, the exact sample size procedures are fully described and illustrated for various allocation and cost schemes. Corresponding computer programs are also developed to implement the suggested algorithms.

  12. Advanced hierarchical distance sampling

    USGS Publications Warehouse

    Royle, Andy

    2016-01-01

    In this chapter, we cover a number of important extensions of the basic hierarchical distance-sampling (HDS) framework from Chapter 8. First, we discuss the inclusion of “individual covariates,” such as group size, in the HDS model. This is important in many surveys where animals form natural groups that are the primary observation unit, with the size of the group expected to have some influence on detectability. We also discuss HDS integrated with time-removal and double-observer or capture-recapture sampling. These “combined protocols” can be formulated as HDS models with individual covariates, and thus they have a commonality with HDS models involving group structure (group size being just another individual covariate). We cover several varieties of open-population HDS models that accommodate population dynamics. On one end of the spectrum, we cover models that allow replicate distance sampling surveys within a year, which estimate abundance relative to availability and temporary emigration through time. We consider a robust design version of that model. We then consider models with explicit dynamics based on the Dail and Madsen (2011) model and the work of Sollmann et al. (2015). The final major theme of this chapter is relatively newly developed spatial distance sampling models that accommodate explicit models describing the spatial distribution of individuals known as Point Process models. We provide novel formulations of spatial DS and HDS models in this chapter, including implementations of those models in the unmarked package using a hack of the pcount function for N-mixture models.

  13. RnaSeqSampleSize: real data based sample size estimation for RNA sequencing.

    PubMed

    Zhao, Shilin; Li, Chung-I; Guo, Yan; Sheng, Quanhu; Shyr, Yu

    2018-05-30

    One of the most important and often neglected components of a successful RNA sequencing (RNA-Seq) experiment is sample size estimation. A few negative binomial model-based methods have been developed to estimate sample size based on the parameters of a single gene. However, thousands of genes are quantified and tested for differential expression simultaneously in RNA-Seq experiments. Thus, additional issues should be carefully addressed, including the false discovery rate for multiple statistic tests, widely distributed read counts and dispersions for different genes. To solve these issues, we developed a sample size and power estimation method named RnaSeqSampleSize, based on the distributions of gene average read counts and dispersions estimated from real RNA-seq data. Datasets from previous, similar experiments such as the Cancer Genome Atlas (TCGA) can be used as a point of reference. Read counts and their dispersions were estimated from the reference's distribution; using that information, we estimated and summarized the power and sample size. RnaSeqSampleSize is implemented in R language and can be installed from Bioconductor website. A user friendly web graphic interface is provided at http://cqs.mc.vanderbilt.edu/shiny/RnaSeqSampleSize/ . RnaSeqSampleSize provides a convenient and powerful way for power and sample size estimation for an RNAseq experiment. It is also equipped with several unique features, including estimation for interested genes or pathway, power curve visualization, and parameter optimization.

  14. Global Sensitivity Analysis of Environmental Models: Convergence, Robustness and Validation

    NASA Astrophysics Data System (ADS)

    Sarrazin, Fanny; Pianosi, Francesca; Khorashadi Zadeh, Farkhondeh; Van Griensven, Ann; Wagener, Thorsten

    2015-04-01

    Global Sensitivity Analysis aims to characterize the impact that variations in model input factors (e.g. the parameters) have on the model output (e.g. simulated streamflow). In sampling-based Global Sensitivity Analysis, the sample size has to be chosen carefully in order to obtain reliable sensitivity estimates while spending computational resources efficiently. Furthermore, insensitive parameters are typically identified through the definition of a screening threshold: the theoretical value of their sensitivity index is zero but in a sampling-base framework they regularly take non-zero values. There is little guidance available for these two steps in environmental modelling though. The objective of the present study is to support modellers in making appropriate choices, regarding both sample size and screening threshold, so that a robust sensitivity analysis can be implemented. We performed sensitivity analysis for the parameters of three hydrological models with increasing level of complexity (Hymod, HBV and SWAT), and tested three widely used sensitivity analysis methods (Elementary Effect Test or method of Morris, Regional Sensitivity Analysis, and Variance-Based Sensitivity Analysis). We defined criteria based on a bootstrap approach to assess three different types of convergence: the convergence of the value of the sensitivity indices, of the ranking (the ordering among the parameters) and of the screening (the identification of the insensitive parameters). We investigated the screening threshold through the definition of a validation procedure. The results showed that full convergence of the value of the sensitivity indices is not necessarily needed to rank or to screen the model input factors. Furthermore, typical values of the sample sizes that are reported in the literature can be well below the sample sizes that actually ensure convergence of ranking and screening.

  15. Internal pilots for a class of linear mixed models with Gaussian and compound symmetric data

    PubMed Central

    Gurka, Matthew J.; Coffey, Christopher S.; Muller, Keith E.

    2015-01-01

    SUMMARY An internal pilot design uses interim sample size analysis, without interim data analysis, to adjust the final number of observations. The approach helps to choose a sample size sufficiently large (to achieve the statistical power desired), but not too large (which would waste money and time). We report on recent research in cerebral vascular tortuosity (curvature in three dimensions) which would benefit greatly from internal pilots due to uncertainty in the parameters of the covariance matrix used for study planning. Unfortunately, observations correlated across the four regions of the brain and small sample sizes preclude using existing methods. However, as in a wide range of medical imaging studies, tortuosity data have no missing or mistimed data, a factorial within-subject design, the same between-subject design for all responses, and a Gaussian distribution with compound symmetry. For such restricted models, we extend exact, small sample univariate methods for internal pilots to linear mixed models with any between-subject design (not just two groups). Planning a new tortuosity study illustrates how the new methods help to avoid sample sizes that are too small or too large while still controlling the type I error rate. PMID:17318914

  16. Valid approximation of spatially distributed grain size distributions - A priori information encoded to a feedforward network

    NASA Astrophysics Data System (ADS)

    Berthold, T.; Milbradt, P.; Berkhahn, V.

    2018-04-01

    This paper presents a model for the approximation of multiple, spatially distributed grain size distributions based on a feedforward neural network. Since a classical feedforward network does not guarantee to produce valid cumulative distribution functions, a priori information is incor porated into the model by applying weight and architecture constraints. The model is derived in two steps. First, a model is presented that is able to produce a valid distribution function for a single sediment sample. Although initially developed for sediment samples, the model is not limited in its application; it can also be used to approximate any other multimodal continuous distribution function. In the second part, the network is extended in order to capture the spatial variation of the sediment samples that have been obtained from 48 locations in the investigation area. Results show that the model provides an adequate approximation of grain size distributions, satisfying the requirements of a cumulative distribution function.

  17. Numerical calculations of spectral turnover and synchrotron self-absorption in CSS and GPS radio sources

    NASA Astrophysics Data System (ADS)

    Jeyakumar, S.

    2016-06-01

    The dependence of the turnover frequency on the linear size is presented for a sample of Giga-hertz Peaked Spectrum and Compact Steep Spectrum radio sources derived from complete samples. The dependence of the luminosity of the emission at the peak frequency with the linear size and the peak frequency is also presented for the galaxies in the sample. The luminosity of the smaller sources evolve strongly with the linear size. Optical depth effects have been included to the 3D model for the radio source of Kaiser to study the spectral turnover. Using this model, the observed trend can be explained by synchrotron self-absorption. The observed trend in the peak-frequency-linear-size plane is not affected by the luminosity evolution of the sources.

  18. Nomogram for sample size calculation on a straightforward basis for the kappa statistic.

    PubMed

    Hong, Hyunsook; Choi, Yunhee; Hahn, Seokyung; Park, Sue Kyung; Park, Byung-Joo

    2014-09-01

    Kappa is a widely used measure of agreement. However, it may not be straightforward in some situation such as sample size calculation due to the kappa paradox: high agreement but low kappa. Hence, it seems reasonable in sample size calculation that the level of agreement under a certain marginal prevalence is considered in terms of a simple proportion of agreement rather than a kappa value. Therefore, sample size formulae and nomograms using a simple proportion of agreement rather than a kappa under certain marginal prevalences are proposed. A sample size formula was derived using the kappa statistic under the common correlation model and goodness-of-fit statistic. The nomogram for the sample size formula was developed using SAS 9.3. The sample size formulae using a simple proportion of agreement instead of a kappa statistic and nomograms to eliminate the inconvenience of using a mathematical formula were produced. A nomogram for sample size calculation with a simple proportion of agreement should be useful in the planning stages when the focus of interest is on testing the hypothesis of interobserver agreement involving two raters and nominal outcome measures. Copyright © 2014 Elsevier Inc. All rights reserved.

  19. Parameter Estimation with Small Sample Size: A Higher-Order IRT Model Approach

    ERIC Educational Resources Information Center

    de la Torre, Jimmy; Hong, Yuan

    2010-01-01

    Sample size ranks as one of the most important factors that affect the item calibration task. However, due to practical concerns (e.g., item exposure) items are typically calibrated with much smaller samples than what is desired. To address the need for a more flexible framework that can be used in small sample item calibration, this article…

  20. Discriminant Analysis of Defective and Non-Defective Field Pea (Pisum sativum L.) into Broad Market Grades Based on Digital Image Features.

    PubMed

    McDonald, Linda S; Panozzo, Joseph F; Salisbury, Phillip A; Ford, Rebecca

    2016-01-01

    Field peas (Pisum sativum L.) are generally traded based on seed appearance, which subjectively defines broad market-grades. In this study, we developed an objective Linear Discriminant Analysis (LDA) model to classify market grades of field peas based on seed colour, shape and size traits extracted from digital images. Seeds were imaged in a high-throughput system consisting of a camera and laser positioned over a conveyor belt. Six colour intensity digital images were captured (under 405, 470, 530, 590, 660 and 850nm light) for each seed, and surface height was measured at each pixel by laser. Colour, shape and size traits were compiled across all seed in each sample to determine the median trait values. Defective and non-defective seed samples were used to calibrate and validate the model. Colour components were sufficient to correctly classify all non-defective seed samples into correct market grades. Defective samples required a combination of colour, shape and size traits to achieve 87% and 77% accuracy in market grade classification of calibration and validation sample-sets respectively. Following these results, we used the same colour, shape and size traits to develop an LDA model which correctly classified over 97% of all validation samples as defective or non-defective.

  1. Discriminant Analysis of Defective and Non-Defective Field Pea (Pisum sativum L.) into Broad Market Grades Based on Digital Image Features

    PubMed Central

    McDonald, Linda S.; Panozzo, Joseph F.; Salisbury, Phillip A.; Ford, Rebecca

    2016-01-01

    Field peas (Pisum sativum L.) are generally traded based on seed appearance, which subjectively defines broad market-grades. In this study, we developed an objective Linear Discriminant Analysis (LDA) model to classify market grades of field peas based on seed colour, shape and size traits extracted from digital images. Seeds were imaged in a high-throughput system consisting of a camera and laser positioned over a conveyor belt. Six colour intensity digital images were captured (under 405, 470, 530, 590, 660 and 850nm light) for each seed, and surface height was measured at each pixel by laser. Colour, shape and size traits were compiled across all seed in each sample to determine the median trait values. Defective and non-defective seed samples were used to calibrate and validate the model. Colour components were sufficient to correctly classify all non-defective seed samples into correct market grades. Defective samples required a combination of colour, shape and size traits to achieve 87% and 77% accuracy in market grade classification of calibration and validation sample-sets respectively. Following these results, we used the same colour, shape and size traits to develop an LDA model which correctly classified over 97% of all validation samples as defective or non-defective. PMID:27176469

  2. Non-destructive crystal size determination in geological samples of archaeological use by means of infrared spectroscopy.

    PubMed

    Olivares, M; Larrañaga, A; Irazola, M; Sarmiento, A; Murelaga, X; Etxebarria, N

    2012-08-30

    The determination of crystal size of chert samples can provide suitable information about the raw material used for the manufacture of archeological items. X-ray diffraction (XRD) has been widely used for this purpose in several scientific areas. However, the historical value of archeological pieces makes this procedure sometimes unfeasible and thus, non-invasive new analytical approaches are required. In this sense, a new method was developed relating the crystal size obtained by means of XRD and infrared spectroscopy (IR) using partial least squares regression. The IR spectra collected from a large amount of different geological chert samples of archeological use were pre-processed following different treatments (i.e., derivatization or sample-wise normalization) to obtain the best regression model. The full cross-validation was satisfactorily validated using real samples and the experimental root mean standard error of precision value was 165 Å whereas the average precision of the estimated size value was 3%. The features of infrared bands were also evaluated in order to know the background of the prediction ability. In the studied case, the variance in the model was associated to the differences in the characteristic stretching and bending infrared bands of SiO(2). Based on this fact, it would be feasible to estimate the crystal size if it is built beforehand a chemometric model relating the size measured by standard methods and the IR spectra. Copyright © 2012 Elsevier B.V. All rights reserved.

  3. Sample size determination for mediation analysis of longitudinal data.

    PubMed

    Pan, Haitao; Liu, Suyu; Miao, Danmin; Yuan, Ying

    2018-03-27

    Sample size planning for longitudinal data is crucial when designing mediation studies because sufficient statistical power is not only required in grant applications and peer-reviewed publications, but is essential to reliable research results. However, sample size determination is not straightforward for mediation analysis of longitudinal design. To facilitate planning the sample size for longitudinal mediation studies with a multilevel mediation model, this article provides the sample size required to achieve 80% power by simulations under various sizes of the mediation effect, within-subject correlations and numbers of repeated measures. The sample size calculation is based on three commonly used mediation tests: Sobel's method, distribution of product method and the bootstrap method. Among the three methods of testing the mediation effects, Sobel's method required the largest sample size to achieve 80% power. Bootstrapping and the distribution of the product method performed similarly and were more powerful than Sobel's method, as reflected by the relatively smaller sample sizes. For all three methods, the sample size required to achieve 80% power depended on the value of the ICC (i.e., within-subject correlation). A larger value of ICC typically required a larger sample size to achieve 80% power. Simulation results also illustrated the advantage of the longitudinal study design. The sample size tables for most encountered scenarios in practice have also been published for convenient use. Extensive simulations study showed that the distribution of the product method and bootstrapping method have superior performance to the Sobel's method, but the product method was recommended to use in practice in terms of less computation time load compared to the bootstrapping method. A R package has been developed for the product method of sample size determination in mediation longitudinal study design.

  4. The effective elastic properties of human trabecular bone may be approximated using micro-finite element analyses of embedded volume elements.

    PubMed

    Daszkiewicz, Karol; Maquer, Ghislain; Zysset, Philippe K

    2017-06-01

    Boundary conditions (BCs) and sample size affect the measured elastic properties of cancellous bone. Samples too small to be representative appear stiffer under kinematic uniform BCs (KUBCs) than under periodicity-compatible mixed uniform BCs (PMUBCs). To avoid those effects, we propose to determine the effective properties of trabecular bone using an embedded configuration. Cubic samples of various sizes (2.63, 5.29, 7.96, 10.58 and 15.87 mm) were cropped from [Formula: see text] scans of femoral heads and vertebral bodies. They were converted into [Formula: see text] models and their stiffness tensor was established via six uniaxial and shear load cases. PMUBCs- and KUBCs-based tensors were determined for each sample. "In situ" stiffness tensors were also evaluated for the embedded configuration, i.e. when the loads were transmitted to the samples via a layer of trabecular bone. The Zysset-Curnier model accounting for bone volume fraction and fabric anisotropy was fitted to those stiffness tensors, and model parameters [Formula: see text] (Poisson's ratio) [Formula: see text] and [Formula: see text] (elastic and shear moduli) were compared between sizes. BCs and sample size had little impact on [Formula: see text]. However, KUBCs- and PMUBCs-based [Formula: see text] and [Formula: see text], respectively, decreased and increased with growing size, though convergence was not reached even for our largest samples. Both BCs produced upper and lower bounds for the in situ values that were almost constant across samples dimensions, thus appearing as an approximation of the effective properties. PMUBCs seem also appropriate for mimicking the trabecular core, but they still underestimate its elastic properties (especially in shear) even for nearly orthotropic samples.

  5. The Relationship between National-Level Carbon Dioxide Emissions and Population Size: An Assessment of Regional and Temporal Variation, 1960–2005

    PubMed Central

    Jorgenson, Andrew K.; Clark, Brett

    2013-01-01

    This study examines the regional and temporal differences in the statistical relationship between national-level carbon dioxide emissions and national-level population size. The authors analyze panel data from 1960 to 2005 for a diverse sample of nations, and employ descriptive statistics and rigorous panel regression modeling techniques. Initial descriptive analyses indicate that all regions experienced overall increases in carbon emissions and population size during the 45-year period of investigation, but with notable differences. For carbon emissions, the sample of countries in Asia experienced the largest percent increase, followed by countries in Latin America, Africa, and lastly the sample of relatively affluent countries in Europe, North America, and Oceania combined. For population size, the sample of countries in Africa experienced the largest percent increase, followed countries in Latin America, Asia, and the combined sample of countries in Europe, North America, and Oceania. Findings for two-way fixed effects panel regression elasticity models of national-level carbon emissions indicate that the estimated elasticity coefficient for population size is much smaller for nations in Africa than for nations in other regions of the world. Regarding potential temporal changes, from 1960 to 2005 the estimated elasticity coefficient for population size decreased by 25% for the sample of Africa countries, 14% for the sample of Asia countries, 6.5% for the sample of Latin America countries, but remained the same in size for the sample of countries in Europe, North America, and Oceania. Overall, while population size continues to be the primary driver of total national-level anthropogenic carbon dioxide emissions, the findings for this study highlight the need for future research and policies to recognize that the actual impacts of population size on national-level carbon emissions differ across both time and region. PMID:23437323

  6. Estimation and applications of size-biased distributions in forestry

    Treesearch

    Jeffrey H. Gove

    2003-01-01

    Size-biased distributions arise naturally in several contexts in forestry and ecology. Simple power relationships (e.g. basal area and diameter at breast height) between variables are one such area of interest arising from a modelling perspective. Another, probability proportional to size PPS) sampling, is found in the most widely used methods for sampling standing or...

  7. Evaluating multi-level models to test occupancy state responses of Plethodontid salamanders

    USGS Publications Warehouse

    Kroll, Andrew J.; Garcia, Tiffany S.; Jones, Jay E.; Dugger, Catherine; Murden, Blake; Johnson, Josh; Peerman, Summer; Brintz, Ben; Rochelle, Michael

    2015-01-01

    Plethodontid salamanders are diverse and widely distributed taxa and play critical roles in ecosystem processes. Due to salamander use of structurally complex habitats, and because only a portion of a population is available for sampling, evaluation of sampling designs and estimators is critical to provide strong inference about Plethodontid ecology and responses to conservation and management activities. We conducted a simulation study to evaluate the effectiveness of multi-scale and hierarchical single-scale occupancy models in the context of a Before-After Control-Impact (BACI) experimental design with multiple levels of sampling. Also, we fit the hierarchical single-scale model to empirical data collected for Oregon slender and Ensatina salamanders across two years on 66 forest stands in the Cascade Range, Oregon, USA. All models were fit within a Bayesian framework. Estimator precision in both models improved with increasing numbers of primary and secondary sampling units, underscoring the potential gains accrued when adding secondary sampling units. Both models showed evidence of estimator bias at low detection probabilities and low sample sizes; this problem was particularly acute for the multi-scale model. Our results suggested that sufficient sample sizes at both the primary and secondary sampling levels could ameliorate this issue. Empirical data indicated Oregon slender salamander occupancy was associated strongly with the amount of coarse woody debris (posterior mean = 0.74; SD = 0.24); Ensatina occupancy was not associated with amount of coarse woody debris (posterior mean = -0.01; SD = 0.29). Our simulation results indicate that either model is suitable for use in an experimental study of Plethodontid salamanders provided that sample sizes are sufficiently large. However, hierarchical single-scale and multi-scale models describe different processes and estimate different parameters. As a result, we recommend careful consideration of study questions and objectives prior to sampling data and fitting models.

  8. On the repeated measures designs and sample sizes for randomized controlled trials.

    PubMed

    Tango, Toshiro

    2016-04-01

    For the analysis of longitudinal or repeated measures data, generalized linear mixed-effects models provide a flexible and powerful tool to deal with heterogeneity among subject response profiles. However, the typical statistical design adopted in usual randomized controlled trials is an analysis of covariance type analysis using a pre-defined pair of "pre-post" data, in which pre-(baseline) data are used as a covariate for adjustment together with other covariates. Then, the major design issue is to calculate the sample size or the number of subjects allocated to each treatment group. In this paper, we propose a new repeated measures design and sample size calculations combined with generalized linear mixed-effects models that depend not only on the number of subjects but on the number of repeated measures before and after randomization per subject used for the analysis. The main advantages of the proposed design combined with the generalized linear mixed-effects models are (1) it can easily handle missing data by applying the likelihood-based ignorable analyses under the missing at random assumption and (2) it may lead to a reduction in sample size, compared with the simple pre-post design. The proposed designs and the sample size calculations are illustrated with real data arising from randomized controlled trials. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  9. Reporting of sample size calculations in analgesic clinical trials: ACTTION systematic review.

    PubMed

    McKeown, Andrew; Gewandter, Jennifer S; McDermott, Michael P; Pawlowski, Joseph R; Poli, Joseph J; Rothstein, Daniel; Farrar, John T; Gilron, Ian; Katz, Nathaniel P; Lin, Allison H; Rappaport, Bob A; Rowbotham, Michael C; Turk, Dennis C; Dworkin, Robert H; Smith, Shannon M

    2015-03-01

    Sample size calculations determine the number of participants required to have sufficiently high power to detect a given treatment effect. In this review, we examined the reporting quality of sample size calculations in 172 publications of double-blind randomized controlled trials of noninvasive pharmacologic or interventional (ie, invasive) pain treatments published in European Journal of Pain, Journal of Pain, and Pain from January 2006 through June 2013. Sixty-five percent of publications reported a sample size calculation but only 38% provided all elements required to replicate the calculated sample size. In publications reporting at least 1 element, 54% provided a justification for the treatment effect used to calculate sample size, and 24% of studies with continuous outcome variables justified the variability estimate. Publications of clinical pain condition trials reported a sample size calculation more frequently than experimental pain model trials (77% vs 33%, P < .001) but did not differ in the frequency of reporting all required elements. No significant differences in reporting of any or all elements were detected between publications of trials with industry and nonindustry sponsorship. Twenty-eight percent included a discrepancy between the reported number of planned and randomized participants. This study suggests that sample size calculation reporting in analgesic trial publications is usually incomplete. Investigators should provide detailed accounts of sample size calculations in publications of clinical trials of pain treatments, which is necessary for reporting transparency and communication of pre-trial design decisions. In this systematic review of analgesic clinical trials, sample size calculations and the required elements (eg, treatment effect to be detected; power level) were incompletely reported. A lack of transparency regarding sample size calculations may raise questions about the appropriateness of the calculated sample size. Copyright © 2015 American Pain Society. All rights reserved.

  10. Rule-of-thumb adjustment of sample sizes to accommodate dropouts in a two-stage analysis of repeated measurements.

    PubMed

    Overall, John E; Tonidandel, Scott; Starbuck, Robert R

    2006-01-01

    Recent contributions to the statistical literature have provided elegant model-based solutions to the problem of estimating sample sizes for testing the significance of differences in mean rates of change across repeated measures in controlled longitudinal studies with differentially correlated error and missing data due to dropouts. However, the mathematical complexity and model specificity of these solutions make them generally inaccessible to most applied researchers who actually design and undertake treatment evaluation research in psychiatry. In contrast, this article relies on a simple two-stage analysis in which dropout-weighted slope coefficients fitted to the available repeated measurements for each subject separately serve as the dependent variable for a familiar ANCOVA test of significance for differences in mean rates of change. This article is about how a sample of size that is estimated or calculated to provide desired power for testing that hypothesis without considering dropouts can be adjusted appropriately to take dropouts into account. Empirical results support the conclusion that, whatever reasonable level of power would be provided by a given sample size in the absence of dropouts, essentially the same power can be realized in the presence of dropouts simply by adding to the original dropout-free sample size the number of subjects who would be expected to drop from a sample of that original size under conditions of the proposed study.

  11. A Monte-Carlo simulation analysis for evaluating the severity distribution functions (SDFs) calibration methodology and determining the minimum sample-size requirements.

    PubMed

    Shirazi, Mohammadali; Reddy Geedipally, Srinivas; Lord, Dominique

    2017-01-01

    Severity distribution functions (SDFs) are used in highway safety to estimate the severity of crashes and conduct different types of safety evaluations and analyses. Developing a new SDF is a difficult task and demands significant time and resources. To simplify the process, the Highway Safety Manual (HSM) has started to document SDF models for different types of facilities. As such, SDF models have recently been introduced for freeway and ramps in HSM addendum. However, since these functions or models are fitted and validated using data from a few selected number of states, they are required to be calibrated to the local conditions when applied to a new jurisdiction. The HSM provides a methodology to calibrate the models through a scalar calibration factor. However, the proposed methodology to calibrate SDFs was never validated through research. Furthermore, there are no concrete guidelines to select a reliable sample size. Using extensive simulation, this paper documents an analysis that examined the bias between the 'true' and 'estimated' calibration factors. It was indicated that as the value of the true calibration factor deviates further away from '1', more bias is observed between the 'true' and 'estimated' calibration factors. In addition, simulation studies were performed to determine the calibration sample size for various conditions. It was found that, as the average of the coefficient of variation (CV) of the 'KAB' and 'C' crashes increases, the analyst needs to collect a larger sample size to calibrate SDF models. Taking this observation into account, sample-size guidelines are proposed based on the average CV of crash severities that are used for the calibration process. Copyright © 2016 Elsevier Ltd. All rights reserved.

  12. The impact of accelerating faster than exponential population growth on genetic variation.

    PubMed

    Reppell, Mark; Boehnke, Michael; Zöllner, Sebastian

    2014-03-01

    Current human sequencing projects observe an abundance of extremely rare genetic variation, suggesting recent acceleration of population growth. To better understand the impact of such accelerating growth on the quantity and nature of genetic variation, we present a new class of models capable of incorporating faster than exponential growth in a coalescent framework. Our work shows that such accelerated growth affects only the population size in the recent past and thus large samples are required to detect the models' effects on patterns of variation. When we compare models with fixed initial growth rate, models with accelerating growth achieve very large current population sizes and large samples from these populations contain more variation than samples from populations with constant growth. This increase is driven almost entirely by an increase in singleton variation. Moreover, linkage disequilibrium decays faster in populations with accelerating growth. When we instead condition on current population size, models with accelerating growth result in less overall variation and slower linkage disequilibrium decay compared to models with exponential growth. We also find that pairwise linkage disequilibrium of very rare variants contains information about growth rates in the recent past. Finally, we demonstrate that models of accelerating growth may substantially change estimates of present-day effective population sizes and growth times.

  13. Using Data-Dependent Priors to Mitigate Small Sample Bias in Latent Growth Models: A Discussion and Illustration Using M"plus"

    ERIC Educational Resources Information Center

    McNeish, Daniel M.

    2016-01-01

    Mixed-effects models (MEMs) and latent growth models (LGMs) are often considered interchangeable save the discipline-specific nomenclature. Software implementations of these models, however, are not interchangeable, particularly with small sample sizes. Restricted maximum likelihood estimation that mitigates small sample bias in MEMs has not been…

  14. Support vector regression to predict porosity and permeability: Effect of sample size

    NASA Astrophysics Data System (ADS)

    Al-Anazi, A. F.; Gates, I. D.

    2012-02-01

    Porosity and permeability are key petrophysical parameters obtained from laboratory core analysis. Cores, obtained from drilled wells, are often few in number for most oil and gas fields. Porosity and permeability correlations based on conventional techniques such as linear regression or neural networks trained with core and geophysical logs suffer poor generalization to wells with only geophysical logs. The generalization problem of correlation models often becomes pronounced when the training sample size is small. This is attributed to the underlying assumption that conventional techniques employing the empirical risk minimization (ERM) inductive principle converge asymptotically to the true risk values as the number of samples increases. In small sample size estimation problems, the available training samples must span the complexity of the parameter space so that the model is able both to match the available training samples reasonably well and to generalize to new data. This is achieved using the structural risk minimization (SRM) inductive principle by matching the capability of the model to the available training data. One method that uses SRM is support vector regression (SVR) network. In this research, the capability of SVR to predict porosity and permeability in a heterogeneous sandstone reservoir under the effect of small sample size is evaluated. Particularly, the impact of Vapnik's ɛ-insensitivity loss function and least-modulus loss function on generalization performance was empirically investigated. The results are compared to the multilayer perception (MLP) neural network, a widely used regression method, which operates under the ERM principle. The mean square error and correlation coefficients were used to measure the quality of predictions. The results demonstrate that SVR yields consistently better predictions of the porosity and permeability with small sample size than the MLP method. Also, the performance of SVR depends on both kernel function type and loss functions used.

  15. Study design requirements for RNA sequencing-based breast cancer diagnostics.

    PubMed

    Mer, Arvind Singh; Klevebring, Daniel; Grönberg, Henrik; Rantalainen, Mattias

    2016-02-01

    Sequencing-based molecular characterization of tumors provides information required for individualized cancer treatment. There are well-defined molecular subtypes of breast cancer that provide improved prognostication compared to routine biomarkers. However, molecular subtyping is not yet implemented in routine breast cancer care. Clinical translation is dependent on subtype prediction models providing high sensitivity and specificity. In this study we evaluate sample size and RNA-sequencing read requirements for breast cancer subtyping to facilitate rational design of translational studies. We applied subsampling to ascertain the effect of training sample size and the number of RNA sequencing reads on classification accuracy of molecular subtype and routine biomarker prediction models (unsupervised and supervised). Subtype classification accuracy improved with increasing sample size up to N = 750 (accuracy = 0.93), although with a modest improvement beyond N = 350 (accuracy = 0.92). Prediction of routine biomarkers achieved accuracy of 0.94 (ER) and 0.92 (Her2) at N = 200. Subtype classification improved with RNA-sequencing library size up to 5 million reads. Development of molecular subtyping models for cancer diagnostics requires well-designed studies. Sample size and the number of RNA sequencing reads directly influence accuracy of molecular subtyping. Results in this study provide key information for rational design of translational studies aiming to bring sequencing-based diagnostics to the clinic.

  16. Design considerations for case series models with exposure onset measurement error.

    PubMed

    Mohammed, Sandra M; Dalrymple, Lorien S; Sentürk, Damla; Nguyen, Danh V

    2013-02-28

    The case series model allows for estimation of the relative incidence of events, such as cardiovascular events, within a pre-specified time window after an exposure, such as an infection. The method requires only cases (individuals with events) and controls for all fixed/time-invariant confounders. The measurement error case series model extends the original case series model to handle imperfect data, where the timing of an infection (exposure) is not known precisely. In this work, we propose a method for power/sample size determination for the measurement error case series model. Extensive simulation studies are used to assess the accuracy of the proposed sample size formulas. We also examine the magnitude of the relative loss of power due to exposure onset measurement error, compared with the ideal situation where the time of exposure is measured precisely. To facilitate the design of case series studies, we provide publicly available web-based tools for determining power/sample size for both the measurement error case series model as well as the standard case series model. Copyright © 2012 John Wiley & Sons, Ltd.

  17. Fragment size distribution statistics in dynamic fragmentation of laser shock-loaded tin

    NASA Astrophysics Data System (ADS)

    He, Weihua; Xin, Jianting; Zhao, Yongqiang; Chu, Genbai; Xi, Tao; Shui, Min; Lu, Feng; Gu, Yuqiu

    2017-06-01

    This work investigates the geometric statistics method to characterize the size distribution of tin fragments produced in the laser shock-loaded dynamic fragmentation process. In the shock experiments, the ejection of the tin sample with etched V-shape groove in the free surface are collected by the soft recovery technique. Subsequently, the produced fragments are automatically detected with the fine post-shot analysis techniques including the X-ray micro-tomography and the improved watershed method. To characterize the size distributions of the fragments, a theoretical random geometric statistics model based on Poisson mixtures is derived for dynamic heterogeneous fragmentation problem, which reveals linear combinational exponential distribution. The experimental data related to fragment size distributions of the laser shock-loaded tin sample are examined with the proposed theoretical model, and its fitting performance is compared with that of other state-of-the-art fragment size distribution models. The comparison results prove that our proposed model can provide far more reasonable fitting result for the laser shock-loaded tin.

  18. Dependence of flux-flow critical frequencies and generalized bundle sizes on distance of fluxoid traversal and fluxoid length in foil samples

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thompson, J.D.; Joiner, W.C.H.

    1979-10-01

    Flux-flow noise power spectra taken on Pb/sub 80/In/sub 20/ foils as a function of the orientation of the magnetic field with respect to the sample surfaces are used to study changes in frequencies and bundle sizes as distances of fluxoid traversal and fluxoid lengths change. The results obtained for the frequency dependence of the noise spectra are entirely consistent with our model for flux motion interrupted by pinning centers, provided one makes the reasonable assumption that the distance between pinning centers which a fluxoid may encounter scales inversely with the fluxoid length. The importance of pinning centers in determining themore » noise characteristics is also demonstrated by the way in which subpulse distributions and generalized bundle sizes are altered by changes in the metallurgical structure of the sample. In unannealed samples the dependence of bundle size on magnetic field orientation is controlled by a structural anisotropy, and we find a correlation between large bundle size and the absence of short subpulse times. Annealing removes this anisotropy, and we find a stronger angular variation of bundle size than would be expected using present simplified models.« less

  19. Effect of Mechanical Impact Energy on the Sorption and Diffusion of Moisture in Reinforced Polymer Composite Samples on Variation of Their Sizes

    NASA Astrophysics Data System (ADS)

    Startsev, V. O.; Il'ichev, A. V.

    2018-05-01

    The effect of mechanical impact energy on the sorption and diffusion of moisture in polymer composite samples on variation of their sizes was investigated. Square samples, with sides of 40, 60, 80, and 100 mm, made of a KMKU-2m-120.E0,1 carbon-fiber and KMKS-2m.120.T10 glass-fiber plastics with different resistances to calibrated impacts, were compared. Impact loading diagrams of the samples in relation to their sizes and impact energy were analyzed. It is shown that the moisture saturation and moisture diffusion coefficient of the impact-damaged materials can be modeled by Fick's second law with account of impact energy and sample sizes.

  20. SU-E-I-46: Sample-Size Dependence of Model Observers for Estimating Low-Contrast Detection Performance From CT Images

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Reiser, I; Lu, Z

    2014-06-01

    Purpose: Recently, task-based assessment of diagnostic CT systems has attracted much attention. Detection task performance can be estimated using human observers, or mathematical observer models. While most models are well established, considerable bias can be introduced when performance is estimated from a limited number of image samples. Thus, the purpose of this work was to assess the effect of sample size on bias and uncertainty of two channelized Hotelling observers and a template-matching observer. Methods: The image data used for this study consisted of 100 signal-present and 100 signal-absent regions-of-interest, which were extracted from CT slices. The experimental conditions includedmore » two signal sizes and five different x-ray beam current settings (mAs). Human observer performance for these images was determined in 2-alternative forced choice experiments. These data were provided by the Mayo clinic in Rochester, MN. Detection performance was estimated from three observer models, including channelized Hotelling observers (CHO) with Gabor or Laguerre-Gauss (LG) channels, and a template-matching observer (TM). Different sample sizes were generated by randomly selecting a subset of image pairs, (N=20,40,60,80). Observer performance was quantified as proportion of correct responses (PC). Bias was quantified as the relative difference of PC for 20 and 80 image pairs. Results: For n=100, all observer models predicted human performance across mAs and signal sizes. Bias was 23% for CHO (Gabor), 7% for CHO (LG), and 3% for TM. The relative standard deviation, σ(PC)/PC at N=20 was highest for the TM observer (11%) and lowest for the CHO (Gabor) observer (5%). Conclusion: In order to make image quality assessment feasible in the clinical practice, a statistically efficient observer model, that can predict performance from few samples, is needed. Our results identified two observer models that may be suited for this task.« less

  1. A comparison of confidence/credible interval methods for the area under the ROC curve for continuous diagnostic tests with small sample size.

    PubMed

    Feng, Dai; Cortese, Giuliana; Baumgartner, Richard

    2017-12-01

    The receiver operating characteristic (ROC) curve is frequently used as a measure of accuracy of continuous markers in diagnostic tests. The area under the ROC curve (AUC) is arguably the most widely used summary index for the ROC curve. Although the small sample size scenario is common in medical tests, a comprehensive study of small sample size properties of various methods for the construction of the confidence/credible interval (CI) for the AUC has been by and large missing in the literature. In this paper, we describe and compare 29 non-parametric and parametric methods for the construction of the CI for the AUC when the number of available observations is small. The methods considered include not only those that have been widely adopted, but also those that have been less frequently mentioned or, to our knowledge, never applied to the AUC context. To compare different methods, we carried out a simulation study with data generated from binormal models with equal and unequal variances and from exponential models with various parameters and with equal and unequal small sample sizes. We found that the larger the true AUC value and the smaller the sample size, the larger the discrepancy among the results of different approaches. When the model is correctly specified, the parametric approaches tend to outperform the non-parametric ones. Moreover, in the non-parametric domain, we found that a method based on the Mann-Whitney statistic is in general superior to the others. We further elucidate potential issues and provide possible solutions to along with general guidance on the CI construction for the AUC when the sample size is small. Finally, we illustrate the utility of different methods through real life examples.

  2. Affected States Soft Independent Modeling by Class Analogy from the Relation Between Independent Variables, Number of Independent Variables and Sample Size

    PubMed Central

    Kanık, Emine Arzu; Temel, Gülhan Orekici; Erdoğan, Semra; Kaya, İrem Ersöz

    2013-01-01

    Objective: The aim of study is to introduce method of Soft Independent Modeling of Class Analogy (SIMCA), and to express whether the method is affected from the number of independent variables, the relationship between variables and sample size. Study Design: Simulation study. Material and Methods: SIMCA model is performed in two stages. In order to determine whether the method is influenced by the number of independent variables, the relationship between variables and sample size, simulations were done. Conditions in which sample sizes in both groups are equal, and where there are 30, 100 and 1000 samples; where the number of variables is 2, 3, 5, 10, 50 and 100; moreover where the relationship between variables are quite high, in medium level and quite low were mentioned. Results: Average classification accuracy of simulation results which were carried out 1000 times for each possible condition of trial plan were given as tables. Conclusion: It is seen that diagnostic accuracy results increase as the number of independent variables increase. SIMCA method is a method in which the relationship between variables are quite high, the number of independent variables are many in number and where there are outlier values in the data that can be used in conditions having outlier values. PMID:25207065

  3. Affected States soft independent modeling by class analogy from the relation between independent variables, number of independent variables and sample size.

    PubMed

    Kanık, Emine Arzu; Temel, Gülhan Orekici; Erdoğan, Semra; Kaya, Irem Ersöz

    2013-03-01

    The aim of study is to introduce method of Soft Independent Modeling of Class Analogy (SIMCA), and to express whether the method is affected from the number of independent variables, the relationship between variables and sample size. Simulation study. SIMCA model is performed in two stages. In order to determine whether the method is influenced by the number of independent variables, the relationship between variables and sample size, simulations were done. Conditions in which sample sizes in both groups are equal, and where there are 30, 100 and 1000 samples; where the number of variables is 2, 3, 5, 10, 50 and 100; moreover where the relationship between variables are quite high, in medium level and quite low were mentioned. Average classification accuracy of simulation results which were carried out 1000 times for each possible condition of trial plan were given as tables. It is seen that diagnostic accuracy results increase as the number of independent variables increase. SIMCA method is a method in which the relationship between variables are quite high, the number of independent variables are many in number and where there are outlier values in the data that can be used in conditions having outlier values.

  4. Sample size calculation in cost-effectiveness cluster randomized trials: optimal and maximin approaches.

    PubMed

    Manju, Md Abu; Candel, Math J J M; Berger, Martijn P F

    2014-07-10

    In this paper, the optimal sample sizes at the cluster and person levels for each of two treatment arms are obtained for cluster randomized trials where the cost-effectiveness of treatments on a continuous scale is studied. The optimal sample sizes maximize the efficiency or power for a given budget or minimize the budget for a given efficiency or power. Optimal sample sizes require information on the intra-cluster correlations (ICCs) for effects and costs, the correlations between costs and effects at individual and cluster levels, the ratio of the variance of effects translated into costs to the variance of the costs (the variance ratio), sampling and measuring costs, and the budget. When planning, a study information on the model parameters usually is not available. To overcome this local optimality problem, the current paper also presents maximin sample sizes. The maximin sample sizes turn out to be rather robust against misspecifying the correlation between costs and effects at the cluster and individual levels but may lose much efficiency when misspecifying the variance ratio. The robustness of the maximin sample sizes against misspecifying the ICCs depends on the variance ratio. The maximin sample sizes are robust under misspecification of the ICC for costs for realistic values of the variance ratio greater than one but not robust under misspecification of the ICC for effects. Finally, we show how to calculate optimal or maximin sample sizes that yield sufficient power for a test on the cost-effectiveness of an intervention.

  5. The Impact of Accelerating Faster than Exponential Population Growth on Genetic Variation

    PubMed Central

    Reppell, Mark; Boehnke, Michael; Zöllner, Sebastian

    2014-01-01

    Current human sequencing projects observe an abundance of extremely rare genetic variation, suggesting recent acceleration of population growth. To better understand the impact of such accelerating growth on the quantity and nature of genetic variation, we present a new class of models capable of incorporating faster than exponential growth in a coalescent framework. Our work shows that such accelerated growth affects only the population size in the recent past and thus large samples are required to detect the models’ effects on patterns of variation. When we compare models with fixed initial growth rate, models with accelerating growth achieve very large current population sizes and large samples from these populations contain more variation than samples from populations with constant growth. This increase is driven almost entirely by an increase in singleton variation. Moreover, linkage disequilibrium decays faster in populations with accelerating growth. When we instead condition on current population size, models with accelerating growth result in less overall variation and slower linkage disequilibrium decay compared to models with exponential growth. We also find that pairwise linkage disequilibrium of very rare variants contains information about growth rates in the recent past. Finally, we demonstrate that models of accelerating growth may substantially change estimates of present-day effective population sizes and growth times. PMID:24381333

  6. Two models of the sound-signal frequency dependence on the animal body size as exemplified by the ground squirrels of Eurasia (mammalia, rodentia).

    PubMed

    Nikol'skii, A A

    2017-11-01

    Dependence of the sound-signal frequency on the animal body length was studied in 14 ground squirrel species (genus Spermophilus) of Eurasia. Regression analysis of the total sample yielded a low determination coefficient (R 2 = 26%), because the total sample proved to be heterogeneous in terms of signal frequency within the dimension classes of animals. When the total sample was divided into two groups according to signal frequency, two statistically significant models (regression equations) were obtained in which signal frequency depended on the body size at high determination coefficients (R 2 = 73 and 94% versus 26% for the total sample). Thus, the problem of correlation between animal body size and the frequency of their vocal signals does not have a unique solution.

  7. The Effect of Small Sample Size on Measurement Equivalence of Psychometric Questionnaires in MIMIC Model: A Simulation Study.

    PubMed

    Jamali, Jamshid; Ayatollahi, Seyyed Mohammad Taghi; Jafari, Peyman

    2017-01-01

    Evaluating measurement equivalence (also known as differential item functioning (DIF)) is an important part of the process of validating psychometric questionnaires. This study aimed at evaluating the multiple indicators multiple causes (MIMIC) model for DIF detection when latent construct distribution is nonnormal and the focal group sample size is small. In this simulation-based study, Type I error rates and power of MIMIC model for detecting uniform-DIF were investigated under different combinations of reference to focal group sample size ratio, magnitude of the uniform-DIF effect, scale length, the number of response categories, and latent trait distribution. Moderate and high skewness in the latent trait distribution led to a decrease of 0.33% and 0.47% power of MIMIC model for detecting uniform-DIF, respectively. The findings indicated that, by increasing the scale length, the number of response categories and magnitude DIF improved the power of MIMIC model, by 3.47%, 4.83%, and 20.35%, respectively; it also decreased Type I error of MIMIC approach by 2.81%, 5.66%, and 0.04%, respectively. This study revealed that power of MIMIC model was at an acceptable level when latent trait distributions were skewed. However, empirical Type I error rate was slightly greater than nominal significance level. Consequently, the MIMIC was recommended for detection of uniform-DIF when latent construct distribution is nonnormal and the focal group sample size is small.

  8. The Effect of Small Sample Size on Measurement Equivalence of Psychometric Questionnaires in MIMIC Model: A Simulation Study

    PubMed Central

    Jafari, Peyman

    2017-01-01

    Evaluating measurement equivalence (also known as differential item functioning (DIF)) is an important part of the process of validating psychometric questionnaires. This study aimed at evaluating the multiple indicators multiple causes (MIMIC) model for DIF detection when latent construct distribution is nonnormal and the focal group sample size is small. In this simulation-based study, Type I error rates and power of MIMIC model for detecting uniform-DIF were investigated under different combinations of reference to focal group sample size ratio, magnitude of the uniform-DIF effect, scale length, the number of response categories, and latent trait distribution. Moderate and high skewness in the latent trait distribution led to a decrease of 0.33% and 0.47% power of MIMIC model for detecting uniform-DIF, respectively. The findings indicated that, by increasing the scale length, the number of response categories and magnitude DIF improved the power of MIMIC model, by 3.47%, 4.83%, and 20.35%, respectively; it also decreased Type I error of MIMIC approach by 2.81%, 5.66%, and 0.04%, respectively. This study revealed that power of MIMIC model was at an acceptable level when latent trait distributions were skewed. However, empirical Type I error rate was slightly greater than nominal significance level. Consequently, the MIMIC was recommended for detection of uniform-DIF when latent construct distribution is nonnormal and the focal group sample size is small. PMID:28713828

  9. Spatially explicit dynamic N-mixture models

    USGS Publications Warehouse

    Zhao, Qing; Royle, Andy; Boomer, G. Scott

    2017-01-01

    Knowledge of demographic parameters such as survival, reproduction, emigration, and immigration is essential to understand metapopulation dynamics. Traditionally the estimation of these demographic parameters requires intensive data from marked animals. The development of dynamic N-mixture models makes it possible to estimate demographic parameters from count data of unmarked animals, but the original dynamic N-mixture model does not distinguish emigration and immigration from survival and reproduction, limiting its ability to explain important metapopulation processes such as movement among local populations. In this study we developed a spatially explicit dynamic N-mixture model that estimates survival, reproduction, emigration, local population size, and detection probability from count data under the assumption that movement only occurs among adjacent habitat patches. Simulation studies showed that the inference of our model depends on detection probability, local population size, and the implementation of robust sampling design. Our model provides reliable estimates of survival, reproduction, and emigration when detection probability is high, regardless of local population size or the type of sampling design. When detection probability is low, however, our model only provides reliable estimates of survival, reproduction, and emigration when local population size is moderate to high and robust sampling design is used. A sensitivity analysis showed that our model is robust against the violation of the assumption that movement only occurs among adjacent habitat patches, suggesting wide applications of this model. Our model can be used to improve our understanding of metapopulation dynamics based on count data that are relatively easy to collect in many systems.

  10. Alternative Models for Small Samples in Psychological Research: Applying Linear Mixed Effects Models and Generalized Estimating Equations to Repeated Measures Data

    ERIC Educational Resources Information Center

    Muth, Chelsea; Bales, Karen L.; Hinde, Katie; Maninger, Nicole; Mendoza, Sally P.; Ferrer, Emilio

    2016-01-01

    Unavoidable sample size issues beset psychological research that involves scarce populations or costly laboratory procedures. When incorporating longitudinal designs these samples are further reduced by traditional modeling techniques, which perform listwise deletion for any instance of missing data. Moreover, these techniques are limited in their…

  11. Modeling of Grain Size Distribution of Tsunami Sand Deposits in V-shaped Valley of Numanohama During the 2011 Tohoku Tsunami

    NASA Astrophysics Data System (ADS)

    Gusman, A. R.; Satake, K.; Goto, T.; Takahashi, T.

    2016-12-01

    Estimating tsunami amplitude from tsunami sand deposit has been a challenge. The grain size distribution of tsunami sand deposit may have correlation with tsunami inundation process, and further with its source characteristics. In order to test this hypothesis, we need a tsunami sediment transport model that can accurately estimate grain size distribution of tsunami deposit. Here, we built and validate a tsunami sediment transport model that can simulate grain size distribution. Our numerical model has three layers which are suspended load layer, active bed layer, and parent bed layer. The two bed layers contain information about the grain size distribution. This numerical model can handle a wide range of grain sizes from 0.063 (4 ϕ) to 5.657 mm (-2.5 ϕ). We apply the numerical model to simulate the sedimentation process during the 2011 Tohoku earthquake in Numanohama, Iwate prefecture, Japan. The grain size distributions at 15 sample points along a 900 m transect from the beach are used to validate the tsunami sediment transport model. The tsunami deposits are dominated by coarse sand with diameter of 0.5 - 1 mm and their thickness are up to 25 cm. Our tsunami model can well reproduce the observed tsunami run-ups that are ranged from 16 to 34 m along the steep valley in Numanohama. The shapes of the simulated grain size distributions at many sample points located within 300 m from the shoreline are similar to the observations. The differences between observed and simulated peak of grain size distributions are less than 1 ϕ. Our result also shows that the simulated sand thickness distribution along the transect is consistent with the observation.

  12. Effects of crystallite size on the structure and magnetism of ferrihydrite

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Xiaoming; Zhu, Mengqiang; Koopal, Luuk K.

    2015-12-15

    The structure and magnetic properties of nano-sized (1.6 to 4.4 nm) ferrihydrite samples are systematically investigated through a combination of X-ray diffraction (XRD), X-ray pair distribution function (PDF), X-ray absorption spectroscopy (XAS) and magnetic analyses. The XRD, PDF and Fe K-edge XAS data of the ferrihydrite samples are all fitted well with the Michel ferrihydrite model, indicating similar local-, medium- and long-range ordered structures. PDF and XAS fitting results indicate that, with increasing crystallite size, the average coordination numbers of Fe–Fe and the unit cell parameter c increase, while Fe2 and Fe3 vacancies and the unit cell parameter a decrease.more » Mössbauer results indicate that the surface layer is relatively disordered, which might have been caused by the random distribution of Fe vacancies. These results support Hiemstra's surface-depletion model in terms of the location of disorder and the variations of Fe2 and Fe3 occupancies with size. Magnetic data indicate that the ferrihydrite samples show antiferromagnetism superimposed with a ferromagnetic-like moment at lower temperatures (100 K and 10 K), but ferrihydrite is paramagnetic at room temperature. In addition, both the magnetization and coercivity decrease with increasing ferrihydrite crystallite size due to strong surface effects in fine-grained ferrihydrites. Smaller ferrihydrite samples show less magnetic hyperfine splitting and a lower unblocking temperature (T B) than larger samples. The dependence of magnetic properties on grain size for nano-sized ferrihydrite provides a practical way to determine the crystallite size of ferrihydrite quantitatively in natural environments or artificial systems.« less

  13. Only pick the right grains: Modelling the bias due to subjective grain-size interval selection for chronometric and fingerprinting approaches.

    NASA Astrophysics Data System (ADS)

    Dietze, Michael; Fuchs, Margret; Kreutzer, Sebastian

    2016-04-01

    Many modern approaches of radiometric dating or geochemical fingerprinting rely on sampling sedimentary deposits. A key assumption of most concepts is that the extracted grain-size fraction of the sampled sediment adequately represents the actual process to be dated or the source area to be fingerprinted. However, these assumptions are not always well constrained. Rather, they have to align with arbitrary, method-determined size intervals, such as "coarse grain" or "fine grain" with partly even different definitions. Such arbitrary intervals violate principal process-based concepts of sediment transport and can thus introduce significant bias to the analysis outcome (i.e., a deviation of the measured from the true value). We present a flexible numerical framework (numOlum) for the statistical programming language R that allows quantifying the bias due to any given analysis size interval for different types of sediment deposits. This framework is applied to synthetic samples from the realms of luminescence dating and geochemical fingerprinting, i.e. a virtual reworked loess section. We show independent validation data from artificially dosed and subsequently mixed grain-size proportions and we present a statistical approach (end-member modelling analysis, EMMA) that allows accounting for the effect of measuring the compound dosimetric history or geochemical composition of a sample. EMMA separates polymodal grain-size distributions into the underlying transport process-related distributions and their contribution to each sample. These underlying distributions can then be used to adjust grain-size preparation intervals to minimise the incorporation of "undesired" grain-size fractions.

  14. Single-arm phase II trial design under parametric cure models.

    PubMed

    Wu, Jianrong

    2015-01-01

    The current practice of designing single-arm phase II survival trials is limited under the exponential model. Trial design under the exponential model may not be appropriate when a portion of patients are cured. There is no literature available for designing single-arm phase II trials under the parametric cure model. In this paper, a test statistic is proposed, and a sample size formula is derived for designing single-arm phase II trials under a class of parametric cure models. Extensive simulations showed that the proposed test and sample size formula perform very well under different scenarios. Copyright © 2015 John Wiley & Sons, Ltd.

  15. Nonidentifiability of population size from capture-recapture data with heterogeneous detection probabilities

    USGS Publications Warehouse

    Link, W.A.

    2003-01-01

    Heterogeneity in detection probabilities has long been recognized as problematic in mark-recapture studies, and numerous models developed to accommodate its effects. Individual heterogeneity is especially problematic, in that reasonable alternative models may predict essentially identical observations from populations of substantially different sizes. Thus even with very large samples, the analyst will not be able to distinguish among reasonable models of heterogeneity, even though these yield quite distinct inferences about population size. The problem is illustrated with models for closed and open populations.

  16. Sample size and power calculations for detecting changes in malaria transmission using antibody seroconversion rate.

    PubMed

    Sepúlveda, Nuno; Paulino, Carlos Daniel; Drakeley, Chris

    2015-12-30

    Several studies have highlighted the use of serological data in detecting a reduction in malaria transmission intensity. These studies have typically used serology as an adjunct measure and no formal examination of sample size calculations for this approach has been conducted. A sample size calculator is proposed for cross-sectional surveys using data simulation from a reverse catalytic model assuming a reduction in seroconversion rate (SCR) at a given change point before sampling. This calculator is based on logistic approximations for the underlying power curves to detect a reduction in SCR in relation to the hypothesis of a stable SCR for the same data. Sample sizes are illustrated for a hypothetical cross-sectional survey from an African population assuming a known or unknown change point. Overall, data simulation demonstrates that power is strongly affected by assuming a known or unknown change point. Small sample sizes are sufficient to detect strong reductions in SCR, but invariantly lead to poor precision of estimates for current SCR. In this situation, sample size is better determined by controlling the precision of SCR estimates. Conversely larger sample sizes are required for detecting more subtle reductions in malaria transmission but those invariantly increase precision whilst reducing putative estimation bias. The proposed sample size calculator, although based on data simulation, shows promise of being easily applicable to a range of populations and survey types. Since the change point is a major source of uncertainty, obtaining or assuming prior information about this parameter might reduce both the sample size and the chance of generating biased SCR estimates.

  17. Estimating population size for Capercaillie (Tetrao urogallus L.) with spatial capture-recapture models based on genotypes from one field sample

    USGS Publications Warehouse

    Mollet, Pierre; Kery, Marc; Gardner, Beth; Pasinelli, Gilberto; Royle, Andy

    2015-01-01

    We conducted a survey of an endangered and cryptic forest grouse, the capercaillie Tetrao urogallus, based on droppings collected on two sampling occasions in eight forest fragments in central Switzerland in early spring 2009. We used genetic analyses to sex and individually identify birds. We estimated sex-dependent detection probabilities and population size using a modern spatial capture-recapture (SCR) model for the data from pooled surveys. A total of 127 capercaillie genotypes were identified (77 males, 46 females, and 4 of unknown sex). The SCR model yielded atotal population size estimate (posterior mean) of 137.3 capercaillies (posterior sd 4.2, 95% CRI 130–147). The observed sex ratio was skewed towards males (0.63). The posterior mean of the sex ratio under the SCR model was 0.58 (posterior sd 0.02, 95% CRI 0.54–0.61), suggesting a male-biased sex ratio in our study area. A subsampling simulation study indicated that a reduced sampling effort representing 75% of the actual detections would still yield practically acceptable estimates of total size and sex ratio in our population. Hence, field work and financial effort could be reduced without compromising accuracy when the SCR model is used to estimate key population parameters of cryptic species.

  18. Accuracy in parameter estimation for targeted effects in structural equation modeling: sample size planning for narrow confidence intervals.

    PubMed

    Lai, Keke; Kelley, Ken

    2011-06-01

    In addition to evaluating a structural equation model (SEM) as a whole, often the model parameters are of interest and confidence intervals for those parameters are formed. Given a model with a good overall fit, it is entirely possible for the targeted effects of interest to have very wide confidence intervals, thus giving little information about the magnitude of the population targeted effects. With the goal of obtaining sufficiently narrow confidence intervals for the model parameters of interest, sample size planning methods for SEM are developed from the accuracy in parameter estimation approach. One method plans for the sample size so that the expected confidence interval width is sufficiently narrow. An extended procedure ensures that the obtained confidence interval will be no wider than desired, with some specified degree of assurance. A Monte Carlo simulation study was conducted that verified the effectiveness of the procedures in realistic situations. The methods developed have been implemented in the MBESS package in R so that they can be easily applied by researchers. © 2011 American Psychological Association

  19. Population size and stopover duration estimation using mark–resight data and Bayesian analysis of a superpopulation model

    USGS Publications Warehouse

    Lyons, James E.; Kendall, William L.; Royle, J. Andrew; Converse, Sarah J.; Andres, Brad A.; Buchanan, Joseph B.

    2016-01-01

    We present a novel formulation of a mark–recapture–resight model that allows estimation of population size, stopover duration, and arrival and departure schedules at migration areas. Estimation is based on encounter histories of uniquely marked individuals and relative counts of marked and unmarked animals. We use a Bayesian analysis of a state–space formulation of the Jolly–Seber mark–recapture model, integrated with a binomial model for counts of unmarked animals, to derive estimates of population size and arrival and departure probabilities. We also provide a novel estimator for stopover duration that is derived from the latent state variable representing the interim between arrival and departure in the state–space model. We conduct a simulation study of field sampling protocols to understand the impact of superpopulation size, proportion marked, and number of animals sampled on bias and precision of estimates. Simulation results indicate that relative bias of estimates of the proportion of the population with marks was low for all sampling scenarios and never exceeded 2%. Our approach does not require enumeration of all unmarked animals detected or direct knowledge of the number of marked animals in the population at the time of the study. This provides flexibility and potential application in a variety of sampling situations (e.g., migratory birds, breeding seabirds, sea turtles, fish, pinnipeds, etc.). Application of the methods is demonstrated with data from a study of migratory sandpipers.

  20. Evaluation of Confluence Model Variables on IQ and Achievement Test Scores in a Sample of 6- to 11-Year-Old Children.

    ERIC Educational Resources Information Center

    Svanum, Soren; Bringle, Robert G.

    1980-01-01

    The confluence model of cognitive development was tested on 7,060 children. Family size, sibling order within family sizes, and hypothesized age-dependent effects were tested. Findings indicated an inverse relationship between family size and the cognitive measures; age-dependent effects and other confluence variables were found to be…

  1. A Proposed Approach for Joint Modeling of the Longitudinal and Time-To-Event Data in Heterogeneous Populations: An Application to HIV/AIDS's Disease.

    PubMed

    Roustaei, Narges; Ayatollahi, Seyyed Mohammad Taghi; Zare, Najaf

    2018-01-01

    In recent years, the joint models have been widely used for modeling the longitudinal and time-to-event data simultaneously. In this study, we proposed an approach (PA) to study the longitudinal and survival outcomes simultaneously in heterogeneous populations. PA relaxes the assumption of conditional independence (CI). We also compared PA with joint latent class model (JLCM) and separate approach (SA) for various sample sizes (150, 300, and 600) and different association parameters (0, 0.2, and 0.5). The average bias of parameters estimation (AB-PE), average SE of parameters estimation (ASE-PE), and coverage probability of the 95% confidence interval (CP) among the three approaches were compared. In most cases, when the sample sizes increased, AB-PE and ASE-PE decreased for the three approaches, and CP got closer to the nominal level of 0.95. When there was a considerable association, PA in comparison with SA and JLCM performed better in the sense that PA had the smallest AB-PE and ASE-PE for the longitudinal submodel among the three approaches for the small and moderate sample sizes. Moreover, JLCM was desirable for the none-association and the large sample size. Finally, the evaluated approaches were applied on a real HIV/AIDS dataset for validation, and the results were compared.

  2. Relative efficiency of unequal versus equal cluster sizes in cluster randomized trials using generalized estimating equation models.

    PubMed

    Liu, Jingxia; Colditz, Graham A

    2018-05-01

    There is growing interest in conducting cluster randomized trials (CRTs). For simplicity in sample size calculation, the cluster sizes are assumed to be identical across all clusters. However, equal cluster sizes are not guaranteed in practice. Therefore, the relative efficiency (RE) of unequal versus equal cluster sizes has been investigated when testing the treatment effect. One of the most important approaches to analyze a set of correlated data is the generalized estimating equation (GEE) proposed by Liang and Zeger, in which the "working correlation structure" is introduced and the association pattern depends on a vector of association parameters denoted by ρ. In this paper, we utilize GEE models to test the treatment effect in a two-group comparison for continuous, binary, or count data in CRTs. The variances of the estimator of the treatment effect are derived for the different types of outcome. RE is defined as the ratio of variance of the estimator of the treatment effect for equal to unequal cluster sizes. We discuss a commonly used structure in CRTs-exchangeable, and derive the simpler formula of RE with continuous, binary, and count outcomes. Finally, REs are investigated for several scenarios of cluster size distributions through simulation studies. We propose an adjusted sample size due to efficiency loss. Additionally, we also propose an optimal sample size estimation based on the GEE models under a fixed budget for known and unknown association parameter (ρ) in the working correlation structure within the cluster. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Accounting for sampling error when inferring population synchrony from time-series data: a Bayesian state-space modelling approach with applications.

    PubMed

    Santin-Janin, Hugues; Hugueny, Bernard; Aubry, Philippe; Fouchet, David; Gimenez, Olivier; Pontier, Dominique

    2014-01-01

    Data collected to inform time variations in natural population size are tainted by sampling error. Ignoring sampling error in population dynamics models induces bias in parameter estimators, e.g., density-dependence. In particular, when sampling errors are independent among populations, the classical estimator of the synchrony strength (zero-lag correlation) is biased downward. However, this bias is rarely taken into account in synchrony studies although it may lead to overemphasizing the role of intrinsic factors (e.g., dispersal) with respect to extrinsic factors (the Moran effect) in generating population synchrony as well as to underestimating the extinction risk of a metapopulation. The aim of this paper was first to illustrate the extent of the bias that can be encountered in empirical studies when sampling error is neglected. Second, we presented a space-state modelling approach that explicitly accounts for sampling error when quantifying population synchrony. Third, we exemplify our approach with datasets for which sampling variance (i) has been previously estimated, and (ii) has to be jointly estimated with population synchrony. Finally, we compared our results to those of a standard approach neglecting sampling variance. We showed that ignoring sampling variance can mask a synchrony pattern whatever its true value and that the common practice of averaging few replicates of population size estimates poorly performed at decreasing the bias of the classical estimator of the synchrony strength. The state-space model used in this study provides a flexible way of accurately quantifying the strength of synchrony patterns from most population size data encountered in field studies, including over-dispersed count data. We provided a user-friendly R-program and a tutorial example to encourage further studies aiming at quantifying the strength of population synchrony to account for uncertainty in population size estimates.

  4. Accounting for Sampling Error When Inferring Population Synchrony from Time-Series Data: A Bayesian State-Space Modelling Approach with Applications

    PubMed Central

    Santin-Janin, Hugues; Hugueny, Bernard; Aubry, Philippe; Fouchet, David; Gimenez, Olivier; Pontier, Dominique

    2014-01-01

    Background Data collected to inform time variations in natural population size are tainted by sampling error. Ignoring sampling error in population dynamics models induces bias in parameter estimators, e.g., density-dependence. In particular, when sampling errors are independent among populations, the classical estimator of the synchrony strength (zero-lag correlation) is biased downward. However, this bias is rarely taken into account in synchrony studies although it may lead to overemphasizing the role of intrinsic factors (e.g., dispersal) with respect to extrinsic factors (the Moran effect) in generating population synchrony as well as to underestimating the extinction risk of a metapopulation. Methodology/Principal findings The aim of this paper was first to illustrate the extent of the bias that can be encountered in empirical studies when sampling error is neglected. Second, we presented a space-state modelling approach that explicitly accounts for sampling error when quantifying population synchrony. Third, we exemplify our approach with datasets for which sampling variance (i) has been previously estimated, and (ii) has to be jointly estimated with population synchrony. Finally, we compared our results to those of a standard approach neglecting sampling variance. We showed that ignoring sampling variance can mask a synchrony pattern whatever its true value and that the common practice of averaging few replicates of population size estimates poorly performed at decreasing the bias of the classical estimator of the synchrony strength. Conclusion/Significance The state-space model used in this study provides a flexible way of accurately quantifying the strength of synchrony patterns from most population size data encountered in field studies, including over-dispersed count data. We provided a user-friendly R-program and a tutorial example to encourage further studies aiming at quantifying the strength of population synchrony to account for uncertainty in population size estimates. PMID:24489839

  5. Analyses of sweep-up, ejecta, and fallback material from the 4250 metric ton high explosive test ''MISTY PICTURE'

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wohletz, K.H.; Raymond, R. Jr.; Rawson, G.

    1988-01-01

    The MISTY PICTURE surface burst was detonated at the White Sands Missle range in May of 1987. The Los Alamos National Laboratory dust characterization program was expanded to help correlate and interrelate aspects of the overall MISTY PICTURE dust and ejecta characterization program. Pre-shot sampling of the test bed included composite samples from 15 to 75 m distance from Surface Ground Zero (SGZ) representing depths down to 2.5 m, interval samples from 15 to 25 m from SGZ representing depths down to 3m, and samples of surface material (top 0.5 cm) out to distances of 190 m from SGZ. Sweep-upmore » samples were collected in GREG/SNOB gages located within the DPR. All samples were dry-sieved between 8.0 mm and 0.045 mm (16 size fractures); selected samples were analyzed for fines by a contrifugal settling technique. The size distributions were analyzed using spectral decomposition based upon a sequential fragmentation model. Results suggest that the same particle size subpopulations are present in the ejecta, fallout, and sweep-up samples as are present in the pre-shot test bed. The particle size distribution in post-shot environments apparently can be modelled taking into account heterogeneities in the pre-shot test bed and dominant wind direction during and following the shot. 13 refs., 12 figs., 2 tabs.« less

  6. Further improvement of hydrostatic pressure sample injection for microchip electrophoresis.

    PubMed

    Luo, Yong; Zhang, Qingquan; Qin, Jianhua; Lin, Bingcheng

    2007-12-01

    Hydrostatic pressure sample injection method is able to minimize the number of electrodes needed for a microchip electrophoresis process; however, it neither can be applied for electrophoretic DNA sizing, nor can be implemented on the widely used single-cross microchip. This paper presents an injector design that makes the hydrostatic pressure sample injection method suitable for DNA sizing. By introducing an assistant channel into the normal double-cross injector, a rugged DNA sample plug suitable for sizing can be successfully formed within the cross area during the sample loading. This paper also demonstrates that the hydrostatic pressure sample injection can be performed in the single-cross microchip by controlling the radial position of the detection point in the separation channel. Rhodamine 123 and its derivative as model sample were successfully separated.

  7. Variable criteria sequential stopping rule: Validity and power with repeated measures ANOVA, multiple correlation, MANOVA and relation to Chi-square distribution.

    PubMed

    Fitts, Douglas A

    2017-09-21

    The variable criteria sequential stopping rule (vcSSR) is an efficient way to add sample size to planned ANOVA tests while holding the observed rate of Type I errors, α o , constant. The only difference from regular null hypothesis testing is that criteria for stopping the experiment are obtained from a table based on the desired power, rate of Type I errors, and beginning sample size. The vcSSR was developed using between-subjects ANOVAs, but it should work with p values from any type of F test. In the present study, the α o remained constant at the nominal level when using the previously published table of criteria with repeated measures designs with various numbers of treatments per subject, Type I error rates, values of ρ, and four different sample size models. New power curves allow researchers to select the optimal sample size model for a repeated measures experiment. The criteria held α o constant either when used with a multiple correlation that varied the sample size model and the number of predictor variables, or when used with MANOVA with multiple groups and two levels of a within-subject variable at various levels of ρ. Although not recommended for use with χ 2 tests such as the Friedman rank ANOVA test, the vcSSR produces predictable results based on the relation between F and χ 2 . Together, the data confirm the view that the vcSSR can be used to control Type I errors during sequential sampling with any t- or F-statistic rather than being restricted to certain ANOVA designs.

  8. Freeze-cast alumina pore networks: Effects of freezing conditions and dispersion medium

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Miller, S. M.; Xiao, X.; Faber, K. T.

    Alumina ceramics were freeze-cast from water- and camphene-based slurries under varying freezing conditions and examined using X-ray computed tomography (XCT). Pore network characteristics, i.e., porosity, pore size, geometric surface area, and tortuosity, were measured from XCT reconstructions and the data were used to develop a model to predict feature size from processing conditions. Classical solidification theory was used to examine relationships between pore size, temperature gradients, and freezing front velocity. Freezing front velocity was subsequently predicted from casting conditions via the two-phase Stefan problem. Resulting models for water-based samples agreed with solidification-based theories predicting lamellar spacing of binary eutectic alloys,more » and models for camphene-based samples concurred with those for dendritic growth. Relationships between freezing conditions and geometric surface area were also modeled by considering the inverse relationship between pore size and surface area. Tortuosity was determined to be dependent primarily on the type of dispersion medium. (C) 2015 Elsevier Ltd. All rights reserved.« less

  9. Identification of usual interstitial pneumonia pattern using RNA-Seq and machine learning: challenges and solutions.

    PubMed

    Choi, Yoonha; Liu, Tiffany Ting; Pankratz, Daniel G; Colby, Thomas V; Barth, Neil M; Lynch, David A; Walsh, P Sean; Raghu, Ganesh; Kennedy, Giulia C; Huang, Jing

    2018-05-09

    We developed a classifier using RNA sequencing data that identifies the usual interstitial pneumonia (UIP) pattern for the diagnosis of idiopathic pulmonary fibrosis. We addressed significant challenges, including limited sample size, biological and technical sample heterogeneity, and reagent and assay batch effects. We identified inter- and intra-patient heterogeneity, particularly within the non-UIP group. The models classified UIP on transbronchial biopsy samples with a receiver-operating characteristic area under the curve of ~ 0.9 in cross-validation. Using in silico mixed samples in training, we prospectively defined a decision boundary to optimize specificity at ≥85%. The penalized logistic regression model showed greater reproducibility across technical replicates and was chosen as the final model. The final model showed sensitivity of 70% and specificity of 88% in the test set. We demonstrated that the suggested methodologies appropriately addressed challenges of the sample size, disease heterogeneity and technical batch effects and developed a highly accurate and robust classifier leveraging RNA sequencing for the classification of UIP.

  10. Extension of latin hypercube samples with correlated variables.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hora, Stephen Curtis; Helton, Jon Craig; Sallaberry, Cedric J. PhD.

    2006-11-01

    A procedure for extending the size of a Latin hypercube sample (LHS) with rank correlated variables is described and illustrated. The extension procedure starts with an LHS of size m and associated rank correlation matrix C and constructs a new LHS of size 2m that contains the elements of the original LHS and has a rank correlation matrix that is close to the original rank correlation matrix C. The procedure is intended for use in conjunction with uncertainty and sensitivity analysis of computationally demanding models in which it is important to make efficient use of a necessarily limited number ofmore » model evaluations.« less

  11. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    ERIC Educational Resources Information Center

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  12. Integrated investigation of the mixed origin of lunar sample 72161,11

    NASA Technical Reports Server (NTRS)

    Basu, A.; Des Marais, D. J.; Hayes, J. M.; Meinschein, W. G.

    1975-01-01

    The comminution-agglutination model and the solar-wind implantation-retention model are used to postulate the origins of the particulate components of lunar sample (72161,11), a submillimeter fraction of a surface sample for the dark mantle regolith at LRV-3. Grain-size analysis was performed by wet sieving with liquid argon, and analyses for CO2, CO, CH4, and H2 were carried out by stepwise pyrolysis in a helium atmosphere. The results indicate that the present sample is from a mature regolith, but the agglutinate content is only 30% in the particle-size range between 90 and 177 microns, indicating an apparent departure from steady state. Analyses of the carbon, methane, and hydrogen concentrations in size fractions larger than 149 microns show that the volume-correlated component of these species increases with increased grain size. It is suggested that the observed increase can be explained in terms of mixing of a dominant local population of coarser agglutinates having high carbon and hydrogen concentrations with an imported population of finer agglutinates relatively poor in carbon and hydrogen.

  13. Efficient computation of the joint sample frequency spectra for multiple populations.

    PubMed

    Kamm, John A; Terhorst, Jonathan; Song, Yun S

    2017-01-01

    A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity.

  14. Efficient computation of the joint sample frequency spectra for multiple populations

    PubMed Central

    Kamm, John A.; Terhorst, Jonathan; Song, Yun S.

    2016-01-01

    A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity. PMID:28239248

  15. Analysis of Genetic Algorithm for Rule-Set Production (GARP) modeling approach for predicting distributions of fleas implicated as vectors of plague, Yersinia pestis, in California.

    PubMed

    Adjemian, Jennifer C Z; Girvetz, Evan H; Beckett, Laurel; Foley, Janet E

    2006-01-01

    More than 20 species of fleas in California are implicated as potential vectors of Yersinia pestis. Extremely limited spatial data exist for plague vectors-a key component to understanding where the greatest risks for human, domestic animal, and wildlife health exist. This study increases the spatial data available for 13 potential plague vectors by using the ecological niche modeling system Genetic Algorithm for Rule-Set Production (GARP) to predict their respective distributions. Because the available sample sizes in our data set varied greatly from one species to another, we also performed an analysis of the robustness of GARP by using the data available for flea Oropsylla montana (Baker) to quantify the effects that sample size and the chosen explanatory variables have on the final species distribution map. GARP effectively modeled the distributions of 13 vector species. Furthermore, our analyses show that all of these modeled ranges are robust, with a sample size of six fleas or greater not significantly impacting the percentage of the in-state area where the flea was predicted to be found, or the testing accuracy of the model. The results of this study will help guide the sampling efforts of future studies focusing on plague vectors.

  16. Malaria prevalence metrics in low- and middle-income countries: an assessment of precision in nationally-representative surveys.

    PubMed

    Alegana, Victor A; Wright, Jim; Bosco, Claudio; Okiro, Emelda A; Atkinson, Peter M; Snow, Robert W; Tatem, Andrew J; Noor, Abdisalan M

    2017-11-21

    One pillar to monitoring progress towards the Sustainable Development Goals is the investment in high quality data to strengthen the scientific basis for decision-making. At present, nationally-representative surveys are the main source of data for establishing a scientific evidence base, monitoring, and evaluation of health metrics. However, little is known about the optimal precisions of various population-level health and development indicators that remains unquantified in nationally-representative household surveys. Here, a retrospective analysis of the precision of prevalence from these surveys was conducted. Using malaria indicators, data were assembled in nine sub-Saharan African countries with at least two nationally-representative surveys. A Bayesian statistical model was used to estimate between- and within-cluster variability for fever and malaria prevalence, and insecticide-treated bed nets (ITNs) use in children under the age of 5 years. The intra-class correlation coefficient was estimated along with the optimal sample size for each indicator with associated uncertainty. Results suggest that the estimated sample sizes for the current nationally-representative surveys increases with declining malaria prevalence. Comparison between the actual sample size and the modelled estimate showed a requirement to increase the sample size for parasite prevalence by up to 77.7% (95% Bayesian credible intervals 74.7-79.4) for the 2015 Kenya MIS (estimated sample size of children 0-4 years 7218 [7099-7288]), and 54.1% [50.1-56.5] for the 2014-2015 Rwanda DHS (12,220 [11,950-12,410]). This study highlights the importance of defining indicator-relevant sample sizes to achieve the required precision in the current national surveys. While expanding the current surveys would need additional investment, the study highlights the need for improved approaches to cost effective sampling.

  17. Gravity or turbulence? IV. Collapsing cores in out-of-virial disguise

    NASA Astrophysics Data System (ADS)

    Ballesteros-Paredes, Javier; Vázquez-Semadeni, Enrique; Palau, Aina; Klessen, Ralf S.

    2018-06-01

    We study the dynamical state of massive cores by using a simple analytical model, an observational sample, and numerical simulations of collapsing massive cores. From the analytical model, we find that cores increase their column density and velocity dispersion as they collapse, resulting in a time evolution path in the Larson velocity dispersion-size diagram from large sizes and small velocity dispersions to small sizes and large velocity dispersions, while they tend to equipartition between gravity and kinetic energy. From the observational sample, we find that: (a) cores with substantially different column densities in the sample do not follow a Larson-like linewidth-size relation. Instead, cores with higher column densities tend to be located in the upper-left corner of the Larson velocity dispersion σv, 3D-size R diagram, a result explained in the hierarchical and chaotic collapse scenario. (b) Cores appear to have overvirial values. Finally, our numerical simulations reproduce the behavior predicted by the analytical model and depicted in the observational sample: collapsing cores evolve towards larger velocity dispersions and smaller sizes as they collapse and increase their column density. More importantly, however, they exhibit overvirial states. This apparent excess is due to the assumption that the gravitational energy is given by the energy of an isolated homogeneous sphere. However, such excess disappears when the gravitational energy is correctly calculated from the actual spatial mass distribution. We conclude that the observed energy budget of cores is consistent with their non-thermal motions being driven by their self-gravity and in the process of dynamical collapse.

  18. Spatial Sampling of Weather Data for Regional Crop Yield Simulations

    NASA Technical Reports Server (NTRS)

    Van Bussel, Lenny G. J.; Ewert, Frank; Zhao, Gang; Hoffmann, Holger; Enders, Andreas; Wallach, Daniel; Asseng, Senthold; Baigorria, Guillermo A.; Basso, Bruno; Biernath, Christian; hide

    2016-01-01

    Field-scale crop models are increasingly applied at spatio-temporal scales that range from regions to the globe and from decades up to 100 years. Sufficiently detailed data to capture the prevailing spatio-temporal heterogeneity in weather, soil, and management conditions as needed by crop models are rarely available. Effective sampling may overcome the problem of missing data but has rarely been investigated. In this study the effect of sampling weather data has been evaluated for simulating yields of winter wheat in a region in Germany over a 30-year period (1982-2011) using 12 process-based crop models. A stratified sampling was applied to compare the effect of different sizes of spatially sampled weather data (10, 30, 50, 100, 500, 1000 and full coverage of 34,078 sampling points) on simulated wheat yields. Stratified sampling was further compared with random sampling. Possible interactions between sample size and crop model were evaluated. The results showed differences in simulated yields among crop models but all models reproduced well the pattern of the stratification. Importantly, the regional mean of simulated yields based on full coverage could already be reproduced by a small sample of 10 points. This was also true for reproducing the temporal variability in simulated yields but more sampling points (about 100) were required to accurately reproduce spatial yield variability. The number of sampling points can be smaller when a stratified sampling is applied as compared to a random sampling. However, differences between crop models were observed including some interaction between the effect of sampling on simulated yields and the model used. We concluded that stratified sampling can considerably reduce the number of required simulations. But, differences between crop models must be considered as the choice for a specific model can have larger effects on simulated yields than the sampling strategy. Assessing the impact of sampling soil and crop management data for regional simulations of crop yields is still needed.

  19. Reproducibility of preclinical animal research improves with heterogeneity of study samples

    PubMed Central

    Vogt, Lucile; Sena, Emily S.; Würbel, Hanno

    2018-01-01

    Single-laboratory studies conducted under highly standardized conditions are the gold standard in preclinical animal research. Using simulations based on 440 preclinical studies across 13 different interventions in animal models of stroke, myocardial infarction, and breast cancer, we compared the accuracy of effect size estimates between single-laboratory and multi-laboratory study designs. Single-laboratory studies generally failed to predict effect size accurately, and larger sample sizes rendered effect size estimates even less accurate. By contrast, multi-laboratory designs including as few as 2 to 4 laboratories increased coverage probability by up to 42 percentage points without a need for larger sample sizes. These findings demonstrate that within-study standardization is a major cause of poor reproducibility. More representative study samples are required to improve the external validity and reproducibility of preclinical animal research and to prevent wasting animals and resources for inconclusive research. PMID:29470495

  20. An empirical analysis of the quantitative effect of data when fitting quadratic and cubic polynomials

    NASA Technical Reports Server (NTRS)

    Canavos, G. C.

    1974-01-01

    A study is made of the extent to which the size of the sample affects the accuracy of a quadratic or a cubic polynomial approximation of an experimentally observed quantity, and the trend with regard to improvement in the accuracy of the approximation as a function of sample size is established. The task is made possible through a simulated analysis carried out by the Monte Carlo method in which data are simulated by using several transcendental or algebraic functions as models. Contaminated data of varying amounts are fitted to either quadratic or cubic polynomials, and the behavior of the mean-squared error of the residual variance is determined as a function of sample size. Results indicate that the effect of the size of the sample is significant only for relatively small sizes and diminishes drastically for moderate and large amounts of experimental data.

  1. Sample size considerations for paired experimental design with incomplete observations of continuous outcomes.

    PubMed

    Zhu, Hong; Xu, Xiaohan; Ahn, Chul

    2017-01-01

    Paired experimental design is widely used in clinical and health behavioral studies, where each study unit contributes a pair of observations. Investigators often encounter incomplete observations of paired outcomes in the data collected. Some study units contribute complete pairs of observations, while the others contribute either pre- or post-intervention observations. Statistical inference for paired experimental design with incomplete observations of continuous outcomes has been extensively studied in literature. However, sample size method for such study design is sparsely available. We derive a closed-form sample size formula based on the generalized estimating equation approach by treating the incomplete observations as missing data in a linear model. The proposed method properly accounts for the impact of mixed structure of observed data: a combination of paired and unpaired outcomes. The sample size formula is flexible to accommodate different missing patterns, magnitude of missingness, and correlation parameter values. We demonstrate that under complete observations, the proposed generalized estimating equation sample size estimate is the same as that based on the paired t-test. In the presence of missing data, the proposed method would lead to a more accurate sample size estimate comparing with the crude adjustment. Simulation studies are conducted to evaluate the finite-sample performance of the generalized estimating equation sample size formula. A real application example is presented for illustration.

  2. A Bayesian approach for incorporating economic factors in sample size design for clinical trials of individual drugs and portfolios of drugs.

    PubMed

    Patel, Nitin R; Ankolekar, Suresh

    2007-11-30

    Classical approaches to clinical trial design ignore economic factors that determine economic viability of a new drug. We address the choice of sample size in Phase III trials as a decision theory problem using a hybrid approach that takes a Bayesian view from the perspective of a drug company and a classical Neyman-Pearson view from the perspective of regulatory authorities. We incorporate relevant economic factors in the analysis to determine the optimal sample size to maximize the expected profit for the company. We extend the analysis to account for risk by using a 'satisficing' objective function that maximizes the chance of meeting a management-specified target level of profit. We extend the models for single drugs to a portfolio of clinical trials and optimize the sample sizes to maximize the expected profit subject to budget constraints. Further, we address the portfolio risk and optimize the sample sizes to maximize the probability of achieving a given target of expected profit.

  3. Violation of the Sphericity Assumption and Its Effect on Type-I Error Rates in Repeated Measures ANOVA and Multi-Level Linear Models (MLM).

    PubMed

    Haverkamp, Nicolas; Beauducel, André

    2017-01-01

    We investigated the effects of violations of the sphericity assumption on Type I error rates for different methodical approaches of repeated measures analysis using a simulation approach. In contrast to previous simulation studies on this topic, up to nine measurement occasions were considered. Effects of the level of inter-correlations between measurement occasions on Type I error rates were considered for the first time. Two populations with non-violation of the sphericity assumption, one with uncorrelated measurement occasions and one with moderately correlated measurement occasions, were generated. One population with violation of the sphericity assumption combines uncorrelated with highly correlated measurement occasions. A second population with violation of the sphericity assumption combines moderately correlated and highly correlated measurement occasions. From these four populations without any between-group effect or within-subject effect 5,000 random samples were drawn. Finally, the mean Type I error rates for Multilevel linear models (MLM) with an unstructured covariance matrix (MLM-UN), MLM with compound-symmetry (MLM-CS) and for repeated measures analysis of variance (rANOVA) models (without correction, with Greenhouse-Geisser-correction, and Huynh-Feldt-correction) were computed. To examine the effect of both the sample size and the number of measurement occasions, sample sizes of n = 20, 40, 60, 80, and 100 were considered as well as measurement occasions of m = 3, 6, and 9. With respect to rANOVA, the results plead for a use of rANOVA with Huynh-Feldt-correction, especially when the sphericity assumption is violated, the sample size is rather small and the number of measurement occasions is large. For MLM-UN, the results illustrate a massive progressive bias for small sample sizes ( n = 20) and m = 6 or more measurement occasions. This effect could not be found in previous simulation studies with a smaller number of measurement occasions. The proportionality of bias and number of measurement occasions should be considered when MLM-UN is used. The good news is that this proportionality can be compensated by means of large sample sizes. Accordingly, MLM-UN can be recommended even for small sample sizes for about three measurement occasions and for large sample sizes for about nine measurement occasions.

  4. Atomistic origin of size effects in fatigue behavior of metallic glasses

    NASA Astrophysics Data System (ADS)

    Sha, Zhendong; Wong, Wei Hin; Pei, Qingxiang; Branicio, Paulo Sergio; Liu, Zishun; Wang, Tiejun; Guo, Tianfu; Gao, Huajian

    2017-07-01

    While many experiments and simulations on metallic glasses (MGs) have focused on their tensile ductility under monotonic loading, the fatigue mechanisms of MGs under cyclic loading still remain largely elusive. Here we perform molecular dynamics (MD) and finite element simulations of tension-compression fatigue tests in MGs to elucidate their fatigue mechanisms with focus on the sample size effect. Shear band (SB) thickening is found to be the inherent fatigue mechanism for nanoscale MGs. The difference in fatigue mechanisms between macroscopic and nanoscale MGs originates from whether the SB forms partially or fully through the cross-section of the specimen. Furthermore, a qualitative investigation of the sample size effect suggests that small sample size increases the fatigue life while large sample size promotes cyclic softening and necking. Our observations on the size-dependent fatigue behavior can be rationalized by the Gurson model and the concept of surface tension of the nanovoids. The present study sheds light on the fatigue mechanisms of MGs and can be useful in interpreting previous experimental results.

  5. Estimating the Size of a Large Network and its Communities from a Random Sample

    PubMed Central

    Chen, Lin; Karbasi, Amin; Crawford, Forrest W.

    2017-01-01

    Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = (V, E) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W ⊆ V and letting G(W) be the induced subgraph in G of the vertices in W. In addition to G(W), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that accurately estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K, and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios. PMID:28867924

  6. Estimating the Size of a Large Network and its Communities from a Random Sample.

    PubMed

    Chen, Lin; Karbasi, Amin; Crawford, Forrest W

    2016-01-01

    Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = ( V, E ) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W ⊆ V and letting G ( W ) be the induced subgraph in G of the vertices in W . In addition to G ( W ), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that accurately estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K , and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios.

  7. Sub-sampling genetic data to estimate black bear population size: A case study

    USGS Publications Warehouse

    Tredick, C.A.; Vaughan, M.R.; Stauffer, D.F.; Simek, S.L.; Eason, T.

    2007-01-01

    Costs for genetic analysis of hair samples collected for individual identification of bears average approximately US$50 [2004] per sample. This can easily exceed budgetary allowances for large-scale studies or studies of high-density bear populations. We used 2 genetic datasets from 2 areas in the southeastern United States to explore how reducing costs of analysis by sub-sampling affected precision and accuracy of resulting population estimates. We used several sub-sampling scenarios to create subsets of the full datasets and compared summary statistics, population estimates, and precision of estimates generated from these subsets to estimates generated from the complete datasets. Our results suggested that bias and precision of estimates improved as the proportion of total samples used increased, and heterogeneity models (e.g., Mh[CHAO]) were more robust to reduced sample sizes than other models (e.g., behavior models). We recommend that only high-quality samples (>5 hair follicles) be used when budgets are constrained, and efforts should be made to maximize capture and recapture rates in the field.

  8. The Influence of Mark-Recapture Sampling Effort on Estimates of Rock Lobster Survival

    PubMed Central

    Kordjazi, Ziya; Frusher, Stewart; Buxton, Colin; Gardner, Caleb; Bird, Tomas

    2016-01-01

    Five annual capture-mark-recapture surveys on Jasus edwardsii were used to evaluate the effect of sample size and fishing effort on the precision of estimated survival probability. Datasets of different numbers of individual lobsters (ranging from 200 to 1,000 lobsters) were created by random subsampling from each annual survey. This process of random subsampling was also used to create 12 datasets of different levels of effort based on three levels of the number of traps (15, 30 and 50 traps per day) and four levels of the number of sampling-days (2, 4, 6 and 7 days). The most parsimonious Cormack-Jolly-Seber (CJS) model for estimating survival probability shifted from a constant model towards sex-dependent models with increasing sample size and effort. A sample of 500 lobsters or 50 traps used on four consecutive sampling-days was required for obtaining precise survival estimations for males and females, separately. Reduced sampling effort of 30 traps over four sampling days was sufficient if a survival estimate for both sexes combined was sufficient for management of the fishery. PMID:26990561

  9. Measures of precision for dissimilarity-based multivariate analysis of ecological communities

    PubMed Central

    Anderson, Marti J; Santana-Garcon, Julia

    2015-01-01

    Ecological studies require key decisions regarding the appropriate size and number of sampling units. No methods currently exist to measure precision for multivariate assemblage data when dissimilarity-based analyses are intended to follow. Here, we propose a pseudo multivariate dissimilarity-based standard error (MultSE) as a useful quantity for assessing sample-size adequacy in studies of ecological communities. Based on sums of squared dissimilarities, MultSE measures variability in the position of the centroid in the space of a chosen dissimilarity measure under repeated sampling for a given sample size. We describe a novel double resampling method to quantify uncertainty in MultSE values with increasing sample size. For more complex designs, values of MultSE can be calculated from the pseudo residual mean square of a permanova model, with the double resampling done within appropriate cells in the design. R code functions for implementing these techniques, along with ecological examples, are provided. PMID:25438826

  10. Performance of Bootstrapping Approaches To Model Test Statistics and Parameter Standard Error Estimation in Structural Equation Modeling.

    ERIC Educational Resources Information Center

    Nevitt, Jonathan; Hancock, Gregory R.

    2001-01-01

    Evaluated the bootstrap method under varying conditions of nonnormality, sample size, model specification, and number of bootstrap samples drawn from the resampling space. Results for the bootstrap suggest the resampling-based method may be conservative in its control over model rejections, thus having an impact on the statistical power associated…

  11. Optimal number of features as a function of sample size for various classification rules.

    PubMed

    Hua, Jianping; Xiong, Zixiang; Lowey, James; Suh, Edward; Dougherty, Edward R

    2005-04-15

    Given the joint feature-label distribution, increasing the number of features always results in decreased classification error; however, this is not the case when a classifier is designed via a classification rule from sample data. Typically (but not always), for fixed sample size, the error of a designed classifier decreases and then increases as the number of features grows. The potential downside of using too many features is most critical for small samples, which are commonplace for gene-expression-based classifiers for phenotype discrimination. For fixed sample size and feature-label distribution, the issue is to find an optimal number of features. Since only in rare cases is there a known distribution of the error as a function of the number of features and sample size, this study employs simulation for various feature-label distributions and classification rules, and across a wide range of sample and feature-set sizes. To achieve the desired end, finding the optimal number of features as a function of sample size, it employs massively parallel computation. Seven classifiers are treated: 3-nearest-neighbor, Gaussian kernel, linear support vector machine, polynomial support vector machine, perceptron, regular histogram and linear discriminant analysis. Three Gaussian-based models are considered: linear, nonlinear and bimodal. In addition, real patient data from a large breast-cancer study is considered. To mitigate the combinatorial search for finding optimal feature sets, and to model the situation in which subsets of genes are co-regulated and correlation is internal to these subsets, we assume that the covariance matrix of the features is blocked, with each block corresponding to a group of correlated features. Altogether there are a large number of error surfaces for the many cases. These are provided in full on a companion website, which is meant to serve as resource for those working with small-sample classification. For the companion website, please visit http://public.tgen.org/tamu/ofs/ e-dougherty@ee.tamu.edu.

  12. Waif goodbye! Average-size female models promote positive body image and appeal to consumers.

    PubMed

    Diedrichs, Phillippa C; Lee, Christina

    2011-10-01

    Despite consensus that exposure to media images of thin fashion models is associated with poor body image and disordered eating behaviours, few attempts have been made to enact change in the media. This study sought to investigate an effective alternative to current media imagery, by exploring the advertising effectiveness of average-size female fashion models, and their impact on the body image of both women and men. A sample of 171 women and 120 men were assigned to one of three advertisement conditions: no models, thin models and average-size models. Women and men rated average-size models as equally effective in advertisements as thin and no models. For women with average and high levels of internalisation of cultural beauty ideals, exposure to average-size female models was associated with a significantly more positive body image state in comparison to exposure to thin models and no models. For men reporting high levels of internalisation, exposure to average-size models was also associated with a more positive body image state in comparison to viewing thin models. These findings suggest that average-size female models can promote positive body image and appeal to consumers.

  13. Pore-scale simulations of drainage in granular materials: Finite size effects and the representative elementary volume

    NASA Astrophysics Data System (ADS)

    Yuan, Chao; Chareyre, Bruno; Darve, Félix

    2016-09-01

    A pore-scale model is introduced for two-phase flow in dense packings of polydisperse spheres. The model is developed as a component of a more general hydromechanical coupling framework based on the discrete element method, which will be elaborated in future papers and will apply to various processes of interest in soil science, in geomechanics and in oil and gas production. Here the emphasis is on the generation of a network of pores mapping the void space between spherical grains, and the definition of local criteria governing the primary drainage process. The pore space is decomposed by Regular Triangulation, from which a set of pores connected by throats are identified. A local entry capillary pressure is evaluated for each throat, based on the balance of capillary pressure and surface tension at equilibrium. The model reflects the possible entrapment of disconnected patches of the receding wetting phase. It is validated by a comparison with drainage experiments. In the last part of the paper, a series of simulations are reported to illustrate size and boundary effects, key questions when studying small samples made of spherical particles be it in simulations or experiments. Repeated tests on samples of different sizes give evolution of water content which are not only scattered but also strongly biased for small sample sizes. More than 20,000 spheres are needed to reduce the bias on saturation below 0.02. Additional statistics are generated by subsampling a large sample of 64,000 spheres. They suggest that the minimal sampling volume for evaluating saturation is one hundred times greater that the sampling volume needed for measuring porosity with the same accuracy. This requirement in terms of sample size induces a need for efficient computer codes. The method described herein has a low algorithmic complexity in order to satisfy this requirement. It will be well suited to further developments toward coupled flow-deformation problems in which evolution of the microstructure require frequent updates of the pore network.

  14. Role of sediment size and biostratinomy on the development of biofilms in recent avian vertebrate remains

    NASA Astrophysics Data System (ADS)

    Peterson, Joseph E.; Lenczewski, Melissa E.; Clawson, Steven R.; Warnock, Jonathan P.

    2017-04-01

    Microscopic soft tissues have been identified in fossil vertebrate remains collected from various lithologies. However, the diagenetic mechanisms to preserve such tissues have remained elusive. While previous studies have described infiltration of biofilms in Haversian and Volkmann’s canals, biostratinomic alteration (e.g., trampling), and iron derived from hemoglobin as playing roles in the preservation processes, the influence of sediment texture has not previously been investigated. This study uses a Kolmogorov Smirnov Goodness-of-Fit test to explore the influence of biostratinomic variability and burial media against the infiltration of biofilms in bone samples. Controlled columns of sediment with bone samples were used to simulate burial and subsequent groundwater flow. Sediments used in this study include clay-, silt-, and sand-sized particles modeled after various fluvial facies commonly associated with fossil vertebrates. Extant limb bone samples obtained from Gallus gallus domesticus (Domestic Chicken) buried in clay-rich sediment exhibit heavy biofilm infiltration, while bones buried in sands and silts exhibit moderate levels. Crushed bones exhibit significantly lower biofilm infiltration than whole bone samples. Strong interactions between biostratinomic alteration and sediment size are also identified with respect to biofilm development. Sediments modeling crevasse splay deposits exhibit considerable variability; whole-bone crevasse splay samples exhibit higher frequencies of high-level biofilm infiltration, and crushed-bone samples in modeled crevasse splay deposits display relatively high frequencies of low-level biofilm infiltration. These results suggest that sediment size, depositional setting, and biostratinomic condition play key roles in biofilm infiltration in vertebrate remains, and may influence soft tissue preservation in fossil vertebrates.

  15. Diagnostic test accuracy and prevalence inferences based on joint and sequential testing with finite population sampling.

    PubMed

    Su, Chun-Lung; Gardner, Ian A; Johnson, Wesley O

    2004-07-30

    The two-test two-population model, originally formulated by Hui and Walter, for estimation of test accuracy and prevalence estimation assumes conditionally independent tests, constant accuracy across populations and binomial sampling. The binomial assumption is incorrect if all individuals in a population e.g. child-care centre, village in Africa, or a cattle herd are sampled or if the sample size is large relative to population size. In this paper, we develop statistical methods for evaluating diagnostic test accuracy and prevalence estimation based on finite sample data in the absence of a gold standard. Moreover, two tests are often applied simultaneously for the purpose of obtaining a 'joint' testing strategy that has either higher overall sensitivity or specificity than either of the two tests considered singly. Sequential versions of such strategies are often applied in order to reduce the cost of testing. We thus discuss joint (simultaneous and sequential) testing strategies and inference for them. Using the developed methods, we analyse two real and one simulated data sets, and we compare 'hypergeometric' and 'binomial-based' inferences. Our findings indicate that the posterior standard deviations for prevalence (but not sensitivity and specificity) based on finite population sampling tend to be smaller than their counterparts for infinite population sampling. Finally, we make recommendations about how small the sample size should be relative to the population size to warrant use of the binomial model for prevalence estimation. Copyright 2004 John Wiley & Sons, Ltd.

  16. A time-varying effect model for examining group differences in trajectories of zero-inflated count outcomes with applications in substance abuse research.

    PubMed

    Yang, Songshan; Cranford, James A; Jester, Jennifer M; Li, Runze; Zucker, Robert A; Buu, Anne

    2017-02-28

    This study proposes a time-varying effect model for examining group differences in trajectories of zero-inflated count outcomes. The motivating example demonstrates that this zero-inflated Poisson model allows investigators to study group differences in different aspects of substance use (e.g., the probability of abstinence and the quantity of alcohol use) simultaneously. The simulation study shows that the accuracy of estimation of trajectory functions improves as the sample size increases; the accuracy under equal group sizes is only higher when the sample size is small (100). In terms of the performance of the hypothesis testing, the type I error rates are close to their corresponding significance levels under all settings. Furthermore, the power increases as the alternative hypothesis deviates more from the null hypothesis, and the rate of this increasing trend is higher when the sample size is larger. Moreover, the hypothesis test for the group difference in the zero component tends to be less powerful than the test for the group difference in the Poisson component. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  17. Bootstrap Estimation of Sample Statistic Bias in Structural Equation Modeling.

    ERIC Educational Resources Information Center

    Thompson, Bruce; Fan, Xitao

    This study empirically investigated bootstrap bias estimation in the area of structural equation modeling (SEM). Three correctly specified SEM models were used under four different sample size conditions. Monte Carlo experiments were carried out to generate the criteria against which bootstrap bias estimation should be judged. For SEM fit indices,…

  18. VARS-TOOL: A Comprehensive, Efficient, and Robust Sensitivity Analysis Toolbox

    NASA Astrophysics Data System (ADS)

    Razavi, S.; Sheikholeslami, R.; Haghnegahdar, A.; Esfahbod, B.

    2016-12-01

    VARS-TOOL is an advanced sensitivity and uncertainty analysis toolbox, applicable to the full range of computer simulation models, including Earth and Environmental Systems Models (EESMs). The toolbox was developed originally around VARS (Variogram Analysis of Response Surfaces), which is a general framework for Global Sensitivity Analysis (GSA) that utilizes the variogram/covariogram concept to characterize the full spectrum of sensitivity-related information, thereby providing a comprehensive set of "global" sensitivity metrics with minimal computational cost. VARS-TOOL is unique in that, with a single sample set (set of simulation model runs), it generates simultaneously three philosophically different families of global sensitivity metrics, including (1) variogram-based metrics called IVARS (Integrated Variogram Across a Range of Scales - VARS approach), (2) variance-based total-order effects (Sobol approach), and (3) derivative-based elementary effects (Morris approach). VARS-TOOL is also enabled with two novel features; the first one being a sequential sampling algorithm, called Progressive Latin Hypercube Sampling (PLHS), which allows progressively increasing the sample size for GSA while maintaining the required sample distributional properties. The second feature is a "grouping strategy" that adaptively groups the model parameters based on their sensitivity or functioning to maximize the reliability of GSA results. These features in conjunction with bootstrapping enable the user to monitor the stability, robustness, and convergence of GSA with the increase in sample size for any given case study. VARS-TOOL has been shown to achieve robust and stable results within 1-2 orders of magnitude smaller sample sizes (fewer model runs) than alternative tools. VARS-TOOL, available in MATLAB and Python, is under continuous development and new capabilities and features are forthcoming.

  19. Clinical and MRI activity as determinants of sample size for pediatric multiple sclerosis trials

    PubMed Central

    Verhey, Leonard H.; Signori, Alessio; Arnold, Douglas L.; Bar-Or, Amit; Sadovnick, A. Dessa; Marrie, Ruth Ann; Banwell, Brenda

    2013-01-01

    Objective: To estimate sample sizes for pediatric multiple sclerosis (MS) trials using new T2 lesion count, annualized relapse rate (ARR), and time to first relapse (TTFR) endpoints. Methods: Poisson and negative binomial models were fit to new T2 lesion and relapse count data, and negative binomial time-to-event and exponential models were fit to TTFR data of 42 children with MS enrolled in a national prospective cohort study. Simulations were performed by resampling from the best-fitting model of new T2 lesion count, number of relapses, or TTFR, under various assumptions of the effect size, trial duration, and model parameters. Results: Assuming a 50% reduction in new T2 lesions over 6 months, 90 patients/arm are required, whereas 165 patients/arm are required for a 40% treatment effect. Sample sizes for 2-year trials using relapse-related endpoints are lower than that for 1-year trials. For 2-year trials and a conservative assumption of overdispersion (ϑ), sample sizes range from 70 patients/arm (using ARR) to 105 patients/arm (TTFR) for a 50% reduction in relapses, and 230 patients/arm (ARR) to 365 patients/arm (TTFR) for a 30% relapse reduction. Assuming a less conservative ϑ, 2-year trials using ARR require 45 patients/arm (60 patients/arm for TTFR) for a 50% reduction in relapses and 145 patients/arm (200 patients/arm for TTFR) for a 30% reduction. Conclusion: Six-month phase II trials using new T2 lesion count as an endpoint are feasible in the pediatric MS population; however, trials powered on ARR or TTFR will need to be 2 years in duration and will require multicentered collaboration. PMID:23966255

  20. Rapid and non-invasive analysis of deoxynivalenol in durum and common wheat by Fourier-Transform Near Infrared (FT-NIR) spectroscopy.

    PubMed

    De Girolamo, A; Lippolis, V; Nordkvist, E; Visconti, A

    2009-06-01

    Fourier transform near-infrared spectroscopy (FT-NIR) was used for rapid and non-invasive analysis of deoxynivalenol (DON) in durum and common wheat. The relevance of using ground wheat samples with a homogeneous particle size distribution to minimize measurement variations and avoid DON segregation among particles of different sizes was established. Calibration models for durum wheat, common wheat and durum + common wheat samples, with particle size <500 microm, were obtained by using partial least squares (PLS) regression with an external validation technique. Values of root mean square error of prediction (RMSEP, 306-379 microg kg(-1)) were comparable and not too far from values of root mean square error of cross-validation (RMSECV, 470-555 microg kg(-1)). Coefficients of determination (r(2)) indicated an "approximate to good" level of prediction of the DON content by FT-NIR spectroscopy in the PLS calibration models (r(2) = 0.71-0.83), and a "good" discrimination between low and high DON contents in the PLS validation models (r(2) = 0.58-0.63). A "limited to good" practical utility of the models was ascertained by range error ratio (RER) values higher than 6. A qualitative model, based on 197 calibration samples, was developed to discriminate between blank and naturally contaminated wheat samples by setting a cut-off at 300 microg kg(-1) DON to separate the two classes. The model correctly classified 69% of the 65 validation samples with most misclassified samples (16 of 20) showing DON contamination levels quite close to the cut-off level. These findings suggest that FT-NIR analysis is suitable for the determination of DON in unprocessed wheat at levels far below the maximum permitted limits set by the European Commission.

  1. Prediction of bovine milk technological traits from mid-infrared spectroscopy analysis in dairy cows.

    PubMed

    Visentin, G; McDermott, A; McParland, S; Berry, D P; Kenny, O A; Brodkorb, A; Fenelon, M A; De Marchi, M

    2015-09-01

    Rapid, cost-effective monitoring of milk technological traits is a significant challenge for dairy industries specialized in cheese manufacturing. The objective of the present study was to investigate the ability of mid-infrared spectroscopy to predict rennet coagulation time, curd-firming time, curd firmness at 30 and 60min after rennet addition, heat coagulation time, casein micelle size, and pH in cow milk samples, and to quantify associations between these milk technological traits and conventional milk quality traits. Samples (n=713) were collected from 605 cows from multiple herds; the samples represented multiple breeds, stages of lactation, parities, and milking times. Reference analyses were undertaken in accordance with standardized methods, and mid-infrared spectra in the range of 900 to 5,000cm(-1) were available for all samples. Prediction models were developed using partial least squares regression, and prediction accuracy was based on both cross and external validation. The proportion of variance explained by the prediction models in external validation was greatest for pH (71%), followed by rennet coagulation time (55%) and milk heat coagulation time (46%). Models to predict curd firmness 60min from rennet addition and casein micelle size, however, were poor, explaining only 25 and 13%, respectively, of the total variance in each trait within external validation. On average, all prediction models tended to be unbiased. The linear regression coefficient of the reference value on the predicted value varied from 0.17 (casein micelle size regression model) to 0.83 (pH regression model) but all differed from 1. The ratio performance deviation of 1.07 (casein micelle size prediction model) to 1.79 (pH prediction model) for all prediction models in the external validation was <2, suggesting that none of the prediction models could be used for analytical purposes. With the exception of casein micelle size and curd firmness at 60min after rennet addition, the developed prediction models may be useful as a screening method, because the concordance correlation coefficient ranged from 0.63 (heat coagulation time prediction model) to 0.84 (pH prediction model) in the external validation. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  2. Model calibration and validation for OFMSW and sewage sludge co-digestion reactors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Esposito, G., E-mail: giovanni.esposito@unicas.it; Frunzo, L., E-mail: luigi.frunzo@unina.it; Panico, A., E-mail: anpanico@unina.it

    2011-12-15

    Highlights: > Disintegration is the limiting step of the anaerobic co-digestion process. > Disintegration kinetic constant does not depend on the waste particle size. > Disintegration kinetic constant depends only on the waste nature and composition. > The model calibration can be performed on organic waste of any particle size. - Abstract: A mathematical model has recently been proposed by the authors to simulate the biochemical processes that prevail in a co-digestion reactor fed with sewage sludge and the organic fraction of municipal solid waste. This model is based on the Anaerobic Digestion Model no. 1 of the International Watermore » Association, which has been extended to include the co-digestion processes, using surface-based kinetics to model the organic waste disintegration and conversion to carbohydrates, proteins and lipids. When organic waste solids are present in the reactor influent, the disintegration process is the rate-limiting step of the overall co-digestion process. The main advantage of the proposed modeling approach is that the kinetic constant of such a process does not depend on the waste particle size distribution (PSD) and rather depends only on the nature and composition of the waste particles. The model calibration aimed to assess the kinetic constant of the disintegration process can therefore be conducted using organic waste samples of any PSD, and the resulting value will be suitable for all the organic wastes of the same nature as the investigated samples, independently of their PSD. This assumption was proven in this study by biomethane potential experiments that were conducted on organic waste samples with different particle sizes. The results of these experiments were used to calibrate and validate the mathematical model, resulting in a good agreement between the simulated and observed data for any investigated particle size of the solid waste. This study confirms the strength of the proposed model and calibration procedure, which can thus be used to assess the treatment efficiency and predict the methane production of full-scale digesters.« less

  3. Optical properties of mineral dust aerosol including analysis of particle size, composition, and shape effects, and the impact of physical and chemical processing

    NASA Astrophysics Data System (ADS)

    Alexander, Jennifer Mary

    Atmospheric mineral dust has a large impact on the earth's radiation balance and climate. The radiative effects of mineral dust depend on factors including, particle size, shape, and composition which can all be extremely complex. Mineral dust particles are typically irregular in shape and can include sharp edges, voids, and fine scale surface roughness. Particle shape can also depend on the type of mineral and can vary as a function of particle size. In addition, atmospheric mineral dust is a complex mixture of different minerals as well as other, possibly organic, components that have been mixed in while these particles are suspended in the atmosphere. Aerosol optical properties are investigated in this work, including studies of the effect of particle size, shape, and composition on the infrared (IR) extinction and visible scattering properties in order to achieve more accurate modeling methods. Studies of particle shape effects on dust optical properties for single component mineral samples of silicate clay and diatomaceous earth are carried out here first. Experimental measurements are modeled using T-matrix theory in a uniform spheroid approximation. Previous efforts to simulate the measured optical properties of silicate clay, using models that assumed particle shape was independent of particle size, have achieved only limited success. However, a model which accounts for a correlation between particle size and shape for the silicate clays offers a large improvement over earlier modeling approaches. Diatomaceous earth is also studied as an example of a single component mineral dust aerosol with extreme particle shapes. A particle shape distribution, determined by fitting the experimental IR extinction data, used as a basis for modeling the visible light scattering properties. While the visible simulations show only modestly good agreement with the scattering data, the fits are generally better than those obtained using more commonly invoked particle shape distributions. The next goal of this work is to investigate if modeling methods developed in the studies of single mineral components can be generalized to predict the optical properties of more authentic aerosol samples which are complex mixtures of different minerals. Samples of Saharan sand, Iowa loess, and Arizona road dust are used here as test cases. T-matrix based simulations of the authentic samples, using measured particle size distributions, empirical mineralogies, and a priori particle shape models for each mineral component are directly compared with the measured IR extinction spectra and visible scattering profiles. This modeling approach offers a significant improvement over more commonly applied models that ignore variations in particle shape with size or mineralogy and include only a moderate range of shape parameters. Mineral dust samples processed with organic acids and humic material are also studied in order to explore how the optical properties of dust can change after being aged in the atmosphere. Processed samples include quartz mixed with humic material, and calcite reacted with acetic and oxalic acid. Clear differences in the light scattering properties are observed for all three processed mineral dust samples when compared to the unprocessed mineral dust or organic salt products. These interactions result in both internal and external mixtures depending on the sample. In addition, the presence of these organic materials can alter the mineral dust particle shape. Overall, however, these results demonstrate the need to account for the effects of atmospheric aging of mineral dust on aerosol optical properties. Particle shape can also affect the aerodynamic properties of mineral dust aerosol. In order to account for these effects, the dynamic shape factor is used to give a measure of particle asphericity. Dynamic shape factors of quartz are measured by mass and mobility selecting particles and measuring their vacuum aerodynamic diameter. From this, dynamic shape factors in both the transition and vacuum regime can be derived. The measured dynamic shape factors of quartz agree quite well with the spheroidal shape distributions derived through studies of the optical properties.

  4. Modified Toxicity Probability Interval Design: A Safer and More Reliable Method Than the 3 + 3 Design for Practical Phase I Trials

    PubMed Central

    Ji, Yuan; Wang, Sue-Jane

    2013-01-01

    The 3 + 3 design is the most common choice among clinicians for phase I dose-escalation oncology trials. In recent reviews, more than 95% of phase I trials have been based on the 3 + 3 design. Given that it is intuitive and its implementation does not require a computer program, clinicians can conduct 3 + 3 dose escalations in practice with virtually no logistic cost, and trial protocols based on the 3 + 3 design pass institutional review board and biostatistics reviews quickly. However, the performance of the 3 + 3 design has rarely been compared with model-based designs in simulation studies with matched sample sizes. In the vast majority of statistical literature, the 3 + 3 design has been shown to be inferior in identifying true maximum-tolerated doses (MTDs), although the sample size required by the 3 + 3 design is often orders-of-magnitude smaller than model-based designs. In this article, through comparative simulation studies with matched sample sizes, we demonstrate that the 3 + 3 design has higher risks of exposing patients to toxic doses above the MTD than the modified toxicity probability interval (mTPI) design, a newly developed adaptive method. In addition, compared with the mTPI design, the 3 + 3 design does not yield higher probabilities in identifying the correct MTD, even when the sample size is matched. Given that the mTPI design is equally transparent, costless to implement with free software, and more flexible in practical situations, we highly encourage its adoption in early dose-escalation studies whenever the 3 + 3 design is also considered. We provide free software to allow direct comparisons of the 3 + 3 design with other model-based designs in simulation studies with matched sample sizes. PMID:23569307

  5. Estimating sample size for landscape-scale mark-recapture studies of North American migratory tree bats

    USGS Publications Warehouse

    Ellison, Laura E.; Lukacs, Paul M.

    2014-01-01

    Concern for migratory tree-roosting bats in North America has grown because of possible population declines from wind energy development. This concern has driven interest in estimating population-level changes. Mark-recapture methodology is one possible analytical framework for assessing bat population changes, but sample size requirements to produce reliable estimates have not been estimated. To illustrate the sample sizes necessary for a mark-recapture-based monitoring program we conducted power analyses using a statistical model that allows reencounters of live and dead marked individuals. We ran 1,000 simulations for each of five broad sample size categories in a Burnham joint model, and then compared the proportion of simulations in which 95% confidence intervals overlapped between and among years for a 4-year study. Additionally, we conducted sensitivity analyses of sample size to various capture probabilities and recovery probabilities. More than 50,000 individuals per year would need to be captured and released to accurately determine 10% and 15% declines in annual survival. To detect more dramatic declines of 33% or 50% survival over four years, then sample sizes of 25,000 or 10,000 per year, respectively, would be sufficient. Sensitivity analyses reveal that increasing recovery of dead marked individuals may be more valuable than increasing capture probability of marked individuals. Because of the extraordinary effort that would be required, we advise caution should such a mark-recapture effort be initiated because of the difficulty in attaining reliable estimates. We make recommendations for what techniques show the most promise for mark-recapture studies of bats because some techniques violate the assumptions of mark-recapture methodology when used to mark bats.

  6. GI Joe or Average Joe? The impact of average-size and muscular male fashion models on men's and women's body image and advertisement effectiveness.

    PubMed

    Diedrichs, Phillippa C; Lee, Christina

    2010-06-01

    Increasing body size and shape diversity in media imagery may promote positive body image. While research has largely focused on female models and women's body image, men may also be affected by unrealistic images. We examined the impact of average-size and muscular male fashion models on men's and women's body image and perceived advertisement effectiveness. A sample of 330 men and 289 women viewed one of four advertisement conditions: no models, muscular, average-slim or average-large models. Men and women rated average-size models as equally effective in advertisements as muscular models. For men, exposure to average-size models was associated with more positive body image in comparison to viewing no models, but no difference was found in comparison to muscular models. Similar results were found for women. Internalisation of beauty ideals did not moderate these effects. These findings suggest that average-size male models can promote positive body image and appeal to consumers. 2010 Elsevier Ltd. All rights reserved.

  7. A Model of Thermal Conductivity for Planetary Soils: 1. Theory for Unconsolidated Soils

    NASA Technical Reports Server (NTRS)

    Piqueux, S.; Christensen, P. R.

    2009-01-01

    We present a model of heat conduction for mono-sized spherical particulate media under stagnant gases based on the kinetic theory of gases, numerical modeling of Fourier s law of heat conduction, theoretical constraints on the gas thermal conductivity at various Knudsen regimes, and laboratory measurements. Incorporating the effect of the temperature allows for the derivation of the pore-filling gas conductivity and bulk thermal conductivity of samples using additional parameters (pressure, gas composition, grain size, and porosity). The radiative and solid-to-solid conductivities are also accounted for. Our thermal model reproduces the well-established bulk thermal conductivity dependency of a sample with the grain size and pressure and also confirms laboratory measurements finding that higher porosities generally lead to lower conductivities. It predicts the existence of the plateau conductivity at high pressure, where the bulk conductivity does not depend on the grain size. The good agreement between the model predictions and published laboratory measurements under a variety of pressures, temperatures, gas compositions, and grain sizes provides additional confidence in our results. On Venus, Earth, and Titan, the pressure and temperature combinations are too high to observe a soil thermal conductivity dependency on the grain size, but each planet has a unique thermal inertia due to their different surface temperatures. On Mars, the temperature and pressure combination is ideal to observe the soil thermal conductivity dependency on the average grain size. Thermal conductivity models that do not take the temperature and the pore-filling gas composition into account may yield significant errors.

  8. Will Outer Tropical Cyclone Size Change due to Anthropogenic Warming?

    NASA Astrophysics Data System (ADS)

    Schenkel, B. A.; Lin, N.; Chavas, D. R.; Vecchi, G. A.; Knutson, T. R.; Oppenheimer, M.

    2017-12-01

    Prior research has shown significant interbasin and intrabasin variability in outer tropical cyclone (TC) size. Moreover, outer TC size has even been shown to vary substantially over the lifetime of the majority of TCs. However, the factors responsible for both setting initial outer TC size and determining its evolution throughout the TC lifetime remain uncertain. Given these gaps in our physical understanding, there remains uncertainty in how outer TC size will change, if at all, due to anthropogenic warming. The present study seeks to quantify whether outer TC size will change significantly in response to anthropogenic warming using data from a high-resolution global climate model and a regional hurricane model. Similar to prior work, the outer TC size metric used in this study is the radius in which the azimuthal-mean surface azimuthal wind equals 8 m/s. The initial results from the high-resolution global climate model data suggest that the distribution of outer TC size shifts significantly towards larger values in each global TC basin during future climates, as revealed by 1) statistically significant increase of the median outer TC size by 5-10% (p<0.05) according to a 1,000-sample bootstrap resampling approach with replacement and 2) statistically significant differences between distributions of outer TC size from current and future climate simulations as shown using two-sample Kolmogorov Smirnov testing (p<<0.01). Additional analysis of the high-resolution global climate model data reveals that outer TC size does not uniformly increase within each basin in future climates, but rather shows substantial locational dependence. Future work will incorporate the regional mesoscale hurricane model data to help focus on identifying the source of the spatial variability in outer TC size increases within each basin during future climates and, more importantly, why outer TC size changes in response to anthropogenic warming.

  9. Evaluation of char combustion models: measurement and analysis of variability in char particle size and density

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Maloney, Daniel J; Monazam, Esmail R; Casleton, Kent H

    Char samples representing a range of combustion conditions and extents of burnout were obtained from a well-characterized laminar flow combustion experiment. Individual particles from the parent coal and char samples were characterized to determine distributions in particle volume, mass, and density at different extent of burnout. The data were then compared with predictions from a comprehensive char combustion model referred to as the char burnout kinetics model (CBK). The data clearly reflect the particle- to-particle heterogeneity of the parent coal and show a significant broadening in the size and density distributions of the chars resulting from both devolatilization and combustion.more » Data for chars prepared in a lower oxygen content environment (6% oxygen by vol.) are consistent with zone II type combustion behavior where most of the combustion is occurring near the particle surface. At higher oxygen contents (12% by vol.), the data show indications of more burning occurring in the particle interior. The CBK model does a good job of predicting the general nature of the development of size and density distributions during burning but the input distribution of particle size and density is critical to obtaining good predictions. A significant reduction in particle size was observed to occur as a result of devolatilization. For comprehensive combustion models to provide accurate predictions, this size reduction phenomenon needs to be included in devolatilization models so that representative char distributions are carried through the calculations.« less

  10. A New Model of Size-graded Soil Veneer on the Lunar Surface

    NASA Technical Reports Server (NTRS)

    Basu, Abhijit; McKay, David S.

    2005-01-01

    Introduction. We propose a new model of distribution of submillimeter sized lunar soil grains on the lunar surface. We propose that in the uppermost millimeter or two of the lunar surface, soil-grains are size graded with the finest nanoscale dust on top and larger micron-scale particles below. This standard state is perturbed by ejecta deposition of larger grains at the lunar surface, which have a coating of dusty layer that may not have substrates of intermediate sizes. Distribution of solar wind elements (SWE), agglutinates, vapor deposited nanophase Fe0 in size fractions of lunar soils and ir spectra of size fractions of lunar soils are compatible with this model. A direct test of this model requires bringing back glue-impregnated tubes of lunar soil samples to be dissected and examined on Earth.

  11. Modeling grain size variations of aeolian gypsum deposits at White Sands, New Mexico, using AVIRIS imagery

    USGS Publications Warehouse

    Ghrefat, H.A.; Goodell, P.C.; Hubbard, B.E.; Langford, R.P.; Aldouri, R.E.

    2007-01-01

    Visible and Near-Infrared (VNIR) through Short Wavelength Infrared (SWIR) (0.4-2.5????m) AVIRIS data, along with laboratory spectral measurements and analyses of field samples, were used to characterize grain size variations in aeolian gypsum deposits across barchan-transverse, parabolic, and barchan dunes at White Sands, New Mexico, USA. All field samples contained a mineralogy of ?????100% gypsum. In order to document grain size variations at White Sands, surficial gypsum samples were collected along three Transects parallel to the prevailing downwind direction. Grain size analyses were carried out on the samples by sieving them into seven size fractions ranging from 45 to 621????m, which were subjected to spectral measurements. Absorption band depths of the size fractions were determined after applying an automated continuum-removal procedure to each spectrum. Then, the relationship between absorption band depth and gypsum size fraction was established using a linear regression. Three software processing steps were carried out to measure the grain size variations of gypsum in the Dune Area using AVIRIS data. AVIRIS mapping results, field work and laboratory analysis all show that the interdune areas have lower absorption band depth values and consist of finer grained gypsum deposits. In contrast, the dune crest areas have higher absorption band depth values and consist of coarser grained gypsum deposits. Based on laboratory estimates, a representative barchan-transverse dune (Transect 1) has a mean grain size of 1.16 ??{symbol} (449????m). The error bar results show that the error ranges from - 50 to + 50????m. Mean grain size for a representative parabolic dune (Transect 2) is 1.51 ??{symbol} (352????m), and 1.52 ??{symbol} (347????m) for a representative barchan dune (Transect 3). T-test results confirm that there are differences in the grain size distributions between barchan and parabolic dunes and between interdune and dune crest areas. The t-test results also show that there are no significant differences between modeled and laboratory-measured grain size values. Hyperspectral grain size modeling can help to determine dynamic processes shaping the formation of the dunes such as wind directions, and the relative strengths of winds through time. This has implications for studying such processes on other planetary landforms that have mineralogy with unique absorption bands in VNIR-SWIR hyperspectral data. ?? 2006 Elsevier B.V. All rights reserved.

  12. Drawing a representative sample from the NCSS soil database: Building blocks for the national wind erosion network

    USDA-ARS?s Scientific Manuscript database

    Developing national wind erosion models for the continental United States requires a comprehensive spatial representation of continuous soil particle size distributions (PSD) for model input. While the current coverage of soil survey is nearly complete, the most detailed particle size classes have c...

  13. The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory

    ERIC Educational Resources Information Center

    Sahin, Alper; Anil, Duygu

    2017-01-01

    This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of…

  14. Minimal-assumption inference from population-genomic data

    NASA Astrophysics Data System (ADS)

    Weissman, Daniel; Hallatschek, Oskar

    Samples of multiple complete genome sequences contain vast amounts of information about the evolutionary history of populations, much of it in the associations among polymorphisms at different loci. Current methods that take advantage of this linkage information rely on models of recombination and coalescence, limiting the sample sizes and populations that they can analyze. We introduce a method, Minimal-Assumption Genomic Inference of Coalescence (MAGIC), that reconstructs key features of the evolutionary history, including the distribution of coalescence times, by integrating information across genomic length scales without using an explicit model of recombination, demography or selection. Using simulated data, we show that MAGIC's performance is comparable to PSMC' on single diploid samples generated with standard coalescent and recombination models. More importantly, MAGIC can also analyze arbitrarily large samples and is robust to changes in the coalescent and recombination processes. Using MAGIC, we show that the inferred coalescence time histories of samples of multiple human genomes exhibit inconsistencies with a description in terms of an effective population size based on single-genome data.

  15. Towards Monitoring Biodiversity in Amazonian Forests: How Regular Samples Capture Meso-Scale Altitudinal Variation in 25 km2 Plots

    PubMed Central

    Norris, Darren; Fortin, Marie-Josée; Magnusson, William E.

    2014-01-01

    Background Ecological monitoring and sampling optima are context and location specific. Novel applications (e.g. biodiversity monitoring for environmental service payments) call for renewed efforts to establish reliable and robust monitoring in biodiversity rich areas. As there is little information on the distribution of biodiversity across the Amazon basin, we used altitude as a proxy for biological variables to test whether meso-scale variation can be adequately represented by different sample sizes in a standardized, regular-coverage sampling arrangement. Methodology/Principal Findings We used Shuttle-Radar-Topography-Mission digital elevation values to evaluate if the regular sampling arrangement in standard RAPELD (rapid assessments (“RAP”) over the long-term (LTER [“PELD” in Portuguese])) grids captured patters in meso-scale spatial variation. The adequacy of different sample sizes (n = 4 to 120) were examined within 32,325 km2/3,232,500 ha (1293×25 km2 sample areas) distributed across the legal Brazilian Amazon. Kolmogorov-Smirnov-tests, correlation and root-mean-square-error were used to measure sample representativeness, similarity and accuracy respectively. Trends and thresholds of these responses in relation to sample size and standard-deviation were modeled using Generalized-Additive-Models and conditional-inference-trees respectively. We found that a regular arrangement of 30 samples captured the distribution of altitude values within these areas. Sample size was more important than sample standard deviation for representativeness and similarity. In contrast, accuracy was more strongly influenced by sample standard deviation. Additionally, analysis of spatially interpolated data showed that spatial patterns in altitude were also recovered within areas using a regular arrangement of 30 samples. Conclusions/Significance Our findings show that the logistically feasible sample used in the RAPELD system successfully recovers meso-scale altitudinal patterns. This suggests that the sample size and regular arrangement may also be generally appropriate for quantifying spatial patterns in biodiversity at similar scales across at least 90% (≈5 million km2) of the Brazilian Amazon. PMID:25170894

  16. The effect of clustering on lot quality assurance sampling: a probabilistic model to calculate sample sizes for quality assessments

    PubMed Central

    2013-01-01

    Background Traditional Lot Quality Assurance Sampling (LQAS) designs assume observations are collected using simple random sampling. Alternatively, randomly sampling clusters of observations and then individuals within clusters reduces costs but decreases the precision of the classifications. In this paper, we develop a general framework for designing the cluster(C)-LQAS system and illustrate the method with the design of data quality assessments for the community health worker program in Rwanda. Results To determine sample size and decision rules for C-LQAS, we use the beta-binomial distribution to account for inflated risk of errors introduced by sampling clusters at the first stage. We present general theory and code for sample size calculations. The C-LQAS sample sizes provided in this paper constrain misclassification risks below user-specified limits. Multiple C-LQAS systems meet the specified risk requirements, but numerous considerations, including per-cluster versus per-individual sampling costs, help identify optimal systems for distinct applications. Conclusions We show the utility of C-LQAS for data quality assessments, but the method generalizes to numerous applications. This paper provides the necessary technical detail and supplemental code to support the design of C-LQAS for specific programs. PMID:24160725

  17. The effect of clustering on lot quality assurance sampling: a probabilistic model to calculate sample sizes for quality assessments.

    PubMed

    Hedt-Gauthier, Bethany L; Mitsunaga, Tisha; Hund, Lauren; Olives, Casey; Pagano, Marcello

    2013-10-26

    Traditional Lot Quality Assurance Sampling (LQAS) designs assume observations are collected using simple random sampling. Alternatively, randomly sampling clusters of observations and then individuals within clusters reduces costs but decreases the precision of the classifications. In this paper, we develop a general framework for designing the cluster(C)-LQAS system and illustrate the method with the design of data quality assessments for the community health worker program in Rwanda. To determine sample size and decision rules for C-LQAS, we use the beta-binomial distribution to account for inflated risk of errors introduced by sampling clusters at the first stage. We present general theory and code for sample size calculations.The C-LQAS sample sizes provided in this paper constrain misclassification risks below user-specified limits. Multiple C-LQAS systems meet the specified risk requirements, but numerous considerations, including per-cluster versus per-individual sampling costs, help identify optimal systems for distinct applications. We show the utility of C-LQAS for data quality assessments, but the method generalizes to numerous applications. This paper provides the necessary technical detail and supplemental code to support the design of C-LQAS for specific programs.

  18. Recommendations for choosing an analysis method that controls Type I error for unbalanced cluster sample designs with Gaussian outcomes.

    PubMed

    Johnson, Jacqueline L; Kreidler, Sarah M; Catellier, Diane J; Murray, David M; Muller, Keith E; Glueck, Deborah H

    2015-11-30

    We used theoretical and simulation-based approaches to study Type I error rates for one-stage and two-stage analytic methods for cluster-randomized designs. The one-stage approach uses the observed data as outcomes and accounts for within-cluster correlation using a general linear mixed model. The two-stage model uses the cluster specific means as the outcomes in a general linear univariate model. We demonstrate analytically that both one-stage and two-stage models achieve exact Type I error rates when cluster sizes are equal. With unbalanced data, an exact size α test does not exist, and Type I error inflation may occur. Via simulation, we compare the Type I error rates for four one-stage and six two-stage hypothesis testing approaches for unbalanced data. With unbalanced data, the two-stage model, weighted by the inverse of the estimated theoretical variance of the cluster means, and with variance constrained to be positive, provided the best Type I error control for studies having at least six clusters per arm. The one-stage model with Kenward-Roger degrees of freedom and unconstrained variance performed well for studies having at least 14 clusters per arm. The popular analytic method of using a one-stage model with denominator degrees of freedom appropriate for balanced data performed poorly for small sample sizes and low intracluster correlation. Because small sample sizes and low intracluster correlation are common features of cluster-randomized trials, the Kenward-Roger method is the preferred one-stage approach. Copyright © 2015 John Wiley & Sons, Ltd.

  19. Simulation of parametric model towards the fixed covariate of right censored lung cancer data

    NASA Astrophysics Data System (ADS)

    Afiqah Muhamad Jamil, Siti; Asrul Affendi Abdullah, M.; Kek, Sie Long; Ridwan Olaniran, Oyebayo; Enera Amran, Syahila

    2017-09-01

    In this study, simulation procedure was applied to measure the fixed covariate of right censored data by using parametric survival model. The scale and shape parameter were modified to differentiate the analysis of parametric regression survival model. Statistically, the biases, mean biases and the coverage probability were used in this analysis. Consequently, different sample sizes were employed to distinguish the impact of parametric regression model towards right censored data with 50, 100, 150 and 200 number of sample. R-statistical software was utilised to develop the coding simulation with right censored data. Besides, the final model of right censored simulation was compared with the right censored lung cancer data in Malaysia. It was found that different values of shape and scale parameter with different sample size, help to improve the simulation strategy for right censored data and Weibull regression survival model is suitable fit towards the simulation of survival of lung cancer patients data in Malaysia.

  20. Equations for hydraulic conductivity estimation from particle size distribution: A dimensional analysis

    NASA Astrophysics Data System (ADS)

    Wang, Ji-Peng; François, Bertrand; Lambert, Pierre

    2017-09-01

    Estimating hydraulic conductivity from particle size distribution (PSD) is an important issue for various engineering problems. Classical models such as Hazen model, Beyer model, and Kozeny-Carman model usually regard the grain diameter at 10% passing (d10) as an effective grain size and the effects of particle size uniformity (in Beyer model) or porosity (in Kozeny-Carman model) are sometimes embedded. This technical note applies the dimensional analysis (Buckingham's ∏ theorem) to analyze the relationship between hydraulic conductivity and particle size distribution (PSD). The porosity is regarded as a dependent variable on the grain size distribution in unconsolidated conditions. It indicates that the coefficient of grain size uniformity and a dimensionless group representing the gravity effect, which is proportional to the mean grain volume, are the main two determinative parameters for estimating hydraulic conductivity. Regression analysis is then carried out on a database comprising 431 samples collected from different depositional environments and new equations are developed for hydraulic conductivity estimation. The new equation, validated in specimens beyond the database, shows an improved prediction comparing to using the classic models.

  1. A statistical analysis of seat belt effectiveness in 1973-1975 model cars involved in towaway crashes. Volume 1

    DOT National Transportation Integrated Search

    1976-09-01

    Standardized injury rates and seat belt effectiveness measures are derived from a probability sample of towaway accidents involving 1973-1975 model cars. The data were collected in five different geographic regions. Weighted sample size available for...

  2. Comparison and Field Validation of Binomial Sampling Plans for Oligonychus perseae (Acari: Tetranychidae) on Hass Avocado in Southern California.

    PubMed

    Lara, Jesus R; Hoddle, Mark S

    2015-08-01

    Oligonychus perseae Tuttle, Baker, & Abatiello is a foliar pest of 'Hass' avocados [Persea americana Miller (Lauraceae)]. The recommended action threshold is 50-100 motile mites per leaf, but this count range and other ecological factors associated with O. perseae infestations limit the application of enumerative sampling plans in the field. Consequently, a comprehensive modeling approach was implemented to compare the practical application of various binomial sampling models for decision-making of O. perseae in California. An initial set of sequential binomial sampling models were developed using three mean-proportion modeling techniques (i.e., Taylor's power law, maximum likelihood, and an empirical model) in combination with two-leaf infestation tally thresholds of either one or two mites. Model performance was evaluated using a robust mite count database consisting of >20,000 Hass avocado leaves infested with varying densities of O. perseae and collected from multiple locations. Operating characteristic and average sample number results for sequential binomial models were used as the basis to develop and validate a standardized fixed-size binomial sampling model with guidelines on sample tree and leaf selection within blocks of avocado trees. This final validated model requires a leaf sampling cost of 30 leaves and takes into account the spatial dynamics of O. perseae to make reliable mite density classifications for a 50-mite action threshold. Recommendations for implementing this fixed-size binomial sampling plan to assess densities of O. perseae in commercial California avocado orchards are discussed. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  3. Model Choice and Sample Size in Item Response Theory Analysis of Aphasia Tests

    ERIC Educational Resources Information Center

    Hula, William D.; Fergadiotis, Gerasimos; Martin, Nadine

    2012-01-01

    Purpose: The purpose of this study was to identify the most appropriate item response theory (IRT) measurement model for aphasia tests requiring 2-choice responses and to determine whether small samples are adequate for estimating such models. Method: Pyramids and Palm Trees (Howard & Patterson, 1992) test data that had been collected from…

  4. Thermal conductivity of nanocrystalline silicon: importance of grain size and frequency-dependent mean free paths.

    PubMed

    Wang, Zhaojie; Alaniz, Joseph E; Jang, Wanyoung; Garay, Javier E; Dames, Chris

    2011-06-08

    The thermal conductivity reduction due to grain boundary scattering is widely interpreted using a scattering length assumed equal to the grain size and independent of the phonon frequency (gray). To assess these assumptions and decouple the contributions of porosity and grain size, five samples of undoped nanocrystalline silicon have been measured with average grain sizes ranging from 550 to 64 nm and porosities from 17% to less than 1%, at temperatures from 310 to 16 K. The samples were prepared using current activated, pressure assisted densification (CAPAD). At low temperature the thermal conductivities of all samples show a T(2) dependence which cannot be explained by any traditional gray model. The measurements are explained over the entire temperature range by a new frequency-dependent model in which the mean free path for grain boundary scattering is inversely proportional to the phonon frequency, which is shown to be consistent with asymptotic analysis of atomistic simulations from the literature. In all cases the recommended boundary scattering length is smaller than the average grain size. These results should prove useful for the integration of nanocrystalline materials in devices such as advanced thermoelectrics.

  5. Multiscale modeling of porous ceramics using movable cellular automaton method

    NASA Astrophysics Data System (ADS)

    Smolin, Alexey Yu.; Smolin, Igor Yu.; Smolina, Irina Yu.

    2017-10-01

    The paper presents a multiscale model for porous ceramics based on movable cellular automaton method, which is a particle method in novel computational mechanics of solid. The initial scale of the proposed approach corresponds to the characteristic size of the smallest pores in the ceramics. At this scale, we model uniaxial compression of several representative samples with an explicit account of pores of the same size but with the unique position in space. As a result, we get the average values of Young's modulus and strength, as well as the parameters of the Weibull distribution of these properties at the current scale level. These data allow us to describe the material behavior at the next scale level were only the larger pores are considered explicitly, while the influence of small pores is included via effective properties determined earliar. If the pore size distribution function of the material has N maxima we need to perform computations for N-1 levels in order to get the properties step by step from the lowest scale up to the macroscale. The proposed approach was applied to modeling zirconia ceramics with bimodal pore size distribution. The obtained results show correct behavior of the model sample at the macroscale.

  6. Small Sample Sizes Yield Biased Allometric Equations in Temperate Forests

    PubMed Central

    Duncanson, L.; Rourke, O.; Dubayah, R.

    2015-01-01

    Accurate quantification of forest carbon stocks is required for constraining the global carbon cycle and its impacts on climate. The accuracies of forest biomass maps are inherently dependent on the accuracy of the field biomass estimates used to calibrate models, which are generated with allometric equations. Here, we provide a quantitative assessment of the sensitivity of allometric parameters to sample size in temperate forests, focusing on the allometric relationship between tree height and crown radius. We use LiDAR remote sensing to isolate between 10,000 to more than 1,000,000 tree height and crown radius measurements per site in six U.S. forests. We find that fitted allometric parameters are highly sensitive to sample size, producing systematic overestimates of height. We extend our analysis to biomass through the application of empirical relationships from the literature, and show that given the small sample sizes used in common allometric equations for biomass, the average site-level biomass bias is ~+70% with a standard deviation of 71%, ranging from −4% to +193%. These findings underscore the importance of increasing the sample sizes used for allometric equation generation. PMID:26598233

  7. Coalescent: an open-science framework for importance sampling in coalescent theory.

    PubMed

    Tewari, Susanta; Spouge, John L

    2015-01-01

    Background. In coalescent theory, computer programs often use importance sampling to calculate likelihoods and other statistical quantities. An importance sampling scheme can exploit human intuition to improve statistical efficiency of computations, but unfortunately, in the absence of general computer frameworks on importance sampling, researchers often struggle to translate new sampling schemes computationally or benchmark against different schemes, in a manner that is reliable and maintainable. Moreover, most studies use computer programs lacking a convenient user interface or the flexibility to meet the current demands of open science. In particular, current computer frameworks can only evaluate the efficiency of a single importance sampling scheme or compare the efficiencies of different schemes in an ad hoc manner. Results. We have designed a general framework (http://coalescent.sourceforge.net; language: Java; License: GPLv3) for importance sampling that computes likelihoods under the standard neutral coalescent model of a single, well-mixed population of constant size over time following infinite sites model of mutation. The framework models the necessary core concepts, comes integrated with several data sets of varying size, implements the standard competing proposals, and integrates tightly with our previous framework for calculating exact probabilities. For a given dataset, it computes the likelihood and provides the maximum likelihood estimate of the mutation parameter. Well-known benchmarks in the coalescent literature validate the accuracy of the framework. The framework provides an intuitive user interface with minimal clutter. For performance, the framework switches automatically to modern multicore hardware, if available. It runs on three major platforms (Windows, Mac and Linux). Extensive tests and coverage make the framework reliable and maintainable. Conclusions. In coalescent theory, many studies of computational efficiency consider only effective sample size. Here, we evaluate proposals in the coalescent literature, to discover that the order of efficiency among the three importance sampling schemes changes when one considers running time as well as effective sample size. We also describe a computational technique called "just-in-time delegation" available to improve the trade-off between running time and precision by constructing improved importance sampling schemes from existing ones. Thus, our systems approach is a potential solution to the "2(8) programs problem" highlighted by Felsenstein, because it provides the flexibility to include or exclude various features of similar coalescent models or importance sampling schemes.

  8. Effects of model complexity and priors on estimation using sequential importance sampling/resampling for species conservation

    USGS Publications Warehouse

    Dunham, Kylee; Grand, James B.

    2016-01-01

    We examined the effects of complexity and priors on the accuracy of models used to estimate ecological and observational processes, and to make predictions regarding population size and structure. State-space models are useful for estimating complex, unobservable population processes and making predictions about future populations based on limited data. To better understand the utility of state space models in evaluating population dynamics, we used them in a Bayesian framework and compared the accuracy of models with differing complexity, with and without informative priors using sequential importance sampling/resampling (SISR). Count data were simulated for 25 years using known parameters and observation process for each model. We used kernel smoothing to reduce the effect of particle depletion, which is common when estimating both states and parameters with SISR. Models using informative priors estimated parameter values and population size with greater accuracy than their non-informative counterparts. While the estimates of population size and trend did not suffer greatly in models using non-informative priors, the algorithm was unable to accurately estimate demographic parameters. This model framework provides reasonable estimates of population size when little to no information is available; however, when information on some vital rates is available, SISR can be used to obtain more precise estimates of population size and process. Incorporating model complexity such as that required by structured populations with stage-specific vital rates affects precision and accuracy when estimating latent population variables and predicting population dynamics. These results are important to consider when designing monitoring programs and conservation efforts requiring management of specific population segments.

  9. Sample Size Estimation for Alzheimer's Disease Trials from Japanese ADNI Serial Magnetic Resonance Imaging.

    PubMed

    Fujishima, Motonobu; Kawaguchi, Atsushi; Maikusa, Norihide; Kuwano, Ryozo; Iwatsubo, Takeshi; Matsuda, Hiroshi

    2017-01-01

    Little is known about the sample sizes required for clinical trials of Alzheimer's disease (AD)-modifying treatments using atrophy measures from serial brain magnetic resonance imaging (MRI) in the Japanese population. The primary objective of the present study was to estimate how large a sample size would be needed for future clinical trials for AD-modifying treatments in Japan using atrophy measures of the brain as a surrogate biomarker. Sample sizes were estimated from the rates of change of the whole brain and hippocampus by the k-means normalized boundary shift integral (KN-BSI) and cognitive measures using the data of 537 Japanese Alzheimer's Neuroimaging Initiative (J-ADNI) participants with a linear mixed-effects model. We also examined the potential use of ApoE status as a trial enrichment strategy. The hippocampal atrophy rate required smaller sample sizes than cognitive measures of AD and mild cognitive impairment (MCI). Inclusion of ApoE status reduced sample sizes for AD and MCI patients in the atrophy measures. These results show the potential use of longitudinal hippocampal atrophy measurement using automated image analysis as a progression biomarker and ApoE status as a trial enrichment strategy in a clinical trial of AD-modifying treatment in Japanese people.

  10. Accounting for parameter uncertainty in the definition of parametric distributions used to describe individual patient variation in health economic models.

    PubMed

    Degeling, Koen; IJzerman, Maarten J; Koopman, Miriam; Koffijberg, Hendrik

    2017-12-15

    Parametric distributions based on individual patient data can be used to represent both stochastic and parameter uncertainty. Although general guidance is available on how parameter uncertainty should be accounted for in probabilistic sensitivity analysis, there is no comprehensive guidance on reflecting parameter uncertainty in the (correlated) parameters of distributions used to represent stochastic uncertainty in patient-level models. This study aims to provide this guidance by proposing appropriate methods and illustrating the impact of this uncertainty on modeling outcomes. Two approaches, 1) using non-parametric bootstrapping and 2) using multivariate Normal distributions, were applied in a simulation and case study. The approaches were compared based on point-estimates and distributions of time-to-event and health economic outcomes. To assess sample size impact on the uncertainty in these outcomes, sample size was varied in the simulation study and subgroup analyses were performed for the case-study. Accounting for parameter uncertainty in distributions that reflect stochastic uncertainty substantially increased the uncertainty surrounding health economic outcomes, illustrated by larger confidence ellipses surrounding the cost-effectiveness point-estimates and different cost-effectiveness acceptability curves. Although both approaches performed similar for larger sample sizes (i.e. n = 500), the second approach was more sensitive to extreme values for small sample sizes (i.e. n = 25), yielding infeasible modeling outcomes. Modelers should be aware that parameter uncertainty in distributions used to describe stochastic uncertainty needs to be reflected in probabilistic sensitivity analysis, as it could substantially impact the total amount of uncertainty surrounding health economic outcomes. If feasible, the bootstrap approach is recommended to account for this uncertainty.

  11. The impact of multiple endpoint dependency on Q and I(2) in meta-analysis.

    PubMed

    Thompson, Christopher Glen; Becker, Betsy Jane

    2014-09-01

    A common assumption in meta-analysis is that effect sizes are independent. When correlated effect sizes are analyzed using traditional univariate techniques, this assumption is violated. This research assesses the impact of dependence arising from treatment-control studies with multiple endpoints on homogeneity measures Q and I(2) in scenarios using the unbiased standardized-mean-difference effect size. Univariate and multivariate meta-analysis methods are examined. Conditions included different overall outcome effects, study sample sizes, numbers of studies, between-outcomes correlations, dependency structures, and ways of computing the correlation. The univariate approach used typical fixed-effects analyses whereas the multivariate approach used generalized least-squares (GLS) estimates of a fixed-effects model, weighted by the inverse variance-covariance matrix. Increased dependence among effect sizes led to increased Type I error rates from univariate models. When effect sizes were strongly dependent, error rates were drastically higher than nominal levels regardless of study sample size and number of studies. In contrast, using GLS estimation to account for multiple-endpoint dependency maintained error rates within nominal levels. Conversely, mean I(2) values were not greatly affected by increased amounts of dependency. Last, we point out that the between-outcomes correlation should be estimated as a pooled within-groups correlation rather than using a full-sample estimator that does not consider treatment/control group membership. Copyright © 2014 John Wiley & Sons, Ltd.

  12. Urban Land Cover Mapping Accuracy Assessment - A Cost-benefit Analysis Approach

    NASA Astrophysics Data System (ADS)

    Xiao, T.

    2012-12-01

    One of the most important components in urban land cover mapping is mapping accuracy assessment. Many statistical models have been developed to help design simple schemes based on both accuracy and confidence levels. It is intuitive that an increased number of samples increases the accuracy as well as the cost of an assessment. Understanding cost and sampling size is crucial in implementing efficient and effective of field data collection. Few studies have included a cost calculation component as part of the assessment. In this study, a cost-benefit sampling analysis model was created by combining sample size design and sampling cost calculation. The sampling cost included transportation cost, field data collection cost, and laboratory data analysis cost. Simple Random Sampling (SRS) and Modified Systematic Sampling (MSS) methods were used to design sample locations and to extract land cover data in ArcGIS. High resolution land cover data layers of Denver, CO and Sacramento, CA, street networks, and parcel GIS data layers were used in this study to test and verify the model. The relationship between the cost and accuracy was used to determine the effectiveness of each sample method. The results of this study can be applied to other environmental studies that require spatial sampling.

  13. Sample size in psychological research over the past 30 years.

    PubMed

    Marszalek, Jacob M; Barber, Carolyn; Kohlhart, Julie; Holmes, Cooper B

    2011-04-01

    The American Psychological Association (APA) Task Force on Statistical Inference was formed in 1996 in response to a growing body of research demonstrating methodological issues that threatened the credibility of psychological research, and made recommendations to address them. One issue was the small, even dramatically inadequate, size of samples used in studies published by leading journals. The present study assessed the progress made since the Task Force's final report in 1999. Sample sizes reported in four leading APA journals in 1955, 1977, 1995, and 2006 were compared using nonparametric statistics, while data from the last two waves were fit to a hierarchical generalized linear growth model for more in-depth analysis. Overall, results indicate that the recommendations for increasing sample sizes have not been integrated in core psychological research, although results slightly vary by field. This and other implications are discussed in the context of current methodological critique and practice.

  14. Accounting for between-study variation in incremental net benefit in value of information methodology.

    PubMed

    Willan, Andrew R; Eckermann, Simon

    2012-10-01

    Previous applications of value of information methods for determining optimal sample size in randomized clinical trials have assumed no between-study variation in mean incremental net benefit. By adopting a hierarchical model, we provide a solution for determining optimal sample size with this assumption relaxed. The solution is illustrated with two examples from the literature. Expected net gain increases with increasing between-study variation, reflecting the increased uncertainty in incremental net benefit and reduced extent to which data are borrowed from previous evidence. Hence, a trial can become optimal where current evidence is sufficient assuming no between-study variation. However, despite the expected net gain increasing, the optimal sample size in the illustrated examples is relatively insensitive to the amount of between-study variation. Further percentage losses in expected net gain were small even when choosing sample sizes that reflected widely different between-study variation. Copyright © 2011 John Wiley & Sons, Ltd.

  15. Size-segregated aerosol in a hot-spot pollution urban area: Chemical composition and three-way source apportionment.

    PubMed

    Bernardoni, V; Elser, M; Valli, G; Valentini, S; Bigi, A; Fermo, P; Piazzalunga, A; Vecchi, R

    2017-12-01

    In this work, a comprehensive characterisation and source apportionment of size-segregated aerosol collected using a multistage cascade impactor was performed. The samples were collected during wintertime in Milan (Italy), which is located in the Po Valley, one of the main pollution hot-spot areas in Europe. For every sampling, size-segregated mass concentration, elemental and ionic composition, and levoglucosan concentration were determined. Size-segregated data were inverted using the program MICRON to identify and quantify modal contributions of all the measured components. The detailed chemical characterisation allowed the application of a three-way (3-D) receptor model (implemented using Multilinear Engine) for size-segregated source apportionment and chemical profiles identification. It is noteworthy that - as far as we know - this is the first time that three-way source apportionment is attempted using data of aerosol collected by traditional cascade impactors. Seven factors were identified: wood burning, industry, resuspended dust, regional aerosol, construction works, traffic 1, and traffic 2. Further insights into size-segregated factor profiles suggested that the traffic 1 factor can be associated to diesel vehicles and traffic 2 to gasoline vehicles. The regional aerosol factor resulted to be the main contributor (nearly 50%) to the droplet mode (accumulation sub-mode with modal diameter in the range 0.5-1 μm), whereas the overall contribution from the two factors related to traffic was the most important one in the other size modes (34-41%). The results showed that applying a 3-D receptor model to size-segregated samples allows identifying factors of local and regional origin while receptor modelling on integrated PM fractions usually singles out factors characterised by primary (e.g. industry, traffic, soil dust) and secondary (e.g. ammonium sulphate and nitrate) origin. Furthermore, the results suggested that the information on size-segregated chemical composition in different size classes was exploited by the model to relate primary emissions to rapidly-formed secondary compounds. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. How large a training set is needed to develop a classifier for microarray data?

    PubMed

    Dobbin, Kevin K; Zhao, Yingdong; Simon, Richard M

    2008-01-01

    A common goal of gene expression microarray studies is the development of a classifier that can be used to divide patients into groups with different prognoses, or with different expected responses to a therapy. These types of classifiers are developed on a training set, which is the set of samples used to train a classifier. The question of how many samples are needed in the training set to produce a good classifier from high-dimensional microarray data is challenging. We present a model-based approach to determining the sample size required to adequately train a classifier. It is shown that sample size can be determined from three quantities: standardized fold change, class prevalence, and number of genes or features on the arrays. Numerous examples and important experimental design issues are discussed. The method is adapted to address ex post facto determination of whether the size of a training set used to develop a classifier was adequate. An interactive web site for performing the sample size calculations is provided. We showed that sample size calculations for classifier development from high-dimensional microarray data are feasible, discussed numerous important considerations, and presented examples.

  17. Accounting for treatment by center interaction in sample size determinations and the use of surrogate outcomes in the pessary for the prevention of preterm birth trial: a simulation study.

    PubMed

    Willan, Andrew R

    2016-07-05

    The Pessary for the Prevention of Preterm Birth Study (PS3) is an international, multicenter, randomized clinical trial designed to examine the effectiveness of the Arabin pessary in preventing preterm birth in pregnant women with a short cervix. During the design of the study two methodological issues regarding power and sample size were raised. Since treatment in the Standard Arm will vary between centers, it is anticipated that so too will the probability of preterm birth in that arm. This will likely result in a treatment by center interaction, and the issue of how this will affect the sample size requirements was raised. The sample size requirements to examine the effect of the pessary on the baby's clinical outcome was prohibitively high, so the second issue is how best to examine the effect on clinical outcome. The approaches taken to address these issues are presented. Simulation and sensitivity analysis were used to address the sample size issue. The probability of preterm birth in the Standard Arm was assumed to vary between centers following a Beta distribution with a mean of 0.3 and a coefficient of variation of 0.3. To address the second issue a Bayesian decision model is proposed that combines the information regarding the between-treatment difference in the probability of preterm birth from PS3 with the data from the Multiple Courses of Antenatal Corticosteroids for Preterm Birth Study that relate preterm birth and perinatal mortality/morbidity. The approach provides a between-treatment comparison with respect to the probability of a bad clinical outcome. The performance of the approach was assessed using simulation and sensitivity analysis. Accounting for a possible treatment by center interaction increased the sample size from 540 to 700 patients per arm for the base case. The sample size requirements increase with the coefficient of variation and decrease with the number of centers. Under the same assumptions used for determining the sample size requirements, the simulated mean probability that pessary reduces the risk of perinatal mortality/morbidity is 0.98. The simulated mean decreased with coefficient of variation and increased with the number of clinical sites. Employing simulation and sensitivity analysis is a useful approach for determining sample size requirements while accounting for the additional uncertainty due to a treatment by center interaction. Using a surrogate outcome in conjunction with a Bayesian decision model is an efficient way to compare important clinical outcomes in a randomized clinical trial in situations where the direct approach requires a prohibitively high sample size.

  18. Measures of precision for dissimilarity-based multivariate analysis of ecological communities.

    PubMed

    Anderson, Marti J; Santana-Garcon, Julia

    2015-01-01

    Ecological studies require key decisions regarding the appropriate size and number of sampling units. No methods currently exist to measure precision for multivariate assemblage data when dissimilarity-based analyses are intended to follow. Here, we propose a pseudo multivariate dissimilarity-based standard error (MultSE) as a useful quantity for assessing sample-size adequacy in studies of ecological communities. Based on sums of squared dissimilarities, MultSE measures variability in the position of the centroid in the space of a chosen dissimilarity measure under repeated sampling for a given sample size. We describe a novel double resampling method to quantify uncertainty in MultSE values with increasing sample size. For more complex designs, values of MultSE can be calculated from the pseudo residual mean square of a permanova model, with the double resampling done within appropriate cells in the design. R code functions for implementing these techniques, along with ecological examples, are provided. © 2014 The Authors. Ecology Letters published by John Wiley & Sons Ltd and CNRS.

  19. Passive vs. Parachute System Architecture for Robotic Sample Return Vehicles

    NASA Technical Reports Server (NTRS)

    Maddock, Robert W.; Henning, Allen B.; Samareh, Jamshid A.

    2016-01-01

    The Multi-Mission Earth Entry Vehicle (MMEEV) is a flexible vehicle concept based on the Mars Sample Return (MSR) EEV design which can be used in the preliminary sample return mission study phase to parametrically investigate any trade space of interest to determine the best entry vehicle design approach for that particular mission concept. In addition to the trade space dimensions often considered (e.g. entry conditions, payload size and mass, vehicle size, etc.), the MMEEV trade space considers whether it might be more beneficial for the vehicle to utilize a parachute system during descent/landing or to be fully passive (i.e. not use a parachute). In order to evaluate this trade space dimension, a simplified parachute system model has been developed based on inputs such as vehicle size/mass, payload size/mass and landing requirements. This model works in conjunction with analytical approximations of a mission trade space dataset provided by the MMEEV System Analysis for Planetary EDL (M-SAPE) tool to help quantify the differences between an active (with parachute) and a passive (no parachute) vehicle concept.

  20. Linear models for airborne-laser-scanning-based operational forest inventory with small field sample size and highly correlated LiDAR data

    USGS Publications Warehouse

    Junttila, Virpi; Kauranne, Tuomo; Finley, Andrew O.; Bradford, John B.

    2015-01-01

    Modern operational forest inventory often uses remotely sensed data that cover the whole inventory area to produce spatially explicit estimates of forest properties through statistical models. The data obtained by airborne light detection and ranging (LiDAR) correlate well with many forest inventory variables, such as the tree height, the timber volume, and the biomass. To construct an accurate model over thousands of hectares, LiDAR data must be supplemented with several hundred field sample measurements of forest inventory variables. This can be costly and time consuming. Different LiDAR-data-based and spatial-data-based sampling designs can reduce the number of field sample plots needed. However, problems arising from the features of the LiDAR data, such as a large number of predictors compared with the sample size (overfitting) or a strong correlation among predictors (multicollinearity), may decrease the accuracy and precision of the estimates and predictions. To overcome these problems, a Bayesian linear model with the singular value decomposition of predictors, combined with regularization, is proposed. The model performance in predicting different forest inventory variables is verified in ten inventory areas from two continents, where the number of field sample plots is reduced using different sampling designs. The results show that, with an appropriate field plot selection strategy and the proposed linear model, the total relative error of the predicted forest inventory variables is only 5%–15% larger using 50 field sample plots than the error of a linear model estimated with several hundred field sample plots when we sum up the error due to both the model noise variance and the model’s lack of fit.

  1. Automated sampling assessment for molecular simulations using the effective sample size

    PubMed Central

    Zhang, Xin; Bhatt, Divesh; Zuckerman, Daniel M.

    2010-01-01

    To quantify the progress in the development of algorithms and forcefields used in molecular simulations, a general method for the assessment of the sampling quality is needed. Statistical mechanics principles suggest the populations of physical states characterize equilibrium sampling in a fundamental way. We therefore develop an approach for analyzing the variances in state populations, which quantifies the degree of sampling in terms of the effective sample size (ESS). The ESS estimates the number of statistically independent configurations contained in a simulated ensemble. The method is applicable to both traditional dynamics simulations as well as more modern (e.g., multi–canonical) approaches. Our procedure is tested in a variety of systems from toy models to atomistic protein simulations. We also introduce a simple automated procedure to obtain approximate physical states from dynamic trajectories: this allows sample–size estimation in systems for which physical states are not known in advance. PMID:21221418

  2. A normative inference approach for optimal sample sizes in decisions from experience

    PubMed Central

    Ostwald, Dirk; Starke, Ludger; Hertwig, Ralph

    2015-01-01

    “Decisions from experience” (DFE) refers to a body of work that emerged in research on behavioral decision making over the last decade. One of the major experimental paradigms employed to study experience-based choice is the “sampling paradigm,” which serves as a model of decision making under limited knowledge about the statistical structure of the world. In this paradigm respondents are presented with two payoff distributions, which, in contrast to standard approaches in behavioral economics, are specified not in terms of explicit outcome-probability information, but by the opportunity to sample outcomes from each distribution without economic consequences. Participants are encouraged to explore the distributions until they feel confident enough to decide from which they would prefer to draw from in a final trial involving real monetary payoffs. One commonly employed measure to characterize the behavior of participants in the sampling paradigm is the sample size, that is, the number of outcome draws which participants choose to obtain from each distribution prior to terminating sampling. A natural question that arises in this context concerns the “optimal” sample size, which could be used as a normative benchmark to evaluate human sampling behavior in DFE. In this theoretical study, we relate the DFE sampling paradigm to the classical statistical decision theoretic literature and, under a probabilistic inference assumption, evaluate optimal sample sizes for DFE. In our treatment we go beyond analytically established results by showing how the classical statistical decision theoretic framework can be used to derive optimal sample sizes under arbitrary, but numerically evaluable, constraints. Finally, we critically evaluate the value of deriving optimal sample sizes under this framework as testable predictions for the experimental study of sampling behavior in DFE. PMID:26441720

  3. Modeling misidentification errors that result from use of genetic tags in capture-recapture studies

    USGS Publications Warehouse

    Yoshizaki, J.; Brownie, C.; Pollock, K.H.; Link, W.A.

    2011-01-01

    Misidentification of animals is potentially important when naturally existing features (natural tags) such as DNA fingerprints (genetic tags) are used to identify individual animals. For example, when misidentification leads to multiple identities being assigned to an animal, traditional estimators tend to overestimate population size. Accounting for misidentification in capture-recapture models requires detailed understanding of the mechanism. Using genetic tags as an example, we outline a framework for modeling the effect of misidentification in closed population studies when individual identification is based on natural tags that are consistent over time (non-evolving natural tags). We first assume a single sample is obtained per animal for each capture event, and then generalize to the case where multiple samples (such as hair or scat samples) are collected per animal per capture occasion. We introduce methods for estimating population size and, using a simulation study, we show that our new estimators perform well for cases with moderately high capture probabilities or high misidentification rates. In contrast, conventional estimators can seriously overestimate population size when errors due to misidentification are ignored. ?? 2009 Springer Science+Business Media, LLC.

  4. Hard choices in assessing survival past dams — a comparison of single- and paired-release strategies

    USGS Publications Warehouse

    Zydlewski, Joseph D.; Stich, Daniel S.; Sigourney, Douglas B.

    2017-01-01

    Mark–recapture models are widely used to estimate survival of salmon smolts migrating past dams. Paired releases have been used to improve estimate accuracy by removing components of mortality not attributable to the dam. This method is accompanied by reduced precision because (i) sample size is reduced relative to a single, large release; and (ii) variance calculations inflate error. We modeled an idealized system with a single dam to assess trade-offs between accuracy and precision and compared methods using root mean squared error (RMSE). Simulations were run under predefined conditions (dam mortality, background mortality, detection probability, and sample size) to determine scenarios when the paired release was preferable to a single release. We demonstrate that a paired-release design provides a theoretical advantage over a single-release design only at large sample sizes and high probabilities of detection. At release numbers typical of many survival studies, paired release can result in overestimation of dam survival. Failures to meet model assumptions of a paired release may result in further overestimation of dam-related survival. Under most conditions, a single-release strategy was preferable.

  5. High-resolution observations of low-luminosity gigahertz-peaked spectrum and compact steep-spectrum sources

    NASA Astrophysics Data System (ADS)

    Collier, J. D.; Tingay, S. J.; Callingham, J. R.; Norris, R. P.; Filipović, M. D.; Galvin, T. J.; Huynh, M. T.; Intema, H. T.; Marvil, J.; O'Brien, A. N.; Roper, Q.; Sirothia, S.; Tothill, N. F. H.; Bell, M. E.; For, B.-Q.; Gaensler, B. M.; Hancock, P. J.; Hindson, L.; Hurley-Walker, N.; Johnston-Hollitt, M.; Kapińska, A. D.; Lenc, E.; Morgan, J.; Procopio, P.; Staveley-Smith, L.; Wayth, R. B.; Wu, C.; Zheng, Q.; Heywood, I.; Popping, A.

    2018-06-01

    We present very long baseline interferometry observations of a faint and low-luminosity (L1.4 GHz < 1027 W Hz-1) gigahertz-peaked spectrum (GPS) and compact steep-spectrum (CSS) sample. We select eight sources from deep radio observations that have radio spectra characteristic of a GPS or CSS source and an angular size of θ ≲ 2 arcsec, and detect six of them with the Australian Long Baseline Array. We determine their linear sizes, and model their radio spectra using synchrotron self-absorption (SSA) and free-free absorption (FFA) models. We derive statistical model ages, based on a fitted scaling relation, and spectral ages, based on the radio spectrum, which are generally consistent with the hypothesis that GPS and CSS sources are young and evolving. We resolve the morphology of one CSS source with a radio luminosity of 10^{25} W Hz^{-1}, and find what appear to be two hotspots spanning 1.7 kpc. We find that our sources follow the turnover-linear size relation, and that both homogeneous SSA and an inhomogeneous FFA model can account for the spectra with observable turnovers. All but one of the FFA models do not require a spectral break to account for the radio spectrum, while all but one of the alternative SSA and power-law models do require a spectral break to account for the radio spectrum. We conclude that our low-luminosity sample is similar to brighter samples in terms of their spectral shape, turnover frequencies, linear sizes, and ages, but cannot test for a difference in morphology.

  6. An Investigation of the Sampling Distribution of the Congruence Coefficient.

    ERIC Educational Resources Information Center

    Broadbooks, Wendy J.; Elmore, Patricia B.

    This study developed and investigated an empirical sampling distribution of the congruence coefficient. The effects of sample size, number of variables, and population value of the congruence coefficient on the sampling distribution of the congruence coefficient were examined. Sample data were generated on the basis of the common factor model and…

  7. A comparison of observation-level random effect and Beta-Binomial models for modelling overdispersion in Binomial data in ecology & evolution.

    PubMed

    Harrison, Xavier A

    2015-01-01

    Overdispersion is a common feature of models of biological data, but researchers often fail to model the excess variation driving the overdispersion, resulting in biased parameter estimates and standard errors. Quantifying and modeling overdispersion when it is present is therefore critical for robust biological inference. One means to account for overdispersion is to add an observation-level random effect (OLRE) to a model, where each data point receives a unique level of a random effect that can absorb the extra-parametric variation in the data. Although some studies have investigated the utility of OLRE to model overdispersion in Poisson count data, studies doing so for Binomial proportion data are scarce. Here I use a simulation approach to investigate the ability of both OLRE models and Beta-Binomial models to recover unbiased parameter estimates in mixed effects models of Binomial data under various degrees of overdispersion. In addition, as ecologists often fit random intercept terms to models when the random effect sample size is low (<5 levels), I investigate the performance of both model types under a range of random effect sample sizes when overdispersion is present. Simulation results revealed that the efficacy of OLRE depends on the process that generated the overdispersion; OLRE failed to cope with overdispersion generated from a Beta-Binomial mixture model, leading to biased slope and intercept estimates, but performed well for overdispersion generated by adding random noise to the linear predictor. Comparison of parameter estimates from an OLRE model with those from its corresponding Beta-Binomial model readily identified when OLRE were performing poorly due to disagreement between effect sizes, and this strategy should be employed whenever OLRE are used for Binomial data to assess their reliability. Beta-Binomial models performed well across all contexts, but showed a tendency to underestimate effect sizes when modelling non-Beta-Binomial data. Finally, both OLRE and Beta-Binomial models performed poorly when models contained <5 levels of the random intercept term, especially for estimating variance components, and this effect appeared independent of total sample size. These results suggest that OLRE are a useful tool for modelling overdispersion in Binomial data, but that they do not perform well in all circumstances and researchers should take care to verify the robustness of parameter estimates of OLRE models.

  8. Does an uneven sample size distribution across settings matter in cross-classified multilevel modeling? Results of a simulation study.

    PubMed

    Milliren, Carly E; Evans, Clare R; Richmond, Tracy K; Dunn, Erin C

    2018-06-06

    Recent advances in multilevel modeling allow for modeling non-hierarchical levels (e.g., youth in non-nested schools and neighborhoods) using cross-classified multilevel models (CCMM). Current practice is to cluster samples from one context (e.g., schools) and utilize the observations however they are distributed from the second context (e.g., neighborhoods). However, it is unknown whether an uneven distribution of sample size across these contexts leads to incorrect estimates of random effects in CCMMs. Using the school and neighborhood data structure in Add Health, we examined the effect of neighborhood sample size imbalance on the estimation of variance parameters in models predicting BMI. We differentially assigned students from a given school to neighborhoods within that school's catchment area using three scenarios of (im)balance. 1000 random datasets were simulated for each of five combinations of school- and neighborhood-level variance and imbalance scenarios, for a total of 15,000 simulated data sets. For each simulation, we calculated 95% CIs for the variance parameters to determine whether the true simulated variance fell within the interval. Across all simulations, the "true" school and neighborhood variance parameters were estimated 93-96% of the time. Only 5% of models failed to capture neighborhood variance; 6% failed to capture school variance. These results suggest that there is no systematic bias in the ability of CCMM to capture the true variance parameters regardless of the distribution of students across neighborhoods. Ongoing efforts to use CCMM are warranted and can proceed without concern for the sample imbalance across contexts. Copyright © 2018 Elsevier Ltd. All rights reserved.

  9. Rectification of depth measurement using pulsed thermography with logarithmic peak second derivative method

    NASA Astrophysics Data System (ADS)

    Li, Xiaoli; Zeng, Zhi; Shen, Jingling; Zhang, Cunlin; Zhao, Yuejin

    2018-03-01

    Logarithmic peak second derivative (LPSD) method is the most popular method for depth prediction in pulsed thermography. It is widely accepted that this method is independent of defect size. The theoretical model for LPSD method is based on the one-dimensional solution of heat conduction without considering the effect of defect size. When a decay term considering defect aspect ratio is introduced into the solution to correct the three-dimensional thermal diffusion effect, we found that LPSD method is affected by defect size by analytical model. Furthermore, we constructed the relation between the characteristic time of LPSD method and defect aspect ratio, which was verified with the experimental results of stainless steel and glass fiber reinforced plate (GFRP) samples. We also proposed an improved LPSD method for depth prediction when the effect of defect size was considered, and the rectification results of stainless steel and GFRP samples were presented and discussed.

  10. Application of the Zero-Order Reaction Rate Model and Transition State Theory to predict porous Ti6Al4V bending strength.

    PubMed

    Reig, L; Amigó, V; Busquets, D; Calero, J A; Ortiz, J L

    2012-08-01

    Porous Ti6Al4V samples were produced by microsphere sintering. The Zero-Order Reaction Rate Model and Transition State Theory were used to model the sintering process and to estimate the bending strength of the porous samples developed. The evolution of the surface area during the sintering process was used to obtain sintering parameters (sintering constant, activation energy, frequency factor, constant of activation and Gibbs energy of activation). These were then correlated with the bending strength in order to obtain a simple model with which to estimate the evolution of the bending strength of the samples when the sintering temperature and time are modified: σY=P+B·[lnT·t-ΔGa/R·T]. Although the sintering parameters were obtained only for the microsphere sizes analysed here, the strength of intermediate sizes could easily be estimated following this model. Copyright © 2012 Elsevier B.V. All rights reserved.

  11. Risk Factors for Addiction and Their Association with Model-Based Behavioral Control.

    PubMed

    Reiter, Andrea M F; Deserno, Lorenz; Wilbertz, Tilmann; Heinze, Hans-Jochen; Schlagenhauf, Florian

    2016-01-01

    Addiction shows familial aggregation and previous endophenotype research suggests that healthy relatives of addicted individuals share altered behavioral and cognitive characteristics with individuals suffering from addiction. In this study we asked whether impairments in behavioral control proposed for addiction, namely a shift from goal-directed, model-based toward habitual, model-free control, extends toward an unaffected sample (n = 20) of adult children of alcohol-dependent fathers as compared to a sample without any personal or family history of alcohol addiction (n = 17). Using a sequential decision-making task designed to investigate model-free and model-based control combined with a computational modeling analysis, we did not find any evidence for altered behavioral control in individuals with a positive family history of alcohol addiction. Independent of family history of alcohol dependence, we however observed that the interaction of two different risk factors of addiction, namely impulsivity and cognitive capacities, predicts the balance of model-free and model-based behavioral control. Post-hoc tests showed a positive association of model-based behavior with cognitive capacity in the lower, but not in the higher impulsive group of the original sample. In an independent sample of particularly high- vs. low-impulsive individuals, we confirmed the interaction effect of cognitive capacities and high vs. low impulsivity on model-based control. In the confirmation sample, a positive association of omega with cognitive capacity was observed in highly impulsive individuals, but not in low impulsive individuals. Due to the moderate sample size of the study, further investigation of the association of risk factors for addiction with model-based behavior in larger sample sizes is warranted.

  12. Parallel Nonnegative Least Squares Solvers for Model Order Reduction

    DTIC Science & Technology

    2016-03-01

    NNLS problems that arise when the Energy Conserving Sampling and Weighting hyper -reduction procedure is used when constructing a reduced-order model...ScaLAPACK and performance results are presented. nonnegative least squares, model order reduction, hyper -reduction, Energy Conserving Sampling and...optimal solution. ........................................ 20 Table 6 Reduced mesh sizes produced for each solver in the ECSW hyper -reduction step

  13. Designing image segmentation studies: Statistical power, sample size and reference standard quality.

    PubMed

    Gibson, Eli; Hu, Yipeng; Huisman, Henkjan J; Barratt, Dean C

    2017-12-01

    Segmentation algorithms are typically evaluated by comparison to an accepted reference standard. The cost of generating accurate reference standards for medical image segmentation can be substantial. Since the study cost and the likelihood of detecting a clinically meaningful difference in accuracy both depend on the size and on the quality of the study reference standard, balancing these trade-offs supports the efficient use of research resources. In this work, we derive a statistical power calculation that enables researchers to estimate the appropriate sample size to detect clinically meaningful differences in segmentation accuracy (i.e. the proportion of voxels matching the reference standard) between two algorithms. Furthermore, we derive a formula to relate reference standard errors to their effect on the sample sizes of studies using lower-quality (but potentially more affordable and practically available) reference standards. The accuracy of the derived sample size formula was estimated through Monte Carlo simulation, demonstrating, with 95% confidence, a predicted statistical power within 4% of simulated values across a range of model parameters. This corresponds to sample size errors of less than 4 subjects and errors in the detectable accuracy difference less than 0.6%. The applicability of the formula to real-world data was assessed using bootstrap resampling simulations for pairs of algorithms from the PROMISE12 prostate MR segmentation challenge data set. The model predicted the simulated power for the majority of algorithm pairs within 4% for simulated experiments using a high-quality reference standard and within 6% for simulated experiments using a low-quality reference standard. A case study, also based on the PROMISE12 data, illustrates using the formulae to evaluate whether to use a lower-quality reference standard in a prostate segmentation study. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  14. Hindlimb muscle architecture in non-human great apes and a comparison of methods for analysing inter-species variation

    PubMed Central

    Myatt, Julia P; Crompton, Robin H; Thorpe, Susannah K S

    2011-01-01

    By relating an animal's morphology to its functional role and the behaviours performed, we can further develop our understanding of the selective factors and constraints acting on the adaptations of great apes. Comparison of muscle architecture between different ape species, however, is difficult because only small sample sizes are ever available. Further, such samples are often comprised of different age–sex classes, so studies have to rely on scaling techniques to remove body mass differences. However, the reliability of such scaling techniques has been questioned. As datasets increase in size, more reliable statistical analysis may eventually become possible. Here we employ geometric and allometric scaling techniques, and ancovas (a form of general linear model, GLM) to highlight and explore the different methods available for comparing functional morphology in the non-human great apes. Our results underline the importance of regressing data against a suitable body size variable to ascertain the relationship (geometric or allometric) and of choosing appropriate exponents by which to scale data. ancova models, while likely to be more robust than scaling for species comparisons when sample sizes are high, suffer from reduced power when sample sizes are low. Therefore, until sample sizes are radically increased it is preferable to include scaling analyses along with ancovas in data exploration. Overall, the results obtained from the different methods show little significant variation, whether in muscle belly mass, fascicle length or physiological cross-sectional area between the different species. This may reflect relatively close evolutionary relationships of the non-human great apes; a universal influence on morphology of generalised orthograde locomotor behaviours or, quite likely, both. PMID:21507000

  15. [Effects of soil trituration size on adsorption of oxytetracycline on soils].

    PubMed

    Qi, Rui-Huan; Li, Zhao-Jun; Long, Jian; Fan, Fei-Fei; Liang, Yong-Chao

    2011-02-01

    In order to understand the effects of soil trituration size on adsorption of oxytetracycline (OTC) on soils, two contrasting soils including moisture soil and purplish soil were selected to investigate adsorption of OTC on these soils, at the scales of no more than 0.20 mm, 0.84 mm, 0.25 mm and 0.15 mm, using the method of batch equilibrium experiments respectively. The results presented as the following: (1) Adsorption amount of OTC on moisture soil and purplish soil increased with the sampling time, and reached to equilibration at 24 h. First-order kinetic model, second-order kinetic model, parabolic-diffusion kinetic model, Elovich kinetic model, and two-constant kinetic model could be used to fit the changes in adsorption on soils with sampling time. Adsorption of OTC on two soils consisted of two processes such as quick adsorption and slow adsorption. Quick adsorption process happened during the period of 0-0.5 h. The adsorption rates of OTC on soils were higher at the small trituration size than those at the large trituration size, and at the same trituration size, the k(f) of purplish soil was about two times higher than those of moisture soil. (2) Adsorption isotherms of OTC on two soils with different trituration sizes were deviated from the linear model. The data were fitted well to Freundlich and Langmuir models, with the correlation coefficients between 0.956 and 0.999. The values of k(f) and q(m) for purplish soil were higher than those for moisture soil. At the same soil, adsorption amount of OTC increased with the decreases of soil trituration size. The results suggested that it is important to select the appropriate trituration size, based on the physical and chemical properties such as soil particle composition and so on, when the fate of antibiotics on soils was investigated.

  16. Efficient Bayesian mixed model analysis increases association power in large cohorts

    PubMed Central

    Loh, Po-Ru; Tucker, George; Bulik-Sullivan, Brendan K; Vilhjálmsson, Bjarni J; Finucane, Hilary K; Salem, Rany M; Chasman, Daniel I; Ridker, Paul M; Neale, Benjamin M; Berger, Bonnie; Patterson, Nick; Price, Alkes L

    2014-01-01

    Linear mixed models are a powerful statistical tool for identifying genetic associations and avoiding confounding. However, existing methods are computationally intractable in large cohorts, and may not optimize power. All existing methods require time cost O(MN2) (where N = #samples and M = #SNPs) and implicitly assume an infinitesimal genetic architecture in which effect sizes are normally distributed, which can limit power. Here, we present a far more efficient mixed model association method, BOLT-LMM, which requires only a small number of O(MN)-time iterations and increases power by modeling more realistic, non-infinitesimal genetic architectures via a Bayesian mixture prior on marker effect sizes. We applied BOLT-LMM to nine quantitative traits in 23,294 samples from the Women’s Genome Health Study (WGHS) and observed significant increases in power, consistent with simulations. Theory and simulations show that the boost in power increases with cohort size, making BOLT-LMM appealing for GWAS in large cohorts. PMID:25642633

  17. Does size really matter? A multisite study assessing the latent structure of the proposed ICD-11 and DSM-5 diagnostic criteria for PTSD

    PubMed Central

    Hansen, Maj; Hyland, Philip; Karstoft, Karen-Inge; Vaegter, Henrik B.; Bramsen, Rikke H.; Nielsen, Anni B. S.; Armour, Cherie; Andersen, Søren B.; Høybye, Mette Terp; Larsen, Simone Kongshøj; Andersen, Tonny E.

    2017-01-01

    ABSTRACT Background: Researchers and clinicians within the field of trauma have to choose between different diagnostic descriptions of posttraumatic stress disorder (PTSD) in the DSM-5 and the proposed ICD-11. Several studies support different competing models of the PTSD structure according to both diagnostic systems; however, findings show that the choice of diagnostic systems can affect the estimated prevalence rates. Objectives: The present study aimed to investigate the potential impact of using a large (i.e. the DSM-5) compared to a small (i.e. the ICD-11) diagnostic description of PTSD. In other words, does the size of PTSD really matter? Methods: The aim was investigated by examining differences in diagnostic rates between the two diagnostic systems and independently examining the model fit of the competing DSM-5 and ICD-11 models of PTSD across three trauma samples: university students (N = 4213), chronic pain patients (N = 573), and military personnel (N = 118). Results: Diagnostic rates of PTSD were significantly lower according to the proposed ICD-11 criteria in the university sample, but no significant differences were found for chronic pain patients and military personnel. The proposed ICD-11 three-factor model provided the best fit of the tested ICD-11 models across all samples, whereas the DSM-5 seven-factor Hybrid model provided the best fit in the university and pain samples, and the DSM-5 six-factor Anhedonia model provided the best fit in the military sample of the tested DSM-5 models. Conclusions: The advantages and disadvantages of using a broad or narrow set of symptoms for PTSD can be debated, however, this study demonstrated that choice of diagnostic system may influence the estimated PTSD rates both qualitatively and quantitatively. In the current described diagnostic criteria only the ICD-11 model can reflect the configuration of symptoms satisfactorily. Thus, size does matter when assessing PTSD. PMID:29201287

  18. Does size really matter? A multisite study assessing the latent structure of the proposed ICD-11 and DSM-5 diagnostic criteria for PTSD.

    PubMed

    Hansen, Maj; Hyland, Philip; Karstoft, Karen-Inge; Vaegter, Henrik B; Bramsen, Rikke H; Nielsen, Anni B S; Armour, Cherie; Andersen, Søren B; Høybye, Mette Terp; Larsen, Simone Kongshøj; Andersen, Tonny E

    2017-01-01

    Background : Researchers and clinicians within the field of trauma have to choose between different diagnostic descriptions of posttraumatic stress disorder (PTSD) in the DSM-5 and the proposed ICD-11. Several studies support different competing models of the PTSD structure according to both diagnostic systems; however, findings show that the choice of diagnostic systems can affect the estimated prevalence rates. Objectives : The present study aimed to investigate the potential impact of using a large (i.e. the DSM-5) compared to a small (i.e. the ICD-11) diagnostic description of PTSD. In other words, does the size of PTSD really matter? Methods: The aim was investigated by examining differences in diagnostic rates between the two diagnostic systems and independently examining the model fit of the competing DSM-5 and ICD-11 models of PTSD across three trauma samples: university students ( N  = 4213), chronic pain patients ( N  = 573), and military personnel ( N  = 118). Results : Diagnostic rates of PTSD were significantly lower according to the proposed ICD-11 criteria in the university sample, but no significant differences were found for chronic pain patients and military personnel. The proposed ICD-11 three-factor model provided the best fit of the tested ICD-11 models across all samples, whereas the DSM-5 seven-factor Hybrid model provided the best fit in the university and pain samples, and the DSM-5 six-factor Anhedonia model provided the best fit in the military sample of the tested DSM-5 models. Conclusions : The advantages and disadvantages of using a broad or narrow set of symptoms for PTSD can be debated, however, this study demonstrated that choice of diagnostic system may influence the estimated PTSD rates both qualitatively and quantitatively. In the current described diagnostic criteria only the ICD-11 model can reflect the configuration of symptoms satisfactorily. Thus, size does matter when assessing PTSD.

  19. Determination of the influence of dispersion pattern of pesticide-resistant individuals on the reliability of resistance estimates using different sampling plans.

    PubMed

    Shah, R; Worner, S P; Chapman, R B

    2012-10-01

    Pesticide resistance monitoring includes resistance detection and subsequent documentation/ measurement. Resistance detection would require at least one (≥1) resistant individual(s) to be present in a sample to initiate management strategies. Resistance documentation, on the other hand, would attempt to get an estimate of the entire population (≥90%) of the resistant individuals. A computer simulation model was used to compare the efficiency of simple random and systematic sampling plans to detect resistant individuals and to document their frequencies when the resistant individuals were randomly or patchily distributed. A patchy dispersion pattern of resistant individuals influenced the sampling efficiency of systematic sampling plans while the efficiency of random sampling was independent of such patchiness. When resistant individuals were randomly distributed, sample sizes required to detect at least one resistant individual (resistance detection) with a probability of 0.95 were 300 (1%) and 50 (10% and 20%); whereas, when resistant individuals were patchily distributed, using systematic sampling, sample sizes required for such detection were 6000 (1%), 600 (10%) and 300 (20%). Sample sizes of 900 and 400 would be required to detect ≥90% of resistant individuals (resistance documentation) with a probability of 0.95 when resistant individuals were randomly dispersed and present at a frequency of 10% and 20%, respectively; whereas, when resistant individuals were patchily distributed, using systematic sampling, a sample size of 3000 and 1500, respectively, was necessary. Small sample sizes either underestimated or overestimated the resistance frequency. A simple random sampling plan is, therefore, recommended for insecticide resistance detection and subsequent documentation.

  20. Model of Tooth Morphogenesis Predicts Carabelli Cusp Expression, Size, and Symmetry in Humans

    PubMed Central

    Hunter, John P.; Guatelli-Steinberg, Debbie; Weston, Theresia C.; Durner, Ryan; Betsinger, Tracy K.

    2010-01-01

    Background The patterning cascade model of tooth morphogenesis accounts for shape development through the interaction of a small number of genes. In the model, gene expression both directs development and is controlled by the shape of developing teeth. Enamel knots (zones of nonproliferating epithelium) mark the future sites of cusps. In order to form, a new enamel knot must escape the inhibitory fields surrounding other enamel knots before crown components become spatially fixed as morphogenesis ceases. Because cusp location on a fully formed tooth reflects enamel knot placement and tooth size is limited by the cessation of morphogenesis, the model predicts that cusp expression varies with intercusp spacing relative to tooth size. Although previous studies in humans have supported the model's implications, here we directly test the model's predictions for the expression, size, and symmetry of Carabelli cusp, a variation present in many human populations. Methodology/Principal Findings In a dental cast sample of upper first molars (M1s) (187 rights, 189 lefts, and 185 antimeric pairs), we measured tooth area and intercusp distances with a Hirox digital microscope. We assessed Carabelli expression quantitatively as an area in a subsample and qualitatively using two typological schemes in the full sample. As predicted, low relative intercusp distance is associated with Carabelli expression in both right and left samples using either qualitative or quantitative measures. Furthermore, asymmetry in Carabelli area is associated with asymmetry in relative intercusp spacing. Conclusions/Significance These findings support the model's predictions for Carabelli cusp expression both across and within individuals. By comparing right-left pairs of the same individual, our data show that small variations in developmental timing or spacing of enamel knots can influence cusp pattern independently of genotype. Our findings suggest that during evolution new cusps may first appear as a result of small changes in the spacing of enamel knots relative to crown size. PMID:20689576

  1. A log-linear model approach to estimation of population size using the line-transect sampling method

    USGS Publications Warehouse

    Anderson, D.R.; Burnham, K.P.; Crain, B.R.

    1978-01-01

    The technique of estimating wildlife population size and density using the belt or line-transect sampling method has been used in many past projects, such as the estimation of density of waterfowl nestling sites in marshes, and is being used currently in such areas as the assessment of Pacific porpoise stocks in regions of tuna fishing activity. A mathematical framework for line-transect methodology has only emerged in the last 5 yr. In the present article, we extend this mathematical framework to a line-transect estimator based upon a log-linear model approach.

  2. Accounting for imperfect detection of groups and individuals when estimating abundance.

    PubMed

    Clement, Matthew J; Converse, Sarah J; Royle, J Andrew

    2017-09-01

    If animals are independently detected during surveys, many methods exist for estimating animal abundance despite detection probabilities <1. Common estimators include double-observer models, distance sampling models and combined double-observer and distance sampling models (known as mark-recapture-distance-sampling models; MRDS). When animals reside in groups, however, the assumption of independent detection is violated. In this case, the standard approach is to account for imperfect detection of groups, while assuming that individuals within groups are detected perfectly. However, this assumption is often unsupported. We introduce an abundance estimator for grouped animals when detection of groups is imperfect and group size may be under-counted, but not over-counted. The estimator combines an MRDS model with an N-mixture model to account for imperfect detection of individuals. The new MRDS-Nmix model requires the same data as an MRDS model (independent detection histories, an estimate of distance to transect, and an estimate of group size), plus a second estimate of group size provided by the second observer. We extend the model to situations in which detection of individuals within groups declines with distance. We simulated 12 data sets and used Bayesian methods to compare the performance of the new MRDS-Nmix model to an MRDS model. Abundance estimates generated by the MRDS-Nmix model exhibited minimal bias and nominal coverage levels. In contrast, MRDS abundance estimates were biased low and exhibited poor coverage. Many species of conservation interest reside in groups and could benefit from an estimator that better accounts for imperfect detection. Furthermore, the ability to relax the assumption of perfect detection of individuals within detected groups may allow surveyors to re-allocate resources toward detection of new groups instead of extensive surveys of known groups. We believe the proposed estimator is feasible because the only additional field data required are a second estimate of group size.

  3. Accounting for imperfect detection of groups and individuals when estimating abundance

    USGS Publications Warehouse

    Clement, Matthew J.; Converse, Sarah J.; Royle, J. Andrew

    2017-01-01

    If animals are independently detected during surveys, many methods exist for estimating animal abundance despite detection probabilities <1. Common estimators include double-observer models, distance sampling models and combined double-observer and distance sampling models (known as mark-recapture-distance-sampling models; MRDS). When animals reside in groups, however, the assumption of independent detection is violated. In this case, the standard approach is to account for imperfect detection of groups, while assuming that individuals within groups are detected perfectly. However, this assumption is often unsupported. We introduce an abundance estimator for grouped animals when detection of groups is imperfect and group size may be under-counted, but not over-counted. The estimator combines an MRDS model with an N-mixture model to account for imperfect detection of individuals. The new MRDS-Nmix model requires the same data as an MRDS model (independent detection histories, an estimate of distance to transect, and an estimate of group size), plus a second estimate of group size provided by the second observer. We extend the model to situations in which detection of individuals within groups declines with distance. We simulated 12 data sets and used Bayesian methods to compare the performance of the new MRDS-Nmix model to an MRDS model. Abundance estimates generated by the MRDS-Nmix model exhibited minimal bias and nominal coverage levels. In contrast, MRDS abundance estimates were biased low and exhibited poor coverage. Many species of conservation interest reside in groups and could benefit from an estimator that better accounts for imperfect detection. Furthermore, the ability to relax the assumption of perfect detection of individuals within detected groups may allow surveyors to re-allocate resources toward detection of new groups instead of extensive surveys of known groups. We believe the proposed estimator is feasible because the only additional field data required are a second estimate of group size.

  4. Around Marshall

    NASA Image and Video Library

    1996-06-10

    The dart and associated launching system was developed by engineers at MSFC to collect a sample of the aluminum oxide particles during the static fire testing of the Shuttle's solid rocket motor. The dart is launched through the exhaust and recovered post test. The particles are collected on sticky copper tapes affixed to a cylindrical shaft in the dart. A protective sleeve draws over the tape after the sample is collected to prevent contamination. The sample is analyzed under a scarning electron microscope under high magnification and a particle size distribution is determined. This size distribution is input into the analytical model to predict the radiative heating rates from the motor exhaust. Good prediction models are essential to optimizing the development of the thermal protection system for the Shuttle.

  5. Random Distribution Pattern and Non-adaptivity of Genome Size in a Highly Variable Population of Festuca pallens

    PubMed Central

    Šmarda, Petr; Bureš, Petr; Horová, Lucie

    2007-01-01

    Background and Aims The spatial and statistical distribution of genome sizes and the adaptivity of genome size to some types of habitat, vegetation or microclimatic conditions were investigated in a tetraploid population of Festuca pallens. The population was previously documented to vary highly in genome size and is assumed as a model for the study of the initial stages of genome size differentiation. Methods Using DAPI flow cytometry, samples were measured repeatedly with diploid Festuca pallens as the internal standard. Altogether 172 plants from 57 plots (2·25 m2), distributed in contrasting habitats over the whole locality in South Moravia, Czech Republic, were sampled. The differences in DNA content were confirmed by the double peaks of simultaneously measured samples. Key Results At maximum, a 1·115-fold difference in genome size was observed. The statistical distribution of genome sizes was found to be continuous and best fits the extreme (Gumbel) distribution with rare occurrences of extremely large genomes (positive-skewed), as it is similar for the log-normal distribution of the whole Angiosperms. Even plants from the same plot frequently varied considerably in genome size and the spatial distribution of genome sizes was generally random and unautocorrelated (P > 0·05). The observed spatial pattern and the overall lack of correlations of genome size with recognized vegetation types or microclimatic conditions indicate the absence of ecological adaptivity of genome size in the studied population. Conclusions These experimental data on intraspecific genome size variability in Festuca pallens argue for the absence of natural selection and the selective non-significance of genome size in the initial stages of genome size differentiation, and corroborate the current hypothetical model of genome size evolution in Angiosperms (Bennetzen et al., 2005, Annals of Botany 95: 127–132). PMID:17565968

  6. Modeling change in potential landscape vulnerability to forest insect and pathogen disturbances: methods for forested subwatersheds sampled in the midscale interior Columbia River basin assessment.

    Treesearch

    Paul F. Hessburg; Bradley G. Smith; Craig A. Miller; Scott D. Kreiter; R. Brion Salter

    1999-01-01

    In the interior Columbia River basin midscale ecological assessment, including portions of the Klamath and Great Basins, we mapped and characterized historical and current vegetation composition and structure of 337 randomly sampled subwatersheds (9500 ha average size) in 43 subbasins (404 000 ha average size). We compared landscape patterns, vegetation structure and...

  7. Sample Size and Power Estimates for a Confirmatory Factor Analytic Model in Exercise and Sport: A Monte Carlo Approach

    ERIC Educational Resources Information Center

    Myers, Nicholas D.; Ahn, Soyeon; Jin, Ying

    2011-01-01

    Monte Carlo methods can be used in data analytic situations (e.g., validity studies) to make decisions about sample size and to estimate power. The purpose of using Monte Carlo methods in a validity study is to improve the methodological approach within a study where the primary focus is on construct validity issues and not on advancing…

  8. Sample size determinations for group-based randomized clinical trials with different levels of data hierarchy between experimental and control arms.

    PubMed

    Heo, Moonseong; Litwin, Alain H; Blackstock, Oni; Kim, Namhee; Arnsten, Julia H

    2017-02-01

    We derived sample size formulae for detecting main effects in group-based randomized clinical trials with different levels of data hierarchy between experimental and control arms. Such designs are necessary when experimental interventions need to be administered to groups of subjects whereas control conditions need to be administered to individual subjects. This type of trial, often referred to as a partially nested or partially clustered design, has been implemented for management of chronic diseases such as diabetes and is beginning to emerge more commonly in wider clinical settings. Depending on the research setting, the level of hierarchy of data structure for the experimental arm can be three or two, whereas that for the control arm is two or one. Such different levels of data hierarchy assume correlation structures of outcomes that are different between arms, regardless of whether research settings require two or three level data structure for the experimental arm. Therefore, the different correlations should be taken into account for statistical modeling and for sample size determinations. To this end, we considered mixed-effects linear models with different correlation structures between experimental and control arms to theoretically derive and empirically validate the sample size formulae with simulation studies.

  9. Robustness of methods for blinded sample size re-estimation with overdispersed count data.

    PubMed

    Schneider, Simon; Schmidli, Heinz; Friede, Tim

    2013-09-20

    Counts of events are increasingly common as primary endpoints in randomized clinical trials. With between-patient heterogeneity leading to variances in excess of the mean (referred to as overdispersion), statistical models reflecting this heterogeneity by mixtures of Poisson distributions are frequently employed. Sample size calculation in the planning of such trials requires knowledge on the nuisance parameters, that is, the control (or overall) event rate and the overdispersion parameter. Usually, there is only little prior knowledge regarding these parameters in the design phase resulting in considerable uncertainty regarding the sample size. In this situation internal pilot studies have been found very useful and very recently several blinded procedures for sample size re-estimation have been proposed for overdispersed count data, one of which is based on an EM-algorithm. In this paper we investigate the EM-algorithm based procedure with respect to aspects of their implementation by studying the algorithm's dependence on the choice of convergence criterion and find that the procedure is sensitive to the choice of the stopping criterion in scenarios relevant to clinical practice. We also compare the EM-based procedure to other competing procedures regarding their operating characteristics such as sample size distribution and power. Furthermore, the robustness of these procedures to deviations from the model assumptions is explored. We find that some of the procedures are robust to at least moderate deviations. The results are illustrated using data from the US National Heart, Lung and Blood Institute sponsored Asymptomatic Cardiac Ischemia Pilot study. Copyright © 2013 John Wiley & Sons, Ltd.

  10. On the role of the grain size in the magnetic behavior of sintered permanent magnets

    NASA Astrophysics Data System (ADS)

    Efthimiadis, K. G.; Ntallis, N.

    2018-02-01

    In this work the finite elements method is used to simulate, by micromagnetic modeling, the magnetic behavior of sintered anisotropic magnets. Hysteresis loops were simulated for different grain sizes in an oriented multigrain sample. By keeping out other parameters that contribute to the magnetic microstructure, such as the sample size, the grain morphology and the grain boundaries mismatch, it has been found that the grain size affects the magnetic properties only if the grains are exchange-decoupled. In this case, as the grain size decreases, a decrease in the nucleation field of a reverse magnetic domain is observed and an increase in the coercive field due to the pinning of the magnetic domain walls at the grain boundaries.

  11. Non-invasive genetic censusing and monitoring of primate populations.

    PubMed

    Arandjelovic, Mimi; Vigilant, Linda

    2018-03-01

    Knowing the density or abundance of primate populations is essential for their conservation management and contextualizing socio-demographic and behavioral observations. When direct counts of animals are not possible, genetic analysis of non-invasive samples collected from wildlife populations allows estimates of population size with higher accuracy and precision than is possible using indirect signs. Furthermore, in contrast to traditional indirect survey methods, prolonged or periodic genetic sampling across months or years enables inference of group membership, movement, dynamics, and some kin relationships. Data may also be used to estimate sex ratios, sex differences in dispersal distances, and detect gene flow among locations. Recent advances in capture-recapture models have further improved the precision of population estimates derived from non-invasive samples. Simulations using these methods have shown that the confidence interval of point estimates includes the true population size when assumptions of the models are met, and therefore this range of population size minima and maxima should be emphasized in population monitoring studies. Innovations such as the use of sniffer dogs or anti-poaching patrols for sample collection are important to ensure adequate sampling, and the expected development of efficient and cost-effective genotyping by sequencing methods for DNAs derived from non-invasive samples will automate and speed analyses. © 2018 Wiley Periodicals, Inc.

  12. MSeq-CNV: accurate detection of Copy Number Variation from Sequencing of Multiple samples.

    PubMed

    Malekpour, Seyed Amir; Pezeshk, Hamid; Sadeghi, Mehdi

    2018-03-05

    Currently a few tools are capable of detecting genome-wide Copy Number Variations (CNVs) based on sequencing of multiple samples. Although aberrations in mate pair insertion sizes provide additional hints for the CNV detection based on multiple samples, the majority of the current tools rely only on the depth of coverage. Here, we propose a new algorithm (MSeq-CNV) which allows detecting common CNVs across multiple samples. MSeq-CNV applies a mixture density for modeling aberrations in depth of coverage and abnormalities in the mate pair insertion sizes. Each component in this mixture density applies a Binomial distribution for modeling the number of mate pairs with aberration in the insertion size and also a Poisson distribution for emitting the read counts, in each genomic position. MSeq-CNV is applied on simulated data and also on real data of six HapMap individuals with high-coverage sequencing, in 1000 Genomes Project. These individuals include a CEU trio of European ancestry and a YRI trio of Nigerian ethnicity. Ancestry of these individuals is studied by clustering the identified CNVs. MSeq-CNV is also applied for detecting CNVs in two samples with low-coverage sequencing in 1000 Genomes Project and six samples form the Simons Genome Diversity Project.

  13. More Power to OATP1B1: An Evaluation of Sample Size in Pharmacogenetic Studies Using a Rosuvastatin PBPK Model for Intestinal, Hepatic, and Renal Transporter‐Mediated Clearances

    PubMed Central

    Burt, Howard; Abduljalil, Khaled; Neuhoff, Sibylle

    2016-01-01

    Abstract Rosuvastatin is a substrate of choice in clinical studies of organic anion‐transporting polypeptide (OATP)1B1‐ and OATP1B3‐associated drug interactions; thus, understanding the effect of OATP1B1 polymorphisms on the pharmacokinetics of rosuvastatin is crucial. Here, physiologically based pharmacokinetic (PBPK) modeling was coupled with a power calculation algorithm to evaluate the influence of sample size on the ability to detect an effect (80% power) of OATP1B1 phenotype on pharmacokinetics of rosuvastatin. Intestinal, hepatic, and renal transporters were mechanistically incorporated into a rosuvastatin PBPK model using permeability‐limited models for intestine, liver, and kidney, respectively, nested within a full PBPK model. Simulated plasma rosuvastatin concentrations in healthy volunteers were in agreement with previously reported clinical data. Power calculations were used to determine the influence of sample size on study power while accounting for OATP1B1 haplotype frequency and abundance in addition to its correlation with OATP1B3 abundance. It was determined that 10 poor‐transporter and 45 intermediate‐transporter individuals are required to achieve 80% power to discriminate the AUC0‐48h of rosuvastatin from that of the extensive‐transporter phenotype. This number was reduced to 7 poor‐transporter and 40 intermediate‐transporter individuals when the reported correlation between OATP1B1 and 1B3 abundance was taken into account. The current study represents the first example in which PBPK modeling in conjunction with power analysis has been used to investigate sample size in clinical studies of OATP1B1 polymorphisms. This approach highlights the influence of interindividual variability and correlation of transporter abundance on study power and should allow more informed decision making in pharmacogenomic study design. PMID:27385171

  14. Measurement of particle size distribution of soil and selected aggregate sizes using the hydrometer method and laser diffractometry

    NASA Astrophysics Data System (ADS)

    Guzmán, G.; Gómez, J. A.; Giráldez, J. V.

    2010-05-01

    Soil particle size distribution has been traditionally determined by the hydrometer or the sieve-pipette methods, both of them time consuming and requiring a relatively large soil sample. This might be a limitation in situations, such as for instance analysis of suspended sediment, when the sample is small. A possible alternative to these methods are the optical techniques such as laser diffractometry. However the literature indicates that the use of this technique as an alternative to traditional methods is still limited, because the difficulty in replicating the results obtained with the standard methods. In this study we present the percentages of soil grain size determined using laser diffractometry within ranges set between 0.04 - 2000 μm. A Beckman-Coulter ® LS-230 with a 750 nm laser beam and software version 3.2 in five soils, representative of southern Spain: Alameda, Benacazón, Conchuela, Lanjarón and Pedrera. In three of the studied soils (Alameda, Benacazón and Conchuela) the particle size distribution of each aggregate size class was also determined. Aggregate size classes were obtained by dry sieve analysis using a Retsch AS 200 basic ®. Two hundred grams of air dried soil were sieved during 150 s, at amplitude 2 mm, getting nine different sizes between 2000 μm and 10 μm. Analyses were performed by triplicate. The soil sample preparation was also adapted to our conditions. A small amount each soil sample (less than 1 g) was transferred to the fluid module full of running water and disaggregated by ultrasonication at energy level 4 and 80 ml of sodium hexametaphosphate solution during 580 seconds. Two replicates of each sample were performed. Each measurement was made for a 90 second reading at a pump speed of 62. After the laser diffractometry analysis, each soil and its aggregate classes were processed calibrating its own optical model fitting the optical parameters that mainly depends on the color and the shape of the analyzed particle. As a second alternative a unique optical model valid for a broad range of soils developed by the Department of Soil, Water, and Environmental Science of the University of Arizona (personal communication, already submitted) was tested. The results were compared with the particle size distribution measured in the same soils and aggregate classes using the hydrometer method. Preliminary results indicate a better calibration of the technique using the optical model of the Department of Soil, Water, and Environmental Science of the University of Arizona, which obtained a good correlations (r2>0.85). This result suggests that with an appropriate calibration of the optical model laser diffractometry might provide a reliable soil particle characterization.

  15. Optimizing variable radius plot size and LiDAR resolution to model standing volume in conifer forests

    Treesearch

    Ram Kumar Deo; Robert E. Froese; Michael J. Falkowski; Andrew T. Hudak

    2016-01-01

    The conventional approach to LiDAR-based forest inventory modeling depends on field sample data from fixed-radius plots (FRP). Because FRP sampling is cost intensive, combining variable-radius plot (VRP) sampling and LiDAR data has the potential to improve inventory efficiency. The overarching goal of this study was to evaluate the integration of LiDAR and VRP data....

  16. Robust Covariate-Adjusted Log-Rank Statistics and Corresponding Sample Size Formula for Recurrent Events Data

    PubMed Central

    Song, Rui; Kosorok, Michael R.; Cai, Jianwen

    2009-01-01

    Summary Recurrent events data are frequently encountered in clinical trials. This article develops robust covariate-adjusted log-rank statistics applied to recurrent events data with arbitrary numbers of events under independent censoring and the corresponding sample size formula. The proposed log-rank tests are robust with respect to different data-generating processes and are adjusted for predictive covariates. It reduces to the Kong and Slud (1997, Biometrika 84, 847–862) setting in the case of a single event. The sample size formula is derived based on the asymptotic normality of the covariate-adjusted log-rank statistics under certain local alternatives and a working model for baseline covariates in the recurrent event data context. When the effect size is small and the baseline covariates do not contain significant information about event times, it reduces to the same form as that of Schoenfeld (1983, Biometrics 39, 499–503) for cases of a single event or independent event times within a subject. We carry out simulations to study the control of type I error and the comparison of powers between several methods in finite samples. The proposed sample size formula is illustrated using data from an rhDNase study. PMID:18162107

  17. Connecting Research to Teaching: Using Data to Motivate the Use of Empirical Sampling Distributions

    ERIC Educational Resources Information Center

    Lee, Hollylynne S.; Starling, Tina T.; Gonzalez, Marggie D.

    2014-01-01

    Research shows that students often struggle with understanding empirical sampling distributions. Using hands-on and technology models and simulations of problems generated by real data help students begin to make connections between repeated sampling, sample size, distribution, variation, and center. A task to assist teachers in implementing…

  18. Thermal conductivity of lunar regolith simulant JSC-1A under vacuum

    NASA Astrophysics Data System (ADS)

    Sakatani, Naoya; Ogawa, Kazunori; Arakawa, Masahiko; Tanaka, Satoshi

    2018-07-01

    Many air-less planetary bodies, including the Moon, asteroids, and comets, are covered by regolith. The thermal conductivity of the regolith is an essential parameter controlling the surface temperature variation. A thermal conductivity model applicable to natural soils as well as planetary surface regolith is required to analyze infrared remote sensing data. In this study, we investigated the temperature and compressional stress dependence of the thermal conductivity of the lunar regolith simulant JSC-1A, and the temperature dependence of sieved JSC-1A samples under vacuum conditions. We confirmed that a series of the experimental data for JSC-1A are fitted well by our analytical model of the thermal conductivity (Sakatani et al., 2017). Comparison with the calibration data of the sieved samples with those for original JSC-1A indicates that the thermal conductivity of natural samples with a wide grain size distribution can be modeled as mono-sized grains with a volumetric median size. The calibrated model can be used to estimate the volumetric median grain size from infrared remote sensing data. Our experiments and the calibrated model indicates that uncompressed JSC-1A has similar thermal conductivity to lunar top-surface materials, but the lunar subsurface thermal conductivity cannot be explained only by the effects of the density and self-weighted compressional stress. We infer that the nature of the lunar subsurface regolith grains is much different from JSC-1A and lunar top-surface regolith, and/or the lunar subsurface regolith is over-consolidated and the compressional stress higher than the hydrostatic pressure is stored in the lunar regolith layer.

  19. Meta-analysis of multiple outcomes: a multilevel approach.

    PubMed

    Van den Noortgate, Wim; López-López, José Antonio; Marín-Martínez, Fulgencio; Sánchez-Meca, Julio

    2015-12-01

    In meta-analysis, dependent effect sizes are very common. An example is where in one or more studies the effect of an intervention is evaluated on multiple outcome variables for the same sample of participants. In this paper, we evaluate a three-level meta-analytic model to account for this kind of dependence, extending the simulation results of Van den Noortgate, López-López, Marín-Martínez, and Sánchez-Meca Behavior Research Methods, 45, 576-594 (2013) by allowing for a variation in the number of effect sizes per study, in the between-study variance, in the correlations between pairs of outcomes, and in the sample size of the studies. At the same time, we explore the performance of the approach if the outcomes used in a study can be regarded as a random sample from a population of outcomes. We conclude that although this approach is relatively simple and does not require prior estimates of the sampling covariances between effect sizes, it gives appropriate mean effect size estimates, standard error estimates, and confidence interval coverage proportions in a variety of realistic situations.

  20. Estimation and applications of size-based distributions in forestry

    Treesearch

    Jeffrey H. Gove

    2003-01-01

    Size-based distributions arise in several contexts in forestry and ecology. Simple power relationships (e.g., basal area and diameter at breast height) between variables are one such area of interest arising from a modeling perspective. Another, probability proportional to size sampline (PPS), is found in the most widely used methods for sampling standing or dead and...

  1. Parameter recovery, bias and standard errors in the linear ballistic accumulator model.

    PubMed

    Visser, Ingmar; Poessé, Rens

    2017-05-01

    The linear ballistic accumulator (LBA) model (Brown & Heathcote, , Cogn. Psychol., 57, 153) is increasingly popular in modelling response times from experimental data. An R package, glba, has been developed to fit the LBA model using maximum likelihood estimation which is validated by means of a parameter recovery study. At sufficient sample sizes parameter recovery is good, whereas at smaller sample sizes there can be large bias in parameters. In a second simulation study, two methods for computing parameter standard errors are compared. The Hessian-based method is found to be adequate and is (much) faster than the alternative bootstrap method. The use of parameter standard errors in model selection and inference is illustrated in an example using data from an implicit learning experiment (Visser et al., , Mem. Cogn., 35, 1502). It is shown that typical implicit learning effects are captured by different parameters of the LBA model. © 2017 The British Psychological Society.

  2. Simulating recurrent event data with hazard functions defined on a total time scale.

    PubMed

    Jahn-Eimermacher, Antje; Ingel, Katharina; Ozga, Ann-Kathrin; Preussler, Stella; Binder, Harald

    2015-03-08

    In medical studies with recurrent event data a total time scale perspective is often needed to adequately reflect disease mechanisms. This means that the hazard process is defined on the time since some starting point, e.g. the beginning of some disease, in contrast to a gap time scale where the hazard process restarts after each event. While techniques such as the Andersen-Gill model have been developed for analyzing data from a total time perspective, techniques for the simulation of such data, e.g. for sample size planning, have not been investigated so far. We have derived a simulation algorithm covering the Andersen-Gill model that can be used for sample size planning in clinical trials as well as the investigation of modeling techniques. Specifically, we allow for fixed and/or random covariates and an arbitrary hazard function defined on a total time scale. Furthermore we take into account that individuals may be temporarily insusceptible to a recurrent incidence of the event. The methods are based on conditional distributions of the inter-event times conditional on the total time of the preceeding event or study start. Closed form solutions are provided for common distributions. The derived methods have been implemented in a readily accessible R script. The proposed techniques are illustrated by planning the sample size for a clinical trial with complex recurrent event data. The required sample size is shown to be affected not only by censoring and intra-patient correlation, but also by the presence of risk-free intervals. This demonstrates the need for a simulation algorithm that particularly allows for complex study designs where no analytical sample size formulas might exist. The derived simulation algorithm is seen to be useful for the simulation of recurrent event data that follow an Andersen-Gill model. Next to the use of a total time scale, it allows for intra-patient correlation and risk-free intervals as are often observed in clinical trial data. Its application therefore allows the simulation of data that closely resemble real settings and thus can improve the use of simulation studies for designing and analysing studies.

  3. Heavy metals in the gold mine soil of the upstream area of a metropolitan drinking water source.

    PubMed

    Ding, Huaijian; Ji, Hongbing; Tang, Lei; Zhang, Aixing; Guo, Xinyue; Li, Cai; Gao, Yang; Briki, Mergem

    2016-02-01

    Pinggu District is adjacent to the county of Miyun, which contains the largest drinking water source of Beijing (Miyun Reservoir). The Wanzhuang gold field and tailing deposits are located in Pinggu, threatening Beijing's drinking water security. In this study, soil samples were collected from the surface of the mining area and the tailings piles and analyzed for physical and chemical properties, as well as heavy metal contents and particle size fraction to study the relationship between degree of pollution degree and particle size. Most metal concentrations in the gold mine soil samples exceeded the background levels in Beijing. The spatial distribution of As, Cd, Cu, Pb, and Zn was the same, while that of Cr and Ni was relatively similar. Trace element concentrations increased in larger particles, decreased in the 50-74 μm size fraction, and were lowest in the <2 μm size fraction. Multivariate analysis showed that Cu, Cd, Zn, and Pb originated from anthropogenic sources, while Cr, Ni, and Sc were of natural origin. The geo-accumulation index indicated serious Pb, As, and Cd pollution, but moderate to no Ni, Cr, and Hg pollution. The Tucker 3 model revealed three factors for particle fractions, metals, and samples. There were two factors in model A and three factors for both the metals and samples (models B and C, respectively). The potential ecological risk index shows that most of the study areas have very high potential ecological risk, a small portion has high potential ecological risk, and only a few sampling points on the perimeter have moderate ecological risk, with higher risk closer to the mining area.

  4. The Discovery of Single-Nucleotide Polymorphisms—and Inferences about Human Demographic History

    PubMed Central

    Wakeley, John; Nielsen, Rasmus; Liu-Cordero, Shau Neen; Ardlie, Kristin

    2001-01-01

    A method of historical inference that accounts for ascertainment bias is developed and applied to single-nucleotide polymorphism (SNP) data in humans. The data consist of 84 short fragments of the genome that were selected, from three recent SNP surveys, to contain at least two polymorphisms in their respective ascertainment samples and that were then fully resequenced in 47 globally distributed individuals. Ascertainment bias is the deviation, from what would be observed in a random sample, caused either by discovery of polymorphisms in small samples or by locus selection based on levels or patterns of polymorphism. The three SNP surveys from which the present data were derived differ both in their protocols for ascertainment and in the size of the samples used for discovery. We implemented a Monte Carlo maximum-likelihood method to fit a subdivided-population model that includes a possible change in effective size at some time in the past. Incorrectly assuming that ascertainment bias does not exist causes errors in inference, affecting both estimates of migration rates and historical changes in size. Migration rates are overestimated when ascertainment bias is ignored. However, the direction of error in inferences about changes in effective population size (whether the population is inferred to be shrinking or growing) depends on whether either the numbers of SNPs per fragment or the SNP-allele frequencies are analyzed. We use the abbreviation “SDL,” for “SNP-discovered locus,” in recognition of the genomic-discovery context of SNPs. When ascertainment bias is modeled fully, both the number of SNPs per SDL and their allele frequencies support a scenario of growth in effective size in the context of a subdivided population. If subdivision is ignored, however, the hypothesis of constant effective population size cannot be rejected. An important conclusion of this work is that, in demographic or other studies, SNP data are useful only to the extent that their ascertainment can be modeled. PMID:11704929

  5. Robust model selection and the statistical classification of languages

    NASA Astrophysics Data System (ADS)

    García, J. E.; González-López, V. A.; Viola, M. L. L.

    2012-10-01

    In this paper we address the problem of model selection for the set of finite memory stochastic processes with finite alphabet, when the data is contaminated. We consider m independent samples, with more than half of them being realizations of the same stochastic process with law Q, which is the one we want to retrieve. We devise a model selection procedure such that for a sample size large enough, the selected process is the one with law Q. Our model selection strategy is based on estimating relative entropies to select a subset of samples that are realizations of the same law. Although the procedure is valid for any family of finite order Markov models, we will focus on the family of variable length Markov chain models, which include the fixed order Markov chain model family. We define the asymptotic breakdown point (ABDP) for a model selection procedure, and we show the ABDP for our procedure. This means that if the proportion of contaminated samples is smaller than the ABDP, then, as the sample size grows our procedure selects a model for the process with law Q. We also use our procedure in a setting where we have one sample conformed by the concatenation of sub-samples of two or more stochastic processes, with most of the subsamples having law Q. We conducted a simulation study. In the application section we address the question of the statistical classification of languages according to their rhythmic features using speech samples. This is an important open problem in phonology. A persistent difficulty on this problem is that the speech samples correspond to several sentences produced by diverse speakers, corresponding to a mixture of distributions. The usual procedure to deal with this problem has been to choose a subset of the original sample which seems to best represent each language. The selection is made by listening to the samples. In our application we use the full dataset without any preselection of samples. We apply our robust methodology estimating a model which represent the main law for each language. Our findings agree with the linguistic conjecture, related to the rhythm of the languages included on our dataset.

  6. Bayesian sample size calculations in phase II clinical trials using a mixture of informative priors.

    PubMed

    Gajewski, Byron J; Mayo, Matthew S

    2006-08-15

    A number of researchers have discussed phase II clinical trials from a Bayesian perspective. A recent article by Mayo and Gajewski focuses on sample size calculations, which they determine by specifying an informative prior distribution and then calculating a posterior probability that the true response will exceed a prespecified target. In this article, we extend these sample size calculations to include a mixture of informative prior distributions. The mixture comes from several sources of information. For example consider information from two (or more) clinicians. The first clinician is pessimistic about the drug and the second clinician is optimistic. We tabulate the results for sample size design using the fact that the simple mixture of Betas is a conjugate family for the Beta- Binomial model. We discuss the theoretical framework for these types of Bayesian designs and show that the Bayesian designs in this paper approximate this theoretical framework. Copyright 2006 John Wiley & Sons, Ltd.

  7. Finite mixture model: A maximum likelihood estimation approach on time series data

    NASA Astrophysics Data System (ADS)

    Yen, Phoong Seuk; Ismail, Mohd Tahir; Hamzah, Firdaus Mohamad

    2014-09-01

    Recently, statistician emphasized on the fitting of finite mixture model by using maximum likelihood estimation as it provides asymptotic properties. In addition, it shows consistency properties as the sample sizes increases to infinity. This illustrated that maximum likelihood estimation is an unbiased estimator. Moreover, the estimate parameters obtained from the application of maximum likelihood estimation have smallest variance as compared to others statistical method as the sample sizes increases. Thus, maximum likelihood estimation is adopted in this paper to fit the two-component mixture model in order to explore the relationship between rubber price and exchange rate for Malaysia, Thailand, Philippines and Indonesia. Results described that there is a negative effect among rubber price and exchange rate for all selected countries.

  8. Assessing accuracy of point fire intervals across landscapes with simulation modelling

    Treesearch

    Russell A. Parsons; Emily K. Heyerdahl; Robert E. Keane; Brigitte Dorner; Joseph Fall

    2007-01-01

    We assessed accuracy in point fire intervals using a simulation model that sampled four spatially explicit simulated fire histories. These histories varied in fire frequency and size and were simulated on a flat landscape with two forest types (dry versus mesic). We used three sampling designs (random, systematic grids, and stratified). We assessed the sensitivity of...

  9. Examination of Polytomous Items' Psychometric Properties According to Nonparametric Item Response Theory Models in Different Test Conditions

    ERIC Educational Resources Information Center

    Sengul Avsar, Asiye; Tavsancil, Ezel

    2017-01-01

    This study analysed polytomous items' psychometric properties according to nonparametric item response theory (NIRT) models. Thus, simulated datasets--three different test lengths (10, 20 and 30 items), three sample distributions (normal, right and left skewed) and three samples sizes (100, 250 and 500)--were generated by conducting 20…

  10. Bias and Precision of Measures of Association for a Fixed-Effect Multivariate Analysis of Variance Model

    ERIC Educational Resources Information Center

    Kim, Soyoung; Olejnik, Stephen

    2005-01-01

    The sampling distributions of five popular measures of association with and without two bias adjusting methods were examined for the single factor fixed-effects multivariate analysis of variance model. The number of groups, sample sizes, number of outcomes, and the strength of association were manipulated. The results indicate that all five…

  11. Designing clinical trials to test disease-modifying agents: application to the treatment trials of Alzheimer's disease.

    PubMed

    Xiong, Chengjie; van Belle, Gerald; Miller, J Philip; Morris, John C

    2011-02-01

    Therapeutic trials of disease-modifying agents on Alzheimer's disease (AD) require novel designs and analyses involving switch of treatments for at least a portion of subjects enrolled. Randomized start and randomized withdrawal designs are two examples of such designs. Crucial design parameters such as sample size and the time of treatment switch are important to understand in designing such clinical trials. The purpose of this article is to provide methods to determine sample sizes and time of treatment switch as well as optimum statistical tests of treatment efficacy for clinical trials of disease-modifying agents on AD. A general linear mixed effects model is proposed to test the disease-modifying efficacy of novel therapeutic agents on AD. This model links the longitudinal growth from both the placebo arm and the treatment arm at the time of treatment switch for these in the delayed treatment arm or early withdrawal arm and incorporates the potential correlation on the rate of cognitive change before and after the treatment switch. Sample sizes and the optimum time for treatment switch of such trials as well as optimum test statistic for the treatment efficacy are determined according to the model. Assuming an evenly spaced longitudinal design over a fixed duration, the optimum treatment switching time in a randomized start or a randomized withdrawal trial is half way through the trial. With the optimum test statistic for the treatment efficacy and over a wide spectrum of model parameters, the optimum sample size allocations are fairly close to the simplest design with a sample size ratio of 1:1:1 among the treatment arm, the delayed treatment or early withdrawal arm, and the placebo arm. The application of the proposed methodology to AD provides evidence that much larger sample sizes are required to adequately power disease-modifying trials when compared with those for symptomatic agents, even when the treatment switch time and efficacy test are optimally chosen. The proposed method assumes that the only and immediate effect of treatment switch is on the rate of cognitive change. Crucial design parameters for the clinical trials of disease-modifying agents on AD can be optimally chosen. Government and industry officials as well as academia researchers should consider the optimum use of the clinical trials design for disease-modifying agents on AD in their effort to search for the treatments with the potential to modify the underlying pathophysiology of AD.

  12. Go big or … don't? A field-based diet evaluation of freshwater piscivore and prey fish size relationships

    PubMed Central

    Ahrenstorff, Tyler D.; Diana, James S.; Fetzer, William W.; Jones, Thomas S.; Lawson, Zach J.; McInerny, Michael C.; Santucci, Victor J.; Vander Zanden, M. Jake

    2018-01-01

    Body size governs predator-prey interactions, which in turn structure populations, communities, and food webs. Understanding predator-prey size relationships is valuable from a theoretical perspective, in basic research, and for management applications. However, predator-prey size data are limited and costly to acquire. We quantified predator-prey total length and mass relationships for several freshwater piscivorous taxa: crappie (Pomoxis spp.), largemouth bass (Micropterus salmoides), muskellunge (Esox masquinongy), northern pike (Esox lucius), rock bass (Ambloplites rupestris), smallmouth bass (Micropterus dolomieu), and walleye (Sander vitreus). The range of prey total lengths increased with predator total length. The median and maximum ingested prey total length varied with predator taxon and length, but generally ranged from 10–20% and 32–46% of predator total length, respectively. Predators tended to consume larger fusiform prey than laterally compressed prey. With the exception of large muskellunge, predators most commonly consumed prey between 16 and 73 mm. A sensitivity analysis indicated estimates can be very accurate at sample sizes greater than 1,000 diet items and fairly accurate at sample sizes greater than 100. However, sample sizes less than 50 should be evaluated with caution. Furthermore, median log10 predator-prey body mass ratios ranged from 1.9–2.5, nearly 50% lower than values previously reported for freshwater fishes. Managers, researchers, and modelers could use our findings as a tool for numerous predator-prey evaluations from stocking size optimization to individual-based bioenergetics analyses identifying prey size structure. To this end, we have developed a web-based user interface to maximize the utility of our models that can be found at www.LakeEcologyLab.org/pred_prey. PMID:29543856

  13. Go big or … don't? A field-based diet evaluation of freshwater piscivore and prey fish size relationships.

    PubMed

    Gaeta, Jereme W; Ahrenstorff, Tyler D; Diana, James S; Fetzer, William W; Jones, Thomas S; Lawson, Zach J; McInerny, Michael C; Santucci, Victor J; Vander Zanden, M Jake

    2018-01-01

    Body size governs predator-prey interactions, which in turn structure populations, communities, and food webs. Understanding predator-prey size relationships is valuable from a theoretical perspective, in basic research, and for management applications. However, predator-prey size data are limited and costly to acquire. We quantified predator-prey total length and mass relationships for several freshwater piscivorous taxa: crappie (Pomoxis spp.), largemouth bass (Micropterus salmoides), muskellunge (Esox masquinongy), northern pike (Esox lucius), rock bass (Ambloplites rupestris), smallmouth bass (Micropterus dolomieu), and walleye (Sander vitreus). The range of prey total lengths increased with predator total length. The median and maximum ingested prey total length varied with predator taxon and length, but generally ranged from 10-20% and 32-46% of predator total length, respectively. Predators tended to consume larger fusiform prey than laterally compressed prey. With the exception of large muskellunge, predators most commonly consumed prey between 16 and 73 mm. A sensitivity analysis indicated estimates can be very accurate at sample sizes greater than 1,000 diet items and fairly accurate at sample sizes greater than 100. However, sample sizes less than 50 should be evaluated with caution. Furthermore, median log10 predator-prey body mass ratios ranged from 1.9-2.5, nearly 50% lower than values previously reported for freshwater fishes. Managers, researchers, and modelers could use our findings as a tool for numerous predator-prey evaluations from stocking size optimization to individual-based bioenergetics analyses identifying prey size structure. To this end, we have developed a web-based user interface to maximize the utility of our models that can be found at www.LakeEcologyLab.org/pred_prey.

  14. Emission characteristics and chemical components of size-segregated particulate matter in iron and steel industry

    NASA Astrophysics Data System (ADS)

    Jia, Jia; Cheng, Shuiyuan; Yao, Sen; Xu, Tiebing; Zhang, Tingting; Ma, Yuetao; Wang, Hongliang; Duan, Wenjiao

    2018-06-01

    As one of the highest energy consumption and pollution industries, the iron and steel industry is regarded as a most important source of particulate matter emission. In this study, chemical components of size-segregated particulate matters (PM) emitted from different manufacturing units in iron and steel industry were sampled by a comprehensive sampling system. Results showed that the average particle mass concentration was highest in sintering process, followed by puddling, steelmaking and then rolling processes. PM samples were divided into eight size fractions for testing the chemical components, SO42- and NH4+ distributed more into fine particles while most of the Ca2+ was concentrated in coarse particles, the size distribution of mineral elements depended on the raw materials applied. Moreover, local database with PM chemical source profiles of iron and steel industry were built and applied in CMAQ modeling for simulating SO42- and NO3- concentration, results showed that the accuracy of model simulation improved with local chemical source profiles compared to the SPECIATE database. The results gained from this study are expected to be helpful to understand the components of PM in iron and steel industry and contribute to the source apportionment researches.

  15. Measuring size evolution of distant, faint galaxies in the radio regime

    NASA Astrophysics Data System (ADS)

    Lindroos, L.; Knudsen, K. K.; Stanley, F.; Muxlow, T. W. B.; Beswick, R. J.; Conway, J.; Radcliffe, J. F.; Wrigley, N.

    2018-05-01

    We measure the evolution of sizes for star-forming galaxies as seen in 1.4 GHz continuum radio for z = 0-3. The measurements are based on combined VLA+MERLIN data of the Hubble Deep Field, and using a uv-stacking algorithm combined with model fitting to estimate the average sizes of galaxies. A sample of ˜1000 star-forming galaxies is selected from optical and near-infrared catalogues, with stellar masses M⊙ ≈ 1010-1011 M⊙ and photometric redshifts 0-3. The median sizes are parametrized for stellar mass M* = 5 × 1010 M⊙ as R_e = A× {}(H(z)/H(1.5))^{α _z}. We find that the median radio sizes evolve towards larger sizes at later times with αz = -1.1 ± 0.6, and A (the median size at z ≈ 1.5) is found to be 0.26^'' ± 0.07^'' or 2.3±0.6 kpc. The measured radio sizes are typically a factor of 2 smaller than those measure in the optical, and are also smaller than the typical H α sizes in the literature. This indicates that star formation, as traced by the radio continuum, is typically concentrated towards the centre of galaxies, for the sampled redshift range. Furthermore, the discrepancy of measured sizes from different tracers of star formation, indicates the need for models of size evolution to adopt a multiwavelength approach in the measurement of the sizes star-forming regions.

  16. Assessing residential exposure to urban noise using environmental models: does the size of the local living neighborhood matter?

    PubMed

    Tenailleau, Quentin M; Bernard, Nadine; Pujol, Sophie; Houot, Hélène; Joly, Daniel; Mauny, Frédéric

    2015-01-01

    Environmental epidemiological studies rely on the quantification of the exposure level in a surface defined as the subject's exposure area. For residential exposure, this area is often the subject's neighborhood. However, the variability of the size and nature of the neighborhoods makes comparison of the findings across studies difficult. This article examines the impact of the neighborhood's definition on environmental noise exposure levels obtained from four commonly used sampling techniques: address point, façade, buffers, and official zoning. A high-definition noise model, built on a middle-sized French city, has been used to estimate LAeq,24 h exposure in the vicinity of 10,825 residential buildings. Twelve noise exposure indicators have been used to assess inhabitants' exposure. Influence of urban environmental factors was analyzed using multilevel modeling. When the sampled area increases, the average exposure increases (+3.9 dB), whereas the SD decreases (-1.6 dB) (P<0.01). Most of the indicators differ statistically. When comparing indicators from the 50-m and 400-m radius buffers, the assigned LAeq,24 h level varies across buildings from -9.4 to +22.3 dB. This variation is influenced by urban environmental characteristics (P<0.01). On the basis of this study's findings, sampling technique, neighborhood size, and environmental composition should be carefully considered in further exposure studies.

  17. Assessing the Application of a Geographic Presence-Only Model for Land Suitability Mapping

    PubMed Central

    Heumann, Benjamin W.; Walsh, Stephen J.; McDaniel, Phillip M.

    2011-01-01

    Recent advances in ecological modeling have focused on novel methods for characterizing the environment that use presence-only data and machine-learning algorithms to predict the likelihood of species occurrence. These novel methods may have great potential for land suitability applications in the developing world where detailed land cover information is often unavailable or incomplete. This paper assesses the adaptation and application of the presence-only geographic species distribution model, MaxEnt, for agricultural crop suitability mapping in a rural Thailand where lowland paddy rice and upland field crops predominant. To assess this modeling approach, three independent crop presence datasets were used including a social-demographic survey of farm households, a remote sensing classification of land use/land cover, and ground control points, used for geodetic and thematic reference that vary in their geographic distribution and sample size. Disparate environmental data were integrated to characterize environmental settings across Nang Rong District, a region of approximately 1,300 sq. km in size. Results indicate that the MaxEnt model is capable of modeling crop suitability for upland and lowland crops, including rice varieties, although model results varied between datasets due to the high sensitivity of the model to the distribution of observed crop locations in geographic and environmental space. Accuracy assessments indicate that model outcomes were influenced by the sample size and the distribution of sample points in geographic and environmental space. The need for further research into accuracy assessments of presence-only models lacking true absence data is discussed. We conclude that the Maxent model can provide good estimates of crop suitability, but many areas need to be carefully scrutinized including geographic distribution of input data and assessment methods to ensure realistic modeling results. PMID:21860606

  18. Gravel-Sand-Clay Mixture Model for Predictions of Permeability and Velocity of Unconsolidated Sediments

    NASA Astrophysics Data System (ADS)

    Konishi, C.

    2014-12-01

    Gravel-sand-clay mixture model is proposed particularly for unconsolidated sediments to predict permeability and velocity from volume fractions of the three components (i.e. gravel, sand, and clay). A well-known sand-clay mixture model or bimodal mixture model treats clay contents as volume fraction of the small particle and the rest of the volume is considered as that of the large particle. This simple approach has been commonly accepted and has validated by many studies before. However, a collection of laboratory measurements of permeability and grain size distribution for unconsolidated samples show an impact of presence of another large particle; i.e. only a few percent of gravel particles increases the permeability of the sample significantly. This observation cannot be explained by the bimodal mixture model and it suggests the necessity of considering the gravel-sand-clay mixture model. In the proposed model, I consider the three volume fractions of each component instead of using only the clay contents. Sand becomes either larger or smaller particles in the three component mixture model, whereas it is always the large particle in the bimodal mixture model. The total porosity of the two cases, one is the case that the sand is smaller particle and the other is the case that the sand is larger particle, can be modeled independently from sand volume fraction by the same fashion in the bimodal model. However, the two cases can co-exist in one sample; thus, the total porosity of the mixed sample is calculated by weighted average of the two cases by the volume fractions of gravel and clay. The effective porosity is distinguished from the total porosity assuming that the porosity associated with clay is zero effective porosity. In addition, effective grain size can be computed from the volume fractions and representative grain sizes for each component. Using the effective porosity and the effective grain size, the permeability is predicted by Kozeny-Carman equation. Furthermore, elastic properties are obtainable by general Hashin-Shtrikman-Walpole bounds. The predicted results by this new mixture model are qualitatively consistent with laboratory measurements and well log obtained for unconsolidated sediments. Acknowledgement: A part of this study was accomplished with a subsidy of River Environment Fund of Japan.

  19. Size-selective separation of polydisperse gold nanoparticles in supercritical ethane.

    PubMed

    Williams, Dylan P; Satherley, John

    2009-04-09

    The aim of this study was to use supercritical ethane to selectively disperse alkanethiol-stabilized gold nanoparticles of one size from a polydisperse sample in order to recover a monodisperse fraction of the nanoparticles. A disperse sample of metal nanoparticles with diameters in the range of 1-5 nm was prepared using established techniques then further purified by Soxhlet extraction. The purified sample was subjected to supercritical ethane at a temperature of 318 K in the pressure range 50-276 bar. Particles were characterized by UV-vis absorption spectroscopy, TEM, and MALDI-TOF mass spectroscopy. The results show that with increasing pressure the dispersibility of the nanoparticles increases, this effect is most pronounced for smaller nanoparticles. At the highest pressure investigated a sample of the particles was effectively stripped of all the smaller particles leaving a monodisperse sample. The relationship between dispersibility and supercritical fluid density for two different size samples of alkanethiol-stabilized gold nanoparticles was considered using the Chrastil chemical equilibrium model.

  20. Swimsuit issues: promoting positive body image in young women's magazines.

    PubMed

    Boyd, Elizabeth Reid; Moncrieff-Boyd, Jessica

    2011-08-01

    This preliminary study reviews the promotion of healthy body image to young Australian women, following the 2009 introduction of the voluntary Industry Code of Conduct on Body Image. The Code includes using diverse sized models in magazines. A qualitative content analysis of the 2010 annual 'swimsuit issues' was conducted on 10 Australian young women's magazines. Pictorial and/or textual editorial evidence of promoting diverse body shapes and sizes was regarded as indicative of the magazines' upholding aspects of the voluntary Code of Conduct for Body Image. Diverse sized models were incorporated in four of the seven magazines with swimsuit features sampled. Body size differentials were presented as part of the swimsuit features in three of the magazines sampled. Tips for diverse body type enhancement were included in four of the magazines. All magazines met at least one criterion. One magazine displayed evidence of all three criteria. Preliminary examination suggests that more than half of young women's magazines are upholding elements of the voluntary Code of Conduct for Body Image, through representation of diverse-sized women in their swimsuit issues.

  1. Connecting Clump Sizes in Turbulent Disk Galaxies to Instability Theory

    NASA Astrophysics Data System (ADS)

    Fisher, David B.; Glazebrook, Karl; Abraham, Roberto G.; Damjanov, Ivana; White, Heidi A.; Obreschkow, Danail; Basset, Robert; Bekiaris, Georgios; Wisnioski, Emily; Green, Andy; Bolatto, Alberto D.

    2017-04-01

    In this letter we study the mean sizes of Hα clumps in turbulent disk galaxies relative to kinematics, gas fractions, and Toomre Q. We use ˜100 pc resolution HST images, IFU kinematics, and gas fractions of a sample of rare, nearby turbulent disks with properties closely matched to z˜ 1.5{--}2 main-sequence galaxies (the DYNAMO sample). We find linear correlations of normalized mean clump sizes with both the gas fraction and the velocity dispersion-to-rotation velocity ratio of the host galaxy. We show that these correlations are consistent with predictions derived from a model of instabilities in a self-gravitating disk (the so-called “violent disk instability model”). We also observe, using a two-fluid model for Q, a correlation between the size of clumps and self-gravity-driven unstable regions. These results are most consistent with the hypothesis that massive star-forming clumps in turbulent disks are the result of instabilities in self-gravitating gas-rich disks, and therefore provide a direct connection between resolved clump sizes and this in situ mechanism.

  2. Infraocclusion: Dental development and associated dental variations in singletons and twins.

    PubMed

    Odeh, Ruba; Townsend, Grant; Mihailidis, Suzanna; Lähdesmäki, Raija; Hughes, Toby; Brook, Alan

    2015-09-01

    The aim of this study was to investigate the prevalence of selected dental variations in association with infraocclusion, as well as determining the effects of infraocclusion on dental development and tooth size, in singletons and twins. Two samples were analysed. The first sample comprised 1454 panoramic radiographs of singleton boys and girls aged 8-11 years. The second sample comprised dental models of 202 pairs of monozygotic and dizygotic twins aged 8-11 years. Adobe Photoshop CS5 was used to construct reference lines and measure the extent of infraocclusion (in mm) of primary molars on the panoramic radiographs and on 2D images obtained from the dental models. The panoramic radiographs were examined for the presence of selected dental variations and to assess dental development following the Demirjian and Willems systems. The twins' dental models were measured to assess mesiodistal crown widths. In the singleton sample there was a significant association of canines in an altered position during eruption and the lateral incisor complex (agenesis and/or small tooth size) with infraocclusion (P<0.001), but there was no significant association between infraocclusion and agenesis of premolars. Dental age assessment revealed that dental development was delayed in individuals with infraocclusion compared to controls. The primary mandibular canines were significantly smaller in size in the infraoccluded group (P<0.05). The presence of other dental variations in association with infraocclusion, as well as delayed dental development and reduced tooth size, suggests the presence of a pleiotropic effect. The underlying aetiological factors may be genetic and/or epigenetic. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. A single test for rejecting the null hypothesis in subgroups and in the overall sample.

    PubMed

    Lin, Yunzhi; Zhou, Kefei; Ganju, Jitendra

    2017-01-01

    In clinical trials, some patient subgroups are likely to demonstrate larger effect sizes than other subgroups. For example, the effect size, or informally the benefit with treatment, is often greater in patients with a moderate condition of a disease than in those with a mild condition. A limitation of the usual method of analysis is that it does not incorporate this ordering of effect size by patient subgroup. We propose a test statistic which supplements the conventional test by including this information and simultaneously tests the null hypothesis in pre-specified subgroups and in the overall sample. It results in more power than the conventional test when the differences in effect sizes across subgroups are at least moderately large; otherwise it loses power. The method involves combining p-values from models fit to pre-specified subgroups and the overall sample in a manner that assigns greater weight to subgroups in which a larger effect size is expected. Results are presented for randomized trials with two and three subgroups.

  4. Sampling guidelines for oral fluid-based surveys of group-housed animals.

    PubMed

    Rotolo, Marisa L; Sun, Yaxuan; Wang, Chong; Giménez-Lirola, Luis; Baum, David H; Gauger, Phillip C; Harmon, Karen M; Hoogland, Marlin; Main, Rodger; Zimmerman, Jeffrey J

    2017-09-01

    Formulas and software for calculating sample size for surveys based on individual animal samples are readily available. However, sample size formulas are not available for oral fluids and other aggregate samples that are increasingly used in production settings. Therefore, the objective of this study was to develop sampling guidelines for oral fluid-based porcine reproductive and respiratory syndrome virus (PRRSV) surveys in commercial swine farms. Oral fluid samples were collected in 9 weekly samplings from all pens in 3 barns on one production site beginning shortly after placement of weaned pigs. Samples (n=972) were tested by real-time reverse-transcription PCR (RT-rtPCR) and the binary results analyzed using a piecewise exponential survival model for interval-censored, time-to-event data with misclassification. Thereafter, simulation studies were used to study the barn-level probability of PRRSV detection as a function of sample size, sample allocation (simple random sampling vs fixed spatial sampling), assay diagnostic sensitivity and specificity, and pen-level prevalence. These studies provided estimates of the probability of detection by sample size and within-barn prevalence. Detection using fixed spatial sampling was as good as, or better than, simple random sampling. Sampling multiple barns on a site increased the probability of detection with the number of barns sampled. These results are relevant to PRRSV control or elimination projects at the herd, regional, or national levels, but the results are also broadly applicable to contagious pathogens of swine for which oral fluid tests of equivalent performance are available. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  5. A Generalized Least Squares Regression Approach for Computing Effect Sizes in Single-Case Research: Application Examples

    ERIC Educational Resources Information Center

    Maggin, Daniel M.; Swaminathan, Hariharan; Rogers, Helen J.; O'Keeffe, Breda V.; Sugai, George; Horner, Robert H.

    2011-01-01

    A new method for deriving effect sizes from single-case designs is proposed. The strategy is applicable to small-sample time-series data with autoregressive errors. The method uses Generalized Least Squares (GLS) to model the autocorrelation of the data and estimate regression parameters to produce an effect size that represents the magnitude of…

  6. Alpha spectrometric characterization of process-related particle size distributions from active particle sampling at the Los Alamos National Laboratory uranium foundry

    NASA Astrophysics Data System (ADS)

    Plionis, A. A.; Peterson, D. S.; Tandon, L.; LaMont, S. P.

    2010-03-01

    Uranium particles within the respirable size range pose a significant hazard to the health and safety of workers. Significant differences in the deposition and incorporation patterns of aerosols within the respirable range can be identified and integrated into sophisticated health physics models. Data characterizing the uranium particle size distribution resulting from specific foundry-related processes are needed. Using personal air sampling cascade impactors, particles collected from several foundry processes were sorted by activity median aerodynamic diameter onto various Marple substrates. After an initial gravimetric assessment of each impactor stage, the substrates were analyzed by alpha spectrometry to determine the uranium content of each stage. Alpha spectrometry provides rapid non-distructive isotopic data that can distinguish process uranium from natural sources and the degree of uranium contribution to the total accumulated particle load. In addition, the particle size bins utilized by the impactors provide adequate resolution to determine if a process particle size distribution is: lognormal, bimodal, or trimodal. Data on process uranium particle size values and distributions facilitate the development of more sophisticated and accurate models for internal dosimetry, resulting in an improved understanding of foundry worker health and safety.

  7. R2 effect-size measures for mediation analysis

    PubMed Central

    Fairchild, Amanda J.; MacKinnon, David P.; Taborga, Marcia P.; Taylor, Aaron B.

    2010-01-01

    R2 effect-size measures are presented to assess variance accounted for in mediation models. The measures offer a means to evaluate both component paths and the overall mediated effect in mediation models. Statistical simulation results indicate acceptable bias across varying parameter and sample-size combinations. The measures are applied to a real-world example using data from a team-based health promotion program to improve the nutrition and exercise habits of firefighters. SAS and SPSS computer code are also provided for researchers to compute the measures in their own data. PMID:19363189

  8. R2 effect-size measures for mediation analysis.

    PubMed

    Fairchild, Amanda J; Mackinnon, David P; Taborga, Marcia P; Taylor, Aaron B

    2009-05-01

    R(2) effect-size measures are presented to assess variance accounted for in mediation models. The measures offer a means to evaluate both component paths and the overall mediated effect in mediation models. Statistical simulation results indicate acceptable bias across varying parameter and sample-size combinations. The measures are applied to a real-world example using data from a team-based health promotion program to improve the nutrition and exercise habits of firefighters. SAS and SPSS computer code are also provided for researchers to compute the measures in their own data.

  9. Joint inversion of NMR and SIP data to estimate pore size distribution of geomaterials

    NASA Astrophysics Data System (ADS)

    Niu, Qifei; Zhang, Chi

    2018-03-01

    There are growing interests in using geophysical tools to characterize the microstructure of geomaterials because of the non-invasive nature and the applicability in field. In these applications, multiple types of geophysical data sets are usually processed separately, which may be inadequate to constrain the key feature of target variables. Therefore, simultaneous processing of multiple data sets could potentially improve the resolution. In this study, we propose a method to estimate pore size distribution by joint inversion of nuclear magnetic resonance (NMR) T2 relaxation and spectral induced polarization (SIP) spectra. The petrophysical relation between NMR T2 relaxation time and SIP relaxation time is incorporated in a nonlinear least squares problem formulation, which is solved using Gauss-Newton method. The joint inversion scheme is applied to a synthetic sample and a Berea sandstone sample. The jointly estimated pore size distributions are very close to the true model and results from other experimental method. Even when the knowledge of the petrophysical models of the sample is incomplete, the joint inversion can still capture the main features of the pore size distribution of the samples, including the general shape and relative peak positions of the distribution curves. It is also found from the numerical example that the surface relaxivity of the sample could be extracted with the joint inversion of NMR and SIP data if the diffusion coefficient of the ions in the electrical double layer is known. Comparing to individual inversions, the joint inversion could improve the resolution of the estimated pore size distribution because of the addition of extra data sets. The proposed approach might constitute a first step towards a comprehensive joint inversion that can extract the full pore geometry information of a geomaterial from NMR and SIP data.

  10. Enumerative and binomial sampling plans for citrus mealybug (Homoptera: pseudococcidae) in citrus groves.

    PubMed

    Martínez-Ferrer, María Teresa; Ripollés, José Luís; Garcia-Marí, Ferran

    2006-06-01

    The spatial distribution of the citrus mealybug, Planococcus citri (Risso) (Homoptera: Pseudococcidae), was studied in citrus groves in northeastern Spain. Constant precision sampling plans were designed for all developmental stages of citrus mealybug under the fruit calyx, for late stages on fruit, and for females on trunks and main branches; more than 66, 286, and 101 data sets, respectively, were collected from nine commercial fields during 1992-1998. Dispersion parameters were determined using Taylor's power law, giving aggregated spatial patterns for citrus mealybug populations in three locations of the tree sampled. A significant relationship between the number of insects per organ and the percentage of occupied organs was established using either Wilson and Room's binomial model or Kono and Sugino's empirical formula. Constant precision (E = 0.25) sampling plans (i.e., enumerative plans) for estimating mean densities were developed using Green's equation and the two binomial models. For making management decisions, enumerative counts may be less labor-intensive than binomial sampling. Therefore, we recommend enumerative sampling plans for the use in an integrated pest management program in citrus. Required sample sizes for the range of population densities near current management thresholds, in the three plant locations calyx, fruit, and trunk were 50, 110-330, and 30, respectively. Binomial sampling, especially the empirical model, required a higher sample size to achieve equivalent levels of precision.

  11. Hierarchical distance-sampling models to estimate population size and habitat-specific abundance of an island endemic

    USGS Publications Warehouse

    Sillett, Scott T.; Chandler, Richard B.; Royle, J. Andrew; Kéry, Marc; Morrison, Scott A.

    2012-01-01

    Population size and habitat-specific abundance estimates are essential for conservation management. A major impediment to obtaining such estimates is that few statistical models are able to simultaneously account for both spatial variation in abundance and heterogeneity in detection probability, and still be amenable to large-scale applications. The hierarchical distance-sampling model of J. A. Royle, D. K. Dawson, and S. Bates provides a practical solution. Here, we extend this model to estimate habitat-specific abundance and rangewide population size of a bird species of management concern, the Island Scrub-Jay (Aphelocoma insularis), which occurs solely on Santa Cruz Island, California, USA. We surveyed 307 randomly selected, 300 m diameter, point locations throughout the 250-km2 island during October 2008 and April 2009. Population size was estimated to be 2267 (95% CI 1613-3007) and 1705 (1212-2369) during the fall and spring respectively, considerably lower than a previously published but statistically problematic estimate of 12 500. This large discrepancy emphasizes the importance of proper survey design and analysis for obtaining reliable information for management decisions. Jays were most abundant in low-elevation chaparral habitat; the detection function depended primarily on the percent cover of chaparral and forest within count circles. Vegetation change on the island has been dramatic in recent decades, due to release from herbivory following the eradication of feral sheep (Ovis aries) from the majority of the island in the mid-1980s. We applied best-fit fall and spring models of habitat-specific jay abundance to a vegetation map from 1985, and estimated the population size of A. insularis was 1400-1500 at that time. The 20-30% increase in the jay population suggests that the species has benefited from the recovery of native vegetation since sheep removal. Nevertheless, this jay's tiny range and small population size make it vulnerable to natural disasters and to habitat alteration related to climate change. Our results demonstrate that hierarchical distance-sampling models hold promise for estimating population size and spatial density variation at large scales. Our statistical methods have been incorporated into the R package unmarked to facilitate their use by animal ecologists, and we provide annotated code in the Supplement.

  12. Influence of grain size and sintering temperature grain size on the critical behavior near the paramagnetic to ferromagnetic phase transition temperature in La0.67Sr0.33MnO3 nanoparticles

    NASA Astrophysics Data System (ADS)

    Baaziz, H.; Tozri, A.; Dhahri, E.; Hlil, E. K.

    2018-03-01

    We have undertaken a systematic study of critical behavior in La0.67Sr0.33MnO3 nanoparticles, sintered at different temperatures (L6, L8, L10 and L12 sintered at 600 °C, 800 °C, 1000 °C, 1200 °C respectively), by magnetization measurements. The critical exponents are estimated by various techniques such as the Modified Arrott plot, Kouvel-Fisher plot and critical isotherm technique. Compared to standard models, the critical exponents are close to those expected by the Mean-field model (with β = 0.5 γ = 1, and δ = 3) for (L6, L8, and L10) samples and by the (3D) Heisenberg model (β = 0.365, γ = 1.336 and δ = 4.80) for L12 sample. We conclude that the reduction of grain size strongly influences the universality class.

  13. Determination of grain-size characteristics from electromagnetic seabed mapping data: A NW Iberian shelf study

    NASA Astrophysics Data System (ADS)

    Baasch, Benjamin; Müller, Hendrik; von Dobeneck, Tilo; Oberle, Ferdinand K. J.

    2017-05-01

    The electric conductivity and magnetic susceptibility of sediments are fundamental parameters in environmental geophysics. Both can be derived from marine electromagnetic profiling, a novel, fast and non-invasive seafloor mapping technique. Here we present statistical evidence that electric conductivity and magnetic susceptibility can help to determine physical grain-size characteristics (size, sorting and mud content) of marine surficial sediments. Electromagnetic data acquired with the bottom-towed electromagnetic profiler MARUM NERIDIS III were analysed and compared with grain size data from 33 samples across the NW Iberian continental shelf. A negative correlation between mean grain size and conductivity (R=-0.79) as well as mean grain size and susceptibility (R=-0.78) was found. Simple and multiple linear regression analyses were carried out to predict mean grain size, mud content and the standard deviation of the grain-size distribution from conductivity and susceptibility. The comparison of both methods showed that multiple linear regression models predict the grain-size distribution characteristics better than the simple models. This exemplary study demonstrates that electromagnetic benthic profiling is capable to estimate mean grain size, sorting and mud content of marine surficial sediments at a very high significance level. Transfer functions can be calibrated using grains-size data from a few reference samples and extrapolated along shelf-wide survey lines. This study suggests that electromagnetic benthic profiling should play a larger role for coastal zone management, seafloor contamination and sediment provenance studies in worldwide continental shelf systems.

  14. Microstress, strain, band gap tuning and photocatalytic properties of thermally annealed and Cu-doped ZnO nanoparticles

    NASA Astrophysics Data System (ADS)

    Prasad, Neena; V. M. M, Saipavitra; Swaminathan, Hariharan; Thangaraj, Pandiyarajan; Ramalinga Viswanathan, Mangalaraja; Balasubramanian, Karthikeyan

    2016-06-01

    ZnO nanoparticles and Cu-doped ZnO nanoparticles were prepared by co-precipitation method. Also, a part of the pure ZnO nanoparticles were annealed at 750 °C for 3, 6, and 9 h. X-ray diffraction studies were carried out and the lattice parameters, unit cell volume, interplanar spacing, and Young's modulus were calculated for all the samples, and also the crystallite size was found using the Scherrer method. X-ray peak broadening analysis was used to estimate the crystallite sizes and the strain using the Williamson-Hall (W-H) method and the size-strain plot (SSP) method. Stress and the energy density were calculated using the W-H method assuming different models such as uniform deformation model, uniform strain deformation model, uniform deformation energy density model, and the SSP method. Optical absorption properties of the samples were understood from their UV-visible spectra. Photocatalytic activities of ZnO and 5 % Cu-doped ZnO were observed by the degradation of methylene blue dye in aqueous medium under the irradiation of 20-W compact fluorescent lamp for an hour.

  15. Accounting for missing data in the estimation of contemporary genetic effective population size (N(e) ).

    PubMed

    Peel, D; Waples, R S; Macbeth, G M; Do, C; Ovenden, J R

    2013-03-01

    Theoretical models are often applied to population genetic data sets without fully considering the effect of missing data. Researchers can deal with missing data by removing individuals that have failed to yield genotypes and/or by removing loci that have failed to yield allelic determinations, but despite their best efforts, most data sets still contain some missing data. As a consequence, realized sample size differs among loci, and this poses a problem for unbiased methods that must explicitly account for random sampling error. One commonly used solution for the calculation of contemporary effective population size (N(e) ) is to calculate the effective sample size as an unweighted mean or harmonic mean across loci. This is not ideal because it fails to account for the fact that loci with different numbers of alleles have different information content. Here we consider this problem for genetic estimators of contemporary effective population size (N(e) ). To evaluate bias and precision of several statistical approaches for dealing with missing data, we simulated populations with known N(e) and various degrees of missing data. Across all scenarios, one method of correcting for missing data (fixed-inverse variance-weighted harmonic mean) consistently performed the best for both single-sample and two-sample (temporal) methods of estimating N(e) and outperformed some methods currently in widespread use. The approach adopted here may be a starting point to adjust other population genetics methods that include per-locus sample size components. © 2012 Blackwell Publishing Ltd.

  16. Using meta-analysis to inform the design of subsequent studies of diagnostic test accuracy.

    PubMed

    Hinchliffe, Sally R; Crowther, Michael J; Phillips, Robert S; Sutton, Alex J

    2013-06-01

    An individual diagnostic accuracy study rarely provides enough information to make conclusive recommendations about the accuracy of a diagnostic test; particularly when the study is small. Meta-analysis methods provide a way of combining information from multiple studies, reducing uncertainty in the result and hopefully providing substantial evidence to underpin reliable clinical decision-making. Very few investigators consider any sample size calculations when designing a new diagnostic accuracy study. However, it is important to consider the number of subjects in a new study in order to achieve a precise measure of accuracy. Sutton et al. have suggested previously that when designing a new therapeutic trial, it could be more beneficial to consider the power of the updated meta-analysis including the new trial rather than of the new trial itself. The methodology involves simulating new studies for a range of sample sizes and estimating the power of the updated meta-analysis with each new study added. Plotting the power values against the range of sample sizes allows the clinician to make an informed decision about the sample size of a new trial. This paper extends this approach from the trial setting and applies it to diagnostic accuracy studies. Several meta-analytic models are considered including bivariate random effects meta-analysis that models the correlation between sensitivity and specificity. Copyright © 2012 John Wiley & Sons, Ltd. Copyright © 2012 John Wiley & Sons, Ltd.

  17. Evaluation of alternative model selection criteria in the analysis of unimodal response curves using CART

    USGS Publications Warehouse

    Ribic, C.A.; Miller, T.W.

    1998-01-01

    We investigated CART performance with a unimodal response curve for one continuous response and four continuous explanatory variables, where two variables were important (ie directly related to the response) and the other two were not. We explored performance under three relationship strengths and two explanatory variable conditions: equal importance and one variable four times as important as the other. We compared CART variable selection performance using three tree-selection rules ('minimum risk', 'minimum risk complexity', 'one standard error') to stepwise polynomial ordinary least squares (OLS) under four sample size conditions. The one-standard-error and minimum-risk-complexity methods performed about as well as stepwise OLS with large sample sizes when the relationship was strong. With weaker relationships, equally important explanatory variables and larger sample sizes, the one-standard-error and minimum-risk-complexity rules performed better than stepwise OLS. With weaker relationships and explanatory variables of unequal importance, tree-structured methods did not perform as well as stepwise OLS. Comparing performance within tree-structured methods, with a strong relationship and equally important explanatory variables, the one-standard-error-rule was more likely to choose the correct model than were the other tree-selection rules 1) with weaker relationships and equally important explanatory variables; and 2) under all relationship strengths when explanatory variables were of unequal importance and sample sizes were lower.

  18. Investigating effects of sample pretreatment on protein stability using size-exclusion chromatography and high-resolution continuum source atomic absorption spectrometry.

    PubMed

    Rakow, Tobias; El Deeb, Sami; Hahne, Thomas; El-Hady, Deia Abd; AlBishri, Hassan M; Wätzig, Hermann

    2014-09-01

    In this study, size-exclusion chromatography and high-resolution atomic absorption spectrometry methods have been developed and evaluated to test the stability of proteins during sample pretreatment. This especially includes different storage conditions but also adsorption before or even during the chromatographic process. For the development of the size exclusion method, a Biosep S3000 5 μm column was used for investigating a series of representative model proteins, namely bovine serum albumin, ovalbumin, monoclonal immunoglobulin G antibody, and myoglobin. Ambient temperature storage was found to be harmful to all model proteins, whereas short-term storage up to 14 days could be done in an ordinary refrigerator. Freezing the protein solutions was always complicated and had to be evaluated for each protein in the corresponding solvent. To keep the proteins in their native state a gentle freezing temperature should be chosen, hence liquid nitrogen should be avoided. Furthermore, a high-resolution continuum source atomic absorption spectrometry method was developed to observe the adsorption of proteins on container material and chromatographic columns. Adsorption to any container led to a sample loss and lowered the recovery rates. During the pretreatment and high-performance size-exclusion chromatography, adsorption caused sample losses of up to 33%. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. Constraining Ω0 with the Angular Size-Redshift Relation of Double-lobed Quasars in the FIRST Survey

    NASA Astrophysics Data System (ADS)

    Buchalter, Ari; Helfand, David J.; Becker, Robert H.; White, Richard L.

    1998-02-01

    In previous attempts to measure cosmological parameters from the angular size-redshift (θ-z) relation of double-lobed radio sources, the observed data have generally been consistent with a static Euclidean universe rather than with standard Friedmann models, and past authors have disagreed significantly as to what effects are responsible for this observation. These results and different interpretations may be due largely to a variety of selection effects and differences in the sample definitions destroying the integrity of the data sets, and inconsistencies in the analysis undermining the results. Using the VLA FIRST survey, we investigate the θ-z relation for a new sample of double-lobed quasars. We define a set of 103 sources, carefully addressing the various potential problems that, we believe, have compromised past work, including a robust definition of size and the completeness and homogeneity of the sample, and further devise a self-consistent method to assure accurate morphological classification and account for finite resolution effects in the analysis. Before focusing on cosmological constraints, we investigate the possible impact of correlations among the intrinsic properties of these sources over the entire assumed range of allowed cosmological parameter values. For all cases, we find apparent size evolution of the form l ~ (1 + z)c, with c ~ -0.8 +/- 0.4, which is found to arise mainly from a power-size correlation of the form l ~ Pβ (β ~ - 0.13 +/- 0.06) coupled with a power-redshift correlation. Intrinsic size evolution is consistent with zero. We also find that in all cases, a subsample with c ~ 0 can be defined, whose θ-z relation should therefore arise primarily from cosmological effects. These results are found to be independent of orientation effects, although other evidence indicates that orientation effects are present and consistent with predictions of the unified scheme for radio-loud active galactic nuclei. The above results are all confirmed by nonparametric analysis. Contrary to past work, we find that the observed θ-z relation for our sample is more consistent with standard Friedmann models than with a static Euclidean universe. Though the current data cannot distinguish with high significance between various Friedmann models, significant constraints on the cosmological parameters within a given model are obtained. In particular, we find that a flat, matter-dominated universe (Ω0 = 1), a flat universe with a cosmological constant, and an open universe all provide comparably good fits to the data, with the latter two models both yielding Ω0 ~ 0.35 with 1 σ ranges including values between ~0.25 and 1.0; the c ~ 0 subsamples yield values of Ω0 near unity in these models, though with even greater error ranges. We also examine the values of H0 implied by the data, using plausible assumptions about the intrinsic source sizes, and find these to be consistent with the currently accepted range of values. We determine the sample size needed to improve significantly the results and outline future strategies for such work.

  20. Multiscale Simulation of Porous Ceramics Based on Movable Cellular Automaton Method

    NASA Astrophysics Data System (ADS)

    Smolin, A.; Smolin, I.; Eremina, G.; Smolina, I.

    2017-10-01

    The paper presents a model for simulating mechanical behaviour of multiscale porous ceramics based on movable cellular automaton method, which is a novel particle method in computational mechanics of solid. The initial scale of the proposed approach corresponds to the characteristic size of the smallest pores in the ceramics. At this scale, we model uniaxial compression of several representative samples with an explicit account of pores of the same size but with the random unique position in space. As a result, we get the average values of Young’s modulus and strength, as well as the parameters of the Weibull distribution of these properties at the current scale level. These data allow us to describe the material behaviour at the next scale level were only the larger pores are considered explicitly, while the influence of small pores is included via the effective properties determined at the previous scale level. If the pore size distribution function of the material has N maxima we need to perform computations for N - 1 levels in order to get the properties from the lowest scale up to the macroscale step by step. The proposed approach was applied to modelling zirconia ceramics with bimodal pore size distribution. The obtained results show correct behaviour of the model sample at the macroscale.

  1. PDF-based heterogeneous multiscale filtration model.

    PubMed

    Gong, Jian; Rutland, Christopher J

    2015-04-21

    Motivated by modeling of gasoline particulate filters (GPFs), a probability density function (PDF) based heterogeneous multiscale filtration (HMF) model is developed to calculate filtration efficiency of clean particulate filters. A new methodology based on statistical theory and classic filtration theory is developed in the HMF model. Based on the analysis of experimental porosimetry data, a pore size probability density function is introduced to represent heterogeneity and multiscale characteristics of the porous wall. The filtration efficiency of a filter can be calculated as the sum of the contributions of individual collectors. The resulting HMF model overcomes the limitations of classic mean filtration models which rely on tuning of the mean collector size. Sensitivity analysis shows that the HMF model recovers the classical mean model when the pore size variance is very small. The HMF model is validated by fundamental filtration experimental data from different scales of filter samples. The model shows a good agreement with experimental data at various operating conditions. The effects of the microstructure of filters on filtration efficiency as well as the most penetrating particle size are correctly predicted by the model.

  2. Change-in-ratio methods for estimating population size

    USGS Publications Warehouse

    Udevitz, Mark S.; Pollock, Kenneth H.; McCullough, Dale R.; Barrett, Reginald H.

    2002-01-01

    Change-in-ratio (CIR) methods can provide an effective, low cost approach for estimating the size of wildlife populations. They rely on being able to observe changes in proportions of population subclasses that result from the removal of a known number of individuals from the population. These methods were first introduced in the 1940’s to estimate the size of populations with 2 subclasses under the assumption of equal subclass encounter probabilities. Over the next 40 years, closed population CIR models were developed to consider additional subclasses and use additional sampling periods. Models with assumptions about how encounter probabilities vary over time, rather than between subclasses, also received some attention. Recently, all of these CIR models have been shown to be special cases of a more general model. Under the general model, information from additional samples can be used to test assumptions about the encounter probabilities and to provide estimates of subclass sizes under relaxations of these assumptions. These developments have greatly extended the applicability of the methods. CIR methods are attractive because they do not require the marking of individuals, and subclass proportions often can be estimated with relatively simple sampling procedures. However, CIR methods require a carefully monitored removal of individuals from the population, and the estimates will be of poor quality unless the removals induce substantial changes in subclass proportions. In this paper, we review the state of the art for closed population estimation with CIR methods. Our emphasis is on the assumptions of CIR methods and on identifying situations where these methods are likely to be effective. We also identify some important areas for future CIR research.

  3. Cost-efficient designs for three-arm trials with treatment delivered by health professionals: Sample sizes for a combination of nested and crossed designs

    PubMed Central

    Moerbeek, Mirjam

    2018-01-01

    Background This article studies the design of trials that compare three treatment conditions that are delivered by two types of health professionals. The one type of health professional delivers one treatment, and the other type delivers two treatments, hence, this design is a combination of a nested and crossed design. As each health professional treats multiple patients, the data have a nested structure. This nested structure has thus far been ignored in the design of such trials, which may result in an underestimate of the required sample size. In the design stage, the sample sizes should be determined such that a desired power is achieved for each of the three pairwise comparisons, while keeping costs or sample size at a minimum. Methods The statistical model that relates outcome to treatment condition and explicitly takes the nested data structure into account is presented. Mathematical expressions that relate sample size to power are derived for each of the three pairwise comparisons on the basis of this model. The cost-efficient design achieves sufficient power for each pairwise comparison at lowest costs. Alternatively, one may minimize the total number of patients. The sample sizes are found numerically and an Internet application is available for this purpose. The design is also compared to a nested design in which each health professional delivers just one treatment. Results Mathematical expressions show that this design is more efficient than the nested design. For each pairwise comparison, power increases with the number of health professionals and the number of patients per health professional. The methodology of finding a cost-efficient design is illustrated using a trial that compares treatments for social phobia. The optimal sample sizes reflect the costs for training and supervising psychologists and psychiatrists, and the patient-level costs in the three treatment conditions. Conclusion This article provides the methodology for designing trials that compare three treatment conditions while taking the nesting of patients within health professionals into account. As such, it helps to avoid underpowered trials. To use the methodology, a priori estimates of the total outcome variances and intraclass correlation coefficients must be obtained from experts’ opinions or findings in the literature. PMID:29316807

  4. The use of group sequential, information-based sample size re-estimation in the design of the PRIMO study of chronic kidney disease.

    PubMed

    Pritchett, Yili; Jemiai, Yannis; Chang, Yuchiao; Bhan, Ishir; Agarwal, Rajiv; Zoccali, Carmine; Wanner, Christoph; Lloyd-Jones, Donald; Cannata-Andía, Jorge B; Thompson, Taylor; Appelbaum, Evan; Audhya, Paul; Andress, Dennis; Zhang, Wuyan; Solomon, Scott; Manning, Warren J; Thadhani, Ravi

    2011-04-01

    Chronic kidney disease is associated with a marked increase in risk for left ventricular hypertrophy and cardiovascular mortality compared with the general population. Therapy with vitamin D receptor activators has been linked with reduced mortality in chronic kidney disease and an improvement in left ventricular hypertrophy in animal studies. PRIMO (Paricalcitol capsules benefits in Renal failure Induced cardia MOrbidity) is a multinational, multicenter randomized controlled trial to assess the effects of paricalcitol (a selective vitamin D receptor activator) on mild to moderate left ventricular hypertrophy in patients with chronic kidney disease. Subjects with mild-moderate chronic kidney disease are randomized to paricalcitol or placebo after confirming left ventricular hypertrophy using a cardiac echocardiogram. Cardiac magnetic resonance imaging is then used to assess left ventricular mass index at baseline, 24 and 48 weeks, which is the primary efficacy endpoint of the study. Because of limited prior data to estimate sample size, a maximum information group sequential design with sample size re-estimation is implemented to allow sample size adjustment based on the nuisance parameter estimated using the interim data. An interim efficacy analysis is planned at a pre-specified time point conditioned on the status of enrollment. The decision to increase sample size depends on the observed treatment effect. A repeated measures analysis model, using available data at Week 24 and 48 with a backup model of an ANCOVA analyzing change from baseline to the final nonmissing observation, are pre-specified to evaluate the treatment effect. Gamma-family of spending function is employed to control family-wise Type I error rate as stopping for success is planned in the interim efficacy analysis. If enrollment is slower than anticipated, the smaller sample size used in the interim efficacy analysis and the greater percent of missing week 48 data might decrease the parameter estimation accuracy, either for the nuisance parameter or for the treatment effect, which might in turn affect the interim decision-making. The application of combining a group sequential design with a sample-size re-estimation in clinical trial design has the potential to improve efficiency and to increase the probability of trial success while ensuring integrity of the study.

  5. Sample size determination for a three-arm equivalence trial of Poisson and negative binomial responses.

    PubMed

    Chang, Yu-Wei; Tsong, Yi; Zhao, Zhigen

    2017-01-01

    Assessing equivalence or similarity has drawn much attention recently as many drug products have lost or will lose their patents in the next few years, especially certain best-selling biologics. To claim equivalence between the test treatment and the reference treatment when assay sensitivity is well established from historical data, one has to demonstrate both superiority of the test treatment over placebo and equivalence between the test treatment and the reference treatment. Thus, there is urgency for practitioners to derive a practical way to calculate sample size for a three-arm equivalence trial. The primary endpoints of a clinical trial may not always be continuous, but may be discrete. In this paper, the authors derive power function and discuss sample size requirement for a three-arm equivalence trial with Poisson and negative binomial clinical endpoints. In addition, the authors examine the effect of the dispersion parameter on the power and the sample size by varying its coefficient from small to large. In extensive numerical studies, the authors demonstrate that required sample size heavily depends on the dispersion parameter. Therefore, misusing a Poisson model for negative binomial data may easily lose power up to 20%, depending on the value of the dispersion parameter.

  6. Elaboration of austenitic stainless steel samples with bimodal grain size distributions and investigation of their mechanical behavior

    NASA Astrophysics Data System (ADS)

    Flipon, B.; de la Cruz, L. Garcia; Hug, E.; Keller, C.; Barbe, F.

    2017-10-01

    Samples of 316L austenitic stainless steel with bimodal grain size distributions are elaborated using two distinct routes. The first one is based on powder metallurgy using spark plasma sintering of two powders with different particle sizes. The second route applies the reverse-annealing method: it consists in inducing martensitic phase transformation by plastic strain and further annealing in order to obtain two austenitic grain populations with different sizes. Microstructural analy ses reveal that both methods are suitable to generate significative grain size contrast and to control this contrast according to the elaboration conditions. Mechanical properties under tension are then characterized for different grain size distributions. Crystal plasticity finite element modelling is further applied in a configuration of bimodal distribution to analyse the role played by coarse grains within a matrix of fine grains, considering not only their volume fraction but also their spatial arrangement.

  7. Monitoring landscape metrics by point sampling: accuracy in estimating Shannon's diversity and edge density.

    PubMed

    Ramezani, Habib; Holm, Sören; Allard, Anna; Ståhl, Göran

    2010-05-01

    Environmental monitoring of landscapes is of increasing interest. To quantify landscape patterns, a number of metrics are used, of which Shannon's diversity, edge length, and density are studied here. As an alternative to complete mapping, point sampling was applied to estimate the metrics for already mapped landscapes selected from the National Inventory of Landscapes in Sweden (NILS). Monte-Carlo simulation was applied to study the performance of different designs. Random and systematic samplings were applied for four sample sizes and five buffer widths. The latter feature was relevant for edge length, since length was estimated through the number of points falling in buffer areas around edges. In addition, two landscape complexities were tested by applying two classification schemes with seven or 20 land cover classes to the NILS data. As expected, the root mean square error (RMSE) of the estimators decreased with increasing sample size. The estimators of both metrics were slightly biased, but the bias of Shannon's diversity estimator was shown to decrease when sample size increased. In the edge length case, an increasing buffer width resulted in larger bias due to the increased impact of boundary conditions; this effect was shown to be independent of sample size. However, we also developed adjusted estimators that eliminate the bias of the edge length estimator. The rates of decrease of RMSE with increasing sample size and buffer width were quantified by a regression model. Finally, indicative cost-accuracy relationships were derived showing that point sampling could be a competitive alternative to complete wall-to-wall mapping.

  8. Modelling forest canopy height by integrating airborne LiDAR samples with satellite Radar and multispectral imagery

    NASA Astrophysics Data System (ADS)

    García, Mariano; Saatchi, Sassan; Ustin, Susan; Balzter, Heiko

    2018-04-01

    Spatially-explicit information on forest structure is paramount to estimating aboveground carbon stocks for designing sustainable forest management strategies and mitigating greenhouse gas emissions from deforestation and forest degradation. LiDAR measurements provide samples of forest structure that must be integrated with satellite imagery to predict and to map landscape scale variations of forest structure. Here we evaluate the capability of existing satellite synthetic aperture radar (SAR) with multispectral data to estimate forest canopy height over five study sites across two biomes in North America, namely temperate broadleaf and mixed forests and temperate coniferous forests. Pixel size affected the modelling results, with an improvement in model performance as pixel resolution coarsened from 25 m to 100 m. Likewise, the sample size was an important factor in the uncertainty of height prediction using the Support Vector Machine modelling approach. Larger sample size yielded better results but the improvement stabilised when the sample size reached approximately 10% of the study area. We also evaluated the impact of surface moisture (soil and vegetation moisture) on the modelling approach. Whereas the impact of surface moisture had a moderate effect on the proportion of the variance explained by the model (up to 14%), its impact was more evident in the bias of the models with bias reaching values up to 4 m. Averaging the incidence angle corrected radar backscatter coefficient (γ°) reduced the impact of surface moisture on the models and improved their performance at all study sites, with R2 ranging between 0.61 and 0.82, RMSE between 2.02 and 5.64 and bias between 0.02 and -0.06, respectively, at 100 m spatial resolution. An evaluation of the relative importance of the variables in the model performance showed that for the study sites located within the temperate broadleaf and mixed forests biome ALOS-PALSAR HV polarised backscatter was the most important variable, with Landsat Tasselled Cap Transformation components barely contributing to the models for two of the study sites whereas it had a significant contribution at the third one. Over the temperate conifer forests, Landsat Tasselled Cap variables contributed more than the ALOS-PALSAR HV band to predict the landscape height variability. In all cases, incorporation of multispectral data improved the retrieval of forest canopy height and reduced the estimation uncertainty for tall forests. Finally, we concluded that models trained at one study site had higher uncertainty when applied to other sites, but a model developed from multiple sites performed equally to site-specific models to predict forest canopy height. This result suggest that a biome level model developed from several study sites can be used as a reliable estimator of biome-level forest structure from existing satellite imagery.

  9. Ambient air particulates and particulate-bound mercury Hg(p) concentrations: dry deposition study over a Traffic, Airport, Park (T.A.P.) areas during years of 2011-2012.

    PubMed

    Fang, Guor-Cheng; Lin, Yen-Heng; Zheng, Yu-Cheng

    2016-02-01

    The main purpose of this study was to monitor ambient air particles and particulate-bound mercury Hg(p) in total suspended particulate (TSP) concentrations and dry deposition at the Hung Kuang (Traffic), Taichung airport and Westing Park sampling sites during the daytime and nighttime, from 2011 to 2012. In addition, the calculated/measured dry deposition flux ratios of ambient air particles and particulate-bound mercury Hg(p) were also studied with Baklanov & Sorensen and the Williams models. For a particle size of 10 μm, the Baklanov & Sorensen model yielded better predictions of dry deposition of ambient air particulates and particulate-bound mercury Hg(p) at the Hung Kuang (Traffic), Taichung airport and Westing Park sampling site during the daytime and nighttime sampling periods. However, for particulates with sizes 20-23 μm, the results obtained in the study reveal that the Williams model provided better prediction results for ambient air particulates and particulate-bound mercury Hg(p) at all sampling sites in this study.

  10. Variability in group size and the evolution of collective action.

    PubMed

    Peña, Jorge; Nöldeke, Georg

    2016-01-21

    Models of the evolution of collective action typically assume that interactions occur in groups of identical size. In contrast, social interactions between animals occur in groups of widely dispersed size. This paper models collective action problems as two-strategy multiplayer games and studies the effect of variability in group size on the evolution of cooperative behavior under the replicator dynamics. The analysis identifies elementary conditions on the payoff structure of the game implying that the evolution of cooperative behavior is promoted or inhibited when the group size experienced by a focal player is more or less variable. Similar but more stringent conditions are applicable when the confounding effect of size-biased sampling, which causes the group-size distribution experienced by a focal player to differ from the statistical distribution of group sizes, is taken into account. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. A Model Based Approach to Sample Size Estimation in Recent Onset Type 1 Diabetes

    PubMed Central

    Bundy, Brian; Krischer, Jeffrey P.

    2016-01-01

    The area under the curve C-peptide following a 2-hour mixed meal tolerance test from 481 individuals enrolled on 5 prior TrialNet studies of recent onset type 1 diabetes from baseline to 12 months after enrollment were modelled to produce estimates of its rate of loss and variance. Age at diagnosis and baseline C-peptide were found to be significant predictors and adjusting for these in an ANCOVA resulted in estimates with lower variance. Using these results as planning parameters for new studies results in a nearly 50% reduction in the target sample size. The modelling also produces an expected C-peptide that can be used in Observed vs. Expected calculations to estimate the presumption of benefit in ongoing trials. PMID:26991448

  12. Development of a simplified optical technique for the simultaneous measurement of particle size distribution and velocity

    NASA Technical Reports Server (NTRS)

    Smith, J. L.

    1983-01-01

    Existing techniques were surveyed, an experimental procedure was developed, a laboratory test model was fabricated, limited data were recovered for proof of principle, and the relationship between particle size distribution and amplitude measurements was illustrated in an effort to develop a low cost, simplified optical technique for measuring particle size distributions and velocities in fluidized bed combustors and gasifiers. A He-Ne laser illuminated Rochi Rulings (range 10 to 500 lines per inch). Various samples of known particle size distributions were passed through the fringe pattern produced by the rulings. A photomultiplier tube converted light from the fringe volume to an electrical signal which was recorded using an oscilloscope and camera. The signal amplitudes were correlated against the known particle size distributions. The correlation holds true for various samples.

  13. A Fracture Mechanics Approach to Thermal Shock Investigation in Alumina-Based Refractory

    NASA Astrophysics Data System (ADS)

    Volkov-Husović, T.; Heinemann, R. Jančić; Mitraković, D.

    2008-02-01

    The thermal shock behavior of large grain size, alumina-based refractories was investigated experimentally using a standard water quench test. A mathematical model was employed to simulate the thermal stability behavior. Behavior of the samples under repeated thermal shock was monitored using ultrasonic measurements of dynamic Young's modulus. Image analysis was used to observe the extent of surface degradation. Analysis of the obtained results for the behavior of large grain size samples under conditions of rapid temperature changes is given.

  14. Power and sample size for multivariate logistic modeling of unmatched case-control studies.

    PubMed

    Gail, Mitchell H; Haneuse, Sebastien

    2017-01-01

    Sample size calculations are needed to design and assess the feasibility of case-control studies. Although such calculations are readily available for simple case-control designs and univariate analyses, there is limited theory and software for multivariate unconditional logistic analysis of case-control data. Here we outline the theory needed to detect scalar exposure effects or scalar interactions while controlling for other covariates in logistic regression. Both analytical and simulation methods are presented, together with links to the corresponding software.

  15. Modeling and Simulation of A Microchannel Cooling System for Vitrification of Cells and Tissues.

    PubMed

    Wang, Y; Zhou, X M; Jiang, C J; Yu, Y T

    The microchannel heat exchange system has several advantages and can be used to enhance heat transfer for vitrification. To evaluate the microchannel cooling method and to analyze the effects of key parameters such as channel structure, flow rate and sample size. A computational flow dynamics model is applied to study the two-phase flow in microchannels and its related heat transfer process. The fluid-solid coupling problem is solved with a whole field solution method (i.e., flow profile in channels and temperature distribution in the system being simulated simultaneously). Simulation indicates that a cooling rate >10 4 C/min is easily achievable using the microchannel method with the high flow rate for a board range of sample sizes. Channel size and material used have significant impact on cooling performance. Computational flow dynamics is useful for optimizing the design and operation of the microchannel system.

  16. Estimating modal abundances from the spectra of natural and laboratory pyroxene mixtures using the modified Gaussian model

    NASA Technical Reports Server (NTRS)

    Sunshine, Jessica M.; Pieters, Carle M.

    1993-01-01

    The modified Gaussian model (MGM) is used to explore spectra of samples containing multiple pyroxene components as a function of modal abundance. The MGM allows spectra to be analyzed directly, without the use of actual or assumed end-member spectra and therefore holds great promise for remote applications. A series of mass fraction mixtures created from several different particle size fractions are analyzed with the MGM to quantify the properties of pyroxene mixtures as a function of both modal abundance and grain size. Band centers, band widths, and relative band strengths of absorptions from individual pyroxenes in mixture spectra are found to be largely independent of particle size. Spectral properties of both zoned and exsolved pyroxene components are resolved in exsolved samples using the MGM, and modal abundances are accurately estimated to within 5-10 percent without predetermined knowledge of the end-member spectra.

  17. Data splitting for artificial neural networks using SOM-based stratified sampling.

    PubMed

    May, R J; Maier, H R; Dandy, G C

    2010-03-01

    Data splitting is an important consideration during artificial neural network (ANN) development where hold-out cross-validation is commonly employed to ensure generalization. Even for a moderate sample size, the sampling methodology used for data splitting can have a significant effect on the quality of the subsets used for training, testing and validating an ANN. Poor data splitting can result in inaccurate and highly variable model performance; however, the choice of sampling methodology is rarely given due consideration by ANN modellers. Increased confidence in the sampling is of paramount importance, since the hold-out sampling is generally performed only once during ANN development. This paper considers the variability in the quality of subsets that are obtained using different data splitting approaches. A novel approach to stratified sampling, based on Neyman sampling of the self-organizing map (SOM), is developed, with several guidelines identified for setting the SOM size and sample allocation in order to minimize the bias and variance in the datasets. Using an example ANN function approximation task, the SOM-based approach is evaluated in comparison to random sampling, DUPLEX, systematic stratified sampling, and trial-and-error sampling to minimize the statistical differences between data sets. Of these approaches, DUPLEX is found to provide benchmark performance with good model performance, with no variability. The results show that the SOM-based approach also reliably generates high-quality samples and can therefore be used with greater confidence than other approaches, especially in the case of non-uniform datasets, with the benefit of scalability to perform data splitting on large datasets. Copyright 2009 Elsevier Ltd. All rights reserved.

  18. Effects of sample size and sampling frequency on studies of brown bear home ranges and habitat use

    USGS Publications Warehouse

    Arthur, Steve M.; Schwartz, Charles C.

    1999-01-01

    We equipped 9 brown bears (Ursus arctos) on the Kenai Peninsula, Alaska, with collars containing both conventional very-high-frequency (VHF) transmitters and global positioning system (GPS) receivers programmed to determine an animal's position at 5.75-hr intervals. We calculated minimum convex polygon (MCP) and fixed and adaptive kernel home ranges for randomly-selected subsets of the GPS data to examine the effects of sample size on accuracy and precision of home range estimates. We also compared results obtained by weekly aerial radiotracking versus more frequent GPS locations to test for biases in conventional radiotracking data. Home ranges based on the MCP were 20-606 km2 (x = 201) for aerial radiotracking data (n = 12-16 locations/bear) and 116-1,505 km2 (x = 522) for the complete GPS data sets (n = 245-466 locations/bear). Fixed kernel home ranges were 34-955 km2 (x = 224) for radiotracking data and 16-130 km2 (x = 60) for the GPS data. Differences between means for radiotracking and GPS data were due primarily to the larger samples provided by the GPS data. Means did not differ between radiotracking data and equivalent-sized subsets of GPS data (P > 0.10). For the MCP, home range area increased and variability decreased asymptotically with number of locations. For the kernel models, both area and variability decreased with increasing sample size. Simulations suggested that the MCP and kernel models required >60 and >80 locations, respectively, for estimates to be both accurate (change in area <1%/additional location) and precise (CV < 50%). Although the radiotracking data appeared unbiased, except for the relationship between area and sample size, these data failed to indicate some areas that likely were important to bears. Our results suggest that the usefulness of conventional radiotracking data may be limited by potential biases and variability due to small samples. Investigators that use home range estimates in statistical tests should consider the effects of variability of those estimates. Use of GPS-equipped collars can facilitate obtaining larger samples of unbiased data and improve accuracy and precision of home range estimates.

  19. The Usability of Rock-Like Materials for Numerical Studies on Rocks

    NASA Astrophysics Data System (ADS)

    Zengin, Enes; Abiddin Erguler, Zeynal

    2017-04-01

    The approaches of synthetic rock material and mass are widely used by many researchers for understanding the failure behavior of different rocks. In order to model the failure behavior of rock material, researchers take advantageous of different techniques and software. But, the majority of all these instruments are based on distinct element method (DEM). For modeling the failure behavior of rocks, and so to create a fundamental synthetic rock material model, it is required to perform related laboratory experiments for providing strength parameters. In modelling studies, model calibration processes are performed by using parameters of intact rocks such as porosity, grain size, modulus of elasticity and Poisson ratio. In some cases, it can be difficult or even impossible to acquire representative rock samples for laboratory experiments from heavily jointed rock masses and vuggy rocks. Considering this limitation, in this study, it was aimed to investigate the applicability of rock-like material (e.g. concrete) to understand and model the failure behavior of rock materials having complex inherent structures. For this purpose, concrete samples having a mixture of %65 cement dust and %35 water were utilized. Accordingly, intact concrete samples representing rocks were prepared in laboratory conditions and their physical properties such as porosity, pore size and density etc. were determined. In addition, to acquire the mechanical parameters of concrete samples, uniaxial compressive strength (UCS) tests were also performed by simultaneously measuring strain during testing. The measured physical and mechanical properties of these extracted concrete samples were used to create synthetic material and then uniaxial compressive tests were modeled and performed by using two dimensional discontinuum program known as Particle Flow Code (PFC2D). After modeling studies in PFC2D, approximately similar failure mechanism and testing results were achieved from both experimental and artificial simulations. The results obtained from these laboratory tests and modelling studies were compared with the other researcher's studies in respect to failure mechanism of different type of rocks. It can be concluded that there is similar failure mechanism between concrete and rock materials. Therefore, the results obtained from concrete samples that would be prepared at different porosity and pore sizes can be used in future studies in selection micro-mechanical and physical properties to constitute synthetic rock materials for understanding failure mechanism of rocks having complex inherent structures such as vuggy rocks or heavily jointed rock masses.

  20. Effect of geometric size on mechanical properties of dielectric elastomers based on an improved visco-hyperelastic film model

    NASA Astrophysics Data System (ADS)

    Chang, Mengzhou; Wang, Zhenqing; Tong, Liyong; Liang, Wenyan

    2017-03-01

    Dielectric polymers show complex mechanical behaviors with different boundary conditions, geometry size and pre-stress. A viscoelastic model suitable for inhomogeneous deformation is presented integrating the Kelvin-Voigt model in a new form in this work. For different types of uniaxial tensile test loading along the length direction of sample, single-step-relaxation tests, loading-unloading tests and tensile-creep-relaxation tests the improved model provides a quite favorable comparison with the experiment results. Moreover, The mechanical properties of test sample with several length-width ratios under different boundary conditions are also invested. The influences of the different boundary conditions are calculated with a stress applied on the boundary point and the result show that the fixed boundary will increase the stress compare with homogeneous deformation. In modeling the effect of pre-stress in the shear test, three pre-stressed mode are discussed. The model validation on the general mechanical behavior shows excellent predictive capability.

  1. Evaluation of a new tear osmometer for repeatability and accuracy, using 0.5-microL (500-Nanoliter) samples.

    PubMed

    Yildiz, Elvin H; Fan, Vincent C; Banday, Hina; Ramanathan, Lakshmi V; Bitra, Ratna K; Garry, Eileen; Asbell, Penny A

    2009-07-01

    To evaluate the repeatability and accuracy of a new tear osmometer that measures the osmolality of 0.5-microL (500-nanoliter) samples. Four standardized solutions were tested with 0.5-microL (500-nanoliter) samples for repeatability of measurements and comparability to standardized technique. Two known standard salt solutions (290 mOsm/kg H2O, 304 mOsm/kg H2O), a normal artificial tear matrix sample (306 mOsm/kg H2O), and an abnormal artificial tear matrix sample (336 mOsm/kg H2O) were repeatedly tested (n = 20 each) for osmolality with use of the Advanced Instruments Model 3100 Tear Osmometer (0.5-microL [500-nanoliter] sample size) and the FDA-approved Advanced Instruments Model 3D2 Clinical Osmometer (250-microL sample size). Four standard solutions were used, with osmolality values of 290, 304, 306, and 336 mOsm/kg H2O. The respective precision data, including the mean and standard deviation, were: 291.8 +/- 4.4, 305.6 +/- 2.4, 305.1 +/- 2.3, and 336.4 +/- 2.2 mOsm/kg H2O. The percent recoveries for the 290 mOsm/kg H2O standard solution, the 304 mOsm/kg H2O reference solution, the normal value-assigned 306 mOsm/kg H2O sample, and the abnormal value-assigned 336 mOsm/kg H2O sample were 100.3, 100.2, 99.8, and 100.3 mOsm/kg H2O, respectively. The repeatability data are in accordance with data obtained on clinical osmometers with use of larger sample sizes. All 4 samples tested on the tear osmometer have osmolality values that correlate well to the clinical instrument method. The tear osmometer is a suitable instrument for testing the osmolality of microliter-sized samples, such as tears, and therefore may be useful in diagnosing, monitoring, and classifying tear abnormalities such as the severity of dry eye disease.

  2. Experimental and numerical modeling research of rubber material during microwave heating process

    NASA Astrophysics Data System (ADS)

    Chen, Hailong; Li, Tao; Li, Kunling; Li, Qingling

    2018-05-01

    This paper aims to investigate the heating behaviors of block rubber by experimental and simulated method. The COMSOL Multiphysics 5.0 software was utilized in numerical simulation work. The effects of microwave frequency, power and sample size on temperature distribution are examined. The effect of frequency on temperature distribution is obvious. The maximum and minimum temperatures of block rubber increase first and then decrease with frequency increasing. The microwave heating efficiency is maximum in the microwave frequency of 2450 MHz. However, more uniform temperature distribution is presented in other microwave frequencies. The influence of microwave power on temperature distribution is also remarkable. The smaller the power, the more uniform the temperature distribution on the block rubber. The effect of power on microwave heating efficiency is not obvious. The effect of sample size on temperature distribution is evidently found. The smaller the sample size, the more uniform the temperature distribution on the block rubber. However, the smaller the sample size, the lower the microwave heating efficiency. The results can serve as references for the research on heating rubber material by microwave technology.

  3. Simple and multiple linear regression: sample size considerations.

    PubMed

    Hanley, James A

    2016-11-01

    The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. Modeling the transport of engineered nanoparticles in saturated porous media - an experimental setup

    NASA Astrophysics Data System (ADS)

    Braun, A.; Neukum, C.; Azzam, R.

    2011-12-01

    The accelerating production and application of engineered nanoparticles is causing concerns regarding their release and fate in the environment. For assessing the risk that is posed to drinking water resources it is important to understand the transport and retention mechanisms of engineered nanoparticles in soil and groundwater. In this study an experimental setup for analyzing the mobility of silver and titanium dioxide nanoparticles in saturated porous media is presented. Batch and column experiments with glass beads and two different soils as matrices are carried out under varied conditions to study the impact of electrolyte concentration and pore water velocities. The analysis of nanoparticles implies several challenges, such as the detection and characterization and the preparation of a well dispersed sample with defined properties, as nanoparticles tend to form agglomerates when suspended in an aqueous medium. The analytical part of the experiments is mainly undertaken with Flow Field-Flow Fractionation (FlFFF). This chromatography like technique separates a particulate sample according to size. It is coupled to a UV/Vis and a light scattering detector for analyzing concentration and size distribution of the sample. The advantage of this technique is the ability to analyze also complex environmental samples, such as the effluent of column experiments including soil components, and the gentle sample treatment. For optimization of the sample preparation and for getting a first idea of the aggregation behavior in soil solutions, in sedimentation experiments the effect of ionic strength, sample concentration and addition of a surfactant on particle or aggregate size and temporal dispersion stability was investigated. In general the samples are more stable the lower the concentration of particles is. For TiO2 nanoparticles, the addition of a surfactant yielded the most stable samples with smallest aggregate sizes. Furthermore the suspension stability is increasing with electrolyte concentration. Depending on the dispersing medium the results show that TiO2 nanoparticles tend to form aggregates between 100-200 nm in diameter while the primary particle size is given as 21 nm by the manufacturer. Aggregate sizes are increasing with time. The particle size distribution of the silver nanoparticle samples is quite uniform in each medium. The fresh samples show aggregate sizes between 40 and 45 nm while the primary particle size is 15 nm according to the manufacturer. Aggregate size is only slightly increasing with time during the sedimentation experiments. These results are used as a reference when analyzing the effluent of column experiments.

  5. Inferring the demographic history from DNA sequences: An importance sampling approach based on non-homogeneous processes.

    PubMed

    Ait Kaci Azzou, S; Larribe, F; Froda, S

    2016-10-01

    In Ait Kaci Azzou et al. (2015) we introduced an Importance Sampling (IS) approach for estimating the demographic history of a sample of DNA sequences, the skywis plot. More precisely, we proposed a new nonparametric estimate of a population size that changes over time. We showed on simulated data that the skywis plot can work well in typical situations where the effective population size does not undergo very steep changes. In this paper, we introduce an iterative procedure which extends the previous method and gives good estimates under such rapid variations. In the iterative calibrated skywis plot we approximate the effective population size by a piecewise constant function, whose values are re-estimated at each step. These piecewise constant functions are used to generate the waiting times of non homogeneous Poisson processes related to a coalescent process with mutation under a variable population size model. Moreover, the present IS procedure is based on a modified version of the Stephens and Donnelly (2000) proposal distribution. Finally, we apply the iterative calibrated skywis plot method to a simulated data set from a rapidly expanding exponential model, and we show that the method based on this new IS strategy correctly reconstructs the demographic history. Copyright © 2016. Published by Elsevier Inc.

  6. The prevalence of terraced treescapes in analyses of phylogenetic data sets.

    PubMed

    Dobrin, Barbara H; Zwickl, Derrick J; Sanderson, Michael J

    2018-04-04

    The pattern of data availability in a phylogenetic data set may lead to the formation of terraces, collections of equally optimal trees. Terraces can arise in tree space if trees are scored with parsimony or with partitioned, edge-unlinked maximum likelihood. Theory predicts that terraces can be large, but their prevalence in contemporary data sets has never been surveyed. We selected 26 data sets and phylogenetic trees reported in recent literature and investigated the terraces to which the trees would belong, under a common set of inference assumptions. We examined terrace size as a function of the sampling properties of the data sets, including taxon coverage density (the proportion of taxon-by-gene positions with any data present) and a measure of gene sampling "sufficiency". We evaluated each data set in relation to the theoretical minimum gene sampling depth needed to reduce terrace size to a single tree, and explored the impact of the terraces found in replicate trees in bootstrap methods. Terraces were identified in nearly all data sets with taxon coverage densities < 0.90. They were not found, however, in high-coverage-density (i.e., ≥ 0.94) transcriptomic and genomic data sets. The terraces could be very large, and size varied inversely with taxon coverage density and with gene sampling sufficiency. Few data sets achieved a theoretical minimum gene sampling depth needed to reduce terrace size to a single tree. Terraces found during bootstrap resampling reduced overall support. If certain inference assumptions apply, trees estimated from empirical data sets often belong to large terraces of equally optimal trees. Terrace size correlates to data set sampling properties. Data sets seldom include enough genes to reduce terrace size to one tree. When bootstrap replicate trees lie on a terrace, statistical support for phylogenetic hypotheses may be reduced. Although some of the published analyses surveyed were conducted with edge-linked inference models (which do not induce terraces), unlinked models have been used and advocated. The present study describes the potential impact of that inference assumption on phylogenetic inference in the context of the kinds of multigene data sets now widely assembled for large-scale tree construction.

  7. Tungsten Carbide Grain Size Computation for WC-Co Dissimilar Welds

    NASA Astrophysics Data System (ADS)

    Zhou, Dongran; Cui, Haichao; Xu, Peiquan; Lu, Fenggui

    2016-06-01

    A "two-step" image processing method based on electron backscatter diffraction in scanning electron microscopy was used to compute the tungsten carbide (WC) grain size distribution for tungsten inert gas (TIG) welds and laser welds. Twenty-four images were collected on randomly set fields per sample located at the top, middle, and bottom of a cross-sectional micrograph. Each field contained 500 to 1500 WC grains. The images were recognized through clustering-based image segmentation and WC grain growth recognition. According to the WC grain size computation and experiments, a simple WC-WC interaction model was developed to explain the WC dissolution, grain growth, and aggregation in welded joints. The WC-WC interaction and blunt corners were characterized using scanning and transmission electron microscopy. The WC grain size distribution and the effects of heat input E on grain size distribution for the laser samples were discussed. The results indicate that (1) the grain size distribution follows a Gaussian distribution. Grain sizes at the top of the weld were larger than those near the middle and weld root because of power attenuation. (2) Significant WC grain growth occurred during welding as observed in the as-welded micrographs. The average grain size was 11.47 μm in the TIG samples, which was much larger than that in base metal 1 (BM1 2.13 μm). The grain size distribution curves for the TIG samples revealed a broad particle size distribution without fine grains. The average grain size (1.59 μm) in laser samples was larger than that in base metal 2 (BM2 1.01 μm). (3) WC-WC interaction exhibited complex plane, edge, and blunt corner characteristics during grain growth. A WC ( { 1 {bar{{1}}}00} ) to WC ( {0 1 1 {bar{{0}}}} ) edge disappeared and became a blunt plane WC ( { 10 1 {bar{{0}}}} ) , several grains with two- or three-sided planes and edges disappeared into a multi-edge, and a WC-WC merged.

  8. Radar volume reflectivity estimation using an array of ground-based rainfall drop size detectors

    NASA Astrophysics Data System (ADS)

    Lane, John; Merceret, Francis; Kasparis, Takis; Roy, D.; Muller, Brad; Jones, W. Linwood

    2000-08-01

    Rainfall drop size distribution (DSD) measurements made by single disdrometers at isolated ground sites have traditionally been used to estimate the transformation between weather radar reflectivity Z and rainfall rate R. Despite the immense disparity in sampling geometries, the resulting Z-R relation obtained by these single point measurements has historically been important in the study of applied radar meteorology. Simultaneous DSD measurements made at several ground sites within a microscale area may be used to improve the estimate of radar reflectivity in the air volume surrounding the disdrometer array. By applying the equations of motion for non-interacting hydrometers, a volume estimate of Z is obtained from the array of ground based disdrometers by first calculating a 3D drop size distribution. The 3D-DSD model assumes that only gravity and terminal velocity due to atmospheric drag within the sampling volume influence hydrometer dynamics. The sampling volume is characterized by wind velocities, which are input parameters to the 3D-DSD model, composed of vertical and horizontal components. Reflectivity data from four consecutive WSR-88D volume scans, acquired during a thunderstorm near Melbourne, FL on June 1, 1997, are compared to data processed using the 3D-DSD model and data form three ground based disdrometers of a microscale array.

  9. Observational studies of patients in the emergency department: a comparison of 4 sampling methods.

    PubMed

    Valley, Morgan A; Heard, Kennon J; Ginde, Adit A; Lezotte, Dennis C; Lowenstein, Steven R

    2012-08-01

    We evaluate the ability of 4 sampling methods to generate representative samples of the emergency department (ED) population. We analyzed the electronic records of 21,662 consecutive patient visits at an urban, academic ED. From this population, we simulated different models of study recruitment in the ED by using 2 sample sizes (n=200 and n=400) and 4 sampling methods: true random, random 4-hour time blocks by exact sample size, random 4-hour time blocks by a predetermined number of blocks, and convenience or "business hours." For each method and sample size, we obtained 1,000 samples from the population. Using χ(2) tests, we measured the number of statistically significant differences between the sample and the population for 8 variables (age, sex, race/ethnicity, language, triage acuity, arrival mode, disposition, and payer source). Then, for each variable, method, and sample size, we compared the proportion of the 1,000 samples that differed from the overall ED population to the expected proportion (5%). Only the true random samples represented the population with respect to sex, race/ethnicity, triage acuity, mode of arrival, language, and payer source in at least 95% of the samples. Patient samples obtained using random 4-hour time blocks and business hours sampling systematically differed from the overall ED patient population for several important demographic and clinical variables. However, the magnitude of these differences was not large. Common sampling strategies selected for ED-based studies may affect parameter estimates for several representative population variables. However, the potential for bias for these variables appears small. Copyright © 2012. Published by Mosby, Inc.

  10. Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data.

    PubMed

    Bhaskar, Anand; Wang, Y X Rachel; Song, Yun S

    2015-02-01

    With the recent increase in study sample sizes in human genetics, there has been growing interest in inferring historical population demography from genomic variation data. Here, we present an efficient inference method that can scale up to very large samples, with tens or hundreds of thousands of individuals. Specifically, by utilizing analytic results on the expected frequency spectrum under the coalescent and by leveraging the technique of automatic differentiation, which allows us to compute gradients exactly, we develop a very efficient algorithm to infer piecewise-exponential models of the historical effective population size from the distribution of sample allele frequencies. Our method is orders of magnitude faster than previous demographic inference methods based on the frequency spectrum. In addition to inferring demography, our method can also accurately estimate locus-specific mutation rates. We perform extensive validation of our method on simulated data and show that it can accurately infer multiple recent epochs of rapid exponential growth, a signal that is difficult to pick up with small sample sizes. Lastly, we use our method to analyze data from recent sequencing studies, including a large-sample exome-sequencing data set of tens of thousands of individuals assayed at a few hundred genic regions. © 2015 Bhaskar et al.; Published by Cold Spring Harbor Laboratory Press.

  11. Addressing small sample size bias in multiple-biomarker trials: Inclusion of biomarker-negative patients and Firth correction.

    PubMed

    Habermehl, Christina; Benner, Axel; Kopp-Schneider, Annette

    2018-03-01

    In recent years, numerous approaches for biomarker-based clinical trials have been developed. One of these developments are multiple-biomarker trials, which aim to investigate multiple biomarkers simultaneously in independent subtrials. For low-prevalence biomarkers, small sample sizes within the subtrials have to be expected, as well as many biomarker-negative patients at the screening stage. The small sample sizes may make it unfeasible to analyze the subtrials individually. This imposes the need to develop new approaches for the analysis of such trials. With an expected large group of biomarker-negative patients, it seems reasonable to explore options to benefit from including them in such trials. We consider advantages and disadvantages of the inclusion of biomarker-negative patients in a multiple-biomarker trial with a survival endpoint. We discuss design options that include biomarker-negative patients in the study and address the issue of small sample size bias in such trials. We carry out a simulation study for a design where biomarker-negative patients are kept in the study and are treated with standard of care. We compare three different analysis approaches based on the Cox model to examine if the inclusion of biomarker-negative patients can provide a benefit with respect to bias and variance of the treatment effect estimates. We apply the Firth correction to reduce the small sample size bias. The results of the simulation study suggest that for small sample situations, the Firth correction should be applied to adjust for the small sample size bias. Additional to the Firth penalty, the inclusion of biomarker-negative patients in the analysis can lead to further but small improvements in bias and standard deviation of the estimates. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  12. Theoretical size controls of the giant Phaeocystis globosa colonies

    NASA Astrophysics Data System (ADS)

    Liu, Xiao; Smith, Walker O.; Tang, Kam W.; Doan, Nhu Hai; Nguyen, Ngoc Lam

    2015-06-01

    An unusual characteristic of the cosmopolitan haptophyte Phaeocystis globosa is its ability to form colonies of strikingly large size-up to 3 cm in diameter. The large size and the presence of a mucoid envelope are believed to contribute to the formation of dense blooms in Southeast Asia. We collected colonies of different sizes in shallow coastal waters of Viet Nam and conducted a series of measurements and experiments on individual colonies. Using these empirical data, we developed a simple carbon-based model to predict the growth and maximal size of P. globosa colonies. Our model suggests that growth of a colony from 0.2 cm to 1.4 cm (the maximal size in our samples) would take 16 days. This number, however, is strongly influenced by the maximal photosynthetic rate and other physiological parameters used in the model. The model also returns a specific growth rate of 0.30 d-1 for colonial cells, comparable to satellite estimates, but lower than have been measured for unicellular P. globosa in batch culture at similar temperatures. We attribute this low growth rate to not only the model uncertainties, but factors such as self-shading and diffusive limitation of nutrient uptake.

  13. Utilizing soil polypedons to improve model performance for digital soil mapping

    USDA-ARS?s Scientific Manuscript database

    Most digital soil mapping approaches that use point data to develop relationships with covariate data intersect sample locations with one raster pixel regardless of pixel size. Resulting models are subject to spurious values in covariate data which may limit model performance. An alternative approac...

  14. Polytomous Rasch Models in Counseling Assessment

    ERIC Educational Resources Information Center

    Willse, John T.

    2017-01-01

    This article provides a brief introduction to the Rasch model. Motivation for using Rasch analyses is provided. Important Rasch model concepts and key aspects of result interpretation are introduced, with major points reinforced using a simulation demonstration. Concrete guidelines are provided regarding sample size and the evaluation of items.

  15. Measuring the specific surface area of natural and manmade glasses: effects of formation process, morphology, and particle size

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Papelis, Charalambos; Um, Wooyong; Russel, Charles E.

    2003-03-28

    The specific surface area of natural and manmade solid materials is a key parameter controlling important interfacial processes in natural environments and engineered systems, including dissolution reactions and sorption processes at solid-fluid interfaces. To improve our ability to quantify the release of trace elements trapped in natural glasses, the release of hazardous compounds trapped in manmade glasses, or the release of radionuclides from nuclear melt glass, we measured the specific surface area of natural and manmade glasses as a function of particle size, morphology, and composition. Volcanic ash, volcanic tuff, tektites, obsidian glass, and in situ vitrified rock were analyzed.more » Specific surface area estimates were obtained using krypton as gas adsorbent and the BET model. The range of surface areas measured exceeded three orders of magnitude. A tektite sample had the highest surface area (1.65 m2/g), while one of the samples of in situ vitrified rock had the lowest surf ace area (0.0016 m2/g). The specific surface area of the samples was a function of particle size, decreasing with increasing particle size. Different types of materials, however, showed variable dependence on particle size, and could be assigned to one of three distinct groups: (1) samples with low surface area dependence on particle size and surface areas approximately two orders of magnitude higher than the surface area of smooth spheres of equivalent size. The specific surface area of these materials was attributed mostly to internal porosity and surface roughness. (2) samples that showed a trend of decreasing surface area dependence on particle size as the particle size increased. The minimum specific surface area of these materials was between 0.1 and 0.01 m2/g and was also attributed to internal porosity and surface roughness. (3) samples whose surface area showed a monotonic decrease with increasing particle size, never reaching an ultimate surface area limit within the particle size range examined. The surface area results were consistent with particle morphology, examined by scanning electron microscopy, and have significant implications for the release of radionuclides and toxic metals in the environment.« less

  16. Effect of flaw size and temperature on the matrix cracking behavior of a brittle ceramic matrix composite

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Anandakumar, U.; Webb, J.E.; Singh, R.N.

    The matrix cracking behavior of a zircon matrix - uniaxial SCS 6 fiber composite was studied as a function of initial flaw size and temperature. The composites were fabricated by a tape casting and hot pressing technique. Surface flaws of controlled size were introduced using a vicker`s indenter. The composite samples were tested in three point flexure at three different temperatures to study the non steady state and steady state matrix cracking behavior. The composite samples exhibited steady state and non steady matrix cracking behavior at all temperatures. The steady state matrix cracking stress and steady state crack size increasedmore » with increasing temperature. The results of the study correlated well with the results predicted by the matrix cracking models.« less

  17. Power and sample-size estimation for microbiome studies using pairwise distances and PERMANOVA.

    PubMed

    Kelly, Brendan J; Gross, Robert; Bittinger, Kyle; Sherrill-Mix, Scott; Lewis, James D; Collman, Ronald G; Bushman, Frederic D; Li, Hongzhe

    2015-08-01

    The variation in community composition between microbiome samples, termed beta diversity, can be measured by pairwise distance based on either presence-absence or quantitative species abundance data. PERMANOVA, a permutation-based extension of multivariate analysis of variance to a matrix of pairwise distances, partitions within-group and between-group distances to permit assessment of the effect of an exposure or intervention (grouping factor) upon the sampled microbiome. Within-group distance and exposure/intervention effect size must be accurately modeled to estimate statistical power for a microbiome study that will be analyzed with pairwise distances and PERMANOVA. We present a framework for PERMANOVA power estimation tailored to marker-gene microbiome studies that will be analyzed by pairwise distances, which includes: (i) a novel method for distance matrix simulation that permits modeling of within-group pairwise distances according to pre-specified population parameters; (ii) a method to incorporate effects of different sizes within the simulated distance matrix; (iii) a simulation-based method for estimating PERMANOVA power from simulated distance matrices; and (iv) an R statistical software package that implements the above. Matrices of pairwise distances can be efficiently simulated to satisfy the triangle inequality and incorporate group-level effects, which are quantified by the adjusted coefficient of determination, omega-squared (ω2). From simulated distance matrices, available PERMANOVA power or necessary sample size can be estimated for a planned microbiome study. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  18. Investigating the relative permeability behavior of microporosity-rich carbonates and tight sandstones with multiscale pore network models

    NASA Astrophysics Data System (ADS)

    Bultreys, Tom; Stappen, Jeroen Van; Kock, Tim De; Boever, Wesley De; Boone, Marijn A.; Hoorebeke, Luc Van; Cnudde, Veerle

    2016-11-01

    The relative permeability behavior of rocks with wide ranges of pore sizes is in many cases still poorly understood and is difficult to model at the pore scale. In this work, we investigate the capillary pressure and relative permeability behavior of three outcrop carbonates and two tight reservoir sandstones with wide, multimodal pore size distributions. To examine how the drainage and imbibition properties of these complex rock types are influenced by the connectivity of macropores to each other and to zones with unresolved small-scale porosity, we apply a previously presented microcomputed-tomography-based multiscale pore network model to these samples. The sensitivity to the properties of the small-scale porosity is studied by performing simulations with different artificial sphere-packing-based networks as a proxy for these pores. Finally, the mixed-wet water-flooding behavior of the samples is investigated, assuming different wettability distributions for the microporosity and macroporosity. While this work is not an attempt to perform predictive modeling, it seeks to qualitatively explain the behavior of the investigated samples and illustrates some of the most recent developments in multiscale pore network modeling.

  19. Change-in-ratio estimators for populations with more than two subclasses

    USGS Publications Warehouse

    Udevitz, Mark S.; Pollock, Kenneth H.

    1991-01-01

    Change-in-ratio methods have been developed to estimate the size of populations with two or three population subclasses. Most of these methods require the often unreasonable assumption of equal sampling probabilities for individuals in all subclasses. This paper presents new models based on the weaker assumption that ratios of sampling probabilities are constant over time for populations with three or more subclasses. Estimation under these models requires that a value be assumed for one of these ratios when there are two samples. Explicit expressions are given for the maximum likelihood estimators under models for two samples with three or more subclasses and for three samples with two subclasses. A numerical method using readily available statistical software is described for obtaining the estimators and their standard errors under all of the models. Likelihood ratio tests that can be used in model selection are discussed. Emphasis is on the two-sample, three-subclass models for which Monte-Carlo simulation results and an illustrative example are presented.

  20. Fossil shrews from Honduras and their significance for late glacial evolution in body size (Mammalia: Soricidae: Cryptotis)

    USGS Publications Warehouse

    Woodman, N.; Croft, D.A.

    2005-01-01

    Our study of mammalian remains excavated in the 1940s from McGrew Cave, north of Copan, Honduras, yielded an assemblage of 29 taxa that probably accumulated predominantly as the result of predation by owls. Among the taxa present are three species of small-eared shrews, genus Cryptotis. One species, Cryptotis merriami, is relatively rare among the fossil remains. The other two shrews, Cryptotis goodwini and Cryptotis orophila, are abundant and exhibit morpho metrical variation distinguishing them from modern populations. Fossils of C. goodwini are distinctly and consistently smaller than modern members of the species. To quantify the size differences, we derived common measures of body size for fossil C. goodwini using regression models based on modern samples of shrews in the Cryptotis mexicana-group. Estimated mean length of head and body for the fossil sample is 72-79 mm, and estimated mean mass is 7.6-9.6 g. These numbers indicate that the fossil sample averaged 6-14% smaller in head and body length and 39-52% less in mass than the modern sample and that increases of 6-17% in head and body length and 65-108% in mass occurred to achieve the mean body size of the modern sample. Conservative estimates of fresh (wet) food intake based on mass indicate that such a size increase would require a 37-58% increase in daily food consumption. In contrast to C. goodwini, fossil C. orophila from the cave is not different in mean body size from modern samples. The fossil sample does, however, show slightly greater variation in size than is currently present throughout the modern geographical distribution of the taxon. Moreover, variation in some other dental and mandibular characters is more constrained, exhibiting a more direct relationship to overall size. Our study of these species indicates that North American shrews have not all been static in size through time, as suggested by some previous work with fossil soricids. Lack of stratigraphic control within the site and our failure to obtain reliable radiometric dates on remains restrict our opportunities to place the site in a firm temporal context. However, the morphometrical differences we document for fossil C. orophila and C. goodwini show them to be distinct from modern populations of these shrews. Some other species of fossil mammals from McGrew Cave exhibit distinct size changes of the magnitudes experienced by many northern North American and some Mexican mammals during the transition from late glacial to Holocene environmental conditions, and it is likely that at least some of the remains from the cave are late Pleistocene in age. One curious factor is that, whereas most mainland mammals that exhibit large-scale size shifts during the late glacial/postglacial transition experienced dwarfing, C. goodwini increased in size. The lack of clinal variation in modern C. goodwini supports the hypothesis that size evolution can result from local selection rather than from cline translocation. Models of size change in mammals indicate that increased size, such as that observed for C. goodwini, are a likely consequence of increased availability of resources and, thereby, a relaxation of selection during critical times of the year.

  1. A model-based approach to sample size estimation in recent onset type 1 diabetes.

    PubMed

    Bundy, Brian N; Krischer, Jeffrey P

    2016-11-01

    The area under the curve C-peptide following a 2-h mixed meal tolerance test from 498 individuals enrolled on five prior TrialNet studies of recent onset type 1 diabetes from baseline to 12 months after enrolment were modelled to produce estimates of its rate of loss and variance. Age at diagnosis and baseline C-peptide were found to be significant predictors, and adjusting for these in an ANCOVA resulted in estimates with lower variance. Using these results as planning parameters for new studies results in a nearly 50% reduction in the target sample size. The modelling also produces an expected C-peptide that can be used in observed versus expected calculations to estimate the presumption of benefit in ongoing trials. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  2. Atomically precise (catalytic) particles synthesized by a novel cluster deposition instrument

    DOE PAGES

    Yin, C.; Tyo, E.; Kuchta, K.; ...

    2014-05-06

    Here, we report a new high vacuum instrument which is dedicated to the preparation of well-defined clusters supported on model and technologically relevant supports for catalytic and materials investigations. The instrument is based on deposition of size selected metallic cluster ions that are produced by a high flux magnetron cluster source. Furthermore, we maximize the throughput of the apparatus by collecting and focusing ions utilizing a conical octupole ion guide and a linear ion guide. The size selection is achieved by a quadrupole mass filter. The new design of the sample holder provides for the preparation of multiple samples onmore » supports of various sizes and shapes in one session. After cluster deposition onto the support of interest, samples will be taken out of the chamber for a variety of testing and characterization.« less

  3. Children's accuracy of portion size estimation using digital food images: effects of interface design and size of image on computer screen.

    PubMed

    Baranowski, Tom; Baranowski, Janice C; Watson, Kathleen B; Martin, Shelby; Beltran, Alicia; Islam, Noemi; Dadabhoy, Hafza; Adame, Su-heyla; Cullen, Karen; Thompson, Debbe; Buday, Richard; Subar, Amy

    2011-03-01

    To test the effect of image size and presence of size cues on the accuracy of portion size estimation by children. Children were randomly assigned to seeing images with or without food size cues (utensils and checked tablecloth) and were presented with sixteen food models (foods commonly eaten by children) in varying portion sizes, one at a time. They estimated each food model's portion size by selecting a digital food image. The same food images were presented in two ways: (i) as small, graduated portion size images all on one screen or (ii) by scrolling across large, graduated portion size images, one per sequential screen. Laboratory-based with computer and food models. Volunteer multi-ethnic sample of 120 children, equally distributed by gender and ages (8 to 13 years) in 2008-2009. Average percentage of correctly classified foods was 60·3 %. There were no differences in accuracy by any design factor or demographic characteristic. Multiple small pictures on the screen at once took half the time to estimate portion size compared with scrolling through large pictures. Larger pictures had more overestimation of size. Multiple images of successively larger portion sizes of a food on one computer screen facilitated quicker portion size responses with no decrease in accuracy. This is the method of choice for portion size estimation on a computer.

  4. Hierarchical Linear Modeling Meta-Analysis of Single-Subject Design Research

    ERIC Educational Resources Information Center

    Gage, Nicholas A.; Lewis, Timothy J.

    2014-01-01

    The identification of evidence-based practices continues to provoke issues of disagreement across multiple fields. One area of contention is the role of single-subject design (SSD) research in providing scientific evidence. The debate about SSD's utility centers on three issues: sample size, effect size, and serial dependence. One potential…

  5. Dimension- and shape-dependent thermal transport in nano-patterned thin films investigated by scanning thermal microscopy

    NASA Astrophysics Data System (ADS)

    Ge, Yunfei; Zhang, Yuan; Weaver, Jonathan M. R.; Dobson, Phillip S.

    2017-12-01

    Scanning thermal microscopy (SThM) is a technique which is often used for the measurement of the thermal conductivity of materials at the nanometre scale. The impact of nano-scale feature size and shape on apparent thermal conductivity, as measured using SThM, has been investigated. To achieve this, our recently developed topography-free samples with 200 and 400 nm wide gold wires (50 nm thick) of length of 400-2500 nm were fabricated and their thermal resistance measured and analysed. This data was used in the development and validation of a rigorous but simple heat transfer model that describes a nanoscopic contact to an object with finite shape and size. This model, in combination with a recently proposed thermal resistance network, was then used to calculate the SThM probe signal obtained by measuring these features. These calculated values closely matched the experimental results obtained from the topography-free sample. By using the model to analyse the dimensional dependence of thermal resistance, we demonstrate that feature size and shape has a significant impact on measured thermal properties that can result in a misinterpretation of material thermal conductivity. In the case of a gold nanowire embedded within a silicon nitride matrix it is found that the apparent thermal conductivity of the wire appears to be depressed by a factor of twenty from the true value. These results clearly demonstrate the importance of knowing both probe-sample thermal interactions and feature dimensions as well as shape when using SThM to quantify material thermal properties. Finally, the new model is used to identify the heat flux sensitivity, as well as the effective contact size of the conventional SThM system used in this study.

  6. Detecting Mixtures from Structural Model Differences Using Latent Variable Mixture Modeling: A Comparison of Relative Model Fit Statistics

    ERIC Educational Resources Information Center

    Henson, James M.; Reise, Steven P.; Kim, Kevin H.

    2007-01-01

    The accuracy of structural model parameter estimates in latent variable mixture modeling was explored with a 3 (sample size) [times] 3 (exogenous latent mean difference) [times] 3 (endogenous latent mean difference) [times] 3 (correlation between factors) [times] 3 (mixture proportions) factorial design. In addition, the efficacy of several…

  7. Evaluation of weighted regression and sample size in developing a taper model for loblolly pine

    Treesearch

    Kenneth L. Cormier; Robin M. Reich; Raymond L. Czaplewski; William A. Bechtold

    1992-01-01

    A stem profile model, fit using pseudo-likelihood weighted regression, was used to estimate merchantable volume of loblolly pine (Pinus taeda L.) in the southeast. The weighted regression increased model fit marginally, but did not substantially increase model performance. In all cases, the unweighted regression models performed as well as the...

  8. The Stability of Post Hoc Model Modifications in Covariance Structure Models.

    ERIC Educational Resources Information Center

    Hutchinson, Susan R.

    The work of R. MacCallum et al. (1992) was extended by examining chance modifications through a Monte Carlo simulation. The stability of post hoc model modifications was examined under varying sample size, model complexity, and severity of misspecification using 2- and 4-factor oblique confirmatory factor analysis (CFA) models with four and eight…

  9. Applying information theory to small groups assessment: emotions and well-being at work.

    PubMed

    García-Izquierdo, Antonio León; Moreno, Blanca; García-Izquierdo, Mariano

    2010-05-01

    This paper explores and analyzes the relations between emotions and well-being in a sample of aviation personnel, passenger crew (flight attendants). There is an increasing interest in studying the influence of emotions and its role as psychosocial factors in the work environment as they are able to act as facilitators or shock absorbers. The contrast of the theoretical models by using traditional parametric techniques requires a large sample size to the efficient estimation of the coefficients that quantify the relations between variables. Since the available sample that we have is small, the most common size in European enterprises, we used the maximum entropy principle to explore the emotions that are involved in the psychosocial risks. The analyses show that this method takes advantage of the limited information available and guarantee an optimal estimation, the results of which are coherent with theoretical models and numerous empirical researches about emotions and well-being.

  10. Mechanical properties and failure behavior of unidirectional porous ceramics

    NASA Astrophysics Data System (ADS)

    Seuba, Jordi; Deville, Sylvain; Guizard, Christian; Stevenson, Adam J.

    2016-04-01

    We show that the honeycomb out-of-plane model derived by Gibson and Ashby can be applied to describe the compressive behavior of unidirectional porous materials. Ice-templating allowed us to process samples with accurate control over pore volume, size, and morphology. These samples allowed us to evaluate the effect of this microstructural variations on the compressive strength in a porosity range of 45-80%. The maximum strength of 286 MPa was achieved in the least porous ice-templated sample (P(%) = 49.9), with the smallest pore size (3 μm). We found that the out-of-plane model only holds when buckling is the dominant failure mode, as should be expected. Furthermore, we controlled total pore volume by adjusting solids loading and sintering temperature. This strategy allows us to independently control macroporosity and densification of walls, and the compressive strength of ice-templated materials is exclusively dependent on total pore volume.

  11. Mechanical properties and failure behavior of unidirectional porous ceramics.

    PubMed

    Seuba, Jordi; Deville, Sylvain; Guizard, Christian; Stevenson, Adam J

    2016-04-14

    We show that the honeycomb out-of-plane model derived by Gibson and Ashby can be applied to describe the compressive behavior of unidirectional porous materials. Ice-templating allowed us to process samples with accurate control over pore volume, size, and morphology. These samples allowed us to evaluate the effect of this microstructural variations on the compressive strength in a porosity range of 45-80%. The maximum strength of 286 MPa was achieved in the least porous ice-templated sample (P(%) = 49.9), with the smallest pore size (3 μm). We found that the out-of-plane model only holds when buckling is the dominant failure mode, as should be expected. Furthermore, we controlled total pore volume by adjusting solids loading and sintering temperature. This strategy allows us to independently control macroporosity and densification of walls, and the compressive strength of ice-templated materials is exclusively dependent on total pore volume.

  12. Purification of complex samples: Implementation of a modular and reconfigurable droplet-based microfluidic platform with cascaded deterministic lateral displacement separation modules

    PubMed Central

    Pudda, Catherine; Boizot, François; Verplanck, Nicolas; Revol-Cavalier, Frédéric; Berthier, Jean; Thuaire, Aurélie

    2018-01-01

    Particle separation in microfluidic devices is a common problematic for sample preparation in biology. Deterministic lateral displacement (DLD) is efficiently implemented as a size-based fractionation technique to separate two populations of particles around a specific size. However, real biological samples contain components of many different sizes and a single DLD separation step is not sufficient to purify these complex samples. When connecting several DLD modules in series, pressure balancing at the DLD outlets of each step becomes critical to ensure an optimal separation efficiency. A generic microfluidic platform is presented in this paper to optimize pressure balancing, when DLD separation is connected either to another DLD module or to a different microfluidic function. This is made possible by generating droplets at T-junctions connected to the DLD outlets. Droplets act as pressure controllers, which perform at the same time the encapsulation of DLD sorted particles and the balance of output pressures. The optimized pressures to apply on DLD modules and on T-junctions are determined by a general model that ensures the equilibrium of the entire platform. The proposed separation platform is completely modular and reconfigurable since the same predictive model applies to any cascaded DLD modules of the droplet-based cartridge. PMID:29768490

  13. Threshold-dependent sample sizes for selenium assessment with stream fish tissue

    USGS Publications Warehouse

    Hitt, Nathaniel P.; Smith, David R.

    2015-01-01

    Natural resource managers are developing assessments of selenium (Se) contamination in freshwater ecosystems based on fish tissue concentrations. We evaluated the effects of sample size (i.e., number of fish per site) on the probability of correctly detecting mean whole-body Se values above a range of potential management thresholds. We modeled Se concentrations as gamma distributions with shape and scale parameters fitting an empirical mean-to-variance relationship in data from southwestern West Virginia, USA (63 collections, 382 individuals). We used parametric bootstrapping techniques to calculate statistical power as the probability of detecting true mean concentrations up to 3 mg Se/kg above management thresholds ranging from 4 to 8 mg Se/kg. Sample sizes required to achieve 80% power varied as a function of management thresholds and Type I error tolerance (α). Higher thresholds required more samples than lower thresholds because populations were more heterogeneous at higher mean Se levels. For instance, to assess a management threshold of 4 mg Se/kg, a sample of eight fish could detect an increase of approximately 1 mg Se/kg with 80% power (given α = 0.05), but this sample size would be unable to detect such an increase from a management threshold of 8 mg Se/kg with more than a coin-flip probability. Increasing α decreased sample size requirements to detect above-threshold mean Se concentrations with 80% power. For instance, at an α-level of 0.05, an 8-fish sample could detect an increase of approximately 2 units above a threshold of 8 mg Se/kg with 80% power, but when α was relaxed to 0.2, this sample size was more sensitive to increasing mean Se concentrations, allowing detection of an increase of approximately 1.2 units with equivalent power. Combining individuals into 2- and 4-fish composite samples for laboratory analysis did not decrease power because the reduced number of laboratory samples was compensated for by increased precision of composites for estimating mean conditions. However, low sample sizes (<5 fish) did not achieve 80% power to detect near-threshold values (i.e., <1 mg Se/kg) under any scenario we evaluated. This analysis can assist the sampling design and interpretation of Se assessments from fish tissue by accounting for natural variation in stream fish populations.

  14. Properties of hypothesis testing techniques and (Bayesian) model selection for exploration-based and theory-based (order-restricted) hypotheses.

    PubMed

    Kuiper, Rebecca M; Nederhoff, Tim; Klugkist, Irene

    2015-05-01

    In this paper, the performance of six types of techniques for comparisons of means is examined. These six emerge from the distinction between the method employed (hypothesis testing, model selection using information criteria, or Bayesian model selection) and the set of hypotheses that is investigated (a classical, exploration-based set of hypotheses containing equality constraints on the means, or a theory-based limited set of hypotheses with equality and/or order restrictions). A simulation study is conducted to examine the performance of these techniques. We demonstrate that, if one has specific, a priori specified hypotheses, confirmation (i.e., investigating theory-based hypotheses) has advantages over exploration (i.e., examining all possible equality-constrained hypotheses). Furthermore, examining reasonable order-restricted hypotheses has more power to detect the true effect/non-null hypothesis than evaluating only equality restrictions. Additionally, when investigating more than one theory-based hypothesis, model selection is preferred over hypothesis testing. Because of the first two results, we further examine the techniques that are able to evaluate order restrictions in a confirmatory fashion by examining their performance when the homogeneity of variance assumption is violated. Results show that the techniques are robust to heterogeneity when the sample sizes are equal. When the sample sizes are unequal, the performance is affected by heterogeneity. The size and direction of the deviations from the baseline, where there is no heterogeneity, depend on the effect size (of the means) and on the trend in the group variances with respect to the ordering of the group sizes. Importantly, the deviations are less pronounced when the group variances and sizes exhibit the same trend (e.g., are both increasing with group number). © 2014 The British Psychological Society.

  15. Complex Population Dynamics and the Coalescent Under Neutrality

    PubMed Central

    Volz, Erik M.

    2012-01-01

    Estimates of the coalescent effective population size Ne can be poorly correlated with the true population size. The relationship between Ne and the population size is sensitive to the way in which birth and death rates vary over time. The problem of inference is exacerbated when the mechanisms underlying population dynamics are complex and depend on many parameters. In instances where nonparametric estimators of Ne such as the skyline struggle to reproduce the correct demographic history, model-based estimators that can draw on prior information about population size and growth rates may be more efficient. A coalescent model is developed for a large class of populations such that the demographic history is described by a deterministic nonlinear dynamical system of arbitrary dimension. This class of demographic model differs from those typically used in population genetics. Birth and death rates are not fixed, and no assumptions are made regarding the fraction of the population sampled. Furthermore, the population may be structured in such a way that gene copies reproduce both within and across demes. For this large class of models, it is shown how to derive the rate of coalescence, as well as the likelihood of a gene genealogy with heterochronous sampling and labeled taxa, and how to simulate a coalescent tree conditional on a complex demographic history. This theoretical framework encapsulates many of the models used by ecologists and epidemiologists and should facilitate the integration of population genetics with the study of mathematical population dynamics. PMID:22042576

  16. Scalable population estimates using spatial-stream-network (SSN) models, fish density surveys, and national geospatial database frameworks for streams

    Treesearch

    Daniel J. Isaak; Jay M. Ver Hoef; Erin E. Peterson; Dona L. Horan; David E. Nagel

    2017-01-01

    Population size estimates for stream fishes are important for conservation and management, but sampling costs limit the extent of most estimates to small portions of river networks that encompass 100s–10 000s of linear kilometres. However, the advent of large fish density data sets, spatial-stream-network (SSN) models that benefit from nonindependence among samples,...

  17. Free-Energy Fluctuations and Chaos in the Sherrington-Kirkpatrick Model

    NASA Astrophysics Data System (ADS)

    Aspelmeier, T.

    2008-03-01

    The sample-to-sample fluctuations ΔFN of the free-energy in the Sherrington-Kirkpatrick model are shown rigorously to be related to bond chaos. Via this connection, the fluctuations become analytically accessible by replica methods. The replica calculation for bond chaos shows that the exponent μ governing the growth of the fluctuations with system size N, ΔFN˜Nμ, is bounded by μ≤(1)/(4).

  18. [Primary branch size of Pinus koraiensis plantation: a prediction based on linear mixed effect model].

    PubMed

    Dong, Ling-Bo; Liu, Zhao-Gang; Li, Feng-Ri; Jiang, Li-Chun

    2013-09-01

    By using the branch analysis data of 955 standard branches from 60 sampled trees in 12 sampling plots of Pinus koraiensis plantation in Mengjiagang Forest Farm in Heilongjiang Province of Northeast China, and based on the linear mixed-effect model theory and methods, the models for predicting branch variables, including primary branch diameter, length, and angle, were developed. Considering tree effect, the MIXED module of SAS software was used to fit the prediction models. The results indicated that the fitting precision of the models could be improved by choosing appropriate random-effect parameters and variance-covariance structure. Then, the correlation structures including complex symmetry structure (CS), first-order autoregressive structure [AR(1)], and first-order autoregressive and moving average structure [ARMA(1,1)] were added to the optimal branch size mixed-effect model. The AR(1) improved the fitting precision of branch diameter and length mixed-effect model significantly, but all the three structures didn't improve the precision of branch angle mixed-effect model. In order to describe the heteroscedasticity during building mixed-effect model, the CF1 and CF2 functions were added to the branch mixed-effect model. CF1 function improved the fitting effect of branch angle mixed model significantly, whereas CF2 function improved the fitting effect of branch diameter and length mixed model significantly. Model validation confirmed that the mixed-effect model could improve the precision of prediction, as compare to the traditional regression model for the branch size prediction of Pinus koraiensis plantation.

  19. Class Extraction and Classification Accuracy in Latent Class Models

    ERIC Educational Resources Information Center

    Wu, Qiong

    2009-01-01

    Despite the increasing popularity of latent class models (LCM) in educational research, methodological studies have not yet accumulated much information on the appropriate application of this modeling technique, especially with regard to requirement on sample size and number of indicators. This dissertation study represented an initial attempt to…

  20. Sample Size Limits for Estimating Upper Level Mediation Models Using Multilevel SEM

    ERIC Educational Resources Information Center

    Li, Xin; Beretvas, S. Natasha

    2013-01-01

    This simulation study investigated use of the multilevel structural equation model (MLSEM) for handling measurement error in both mediator and outcome variables ("M" and "Y") in an upper level multilevel mediation model. Mediation and outcome variable indicators were generated with measurement error. Parameter and standard…

  1. Modeling aboveground biomass of Tamarix ramosissima in the Arkansas River Basin of Southeastern Colorado, USA

    USGS Publications Warehouse

    Evangelista, P.; Kumar, S.; Stohlgren, T.J.; Crall, A.W.; Newman, G.J.

    2007-01-01

    Predictive models of aboveground biomass of nonnative Tamarix ramosissima of various sizes were developed using destructive sampling techniques on 50 individuals and four 100-m2 plots. Each sample was measured for average height (m) of stems and canopy area (m2) prior to cutting, drying, and weighing. Five competing regression models (P < 0.05) were developed to estimate aboveground biomass of T. ramosissima using average height and/or canopy area measurements and were evaluated using Akaike's Information Criterion corrected for small sample size (AICc). Our best model (AICc = -148.69, ??AICc = 0) successfully predicted T. ramosissima aboveground biomass (R2 = 0.97) and used average height and canopy area as predictors. Our 2nd-best model, using the same predictors, was also successful in predicting aboveground biomass (R2 = 0.97, AICc = -131.71, ??AICc = 16.98). A 3rd model demonstrated high correlation between only aboveground biomass and canopy area (R2 = 0.95), while 2 additional models found high correlations between aboveground biomass and average height measurements only (R2 = 0.90 and 0.70, respectively). These models illustrate how simple field measurements, such as height and canopy area, can be used in allometric relationships to accurately predict aboveground biomass of T. ramosissima. Although a correction factor may be necessary for predictions at larger scales, the models presented will prove useful for many research and management initiatives.

  2. QA/QC requirements for physical properties sampling and analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Innis, B.E.

    1993-07-21

    This report presents results of an assessment of the available information concerning US Environmental Protection Agency (EPA) quality assurance/quality control (QA/QC) requirements and guidance applicable to sampling, handling, and analyzing physical parameter samples at Comprehensive Environmental Restoration, Compensation, and Liability Act (CERCLA) investigation sites. Geotechnical testing laboratories measure the following physical properties of soil and sediment samples collected during CERCLA remedial investigations (RI) at the Hanford Site: moisture content, grain size by sieve, grain size by hydrometer, specific gravity, bulk density/porosity, saturated hydraulic conductivity, moisture retention, unsaturated hydraulic conductivity, and permeability of rocks by flowing air. Geotechnical testing laboratories alsomore » measure the following chemical parameters of soil and sediment samples collected during Hanford Site CERCLA RI: calcium carbonate and saturated column leach testing. Physical parameter data are used for (1) characterization of vadose and saturated zone geology and hydrogeology, (2) selection of monitoring well screen sizes, (3) to support modeling and analysis of the vadose and saturated zones, and (4) for engineering design. The objectives of this report are to determine the QA/QC levels accepted in the EPA Region 10 for the sampling, handling, and analysis of soil samples for physical parameters during CERCLA RI.« less

  3. Valuing Trial Designs from a Pharmaceutical Perspective Using Value-Based Pricing.

    PubMed

    Breeze, Penny; Brennan, Alan

    2015-11-01

    Our aim was to adapt the traditional framework for expected net benefit of sampling (ENBS) to be more compatible with drug development trials from the pharmaceutical perspective. We modify the traditional framework for conducting ENBS and assume that the price of the drug is conditional on the trial outcomes. We use a value-based pricing (VBP) criterion to determine price conditional on trial data using Bayesian updating of cost-effectiveness (CE) model parameters. We assume that there is a threshold price below which the company would not market the new intervention. We present a case study in which a phase III trial sample size and trial duration are varied. For each trial design, we sampled 10,000 trial outcomes and estimated VBP using a CE model. The expected commercial net benefit is calculated as the expected profits minus the trial costs. A clinical trial with shorter follow-up, and larger sample size, generated the greatest expected commercial net benefit. Increasing the duration of follow-up had a modest impact on profit forecasts. Expected net benefit of sampling can be adapted to value clinical trials in the pharmaceutical industry to optimise the expected commercial net benefit. However, the analyses can be very time consuming for complex CE models. © 2014 The Authors. Health Economics published by John Wiley & Sons Ltd.

  4. Dispersion models and sampling of cacao mirid bug Sahlbergella singularis (Hemiptera: Miridae) on Theobroma Cacao in southern Cameroon.

    PubMed

    Bisseleua, D H B; Vidal, Stefan

    2011-02-01

    The spatio-temporal distribution of Sahlbergella singularis Haglung, a major pest of cacao trees (Theobroma cacao) (Malvaceae), was studied for 2 yr in traditional cacao forest gardens in the humid forest area of southern Cameroon. The first objective was to analyze the dispersion of this insect on cacao trees. The second objective was to develop sampling plans based on fixed levels of precision for estimating S. singularis populations. The following models were used to analyze the data: Taylor's power law, Iwao's patchiness regression, the Nachman model, and the negative binomial distribution. Our results document that Taylor's power law was a better fit for the data than the Iwao and Nachman models. Taylor's b and Iwao's β were both significantly >1, indicating that S. singularis aggregated on specific trees. This result was further supported by the calculated common k of 1.75444. Iwao's α was significantly <0, indicating that the basic distribution component of S. singularis was the individual insect. Comparison of negative binomial (NBD) and Nachman models indicated that the NBD model was appropriate for studying S. singularis distribution. Optimal sample sizes for fixed precision levels of 0.10, 0.15, and 0.25 were estimated with Taylor's regression coefficients. Required sample sizes increased dramatically with increasing levels of precision. This is the first study on S. singularis dispersion in cacao plantations. Sampling plans, presented here, should be a tool for research on population dynamics and pest management decisions of mirid bugs on cacao. © 2011 Entomological Society of America

  5. Statistical Modelling of Temperature and Moisture Uptake of Biochars Exposed to Selected Relative Humidity of Air.

    PubMed

    Bastistella, Luciane; Rousset, Patrick; Aviz, Antonio; Caldeira-Pires, Armando; Humbert, Gilles; Nogueira, Manoel

    2018-02-09

    New experimental techniques, as well as modern variants on known methods, have recently been employed to investigate the fundamental reactions underlying the oxidation of biochar. The purpose of this paper was to experimentally and statistically study how the relative humidity of air, mass, and particle size of four biochars influenced the adsorption of water and the increase in temperature. A random factorial design was employed using the intuitive statistical software Xlstat. A simple linear regression model and an analysis of variance with a pairwise comparison were performed. The experimental study was carried out on the wood of Quercus pubescens , Cyclobalanopsis glauca , Trigonostemon huangmosun , and Bambusa vulgaris , and involved five relative humidity conditions (22, 43, 75, 84, and 90%), two mass samples (0.1 and 1 g), and two particle sizes (powder and piece). Two response variables including water adsorption and temperature increase were analyzed and discussed. The temperature did not increase linearly with the adsorption of water. Temperature was modeled by nine explanatory variables, while water adsorption was modeled by eight. Five variables, including factors and their interactions, were found to be common to the two models. Sample mass and relative humidity influenced the two qualitative variables, while particle size and biochar type only influenced the temperature.

  6. Gaps in sampling and limitations to tree biomass estimation: a review of past sampling efforts over the past 50 years

    Treesearch

    Aaron Weiskittel; Jereme Frank; James Westfall; David Walker; Phil Radtke; David Affleck; David Macfarlane

    2015-01-01

    Tree biomass models are widely used but differ due to variation in the quality and quantity of data used in their development. We reviewed over 250 biomass studies and categorized them by species, location, sampled diameter distribution, and sample size. Overall, less than half of the tree species in Forest Inventory and Analysis database (FIADB) are without a...

  7. Self-assembled indium arsenide quantum dots: Structure, formation dynamics, optical properties

    NASA Astrophysics Data System (ADS)

    Lee, Hao

    1998-12-01

    In this dissertation, we investigate the properties of InAs/GaAs quantum dots grown by molecular beam epitaxy. The structure and formation dynamics of InAs quantum dots are studied by a variety of structural characterization techniques. Correlations among the growth conditions, the structural characteristics, and the observed optical properties are explored. The most fundamental structural characteristic of the InAs quantum dots is their shape. Through detailed study of the reflection high energy electron diffraction patterns, we determined that self-assembled InAs islands possess a pyramidal shape with 136 bounding facets. Cross-sectional transmission electron microscopy images and atomic force microscopy images strongly support this model. The 136 model we proposed is the first model that is consistent with all reported shape features determined using different methods. The dynamics of coherent island formation is also studied with the goal of establishing the factors most important in determining the size, density, and the shape of self- organized InAs quantum dots. Our studies clearly demonstrate the roles that indium diffusion and desorption play in InAs island formation. An unexpected finding (from atomic force microscopy images) was that the island size distribution bifurcated during post- growth annealing. Photoluminescence spectra of the samples subjected to in-situ annealing prior to the growth of a capping layer show a distinctive double-peak feature. The power-dependence and temperature-dependence of the photoluminescence spectra reveals that the double- peak emission is associated with the ground-state transition of islands in two different size branches. These results confirm the island size bifurcation observed from atomic force microscopy images. The island size bifurcation provides a new approach to the control and manipulation of the island size distribution. Unexpected dependence of the photoluminescence line-shape on sample temperature and pump intensity was observed for samples grown at relatively high substrate temperatures. The behavior is modeled and explained in terms of competition between two overlapping transitions. The study underscores that the growth conditions can have a dramatic impact on the optical properties of the quantum dots. This dissertation includes both my previously published and unpublished authored materials.

  8. Freeway travel speed calculation model based on ETC transaction data.

    PubMed

    Weng, Jiancheng; Yuan, Rongliang; Wang, Ru; Wang, Chang

    2014-01-01

    Real-time traffic flow operation condition of freeway gradually becomes the critical information for the freeway users and managers. In fact, electronic toll collection (ETC) transaction data effectively records operational information of vehicles on freeway, which provides a new method to estimate the travel speed of freeway. First, the paper analyzed the structure of ETC transaction data and presented the data preprocess procedure. Then, a dual-level travel speed calculation model was established under different levels of sample sizes. In order to ensure a sufficient sample size, ETC data of different enter-leave toll plazas pairs which contain more than one road segment were used to calculate the travel speed of every road segment. The reduction coefficient α and reliable weight θ for sample vehicle speed were introduced in the model. Finally, the model was verified by the special designed field experiments which were conducted on several freeways in Beijing at different time periods. The experiments results demonstrated that the average relative error was about 6.5% which means that the freeway travel speed could be estimated by the proposed model accurately. The proposed model is helpful to promote the level of the freeway operation monitoring and the freeway management, as well as to provide useful information for the freeway travelers.

  9. Information for forest process models: a review of NRS-FIA vegetation measurements

    Treesearch

    Charles D. Canham; William H. McWilliams

    2012-01-01

    The Forest and Analysis Program of the Northern Research Station (NRS-FIA) has re-designed Phase 3 measurements and intensified the sample intensity following a study to balance costs, utility, and sample size. The sampling scheme consists of estimating canopy-cover percent for six vegetation growth habits on 24-foot-radius subplots in four height classes and as an...

  10. Geostatistics and the representative elementary volume of gamma ray tomography attenuation in rocks cores

    USGS Publications Warehouse

    Vogel, J.R.; Brown, G.O.

    2003-01-01

    Semivariograms of samples of Culebra Dolomite have been determined at two different resolutions for gamma ray computed tomography images. By fitting models to semivariograms, small-scale and large-scale correlation lengths are determined for four samples. Different semivariogram parameters were found for adjacent cores at both resolutions. Relative elementary volume (REV) concepts are related to the stationarity of the sample. A scale disparity factor is defined and is used to determine sample size required for ergodic stationarity with a specified correlation length. This allows for comparison of geostatistical measures and representative elementary volumes. The modifiable areal unit problem is also addressed and used to determine resolution effects on correlation lengths. By changing resolution, a range of correlation lengths can be determined for the same sample. Comparison of voxel volume to the best-fit model correlation length of a single sample at different resolutions reveals a linear scaling effect. Using this relationship, the range of the point value semivariogram is determined. This is the range approached as the voxel size goes to zero. Finally, these results are compared to the regularization theory of point variables for borehole cores and are found to be a better fit for predicting the volume-averaged range.

  11. Impact of multicollinearity on small sample hydrologic regression models

    NASA Astrophysics Data System (ADS)

    Kroll, Charles N.; Song, Peter

    2013-06-01

    Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.

  12. Monitoring diesel particulate matter and calculating diesel particulate densities using Grimm model 1.109 real-time aerosol monitors in underground mines.

    PubMed

    Kimbal, Kyle C; Pahler, Leon; Larson, Rodney; VanDerslice, Jim

    2012-01-01

    Currently, there is no Mine Safety and Health Administration (MSHA)-approved sampling method that provides real-time results for ambient concentrations of diesel particulates. This study investigated whether a commercially available aerosol spectrometer, the Grimm Portable Aerosol Spectrometer Model 1.109, could be used during underground mine operations to provide accurate real-time diesel particulate data relative to MSHA-approved cassette-based sampling methods. A subset was to estimate size-specific diesel particle densities to potentially improve the diesel particulate concentration estimates using the aerosol monitor. Concurrent sampling was conducted during underground metal mine operations using six duplicate diesel particulate cassettes, according to the MSHA-approved method, and two identical Grimm Model 1.109 instruments. Linear regression was used to develop adjustment factors relating the Grimm results to the average of the cassette results. Statistical models using the Grimm data produced predicted diesel particulate concentrations that highly correlated with the time-weighted average cassette results (R(2) = 0.86, 0.88). Size-specific diesel particulate densities were not constant over the range of particle diameters observed. The variance of the calculated diesel particulate densities by particle diameter size supports the current understanding that diesel emissions are a mixture of particulate aerosols and a complex host of gases and vapors not limited to elemental and organic carbon. Finally, diesel particulate concentrations measured by the Grimm Model 1.109 can be adjusted to provide sufficiently accurate real-time air monitoring data for an underground mining environment.

  13. Function approximation and documentation of sampling data using artificial neural networks.

    PubMed

    Zhang, Wenjun; Barrion, Albert

    2006-11-01

    Biodiversity studies in ecology often begin with the fitting and documentation of sampling data. This study is conducted to make function approximation on sampling data and to document the sampling information using artificial neural network algorithms, based on the invertebrate data sampled in the irrigated rice field. Three types of sampling data, i.e., the curve species richness vs. the sample size, the curve rarefaction, and the curve mean abundance of newly sampled species vs.the sample size, are fitted and documented using BP (Backpropagation) network and RBF (Radial Basis Function) network. As the comparisons, The Arrhenius model, and rarefaction model, and power function are tested for their ability to fit these data. The results show that the BP network and RBF network fit the data better than these models with smaller errors. BP network and RBF network can fit non-linear functions (sampling data) with specified accuracy and don't require mathematical assumptions. In addition to the interpolation, BP network is used to extrapolate the functions and the asymptote of the sampling data can be drawn. BP network cost a longer time to train the network and the results are always less stable compared to the RBF network. RBF network require more neurons to fit functions and generally it may not be used to extrapolate the functions. The mathematical function for sampling data can be exactly fitted using artificial neural network algorithms by adjusting the desired accuracy and maximum iterations. The total numbers of functional species of invertebrates in the tropical irrigated rice field are extrapolated as 140 to 149 using trained BP network, which are similar to the observed richness.

  14. Orphan therapies: making best use of postmarket data.

    PubMed

    Maro, Judith C; Brown, Jeffrey S; Dal Pan, Gerald J; Li, Lingling

    2014-08-01

    Postmarket surveillance of the comparative safety and efficacy of orphan therapeutics is challenging, particularly when multiple therapeutics are licensed for the same orphan indication. To make best use of product-specific registry data collected to fulfill regulatory requirements, we propose the creation of a distributed electronic health data network among registries. Such a network could support sequential statistical analyses designed to detect early warnings of excess risks. We use a simulated example to explore the circumstances under which a distributed network may prove advantageous. We perform sample size calculations for sequential and non-sequential statistical studies aimed at comparing the incidence of hepatotoxicity following initiation of two newly licensed therapies for homozygous familial hypercholesterolemia. We calculate the sample size savings ratio, or the proportion of sample size saved if one conducted a sequential study as compared to a non-sequential study. Then, using models to describe the adoption and utilization of these therapies, we simulate when these sample sizes are attainable in calendar years. We then calculate the analytic calendar time savings ratio, analogous to the sample size savings ratio. We repeat these analyses for numerous scenarios. Sequential analyses detect effect sizes earlier or at the same time as non-sequential analyses. The most substantial potential savings occur when the market share is more imbalanced (i.e., 90% for therapy A) and the effect size is closest to the null hypothesis. However, due to low exposure prevalence, these savings are difficult to realize within the 30-year time frame of this simulation for scenarios in which the outcome of interest occurs at or more frequently than one event/100 person-years. We illustrate a process to assess whether sequential statistical analyses of registry data performed via distributed networks may prove a worthwhile infrastructure investment for pharmacovigilance.

  15. Snow particles extracted from X-ray computed microtomography imagery and their single-scattering properties

    NASA Astrophysics Data System (ADS)

    Ishimoto, Hiroshi; Adachi, Satoru; Yamaguchi, Satoru; Tanikawa, Tomonori; Aoki, Teruo; Masuda, Kazuhiko

    2018-04-01

    Sizes and shapes of snow particles were determined from X-ray computed microtomography (micro-CT) images, and their single-scattering properties were calculated at visible and near-infrared wavelengths using a Geometrical Optics Method (GOM). We analyzed seven snow samples including fresh and aged artificial snow and natural snow obtained from field samples. Individual snow particles were numerically extracted, and the shape of each snow particle was defined by applying a rendering method. The size distribution and specific surface area distribution were estimated from the geometrical properties of the snow particles, and an effective particle radius was derived for each snow sample. The GOM calculations at wavelengths of 0.532 and 1.242 μm revealed that the realistic snow particles had similar scattering phase functions as those of previously modeled irregular shaped particles. Furthermore, distinct dendritic particles had a characteristic scattering phase function and asymmetry factor. The single-scattering properties of particles of effective radius reff were compared with the size-averaged single-scattering properties. We found that the particles of reff could be used as representative particles for calculating the average single-scattering properties of the snow. Furthermore, the single-scattering properties of the micro-CT particles were compared to those of particle shape models using our current snow retrieval algorithm. For the single-scattering phase function, the results of the micro-CT particles were consistent with those of a conceptual two-shape model. However, the particle size dependence differed for the single-scattering albedo and asymmetry factor.

  16. Personalized prediction of chronic wound healing: an exponential mixed effects model using stereophotogrammetric measurement.

    PubMed

    Xu, Yifan; Sun, Jiayang; Carter, Rebecca R; Bogie, Kath M

    2014-05-01

    Stereophotogrammetric digital imaging enables rapid and accurate detailed 3D wound monitoring. This rich data source was used to develop a statistically validated model to provide personalized predictive healing information for chronic wounds. 147 valid wound images were obtained from a sample of 13 category III/IV pressure ulcers from 10 individuals with spinal cord injury. Statistical comparison of several models indicated the best fit for the clinical data was a personalized mixed-effects exponential model (pMEE), with initial wound size and time as predictors and observed wound size as the response variable. Random effects capture personalized differences. Other models are only valid when wound size constantly decreases. This is often not achieved for clinical wounds. Our model accommodates this reality. Two criteria to determine effective healing time outcomes are proposed: r-fold wound size reduction time, t(r-fold), is defined as the time when wound size reduces to 1/r of initial size. t(δ) is defined as the time when the rate of the wound healing/size change reduces to a predetermined threshold δ < 0. Healing rate differs from patient to patient. Model development and validation indicates that accurate monitoring of wound geometry can adaptively predict healing progression and that larger wounds heal more rapidly. Accuracy of the prediction curve in the current model improves with each additional evaluation. Routine assessment of wounds using detailed stereophotogrammetric imaging can provide personalized predictions of wound healing time. Application of a valid model will help the clinical team to determine wound management care pathways. Published by Elsevier Ltd.

  17. Discovery of the linear region of Near Infrared Diffuse Reflectance spectra using the Kubelka-Munk theory

    NASA Astrophysics Data System (ADS)

    Dai, Shengyun; Pan, Xiaoning; Ma, Lijuan; Huang, Xingguo; Du, Chenzhao; Qiao, Yanjiang; Wu, Zhisheng

    2018-05-01

    Particle size is of great importance for the quantitative model of the NIR diffuse reflectance. In this paper, the effect of sample particle size on the measurement of harpagoside in Radix Scrophulariae powder by near infrared diffuse (NIR) reflectance spectroscopy was explored. High-performance liquid chromatography (HPLC) was employed as a reference method to construct the quantitative particle size model. Several spectral preprocessing methods were compared, and particle size models obtained by different preprocessing methods for establishing the partial least-squares (PLS) models of harpagoside. Data showed that the particle size distribution of 125-150 μm for Radix Scrophulariae exhibited the best prediction ability with R2pre=0.9513, RMSEP=0.1029 mg·g-1, and RPD = 4.78. For the hybrid granularity calibration model, the particle size distribution of 90-180 μm exhibited the best prediction ability with R2pre=0.8919, RMSEP=0.1632 mg·g-1, and RPD = 3.09. Furthermore, the Kubelka-Munk theory was used to relate the absorption coefficient k (concentration-dependent) and scatter coefficient s (particle size-dependent). The scatter coefficient s was calculated based on the Kubelka-Munk theory to study the changes of s after being mathematically preprocessed. A linear relationship was observed between k/s and absorption A within a certain range and the value for k/s was greater than 4. According to this relationship, the model was more accurately constructed with the particle size distribution of 90-180 μm when s was kept constant or in a small linear region. This region provided a good reference for the linear modeling of diffuse reflectance spectroscopy. To establish a diffuse reflectance NIR model, further accurate assessment should be obtained in advance for a precise linear model.

  18. Mixture models for estimating the size of a closed population when capture rates vary among individuals

    USGS Publications Warehouse

    Dorazio, R.M.; Royle, J. Andrew

    2003-01-01

    We develop a parameterization of the beta-binomial mixture that provides sensible inferences about the size of a closed population when probabilities of capture or detection vary among individuals. Three classes of mixture models (beta-binomial, logistic-normal, and latent-class) are fitted to recaptures of snowshoe hares for estimating abundance and to counts of bird species for estimating species richness. In both sets of data, rates of detection appear to vary more among individuals (animals or species) than among sampling occasions or locations. The estimates of population size and species richness are sensitive to model-specific assumptions about the latent distribution of individual rates of detection. We demonstrate using simulation experiments that conventional diagnostics for assessing model adequacy, such as deviance, cannot be relied on for selecting classes of mixture models that produce valid inferences about population size. Prior knowledge about sources of individual heterogeneity in detection rates, if available, should be used to help select among classes of mixture models that are to be used for inference.

  19. Evidence for a Global Sampling Process in Extraction of Summary Statistics of Item Sizes in a Set.

    PubMed

    Tokita, Midori; Ueda, Sachiyo; Ishiguchi, Akira

    2016-01-01

    Several studies have shown that our visual system may construct a "summary statistical representation" over groups of visual objects. Although there is a general understanding that human observers can accurately represent sets of a variety of features, many questions on how summary statistics, such as an average, are computed remain unanswered. This study investigated sampling properties of visual information used by human observers to extract two types of summary statistics of item sets, average and variance. We presented three models of ideal observers to extract the summary statistics: a global sampling model without sampling noise, global sampling model with sampling noise, and limited sampling model. We compared the performance of an ideal observer of each model with that of human observers using statistical efficiency analysis. Results suggest that summary statistics of items in a set may be computed without representing individual items, which makes it possible to discard the limited sampling account. Moreover, the extraction of summary statistics may not necessarily require the representation of individual objects with focused attention when the sets of items are larger than 4.

  20. Laboratory measurements of electric properties of composite mine dump samples from Colorado and New Mexico

    USGS Publications Warehouse

    Anderson, Anita L.; Campbell, David L.; Beanland, Shay

    2001-01-01

    Individual mine waste samples were collected and combined to form one composite sample at each of eight mine dump sites in Colorado and New Mexico. The samples were air-dried and sieved to determine the geochemical composition of their <2mm size fraction. Splits of the samples were then rehydrated and their electrical properties were measured in the US Geological Survey Petrophysical Laboratory, Denver, Colorado (PetLab). The PetLab measurements were done twice: in 1999, using convenient amounts of rehydration water ranging from 5% to 8%; and in 2000, using carefully controlled rehydrations to 5% and 10% water. This report gives geochemical analyses of the <2mm size fraction of the composite samples (Appendix A), PetLab graphs of the 1999 measurements (Appendix B), Petlab graphs of the 2000 measurements (Appendix C), and Cole-Cole models of the PetLab data from the 2000 measurements (Appendix D).

  1. Lack of association between ectoparasite intensities and rabies virus neutralizing antibody seroprevalence in wild big brown bats (Eptesicus fuscus), Fort Collins, Colorado

    USGS Publications Warehouse

    Pearce, R.D.; O'Shea, T.J.; Shankar, V.; Rupprecht, C.E.

    2007-01-01

    Recently, bat ectoparasites have been demonstrated to harbor pathogens of potential importance to humans. We evaluated antirabies antibody seroprevalence and the presence of ectoparasites in big brown bats (Eptesicus fuscus) sampled in 2002 and 2003 in Colorado to investigate if an association existed between ectoparasite intensity and exposure to rabies virus (RV). We used logistic regression and Akaike's Information Criteria adjusted for sample size (AICc) in a post-hoc analysis to investigate the relative importance of three ectoparasite species, as well as bat colony size, year sampled, age class, colony size, and year interaction on the presence of rabies virus neutralizing antibodies (VNA) in serum of wild E. fuscus. We obtained serum samples and ectoparasite counts from big brown bats simultaneously in 2002 and 2003. Although the presence of ectoparasites (Steatonyssus occidentalis and Spinturnix bakeri) were important in elucidating VNA seroprevalence, their intensities were higher in seronegative bats than in seropositive bats, and the presence of a third ectoparasite (Cimex pilosellus) was inconsequential. Colony size and year sampled were the most important variables in these AICc models. These findings suggest that these ectoparasites do not enhance exposure of big brown bats to RV. ?? 2007 Mary Ann Liebert, Inc.

  2. Quantifying the size-resolved dynamics of indoor bioaerosol transport and control.

    PubMed

    Kunkel, S A; Azimi, P; Zhao, H; Stark, B C; Stephens, B

    2017-09-01

    Understanding the bioaerosol dynamics of droplets and droplet nuclei emitted during respiratory activities is important for understanding how infectious diseases are transmitted and potentially controlled. To this end, we conducted experiments to quantify the size-resolved dynamics of indoor bioaerosol transport and control in an unoccupied apartment unit operating under four different HVAC particle filtration conditions. Two model organisms (Escherichia coli K12 and bacteriophage T4) were aerosolized under alternating low and high flow rates to roughly represent constant breathing and periodic coughing. Size-resolved aerosol sampling and settle plate swabbing were conducted in multiple locations. Samples were analyzed by DNA extraction and quantitative polymerase chain reaction (qPCR). DNA from both organisms was detected during all test conditions in all air samples up to 7 m away from the source, but decreased in magnitude with the distance from the source. A greater fraction of T4 DNA was recovered from the aerosol size fractions smaller than 1 μm than E. coli K12 at all air sampling locations. Higher efficiency HVAC filtration also reduced the amount of DNA recovered in air samples and on settle plates located 3-7 m from the source. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  3. Dimensions of design space: a decision-theoretic approach to optimal research design.

    PubMed

    Conti, Stefano; Claxton, Karl

    2009-01-01

    Bayesian decision theory can be used not only to establish the optimal sample size and its allocation in a single clinical study but also to identify an optimal portfolio of research combining different types of study design. Within a single study, the highest societal payoff to proposed research is achieved when its sample sizes and allocation between available treatment options are chosen to maximize the expected net benefit of sampling (ENBS). Where a number of different types of study informing different parameters in the decision problem could be conducted, the simultaneous estimation of ENBS across all dimensions of the design space is required to identify the optimal sample sizes and allocations within such a research portfolio. This is illustrated through a simple example of a decision model of zanamivir for the treatment of influenza. The possible study designs include: 1) a single trial of all the parameters, 2) a clinical trial providing evidence only on clinical endpoints, 3) an epidemiological study of natural history of disease, and 4) a survey of quality of life. The possible combinations, samples sizes, and allocation between trial arms are evaluated over a range of cost-effectiveness thresholds. The computational challenges are addressed by implementing optimization algorithms to search the ENBS surface more efficiently over such large dimensions.

  4. Spatial structure, sampling design and scale in remotely-sensed imagery of a California savanna woodland

    NASA Technical Reports Server (NTRS)

    Mcgwire, K.; Friedl, M.; Estes, J. E.

    1993-01-01

    This article describes research related to sampling techniques for establishing linear relations between land surface parameters and remotely-sensed data. Predictive relations are estimated between percentage tree cover in a savanna environment and a normalized difference vegetation index (NDVI) derived from the Thematic Mapper sensor. Spatial autocorrelation in original measurements and regression residuals is examined using semi-variogram analysis at several spatial resolutions. Sampling schemes are then tested to examine the effects of autocorrelation on predictive linear models in cases of small sample sizes. Regression models between image and ground data are affected by the spatial resolution of analysis. Reducing the influence of spatial autocorrelation by enforcing minimum distances between samples may also improve empirical models which relate ground parameters to satellite data.

  5. Comparison of Two Methods Used to Model Shape Parameters of Pareto Distributions

    USGS Publications Warehouse

    Liu, C.; Charpentier, R.R.; Su, J.

    2011-01-01

    Two methods are compared for estimating the shape parameters of Pareto field-size (or pool-size) distributions for petroleum resource assessment. Both methods assume mature exploration in which most of the larger fields have been discovered. Both methods use the sizes of larger discovered fields to estimate the numbers and sizes of smaller fields: (1) the tail-truncated method uses a plot of field size versus size rank, and (2) the log-geometric method uses data binned in field-size classes and the ratios of adjacent bin counts. Simulation experiments were conducted using discovered oil and gas pool-size distributions from four petroleum systems in Alberta, Canada and using Pareto distributions generated by Monte Carlo simulation. The estimates of the shape parameters of the Pareto distributions, calculated by both the tail-truncated and log-geometric methods, generally stabilize where discovered pool numbers are greater than 100. However, with fewer than 100 discoveries, these estimates can vary greatly with each new discovery. The estimated shape parameters of the tail-truncated method are more stable and larger than those of the log-geometric method where the number of discovered pools is more than 100. Both methods, however, tend to underestimate the shape parameter. Monte Carlo simulation was also used to create sequences of discovered pool sizes by sampling from a Pareto distribution with a discovery process model using a defined exploration efficiency (in order to show how biased the sampling was in favor of larger fields being discovered first). A higher (more biased) exploration efficiency gives better estimates of the Pareto shape parameters. ?? 2011 International Association for Mathematical Geosciences.

  6. Finite element model updating using the shadow hybrid Monte Carlo technique

    NASA Astrophysics Data System (ADS)

    Boulkaibet, I.; Mthembu, L.; Marwala, T.; Friswell, M. I.; Adhikari, S.

    2015-02-01

    Recent research in the field of finite element model updating (FEM) advocates the adoption of Bayesian analysis techniques to dealing with the uncertainties associated with these models. However, Bayesian formulations require the evaluation of the Posterior Distribution Function which may not be available in analytical form. This is the case in FEM updating. In such cases sampling methods can provide good approximations of the Posterior distribution when implemented in the Bayesian context. Markov Chain Monte Carlo (MCMC) algorithms are the most popular sampling tools used to sample probability distributions. However, the efficiency of these algorithms is affected by the complexity of the systems (the size of the parameter space). The Hybrid Monte Carlo (HMC) offers a very important MCMC approach to dealing with higher-dimensional complex problems. The HMC uses the molecular dynamics (MD) steps as the global Monte Carlo (MC) moves to reach areas of high probability where the gradient of the log-density of the Posterior acts as a guide during the search process. However, the acceptance rate of HMC is sensitive to the system size as well as the time step used to evaluate the MD trajectory. To overcome this limitation we propose the use of the Shadow Hybrid Monte Carlo (SHMC) algorithm. The SHMC algorithm is a modified version of the Hybrid Monte Carlo (HMC) and designed to improve sampling for large-system sizes and time steps. This is done by sampling from a modified Hamiltonian function instead of the normal Hamiltonian function. In this paper, the efficiency and accuracy of the SHMC method is tested on the updating of two real structures; an unsymmetrical H-shaped beam structure and a GARTEUR SM-AG19 structure and is compared to the application of the HMC algorithm on the same structures.

  7. Improved canopy reflectance modeling and scene inference through improved understanding of scene pattern

    NASA Technical Reports Server (NTRS)

    Franklin, Janet; Simonett, David

    1988-01-01

    The Li-Strahler reflectance model, driven by LANDSAT Thematic Mapper (TM) data, provided regional estimates of tree size and density within 20 percent of sampled values in two bioclimatic zones in West Africa. This model exploits tree geometry in an inversion technique to predict average tree size and density from reflectance data using a few simple parameters measured in the field (spatial pattern, shape, and size distribution of trees) and in the imagery (spectral signatures of scene components). Trees are treated as simply shaped objects, and multispectral reflectance of a pixel is assumed to be related only to the proportions of tree crown, shadow, and understory in the pixel. These, in turn, are a direct function of the number and size of trees, the solar illumination angle, and the spectral signatures of crown, shadow and understory. Given the variance in reflectance from pixel to pixel within a homogeneous area of woodland, caused by the variation in the number and size of trees, the model can be inverted to give estimates of average tree size and density. Because the inversion is sensitive to correct determination of component signatures, predictions are not accurate for small areas.

  8. A multi-scale study of Orthoptera species richness and human population size controlling for sampling effort

    NASA Astrophysics Data System (ADS)

    Cantarello, Elena; Steck, Claude E.; Fontana, Paolo; Fontaneto, Diego; Marini, Lorenzo; Pautasso, Marco

    2010-03-01

    Recent large-scale studies have shown that biodiversity-rich regions also tend to be densely populated areas. The most obvious explanation is that biodiversity and human beings tend to match the distribution of energy availability, environmental stability and/or habitat heterogeneity. However, the species-people correlation can also be an artefact, as more populated regions could show more species because of a more thorough sampling. Few studies have tested this sampling bias hypothesis. Using a newly collated dataset, we studied whether Orthoptera species richness is related to human population size in Italy’s regions (average area 15,000 km2) and provinces (2,900 km2). As expected, the observed number of species increases significantly with increasing human population size for both grain sizes, although the proportion of variance explained is minimal at the provincial level. However, variations in observed Orthoptera species richness are primarily associated with the available number of records, which is in turn well correlated with human population size (at least at the regional level). Estimated Orthoptera species richness (Chao2 and Jackknife) also increases with human population size both for regions and provinces. Both for regions and provinces, this increase is not significant when controlling for variation in area and number of records. Our study confirms the hypothesis that broad-scale human population-biodiversity correlations can in some cases be artefactual. More systematic sampling of less studied taxa such as invertebrates is necessary to ascertain whether biogeographical patterns persist when sampling effort is kept constant or included in models.

  9. Cluster randomised crossover trials with binary data and unbalanced cluster sizes: application to studies of near-universal interventions in intensive care.

    PubMed

    Forbes, Andrew B; Akram, Muhammad; Pilcher, David; Cooper, Jamie; Bellomo, Rinaldo

    2015-02-01

    Cluster randomised crossover trials have been utilised in recent years in the health and social sciences. Methods for analysis have been proposed; however, for binary outcomes, these have received little assessment of their appropriateness. In addition, methods for determination of sample size are currently limited to balanced cluster sizes both between clusters and between periods within clusters. This article aims to extend this work to unbalanced situations and to evaluate the properties of a variety of methods for analysis of binary data, with a particular focus on the setting of potential trials of near-universal interventions in intensive care to reduce in-hospital mortality. We derive a formula for sample size estimation for unbalanced cluster sizes, and apply it to the intensive care setting to demonstrate the utility of the cluster crossover design. We conduct a numerical simulation of the design in the intensive care setting and for more general configurations, and we assess the performance of three cluster summary estimators and an individual-data estimator based on binomial-identity-link regression. For settings similar to the intensive care scenario involving large cluster sizes and small intra-cluster correlations, the sample size formulae developed and analysis methods investigated are found to be appropriate, with the unweighted cluster summary method performing well relative to the more optimal but more complex inverse-variance weighted method. More generally, we find that the unweighted and cluster-size-weighted summary methods perform well, with the relative efficiency of each largely determined systematically from the study design parameters. Performance of individual-data regression is adequate with small cluster sizes but becomes inefficient for large, unbalanced cluster sizes. When outcome prevalences are 6% or less and the within-cluster-within-period correlation is 0.05 or larger, all methods display sub-nominal confidence interval coverage, with the less prevalent the outcome the worse the coverage. As with all simulation studies, conclusions are limited to the configurations studied. We confined attention to detecting intervention effects on an absolute risk scale using marginal models and did not explore properties of binary random effects models. Cluster crossover designs with binary outcomes can be analysed using simple cluster summary methods, and sample size in unbalanced cluster size settings can be determined using relatively straightforward formulae. However, caution needs to be applied in situations with low prevalence outcomes and moderate to high intra-cluster correlations. © The Author(s) 2014.

  10. Modeling the development of written language

    PubMed Central

    Puranik, Cynthia S.; Foorman, Barbara; Foster, Elizabeth; Wilson, Laura Gehron; Tschinkel, Erika; Kantor, Patricia Thatcher

    2011-01-01

    Alternative models of the structure of individual and developmental differences of written composition and handwriting fluency were tested using confirmatory factor analysis of writing samples provided by first- and fourth-grade students. For both groups, a five-factor model provided the best fit to the data. Four of the factors represented aspects of written composition: macro-organization (use of top sentence and number and ordering of ideas), productivity (number and diversity of words used), complexity (mean length of T-unit and syntactic density), and spelling and punctuation. The fifth factor represented handwriting fluency. Handwriting fluency was correlated with written composition factors at both grades. The magnitude of developmental differences between first grade and fourth grade expressed as effect sizes varied for variables representing the five constructs: large effect sizes were found for productivity and handwriting fluency variables; moderate effect sizes were found for complexity and macro-organization variables; and minimal effect sizes were found for spelling and punctuation variables. PMID:22228924

  11. Developing GIS-based eastern equine encephalitis vector-host models in Tuskegee, Alabama.

    PubMed

    Jacob, Benjamin G; Burkett-Cadena, Nathan D; Luvall, Jeffrey C; Parcak, Sarah H; McClure, Christopher J W; Estep, Laura K; Hill, Geoffrey E; Cupp, Eddie W; Novak, Robert J; Unnasch, Thomas R

    2010-02-24

    A site near Tuskegee, Alabama was examined for vector-host activities of eastern equine encephalomyelitis virus (EEEV). Land cover maps of the study site were created in ArcInfo 9.2 from QuickBird data encompassing visible and near-infrared (NIR) band information (0.45 to 0.72 microm) acquired July 15, 2008. Georeferenced mosquito and bird sampling sites, and their associated land cover attributes from the study site, were overlaid onto the satellite data. SAS 9.1.4 was used to explore univariate statistics and to generate regression models using the field and remote-sampled mosquito and bird data. Regression models indicated that Culex erracticus and Northern Cardinals were the most abundant mosquito and bird species, respectively. Spatial linear prediction models were then generated in Geostatistical Analyst Extension of ArcGIS 9.2. Additionally, a model of the study site was generated, based on a Digital Elevation Model (DEM), using ArcScene extension of ArcGIS 9.2. For total mosquito count data, a first-order trend ordinary kriging process was fitted to the semivariogram at a partial sill of 5.041 km, nugget of 6.325 km, lag size of 7.076 km, and range of 31.43 km, using 12 lags. For total adult Cx. erracticus count, a first-order trend ordinary kriging process was fitted to the semivariogram at a partial sill of 5.764 km, nugget of 6.114 km, lag size of 7.472 km, and range of 32.62 km, using 12 lags. For the total bird count data, a first-order trend ordinary kriging process was fitted to the semivariogram at a partial sill of 4.998 km, nugget of 5.413 km, lag size of 7.549 km and range of 35.27 km, using 12 lags. For the Northern Cardinal count data, a first-order trend ordinary kriging process was fitted to the semivariogram at a partial sill of 6.387 km, nugget of 5.935 km, lag size of 8.549 km and a range of 41.38 km, using 12 lags. Results of the DEM analyses indicated a statistically significant inverse linear relationship between total sampled mosquito data and elevation (R2 = -.4262; p < .0001), with a standard deviation (SD) of 10.46, and total sampled bird data and elevation (R2 = -.5111; p < .0001), with a SD of 22.97. DEM statistics also indicated a significant inverse linear relationship between total sampled Cx. erracticus data and elevation (R2 = -.4711; p < .0001), with a SD of 11.16, and the total sampled Northern Cardinal data and elevation (R2 = -.5831; p < .0001), SD of 11.42. These data demonstrate that GIS/remote sensing models and spatial statistics can capture space-varying functional relationships between field-sampled mosquito and bird parameters for determining risk for EEEV transmission.

  12. Accurate in situ measurement of complex refractive index and particle size in intralipid emulsions

    NASA Astrophysics Data System (ADS)

    Dong, Miao L.; Goyal, Kashika G.; Worth, Bradley W.; Makkar, Sorab S.; Calhoun, William R.; Bali, Lalit M.; Bali, Samir

    2013-08-01

    A first accurate measurement of the complex refractive index in an intralipid emulsion is demonstrated, and thereby the average scatterer particle size using standard Mie scattering calculations is extracted. Our method is based on measurement and modeling of the reflectance of a divergent laser beam from the sample surface. In the absence of any definitive reference data for the complex refractive index or particle size in highly turbid intralipid emulsions, we base our claim of accuracy on the fact that our work offers several critically important advantages over previously reported attempts. First, our measurements are in situ in the sense that they do not require any sample dilution, thus eliminating dilution errors. Second, our theoretical model does not employ any fitting parameters other than the two quantities we seek to determine, i.e., the real and imaginary parts of the refractive index, thus eliminating ambiguities arising from multiple extraneous fitting parameters. Third, we fit the entire reflectance-versus-incident-angle data curve instead of focusing on only the critical angle region, which is just a small subset of the data. Finally, despite our use of highly scattering opaque samples, our experiment uniquely satisfies a key assumption behind the Mie scattering formalism, namely, no multiple scattering occurs. Further proof of our method's validity is given by the fact that our measured particle size finds good agreement with the value obtained by dynamic light scattering.

  13. Accurate in situ measurement of complex refractive index and particle size in intralipid emulsions.

    PubMed

    Dong, Miao L; Goyal, Kashika G; Worth, Bradley W; Makkar, Sorab S; Calhoun, William R; Bali, Lalit M; Bali, Samir

    2013-08-01

    A first accurate measurement of the complex refractive index in an intralipid emulsion is demonstrated, and thereby the average scatterer particle size using standard Mie scattering calculations is extracted. Our method is based on measurement and modeling of the reflectance of a divergent laser beam from the sample surface. In the absence of any definitive reference data for the complex refractive index or particle size in highly turbid intralipid emulsions, we base our claim of accuracy on the fact that our work offers several critically important advantages over previously reported attempts. First, our measurements are in situ in the sense that they do not require any sample dilution, thus eliminating dilution errors. Second, our theoretical model does not employ any fitting parameters other than the two quantities we seek to determine, i.e., the real and imaginary parts of the refractive index, thus eliminating ambiguities arising from multiple extraneous fitting parameters. Third, we fit the entire reflectance-versus-incident-angle data curve instead of focusing on only the critical angle region, which is just a small subset of the data. Finally, despite our use of highly scattering opaque samples, our experiment uniquely satisfies a key assumption behind the Mie scattering formalism, namely, no multiple scattering occurs. Further proof of our method's validity is given by the fact that our measured particle size finds good agreement with the value obtained by dynamic light scattering.

  14. A Re-Evaluation of the Size of the White Shark (Carcharodon carcharias) Population off California, USA

    PubMed Central

    Burgess, George H.; Bruce, Barry D.; Cailliet, Gregor M.; Goldman, Kenneth J.; Grubbs, R. Dean; Lowe, Christopher G.; MacNeil, M. Aaron; Mollet, Henry F.; Weng, Kevin C.; O'Sullivan, John B.

    2014-01-01

    White sharks are highly migratory and segregate by sex, age and size. Unlike marine mammals, they neither surface to breathe nor frequent haul-out sites, hindering generation of abundance data required to estimate population size. A recent tag-recapture study used photographic identifications of white sharks at two aggregation sites to estimate abundance in “central California” at 219 mature and sub-adult individuals. They concluded this represented approximately one-half of the total abundance of mature and sub-adult sharks in the entire eastern North Pacific Ocean (ENP). This low estimate generated great concern within the conservation community, prompting petitions for governmental endangered species designations. We critically examine that study and find violations of model assumptions that, when considered in total, lead to population underestimates. We also use a Bayesian mixture model to demonstrate that the inclusion of transient sharks, characteristic of white shark aggregation sites, would substantially increase abundance estimates for the adults and sub-adults in the surveyed sub-population. Using a dataset obtained from the same sampling locations and widely accepted demographic methodology, our analysis indicates a minimum all-life stages population size of >2000 individuals in the California subpopulation is required to account for the number and size range of individual sharks observed at the two sampled sites. Even accounting for methodological and conceptual biases, an extrapolation of these data to estimate the white shark population size throughout the ENP is inappropriate. The true ENP white shark population size is likely several-fold greater as both our study and the original published estimate exclude non-aggregating sharks and those that independently aggregate at other important ENP sites. Accurately estimating the central California and ENP white shark population size requires methodologies that account for biases introduced by sampling a limited number of sites and that account for all life history stages across the species' range of habitats. PMID:24932483

  15. A re-evaluation of the size of the white shark (Carcharodon carcharias) population off California, USA.

    PubMed

    Burgess, George H; Bruce, Barry D; Cailliet, Gregor M; Goldman, Kenneth J; Grubbs, R Dean; Lowe, Christopher G; MacNeil, M Aaron; Mollet, Henry F; Weng, Kevin C; O'Sullivan, John B

    2014-01-01

    White sharks are highly migratory and segregate by sex, age and size. Unlike marine mammals, they neither surface to breathe nor frequent haul-out sites, hindering generation of abundance data required to estimate population size. A recent tag-recapture study used photographic identifications of white sharks at two aggregation sites to estimate abundance in "central California" at 219 mature and sub-adult individuals. They concluded this represented approximately one-half of the total abundance of mature and sub-adult sharks in the entire eastern North Pacific Ocean (ENP). This low estimate generated great concern within the conservation community, prompting petitions for governmental endangered species designations. We critically examine that study and find violations of model assumptions that, when considered in total, lead to population underestimates. We also use a Bayesian mixture model to demonstrate that the inclusion of transient sharks, characteristic of white shark aggregation sites, would substantially increase abundance estimates for the adults and sub-adults in the surveyed sub-population. Using a dataset obtained from the same sampling locations and widely accepted demographic methodology, our analysis indicates a minimum all-life stages population size of >2000 individuals in the California subpopulation is required to account for the number and size range of individual sharks observed at the two sampled sites. Even accounting for methodological and conceptual biases, an extrapolation of these data to estimate the white shark population size throughout the ENP is inappropriate. The true ENP white shark population size is likely several-fold greater as both our study and the original published estimate exclude non-aggregating sharks and those that independently aggregate at other important ENP sites. Accurately estimating the central California and ENP white shark population size requires methodologies that account for biases introduced by sampling a limited number of sites and that account for all life history stages across the species' range of habitats.

  16. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Saxena, Shailendra K., E-mail: phd1211512@iiti.ac.in; Sahu, Gayatri; Sagdeo, Pankaj R.

    Quantum confinement effect has been studied in cheese like silicon nano-structures (Ch-SiNS) fabricated by metal induced chemical etching using different etching times. Scanning electron microscopy is used for the morphological study of these Ch-SiNS. A visible photoluminescence (PL) emission is observed from the samples under UV excitation at room temperature due to quantum confinement effect. The average size of Silicon Nanostructures (SiNS) present in the samples has been estimated by bond polarizability model using Raman Spectroscopy from the red-shift observed from SiNSs as compared to its bulk counterpart. The sizes of SiNS present in the samples decreases as etching timemore » increase from 45 to 75 mintunes.« less

  17. Dual-window dual-bandwidth spectroscopic optical coherence tomography metric for qualitative scatterer size differentiation in tissues.

    PubMed

    Tay, Benjamin Chia-Meng; Chow, Tzu-Hao; Ng, Beng-Koon; Loh, Thomas Kwok-Seng

    2012-09-01

    This study investigates the autocorrelation bandwidths of dual-window (DW) optical coherence tomography (OCT) k-space scattering profile of different-sized microspheres and their correlation to scatterer size. A dual-bandwidth spectroscopic metric defined as the ratio of the 10% to 90% autocorrelation bandwidths is found to change monotonically with microsphere size and gives the best contrast enhancement for scatterer size differentiation in the resulting spectroscopic image. A simulation model supports the experimental results and revealed a tradeoff between the smallest detectable scatterer size and the maximum scatterer size in the linear range of the dual-window dual-bandwidth (DWDB) metric, which depends on the choice of the light source optical bandwidth. Spectroscopic OCT (SOCT) images of microspheres and tonsil tissue samples based on the proposed DWDB metric showed clear differentiation between different-sized scatterers as compared to those derived from conventional short-time Fourier transform metrics. The DWDB metric significantly improves the contrast in SOCT imaging and can aid the visualization and identification of dissimilar scatterer size in a sample. Potential applications include the early detection of cell nuclear changes in tissue carcinogenesis, the monitoring of healing tendons, and cell proliferation in tissue scaffolds.

  18. Improving risk classification of critical illness with biomarkers: a simulation study

    PubMed Central

    Seymour, Christopher W.; Cooke, Colin R.; Wang, Zheyu; Kerr, Kathleen F.; Yealy, Donald M.; Angus, Derek C.; Rea, Thomas D.; Kahn, Jeremy M.; Pepe, Margaret S.

    2012-01-01

    Purpose Optimal triage of patients at risk of critical illness requires accurate risk prediction, yet little data exists on the performance criteria required of a potential biomarker to be clinically useful. Materials and Methods We studied an adult cohort of non-arrest, non-trauma emergency medical services encounters transported to a hospital from 2002–2006. We simulated hypothetical biomarkers increasingly associated with critical illness during hospitalization, and determined the biomarker strength and sample size necessary to improve risk classification beyond a best clinical model. Results Of 57,647 encounters, 3,121 (5.4%) were hospitalized with critical illness and 54,526 (94.6%) without critical illness. The addition of a moderate strength biomarker (odds ratio=3.0 for critical illness) to a clinical model improved discrimination (c-statistic 0.85 vs. 0.8, p<0.01), reclassification (net reclassification improvement=0.15, 95%CI: 0.13,0.18), and increased the proportion of cases in the highest risk categoryby+8.6% (95%CI: 7.5,10.8%). Introducing correlation between the biomarker and physiological variables in the clinical risk score did not modify the results. Statistically significant changes in net reclassification required a sample size of at least 1000 subjects. Conclusions Clinical models for triage of critical illness could be significantly improved by incorporating biomarkers, yet, substantial sample sizes and biomarker strength may be required. PMID:23566734

  19. A Note on Structural Equation Modeling Estimates of Reliability

    ERIC Educational Resources Information Center

    Yang, Yanyun; Green, Samuel B.

    2010-01-01

    Reliability can be estimated using structural equation modeling (SEM). Two potential problems with this approach are that estimates may be unstable with small sample sizes and biased with misspecified models. A Monte Carlo study was conducted to investigate the quality of SEM estimates of reliability by themselves and relative to coefficient…

  20. A Structural Equation Model for Predicting Business Student Performance

    ERIC Educational Resources Information Center

    Pomykalski, James J.; Dion, Paul; Brock, James L.

    2008-01-01

    In this study, the authors developed a structural equation model that accounted for 79% of the variability of a student's final grade point average by using a sample size of 147 students. The model is based on student grades in 4 foundational business courses: introduction to business, macroeconomics, statistics, and using databases. Educators and…

  1. Spatial sampling considerations of the CERES (Clouds and Earth Radiant Energy System) instrument

    NASA Astrophysics Data System (ADS)

    Smith, G. L.; Manalo-Smith, Natividdad; Priestley, Kory

    2014-10-01

    The CERES (Clouds and Earth Radiant Energy System) instrument is a scanning radiometer with three channels for measuring Earth radiation budget. At present CERES models are operating aboard the Terra, Aqua and Suomi/NPP spacecraft and flights of CERES instruments are planned for the JPSS-1 spacecraft and its successors. CERES scans from one limb of the Earth to the other and back. The footprint size grows with distance from nadir simply due to geometry so that the size of the smallest features which can be resolved from the data increases and spatial sampling errors increase with nadir angle. This paper presents an analysis of the effect of nadir angle on spatial sampling errors of the CERES instrument. The analysis performed in the Fourier domain. Spatial sampling errors are created by smoothing of features which are the size of the footprint and smaller, or blurring, and inadequate sampling, that causes aliasing errors. These spatial sampling errors are computed in terms of the system transfer function, which is the Fourier transform of the point response function, the spacing of data points and the spatial spectrum of the radiance field.

  2. Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment

    PubMed Central

    Cheng, Ningtao; Wu, Leihong; Cheng, Yiyu

    2013-01-01

    The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot be reliably estimated with only a small number of training samples. Therefore, it is of vital importance to determine the minimum number of training samples and to ensure the clinical value of microarrays in cancer outcome prediction. We evaluated the impact of training sample size on model performance extensively based on 3 large-scale cancer microarray datasets provided by the second phase of MicroArray Quality Control project (MAQC-II). An SSNR-based (scale of signal-to-noise ratio) protocol was proposed in this study for minimum training sample size determination. External validation results based on another 3 cancer datasets confirmed that the SSNR-based approach could not only determine the minimum number of training samples efficiently, but also provide a valuable strategy for estimating the underlying performance of classifiers in advance. Once translated into clinical routine applications, the SSNR-based protocol would provide great convenience in microarray-based cancer outcome prediction in improving classifier reliability. PMID:23861920

  3. Power and sample-size estimation for microbiome studies using pairwise distances and PERMANOVA

    PubMed Central

    Kelly, Brendan J.; Gross, Robert; Bittinger, Kyle; Sherrill-Mix, Scott; Lewis, James D.; Collman, Ronald G.; Bushman, Frederic D.; Li, Hongzhe

    2015-01-01

    Motivation: The variation in community composition between microbiome samples, termed beta diversity, can be measured by pairwise distance based on either presence–absence or quantitative species abundance data. PERMANOVA, a permutation-based extension of multivariate analysis of variance to a matrix of pairwise distances, partitions within-group and between-group distances to permit assessment of the effect of an exposure or intervention (grouping factor) upon the sampled microbiome. Within-group distance and exposure/intervention effect size must be accurately modeled to estimate statistical power for a microbiome study that will be analyzed with pairwise distances and PERMANOVA. Results: We present a framework for PERMANOVA power estimation tailored to marker-gene microbiome studies that will be analyzed by pairwise distances, which includes: (i) a novel method for distance matrix simulation that permits modeling of within-group pairwise distances according to pre-specified population parameters; (ii) a method to incorporate effects of different sizes within the simulated distance matrix; (iii) a simulation-based method for estimating PERMANOVA power from simulated distance matrices; and (iv) an R statistical software package that implements the above. Matrices of pairwise distances can be efficiently simulated to satisfy the triangle inequality and incorporate group-level effects, which are quantified by the adjusted coefficient of determination, omega-squared (ω2). From simulated distance matrices, available PERMANOVA power or necessary sample size can be estimated for a planned microbiome study. Availability and implementation: http://github.com/brendankelly/micropower. Contact: brendank@mail.med.upenn.edu or hongzhe@upenn.edu PMID:25819674

  4. Application of a time-dependent coalescence process for inferring the history of population size changes from DNA sequence data.

    PubMed

    Polanski, A; Kimmel, M; Chakraborty, R

    1998-05-12

    Distribution of pairwise differences of nucleotides from data on a sample of DNA sequences from a given segment of the genome has been used in the past to draw inferences about the past history of population size changes. However, all earlier methods assume a given model of population size changes (such as sudden expansion), parameters of which (e.g., time and amplitude of expansion) are fitted to the observed distributions of nucleotide differences among pairwise comparisons of all DNA sequences in the sample. Our theory indicates that for any time-dependent population size, N(tau) (in which time tau is counted backward from present), a time-dependent coalescence process yields the distribution, p(tau), of the time of coalescence between two DNA sequences randomly drawn from the population. Prediction of p(tau) and N(tau) requires the use of a reverse Laplace transform known to be unstable. Nevertheless, simulated data obtained from three models of monotone population change (stepwise, exponential, and logistic) indicate that the pattern of a past population size change leaves its signature on the pattern of DNA polymorphism. Application of the theory to the published mtDNA sequences indicates that the current mtDNA sequence variation is not inconsistent with a logistic growth of the human population.

  5. Effects of field plot size on prediction accuracy of aboveground biomass in airborne laser scanning-assisted inventories in tropical rain forests of Tanzania.

    PubMed

    Mauya, Ernest William; Hansen, Endre Hofstad; Gobakken, Terje; Bollandsås, Ole Martin; Malimbwi, Rogers Ernest; Næsset, Erik

    2015-12-01

    Airborne laser scanning (ALS) has recently emerged as a promising tool to acquire auxiliary information for improving aboveground biomass (AGB) estimation in sample-based forest inventories. Under design-based and model-assisted inferential frameworks, the estimation relies on a model that relates the auxiliary ALS metrics to AGB estimated on ground plots. The size of the field plots has been identified as one source of model uncertainty because of the so-called boundary effects which increases with decreasing plot size. Recent research in tropical forests has aimed to quantify the boundary effects on model prediction accuracy, but evidence of the consequences for the final AGB estimates is lacking. In this study we analyzed the effect of field plot size on model prediction accuracy and its implication when used in a model-assisted inferential framework. The results showed that the prediction accuracy of the model improved as the plot size increased. The adjusted R 2 increased from 0.35 to 0.74 while the relative root mean square error decreased from 63.6 to 29.2%. Indicators of boundary effects were identified and confirmed to have significant effects on the model residuals. Variance estimates of model-assisted mean AGB relative to corresponding variance estimates of pure field-based AGB, decreased with increasing plot size in the range from 200 to 3000 m 2 . The variance ratio of field-based estimates relative to model-assisted variance ranged from 1.7 to 7.7. This study showed that the relative improvement in precision of AGB estimation when increasing field-plot size, was greater for an ALS-assisted inventory compared to that of a pure field-based inventory.

  6. Structural brain development between childhood and adulthood: Convergence across four longitudinal samples.

    PubMed

    Mills, Kathryn L; Goddings, Anne-Lise; Herting, Megan M; Meuwese, Rosa; Blakemore, Sarah-Jayne; Crone, Eveline A; Dahl, Ronald E; Güroğlu, Berna; Raznahan, Armin; Sowell, Elizabeth R; Tamnes, Christian K

    2016-11-01

    Longitudinal studies including brain measures acquired through magnetic resonance imaging (MRI) have enabled population models of human brain development, crucial for our understanding of typical development as well as neurodevelopmental disorders. Brain development in the first two decades generally involves early cortical grey matter volume (CGMV) increases followed by decreases, and monotonic increases in cerebral white matter volume (CWMV). However, inconsistencies regarding the precise developmental trajectories call into question the comparability of samples. This issue can be addressed by conducting a comprehensive study across multiple datasets from diverse populations. Here, we present replicable models for gross structural brain development between childhood and adulthood (ages 8-30years) by repeating analyses in four separate longitudinal samples (391 participants; 852 scans). In addition, we address how accounting for global measures of cranial/brain size affect these developmental trajectories. First, we found evidence for continued development of both intracranial volume (ICV) and whole brain volume (WBV) through adolescence, albeit following distinct trajectories. Second, our results indicate that CGMV is at its highest in childhood, decreasing steadily through the second decade with deceleration in the third decade, while CWMV increases until mid-to-late adolescence before decelerating. Importantly, we show that accounting for cranial/brain size affects models of regional brain development, particularly with respect to sex differences. Our results increase confidence in our knowledge of the pattern of brain changes during adolescence, reduce concerns about discrepancies across samples, and suggest some best practices for statistical control of cranial volume and brain size in future studies. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  7. Statistical power calculations for mixed pharmacokinetic study designs using a population approach.

    PubMed

    Kloprogge, Frank; Simpson, Julie A; Day, Nicholas P J; White, Nicholas J; Tarning, Joel

    2014-09-01

    Simultaneous modelling of dense and sparse pharmacokinetic data is possible with a population approach. To determine the number of individuals required to detect the effect of a covariate, simulation-based power calculation methodologies can be employed. The Monte Carlo Mapped Power method (a simulation-based power calculation methodology using the likelihood ratio test) was extended in the current study to perform sample size calculations for mixed pharmacokinetic studies (i.e. both sparse and dense data collection). A workflow guiding an easy and straightforward pharmacokinetic study design, considering also the cost-effectiveness of alternative study designs, was used in this analysis. Initially, data were simulated for a hypothetical drug and then for the anti-malarial drug, dihydroartemisinin. Two datasets (sampling design A: dense; sampling design B: sparse) were simulated using a pharmacokinetic model that included a binary covariate effect and subsequently re-estimated using (1) the same model and (2) a model not including the covariate effect in NONMEM 7.2. Power calculations were performed for varying numbers of patients with sampling designs A and B. Study designs with statistical power >80% were selected and further evaluated for cost-effectiveness. The simulation studies of the hypothetical drug and the anti-malarial drug dihydroartemisinin demonstrated that the simulation-based power calculation methodology, based on the Monte Carlo Mapped Power method, can be utilised to evaluate and determine the sample size of mixed (part sparsely and part densely sampled) study designs. The developed method can contribute to the design of robust and efficient pharmacokinetic studies.

  8. Effects of growth rate, size, and light availability on tree survival across life stages: a demographic analysis accounting for missing values and small sample sizes.

    PubMed

    Moustakas, Aristides; Evans, Matthew R

    2015-02-28

    Plant survival is a key factor in forest dynamics and survival probabilities often vary across life stages. Studies specifically aimed at assessing tree survival are unusual and so data initially designed for other purposes often need to be used; such data are more likely to contain errors than data collected for this specific purpose. We investigate the survival rates of ten tree species in a dataset designed to monitor growth rates. As some individuals were not included in the census at some time points we use capture-mark-recapture methods both to allow us to account for missing individuals, and to estimate relocation probabilities. Growth rates, size, and light availability were included as covariates in the model predicting survival rates. The study demonstrates that tree mortality is best described as constant between years and size-dependent at early life stages and size independent at later life stages for most species of UK hardwood. We have demonstrated that even with a twenty-year dataset it is possible to discern variability both between individuals and between species. Our work illustrates the potential utility of the method applied here for calculating plant population dynamics parameters in time replicated datasets with small sample sizes and missing individuals without any loss of sample size, and including explanatory covariates.

  9. EFFECT OF ENVIRONMENT ON GALAXIES' MASS-SIZE DISTRIBUTION: UNVEILING THE TRANSITION FROM OUTSIDE-IN TO INSIDE-OUT EVOLUTION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cappellari, Michele

    2013-11-20

    The distribution of galaxies on the mass-size plane as a function of redshift or environment is a powerful test for galaxy formation models. Here we use integral-field stellar kinematics to interpret the variation of the mass-size distribution in two galaxy samples spanning extreme environmental densities. The samples are both identically and nearly mass-selected (stellar mass M {sub *} ≳ 6 × 10{sup 9} M {sub ☉}) and volume-limited. The first consists of nearby field galaxies from the ATLAS{sup 3D} parent sample. The second consists of galaxies in the Coma Cluster (Abell 1656), one of the densest environments for which good, resolvedmore » spectroscopy can be obtained. The mass-size distribution in the dense environment differs from the field one in two ways: (1) spiral galaxies are replaced by bulge-dominated disk-like fast-rotator early-type galaxies (ETGs), which follow the same mass-size relation and have the same mass distribution as in the field sample; (2) the slow-rotator ETGs are segregated in mass from the fast rotators, with their size increasing proportionally to their mass. A transition between the two processes appears around the stellar mass M {sub crit} ≈ 2 × 10{sup 11} M {sub ☉}. We interpret this as evidence for bulge growth (outside-in evolution) and bulge-related environmental quenching dominating at low masses, with little influence from merging. In contrast, significant dry mergers (inside-out evolution) and halo-related quenching drives the mass and size growth at the high-mass end. The existence of these two processes naturally explains the diverse size evolution of galaxies of different masses and the separability of mass and environmental quenching.« less

  10. Thermal conductivity enhancement and sedimentation reduction of magnetorheological fluids with nano-sized Cu and Al additives

    NASA Astrophysics Data System (ADS)

    Rahim, M. S. A.; Ismail, I.; Choi, S. B.; Azmi, W. H.; Aqida, S. N.

    2017-11-01

    This work presents enhanced material characteristics of smart magnetorheological (MR) fluids by utilizing nano-sized metal particles. Especially, enhancement of thermal conductivity and reduction of sedimentation rate of MR fluids those are crucial properties for applications of MR fluids are focussed. In order to achieve this goal, a series of MR fluid samples are prepared using carbonyl iron particles (CIP) and hydraulic oil, and adding nano-sized particles of copper (Cu), aluminium (Al), and fumed silica (SiO2). Subsequently, the thermal conductivity is measured by the thermal property analyser and the sedimentation of MR fluids is measured using glass tubes without any excitation for a long time. The measured thermal conductivity is then compared with theoretical models such as Maxwell model at various CIP concentrations. In addition, in order to show the effectiveness of MR fluids synthesized in this work, the thermal conductivity of MRF-132DG which is commercially available is measured and compared with those of the prepared samples. It is observed that the thermal conductivity of the samples is much better than MRF-132DG showing the 148% increment with 40 vol% of the magnetic particles. It is also observed that the sedimentation rate of the prepared MR fluid samples is less than that of MRF-132DG showing 9% reduction with 40 vol% of the magnetic particles. The mixture optimized sample with high conductivity and low sedimentation was also obtained. The magnetization of the sample recorded an enhancement of 70.5% when compared to MRF-132DG. Furthermore, the shear yield stress of the sample were also increased with and without the influence of magnetic field.

  11. Implementing Generalized Additive Models to Estimate the Expected Value of Sample Information in a Microsimulation Model: Results of Three Case Studies.

    PubMed

    Rabideau, Dustin J; Pei, Pamela P; Walensky, Rochelle P; Zheng, Amy; Parker, Robert A

    2018-02-01

    The expected value of sample information (EVSI) can help prioritize research but its application is hampered by computational infeasibility, especially for complex models. We investigated an approach by Strong and colleagues to estimate EVSI by applying generalized additive models (GAM) to results generated from a probabilistic sensitivity analysis (PSA). For 3 potential HIV prevention and treatment strategies, we estimated life expectancy and lifetime costs using the Cost-effectiveness of Preventing AIDS Complications (CEPAC) model, a complex patient-level microsimulation model of HIV progression. We fitted a GAM-a flexible regression model that estimates the functional form as part of the model fitting process-to the incremental net monetary benefits obtained from the CEPAC PSA. For each case study, we calculated the expected value of partial perfect information (EVPPI) using both the conventional nested Monte Carlo approach and the GAM approach. EVSI was calculated using the GAM approach. For all 3 case studies, the GAM approach consistently gave similar estimates of EVPPI compared with the conventional approach. The EVSI behaved as expected: it increased and converged to EVPPI for larger sample sizes. For each case study, generating the PSA results for the GAM approach required 3 to 4 days on a shared cluster, after which EVPPI and EVSI across a range of sample sizes were evaluated in minutes. The conventional approach required approximately 5 weeks for the EVPPI calculation alone. Estimating EVSI using the GAM approach with results from a PSA dramatically reduced the time required to conduct a computationally intense project, which would otherwise have been impractical. Using the GAM approach, we can efficiently provide policy makers with EVSI estimates, even for complex patient-level microsimulation models.

  12. Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining.

    PubMed

    Hero, Alfred O; Rajaratnam, Bala

    2016-01-01

    When can reliable inference be drawn in fue "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, wifu implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics fue dataset is often variable-rich but sample-starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than fue number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for "Big Data". Sample complexity however has received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address fuis gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where fue variable dimension is fixed and fue sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa cale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables fua t are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. we demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.

  13. Update to core reporting practices in structural equation modeling.

    PubMed

    Schreiber, James B

    This paper is a technical update to "Core Reporting Practices in Structural Equation Modeling." 1 As such, the content covered in this paper includes, sample size, missing data, specification and identification of models, estimation method choices, fit and residual concerns, nested, alternative, and equivalent models, and unique issues within the SEM family of techniques. Copyright © 2016 Elsevier Inc. All rights reserved.

  14. The Effect of Small Sample Size on Two-Level Model Estimates: A Review and Illustration

    ERIC Educational Resources Information Center

    McNeish, Daniel M.; Stapleton, Laura M.

    2016-01-01

    Multilevel models are an increasingly popular method to analyze data that originate from a clustered or hierarchical structure. To effectively utilize multilevel models, one must have an adequately large number of clusters; otherwise, some model parameters will be estimated with bias. The goals for this paper are to (1) raise awareness of the…

  15. Whether the weather drives patterns of endemic amphibian chytridiomycosis: a pathogen proliferation approach.

    PubMed

    Murray, Kris A; Skerratt, Lee F; Garland, Stephen; Kriticos, Darren; McCallum, Hamish

    2013-01-01

    The pandemic amphibian disease chytridiomycosis often exhibits strong seasonality in both prevalence and disease-associated mortality once it becomes endemic. One hypothesis that could explain this temporal pattern is that simple weather-driven pathogen proliferation (population growth) is a major driver of chytridiomycosis disease dynamics. Despite various elaborations of this hypothesis in the literature for explaining amphibian declines (e.g., the chytrid thermal-optimum hypothesis) it has not been formally tested on infection patterns in the wild. In this study we developed a simple process-based model to simulate the growth of the pathogen Batrachochytrium dendrobatidis (Bd) under varying weather conditions to provide an a priori test of a weather-linked pathogen proliferation hypothesis for endemic chytridiomycosis. We found strong support for several predictions of the proliferation hypothesis when applied to our model species, Litoria pearsoniana, sampled across multiple sites and years: the weather-driven simulations of pathogen growth potential (represented as a growth index in the 30 days prior to sampling; GI30) were positively related to both the prevalence and intensity of Bd infections, which were themselves strongly and positively correlated. In addition, a machine-learning classifier achieved ~72% success in classifying positive qPCR results when utilising just three informative predictors 1) GI30, 2) frog body size and 3) rain on the day of sampling. Hence, while intrinsic traits of the individuals sampled (species, size, sex) and nuisance sampling variables (rainfall when sampling) influenced infection patterns obtained when sampling via qPCR, our results also strongly suggest that weather-linked pathogen proliferation plays a key role in the infection dynamics of endemic chytridiomycosis in our study system. Predictive applications of the model include surveillance design, outbreak preparedness and response, climate change scenario modelling and the interpretation of historical patterns of amphibian decline.

  16. A scenario tree model for the Canadian Notifiable Avian Influenza Surveillance System and its application to estimation of probability of freedom and sample size determination.

    PubMed

    Christensen, Jette; Stryhn, Henrik; Vallières, André; El Allaki, Farouk

    2011-05-01

    In 2008, Canada designed and implemented the Canadian Notifiable Avian Influenza Surveillance System (CanNAISS) with six surveillance activities in a phased-in approach. CanNAISS was a surveillance system because it had more than one surveillance activity or component in 2008: passive surveillance; pre-slaughter surveillance; and voluntary enhanced notifiable avian influenza surveillance. Our objectives were to give a short overview of two active surveillance components in CanNAISS; describe the CanNAISS scenario tree model and its application to estimation of probability of populations being free of NAI virus infection and sample size determination. Our data from the pre-slaughter surveillance component included diagnostic test results from 6296 serum samples representing 601 commercial chicken and turkey farms collected from 25 August 2008 to 29 January 2009. In addition, we included data from a sub-population of farms with high biosecurity standards: 36,164 samples from 55 farms sampled repeatedly over the 24 months study period from January 2007 to December 2008. All submissions were negative for Notifiable Avian Influenza (NAI) virus infection. We developed the CanNAISS scenario tree model, so that it will estimate the surveillance component sensitivity and the probability of a population being free of NAI at the 0.01 farm-level and 0.3 within-farm-level prevalences. We propose that a general model, such as the CanNAISS scenario tree model, may have a broader application than more detailed models that require disease specific input parameters, such as relative risk estimates. Crown Copyright © 2011. Published by Elsevier B.V. All rights reserved.

  17. Sample size and classification error for Bayesian change-point models with unlabelled sub-groups and incomplete follow-up.

    PubMed

    White, Simon R; Muniz-Terrera, Graciela; Matthews, Fiona E

    2018-05-01

    Many medical (and ecological) processes involve the change of shape, whereby one trajectory changes into another trajectory at a specific time point. There has been little investigation into the study design needed to investigate these models. We consider the class of fixed effect change-point models with an underlying shape comprised two joined linear segments, also known as broken-stick models. We extend this model to include two sub-groups with different trajectories at the change-point, a change and no change class, and also include a missingness model to account for individuals with incomplete follow-up. Through a simulation study, we consider the relationship of sample size to the estimates of the underlying shape, the existence of a change-point, and the classification-error of sub-group labels. We use a Bayesian framework to account for the missing labels, and the analysis of each simulation is performed using standard Markov chain Monte Carlo techniques. Our simulation study is inspired by cognitive decline as measured by the Mini-Mental State Examination, where our extended model is appropriate due to the commonly observed mixture of individuals within studies who do or do not exhibit accelerated decline. We find that even for studies of modest size ( n = 500, with 50 individuals observed past the change-point) in the fixed effect setting, a change-point can be detected and reliably estimated across a range of observation-errors.

  18. Nearest neighbor density ratio estimation for large-scale applications in astronomy

    NASA Astrophysics Data System (ADS)

    Kremer, J.; Gieseke, F.; Steenstrup Pedersen, K.; Igel, C.

    2015-09-01

    In astronomical applications of machine learning, the distribution of objects used for building a model is often different from the distribution of the objects the model is later applied to. This is known as sample selection bias, which is a major challenge for statistical inference as one can no longer assume that the labeled training data are representative. To address this issue, one can re-weight the labeled training patterns to match the distribution of unlabeled data that are available already in the training phase. There are many examples in practice where this strategy yielded good results, but estimating the weights reliably from a finite sample is challenging. We consider an efficient nearest neighbor density ratio estimator that can exploit large samples to increase the accuracy of the weight estimates. To solve the problem of choosing the right neighborhood size, we propose to use cross-validation on a model selection criterion that is unbiased under covariate shift. The resulting algorithm is our method of choice for density ratio estimation when the feature space dimensionality is small and sample sizes are large. The approach is simple and, because of the model selection, robust. We empirically find that it is on a par with established kernel-based methods on relatively small regression benchmark datasets. However, when applied to large-scale photometric redshift estimation, our approach outperforms the state-of-the-art.

  19. Bayesian Modal Estimation of the Four-Parameter Item Response Model in Real, Realistic, and Idealized Data Sets.

    PubMed

    Waller, Niels G; Feuerstahler, Leah

    2017-01-01

    In this study, we explored item and person parameter recovery of the four-parameter model (4PM) in over 24,000 real, realistic, and idealized data sets. In the first analyses, we fit the 4PM and three alternative models to data from three Minnesota Multiphasic Personality Inventory-Adolescent form factor scales using Bayesian modal estimation (BME). Our results indicated that the 4PM fits these scales better than simpler item Response Theory (IRT) models. Next, using the parameter estimates from these real data analyses, we estimated 4PM item parameters in 6,000 realistic data sets to establish minimum sample size requirements for accurate item and person parameter recovery. Using a factorial design that crossed discrete levels of item parameters, sample size, and test length, we also fit the 4PM to an additional 18,000 idealized data sets to extend our parameter recovery findings. Our combined results demonstrated that 4PM item parameters and parameter functions (e.g., item response functions) can be accurately estimated using BME in moderate to large samples (N ⩾ 5, 000) and person parameters can be accurately estimated in smaller samples (N ⩾ 1, 000). In the supplemental files, we report annotated [Formula: see text] code that shows how to estimate 4PM item and person parameters in [Formula: see text] (Chalmers, 2012 ).

  20. Performance of digital RGB reflectance color extraction for plaque lesion

    NASA Astrophysics Data System (ADS)

    Hashim, Hadzli; Taib, Mohd Nasir; Jailani, Rozita; Sulaiman, Saadiah; Baba, Roshidah

    2005-01-01

    Several clinical psoriasis lesion groups are been studied for digital RGB color features extraction. Previous works have used samples size that included all the outliers lying beyond the standard deviation factors from the peak histograms. This paper described the statistical performances of the RGB model with and without removing these outliers. Plaque lesion is experimented with other types of psoriasis. The statistical tests are compared with respect to three samples size; the original 90 samples, the first size reduction by removing outliers from 2 standard deviation distances (2SD) and the second size reduction by removing outliers from 1 standard deviation distance (1SD). Quantification of data images through the normal/direct and differential of the conventional reflectance method is considered. Results performances are concluded by observing the error plots with 95% confidence interval and findings of the inference T-tests applied. The statistical tests outcomes have shown that B component for conventional differential method can be used to distinctively classify plaque from the other psoriasis groups in consistent with the error plots finding with an improvement in p-value greater than 0.5.

  1. A novel measure of effect size for mediation analysis.

    PubMed

    Lachowicz, Mark J; Preacher, Kristopher J; Kelley, Ken

    2018-06-01

    Mediation analysis has become one of the most popular statistical methods in the social sciences. However, many currently available effect size measures for mediation have limitations that restrict their use to specific mediation models. In this article, we develop a measure of effect size that addresses these limitations. We show how modification of a currently existing effect size measure results in a novel effect size measure with many desirable properties. We also derive an expression for the bias of the sample estimator for the proposed effect size measure and propose an adjusted version of the estimator. We present a Monte Carlo simulation study conducted to examine the finite sampling properties of the adjusted and unadjusted estimators, which shows that the adjusted estimator is effective at recovering the true value it estimates. Finally, we demonstrate the use of the effect size measure with an empirical example. We provide freely available software so that researchers can immediately implement the methods we discuss. Our developments here extend the existing literature on effect sizes and mediation by developing a potentially useful method of communicating the magnitude of mediation. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  2. Calculation for tensile strength and fracture toughness of granite with three kinds of grain sizes using three-point-bending test

    PubMed Central

    Yu, Miao; Wei, Chenhui; Niu, Leilei; Li, Shaohua; Yu, Yongjun

    2018-01-01

    Tensile strength and fracture toughness, important parameters of the rock for engineering applications are difficult to measure. Thus this paper selected three kinds of granite samples (grain sizes = 1.01mm, 2.12mm and 3mm), used the combined experiments of physical and numerical simulation (RFPA-DIP version) to conduct three-point-bending (3-p-b) tests with different notches and introduced the acoustic emission monitor system to analyze the fracture mechanism around the notch tips. To study the effects of grain size on the tensile strength and toughness of rock samples, a modified fracture model was established linking fictitious crack to the grain size so that the microstructure of the specimens and fictitious crack growth can be considered together. The fractal method was introduced to represent microstructure of three kinds of granites and used to determine the length of fictitious crack. It is a simple and novel method to calculate the tensile strength and fracture toughness directly. Finally, the theoretical model was verified by the comparison to the numerical experiments by calculating the nominal strength σn and maximum loads Pmax. PMID:29596422

  3. Calculation for tensile strength and fracture toughness of granite with three kinds of grain sizes using three-point-bending test.

    PubMed

    Yu, Miao; Wei, Chenhui; Niu, Leilei; Li, Shaohua; Yu, Yongjun

    2018-01-01

    Tensile strength and fracture toughness, important parameters of the rock for engineering applications are difficult to measure. Thus this paper selected three kinds of granite samples (grain sizes = 1.01mm, 2.12mm and 3mm), used the combined experiments of physical and numerical simulation (RFPA-DIP version) to conduct three-point-bending (3-p-b) tests with different notches and introduced the acoustic emission monitor system to analyze the fracture mechanism around the notch tips. To study the effects of grain size on the tensile strength and toughness of rock samples, a modified fracture model was established linking fictitious crack to the grain size so that the microstructure of the specimens and fictitious crack growth can be considered together. The fractal method was introduced to represent microstructure of three kinds of granites and used to determine the length of fictitious crack. It is a simple and novel method to calculate the tensile strength and fracture toughness directly. Finally, the theoretical model was verified by the comparison to the numerical experiments by calculating the nominal strength σn and maximum loads Pmax.

  4. Analytical template protection performance and maximum key size given a Gaussian-modeled biometric source

    NASA Astrophysics Data System (ADS)

    Kelkboom, Emile J. C.; Breebaart, Jeroen; Buhan, Ileana; Veldhuis, Raymond N. J.

    2010-04-01

    Template protection techniques are used within biometric systems in order to protect the stored biometric template against privacy and security threats. A great portion of template protection techniques are based on extracting a key from or binding a key to a biometric sample. The achieved protection depends on the size of the key and its closeness to being random. In the literature it can be observed that there is a large variation on the reported key lengths at similar classification performance of the same template protection system, even when based on the same biometric modality and database. In this work we determine the analytical relationship between the system performance and the theoretical maximum key size given a biometric source modeled by parallel Gaussian channels. We consider the case where the source capacity is evenly distributed across all channels and the channels are independent. We also determine the effect of the parameters such as the source capacity, the number of enrolment and verification samples, and the operating point selection on the maximum key size. We show that a trade-off exists between the privacy protection of the biometric system and its convenience for its users.

  5. Numerical modeling of the tensile strength of a biological granular aggregate: Effect of the particle size distribution

    NASA Astrophysics Data System (ADS)

    Heinze, Karsta; Frank, Xavier; Lullien-Pellerin, Valérie; George, Matthieu; Radjai, Farhang; Delenne, Jean-Yves

    2017-06-01

    Wheat grains can be considered as a natural cemented granular material. They are milled under high forces to produce food products such as flour. The major part of the grain is the so-called starchy endosperm. It contains stiff starch granules, which show a multi-modal size distribution, and a softer protein matrix that surrounds the granules. Experimental milling studies and numerical simulations are going hand in hand to better understand the fragmentation behavior of this biological material and to improve milling performance. We present a numerical study of the effect of granule size distribution on the strength of such a cemented granular material. Samples of bi-modal starch granule size distribution were created and submitted to uniaxial tension, using a peridynamics method. We show that, when compared to the effects of starch-protein interface adhesion and voids, the granule size distribution has a limited effect on the samples' yield stress.

  6. Combining the boundary shift integral and tensor-based morphometry for brain atrophy estimation

    NASA Astrophysics Data System (ADS)

    Michalkiewicz, Mateusz; Pai, Akshay; Leung, Kelvin K.; Sommer, Stefan; Darkner, Sune; Sørensen, Lauge; Sporring, Jon; Nielsen, Mads

    2016-03-01

    Brain atrophy from structural magnetic resonance images (MRIs) is widely used as an imaging surrogate marker for Alzheimers disease. Their utility has been limited due to the large degree of variance and subsequently high sample size estimates. The only consistent and reasonably powerful atrophy estimation methods has been the boundary shift integral (BSI). In this paper, we first propose a tensor-based morphometry (TBM) method to measure voxel-wise atrophy that we combine with BSI. The combined model decreases the sample size estimates significantly when compared to BSI and TBM alone.

  7. Developing optimum sample size and multistage sampling plans for Lobesia botrana (Lepidoptera: Tortricidae) larval infestation and injury in northern Greece.

    PubMed

    Ifoulis, A A; Savopoulou-Soultani, M

    2006-10-01

    The purpose of this research was to quantify the spatial pattern and develop a sampling program for larvae of Lobesia botrana Denis and Schiffermüller (Lepidoptera: Tortricidae), an important vineyard pest in northern Greece. Taylor's power law and Iwao's patchiness regression were used to model the relationship between the mean and the variance of larval counts. Analysis of covariance was carried out, separately for infestation and injury, with combined second and third generation data, for vine and half-vine sample units. Common regression coefficients were estimated to permit use of the sampling plan over a wide range of conditions. Optimum sample sizes for infestation and injury, at three levels of precision, were developed. An investigation of a multistage sampling plan with a nested analysis of variance showed that if the goal of sampling is focusing on larval infestation, three grape clusters should be sampled in a half-vine; if the goal of sampling is focusing on injury, then two grape clusters per half-vine are recommended.

  8. Verification of ARMA identification for modelling temporal correlation of GPS observations using the toolbox ARMASA

    NASA Astrophysics Data System (ADS)

    Luo, Xiaoguang; Mayer, Michael; Heck, Bernhard

    2010-05-01

    One essential deficiency of the stochastic model used in many GNSS (Global Navigation Satellite Systems) software products consists in neglecting temporal correlation of GNSS observations. Analysing appropriately detrended time series of observation residuals resulting from GPS (Global Positioning System) data processing, the temporal correlation behaviour of GPS observations can be sufficiently described by means of so-called autoregressive moving average (ARMA) processes. Using the toolbox ARMASA which is available free of charge in MATLAB® Central (open exchange platform for the MATLAB® and SIMULINK® user community), a well-fitting time series model can be identified automatically in three steps. Firstly, AR, MA, and ARMA models are computed up to some user-specified maximum order. Subsequently, for each model type, the best-fitting model is selected using the combined (for AR processes) resp. generalised (for MA and ARMA processes) information criterion. The final model identification among the best-fitting AR, MA, and ARMA models is performed based on the minimum prediction error characterising the discrepancies between the given data and the fitted model. The ARMA coefficients are computed using Burg's maximum entropy algorithm (for AR processes), Durbin's first (for MA processes) and second (for ARMA processes) methods, respectively. This paper verifies the performance of the automated ARMA identification using the toolbox ARMASA. For this purpose, a representative data base is generated by means of ARMA simulation with respect to sample size, correlation level, and model complexity. The model error defined as a transform of the prediction error is used as measure for the deviation between the true and the estimated model. The results of the study show that the recognition rates of underlying true processes increase with increasing sample sizes and decrease with rising model complexity. Considering large sample sizes, the true underlying processes can be correctly recognised for nearly 80% of the analysed data sets. Additionally, the model errors of first-order AR resp. MA processes converge clearly more rapidly to the corresponding asymptotical values than those of high-order ARMA processes.

  9. Effect of Microstructural Interfaces on the Mechanical Response of Crystalline Metallic Materials

    NASA Astrophysics Data System (ADS)

    Aitken, Zachary H.

    Advances in nano-scale mechanical testing have brought about progress in the understanding of physical phenomena in materials and a measure of control in the fabrication of novel materials. In contrast to bulk materials that display size-invariant mechanical properties, sub-micron metallic samples show a critical dependence on sample size. The strength of nano-scale single crystalline metals is well-described by a power-law function, sigma ∝ D-n, where D is a critical sample size and n is a experimentally-fit positive exponent. This relationship is attributed to source-driven plasticity and demonstrates a strengthening as the decreasing sample size begins to limit the size and number of dislocation sources. A full understanding of this size-dependence is complicated by the presence of microstructural features such as interfaces that can compete with the dominant dislocation-based deformation mechanisms. In this thesis, the effects of microstructural features such as grain boundaries and anisotropic crystallinity on nano-scale metals are investigated through uniaxial compression testing. We find that nano-sized Cu covered by a hard coating displays a Bauschinger effect and the emergence of this behavior can be explained through a simple dislocation-based analytic model. Al nano-pillars containing a single vertically-oriented coincident site lattice grain boundary are found to show similar deformation to single-crystalline nano-pillars with slip traces passing through the grain boundary. With increasing tilt angle of the grain boundary from the pillar axis, we observe a transition from dislocation-dominated deformation to grain boundary sliding. Crystallites are observed to shear along the grain boundary and molecular dynamics simulations reveal a mechanism of atomic migration that accommodates boundary sliding. We conclude with an analysis of the effects of inherent crystal anisotropy and alloying on the mechanical behavior of the Mg alloy, AZ31. Through comparison to pure Mg, we show that the size effect dominates the strength of samples below 10 microm, that differences in the size effect between hexagonal slip systems is due to the inherent crystal anisotropy, suggesting that the fundamental mechanism of the size effect in these slip systems is the same.

  10. Behavior and sensitivity of an optimal tree diameter growth model under data uncertainty

    Treesearch

    Don C. Bragg

    2005-01-01

    Using loblolly pine, shortleaf pine, white oak, and northern red oak as examples, this paper considers the behavior of potential relative increment (PRI) models of optimal tree diameter growth under data uncertainity. Recommendations on intial sample size and the PRI iteractive curve fitting process are provided. Combining different state inventories prior to PRI model...

  11. Relative Performance of Rescaling and Resampling Approaches to Model Chi Square and Parameter Standard Error Estimation in Structural Equation Modeling.

    ERIC Educational Resources Information Center

    Nevitt, Johnathan; Hancock, Gregory R.

    Though common structural equation modeling (SEM) methods are predicated upon the assumption of multivariate normality, applied researchers often find themselves with data clearly violating this assumption and without sufficient sample size to use distribution-free estimation methods. Fortunately, promising alternatives are being integrated into…

  12. The Impact of Various Class-Distinction Features on Model Selection in the Mixture Rasch Model

    ERIC Educational Resources Information Center

    Choi, In-Hee; Paek, Insu; Cho, Sun-Joo

    2017-01-01

    The purpose of the current study is to examine the performance of four information criteria (Akaike's information criterion [AIC], corrected AIC [AICC] Bayesian information criterion [BIC], sample-size adjusted BIC [SABIC]) for detecting the correct number of latent classes in the mixture Rasch model through simulations. The simulation study…

  13. A model for estimating understory vegetation response to fertilization and precipitation in loblolly pine plantations

    Treesearch

    Curtis L. VanderSchaaf; Ryan W. McKnight; Thomas R. Fox; H. Lee Allen

    2010-01-01

    A model form is presented, where the model contains regressors selected for inclusion based on biological rationale, to predict how fertilization, precipitation amounts, and overstory stand density affect understory vegetation biomass. Due to time, economic, and logistic constraints, datasets of large sample sizes generally do not exist for understory vegetation. Thus...

  14. Metapopulation models for historical inference.

    PubMed

    Wakeley, John

    2004-04-01

    The genealogical process for a sample from a metapopulation, in which local populations are connected by migration and can undergo extinction and subsequent recolonization, is shown to have a relatively simple structure in the limit as the number of populations in the metapopulation approaches infinity. The result, which is an approximation to the ancestral behaviour of samples from a metapopulation with a large number of populations, is the same as that previously described for other metapopulation models, namely that the genealogical process is closely related to Kingman's unstructured coalescent. The present work considers a more general class of models that includes two kinds of extinction and recolonization, and the possibility that gamete production precedes extinction. In addition, following other recent work, this result for a metapopulation divided into many populations is shown to hold both for finite population sizes and in the usual diffusion limit, which assumes that population sizes are large. Examples illustrate when the usual diffusion limit is appropriate and when it is not. Some shortcomings and extensions of the model are considered, and the relevance of such models to understanding human history is discussed.

  15. Structural characterizations of pure SnS and In-doped SnS thin films using isotropic and anisotropic models

    NASA Astrophysics Data System (ADS)

    Kafashan, Hosein

    2018-04-01

    An electrochemical route has been employed to prepare pure SnS and indium-doped SnS thin films. Six samples including undoped SnS and In-doped SnS thin films deposited on the fluorine-doped tin oxide (FTO) glass substrates. An aqueous solution having SnCl2 and Na2S2O3 used as the primary electrolyte. Different In-doped SnS samples were prepared by adding a different amount of 1 mM InCl3 solution into the first electrolyte. The applied potential (E), time of deposition (t), pH and bath temperature (T) were kept at ‑1 V, 30 min, 2.1 and 60 °C, respectively. For all samples, except the In-dopant concentration, all the deposition parameters are the same. After preparation, X-ray diffraction (XRD), field emission scanning electron microscopy (FESEM) with an energy dispersive X-ray analyzer (EDX) attachment, atomic force microscopy (AFM), and transmission electron microscopy (TEM) were used to determine structural properties of as-deposited films. XRD patterns revealed that the synthesized undoped- and In-doped SnS thin films were crystallized in the orthorhombic structure. The shape of SnS crystals was spherical in the TEM image. X-ray peak broadening studies was done by applying Scherrer’s method, Williamson-Hall (W–H) models (including uniform deformation model (UDM), uniform strain deformation model (UDSM), and uniform deformation energy density model (UDEDM)), and size-strain plot (SSP) method. Using these techniques, the crystallite size and the lattice strains have been predicted. There was a good agreement in the particle size achieved by W–H- and SSP methods with TEM image.

  16. The Power of Low Back Pain Trials: A Systematic Review of Power, Sample Size, and Reporting of Sample Size Calculations Over Time, in Trials Published Between 1980 and 2012.

    PubMed

    Froud, Robert; Rajendran, Dévan; Patel, Shilpa; Bright, Philip; Bjørkli, Tom; Eldridge, Sandra; Buchbinder, Rachelle; Underwood, Martin

    2017-06-01

    A systematic review of nonspecific low back pain trials published between 1980 and 2012. To explore what proportion of trials have been powered to detect different bands of effect size; whether there is evidence that sample size in low back pain trials has been increasing; what proportion of trial reports include a sample size calculation; and whether likelihood of reporting sample size calculations has increased. Clinical trials should have a sample size sufficient to detect a minimally important difference for a given power and type I error rate. An underpowered trial is one within which probability of type II error is too high. Meta-analyses do not mitigate underpowered trials. Reviewers independently abstracted data on sample size at point of analysis, whether a sample size calculation was reported, and year of publication. Descriptive analyses were used to explore ability to detect effect sizes, and regression analyses to explore the relationship between sample size, or reporting sample size calculations, and time. We included 383 trials. One-third were powered to detect a standardized mean difference of less than 0.5, and 5% were powered to detect less than 0.3. The average sample size was 153 people, which increased only slightly (∼4 people/yr) from 1980 to 2000, and declined slightly (∼4.5 people/yr) from 2005 to 2011 (P < 0.00005). Sample size calculations were reported in 41% of trials. The odds of reporting a sample size calculation (compared to not reporting one) increased until 2005 and then declined (Equation is included in full-text article.). Sample sizes in back pain trials and the reporting of sample size calculations may need to be increased. It may be justifiable to power a trial to detect only large effects in the case of novel interventions. 3.

  17. Split-plot microarray experiments: issues of design, power and sample size.

    PubMed

    Tsai, Pi-Wen; Lee, Mei-Ling Ting

    2005-01-01

    This article focuses on microarray experiments with two or more factors in which treatment combinations of the factors corresponding to the samples paired together onto arrays are not completely random. A main effect of one (or more) factor(s) is confounded with arrays (the experimental blocks). This is called a split-plot microarray experiment. We utilise an analysis of variance (ANOVA) model to assess differentially expressed genes for between-array and within-array comparisons that are generic under a split-plot microarray experiment. Instead of standard t- or F-test statistics that rely on mean square errors of the ANOVA model, we use a robust method, referred to as 'a pooled percentile estimator', to identify genes that are differentially expressed across different treatment conditions. We illustrate the design and analysis of split-plot microarray experiments based on a case application described by Jin et al. A brief discussion of power and sample size for split-plot microarray experiments is also presented.

  18. Reduction of Sample Size Requirements by Bilateral Versus Unilateral Research Designs in Animal Models for Cartilage Tissue Engineering

    PubMed Central

    Orth, Patrick; Zurakowski, David; Alini, Mauro; Cucchiarini, Magali

    2013-01-01

    Advanced tissue engineering approaches for articular cartilage repair in the knee joint rely on translational animal models. In these investigations, cartilage defects may be established either in one joint (unilateral design) or in both joints of the same animal (bilateral design). We hypothesized that a lower intraindividual variability following the bilateral strategy would reduce the number of required joints. Standardized osteochondral defects were created in the trochlear groove of 18 rabbits. In 12 animals, defects were produced unilaterally (unilateral design; n=12 defects), while defects were created bilaterally in 6 animals (bilateral design; n=12 defects). After 3 weeks, osteochondral repair was evaluated histologically applying an established grading system. Based on intra- and interindividual variabilities, required sample sizes for the detection of discrete differences in the histological score were determined for both study designs (α=0.05, β=0.20). Coefficients of variation (%CV) of the total histological score values were 1.9-fold increased following the unilateral design when compared with the bilateral approach (26 versus 14%CV). The resulting numbers of joints needed to treat were always higher for the unilateral design, resulting in an up to 3.9-fold increase in the required number of experimental animals. This effect was most pronounced for the detection of small-effect sizes and estimating large standard deviations. The data underline the possible benefit of bilateral study designs for the decrease of sample size requirements for certain investigations in articular cartilage research. These findings might also be transferred to other scoring systems, defect types, or translational animal models in the field of cartilage tissue engineering. PMID:23510128

  19. Transport of dissolved organic matter in Boom Clay: Size effects

    NASA Astrophysics Data System (ADS)

    Durce, D.; Aertsens, M.; Jacques, D.; Maes, N.; Van Gompel, M.

    2018-01-01

    A coupled experimental-modelling approach was developed to evaluate the effects of molecular weight (MW) of dissolved organic matter (DOM) on its transport through intact Boom Clay (BC) samples. Natural DOM was sampled in-situ in the BC layer. Transport was investigated with percolation experiments on 1.5 cm BC samples by measuring the outflow MW distribution (MWD) by size exclusion chromatography (SEC). A one-dimensional reactive transport model was developed to account for retardation, diffusion and entrapment (attachment and/or straining) of DOM. These parameters were determined along the MWD by implementing a discretisation of DOM into several MW points and modelling the breakthrough of each point. The pore throat diameter of BC was determined as 6.6-7.6 nm. Below this critical size, transport of DOM is MW dependent and two major types of transport were identified. Below MW of 2 kDa, DOM was neither strongly trapped nor strongly retarded. This fraction had an averaged capacity factor of 1.19 ± 0.24 and an apparent dispersion coefficient ranging from 7.5 × 10- 11 to 1.7 × 10- 11 m2/s with increasing MW. DOM with MW > 2 kDa was affected by both retardation and straining that increased significantly with increasing MW while apparent dispersion coefficients decreased. Values ranging from 1.36 to 19.6 were determined for the capacity factor and 3.2 × 10- 11 to 1.0 × 10- 11 m2/s for the apparent dispersion coefficient for species with 2.2 kDa < MW < 9.3 kDa. Straining resulted in an immobilisation of in average 49 ± 6% of the injected 9.3 kDa species. Our findings show that an accurate description of DOM transport requires the consideration of the size effects.

  20. Experimental calibration of Phreatic and Hydrothermal Explosions: A case study on Lake Okaro, New Zealand

    NASA Astrophysics Data System (ADS)

    Foote, L. C.; Scheu, B.; kennedy, B.; Gravley, D.; Dingwell, D. B.

    2011-12-01

    Phreatic and hydrothermal eruptions, the most common on earth, frequently lead to magmatic eruptions. They often occur with little or no warning, representing a significant hazard. These eruptions occur over a range of temperature and pressure, and within widely differing rock types. Additionally, these eruptions may be triggered by earthquakes or landslides . Regardless of the trigger, they occur when hydrothermal/supercritical fluid rapidly flashes to steam due either to a heating or a decompression. Despite the frequency of these eruptions, previous studies have largely been focused exclusively on either the physical characteristics of the eruptions or experimental modelling of the trigger processes, with very few combining the two. Here, a new experimental procedure has been developed to model phreatic fragmentation based on the shock-tube experiments of magmatic fragmentation introduced by Alidibirov & Dingwell (1996). This technique uses water-saturated samples, producing fragmentation from a combination of argon gas overpressure and steam flashing, within the vesicles. By integrating measurements of the physical characteristics such as porosity, permeability and mineralogy in the analysis of the results of these experiments a model of phreatic fragmentation is proposed, to aid in future hazard modelling. The phreatic explosion crater forming Lake Okaro, within the Taupo Volcanic Zone of New Zealand was used as a case study. The eruption was triggered within the Rangitaiki Ignimbrite, which served as the sample material for these experiments. In order to evaluate the effects of alteration, both original, unaltered material and hydrothermally altered samples were analysed. As fragmentation is driven by gas overpressure/steam expansion within vesicles, porosity plays a critical role. For these samples average porosity values are 24 and 40% respectively. Experimental conditions were chosen primarily to reflect the conditions of the study location but also to study the effect of water saturation on the fragmentation behavior. Thus experiments were run at both room temperature and 300°C, and from 4 to 15 MPa. Pressure sensors were used to record the speed of fragmentation and fragments were recovered in order to determine grain-size distributions. First analyses of the fragmentation speed reveal no significant difference between dry and saturated samples; (14 - 42 m/s depending on applied energy). In contrast, the results of the grain size analysis show a clear shift to smaller grain sizes with saturated samples (independent of pressure or sample type) possibly reflecting the more efficient conversion of energy involved in phreatic eruptions most likely in combination with a strength reduction of the samples due to water weakening effects. We provide herewith a first parameterisation of conditions for phreatic and hydrothermal eruptions and offer an explanation for the reduction in grain size associated with phreatic eruptions.

  1. [Potentials in the regionalization of health indicators using small-area estimation methods : Exemplary results based on the 2009, 2010 and 2012 GEDA studies].

    PubMed

    Kroll, Lars Eric; Schumann, Maria; Müters, Stephan; Lampert, Thomas

    2017-12-01

    Nationwide health surveys can be used to estimate regional differences in health. Using traditional estimation techniques, the spatial depth for these estimates is limited due to the constrained sample size. So far - without special refreshment samples - results have only been available for larger populated federal states of Germany. An alternative is regression-based small-area estimation techniques. These models can generate smaller-scale data, but are also subject to greater statistical uncertainties because of the model assumptions. In the present article, exemplary regionalized results based on the studies "Gesundheit in Deutschland aktuell" (GEDA studies) 2009, 2010 and 2012, are compared to the self-rated health status of the respondents. The aim of the article is to analyze the range of regional estimates in order to assess the usefulness of the techniques for health reporting more adequately. The results show that the estimated prevalence is relatively stable when using different samples. Important determinants of the variation of the estimates are the achieved sample size on the district level and the type of the district (cities vs. rural regions). Overall, the present study shows that small-area modeling of prevalence is associated with additional uncertainties compared to conventional estimates, which should be taken into account when interpreting the corresponding findings.

  2. Segmented polynomial taper equation incorporating years since thinning for loblolly pine plantations

    Treesearch

    A. Gordon Holley; Thomas B. Lynch; Charles T. Stiff; William Stansfield

    2010-01-01

    Data from 108 trees felled from 16 loblolly pine stands owned by Temple-Inland Forest Products Corp. were used to determine effects of years since thinning (YST) on stem taper using the Max–Burkhart type segmented polynomial taper model. Sample tree YST ranged from two to nine years prior to destructive sampling. In an effort to equalize sample sizes, tree data were...

  3. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Fangyan; Zhang, Song; Chung Wong, Pak

    Effectively visualizing large graphs and capturing the statistical properties are two challenging tasks. To aid in these two tasks, many sampling approaches for graph simplification have been proposed, falling into three categories: node sampling, edge sampling, and traversal-based sampling. It is still unknown which approach is the best. We evaluate commonly used graph sampling methods through a combined visual and statistical comparison of graphs sampled at various rates. We conduct our evaluation on three graph models: random graphs, small-world graphs, and scale-free graphs. Initial results indicate that the effectiveness of a sampling method is dependent on the graph model, themore » size of the graph, and the desired statistical property. This benchmark study can be used as a guideline in choosing the appropriate method for a particular graph sampling task, and the results presented can be incorporated into graph visualization and analysis tools.« less

  4. Probing defects in chemically synthesized ZnO nanostrucures by positron annihilation and photoluminescence spectroscopy

    NASA Astrophysics Data System (ADS)

    Chaudhuri, S. K.; Ghosh, Manoranjan; Das, D.; Raychaudhuri, A. K.

    2010-09-01

    The present article describes the size induced changes in the structural arrangement of intrinsic defects present in chemically synthesized ZnO nanoparticles of various sizes. Routine x-ray diffraction and transmission electron microscopy have been performed to determine the shapes and sizes of the nanocrystalline ZnO samples. Detailed studies using positron annihilation spectroscopy reveals the presence of zinc vacancy. Whereas analysis of photoluminescence results predict the signature of charged oxygen vacancies. The size induced changes in positron parameters as well as the photoluminescence properties, has shown contrasting or nonmonotonous trends as size varies from 4 to 85 nm. Small spherical particles below a critical size (˜23 nm) receive more positive surface charge due to the higher occupancy of the doubly charge oxygen vacancy as compared to the bigger nanostructures where singly charged oxygen vacancy predominates. This electronic alteration has been seen to trigger yet another interesting phenomenon, described as positron confinement inside nanoparticles. Finally, based on all the results, a model of the structural arrangement of the intrinsic defects in the present samples has been reconciled.

  5. DEM Particle Fracture Model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Boning; Herbold, Eric B.; Homel, Michael A.

    2015-12-01

    An adaptive particle fracture model in poly-ellipsoidal Discrete Element Method is developed. The poly-ellipsoidal particle will break into several sub-poly-ellipsoids by Hoek-Brown fracture criterion based on continuum stress and the maximum tensile stress in contacts. Also Weibull theory is introduced to consider the statistics and size effects on particle strength. Finally, high strain-rate split Hopkinson pressure bar experiment of silica sand is simulated using this newly developed model. Comparisons with experiments show that our particle fracture model can capture the mechanical behavior of this experiment very well, both in stress-strain response and particle size redistribution. The effects of density andmore » packings o the samples are also studied in numerical examples.« less

  6. Empirical tests of harvest-induced body-size evolution along a geographic gradient in Australian macropods.

    PubMed

    Prowse, Thomas A A; Correll, Rachel A; Johnson, Christopher N; Prideaux, Gavin J; Brook, Barry W

    2015-01-01

    Life-history theory predicts the progressive dwarfing of animal populations that are subjected to chronic mortality stress, but the evolutionary impact of harvesting terrestrial herbivores has seldom been tested. In Australia, marsupials of the genus Macropus (kangaroos and wallabies) are subjected to size-selective commercial harvesting. Mathematical modelling suggests that harvest quotas (c. 10-20% of population estimates annually) could be driving body-size evolution in these species. We tested this hypothesis for three harvested macropod species with continental-scale distributions. To do so, we measured more than 2000 macropod skulls sourced from wildlife collections spanning the last 130 years. We analysed these data using spatial Bayesian models that controlled for the age and sex of specimens as well as environmental drivers and island effects. We found no evidence for the hypothesized decline in body size for any species; rather, models that fit trend terms supported minor body size increases over time. This apparently counterintuitive result is consistent with reduced mortality due to a depauperate predator guild and increased primary productivity of grassland vegetation following European settlement in Australia. Spatial patterns in macropod body size supported the heat dissipation limit and productivity hypotheses proposed to explain geographic body-size variation (i.e. skull size increased with decreasing summer maximum temperature and increasing rainfall, respectively). There is no empirical evidence that size-selective harvesting has driven the evolution of smaller body size in Australian macropods. Bayesian models are appropriate for investigating the long-term impact of human harvesting because they can impute missing data, fit nonlinear growth models and account for non-random spatial sampling inherent in wildlife collections. © 2014 The Authors. Journal of Animal Ecology © 2014 British Ecological Society.

  7. Microplastic distribution in global marine surface waters: results of an extensive citizen science study

    NASA Astrophysics Data System (ADS)

    Barrows, A.; Petersen, C.

    2017-12-01

    Plastic is a major pollutant throughout the world. The majority of the 322 million tons produced annually is used for single-use packaging. What makes plastic an attractive packaging material: cheap, light-weight and durable are also the features that help make it a common and persistent pollutant. There is a growing body of research on microplastic, particles less than 5 mm in size. Microfibers are the most common microplastic in the marine environment. Global estimates of marine microplastic surface concentrations are based on relatively small sample sizes when compared to the vast geographic scale of the ocean. Microplastic residence time and movement along the coast and sea surface outside of the gyres is still not well researched. This five-year project utilized global citizen scientists to collect 1,628 1-liter surface grab samples in every major ocean. The Artic and Southern oceans contained highest average of particles per liter of surface water. Open ocean samples (further than 12 nm from land, n = 686) contained a higher particle average (17 pieces L-1) than coastal samples (n = 723) 6 pieces L-1. Particles were predominantly 100 µm- 1.5 mm in length (77%), smaller than what has been captured in the majority of surface studies. Utilization of citizen scientists to collect data both in fairly accessible regions of the world as well as from areas hard to reach and therefore under sampled, provides us with a wider perspective of global microplastics occurrence. Our findings confirm global microplastic accumulation zone model predictions. The open ocean and poles have sequestered and trapped plastic for over half a century, and show that not only plastics, but anthropogenic fibers are polluting the environment. Continuing to fill knowledge gaps on microplastic shape, size and color in remote ocean areas will drive more accurate oceanographic models of plastic accumulation zones. Incorporation of smaller-sized particles in these models, which has previously been lacking, will help to better understand potential fate and transformation microplastic and anthropogenic particles in the marine environment.

  8. Development of property-transfer models for estimating the hydraulic properties of deep sediments at the Idaho National Engineering and Environmental Laboratory, Idaho

    USGS Publications Warehouse

    Winfield, Kari A.

    2005-01-01

    Because characterizing the unsaturated hydraulic properties of sediments over large areas or depths is costly and time consuming, development of models that predict these properties from more easily measured bulk-physical properties is desirable. At the Idaho National Engineering and Environmental Laboratory, the unsaturated zone is composed of thick basalt flow sequences interbedded with thinner sedimentary layers. Determining the unsaturated hydraulic properties of sedimentary layers is one step in understanding water flow and solute transport processes through this complex unsaturated system. Multiple linear regression was used to construct simple property-transfer models for estimating the water-retention curve and saturated hydraulic conductivity of deep sediments at the Idaho National Engineering and Environmental Laboratory. The regression models were developed from 109 core sample subsets with laboratory measurements of hydraulic and bulk-physical properties. The core samples were collected at depths of 9 to 175 meters at two facilities within the southwestern portion of the Idaho National Engineering and Environmental Laboratory-the Radioactive Waste Management Complex, and the Vadose Zone Research Park southwest of the Idaho Nuclear Technology and Engineering Center. Four regression models were developed using bulk-physical property measurements (bulk density, particle density, and particle size) as the potential explanatory variables. Three representations of the particle-size distribution were compared: (1) textural-class percentages (gravel, sand, silt, and clay), (2) geometric statistics (mean and standard deviation), and (3) graphical statistics (median and uniformity coefficient). The four response variables, estimated from linear combinations of the bulk-physical properties, included saturated hydraulic conductivity and three parameters that define the water-retention curve. For each core sample,values of each water-retention parameter were estimated from the appropriate regression equation and used to calculate an estimated water-retention curve. The degree to which the estimated curve approximated the measured curve was quantified using a goodness-of-fit indicator, the root-mean-square error. Comparison of the root-mean-square-error distributions for each alternative particle-size model showed that the estimated water-retention curves were insensitive to the way the particle-size distribution was represented. Bulk density, the median particle diameter, and the uniformity coefficient were chosen as input parameters for the final models. The property-transfer models developed in this study allow easy determination of hydraulic properties without need for their direct measurement. Additionally, the models provide the basis for development of theoretical models that rely on physical relationships between the pore-size distribution and the bulk-physical properties of the media. With this adaptation, the property-transfer models should have greater application throughout the Idaho National Engineering and Environmental Laboratory and other geographic locations.

  9. A Note on Cluster Effects in Latent Class Analysis

    ERIC Educational Resources Information Center

    Kaplan, David; Keller, Bryan

    2011-01-01

    This article examines the effects of clustering in latent class analysis. A comprehensive simulation study is conducted, which begins by specifying a true multilevel latent class model with varying within- and between-cluster sample sizes, varying latent class proportions, and varying intraclass correlations. These models are then estimated under…

  10. Effect of Differential Item Functioning on Test Equating

    ERIC Educational Resources Information Center

    Kabasakal, Kübra Atalay; Kelecioglu, Hülya

    2015-01-01

    This study examines the effect of differential item functioning (DIF) items on test equating through multilevel item response models (MIRMs) and traditional IRMs. The performances of three different equating models were investigated under 24 different simulation conditions, and the variables whose effects were examined included sample size, test…

  11. DEVELOPMENT AND PEER REVIEW OF TIME-TO-EFFECT MODELS FOR THE ANALYSIS OF NEUROTOXICITY AND OTHER TIME DEPENDENT DATA

    EPA Science Inventory

    Neurobehavioral studies pose unique challenges for dose-response modeling, including small sample size and relatively large intra-subject variation, repeated measurements over time, multiple endpoints with both continuous and ordinal scales, and time dependence of risk characteri...

  12. A Comparison of Normal and Elliptical Estimation Methods in Structural Equation Models.

    ERIC Educational Resources Information Center

    Schumacker, Randall E.; Cheevatanarak, Suchittra

    Monte Carlo simulation compared chi-square statistics, parameter estimates, and root mean square error of approximation values using normal and elliptical estimation methods. Three research conditions were imposed on the simulated data: sample size, population contamination percent, and kurtosis. A Bentler-Weeks structural model established the…

  13. Apatite (U-Th-Sm)/He age dispersion arising from analysis of variable grain sizes and broken crystals - examples from the Scottish Southern Uplands

    NASA Astrophysics Data System (ADS)

    Łuszczak, Katarzyna; Persano, Cristina; Stuart, Finlay; Brown, Roderick

    2016-04-01

    Apatite (U-Th-Sm)/He (AHe) thermochronometry is a powerful technique for deciphering denudation of the uppermost crust. However, the age dispersion of single grains from the same rock is typical, and this hampers establishing accurate thermal histories when low grain numbers are analysed. Dispersion arising from the analysis of broken crystal fragments[1] has been proposed as an important cause of age dispersion, along with grain size and radiation damage. A new tool, Helfrag[2], allows constraints to be placed on the low temperature history derived from the analysis of apatite crystal fragments. However, the age dispersion model has not been fully tested on natural samples yet. We have performed AHe analysis of multiple (n = 20-25) grains from four rock samples from the Scottish Southern Uplands, which were subjected to the same exhumation episodes, although, the amount of exhumation varied between the localities. This is evident from the range of AFT ages (˜60 to ˜200 Ma) and variable thermal histories showing either strong, moderate and no support for a rapid cooling event at ˜60 Ma. Different apatite size and fragment geometry were analysed in order to maximise age dispersion. In general, the age dispersion increases with increasing AFT age (from 47% to 127%), consistent with the prediction from the fragmentation model. Thermal histories obtained using Helfrag were compared with those obtained by standard codes based on the spherical approximation. In one case, the Helfrag model was capable of resolving the higher complexity of the thermal history of the rock, constraining several heating/cooling events that are not predicted by the standard models, but are in good agreement with the regional geology. In other cases, the thermal histories are similar for both Helfrag and standard models and the age predictions for the Helfrag are only slightly better than for standard model, implying that the grain size has the dominant role in generating the age dispersion. Rather than suggesting that grain size is the predominant factor in controlling age dispersion in all data sets, our results may be linked to the actual size of the picked grains; for grain widths smaller than 100 μm, the He profile within the crystal may not be differentiated enough to produce a dispersion measureable outside the uncertainty associated with the age. It is also easier for long-thin and short-thick than long-thick and short-thin grains to be preserved; this minimises the age dispersion that can be generated from fragmentation. We suggest, that in order to obtain valuable information from both fragmentation and grain size >20 large (width >100 μm) grain fragments of variable length have to be analyzed, together with a few smaller grains. Our results point to a strategy that favours multiple single-grain AHe ages determinations on carefully selected samples, with good quality apatite crystals of variable dimensions rather than fewer determinations on many samples. [1] Brown, R. et al. 2013.Geochim. Cosmochim. Acta.122, 478-497 [2] Beucher, R. et al. 2013.Geochim. Cosmochim. Acta. 120, 395-416.

  14. Is the permeability of naturally fractured rocks scale dependent?

    NASA Astrophysics Data System (ADS)

    Azizmohammadi, Siroos; Matthäi, Stephan K.

    2017-09-01

    The equivalent permeability, keq of stratified fractured porous rocks and its anisotropy is important for hydrocarbon reservoir engineering, groundwater hydrology, and subsurface contaminant transport. However, it is difficult to constrain this tensor property as it is strongly influenced by infrequent large fractures. Boreholes miss them and their directional sampling bias affects the collected geostatistical data. Samples taken at any scale smaller than that of interest truncate distributions and this bias leads to an incorrect characterization and property upscaling. To better understand this sampling problem, we have investigated a collection of outcrop-data-based Discrete Fracture and Matrix (DFM) models with mechanically constrained fracture aperture distributions, trying to establish a useful Representative Elementary Volume (REV). Finite-element analysis and flow-based upscaling have been used to determine keq eigenvalues and anisotropy. While our results indicate a convergence toward a scale-invariant keq REV with increasing sample size, keq magnitude can have multi-modal distributions. REV size relates to the length of dilated fracture segments as opposed to overall fracture length. Tensor orientation and degree of anisotropy also converge with sample size. However, the REV for keq anisotropy is larger than that for keq magnitude. Across scales, tensor orientation varies spatially, reflecting inhomogeneity of the fracture patterns. Inhomogeneity is particularly pronounced where the ambient stress selectively activates late- as opposed to early (through-going) fractures. While we cannot detect any increase of keq with sample size as postulated in some earlier studies, our results highlight a strong keq anisotropy that influences scale dependence.

  15. Statistical theory and methodology for remote sensing data analysis

    NASA Technical Reports Server (NTRS)

    Odell, P. L.

    1974-01-01

    A model is developed for the evaluation of acreages (proportions) of different crop-types over a geographical area using a classification approach and methods for estimating the crop acreages are given. In estimating the acreages of a specific croptype such as wheat, it is suggested to treat the problem as a two-crop problem: wheat vs. nonwheat, since this simplifies the estimation problem considerably. The error analysis and the sample size problem is investigated for the two-crop approach. Certain numerical results for sample sizes are given for a JSC-ERTS-1 data example on wheat identification performance in Hill County, Montana and Burke County, North Dakota. Lastly, for a large area crop acreages inventory a sampling scheme is suggested for acquiring sample data and the problem of crop acreage estimation and the error analysis is discussed.

  16. Improving size estimates of open animal populations by incorporating information on age

    USGS Publications Warehouse

    Manly, Bryan F.J.; McDonald, Trent L.; Amstrup, Steven C.; Regehr, Eric V.

    2003-01-01

    Around the world, a great deal of effort is expended each year to estimate the sizes of wild animal populations. Unfortunately, population size has proven to be one of the most intractable parameters to estimate. The capture-recapture estimation models most commonly used (of the Jolly-Seber type) are complicated and require numerous, sometimes questionable, assumptions. The derived estimates usually have large variances and lack consistency over time. In capture–recapture studies of long-lived animals, the ages of captured animals can often be determined with great accuracy and relative ease. We show how to incorporate age information into size estimates for open populations, where the size changes through births, deaths, immigration, and emigration. The proposed method allows more precise estimates of population size than the usual models, and it can provide these estimates from two sample occasions rather than the three usually required. Moreover, this method does not require specialized programs for capture-recapture data; researchers can derive their estimates using the logistic regression module in any standard statistical package.

  17. Size distribution and growth rate of crystal nuclei near critical undercooling in small volumes

    NASA Astrophysics Data System (ADS)

    Kožíšek, Z.; Demo, P.

    2017-11-01

    Kinetic equations are numerically solved within standard nucleation model to determine the size distribution of nuclei in small volumes near critical undercooling. Critical undercooling, when first nuclei are detected within the system, depends on the droplet volume. The size distribution of nuclei reaches the stationary value after some time delay and decreases with nucleus size. Only a certain maximum size of nuclei is reached in small volumes near critical undercooling. As a model system, we selected recently studied nucleation in Ni droplet [J. Bokeloh et al., Phys. Rev. Let. 107 (2011) 145701] due to available experimental and simulation data. However, using these data for sample masses from 23 μg up to 63 mg (corresponding to experiments) leads to the size distribution of nuclei, when no critical nuclei in Ni droplet are formed (the number of critical nuclei < 1). If one takes into account the size dependence of the interfacial energy, the size distribution of nuclei increases to reasonable values. In lower volumes (V ≤ 10-9 m3) nucleus size reaches some maximum extreme size, which quickly increases with undercooling. Supercritical clusters continue their growth only if the number of critical nuclei is sufficiently high.

  18. Estimation of polydispersity in aggregating red blood cells by quantitative ultrasound backscatter analysis.

    PubMed

    de Monchy, Romain; Rouyer, Julien; Destrempes, François; Chayer, Boris; Cloutier, Guy; Franceschini, Emilie

    2018-04-01

    Quantitative ultrasound techniques based on the backscatter coefficient (BSC) have been commonly used to characterize red blood cell (RBC) aggregation. Specifically, a scattering model is fitted to measured BSC and estimated parameters can provide a meaningful description of the RBC aggregates' structure (i.e., aggregate size and compactness). In most cases, scattering models assumed monodisperse RBC aggregates. This study proposes the Effective Medium Theory combined with the polydisperse Structure Factor Model (EMTSFM) to incorporate the polydispersity of aggregate size. From the measured BSC, this model allows estimating three structural parameters: the mean radius of the aggregate size distribution, the width of the distribution, and the compactness of the aggregates. Two successive experiments were conducted: a first experiment on blood sheared in a Couette flow device coupled with an ultrasonic probe, and a second experiment, on the same blood sample, sheared in a plane-plane rheometer coupled to a light microscope. Results demonstrated that the polydisperse EMTSFM provided the best fit to the BSC data when compared to the classical monodisperse models for the higher levels of aggregation at hematocrits between 10% and 40%. Fitting the polydisperse model yielded aggregate size distributions that were consistent with direct light microscope observations at low hematocrits.

  19. Publication Bias in Psychology: A Diagnosis Based on the Correlation between Effect Size and Sample Size

    PubMed Central

    Kühberger, Anton; Fritz, Astrid; Scherndl, Thomas

    2014-01-01

    Background The p value obtained from a significance test provides no information about the magnitude or importance of the underlying phenomenon. Therefore, additional reporting of effect size is often recommended. Effect sizes are theoretically independent from sample size. Yet this may not hold true empirically: non-independence could indicate publication bias. Methods We investigate whether effect size is independent from sample size in psychological research. We randomly sampled 1,000 psychological articles from all areas of psychological research. We extracted p values, effect sizes, and sample sizes of all empirical papers, and calculated the correlation between effect size and sample size, and investigated the distribution of p values. Results We found a negative correlation of r = −.45 [95% CI: −.53; −.35] between effect size and sample size. In addition, we found an inordinately high number of p values just passing the boundary of significance. Additional data showed that neither implicit nor explicit power analysis could account for this pattern of findings. Conclusion The negative correlation between effect size and samples size, and the biased distribution of p values indicate pervasive publication bias in the entire field of psychology. PMID:25192357

  20. Publication bias in psychology: a diagnosis based on the correlation between effect size and sample size.

    PubMed

    Kühberger, Anton; Fritz, Astrid; Scherndl, Thomas

    2014-01-01

    The p value obtained from a significance test provides no information about the magnitude or importance of the underlying phenomenon. Therefore, additional reporting of effect size is often recommended. Effect sizes are theoretically independent from sample size. Yet this may not hold true empirically: non-independence could indicate publication bias. We investigate whether effect size is independent from sample size in psychological research. We randomly sampled 1,000 psychological articles from all areas of psychological research. We extracted p values, effect sizes, and sample sizes of all empirical papers, and calculated the correlation between effect size and sample size, and investigated the distribution of p values. We found a negative correlation of r = -.45 [95% CI: -.53; -.35] between effect size and sample size. In addition, we found an inordinately high number of p values just passing the boundary of significance. Additional data showed that neither implicit nor explicit power analysis could account for this pattern of findings. The negative correlation between effect size and samples size, and the biased distribution of p values indicate pervasive publication bias in the entire field of psychology.

  1. Statistical aspects of genetic association testing in small samples, based on selective DNA pooling data in the arctic fox.

    PubMed

    Szyda, Joanna; Liu, Zengting; Zatoń-Dobrowolska, Magdalena; Wierzbicki, Heliodor; Rzasa, Anna

    2008-01-01

    We analysed data from a selective DNA pooling experiment with 130 individuals of the arctic fox (Alopex lagopus), which originated from 2 different types regarding body size. The association between alleles of 6 selected unlinked molecular markers and body size was tested by using univariate and multinomial logistic regression models, applying odds ratio and test statistics from the power divergence family. Due to the small sample size and the resulting sparseness of the data table, in hypothesis testing we could not rely on the asymptotic distributions of the tests. Instead, we tried to account for data sparseness by (i) modifying confidence intervals of odds ratio; (ii) using a normal approximation of the asymptotic distribution of the power divergence tests with different approaches for calculating moments of the statistics; and (iii) assessing P values empirically, based on bootstrap samples. As a result, a significant association was observed for 3 markers. Furthermore, we used simulations to assess the validity of the normal approximation of the asymptotic distribution of the test statistics under the conditions of small and sparse samples.

  2. EVALUATION OF A NEW MEAN SCALED AND MOMENT ADJUSTED TEST STATISTIC FOR SEM.

    PubMed

    Tong, Xiaoxiao; Bentler, Peter M

    2013-01-01

    Recently a new mean scaled and skewness adjusted test statistic was developed for evaluating structural equation models in small samples and with potentially nonnormal data, but this statistic has received only limited evaluation. The performance of this statistic is compared to normal theory maximum likelihood and two well-known robust test statistics. A modification to the Satorra-Bentler scaled statistic is developed for the condition that sample size is smaller than degrees of freedom. The behavior of the four test statistics is evaluated with a Monte Carlo confirmatory factor analysis study that varies seven sample sizes and three distributional conditions obtained using Headrick's fifth-order transformation to nonnormality. The new statistic performs badly in most conditions except under the normal distribution. The goodness-of-fit χ(2) test based on maximum-likelihood estimation performed well under normal distributions as well as under a condition of asymptotic robustness. The Satorra-Bentler scaled test statistic performed best overall, while the mean scaled and variance adjusted test statistic outperformed the others at small and moderate sample sizes under certain distributional conditions.

  3. Generalizing the Network Scale-Up Method: A New Estimator for the Size of Hidden Populations*

    PubMed Central

    Feehan, Dennis M.; Salganik, Matthew J.

    2018-01-01

    The network scale-up method enables researchers to estimate the size of hidden populations, such as drug injectors and sex workers, using sampled social network data. The basic scale-up estimator offers advantages over other size estimation techniques, but it depends on problematic modeling assumptions. We propose a new generalized scale-up estimator that can be used in settings with non-random social mixing and imperfect awareness about membership in the hidden population. Further, the new estimator can be used when data are collected via complex sample designs and from incomplete sampling frames. However, the generalized scale-up estimator also requires data from two samples: one from the frame population and one from the hidden population. In some situations these data from the hidden population can be collected by adding a small number of questions to already planned studies. For other situations, we develop interpretable adjustment factors that can be applied to the basic scale-up estimator. We conclude with practical recommendations for the design and analysis of future studies. PMID:29375167

  4. Inactivation of Alicyclobacillus acidoterrestris ATCC 49025 spores in apple juice by pulsed light. Influence of initial contamination and required reduction levels.

    PubMed

    Ferrario, Mariana I; Guerrero, Sandra N

    The purpose of this study was to analyze the response of different initial contamination levels of Alicyclobacillus acidoterrestris ATCC 49025 spores in apple juice as affected by pulsed light treatment (PL, batch mode, xenon lamp, 3pulses/s, 0-71.6J/cm 2 ). Biphasic and Weibull frequency distribution models were used to characterize the relationship between inoculum size and treatment time with the reductions achieved after PL exposure. Additionally, a second order polynomial model was computed to relate required PL processing time to inoculum size and requested log reductions. PL treatment caused up to 3.0-3.5 log reductions, depending on the initial inoculum size. Inactivation curves corresponding to PL-treated samples were adequately characterized by both Weibull and biphasic models (R adj 2 94-96%), and revealed that lower initial inoculum sizes were associated with higher inactivation rates. According to the polynomial model, the predicted time for PL treatment increased exponentially with inoculum size. Copyright © 2017 Asociación Argentina de Microbiología. Publicado por Elsevier España, S.L.U. All rights reserved.

  5. An empirical model of human aspiration in low-velocity air using CFD investigations.

    PubMed

    Anthony, T Renée; Anderson, Kimberly R

    2015-01-01

    Computational fluid dynamics (CFD) modeling was performed to investigate the aspiration efficiency of the human head in low velocities to examine whether the current inhaled particulate mass (IPM) sampling criterion matches the aspiration efficiency of an inhaling human in airflows common to worker exposures. Data from both mouth and nose inhalation, averaged to assess omnidirectional aspiration efficiencies, were compiled and used to generate a unifying model to relate particle size to aspiration efficiency of the human head. Multiple linear regression was used to generate an empirical model to estimate human aspiration efficiency and included particle size as well as breathing and freestream velocities as dependent variables. A new set of simulated mouth and nose breathing aspiration efficiencies was generated and used to test the fit of empirical models. Further, empirical relationships between test conditions and CFD estimates of aspiration were compared to experimental data from mannequin studies, including both calm-air and ultra-low velocity experiments. While a linear relationship between particle size and aspiration is reported in calm air studies, the CFD simulations identified a more reasonable fit using the square of particle aerodynamic diameter, which better addressed the shape of the efficiency curve's decline toward zero for large particles. The ultimate goal of this work was to develop an empirical model that incorporates real-world variations in critical factors associated with particle aspiration to inform low-velocity modifications to the inhalable particle sampling criterion.

  6. A Comparison of the Fit of Empirical Data to Two Latent Trait Models. Report No. 92.

    ERIC Educational Resources Information Center

    Hutten, Leah R.

    Goodness of fit of raw test score data were compared, using two latent trait models: the Rasch model and the Birnbaum three-parameter logistic model. Data were taken from various achievement tests and the Scholastic Aptitude Test (Verbal). A minimum sample size of 1,000 was required, and the minimum test length was 40 items. Results indicated that…

  7. Optimum sample size allocation to minimize cost or maximize power for the two-sample trimmed mean test.

    PubMed

    Guo, Jiin-Huarng; Luh, Wei-Ming

    2009-05-01

    When planning a study, sample size determination is one of the most important tasks facing the researcher. The size will depend on the purpose of the study, the cost limitations, and the nature of the data. By specifying the standard deviation ratio and/or the sample size ratio, the present study considers the problem of heterogeneous variances and non-normality for Yuen's two-group test and develops sample size formulas to minimize the total cost or maximize the power of the test. For a given power, the sample size allocation ratio can be manipulated so that the proposed formulas can minimize the total cost, the total sample size, or the sum of total sample size and total cost. On the other hand, for a given total cost, the optimum sample size allocation ratio can maximize the statistical power of the test. After the sample size is determined, the present simulation applies Yuen's test to the sample generated, and then the procedure is validated in terms of Type I errors and power. Simulation results show that the proposed formulas can control Type I errors and achieve the desired power under the various conditions specified. Finally, the implications for determining sample sizes in experimental studies and future research are discussed.

  8. Stardust Sample Collection at Wild 2 and Its Preliminary Examination

    NASA Technical Reports Server (NTRS)

    Tsou, P.; Brownlee, D. E.; Hoerz, F.; Newburn, R. L.; Sandford, S. A.; Sekanina, Z.; Zolensky, M. E.

    2004-01-01

    The primary objective of STARDUST is to collect coma samples from 81P/Wild 2. This was made on January 2, 2004. Before the encounter three significant model predictions existed for the number and size of samples to be captured. Three investigations during the Wild 2 encounter (Dust Flux Monitor, Comet and Interstellar Dust Analyzer and Dynamic Science) made in situ measurements of the dust. Spectacular images were captured of the Wild 2 nucleus and dust jets. This abstract compares the model predictions with the in situ measurements and Wild 2 images and assesses the likely samples to be returned for analysis on January 15, 2006. To give some lead time for sample analysts to prepare for the analyses of the returned samples, the organization of the Preliminary Examination is presented.

  9. A Bayesian model for estimating population means using a link-tracing sampling design.

    PubMed

    St Clair, Katherine; O'Connell, Daniel

    2012-03-01

    Link-tracing sampling designs can be used to study human populations that contain "hidden" groups who tend to be linked together by a common social trait. These links can be used to increase the sampling intensity of a hidden domain by tracing links from individuals selected in an initial wave of sampling to additional domain members. Chow and Thompson (2003, Survey Methodology 29, 197-205) derived a Bayesian model to estimate the size or proportion of individuals in the hidden population for certain link-tracing designs. We propose an addition to their model that will allow for the modeling of a quantitative response. We assess properties of our model using a constructed population and a real population of at-risk individuals, both of which contain two domains of hidden and nonhidden individuals. Our results show that our model can produce good point and interval estimates of the population mean and domain means when our population assumptions are satisfied. © 2011, The International Biometric Society.

  10. Vigilance behaviour of the year-round territorial vicuña (Vicugna vicugna) outside the breeding season: influence of group size, social factors and distance to a water source.

    PubMed

    Torres, M Eugenia Mosca; Puig, Silvia; Novillo, Agustina; Ovejero, Ramiro

    2015-04-01

    We conducted focal observations of vicuña, a year-around territorial mammal, to compare vigilance behaviour between territorial and bachelor males outside the reproductive season. We hypothesized that the time spent vigilant would depend on male social status, considering the potential effects of several variables: sampling year, group size, distances to the nearest neighbour and to a vega (mountain wetland). We fit GLM models to assess how these variables, and their interactions, affected time allocation of territorial and bachelor males. We found non significant differences between territorial and bachelor males in the time devoted to vigilance behaviour. Vigilance of territorial males was influenced by the sampling year and the distance to the vega. In turn, vigilance in bachelor males was influenced mainly by the sampling year, the group size and the distance to the vega. Our results suggest that sampling year and distance to the vega are more important than social factors in conditioning the behaviour of male vicuñas, during the non-reproductive season. Future studies of behaviour in water-dependant ungulates, should consider the influence of water and forage availabilities, and the interactions between group size and other variables. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. Robust functional statistics applied to Probability Density Function shape screening of sEMG data.

    PubMed

    Boudaoud, S; Rix, H; Al Harrach, M; Marin, F

    2014-01-01

    Recent studies pointed out possible shape modifications of the Probability Density Function (PDF) of surface electromyographical (sEMG) data according to several contexts like fatigue and muscle force increase. Following this idea, criteria have been proposed to monitor these shape modifications mainly using High Order Statistics (HOS) parameters like skewness and kurtosis. In experimental conditions, these parameters are confronted with small sample size in the estimation process. This small sample size induces errors in the estimated HOS parameters restraining real-time and precise sEMG PDF shape monitoring. Recently, a functional formalism, the Core Shape Model (CSM), has been used to analyse shape modifications of PDF curves. In this work, taking inspiration from CSM method, robust functional statistics are proposed to emulate both skewness and kurtosis behaviors. These functional statistics combine both kernel density estimation and PDF shape distances to evaluate shape modifications even in presence of small sample size. Then, the proposed statistics are tested, using Monte Carlo simulations, on both normal and Log-normal PDFs that mimic observed sEMG PDF shape behavior during muscle contraction. According to the obtained results, the functional statistics seem to be more robust than HOS parameters to small sample size effect and more accurate in sEMG PDF shape screening applications.

  12. Intra-class correlation estimates for assessment of vitamin A intake in children.

    PubMed

    Agarwal, Girdhar G; Awasthi, Shally; Walter, Stephen D

    2005-03-01

    In many community-based surveys, multi-level sampling is inherent in the design. In the design of these studies, especially to calculate the appropriate sample size, investigators need good estimates of intra-class correlation coefficient (ICC), along with the cluster size, to adjust for variation inflation due to clustering at each level. The present study used data on the assessment of clinical vitamin A deficiency and intake of vitamin A-rich food in children in a district in India. For the survey, 16 households were sampled from 200 villages nested within eight randomly-selected blocks of the district. ICCs and components of variances were estimated from a three-level hierarchical random effects analysis of variance model. Estimates of ICCs and variance components were obtained at village and block levels. Between-cluster variation was evident at each level of clustering. In these estimates, ICCs were inversely related to cluster size, but the design effect could be substantial for large clusters. At the block level, most ICC estimates were below 0.07. At the village level, many ICC estimates ranged from 0.014 to 0.45. These estimates may provide useful information for the design of epidemiological studies in which the sampled (or allocated) units range in size from households to large administrative zones.

  13. Beyond Gorilla and Pongo: alternative models for evaluating variation and sexual dimorphism in fossil hominoid samples.

    PubMed

    Scott, Jeremiah E; Schrein, Caitlin M; Kelley, Jay

    2009-10-01

    Sexual size dimorphism in the postcanine dentition of the late Miocene hominoid Lufengpithecus lufengensis exceeds that in Pongo pygmaeus, demonstrating that the maximum degree of molar size dimorphism in apes is not represented among the extant Hominoidea. It has not been established, however, that the molars of Pongo are more dimorphic than those of any other living primate. In this study, we used resampling-based methods to compare molar dimorphism in Gorilla, Pongo, and Lufengpithecus to that in the papionin Mandrillus leucophaeus to test two hypotheses: (1) Pongo possesses the most size-dimorphic molars among living primates and (2) molar size dimorphism in Lufengpithecus is greater than that in the most dimorphic living primates. Our results show that M. leucophaeus exceeds great apes in its overall level of dimorphism and that L. lufengensis is more dimorphic than the extant species. Using these samples, we also evaluated molar dimorphism and taxonomic composition in two other Miocene ape samples--Ouranopithecus macedoniensis from Greece, specimens of which can be sexed based on associated canines and P(3)s, and the Sivapithecus sample from Haritalyangar, India. Ouranopithecus is more dimorphic than the extant taxa but is similar to Lufengpithecus, demonstrating that the level of molar dimorphism required for the Greek fossil sample under the single-species taxonomy is not unprecedented when the comparative framework is expanded to include extinct primates. In contrast, the Haritalyangar Sivapithecus sample, if itrepresents a single species, exhibits substantially greater molar dimorphism than does Lufengpithecus. Given these results, the taxonomic status of this sample remains equivocal.

  14. Estimating animal populations and body sizes from burrows: Marine ecologists have their heads buried in the sand

    NASA Astrophysics Data System (ADS)

    Schlacher, Thomas A.; Lucrezi, Serena; Peterson, Charles H.; Connolly, Rod M.; Olds, Andrew D.; Althaus, Franziska; Hyndes, Glenn A.; Maslo, Brooke; Gilby, Ben L.; Leon, Javier X.; Weston, Michael A.; Lastra, Mariano; Williams, Alan; Schoeman, David S.

    2016-06-01

    Most ecological studies require knowledge of animal abundance, but it can be challenging and destructive of habitat to obtain accurate density estimates for cryptic species, such as crustaceans that tunnel deeply into the seafloor, beaches, or mudflats. Such fossorial species are, however, widely used in environmental impact assessments, requiring sampling techniques that are reliable, efficient, and environmentally benign for these species and environments. Counting and measuring the entrances of burrows made by cryptic species is commonly employed to index population and body sizes of individuals. The fundamental premise is that burrow metrics consistently predict density and size. Here we review the evidence for this premise. We also review criteria for selecting among sampling methods: burrow counts, visual censuses, and physical collections. A simple 1:1 correspondence between the number of holes and population size cannot be assumed. Occupancy rates, indexed by the slope of regression models, vary widely between species and among sites for the same species. Thus, 'average' or 'typical' occupancy rates should not be extrapolated from site- or species specific field validations and then be used as conversion factors in other situations. Predictions of organism density made from burrow counts often have large uncertainty, being double to half of the predicted mean value. Whether such prediction uncertainty is 'acceptable' depends on investigators' judgements regarding the desired detectable effect sizes. Regression models predicting body size from burrow entrance dimensions are more precise, but parameter estimates of most models are specific to species and subject to site-to-site variation within species. These results emphasise the need to undertake thorough field validations of indirect census techniques that include tests of how sensitive predictive models are to changes in habitat conditions or human impacts. In addition, new technologies (e.g. drones, thermal-, acoustic- or chemical sensors) should be used to enhance visual census techniques of burrows and surface-active animals.

  15. Investigating Compaction by Intergranular Pressure Solution Using the Discrete Element Method

    NASA Astrophysics Data System (ADS)

    van den Ende, M. P. A.; Marketos, G.; Niemeijer, A. R.; Spiers, C. J.

    2018-01-01

    Intergranular pressure solution creep is an important deformation mechanism in the Earth's crust. The phenomenon has been frequently studied and several analytical models have been proposed that describe its constitutive behavior. These models require assumptions regarding the geometry of the aggregate and the grain size distribution in order to solve for the contact stresses and often neglect shear tractions. Furthermore, analytical models tend to overestimate experimental compaction rates at low porosities, an observation for which the underlying mechanisms remain to be elucidated. Here we present a conceptually simple, 3-D discrete element method (DEM) approach for simulating intergranular pressure solution creep that explicitly models individual grains, relaxing many of the assumptions that are required by analytical models. The DEM model is validated against experiments by direct comparison of macroscopic sample compaction rates. Furthermore, the sensitivity of the overall DEM compaction rate to the grain size and applied stress is tested. The effects of the interparticle friction and of a distributed grain size on macroscopic strain rates are subsequently investigated. Overall, we find that the DEM model is capable of reproducing realistic compaction behavior, and that the strain rates produced by the model are in good agreement with uniaxial compaction experiments. Characteristic features, such as the dependence of the strain rate on grain size and applied stress, as predicted by analytical models, are also observed in the simulations. DEM results show that interparticle friction and a distributed grain size affect the compaction rates by less than half an order of magnitude.

  16. Linear Combinations of Multiple Outcome Measures to Improve the Power of Efficacy Analysis ---Application to Clinical Trials on Early Stage Alzheimer Disease

    PubMed Central

    Xiong, Chengjie; Luo, Jingqin; Morris, John C; Bateman, Randall

    2018-01-01

    Modern clinical trials on Alzheimer disease (AD) focus on the early symptomatic stage or even the preclinical stage. Subtle disease progression at the early stages, however, poses a major challenge in designing such clinical trials. We propose a multivariate mixed model on repeated measures to model the disease progression over time on multiple efficacy outcomes, and derive the optimum weights to combine multiple outcome measures by minimizing the sample sizes to adequately power the clinical trials. A cross-validation simulation study is conducted to assess the accuracy for the estimated weights as well as the improvement in reducing the sample sizes for such trials. The proposed methodology is applied to the multiple cognitive tests from the ongoing observational study of the Dominantly Inherited Alzheimer Network (DIAN) to power future clinical trials in the DIAN with a cognitive endpoint. Our results show that the optimum weights to combine multiple outcome measures can be accurately estimated, and that compared to the individual outcomes, the combined efficacy outcome with these weights significantly reduces the sample size required to adequately power clinical trials. When applied to the clinical trial in the DIAN, the estimated linear combination of six cognitive tests can adequately power the clinical trial. PMID:29546251

  17. Ferromagnetic linewidth measurements employing electrodynamic model of the magnetic plasmon resonance

    NASA Astrophysics Data System (ADS)

    Krupka, Jerzy; Aleshkevych, Pavlo; Salski, Bartlomiej; Kopyt, Pawel

    2018-02-01

    The mode of uniform precession, or Kittel mode, in a magnetized ferromagnetic sphere, has recently been proven to be the magnetic plasmon resonance. In this paper we show how to apply the electrodynamic model of the magnetic plasmon resonance for accurate measurements of the ferromagnetic resonance linewidth ΔH. Two measurement methods are presented. The first one employs Q-factor measurements of the magnetic plasmon resonance coupled to the resonance of an empty metallic cavity. Such coupled modes are known as magnon-polariton modes, i.e. hybridized modes between the collective spin excitation and the cavity excitation. The second one employs direct Q-factor measurements of the magnetic plasmon resonance in a filter setup with two orthogonal semi-loops used for coupling. Q-factor measurements are performed employing a vector network analyser. The methods presented in this paper allow one to extend the measurement range of the ferromagnetic resonance linewidth ΔH well beyond the limits of the commonly used measurement standards in terms of the size of the samples and the lowest measurable linewidths. Samples that can be measured with the newly proposed methods may have larger size as compared to the size of samples that were used in the standard methods restricted by the limits of perturbation theory.

  18. DRME: Count-based differential RNA methylation analysis at small sample size scenario.

    PubMed

    Liu, Lian; Zhang, Shao-Wu; Gao, Fan; Zhang, Yixin; Huang, Yufei; Chen, Runsheng; Meng, Jia

    2016-04-15

    Differential methylation, which concerns difference in the degree of epigenetic regulation via methylation between two conditions, has been formulated as a beta or beta-binomial distribution to address the within-group biological variability in sequencing data. However, a beta or beta-binomial model is usually difficult to infer at small sample size scenario with discrete reads count in sequencing data. On the other hand, as an emerging research field, RNA methylation has drawn more and more attention recently, and the differential analysis of RNA methylation is significantly different from that of DNA methylation due to the impact of transcriptional regulation. We developed DRME to better address the differential RNA methylation problem. The proposed model can effectively describe within-group biological variability at small sample size scenario and handles the impact of transcriptional regulation on RNA methylation. We tested the newly developed DRME algorithm on simulated and 4 MeRIP-Seq case-control studies and compared it with Fisher's exact test. It is in principle widely applicable to several other RNA-related data types as well, including RNA Bisulfite sequencing and PAR-CLIP. The code together with an MeRIP-Seq dataset is available online (https://github.com/lzcyzm/DRME) for evaluation and reproduction of the figures shown in this article. Copyright © 2016 Elsevier Inc. All rights reserved.

  19. Investigation of the effects of the macrophysical and microphysical properties of cirrus clouds on the retrieval of optical properties: Results for FIRE 2

    NASA Technical Reports Server (NTRS)

    Stackhouse, Paul W., Jr.; Stephens, Graeme L.

    1993-01-01

    Due to the prevalence and persistence of cirrus cloudiness across the globe, cirrus clouds are believed to have an important effect on the climate. Stephens et al., (1990) among others have shown that the important factor determining how cirrus clouds modulate the climate is the balance between the albedo and emittance effect of the cloud systems. This factor was shown to depend in part upon the effective sizes of the cirrus cloud particles. Since effective sizes of cirrus cloud microphysical distributions are used as a basis of parameterizations in climate models, it is crucial that the relationships between effective sizes and radiative properties be clearly established. In this preliminary study, the retrieval of cirrus cloud effective sizes are examined using a two dimensional radiative transfer model for a cirrus cloud case sampled during FIRE Cirrus 11. The purpose of this paper is to present preliminary results from the SHSG model demonstrating the sensitivity of the bispectral relationships of reflected radiances and thus the retrieval of effective sizes to phase function and dimensionality.

  20. Optical phonons in nanostructured thin films composed by zincblende zinc selenide quantum dots in strong size-quantization regime: Competition between phonon confinement and strain-related effects

    NASA Astrophysics Data System (ADS)

    Pejova, Biljana

    2014-05-01

    Raman scattering in combination with optical spectroscopy and structural studies by X-ray diffraction was employed to investigate the phonon confinement and strain-induced effects in 3D assemblies of variable-size zincblende ZnSe quantum dots close packed in thin film form. Nanostructured thin films were synthesized by colloidal chemical approach, while tuning of the nanocrystal size was enabled by post-deposition thermal annealing treatment. In-depth insights into the factors governing the observed trends of the position and half-width of the 1LO band as a function of the average QD size were gained. The overall shifts in the position of 1LO band were found to result from an intricate compromise between the influence of phonon confinement and lattice strain-induced effects. Both contributions were quantitatively and exactly modeled. Accurate assignments of the bands due to surface optical (SO) modes as well as of the theoretically forbidden transverse optical (TO) modes were provided, on the basis of reliable physical models (such as the dielectric continuum model of Ruppin and Englman). The size-dependence of the ratio of intensities of the TO and LO modes was studied and discussed as well. Relaxation time characterizing the phonon decay processes in as-deposited samples was found to be approximately 0.38 ps, while upon post-deposition annealing already at 200 °C it increases to about 0.50 ps. Both of these values are, however, significantly smaller than those characteristic for a macrocrystalline ZnSe sample.

  1. State-space modeling of population sizes and trends in Nihoa Finch and Millerbird

    USGS Publications Warehouse

    Gorresen, P. Marcos; Brinck, Kevin W.; Camp, Richard J.; Farmer, Chris; Plentovich, Sheldon M.; Banko, Paul C.

    2016-01-01

    Both of the 2 passerines endemic to Nihoa Island, Hawai‘i, USA—the Nihoa Millerbird (Acrocephalus familiaris kingi) and Nihoa Finch (Telespiza ultima)—are listed as endangered by federal and state agencies. Their abundances have been estimated by irregularly implemented fixed-width strip-transect sampling from 1967 to 2012, from which area-based extrapolation of the raw counts produced highly variable abundance estimates for both species. To evaluate an alternative survey method and improve abundance estimates, we conducted variable-distance point-transect sampling between 2010 and 2014. We compared our results to those obtained from strip-transect samples. In addition, we applied state-space models to derive improved estimates of population size and trends from the legacy time series of strip-transect counts. Both species were fairly evenly distributed across Nihoa and occurred in all or nearly all available habitat. Population trends for Nihoa Millerbird were inconclusive because of high within-year variance. Trends for Nihoa Finch were positive, particularly since the early 1990s. Distance-based analysis of point-transect counts produced mean estimates of abundance similar to those from strip-transects but was generally more precise. However, both survey methods produced biologically unrealistic variability between years. State-space modeling of the long-term time series of abundances obtained from strip-transect counts effectively reduced uncertainty in both within- and between-year estimates of population size, and allowed short-term changes in abundance trajectories to be smoothed into a long-term trend.

  2. Risk of bias reporting in the recent animal focal cerebral ischaemia literature.

    PubMed

    Bahor, Zsanett; Liao, Jing; Macleod, Malcolm R; Bannach-Brown, Alexandra; McCann, Sarah K; Wever, Kimberley E; Thomas, James; Ottavi, Thomas; Howells, David W; Rice, Andrew; Ananiadou, Sophia; Sena, Emily

    2017-10-15

    Findings from in vivo research may be less reliable where studies do not report measures to reduce risks of bias. The experimental stroke community has been at the forefront of implementing changes to improve reporting, but it is not known whether these efforts are associated with continuous improvements. Our aims here were firstly to validate an automated tool to assess risks of bias in published works, and secondly to assess the reporting of measures taken to reduce the risk of bias within recent literature for two experimental models of stroke. We developed and used text analytic approaches to automatically ascertain reporting of measures to reduce risk of bias from full-text articles describing animal experiments inducing middle cerebral artery occlusion (MCAO) or modelling lacunar stroke. Compared with previous assessments, there were improvements in the reporting of measures taken to reduce risks of bias in the MCAO literature but not in the lacunar stroke literature. Accuracy of automated annotation of risk of bias in the MCAO literature was 86% (randomization), 94% (blinding) and 100% (sample size calculation); and in the lacunar stroke literature accuracy was 67% (randomization), 91% (blinding) and 96% (sample size calculation). There remains substantial opportunity for improvement in the reporting of animal research modelling stroke, particularly in the lacunar stroke literature. Further, automated tools perform sufficiently well to identify whether studies report blinded assessment of outcome, but improvements are required in the tools to ascertain whether randomization and a sample size calculation were reported. © 2017 The Author(s).

  3. Size-sex variation in survival rates and abundance of pig frogs, Rana grylio, in northern Florida wetlands

    USGS Publications Warehouse

    Wood, K.V.; Nichols, J.D.; Percival, H.F.; Hines, J.E.

    1998-01-01

    During 1991-1993, we conducted capture-recapture studies on pig frogs, Rana grylio, in seven study locations in northcentral Florida. Resulting data were used to test hypotheses about variation in survival probability over different size-sex classes of pig frogs. We developed multistate capture-recapture models for the resulting data and used them to estimate survival rates and frog abundance. Tests provided strong evidence of survival differences among size-sex classes, with adult females showing the highest survival probabilities. Adult males and juvenile frogs had lower survival rates that were similar to each other. Adult females were more abundant than adult males in most locations at most sampling occasions. We recommended probabilistic capture-recapture models in general, and multistate models in particular, for robust estimation of demographic parameters in amphibian populations.

  4. The Response of Simple Polymer Structures Under Dynamic Loading

    NASA Astrophysics Data System (ADS)

    Proud, William; Ellison, Kay; Yapp, Su; Cole, Cloe; Galimberti, Stefano; Institute of Shock Physics Team

    2017-06-01

    The dynamic response of polymeric materials has been widely studied with the effects of degree of crystallinity, strain rate, temperature and sample size being commonly reported. This study uses a simple PMMA structure, a right cylindrical sample, with structural features such as holes. The features are added an varied in a systematic fashion. Samples were dynamically loaded using a Split Hopkinson Pressure Bar up to failure. The resulting stress-strain curves are presented showing the change in sample response. The strain to failure is shown to increase initially with the presence of holes, while failure stress is relatively unaffected. The fracture patterns seen in the failed samples change, with tensile cracks, Hertzian cones, shear effects being dominant for different holes sizes and geometries. The sample were prepared by laser cutting and checked for residual stress before experiment. The data is used to validate predictive model predictions where material, structure and damage are included.. The Institute of Shock Physics acknowledges the support of Imperial College London and the Atomic Weapons Establishment.

  5. Investigating textural controls on Archie's porosity exponent using process-based, pore-scale modelling

    NASA Astrophysics Data System (ADS)

    Niu, Q.; Zhang, C.

    2017-12-01

    Archie's law is an important empirical relationship linking the electrical resistivity of geological materials to their porosity. It has been found experimentally that the porosity exponent m in Archie's law in sedimentary rocks might be related to the degree of cementation, and therefore m is termed as "cementation factor" in most literatures. Despite it has been known for many years, there is lack of well-accepted physical interpretations of the porosity exponent. Some theoretical and experimental evidences have also shown that m may be controlled by the particle and/or pore shape. In this study, we conduct a pore-scale modeling of the porosity exponent that incorporates different geological processes. The evolution of m of eight synthetic samples with different particle sizes and shapes are calculated during two geological processes, i.e., compaction and cementation. The numerical results show that in dilute conditions, m is controlled by the particle shape. As the samples deviate from dilute conditions, m increases gradually due to the strong interaction between particles. When the samples are at static equilibrium, m is noticeably larger than its values at dilution condition. The numerical simulation results also show that both geological compaction and cementation induce a significant increase in m. In addition, the geometric characteristics of these samples (e.g., pore space/throat size, and their distributions) during compaction and cementation are also calculated. Preliminary analysis shows a unique correlation between the pore size broadness and porosity exponent for all eight samples. However, such a correlation is not found between m and other geometric characteristics.

  6. Drying regimes in homogeneous porous media from macro- to nanoscale

    NASA Astrophysics Data System (ADS)

    Thiery, J.; Rodts, S.; Weitz, D. A.; Coussot, P.

    2017-07-01

    Magnetic resonance imaging visualization down to nanometric liquid films in model porous media with pore sizes from micro- to nanometers enables one to fully characterize the physical mechanisms of drying. For pore size larger than a few tens of nanometers, we identify an initial constant drying rate period, probing homogeneous desaturation, followed by a falling drying rate period. This second period is associated with the development of a gradient in saturation underneath the sample free surface that initiates the inward recession of the contact line. During this latter stage, the drying rate varies in accordance with vapor diffusion through the dry porous region, possibly affected by the Knudsen effect for small pore size. However, we show that for sufficiently small pore size and/or saturation the drying rate is increasingly reduced by the Kelvin effect. Subsequently, we demonstrate that this effect governs the kinetics of evaporation in nanopores as a homogeneous desaturation occurs. Eventually, under our experimental conditions, we show that the saturation unceasingly decreases in a homogeneous manner throughout the wet regions of the medium regardless of pore size or drying regime considered. This finding suggests the existence of continuous liquid flow towards the interface of higher evaporation, down to very low saturation or very small pore size. Paradoxically, even if this net flow is unidirectional and capillary driven, it corresponds to a series of diffused local capillary equilibrations over the full height of the sample, which might explain that a simple Darcy's law model does not predict the effect of scaling of the net flow rate on the pore size observed in our tests.

  7. Study the fragment size distribution in dynamic fragmentation of laser shock loding tin

    NASA Astrophysics Data System (ADS)

    He, Weihua; Xin, Jianting; Chu, Genbai; Shui, Min; Xi, Tao; Zhao, Yongqiang; Gu, Yuqiu

    2017-06-01

    Characterizing the distribution of fragment size produced from dynamic fragmentation process is very important for fundamental science like predicting material dymanic response performance and for a variety of engineering applications. However, only a few data about fragment mass or size have been obtained due to its great challenge in its dynamic measurement. This paper would focus on investigating the fragment size distribution from the dynamic fragmentation of laser shock-loaded metal. Material ejection of tin sample with wedge shape groove in the free surface is collected with soft recovery technique. Via fine post-shot analysis techniques including X-ray micro-tomography and the improved watershed method, it is found that fragments can be well detected. To characterize their size distributions, a random geometric statistics method based on Poisson mixtures was derived for dynamic heterogeneous fragmentation problem, which leads to a linear combinational exponential distribution. Finally we examined the size distribution of laser shock-loaded tin with the derived model, and provided comparisons with other state-of-art models. The resulting comparisons prove that our proposed model can provide more reasonable fitting result for laser shock-loaded metal.

  8. Probe measurements and numerical model predictions of evolving size distributions in premixed flames

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    De Filippo, A.; Sgro, L.A.; Lanzuolo, G.

    2009-09-15

    Particle size distributions (PSDs), measured with a dilution probe and a Differential Mobility Analyzer (DMA), and numerical predictions of these PSDs, based on a model that includes only coagulation or alternatively inception and coagulation, are compared to investigate particle growth processes and possible sampling artifacts in the post-flame region of a C/O = 0.65 premixed laminar ethylene-air flame. Inputs to the numerical model are the PSD measured early in the flame (the initial condition for the aerosol population) and the temperature profile measured along the flame's axial centerline. The measured PSDs are initially unimodal, with a modal mobility diameter ofmore » 2.2 nm, and become bimodal later in the post-flame region. The smaller mode is best predicted with a size-dependent coagulation model, which allows some fraction of the smallest particles to escape collisions without resulting in coalescence or coagulation through the size-dependent coagulation efficiency ({gamma}{sub SD}). Instead, when {gamma} = 1 and the coagulation rate is equal to the collision rate for all particles regardless of their size, the coagulation model significantly under predicts the number concentration of both modes and over predicts the size of the largest particles in the distribution compared to the measured size distributions at various heights above the burner. The coagulation ({gamma}{sub SD}) model alone is unable to reproduce well the larger particle mode (mode II). Combining persistent nucleation with size-dependent coagulation brings the predicted PSDs to within experimental error of the measurements, which seems to suggest that surface growth processes are relatively insignificant in these flames. Shifting measured PSDs a few mm closer to the burner surface, generally adopted to correct for probe perturbations, does not produce a better matching between the experimental and the numerical results. (author)« less

  9. Catching ghosts with a coarse net: use and abuse of spatial sampling data in detecting synchronization

    PubMed Central

    2017-01-01

    Synchronization of population dynamics in different habitats is a frequently observed phenomenon. A common mathematical tool to reveal synchronization is the (cross)correlation coefficient between time courses of values of the population size of a given species where the population size is evaluated from spatial sampling data. The corresponding sampling net or grid is often coarse, i.e. it does not resolve all details of the spatial configuration, and the evaluation error—i.e. the difference between the true value of the population size and its estimated value—can be considerable. We show that this estimation error can make the value of the correlation coefficient very inaccurate or even irrelevant. We consider several population models to show that the value of the correlation coefficient calculated on a coarse sampling grid rarely exceeds 0.5, even if the true value is close to 1, so that the synchronization is effectively lost. We also observe ‘ghost synchronization’ when the correlation coefficient calculated on a coarse sampling grid is close to 1 but in reality the dynamics are not correlated. Finally, we suggest a simple test to check the sampling grid coarseness and hence to distinguish between the true and artifactual values of the correlation coefficient. PMID:28202589

  10. Fitting models of continuous trait evolution to incompletely sampled comparative data using approximate Bayesian computation.

    PubMed

    Slater, Graham J; Harmon, Luke J; Wegmann, Daniel; Joyce, Paul; Revell, Liam J; Alfaro, Michael E

    2012-03-01

    In recent years, a suite of methods has been developed to fit multiple rate models to phylogenetic comparative data. However, most methods have limited utility at broad phylogenetic scales because they typically require complete sampling of both the tree and the associated phenotypic data. Here, we develop and implement a new, tree-based method called MECCA (Modeling Evolution of Continuous Characters using ABC) that uses a hybrid likelihood/approximate Bayesian computation (ABC)-Markov-Chain Monte Carlo approach to simultaneously infer rates of diversification and trait evolution from incompletely sampled phylogenies and trait data. We demonstrate via simulation that MECCA has considerable power to choose among single versus multiple evolutionary rate models, and thus can be used to test hypotheses about changes in the rate of trait evolution across an incomplete tree of life. We finally apply MECCA to an empirical example of body size evolution in carnivores, and show that there is no evidence for an elevated rate of body size evolution in the pinnipeds relative to terrestrial carnivores. ABC approaches can provide a useful alternative set of tools for future macroevolutionary studies where likelihood-dependent approaches are lacking. © 2011 The Author(s). Evolution© 2011 The Society for the Study of Evolution.

  11. Estimating Animal Abundance in Ground Beef Batches Assayed with Molecular Markers

    PubMed Central

    Hu, Xin-Sheng; Simila, Janika; Platz, Sindey Schueler; Moore, Stephen S.; Plastow, Graham; Meghen, Ciaran N.

    2012-01-01

    Estimating animal abundance in industrial scale batches of ground meat is important for mapping meat products through the manufacturing process and for effectively tracing the finished product during a food safety recall. The processing of ground beef involves a potentially large number of animals from diverse sources in a single product batch, which produces a high heterogeneity in capture probability. In order to estimate animal abundance through DNA profiling of ground beef constituents, two parameter-based statistical models were developed for incidence data. Simulations were applied to evaluate the maximum likelihood estimate (MLE) of a joint likelihood function from multiple surveys, showing superiority in the presence of high capture heterogeneity with small sample sizes, or comparable estimation in the presence of low capture heterogeneity with a large sample size when compared to other existing models. Our model employs the full information on the pattern of the capture-recapture frequencies from multiple samples. We applied the proposed models to estimate animal abundance in six manufacturing beef batches, genotyped using 30 single nucleotide polymorphism (SNP) markers, from a large scale beef grinding facility. Results show that between 411∼1367 animals were present in six manufacturing beef batches. These estimates are informative as a reference for improving recall processes and tracing finished meat products back to source. PMID:22479559

  12. Effects of Melt Convection and Solid Transport on Macrosegregation and Grain Structure in Equiaxed Al-Cu Alloys

    NASA Technical Reports Server (NTRS)

    Rerko, Rodney S.; deGroh, Henry C., III; Beckermann, Christoph

    2000-01-01

    Macrosegregation in metal casting can be caused by thermal and solutal melt convection, and the transport of unattached solid crystals resulting from nucleation in the bulk liquid or dendrite fragmentation. To develop a comprehensive numerical model for the casting of alloys, an experimental study has been conducted to generate benchmark data with which such a solidification model could be tested. The objectives were: (1) experimentally study the effects of solid transport and thermosolutal convection on macrosegregation and grain size; and (2) provide a complete set of boundary conditions temperature data, segregation data, and grain size data - to validate numerical models. Through the control of end cooling and side wall heating, radial temperature gradients in the sample and furnace were minimized. Thus the vertical crucible wall was adiabatic. Samples at room temperature were 24 cc and 95 mm long. The alloys used were Al-1 wt. pct. Cu, and Al- 10 wt. pct. Cu; the starting point for solidification was isothermal at 710 and 685 C respectively. To induce an equiaxed structure various amounts of the grain refiner TiB2 were added. Samples were either cooled from the top, or the bottom. Several trends in the data stand out. In attempting to model these experiments, concentrating on these trends or differences may be beneficial.

  13. Accuracy or precision: Implications of sample design and methodology on abundance estimation

    USGS Publications Warehouse

    Kowalewski, Lucas K.; Chizinski, Christopher J.; Powell, Larkin A.; Pope, Kevin L.; Pegg, Mark A.

    2015-01-01

    Sampling by spatially replicated counts (point-count) is an increasingly popular method of estimating population size of organisms. Challenges exist when sampling by point-count method, and it is often impractical to sample entire area of interest and impossible to detect every individual present. Ecologists encounter logistical limitations that force them to sample either few large-sample units or many small sample-units, introducing biases to sample counts. We generated a computer environment and simulated sampling scenarios to test the role of number of samples, sample unit area, number of organisms, and distribution of organisms in the estimation of population sizes using N-mixture models. Many sample units of small area provided estimates that were consistently closer to true abundance than sample scenarios with few sample units of large area. However, sample scenarios with few sample units of large area provided more precise abundance estimates than abundance estimates derived from sample scenarios with many sample units of small area. It is important to consider accuracy and precision of abundance estimates during the sample design process with study goals and objectives fully recognized, although and with consequence, consideration of accuracy and precision of abundance estimates is often an afterthought that occurs during the data analysis process.

  14. Technical note: Alternatives to reduce adipose tissue sampling bias.

    PubMed

    Cruz, G D; Wang, Y; Fadel, J G

    2014-10-01

    Understanding the mechanisms by which nutritional and pharmaceutical factors can manipulate adipose tissue growth and development in production animals has direct and indirect effects in the profitability of an enterprise. Adipocyte cellularity (number and size) is a key biological response that is commonly measured in animal science research. The variability and sampling of adipocyte cellularity within a muscle has been addressed in previous studies, but no attempt to critically investigate these issues has been proposed in the literature. The present study evaluated 2 sampling techniques (random and systematic) in an attempt to minimize sampling bias and to determine the minimum number of samples from 1 to 15 needed to represent the overall adipose tissue in the muscle. Both sampling procedures were applied on adipose tissue samples dissected from 30 longissimus muscles from cattle finished either on grass or grain. Briefly, adipose tissue samples were fixed with osmium tetroxide, and size and number of adipocytes were determined by a Coulter Counter. These results were then fit in a finite mixture model to obtain distribution parameters of each sample. To evaluate the benefits of increasing number of samples and the advantage of the new sampling technique, the concept of acceptance ratio was used; simply stated, the higher the acceptance ratio, the better the representation of the overall population. As expected, a great improvement on the estimation of the overall adipocyte cellularity parameters was observed using both sampling techniques when sample size number increased from 1 to 15 samples, considering both techniques' acceptance ratio increased from approximately 3 to 25%. When comparing sampling techniques, the systematic procedure slightly improved parameters estimation. The results suggest that more detailed research using other sampling techniques may provide better estimates for minimum sampling.

  15. A Model for Hydraulic Properties Based on Angular Pores with Lognormal Size Distribution

    NASA Astrophysics Data System (ADS)

    Durner, W.; Diamantopoulos, E.

    2014-12-01

    Soil water retention and unsaturated hydraulic conductivity curves are mandatory for modeling water flow in soils. It is a common approach to measure few points of the water retention curve and to calculate the hydraulic conductivity curve by assuming that the soil can be represented as a bundle of capillary tubes. Both curves are then used to predict water flow at larger spatial scales. However, the predictive power of these curves is often very limited. This can be very easily illustrated if we measure the soil hydraulic properties (SHPs) for a drainage experiment and then use these properties to predict the water flow in the case of imbibition. Further complications arise from the incomplete wetting of water at the solid matrix which results in finite values of the contact angles between the solid-water-air interfaces. To address these problems we present a physically-based model for hysteretic SHPs. This model is based on bundles of angular pores. Hysteresis for individual pores is caused by (i) different snap-off pressures during filling and emptying of single angular pores and (ii) by different advancing and receding contact angles for fluids that are not perfectly wettable. We derive a model of hydraulic conductivity as a function of contact angle by assuming flow perpendicular to pore cross sections and present closed-form expressions for both the sample scale water retention and hydraulic conductivity function by assuming a log-normal statistical distribution of pore size. We tested the new model against drainage and imbibition experiments for various sandy materials which were conducted with various liquids of differing wettability. The model described both imbibition and drainage experiments very well by assuming a unique pore size distribution of the sample and a zero contact angle for the perfectly wetting liquid. Eventually, we see the possibility to relate the particle size distribution with a model which describes the SHPs.

  16. Optimizing occupational exposure measurement strategies when estimating the log-scale arithmetic mean value--an example from the reinforced plastics industry.

    PubMed

    Lampa, Erik G; Nilsson, Leif; Liljelind, Ingrid E; Bergdahl, Ingvar A

    2006-06-01

    When assessing occupational exposures, repeated measurements are in most cases required. Repeated measurements are more resource intensive than a single measurement, so careful planning of the measurement strategy is necessary to assure that resources are spent wisely. The optimal strategy depends on the objectives of the measurements. Here, two different models of random effects analysis of variance (ANOVA) are proposed for the optimization of measurement strategies by the minimization of the variance of the estimated log-transformed arithmetic mean value of a worker group, i.e. the strategies are optimized for precise estimation of that value. The first model is a one-way random effects ANOVA model. For that model it is shown that the best precision in the estimated mean value is always obtained by including as many workers as possible in the sample while restricting the number of replicates to two or at most three regardless of the size of the variance components. The second model introduces the 'shared temporal variation' which accounts for those random temporal fluctuations of the exposure that the workers have in common. It is shown for that model that the optimal sample allocation depends on the relative sizes of the between-worker component and the shared temporal component, so that if the between-worker component is larger than the shared temporal component more workers should be included in the sample and vice versa. The results are illustrated graphically with an example from the reinforced plastics industry. If there exists a shared temporal variation at a workplace, that variability needs to be accounted for in the sampling design and the more complex model is recommended.

  17. One size does not fit all: an examination of low birthweight disparities among a diverse set of racial/ethnic groups.

    PubMed

    Johnelle Sparks, P

    2009-11-01

    To examine disparities in low birthweight using a diverse set of racial/ethnic categories and a nationally representative sample. This research explored the degree to which sociodemographic characteristics, health care access, maternal health status, and health behaviors influence birthweight disparities among seven racial/ethnic groups. Binary logistic regression models were estimated using a nationally representative sample of singleton, normal for gestational age births from 2001 using the ECLS-B, which has an approximate sample size of 7,800 infants. The multiple variable models examine disparities in low birthweight (LBW) for seven racial/ethnic groups, including non-Hispanic white, non-Hispanic black, U.S.-born Mexican-origin Hispanic, foreign-born Mexican-origin Hispanic, other Hispanic, Native American, and Asian mothers. Race-stratified logistic regression models were also examined. In the full sample models, only non-Hispanic black mothers have a LBW disadvantage compared to non-Hispanic white mothers. Maternal WIC usage was protective against LBW in the full models. No prenatal care and adequate plus prenatal care increase the odds of LBW. In the race-stratified models, prenatal care adequacy and high maternal health risks are the only variables that influence LBW for all racial/ethnic groups. The race-stratified models highlight the different mechanism important across the racial/ethnic groups in determining LBW. Differences in the distribution of maternal sociodemographic, health care access, health status, and behavior characteristics by race/ethnicity demonstrate that a single empirical framework may distort associations with LBW for certain racial and ethnic groups. More attention must be given to the specific mechanisms linking maternal risk factors to poor birth outcomes for specific racial/ethnic groups.

  18. Sampling and counting genome rearrangement scenarios

    PubMed Central

    2015-01-01

    Background Even for moderate size inputs, there are a tremendous number of optimal rearrangement scenarios, regardless what the model is and which specific question is to be answered. Therefore giving one optimal solution might be misleading and cannot be used for statistical inferring. Statistically well funded methods are necessary to sample uniformly from the solution space and then a small number of samples are sufficient for statistical inferring. Contribution In this paper, we give a mini-review about the state-of-the-art of sampling and counting rearrangement scenarios, focusing on the reversal, DCJ and SCJ models. Above that, we also give a Gibbs sampler for sampling most parsimonious labeling of evolutionary trees under the SCJ model. The method has been implemented and tested on real life data. The software package together with example data can be downloaded from http://www.renyi.hu/~miklosi/SCJ-Gibbs/ PMID:26452124

  19. Does the choice of nucleotide substitution models matter topologically?

    PubMed

    Hoff, Michael; Orf, Stefan; Riehm, Benedikt; Darriba, Diego; Stamatakis, Alexandros

    2016-03-24

    In the context of a master level programming practical at the computer science department of the Karlsruhe Institute of Technology, we developed and make available an open-source code for testing all 203 possible nucleotide substitution models in the Maximum Likelihood (ML) setting under the common Akaike, corrected Akaike, and Bayesian information criteria. We address the question if model selection matters topologically, that is, if conducting ML inferences under the optimal, instead of a standard General Time Reversible model, yields different tree topologies. We also assess, to which degree models selected and trees inferred under the three standard criteria (AIC, AICc, BIC) differ. Finally, we assess if the definition of the sample size (#sites versus #sites × #taxa) yields different models and, as a consequence, different tree topologies. We find that, all three factors (by order of impact: nucleotide model selection, information criterion used, sample size definition) can yield topologically substantially different final tree topologies (topological difference exceeding 10 %) for approximately 5 % of the tree inferences conducted on the 39 empirical datasets used in our study. We find that, using the best-fit nucleotide substitution model may change the final ML tree topology compared to an inference under a default GTR model. The effect is less pronounced when comparing distinct information criteria. Nonetheless, in some cases we did obtain substantial topological differences.

  20. Brain size growth in wild and captive chimpanzees (Pan troglodytes).

    PubMed

    Cofran, Zachary

    2018-05-24

    Despite many studies of chimpanzee brain size growth, intraspecific variation is under-explored. Brain size data from chimpanzees of the Taï Forest and the Yerkes Primate Research Center enable a unique glimpse into brain growth variation as age at death is known for individuals, allowing cross-sectional growth curves to be estimated. Because Taï chimpanzees are from the wild but Yerkes apes are captive, potential environmental effects on neural development can also be explored. Previous research has revealed differences in growth and health between wild and captive primates, but such habitat effects have yet to be investigated for brain growth. Here, I use an iterative curve fitting procedure to estimate brain growth and regression parameters for each population, statistically comparing growth models using bootstrapped confidence intervals. Yerkes and Taï brain sizes overlap at all ages, although the sole Taï newborn is at the low end of captive neonatal variation. Growth rate and duration are statistically indistinguishable between the two populations. Resampling the Yerkes sample to match the Taï sample size and age group composition shows that ontogenetic variation in the two groups are remarkably similar despite the latter's limited size. Best fit growth curves for each sample indicate cessation of brain size growth at around 2 years, earlier than has previously been reported. The overall similarity between wild and captive chimpanzees points to the canalization of brain growth in this species. © 2018 Wiley Periodicals, Inc.

  1. Moderating the Covariance Between Family Member’s Substance Use Behavior

    PubMed Central

    Eaves, Lindon J.; Neale, Michael C.

    2014-01-01

    Twin and family studies implicitly assume that the covariation between family members remains constant across differences in age between the members of the family. However, age-specificity in gene expression for shared environmental factors could generate higher correlations between family members who are more similar in age. Cohort effects (cohort × genotype or cohort × common environment) could have the same effects, and both potentially reduce effect sizes estimated in genome-wide association studies where the subjects are heterogeneous in age. In this paper we describe a model in which the covariance between twins and non-twin siblings is moderated as a function of age difference. We describe the details of the model and simulate data using a variety of different parameter values to demonstrate that model fitting returns unbiased parameter estimates. Power analyses are then conducted to estimate the sample sizes required to detect the effects of moderation in a design of twins and siblings. Finally, the model is applied to data on cigarette smoking. We find that (1) the model effectively recovers the simulated parameters, (2) the power is relatively low and therefore requires large sample sizes before small to moderate effect sizes can be found reliably, and (3) the genetic covariance between siblings for smoking behavior decays very rapidly. Result 3 implies that, e.g., genome-wide studies of smoking behavior that use individuals assessed at different ages, or belonging to different birth-year cohorts may have had substantially reduced power to detect effects of genotype on cigarette use. It also implies that significant special twin environmental effects can be explained by age-moderation in some cases. This effect likely contributes to the missing heritability paradox. PMID:24647834

  2. A simulation study on Bayesian Ridge regression models for several collinearity levels

    NASA Astrophysics Data System (ADS)

    Efendi, Achmad; Effrihan

    2017-12-01

    When analyzing data with multiple regression model if there are collinearities, then one or several predictor variables are usually omitted from the model. However, there sometimes some reasons, for instance medical or economic reasons, the predictors are all important and should be included in the model. Ridge regression model is not uncommon in some researches to use to cope with collinearity. Through this modeling, weights for predictor variables are used for estimating parameters. The next estimation process could follow the concept of likelihood. Furthermore, for the estimation nowadays the Bayesian version could be an alternative. This estimation method does not match likelihood one in terms of popularity due to some difficulties; computation and so forth. Nevertheless, with the growing improvement of computational methodology recently, this caveat should not at the moment become a problem. This paper discusses about simulation process for evaluating the characteristic of Bayesian Ridge regression parameter estimates. There are several simulation settings based on variety of collinearity levels and sample sizes. The results show that Bayesian method gives better performance for relatively small sample sizes, and for other settings the method does perform relatively similar to the likelihood method.

  3. An Overview and Evaluation of Recent Machine Learning Imputation Methods Using Cardiac Imaging Data.

    PubMed

    Liu, Yuzhe; Gopalakrishnan, Vanathi

    2017-03-01

    Many clinical research datasets have a large percentage of missing values that directly impacts their usefulness in yielding high accuracy classifiers when used for training in supervised machine learning. While missing value imputation methods have been shown to work well with smaller percentages of missing values, their ability to impute sparse clinical research data can be problem specific. We previously attempted to learn quantitative guidelines for ordering cardiac magnetic resonance imaging during the evaluation for pediatric cardiomyopathy, but missing data significantly reduced our usable sample size. In this work, we sought to determine if increasing the usable sample size through imputation would allow us to learn better guidelines. We first review several machine learning methods for estimating missing data. Then, we apply four popular methods (mean imputation, decision tree, k-nearest neighbors, and self-organizing maps) to a clinical research dataset of pediatric patients undergoing evaluation for cardiomyopathy. Using Bayesian Rule Learning (BRL) to learn ruleset models, we compared the performance of imputation-augmented models versus unaugmented models. We found that all four imputation-augmented models performed similarly to unaugmented models. While imputation did not improve performance, it did provide evidence for the robustness of our learned models.

  4. Effect of the centrifugal force on domain chaos in Rayleigh-Bénard convection.

    PubMed

    Becker, Nathan; Scheel, J D; Cross, M C; Ahlers, Guenter

    2006-06-01

    Experiments and simulations from a variety of sample sizes indicated that the centrifugal force significantly affects the domain-chaos state observed in rotating Rayleigh-Bénard convection-patterns. In a large-aspect-ratio sample, we observed a hybrid state consisting of domain chaos close to the sample center, surrounded by an annulus of nearly stationary nearly radial rolls populated by occasional defects reminiscent of undulation chaos. Although the Coriolis force is responsible for domain chaos, by comparing experiment and simulation we show that the centrifugal force is responsible for the radial rolls. Furthermore, simulations of the Boussinesq equations for smaller aspect ratios neglecting the centrifugal force yielded a domain precession-frequency f approximately epsilon(mu) with mu approximately equal to 1 as predicted by the amplitude-equation model for domain chaos, but contradicted by previous experiment. Additionally the simulations gave a domain size that was larger than in the experiment. When the centrifugal force was included in the simulation, mu and the domain size were consistent with experiment.

  5. A Bayesian nonparametric method for prediction in EST analysis

    PubMed Central

    Lijoi, Antonio; Mena, Ramsés H; Prünster, Igor

    2007-01-01

    Background Expressed sequence tags (ESTs) analyses are a fundamental tool for gene identification in organisms. Given a preliminary EST sample from a certain library, several statistical prediction problems arise. In particular, it is of interest to estimate how many new genes can be detected in a future EST sample of given size and also to determine the gene discovery rate: these estimates represent the basis for deciding whether to proceed sequencing the library and, in case of a positive decision, a guideline for selecting the size of the new sample. Such information is also useful for establishing sequencing efficiency in experimental design and for measuring the degree of redundancy of an EST library. Results In this work we propose a Bayesian nonparametric approach for tackling statistical problems related to EST surveys. In particular, we provide estimates for: a) the coverage, defined as the proportion of unique genes in the library represented in the given sample of reads; b) the number of new unique genes to be observed in a future sample; c) the discovery rate of new genes as a function of the future sample size. The Bayesian nonparametric model we adopt conveys, in a statistically rigorous way, the available information into prediction. Our proposal has appealing properties over frequentist nonparametric methods, which become unstable when prediction is required for large future samples. EST libraries, previously studied with frequentist methods, are analyzed in detail. Conclusion The Bayesian nonparametric approach we undertake yields valuable tools for gene capture and prediction in EST libraries. The estimators we obtain do not feature the kind of drawbacks associated with frequentist estimators and are reliable for any size of the additional sample. PMID:17868445

  6. Injection molding of iPP samples in controlled conditions and resulting morphology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sessa, Nino, E-mail: ninosessa.ns@gmail.com; De Santis, Felice, E-mail: fedesantis@unisa.it; Pantani, Roberto, E-mail: rpantani@unisa.it

    2015-12-17

    Injection molded parts are driven down in size and weight especially for electronic applications. In this work, an investigation was carried out on the process of injection molding of thin iPP samples and on the morphology of these parts. Melt flow in the mold cavity was analyzed and described with a mathematical model. Influence of mold temperature and injection pressure was analyzed. Samples orientation was studied using optical microscopy.

  7. POWER ANALYSIS FOR COMPLEX MEDIATIONAL DESIGNS USING MONTE CARLO METHODS

    PubMed Central

    Thoemmes, Felix; MacKinnon, David P.; Reiser, Mark R.

    2013-01-01

    Applied researchers often include mediation effects in applications of advanced methods such as latent variable models and linear growth curve models. Guidance on how to estimate statistical power to detect mediation for these models has not yet been addressed in the literature. We describe a general framework for power analyses for complex mediational models. The approach is based on the well known technique of generating a large number of samples in a Monte Carlo study, and estimating power as the percentage of cases in which an estimate of interest is significantly different from zero. Examples of power calculation for commonly used mediational models are provided. Power analyses for the single mediator, multiple mediators, three-path mediation, mediation with latent variables, moderated mediation, and mediation in longitudinal designs are described. Annotated sample syntax for Mplus is appended and tabled values of required sample sizes are shown for some models. PMID:23935262

  8. Element enrichment factor calculation using grain-size distribution and functional data regression.

    PubMed

    Sierra, C; Ordóñez, C; Saavedra, A; Gallego, J R

    2015-01-01

    In environmental geochemistry studies it is common practice to normalize element concentrations in order to remove the effect of grain size. Linear regression with respect to a particular grain size or conservative element is a widely used method of normalization. In this paper, the utility of functional linear regression, in which the grain-size curve is the independent variable and the concentration of pollutant the dependent variable, is analyzed and applied to detrital sediment. After implementing functional linear regression and classical linear regression models to normalize and calculate enrichment factors, we concluded that the former regression technique has some advantages over the latter. First, functional linear regression directly considers the grain-size distribution of the samples as the explanatory variable. Second, as the regression coefficients are not constant values but functions depending on the grain size, it is easier to comprehend the relationship between grain size and pollutant concentration. Third, regularization can be introduced into the model in order to establish equilibrium between reliability of the data and smoothness of the solutions. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. Reliability of dose volume constraint inference from clinical data.

    PubMed

    Lutz, C M; Møller, D S; Hoffmann, L; Knap, M M; Alber, M

    2017-04-21

    Dose volume histogram points (DVHPs) frequently serve as dose constraints in radiotherapy treatment planning. An experiment was designed to investigate the reliability of DVHP inference from clinical data for multiple cohort sizes and complication incidence rates. The experimental background was radiation pneumonitis in non-small cell lung cancer and the DVHP inference method was based on logistic regression. From 102 NSCLC real-life dose distributions and a postulated DVHP model, an 'ideal' cohort was generated where the most predictive model was equal to the postulated model. A bootstrap and a Cohort Replication Monte Carlo (CoRepMC) approach were applied to create 1000 equally sized populations each. The cohorts were then analyzed to establish inference frequency distributions. This was applied to nine scenarios for cohort sizes of 102 (1), 500 (2) to 2000 (3) patients (by sampling with replacement) and three postulated DVHP models. The Bootstrap was repeated for a 'non-ideal' cohort, where the most predictive model did not coincide with the postulated model. The Bootstrap produced chaotic results for all models of cohort size 1 for both the ideal and non-ideal cohorts. For cohort size 2 and 3, the distributions for all populations were more concentrated around the postulated DVHP. For the CoRepMC, the inference frequency increased with cohort size and incidence rate. Correct inference rates  >[Formula: see text] were only achieved by cohorts with more than 500 patients. Both Bootstrap and CoRepMC indicate that inference of the correct or approximate DVHP for typical cohort sizes is highly uncertain. CoRepMC results were less spurious than Bootstrap results, demonstrating the large influence that randomness in dose-response has on the statistical analysis.

  10. Reliability of dose volume constraint inference from clinical data

    NASA Astrophysics Data System (ADS)

    Lutz, C. M.; Møller, D. S.; Hoffmann, L.; Knap, M. M.; Alber, M.

    2017-04-01

    Dose volume histogram points (DVHPs) frequently serve as dose constraints in radiotherapy treatment planning. An experiment was designed to investigate the reliability of DVHP inference from clinical data for multiple cohort sizes and complication incidence rates. The experimental background was radiation pneumonitis in non-small cell lung cancer and the DVHP inference method was based on logistic regression. From 102 NSCLC real-life dose distributions and a postulated DVHP model, an ‘ideal’ cohort was generated where the most predictive model was equal to the postulated model. A bootstrap and a Cohort Replication Monte Carlo (CoRepMC) approach were applied to create 1000 equally sized populations each. The cohorts were then analyzed to establish inference frequency distributions. This was applied to nine scenarios for cohort sizes of 102 (1), 500 (2) to 2000 (3) patients (by sampling with replacement) and three postulated DVHP models. The Bootstrap was repeated for a ‘non-ideal’ cohort, where the most predictive model did not coincide with the postulated model. The Bootstrap produced chaotic results for all models of cohort size 1 for both the ideal and non-ideal cohorts. For cohort size 2 and 3, the distributions for all populations were more concentrated around the postulated DVHP. For the CoRepMC, the inference frequency increased with cohort size and incidence rate. Correct inference rates  >85 % were only achieved by cohorts with more than 500 patients. Both Bootstrap and CoRepMC indicate that inference of the correct or approximate DVHP for typical cohort sizes is highly uncertain. CoRepMC results were less spurious than Bootstrap results, demonstrating the large influence that randomness in dose-response has on the statistical analysis.

  11. Detection probability in aerial surveys of feral horses

    USGS Publications Warehouse

    Ransom, Jason I.

    2011-01-01

    Observation bias pervades data collected during aerial surveys of large animals, and although some sources can be mitigated with informed planning, others must be addressed using valid sampling techniques that carefully model detection probability. Nonetheless, aerial surveys are frequently employed to count large mammals without applying such methods to account for heterogeneity in visibility of animal groups on the landscape. This often leaves managers and interest groups at odds over decisions that are not adequately informed. I analyzed detection of feral horse (Equus caballus) groups by dual independent observers from 24 fixed-wing and 16 helicopter flights using mixed-effect logistic regression models to investigate potential sources of observation bias. I accounted for observer skill, population location, and aircraft type in the model structure and analyzed the effects of group size, sun effect (position related to observer), vegetation type, topography, cloud cover, percent snow cover, and observer fatigue on detection of horse groups. The most important model-averaged effects for both fixed-wing and helicopter surveys included group size (fixed-wing: odds ratio = 0.891, 95% CI = 0.850–0.935; helicopter: odds ratio = 0.640, 95% CI = 0.587–0.698) and sun effect (fixed-wing: odds ratio = 0.632, 95% CI = 0.350–1.141; helicopter: odds ratio = 0.194, 95% CI = 0.080–0.470). Observer fatigue was also an important effect in the best model for helicopter surveys, with detection probability declining after 3 hr of survey time (odds ratio = 0.278, 95% CI = 0.144–0.537). Biases arising from sun effect and observer fatigue can be mitigated by pre-flight survey design. Other sources of bias, such as those arising from group size, topography, and vegetation can only be addressed by employing valid sampling techniques such as double sampling, mark–resight (batch-marked animals), mark–recapture (uniquely marked and identifiable animals), sightability bias correction models, and line transect distance sampling; however, some of these techniques may still only partially correct for negative observation biases.

  12. Recent Structural Evolution of Early-Type Galaxies: Size Growth from z = 1 to z = 0

    NASA Astrophysics Data System (ADS)

    van der Wel, Arjen; Holden, Bradford P.; Zirm, Andrew W.; Franx, Marijn; Rettura, Alessandro; Illingworth, Garth D.; Ford, Holland C.

    2008-11-01

    Strong size and internal density evolution of early-type galaxies between z ~ 2 and the present has been reported by several authors. Here we analyze samples of nearby and distant (z ~ 1) galaxies with dynamically measured masses in order to confirm the previous, model-dependent results and constrain the uncertainties that may play a role. Velocity dispersion (σ) measurements are taken from the literature for 50 morphologically selected 0.8 < z < 1.2 field and cluster early-type galaxies with typical masses Mdyn = 2 × 1011 M⊙. Sizes (Reff) are determined with Advanced Camera for Surveys imaging. We compare the distant sample with a large sample of nearby (0.04 < z < 0.08) early-type galaxies extracted from the Sloan Digital Sky Survey for which we determine sizes, masses, and densities in a consistent manner, using simulations to quantify systematic differences between the size measurements of nearby and distant galaxies. We find a highly significant difference between the σ - Reff distributions of the nearby and distant samples, regardless of sample selection effects. The implied evolution in Reff at fixed mass between z = 1 and the present is a factor of 1.97 +/- 0.15. This is in qualitative agreement with semianalytic models; however, the observed evolution is much faster than the predicted evolution. Our results reinforce and are quantitatively consistent with previous, photometric studies that found size evolution of up to a factor of 5 since z ~ 2. A combination of structural evolution of individual galaxies through the accretion of companions and the continuous formation of early-type galaxies through increasingly gas-poor mergers is one plausible explanation of the observations. Based on observations with the Hubble Space Telescope, obtained at the Space Telescope Science Institute, which is operated by AURA, Inc., under NASA contract NAS5-26555, and observations made with the Spitzer Space Telescope, which is operated by the Jet Propulsion Laboratory, California Institute of Technology, under NASA contract 1407. Based on observations collected at the European Southern Observatory, Chile (169.A-0458). Some of the data presented herein were obtained at the W. M. Keck Observatory, which is operated as a scientific partnership among the California Institute of Technology, the University of California and the National Aeronautics and Space Administration. The Observatory was made possible by the generous financial support of the W.M. Keck Foundation.

  13. Modified Distribution-Free Goodness-of-Fit Test Statistic.

    PubMed

    Chun, So Yeon; Browne, Michael W; Shapiro, Alexander

    2018-03-01

    Covariance structure analysis and its structural equation modeling extensions have become one of the most widely used methodologies in social sciences such as psychology, education, and economics. An important issue in such analysis is to assess the goodness of fit of a model under analysis. One of the most popular test statistics used in covariance structure analysis is the asymptotically distribution-free (ADF) test statistic introduced by Browne (Br J Math Stat Psychol 37:62-83, 1984). The ADF statistic can be used to test models without any specific distribution assumption (e.g., multivariate normal distribution) of the observed data. Despite its advantage, it has been shown in various empirical studies that unless sample sizes are extremely large, this ADF statistic could perform very poorly in practice. In this paper, we provide a theoretical explanation for this phenomenon and further propose a modified test statistic that improves the performance in samples of realistic size. The proposed statistic deals with the possible ill-conditioning of the involved large-scale covariance matrices.

  14. Computational Process Modeling for Additive Manufacturing

    NASA Technical Reports Server (NTRS)

    Bagg, Stacey; Zhang, Wei

    2014-01-01

    Computational Process and Material Modeling of Powder Bed additive manufacturing of IN 718. Optimize material build parameters with reduced time and cost through modeling. Increase understanding of build properties. Increase reliability of builds. Decrease time to adoption of process for critical hardware. Potential to decrease post-build heat treatments. Conduct single-track and coupon builds at various build parameters. Record build parameter information and QM Meltpool data. Refine Applied Optimization powder bed AM process model using data. Report thermal modeling results. Conduct metallography of build samples. Calibrate STK models using metallography findings. Run STK models using AO thermal profiles and report STK modeling results. Validate modeling with additional build. Photodiode Intensity measurements highly linear with power input. Melt Pool Intensity highly correlated to Melt Pool Size. Melt Pool size and intensity increase with power. Applied Optimization will use data to develop powder bed additive manufacturing process model.

  15. Stochastic theory of size exclusion chromatography by the characteristic function approach.

    PubMed

    Dondi, Francesco; Cavazzini, Alberto; Remelli, Maurizio; Felinger, Attila; Martin, Michel

    2002-01-18

    A general stochastic theory of size exclusion chromatography (SEC) able to account for size dependence on both pore ingress and egress processes, moving zone dispersion and pore size distribution, was developed. The relationship between stochastic-chromatographic and batch equilibrium conditions are discussed and the fundamental role of the 'ergodic' hypothesis in establishing a link between them is emphasized. SEC models are solved by means of the characteristic function method and chromatographic parameters like plate height, peak skewness and excess are derived. The peak shapes are obtained by numerical inversion of the characteristic function under the most general conditions of the exploited models. Separate size effects on pore ingress and pore egress processes are investigated and their effects on both retention selectivity and efficiency are clearly shown. The peak splitting phenomenon and peak tailing due to incomplete sample sorption near to the exclusion limit is discussed. An SEC model for columns with two types of pores is discussed and several effects on retention selectivity and efficiency coming from pore size differences and their relative abundance are singled out. The relevance of moving zone dispersion on separation is investigated. The present approach proves to be general and able to account for more complex SEC conditions such as continuous pore size distributions and mixed retention mechanism.

  16. [Sequential sampling plans to Orthezia praelonga Douglas (Hemiptera: Sternorrhyncha, Ortheziidae) in citrus].

    PubMed

    Costa, Marilia G; Barbosa, José C; Yamamoto, Pedro T

    2007-01-01

    The sequential sampling is characterized by using samples of variable sizes, and has the advantage of reducing sampling time and costs if compared to fixed-size sampling. To introduce an adequate management for orthezia, sequential sampling plans were developed for orchards under low and high infestation. Data were collected in Matão, SP, in commercial stands of the orange variety 'Pêra Rio', at five, nine and 15 years of age. Twenty samplings were performed in the whole area of each stand by observing the presence or absence of scales on plants, being plots comprised of ten plants. After observing that in all of the three stands the scale population was distributed according to the contagious model, fitting the Negative Binomial Distribution in most samplings, two sequential sampling plans were constructed according to the Sequential Likelihood Ratio Test (SLRT). To construct these plans an economic threshold of 2% was adopted and the type I and II error probabilities were fixed in alpha = beta = 0.10. Results showed that the maximum numbers of samples expected to determine control need were 172 and 76 samples for stands with low and high infestation, respectively.

  17. The effect of salt crust on the thermal conductivity of one sample of fluvial particulate materials under Martian atmospheric pressures

    NASA Astrophysics Data System (ADS)

    Presley, Marsha A.; Craddock, Robert A.; Zolotova, Natalya

    2009-11-01

    A line-heat source apparatus was used to measure thermal conductivities of a lightly cemented fluvial sediment (salinity = 1.1 g · kg-1), and the same sample with the cement bonds almost completely disrupted, under low pressure, carbon dioxide atmospheres. The thermal conductivities of the cemented sample were approximately 3× higher, over the range of atmospheric pressures tested, than the thermal conductivities of the same sample after the cement bonds were broken. A thermal conductivity-derived particle size was determined for each sample by comparing these thermal conductivity measurements to previous data that demonstrated the dependence of thermal conductivity on particle size. Actual particle-size distributions were determined via physical separation through brass sieves. When uncemented, 87% of the particles were less than 125 μm in diameter, with 60% of the sample being less than 63 μm in diameter. As much as 35% of the cemented sample was composed of conglomerate particles with diameters greater than 500 μm. The thermal conductivities of the cemented sample were most similar to those of 500-μm glass beads, whereas the thermal conductivities of the uncemented sample were most similar to those of 75-μm glass beads. This study demonstrates that even a small amount of salt cement can significantly increase the thermal conductivity of particulate materials, as predicted by thermal modeling estimates by previous investigators.

  18. Improved variance estimation of classification performance via reduction of bias caused by small sample size.

    PubMed

    Wickenberg-Bolin, Ulrika; Göransson, Hanna; Fryknäs, Mårten; Gustafsson, Mats G; Isaksson, Anders

    2006-03-13

    Supervised learning for classification of cancer employs a set of design examples to learn how to discriminate between tumors. In practice it is crucial to confirm that the classifier is robust with good generalization performance to new examples, or at least that it performs better than random guessing. A suggested alternative is to obtain a confidence interval of the error rate using repeated design and test sets selected from available examples. However, it is known that even in the ideal situation of repeated designs and tests with completely novel samples in each cycle, a small test set size leads to a large bias in the estimate of the true variance between design sets. Therefore different methods for small sample performance estimation such as a recently proposed procedure called Repeated Random Sampling (RSS) is also expected to result in heavily biased estimates, which in turn translates into biased confidence intervals. Here we explore such biases and develop a refined algorithm called Repeated Independent Design and Test (RIDT). Our simulations reveal that repeated designs and tests based on resampling in a fixed bag of samples yield a biased variance estimate. We also demonstrate that it is possible to obtain an improved variance estimate by means of a procedure that explicitly models how this bias depends on the number of samples used for testing. For the special case of repeated designs and tests using new samples for each design and test, we present an exact analytical expression for how the expected value of the bias decreases with the size of the test set. We show that via modeling and subsequent reduction of the small sample bias, it is possible to obtain an improved estimate of the variance of classifier performance between design sets. However, the uncertainty of the variance estimate is large in the simulations performed indicating that the method in its present form cannot be directly applied to small data sets.

  19. 10 CFR 431.135 - Units to be tested.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... EQUIPMENT Automatic Commercial Ice Makers Test Procedures § 431.135 Units to be tested. For each basic model of automatic commercial ice maker selected for testing, a sample of sufficient size shall be selected...

  20. Optimal flexible sample size design with robust power.

    PubMed

    Zhang, Lanju; Cui, Lu; Yang, Bo

    2016-08-30

    It is well recognized that sample size determination is challenging because of the uncertainty on the treatment effect size. Several remedies are available in the literature. Group sequential designs start with a sample size based on a conservative (smaller) effect size and allow early stop at interim looks. Sample size re-estimation designs start with a sample size based on an optimistic (larger) effect size and allow sample size increase if the observed effect size is smaller than planned. Different opinions favoring one type over the other exist. We propose an optimal approach using an appropriate optimality criterion to select the best design among all the candidate designs. Our results show that (1) for the same type of designs, for example, group sequential designs, there is room for significant improvement through our optimization approach; (2) optimal promising zone designs appear to have no advantages over optimal group sequential designs; and (3) optimal designs with sample size re-estimation deliver the best adaptive performance. We conclude that to deal with the challenge of sample size determination due to effect size uncertainty, an optimal approach can help to select the best design that provides most robust power across the effect size range of interest. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

Top