Science.gov

Sample records for calculated sample size

  1. Sample size calculations.

    PubMed

    Noordzij, Marlies; Dekker, Friedo W; Zoccali, Carmine; Jager, Kitty J

    2011-01-01

    The sample size is the number of patients or other experimental units that need to be included in a study to answer the research question. Pre-study calculation of the sample size is important; if a sample size is too small, one will not be able to detect an effect, while a sample that is too large may be a waste of time and money. Methods to calculate the sample size are explained in statistical textbooks, but because there are many different formulas available, it can be difficult for investigators to decide which method to use. Moreover, these calculations are prone to errors, because small changes in the selected parameters can lead to large differences in the sample size. This paper explains the basic principles of sample size calculations and demonstrates how to perform such a calculation for a simple study design.

  2. How to calculate sample size and why.

    PubMed

    Kim, Jeehyoung; Seo, Bong Soo

    2013-09-01

    Calculating the sample size is essential to reduce the cost of a study and to prove the hypothesis effectively. Referring to pilot studies and previous research studies, we can choose a proper hypothesis and simplify the studies by using a website or Microsoft Excel sheet that contains formulas for calculating sample size in the beginning stage of the study. There are numerous formulas for calculating the sample size for complicated statistics and studies, but most studies can use basic calculating methods for sample size calculation.

  3. How to Calculate Sample Size and Why

    PubMed Central

    Seo, Bong Soo

    2013-01-01

    Why Calculating the sample size is essential to reduce the cost of a study and to prove the hypothesis effectively. How Referring to pilot studies and previous research studies, we can choose a proper hypothesis and simplify the studies by using a website or Microsoft Excel sheet that contains formulas for calculating sample size in the beginning stage of the study. More There are numerous formulas for calculating the sample size for complicated statistics and studies, but most studies can use basic calculating methods for sample size calculation. PMID:24009911

  4. Sample size calculations for prevalent cohort designs.

    PubMed

    Liu, Hao; Shen, Yu; Ning, Jing; Qin, Jing

    2017-02-01

    Cross-sectional prevalent cohort design has drawn considerable interests in the studies of association between risk factors and time-to-event outcome. The sampling scheme in such design gives rise to length-biased data that require specialized analysis strategy but can improve study efficiency. The power and sample size calculation methods are however lacking for studies with prevalent cohort design, and using the formula developed for traditional survival data may overestimate sample size. We derive the sample size formulas that are appropriate for the design of cross-sectional prevalent cohort studies, under the assumptions of exponentially distributed event time and uniform follow-up for cross-sectional prevalent cohort design. We perform numerical and simulation studies to compare the sample size requirements for achieving the same power between prevalent cohort and incident cohort designs. We also use a large prospective prevalent cohort study to demonstrate the procedure. Using rigorous designs and proper analysis tools, the prospective prevalent cohort design can be more efficient than the incident cohort design with the same total sample sizes and study durations.

  5. Sample size calculation in metabolic phenotyping studies.

    PubMed

    Billoir, Elise; Navratil, Vincent; Blaise, Benjamin J

    2015-09-01

    The number of samples needed to identify significant effects is a key question in biomedical studies, with consequences on experimental designs, costs and potential discoveries. In metabolic phenotyping studies, sample size determination remains a complex step. This is due particularly to the multiple hypothesis-testing framework and the top-down hypothesis-free approach, with no a priori known metabolic target. Until now, there was no standard procedure available to address this purpose. In this review, we discuss sample size estimation procedures for metabolic phenotyping studies. We release an automated implementation of the Data-driven Sample size Determination (DSD) algorithm for MATLAB and GNU Octave. Original research concerning DSD was published elsewhere. DSD allows the determination of an optimized sample size in metabolic phenotyping studies. The procedure uses analytical data only from a small pilot cohort to generate an expanded data set. The statistical recoupling of variables procedure is used to identify metabolic variables, and their intensity distributions are estimated by Kernel smoothing or log-normal density fitting. Statistically significant metabolic variations are evaluated using the Benjamini-Yekutieli correction and processed for data sets of various sizes. Optimal sample size determination is achieved in a context of biomarker discovery (at least one statistically significant variation) or metabolic exploration (a maximum of statistically significant variations). DSD toolbox is encoded in MATLAB R2008A (Mathworks, Natick, MA) for Kernel and log-normal estimates, and in GNU Octave for log-normal estimates (Kernel density estimates are not robust enough in GNU octave). It is available at http://www.prabi.fr/redmine/projects/dsd/repository, with a tutorial at http://www.prabi.fr/redmine/projects/dsd/wiki. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  6. Considerations when calculating the sample size for an inequality test.

    PubMed

    In, Junyong

    2016-08-01

    Click here for Korean Translation. Calculating the sample size is a vital step during the planning of a study in order to ensure the desired power for detecting clinically meaningful differences. However, estimating the sample size is not always straightforward. A number of key components should be considered to calculate a suitable sample size. In this paper, general considerations for conducting sample size calculations for inequality tests are summarized.

  7. Considerations when calculating the sample size for an inequality test

    PubMed Central

    2016-01-01

    Click here for Korean Translation. Calculating the sample size is a vital step during the planning of a study in order to ensure the desired power for detecting clinically meaningful differences. However, estimating the sample size is not always straightforward. A number of key components should be considered to calculate a suitable sample size. In this paper, general considerations for conducting sample size calculations for inequality tests are summarized. PMID:27482308

  8. On power and sample size calculation in ethnic sensitivity studies.

    PubMed

    Zhang, Wei; Sethuraman, Venkat

    2011-01-01

    In ethnic sensitivity studies, it is of interest to know whether the same dose has the same effect over populations in different regions. Glasbrenner and Rosenkranz (2006) proposed a criterion for ethnic sensitivity studies in the context of different dose-exposure models. Their method is liberal in the sense that their sample size will not achieve the target power. We will show that the power function can be easily calculated by numeric integration, and the sample size can be determined by bisection.

  9. Sample size calculation for comparing two negative binomial rates.

    PubMed

    Zhu, Haiyuan; Lakkis, Hassan

    2014-02-10

    Negative binomial model has been increasingly used to model the count data in recent clinical trials. It is frequently chosen over Poisson model in cases of overdispersed count data that are commonly seen in clinical trials. One of the challenges of applying negative binomial model in clinical trial design is the sample size estimation. In practice, simulation methods have been frequently used for sample size estimation. In this paper, an explicit formula is developed to calculate sample size based on the negative binomial model. Depending on different approaches to estimate the variance under null hypothesis, three variations of the sample size formula are proposed and discussed. Important characteristics of the formula include its accuracy and its ability to explicitly incorporate dispersion parameter and exposure time. The performance of the formula with each variation is assessed using simulations.

  10. Power and sample size calculations for current status survival analysis.

    PubMed

    Williamson, John M; Lin, Hung-Mo; Kim, Hae-Young

    2009-07-10

    Although sample size calculations have become an important element in the design of research projects, such methods for studies involving current status data are scarce. Here, we propose a method for calculating power and sample size for studies using current status data. This method is based on a Weibull survival model for a two-group comparison. The Weibull model allows the investigator to specify a group difference in terms of a hazards ratio or a failure time ratio. We consider exponential, Weibull and uniformly distributed censoring distributions. We base our power calculations on a parametric approach with the Wald test because it is easy for medical investigators to conceptualize and specify the required input variables. As expected, studies with current status data have substantially less power than studies with the usual right-censored failure time data. Our simulation results demonstrate the merits of these proposed power calculations. Copyright 2009 John Wiley & Sons, Ltd.

  11. Power and Sample Size Calculations for Contrast Analysis in ANCOVA.

    PubMed

    Shieh, Gwowen

    2017-01-01

    Analysis of covariance (ANCOVA) is commonly used in behavioral and educational research to reduce the error variance and improve the power of analysis of variance by adjusting the covariate effects. For planning and evaluating randomized ANCOVA designs, a simple sample-size formula has been proposed to account for the variance deflation factor in the comparison of two treatment groups. The objective of this article is to highlight an overlooked and potential problem of the exiting approximation and to provide an alternative and exact solution of power and sample size assessments for testing treatment contrasts. Numerical investigations are conducted to reveal the relative performance of the two procedures as a reliable technique to accommodate the covariate features that make ANCOVA design particularly distinctive. The described approach has important advantages over the current method in general applicability, methodological justification, and overall accuracy. To enhance the practical usefulness, computer algorithms are presented to implement the recommended power calculations and sample-size determinations.

  12. GLIMMPSE Lite: Calculating Power and Sample Size on Smartphone Devices

    PubMed Central

    Munjal, Aarti; Sakhadeo, Uttara R.; Muller, Keith E.; Glueck, Deborah H.; Kreidler, Sarah M.

    2014-01-01

    Researchers seeking to develop complex statistical applications for mobile devices face a common set of difficult implementation issues. In this work, we discuss general solutions to the design challenges. We demonstrate the utility of the solutions for a free mobile application designed to provide power and sample size calculations for univariate, one-way analysis of variance (ANOVA), GLIMMPSE Lite. Our design decisions provide a guide for other scientists seeking to produce statistical software for mobile platforms. PMID:25541688

  13. Sample size calculation for the proportional hazards cure model.

    PubMed

    Wang, Songfeng; Zhang, Jiajia; Lu, Wenbin

    2012-12-20

    In clinical trials with time-to-event endpoints, it is not uncommon to see a significant proportion of patients being cured (or long-term survivors), such as trials for the non-Hodgkins lymphoma disease. The popularly used sample size formula derived under the proportional hazards (PH) model may not be proper to design a survival trial with a cure fraction, because the PH model assumption may be violated. To account for a cure fraction, the PH cure model is widely used in practice, where a PH model is used for survival times of uncured patients and a logistic distribution is used for the probability of patients being cured. In this paper, we develop a sample size formula on the basis of the PH cure model by investigating the asymptotic distributions of the standard weighted log-rank statistics under the null and local alternative hypotheses. The derived sample size formula under the PH cure model is more flexible because it can be used to test the differences in the short-term survival and/or cure fraction. Furthermore, we also investigate as numerical examples the impacts of accrual methods and durations of accrual and follow-up periods on sample size calculation. The results show that ignoring the cure rate in sample size calculation can lead to either underpowered or overpowered studies. We evaluate the performance of the proposed formula by simulation studies and provide an example to illustrate its application with the use of data from a melanoma trial. Copyright © 2012 John Wiley & Sons, Ltd.

  14. Stratified Fisher's Exact Test and its Sample Size Calculation

    PubMed Central

    Jung, Sin-Ho

    2013-01-01

    Summary Chi-squared test has been a popular approach to the analysis of a 2 × 2 table when the sample sizes for the four cells are large. When the large sample assumption does not hold, however, we need an exact testing method such as Fisher's test. When the study population is heterogeneous, we often partition the subjects into multiple strata, so that each stratum consists of homogeneous subjects and hence the stratified analysis has an improved testing power. While Mantel-Haenszel test has been widely used as an extension of the chi-squared test to test on stratified 2×2 tables with a large-sample approximation, we have been lacking an extension of Fisher's test for stratified exact testing. In this paper, we discuss an exact testing method for stratified 2 × 2 tables which is simplified to the standard Fisher's test in single 2 × 2 table cases, and propose its sample size calculation method that can be useful for designing a study with rare cell frequencies. PMID:24395208

  15. Stratified Fisher's exact test and its sample size calculation.

    PubMed

    Jung, Sin-Ho

    2014-01-01

    Chi-squared test has been a popular approach to the analysis of a 2 × 2 table when the sample sizes for the four cells are large. When the large sample assumption does not hold, however, we need an exact testing method such as Fisher's test. When the study population is heterogeneous, we often partition the subjects into multiple strata, so that each stratum consists of homogeneous subjects and hence the stratified analysis has an improved testing power. While Mantel-Haenszel test has been widely used as an extension of the chi-squared test to test on stratified 2 × 2 tables with a large-sample approximation, we have been lacking an extension of Fisher's test for stratified exact testing. In this paper, we discuss an exact testing method for stratified 2 × 2 tables that is simplified to the standard Fisher's test in single 2 × 2 table cases, and propose its sample size calculation method that can be useful for designing a study with rare cell frequencies. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. Sample size and power calculations with correlated binary data.

    PubMed

    Pan, W

    2001-06-01

    Correlated binary data are common in biomedical studies. Such data can be analyzed using Liang and Zeger's generalized estimating equations (GEE) approach. An attractive point of the GEE approach is that one can use a misspecified working correlation matrix, such as the working independence model (i.e., the identity matrix), and draw (asymptotically) valid statistical inference by using the so-called robust or sandwich variance estimator. In this article we derive some explicit formulas for sample size and power calculations under various common situations. The given formulas are based on using the robust variance estimator in GEE. We believe that these formulas will facilitate the practice in planning two-arm clinical trials with correlated binary outcome data.

  17. A program to calculate sample size, power, and least detectable relative risk using a programmable calculator.

    PubMed

    Muhm, J M; Olshan, A F

    1989-01-01

    A program for the Hewlett Packard 41 series programmable calculator that determines sample size, power, and least detectable relative risk for comparative studies with independent groups is described. The user may specify any ratio of cases to controls (or exposed to unexposed subjects) and, if calculating least detectable relative risks, may specify whether the study is a case-control or cohort study.

  18. Calculating sample size in trials using historical controls.

    PubMed

    Zhang, Song; Cao, Jing; Ahn, Chul

    2010-08-01

    Makuch and Simon [Sample size considerations for non-randomised comparative studies. J Chronic Dis 1980; 33: 175-81.] developed a sample size formula for historical control trials. When assessing power, they assumed the true control treatment effect to be equal to the observed effect from the historical control group. Many researchers have pointed out that the Makuch-Simon approach does not preserve the nominal power and type I error when considering the uncertainty in the true historical control treatment effect. To develop a sample size formula that properly accounts for the underlying randomness in the observations from the historical control group. We reveal the extremely skewed nature in the distributions of power and type I error, obtained over all the random realizations of the historical control data. The skewness motivates us to derive a sample size formula that controls the percentiles, instead of the means, of the power and type I error. A closed-form sample size formula is developed to control arbitrary percentiles of power and type I error for historical control trials. A simulation study further demonstrates that this approach preserves the operational characteristics in a more realistic scenario where the population variances are unknown and replaced by sample variances. The closed-form sample size formula is derived for continuous outcomes. The formula is more complicated for binary or survival time outcomes. We have derived a closed-form sample size formula that controls the percentiles instead of means of power and type I error in historical control trials, which have extremely skewed distributions over all the possible realizations of historical control data.

  19. Sample Size Calculations for Population Size Estimation Studies Using Multiplier Methods With Respondent-Driven Sampling Surveys.

    PubMed

    Fearon, Elizabeth; Chabata, Sungai T; Thompson, Jennifer A; Cowan, Frances M; Hargreaves, James R

    2017-09-14

    While guidance exists for obtaining population size estimates using multiplier methods with respondent-driven sampling surveys, we lack specific guidance for making sample size decisions. To guide the design of multiplier method population size estimation studies using respondent-driven sampling surveys to reduce the random error around the estimate obtained. The population size estimate is obtained by dividing the number of individuals receiving a service or the number of unique objects distributed (M) by the proportion of individuals in a representative survey who report receipt of the service or object (P). We have developed an approach to sample size calculation, interpreting methods to estimate the variance around estimates obtained using multiplier methods in conjunction with research into design effects and respondent-driven sampling. We describe an application to estimate the number of female sex workers in Harare, Zimbabwe. There is high variance in estimates. Random error around the size estimate reflects uncertainty from M and P, particularly when the estimate of P in the respondent-driven sampling survey is low. As expected, sample size requirements are higher when the design effect of the survey is assumed to be greater. We suggest a method for investigating the effects of sample size on the precision of a population size estimate obtained using multipler methods and respondent-driven sampling. Uncertainty in the size estimate is high, particularly when P is small, so balancing against other potential sources of bias, we advise researchers to consider longer service attendance reference periods and to distribute more unique objects, which is likely to result in a higher estimate of P in the respondent-driven sampling survey.

  20. Sample size calculation for the one-sample log-rank test.

    PubMed

    Schmidt, René; Kwiecien, Robert; Faldum, Andreas; Berthold, Frank; Hero, Barbara; Ligges, Sandra

    2015-03-15

    An improved method of sample size calculation for the one-sample log-rank test is provided. The one-sample log-rank test may be the method of choice if the survival curve of a single treatment group is to be compared with that of a historic control. Such settings arise, for example, in clinical phase-II trials if the response to a new treatment is measured by a survival endpoint. Present sample size formulas for the one-sample log-rank test are based on the number of events to be observed, that is, in order to achieve approximately a desired power for allocated significance level and effect the trial is stopped as soon as a certain critical number of events are reached. We propose a new stopping criterion to be followed. Both approaches are shown to be asymptotically equivalent. For small sample size, though, a simulation study indicates that the new criterion might be preferred when planning a corresponding trial. In our simulations, the trial is usually underpowered, and the aspired significance level is not exploited if the traditional stopping criterion based on the number of events is used, whereas a trial based on the new stopping criterion maintains power with the type-I error rate still controlled.

  1. How to calculate sample size for different study designs in medical research?

    PubMed

    Charan, Jaykaran; Biswas, Tamoghna

    2013-04-01

    Calculation of exact sample size is an important part of research design. It is very important to understand that different study design need different method of sample size calculation and one formula cannot be used in all designs. In this short review we tried to educate researcher regarding various method of sample size calculation available for different study designs. In this review sample size calculation for most frequently used study designs are mentioned. For genetic and microbiological studies readers are requested to read other sources.

  2. Sample size calculation in clinical trials: part 13 of a series on evaluation of scientific publications.

    PubMed

    Röhrig, Bernd; du Prel, Jean-Baptist; Wachtlin, Daniel; Kwiecien, Robert; Blettner, Maria

    2010-08-01

    In this article, we discuss the purpose of sample size calculation in clinical trials, the need for it, and the methods by which it is accomplished. Study samples that are either too small or too large are unacceptable, for clinical, methodological, and ethical reasons. The physicians participating in clinical trials should be directly involved in sample size planning, because their expertise and knowledge of the literature are indispensable. We explain the process of sample size calculation on the basis of articles retrieved by a selective search of the international literature, as well as our own experience. We present a fictitious clinical trial in which two antihypertensive agents are to be compared to each other with a t-test and then show how the appropriate size of the study sample should be calculated. Next, we describe the general principles of sample size calculation that apply when any kind of statistical test is to be used. We give further illustrative examples and explain what types of expert medical knowledge and assumptions are needed to calculate the appropriate sample size for each. These generally depend on the particular statistical test that is to be performed. In any clinical trial, the sample size has to be planned on a justifiable, rational basis. The purpose of sample size calculation is to determine the optimal number of participants (patients) to be included in the trial. Sample size calculation requires the collaboration of experienced biostatisticians and physician-researchers: expert medical knowledge is an essential part of it.

  3. Reporting of sample size calculations in analgesic clinical trials: ACTTION systematic review.

    PubMed

    McKeown, Andrew; Gewandter, Jennifer S; McDermott, Michael P; Pawlowski, Joseph R; Poli, Joseph J; Rothstein, Daniel; Farrar, John T; Gilron, Ian; Katz, Nathaniel P; Lin, Allison H; Rappaport, Bob A; Rowbotham, Michael C; Turk, Dennis C; Dworkin, Robert H; Smith, Shannon M

    2015-03-01

    Sample size calculations determine the number of participants required to have sufficiently high power to detect a given treatment effect. In this review, we examined the reporting quality of sample size calculations in 172 publications of double-blind randomized controlled trials of noninvasive pharmacologic or interventional (ie, invasive) pain treatments published in European Journal of Pain, Journal of Pain, and Pain from January 2006 through June 2013. Sixty-five percent of publications reported a sample size calculation but only 38% provided all elements required to replicate the calculated sample size. In publications reporting at least 1 element, 54% provided a justification for the treatment effect used to calculate sample size, and 24% of studies with continuous outcome variables justified the variability estimate. Publications of clinical pain condition trials reported a sample size calculation more frequently than experimental pain model trials (77% vs 33%, P < .001) but did not differ in the frequency of reporting all required elements. No significant differences in reporting of any or all elements were detected between publications of trials with industry and nonindustry sponsorship. Twenty-eight percent included a discrepancy between the reported number of planned and randomized participants. This study suggests that sample size calculation reporting in analgesic trial publications is usually incomplete. Investigators should provide detailed accounts of sample size calculations in publications of clinical trials of pain treatments, which is necessary for reporting transparency and communication of pre-trial design decisions. In this systematic review of analgesic clinical trials, sample size calculations and the required elements (eg, treatment effect to be detected; power level) were incompletely reported. A lack of transparency regarding sample size calculations may raise questions about the appropriateness of the calculated sample size. Copyright

  4. A Comparative Study of Power and Sample Size Calculations for Multivariate General Linear Models

    ERIC Educational Resources Information Center

    Shieh, Gwowen

    2003-01-01

    Repeated measures and longitudinal studies arise often in social and behavioral science research. During the planning stage of such studies, the calculations of sample size are of particular interest to the investigators and should be an integral part of the research projects. In this article, we consider the power and sample size calculations for…

  5. Verification of the Accuracy of Sample-Size Equation Calculations for Visual Sample Plan Version 0.9C

    SciTech Connect

    Davidson, James R.

    2001-01-29

    Visual Sample Plan (VSP) is a software tool being developed to facilitate the design of environmental sampling plans using a site-map visual interface, standard sample-size equations, a variety of sampling grids and random sampling plans, and graphs to visually depict the results to the user. This document provides comparisons between sample sizes calculated by VSP Version 0.9C, and sample sizes calculated by test code written in the S-Plus language. All sample sizes calculated by VSP matched the independently calculated sample sizes. Also the VSP implementation of the ELIGPRID-PC algorithm for hot spot probabilities is shown to match previous results for 100 standard test cases. The Conclusions and Limitations section of this document lists some aspects of VSP that were not tested by this suite of tests and recommends simulation-based enhancements for future versions of VSP.

  6. Sample size calculation for weighted rank tests comparing survival distributions under cluster randomization: a simulation method.

    PubMed

    Jung, Sin-Ho

    2007-01-01

    We propose a sample size calculation method for rank tests comparing two survival distributions under cluster randomization with possibly variable cluster sizes. Here, sample size refers to number of clusters. Our method is based on simulation procedure generating clustered exponential survival variables whose distribution is specified by the marginal hazard rate and the intracluster correlation coefficient. Sample size is calculated given significance level, power, marginal hazard rates (or median survival times) under the alternative hypothesis, intracluster correlation coefficient, accrual rate, follow-up period, and cluster size distribution.

  7. [Formal sample size calculation and its limited validity in animal studies of medical basic research].

    PubMed

    Mayer, B; Muche, R

    2013-01-01

    Animal studies are highly relevant for basic medical research, although their usage is discussed controversially in public. Thus, an optimal sample size for these projects should be aimed at from a biometrical point of view. Statistical sample size calculation is usually the appropriate methodology in planning medical research projects. However, required information is often not valid or only available during the course of an animal experiment. This article critically discusses the validity of formal sample size calculation for animal studies. Within the discussion, some requirements are formulated to fundamentally regulate the process of sample size determination for animal experiments.

  8. Sample size calculation for the one-sample log-rank test.

    PubMed

    Wu, Jianrong

    2015-01-01

    In this paper, an exact variance of the one-sample log-rank test statistic is derived under the alternative hypothesis, and a sample size formula is proposed based on the derived exact variance. Simulation results showed that the proposed sample size formula provides adequate power to design a study to compare the survival of a single sample with that of a standard population.

  9. Clinical audit for occupational therapy intervention for children with autism spectrum disorder: sampling steps and sample size calculation.

    PubMed

    Weeks, Scott; Atlas, Alvin

    2015-06-30

    A priori sample size calculations are used to determine the adequate sample size to estimate the prevalence of the target population with good precision. However, published audits rarely report a priori calculations for their sample size. This article discusses a process in health services delivery mapping to generate a comprehensive sampling frame, which was used to calculate an a priori sample size for a targeted clinical record audit. We describe how we approached methodological and definitional issues in the following steps: (1) target population definition, (2) sampling frame construction, and (3) a priori sample size calculation. We recommend this process for clinicians, researchers, or policy makers when detailed information on a reference population is unavailable.

  10. Sample Size Calculations for Precise Interval Estimation of the Eta-Squared Effect Size

    ERIC Educational Resources Information Center

    Shieh, Gwowen

    2015-01-01

    Analysis of variance is one of the most frequently used statistical analyses in the behavioral, educational, and social sciences, and special attention has been paid to the selection and use of an appropriate effect size measure of association in analysis of variance. This article presents the sample size procedures for precise interval estimation…

  11. Sample Size Calculations for Precise Interval Estimation of the Eta-Squared Effect Size

    ERIC Educational Resources Information Center

    Shieh, Gwowen

    2015-01-01

    Analysis of variance is one of the most frequently used statistical analyses in the behavioral, educational, and social sciences, and special attention has been paid to the selection and use of an appropriate effect size measure of association in analysis of variance. This article presents the sample size procedures for precise interval estimation…

  12. New method to estimate the sample size for calculation of a proportion assuming binomial distribution.

    PubMed

    Vallejo, Adriana; Muniesa, Ana; Ferreira, Chelo; de Blas, Ignacio

    2013-10-01

    Nowadays the formula to calculate the sample size for estimate a proportion (as prevalence) is based on the Normal distribution, however it would be based on a Binomial distribution which confidence interval was possible to be calculated using the Wilson Score method. By comparing the two formulae (Normal and Binomial distributions), the variation of the amplitude of the confidence intervals is relevant in the tails and the center of the curves. In order to calculate the needed sample size we have simulated an iterative sampling procedure, which shows an underestimation of the sample size for values of prevalence closed to 0 or 1, and also an overestimation for values closed to 0.5. Attending to these results we proposed an algorithm based on Wilson Score method that provides similar values for the sample size than empirically obtained by simulation.

  13. SAMPLE SIZE/POWER CALCULATION FOR STRATIFIED CASE-COHORT DESIGN

    PubMed Central

    Hu, Wenrong; Cai, Jianwen; Zeng, Donglin

    2014-01-01

    The Case-cohort (CC) study design usually has been used for risk factor assessment in epidemiologic studies or disease prevention trials for rare diseases. The sample size/power calculation for the CC design is given in Cai and Zeng [1]. However, the sample size/power calculation for a stratified case-cohort (SCC) design has not been addressed before. This article extends the results of Cai and Zeng [1] to the SCC design. Simulation studies show that the proposed test for the SCC design utilizing small sub-cohort sampling fractions is valid and efficient for situations where the disease rate is low. Furthermore, optimization of sampling in the SCC design is discussed and compared with proportional and balanced sampling techniques. An epidemiological study is provided to illustrate the sample size calculation under the SCC design. PMID:24889145

  14. Sample Size Calculation for Clustered Binary Data with Sign Tests Using Different Weighting Schemes

    PubMed Central

    Ahn, Chul; Hu, Fan; Schucany, William R.

    2011-01-01

    We propose a sample size calculation approach for testing a proportion using the weighted sign test when binary observations are dependent within a cluster. Sample size formulas are derived with nonparametric methods using three weighting schemes: equal weights to observations, equal weights to clusters, and optimal weights that minimize the variance of the estimator. Sample size formulas are derived incorporating intracluster correlation and the variability in cluster sizes. Simulation studies are conducted to evaluate a finite sample performance of the proposed sample size formulas. Empirical powers are generally close to nominal levels. The number of clusters required increases as the imbalance in cluster size increases and the intracluster correlation increases. The estimator using optimal weights yields the smallest sample size estimate among three estimators. For small values of intracluster correlation the sample size estimates derived from the optimal weight estimator are close to that derived from the estimator assigning equal weights to observations. For large values of intracluster correlation, the optimal weight sample size estimate is close to the sample size estimate assigning equal weights to clusters. PMID:21339864

  15. Sample Size Calculation for Clustered Binary Data with Sign Tests Using Different Weighting Schemes.

    PubMed

    Ahn, Chul; Hu, Fan; Schucany, William R

    2011-02-01

    We propose a sample size calculation approach for testing a proportion using the weighted sign test when binary observations are dependent within a cluster. Sample size formulas are derived with nonparametric methods using three weighting schemes: equal weights to observations, equal weights to clusters, and optimal weights that minimize the variance of the estimator. Sample size formulas are derived incorporating intracluster correlation and the variability in cluster sizes. Simulation studies are conducted to evaluate a finite sample performance of the proposed sample size formulas. Empirical powers are generally close to nominal levels. The number of clusters required increases as the imbalance in cluster size increases and the intracluster correlation increases. The estimator using optimal weights yields the smallest sample size estimate among three estimators. For small values of intracluster correlation the sample size estimates derived from the optimal weight estimator are close to that derived from the estimator assigning equal weights to observations. For large values of intracluster correlation, the optimal weight sample size estimate is close to the sample size estimate assigning equal weights to clusters.

  16. Sample size calculation for differential expression analysis of RNA-seq data under Poisson distribution

    PubMed Central

    Li, Chung-I; Su, Pei-Fang; Guo, Yan

    2013-01-01

    Sample size determination is an important issue in the experimental design of biomedical research. Because of the complexity of RNA-seq experiments, however, the field currently lacks a sample size method widely applicable to differential expression studies utilizing RNA-seq technology. In this report, we propose several methods for sample size calculation for single-gene differential expression analysis of RNA-seq data under Poisson distribution. These methods are then extended to multiple genes, with consideration for addressing the multiple testing problem by controlling false discovery rate. Moreover, most of the proposed methods allow for closedform sample size formulas with specification of the desired minimum fold change and minimum average read count, and thus are not computationally intensive. Simulation studies to evaluate the performance of the proposed sample size formulas are presented; the results indicate that our methods work well, with achievement of desired power. Finally, our sample size calculation methods are applied to three real RNA-seq data sets. PMID:24088268

  17. Sample size calculation for differential expression analysis of RNA-seq data under Poisson distribution.

    PubMed

    Li, Chung-I; Su, Pei-Fang; Guo, Yan; Shyr, Yu

    2013-01-01

    Sample size determination is an important issue in the experimental design of biomedical research. Because of the complexity of RNA-seq experiments, however, the field currently lacks a sample size method widely applicable to differential expression studies utilising RNA-seq technology. In this report, we propose several methods for sample size calculation for single-gene differential expression analysis of RNA-seq data under Poisson distribution. These methods are then extended to multiple genes, with consideration for addressing the multiple testing problem by controlling false discovery rate. Moreover, most of the proposed methods allow for closed-form sample size formulas with specification of the desired minimum fold change and minimum average read count, and thus are not computationally intensive. Simulation studies to evaluate the performance of the proposed sample size formulas are presented; the results indicate that our methods work well, with achievement of desired power. Finally, our sample size calculation methods are applied to three real RNA-seq data sets.

  18. Power and sample size calculations for Mendelian randomization studies using one genetic instrument.

    PubMed

    Freeman, Guy; Cowling, Benjamin J; Schooling, C Mary

    2013-08-01

    Mendelian randomization, which is instrumental variable analysis using genetic variants as instruments, is an increasingly popular method of making causal inferences from observational studies. In order to design efficient Mendelian randomization studies, it is essential to calculate the sample sizes required. We present formulas for calculating the power of a Mendelian randomization study using one genetic instrument to detect an effect of a given size, and the minimum sample size required to detect effects for given levels of significance and power, using asymptotic statistical theory. We apply the formulas to some example data and compare the results with those from simulation methods. Power and sample size calculations using these formulas should be more straightforward to carry out than simulation approaches. These formulas make explicit that the sample size needed for Mendelian randomization study is inversely proportional to the square of the correlation between the genetic instrument and the exposure and proportional to the residual variance of the outcome after removing the effect of the exposure, as well as inversely proportional to the square of the effect size.

  19. Sample size calculation for stepped wedge and other longitudinal cluster randomised trials.

    PubMed

    Hooper, Richard; Teerenstra, Steven; de Hoop, Esther; Eldridge, Sandra

    2016-11-20

    The sample size required for a cluster randomised trial is inflated compared with an individually randomised trial because outcomes of participants from the same cluster are correlated. Sample size calculations for longitudinal cluster randomised trials (including stepped wedge trials) need to take account of at least two levels of clustering: the clusters themselves and times within clusters. We derive formulae for sample size for repeated cross-section and closed cohort cluster randomised trials with normally distributed outcome measures, under a multilevel model allowing for variation between clusters and between times within clusters. Our formulae agree with those previously described for special cases such as crossover and analysis of covariance designs, although simulation suggests that the formulae could underestimate required sample size when the number of clusters is small. Whether using a formula or simulation, a sample size calculation requires estimates of nuisance parameters, which in our model include the intracluster correlation, cluster autocorrelation, and individual autocorrelation. A cluster autocorrelation less than 1 reflects a situation where individuals sampled from the same cluster at different times have less correlated outcomes than individuals sampled from the same cluster at the same time. Nuisance parameters could be estimated from time series obtained in similarly clustered settings with the same outcome measure, using analysis of variance to estimate variance components. Copyright © 2016 John Wiley & Sons, Ltd.

  20. Sample size/power calculation for stratified case-cohort design.

    PubMed

    Hu, Wenrong; Cai, Jianwen; Zeng, Donglin

    2014-10-15

    The case-cohort (CC) study design usually has been used for risk factor assessment in epidemiologic studies or disease prevention trials for rare diseases. The sample size/power calculation for a stratified CC (SCC) design has not been addressed before. This article derives such result based on a stratified test statistic. Simulation studies show that the proposed test for the SCC design utilizing small sub-cohort sampling fractions is valid and efficient for situations where the disease rate is low. Furthermore, optimization of sampling in the SCC design is discussed and compared with proportional and balanced sampling techniques. An epidemiological study is provided to illustrate the sample size calculation under the SCC design. Copyright © 2014 John Wiley & Sons, Ltd.

  1. Using design effects from previous cluster surveys to guide sample size calculation in emergency settings.

    PubMed

    Kaiser, Reinhard; Woodruff, Bradley A; Bilukha, Oleg; Spiegel, Paul B; Salama, Peter

    2006-06-01

    A good estimate of the design effect is critical for calculating the most efficient sample size for cluster surveys. We reviewed the design effects for seven nutrition and health outcomes from nine population-based cluster surveys conducted in emergency settings. Most of the design effects for outcomes in children, and one-half of the design effects for crude mortality, were below two. A reassessment of mortality data from Kosovo and Badghis, Afghanistan revealed that, given the same number of clusters, changing sample size had a relatively small impact on the precision of the estimate of mortality. We concluded that, in most surveys, assuming a design effect of 1.5 for acute malnutrition in children and two or less for crude mortality would produce a more efficient sample size. In addition, enhancing the sample size in cluster surveys without increasing the number of clusters may not result in substantial improvements in precision.

  2. [On the impact of sample size calculation and power in clinical research].

    PubMed

    Held, Ulrike

    2014-10-01

    The aim of a clinical trial is to judge the efficacy of a new therapy or drug. In the planning phase of the study, the calculation of the necessary sample size is crucial in order to obtain a meaningful result. The study design, the expected treatment effect in outcome and its variability, power and level of significance are factors which determine the sample size. It is often difficult to fix these parameters prior to the start of the study, but related papers from the literature can be helpful sources for the unknown quantities. For scientific as well as ethical reasons it is necessary to calculate the sample size in advance in order to be able to answer the study question.

  3. Sample Size Calculation for Estimating or Testing a Nonzero Squared Multiple Correlation Coefficient.

    PubMed

    Krishnamoorthy, K; Xia, Yanping

    2008-01-01

    The problems of hypothesis testing and interval estimation of the squared multiple correlation coefficient of a multivariate normal distribution are considered. It is shown that available one-sided tests are uniformly most powerful, and the one-sided confidence intervals are uniformly most accurate. An exact method of calculating sample size to carry out one-sided tests (null hypothesis may involve a nonzero value for the multiple correlation coefficient) to attain a specified power is given. Sample size calculation for computing confidence intervals for the squared multiple correlation coefficient with a specified expected width is also provided. Sample sizes for powers and confidence intervals are tabulated for a wide range of parameter configurations and dimensions. The results are illustrated using the empirical data from Timm (1975) that related scores from the Peabody Picture Vocabulary Test to four proficiency measures.

  4. Power and Sample Size Calculations for Multivariate Linear Models with Random Explanatory Variables

    ERIC Educational Resources Information Center

    Shieh, Gwowen

    2005-01-01

    This article considers the problem of power and sample size calculations for normal outcomes within the framework of multivariate linear models. The emphasis is placed on the practical situation that not only the values of response variables for each subject are just available after the observations are made, but also the levels of explanatory…

  5. [Sample size calculation in clinical post-marketing evaluation of traditional Chinese medicine].

    PubMed

    Fu, Yingkun; Xie, Yanming

    2011-10-01

    In recent years, as the Chinese government and people pay more attention on the post-marketing research of Chinese Medicine, part of traditional Chinese medicine breed has or is about to begin after the listing of post-marketing evaluation study. In the post-marketing evaluation design, sample size calculation plays a decisive role. It not only ensures the accuracy and reliability of post-marketing evaluation. but also assures that the intended trials will have a desired power for correctly detecting a clinically meaningful difference of different medicine under study if such a difference truly exists. Up to now, there is no systemic method of sample size calculation in view of the traditional Chinese medicine. In this paper, according to the basic method of sample size calculation and the characteristic of the traditional Chinese medicine clinical evaluation, the sample size calculation methods of the Chinese medicine efficacy and safety are discussed respectively. We hope the paper would be beneficial to medical researchers, and pharmaceutical scientists who are engaged in the areas of Chinese medicine research.

  6. CHI-B: Sample Size Calculation for Chi-Square Tests

    ERIC Educational Resources Information Center

    Pohl, Norval F.; Tsai, San-Yun W.

    1978-01-01

    The nature of the approximate chi-square test for hypotheses concerning multinomial probabilities is reviewed. Also, a BASIC computer program for calculating the sample size necessary to control for both Type I and Type II errors in chi-square tests for hypotheses concerning multinomial probabilities is described.

  7. Sample Size Calculation for Estimating or Testing a Nonzero Squared Multiple Correlation Coefficient

    ERIC Educational Resources Information Center

    Krishnamoorthy, K.; Xia, Yanping

    2008-01-01

    The problems of hypothesis testing and interval estimation of the squared multiple correlation coefficient of a multivariate normal distribution are considered. It is shown that available one-sided tests are uniformly most powerful, and the one-sided confidence intervals are uniformly most accurate. An exact method of calculating sample size to…

  8. Sample Size Calculation for Estimating or Testing a Nonzero Squared Multiple Correlation Coefficient

    ERIC Educational Resources Information Center

    Krishnamoorthy, K.; Xia, Yanping

    2008-01-01

    The problems of hypothesis testing and interval estimation of the squared multiple correlation coefficient of a multivariate normal distribution are considered. It is shown that available one-sided tests are uniformly most powerful, and the one-sided confidence intervals are uniformly most accurate. An exact method of calculating sample size to…

  9. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    ERIC Educational Resources Information Center

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  10. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    ERIC Educational Resources Information Center

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  11. Confidence intervals and sample-size calculations for the sisterhood method of estimating maternal mortality.

    PubMed

    Hanley, J A; Hagen, C A; Shiferaw, T

    1996-01-01

    The sisterhood method is an indirect method of estimating maternal mortality that has, in comparison with conventional direct methods, the dual advantages of ease of use in the field and smaller sample-size requirements. This report describes how to calculate a standard error to quantify the sampling variability for this method. This standard error can be used to construct confidence intervals and statistical tests and to plan the size of a sample survey that employs the sisterhood method. Statistical assumptions are discussed, particularly in relation to the effective sample size and to effects of extrabinomial variation. In a worked example of data from urban Pakistan, a maternal mortality ratio of 153 (95 percent confidence interval between 96 and 212) deaths per 100,000 live births is estimated.

  12. [Comparison of calculation methods for diagnostic trials under different sample size].

    PubMed

    Wang, Yang; Hu, Bo; Chen, Tao; Li, Wei

    2010-12-01

    To discuss the calculation methods under different sample size, used for diagnostic trials. The purpose of the diagnostic trial was to demonstrate the sensitivity and specificity of the new method. Equations and results were directly compared. Monte Carlo random simulation was used to validate the results. Sample size obtained from the sampling method was always smaller than from the target value method. Results from simulation showed that the target value method could offer more and larger power. The two sample size determination method showed essential differences of the results, suggesting that the investigator should choose appropriate method in accordance with the study design. If the hypothesis of study was to demonstrate the new diagnostic method which could meet the clinical requirements, only if the target value method provides enough statistical power.

  13. Exact Power and Sample Size Calculations for the Two One-Sided Tests of Equivalence

    PubMed Central

    Shieh, Gwowen

    2016-01-01

    Equivalent testing has been strongly recommended for demonstrating the comparability of treatment effects in a wide variety of research fields including medical studies. Although the essential properties of the favorable two one-sided tests of equivalence have been addressed in the literature, the associated power and sample size calculations were illustrated mainly for selecting the most appropriate approximate method. Moreover, conventional power analysis does not consider the allocation restrictions and cost issues of different sample size choices. To extend the practical usefulness of the two one-sided tests procedure, this article describes exact approaches to sample size determinations under various allocation and cost considerations. Because the presented features are not generally available in common software packages, both R and SAS computer codes are presented to implement the suggested power and sample size computations for planning equivalence studies. The exact power function of the TOST procedure is employed to compute optimal sample sizes under four design schemes allowing for different allocation and cost concerns. The proposed power and sample size methodology should be useful for medical sciences to plan equivalence studies. PMID:27598468

  14. Exact Power and Sample Size Calculations for the Two One-Sided Tests of Equivalence.

    PubMed

    Shieh, Gwowen

    2016-01-01

    Equivalent testing has been strongly recommended for demonstrating the comparability of treatment effects in a wide variety of research fields including medical studies. Although the essential properties of the favorable two one-sided tests of equivalence have been addressed in the literature, the associated power and sample size calculations were illustrated mainly for selecting the most appropriate approximate method. Moreover, conventional power analysis does not consider the allocation restrictions and cost issues of different sample size choices. To extend the practical usefulness of the two one-sided tests procedure, this article describes exact approaches to sample size determinations under various allocation and cost considerations. Because the presented features are not generally available in common software packages, both R and SAS computer codes are presented to implement the suggested power and sample size computations for planning equivalence studies. The exact power function of the TOST procedure is employed to compute optimal sample sizes under four design schemes allowing for different allocation and cost concerns. The proposed power and sample size methodology should be useful for medical sciences to plan equivalence studies.

  15. Finding Alternatives to the Dogma of Power Based Sample Size Calculation: Is a Fixed Sample Size Prospective Meta-Experiment a Potential Alternative?

    PubMed Central

    Tavernier, Elsa; Trinquart, Ludovic; Giraudeau, Bruno

    2016-01-01

    Sample sizes for randomized controlled trials are typically based on power calculations. They require us to specify values for parameters such as the treatment effect, which is often difficult because we lack sufficient prior information. The objective of this paper is to provide an alternative design which circumvents the need for sample size calculation. In a simulation study, we compared a meta-experiment approach to the classical approach to assess treatment efficacy. The meta-experiment approach involves use of meta-analyzed results from 3 randomized trials of fixed sample size, 100 subjects. The classical approach involves a single randomized trial with the sample size calculated on the basis of an a priori-formulated hypothesis. For the sample size calculation in the classical approach, we used observed articles to characterize errors made on the formulated hypothesis. A prospective meta-analysis of data from trials of fixed sample size provided the same precision, power and type I error rate, on average, as the classical approach. The meta-experiment approach may provide an alternative design which does not require a sample size calculation and addresses the essential need for study replication; results may have greater external validity. PMID:27362939

  16. Sample size calculation to externally validate scoring systems based on logistic regression models.

    PubMed

    Palazón-Bru, Antonio; Folgado-de la Rosa, David Manuel; Cortés-Castell, Ernesto; López-Cascales, María Teresa; Gil-Guillén, Vicente Francisco

    2017-01-01

    A sample size containing at least 100 events and 100 non-events has been suggested to validate a predictive model, regardless of the model being validated and that certain factors can influence calibration of the predictive model (discrimination, parameterization and incidence). Scoring systems based on binary logistic regression models are a specific type of predictive model. The aim of this study was to develop an algorithm to determine the sample size for validating a scoring system based on a binary logistic regression model and to apply it to a case study. The algorithm was based on bootstrap samples in which the area under the ROC curve, the observed event probabilities through smooth curves, and a measure to determine the lack of calibration (estimated calibration index) were calculated. To illustrate its use for interested researchers, the algorithm was applied to a scoring system, based on a binary logistic regression model, to determine mortality in intensive care units. In the case study provided, the algorithm obtained a sample size with 69 events, which is lower than the value suggested in the literature. An algorithm is provided for finding the appropriate sample size to validate scoring systems based on binary logistic regression models. This could be applied to determine the sample size in other similar cases.

  17. Calculating sample size for studies with expected all-or-none nonadherence and selection bias.

    PubMed

    Shardell, Michelle D; El-Kamary, Samer S

    2009-06-01

    We develop sample size formulas for studies aiming to test mean differences between a treatment and control group when all-or-none nonadherence (noncompliance) and selection bias are expected. Recent work by Fay, Halloran, and Follmann (2007, Biometrics 63, 465-474) addressed the increased variances within groups defined by treatment assignment when nonadherence occurs, compared to the scenario of full adherence, under the assumption of no selection bias. In this article, we extend the authors' approach to allow selection bias in the form of systematic differences in means and variances among latent adherence subgroups. We illustrate the approach by performing sample size calculations to plan clinical trials with and without pilot adherence data. Sample size formulas and tests for normally distributed outcomes are also developed in a Web Appendix that account for uncertainty of estimates from external or internal pilot data.

  18. Calculating Sample Size for Studies with Expected All-or-None Nonadherence and Selection Bias

    PubMed Central

    Shardell, Michelle D.; El-Kamary, Samer S.

    2015-01-01

    Summary We develop sample size formulas for studies aiming to test mean differences between a treatment and control group when all-or-none nonadherence (noncompliance) and selection bias are expected. Recent work addressed the increased variances within groups defined by treatment assignment when nonadherence occurs, compared to the scenario of full adherence, under the assumption of no selection bias. In this article, we extend the authors’ approach to allow selection bias in the form of systematic differences in means and variances among latent adherence subgroups. We illustrate the approach by performing sample size calculations to plan clinical trials with and without pilot adherence data. Sample size formulas and tests for normally distributed outcomes are also developed in a Web Appendix that account for uncertainty of estimates from external or internal pilot data. PMID:18759830

  19. Sample size calculation for testing differences between cure rates with the optimal log-rank test.

    PubMed

    Wu, Jianrong

    2017-01-01

    In this article, sample size calculations are developed for use when the main interest is in the differences between the cure rates of two groups. Following the work of Ewell and Ibrahim, the asymptotic distribution of the weighted log-rank test is derived under the local alternative. The optimal log-rank test under the proportional distributions alternative is discussed, and sample size formulas for the optimal and standard log-rank tests are derived. Simulation results show that the proposed formulas provide adequate sample size estimation for trial designs and that the optimal log-rank test is more efficient than the standard log-rank test, particularly when both cure rates and percentages of censoring are small.

  20. Sample size calculations in pediatric clinical trials conducted in an ICU: a systematic review.

    PubMed

    Nikolakopoulos, Stavros; Roes, Kit C B; van der Lee, Johanna H; van der Tweel, Ingeborg

    2014-07-08

    At the design stage of a clinical trial, several assumptions have to be made. These usually include guesses about parameters that are not of direct interest but must be accounted for in the analysis of the treatment effect and also in the sample size calculation (nuisance parameters, e.g. the standard deviation or the control group event rate). We conducted a systematic review to investigate the impact of misspecification of nuisance parameters in pediatric randomized controlled trials conducted in intensive care units. We searched MEDLINE through PubMed. We included all publications concerning two-arm RCTs where efficacy assessment was the main objective. We included trials with pharmacological interventions. Only trials with a dichotomous or a continuous outcome were included. This led to the inclusion of 70 articles describing 71 trials. In 49 trial reports a sample size calculation was reported. Relative misspecification could be calculated for 28 trials, 22 with a dichotomous and 6 with a continuous primary outcome. The median [inter-quartile range (IQR)] overestimation was 6.9 [-12.1, 57.8]% for the control group event rate in trials with dichotomous outcomes and -1.5 [-15.3, 5.1]% for the standard deviation in trials with continuous outcomes. Our results show that there is room for improvement in the clear reporting of sample size calculations in pediatric clinical trials conducted in ICUs. Researchers should be aware of the importance of nuisance parameters in study design and in the interpretation of the results.

  1. A web application for sample size and power calculation in case-control microbiome studies.

    PubMed

    Mattiello, Federico; Verbist, Bie; Faust, Karoline; Raes, Jeroen; Shannon, William D; Bijnens, Luc; Thas, Olivier

    2016-07-01

    : When designing a case-control study to investigate differences in microbial composition, it is fundamental to assess the sample sizes needed to detect an hypothesized difference with sufficient statistical power. Our application includes power calculation for (i) a recoded version of the two-sample generalized Wald test of the 'HMP' R-package for comparing community composition, and (ii) the Wilcoxon-Mann-Whitney test for comparing operational taxonomic unit-specific abundances between two samples (optional). The simulation-based power calculations make use of the Dirichlet-Multinomial model to describe and generate abundances. The web interface allows for easy specification of sample and effect sizes. As an illustration of our application, we compared the statistical power of the two tests, with and without stratification of samples. We observed that statistical power increases considerably when stratification is employed, meaning that less samples are needed to detect the same effect size with the same power. The web interface is written in R code using Shiny (RStudio Inc., 2016) and it is available at https://fedematt.shinyapps.io/shinyMB The R code for the recoded generalized Wald test can be found at https://github.com/mafed/msWaldHMP CONTACT: Federico.Mattiello@UGent.be. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  2. Sample size and power calculations for correlations between bivariate longitudinal data.

    PubMed

    Comulada, W Scott; Weiss, Robert E

    2010-11-30

    The analysis of a baseline predictor with a longitudinally measured outcome is well established and sample size calculations are reasonably well understood. Analysis of bivariate longitudinally measured outcomes is gaining in popularity and methods to address design issues are required. The focus in a random effects model for bivariate longitudinal outcomes is on the correlations that arise between the random effects and between the bivariate residuals. In the bivariate random effects model, we estimate the asymptotic variances of the correlations and we propose power calculations for testing and estimating the correlations. We compare asymptotic variance estimates to variance estimates obtained from simulation studies and compare our proposed power calculations for correlations on bivariate longitudinal data to power calculations for correlations on cross-sectional data.

  3. Power and sample size calculations for generalized regression models with covariate measurement error.

    PubMed

    Tosteson, Tor D; Buzas, Jeffrey S; Demidenko, Eugene; Karagas, Margaret

    2003-04-15

    Covariate measurement error is often a feature of scientific data used for regression modelling. The consequences of such errors include a loss of power of tests of significance for the regression parameters corresponding to the true covariates. Power and sample size calculations that ignore covariate measurement error tend to overestimate power and underestimate the actual sample size required to achieve a desired power. In this paper we derive a novel measurement error corrected power function for generalized linear models using a generalized score test based on quasi-likelihood methods. Our power function is flexible in that it is adaptable to designs with a discrete or continuous scalar covariate (exposure) that can be measured with or without error, allows for additional confounding variables and applies to a broad class of generalized regression and measurement error models. A program is described that provides sample size or power for a continuous exposure with a normal measurement error model and a single normal confounder variable in logistic regression. We demonstrate the improved properties of our power calculations with simulations and numerical studies. An example is given from an ongoing study of cancer and exposure to arsenic as measured by toenail concentrations and tap water samples.

  4. Sample Size Calculation in Oncology Trials: Quality of Reporting and Implications for Clinical Cancer Research.

    PubMed

    Bariani, Giovanni M; de Celis Ferrari, Anezka C R; Precivale, Maristela; Arai, Roberto; Saad, Everardo D; Riechelmann, Rachel P

    2015-12-01

    Sample size calculation (SSC) is a pivotal step in clinical trial conception and design. Herein, we describe the frequency with which oncology phase III trials report the parameters required for SSC. We systematically searched for phase III trials published in 6 leading journals, which were accompanied by editorials from January 2008 to October 2011. Two blinded investigators extracted required and optional parameters for SSC according to the primary endpoint. We retrieved 140 eligible phase III trials. The median target sample size was 596 subjects (50 to 40,000); in 66.4% of cases, the number of enrolled subjects was at least 90% of the target. The primary endpoint was a continuous variable in 5.7%, categorical in 30.0%, and a time-to-event variable in 64.3% of phase III trials. Although nearly 80% reported a target sample size, only 27.9% of the trials provided all the required parameters for proper SSC. The most commonly reported parameters for sample size computation were α (93.6%) and β (90.7%) errors. The parameters least reported were the expected outcomes in the control or experimental groups, each provided in only 57.9% of trials. The quality of SSC reporting in phase III cancer trials is poor. Such incomplete reporting may compromise future study designs, pooling of data, and interpretation of results. Lack of transparency in SSC reporting may also have ethical implications.

  5. Sample size calculations for the design of cluster randomized trials: A summary of methodology.

    PubMed

    Gao, Fei; Earnest, Arul; Matchar, David B; Campbell, Michael J; Machin, David

    2015-05-01

    Cluster randomized trial designs are growing in popularity in, for example, cardiovascular medicine research and other clinical areas and parallel statistical developments concerned with the design and analysis of these trials have been stimulated. Nevertheless, reviews suggest that design issues associated with cluster randomized trials are often poorly appreciated and there remain inadequacies in, for example, describing how the trial size is determined and the associated results are presented. In this paper, our aim is to provide pragmatic guidance for researchers on the methods of calculating sample sizes. We focus attention on designs with the primary purpose of comparing two interventions with respect to continuous, binary, ordered categorical, incidence rate and time-to-event outcome variables. Issues of aggregate and non-aggregate cluster trials, adjustment for variation in cluster size and the effect size are detailed. The problem of establishing the anticipated magnitude of between- and within-cluster variation to enable planning values of the intra-cluster correlation coefficient and the coefficient of variation are also described. Illustrative examples of calculations of trial sizes for each endpoint type are included.

  6. Sample size calculations in pediatric clinical trials conducted in an ICU: a systematic review

    PubMed Central

    2014-01-01

    At the design stage of a clinical trial, several assumptions have to be made. These usually include guesses about parameters that are not of direct interest but must be accounted for in the analysis of the treatment effect and also in the sample size calculation (nuisance parameters, e.g. the standard deviation or the control group event rate). We conducted a systematic review to investigate the impact of misspecification of nuisance parameters in pediatric randomized controlled trials conducted in intensive care units. We searched MEDLINE through PubMed. We included all publications concerning two-arm RCTs where efficacy assessment was the main objective. We included trials with pharmacological interventions. Only trials with a dichotomous or a continuous outcome were included. This led to the inclusion of 70 articles describing 71 trials. In 49 trial reports a sample size calculation was reported. Relative misspecification could be calculated for 28 trials, 22 with a dichotomous and 6 with a continuous primary outcome. The median [inter-quartile range (IQR)] overestimation was 6.9 [-12.1, 57.8]% for the control group event rate in trials with dichotomous outcomes and -1.5 [-15.3, 5.1]% for the standard deviation in trials with continuous outcomes. Our results show that there is room for improvement in the clear reporting of sample size calculations in pediatric clinical trials conducted in ICUs. Researchers should be aware of the importance of nuisance parameters in study design and in the interpretation of the results. PMID:25004909

  7. Power and sample size calculation for paired recurrent events data based on robust nonparametric tests.

    PubMed

    Su, Pei-Fang; Chung, Chia-Hua; Wang, Yu-Wen; Chi, Yunchan; Chang, Ying-Ju

    2017-05-20

    The purpose of this paper is to develop a formula for calculating the required sample size for paired recurrent events data. The developed formula is based on robust non-parametric tests for comparing the marginal mean function of events between paired samples. This calculation can accommodate the associations among a sequence of paired recurrent event times with a specification of correlated gamma frailty variables for a proportional intensity model. We evaluate the performance of the proposed method with comprehensive simulations including the impacts of paired correlations, homogeneous or nonhomogeneous processes, marginal hazard rates, censoring rate, accrual and follow-up times, as well as the sensitivity analysis for the assumption of the frailty distribution. The use of the formula is also demonstrated using a premature infant study from the neonatal intensive care unit of a tertiary center in southern Taiwan. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  8. Sample size calculation for recurrent events data in one-arm studies.

    PubMed

    Rebora, Paola; Galimberti, Stefania

    2012-01-01

    In some exceptional circumstances, as in very rare diseases, nonrandomized one-arm trials are the sole source of evidence to demonstrate efficacy and safety of a new treatment. The design of such studies needs a sound methodological approach in order to provide reliable information, and the determination of the appropriate sample size still represents a critical step of this planning process. As, to our knowledge, no method exists for sample size calculation in one-arm trials with a recurrent event endpoint, we propose here a closed sample size formula. It is derived assuming a mixed Poisson process, and it is based on the asymptotic distribution of the one-sample robust nonparametric test recently developed for the analysis of recurrent events data. The validity of this formula in managing a situation with heterogeneity of event rates, both in time and between patients, and time-varying treatment effect was demonstrated with exhaustive simulation studies. Moreover, although the method requires the specification of a process for events generation, it seems to be robust under erroneous definition of this process, provided that the number of events at the end of the study is similar to the one assumed in the planning phase. The motivating clinical context is represented by a nonrandomized one-arm study on gene therapy in a very rare immunodeficiency in children (ADA-SCID), where a major endpoint is the recurrence of severe infections. Copyright © 2012 John Wiley & Sons, Ltd.

  9. Sample size calculation for the weighted rank statistics with paired survival data.

    PubMed

    Jung, Sin-Ho

    2008-07-30

    This paper introduces a sample size calculation method for the weighted rank test statistics with paired two-sample survival data. Our sample size formula requires specification of joint survival and censoring distributions. For modelling the distribution of paired survival variables, we may use a paired exponential survival distribution that is specified by the marginal hazard rates and a measure of dependency. Also, in most trials randomizing paired subjects, the subjects of each pair are accrued and censored at the same time over an accrual period and an additional follow-up period, so that the paired subjects have a common censoring time. Under these practical settings, the design parameters include type I and type II error probabilities, marginal hazard rates under the alternative hypothesis, correlation coefficient, accrual period (or accrual rate) and follow-up period. If pilot data are available, we may estimate the survival distributions from them, but we specify the censoring distribution based on the specified accrual trend and the follow-up period planned for the new study. Through simulations, the formula is shown to provide accurate sample sizes under practical settings. Real studies are taken to demonstrate the proposed method.

  10. Sample size/power calculations for population pharmacodynamic experiments involving repeated-count measurements.

    PubMed

    Ogungbenro, Kayode; Aarons, Leon

    2010-09-01

    Repeated discrete outcome variables such as count measurements often arise in pharmacodynamic experiments. Count measurements can only take nonnegative integer values; this and correlation between repeated measurements from an individual make the design and analysis of repeated-count data special. Sample size/power calculation is an important part of clinical trial design to ensure adequate power for detecting significant effect, and it is often based on the procedure for analysis. This paper describes an approach for calculating sample size/power for population pharmacokinetic/pharmacodynamic experiments involving repeated-count measurements modeled as a Poisson process based on mixed-effects modeling technique. The noncentral version of the Wald chi(2) test is used for testing parameter/treatment significance. The approach was applied to two examples and the results were compared to results obtained from simulations in NONMEM. The first example involves calculating the power of a design to detect parameter significance between two groups: placebo and treatment group. The second example involves characterization of the dose-efficacy relationship of oxybutynin using a mixed-effects modeling approach. Weekly urge urinary incontinence episodes (a discrete count variable) is the primary efficacy variable and is modeled as a Poisson variable. A prospective study based on two different formulations of oxybutynin was designed using published population pharmacokinetic/pharmacodynamic model. The results of simulation studies showed good agreement between the proposed method and NONMEM simulations.

  11. Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data.

    PubMed

    Jinks, Rachel C; Royston, Patrick; Parmar, Mahesh K B

    2015-10-12

    Prognostic studies of time-to-event data, where researchers aim to develop or validate multivariable prognostic models in order to predict survival, are commonly seen in the medical literature; however, most are performed retrospectively and few consider sample size prior to analysis. Events per variable rules are sometimes cited, but these are based on bias and coverage of confidence intervals for model terms, which are not of primary interest when developing a model to predict outcome. In this paper we aim to develop sample size recommendations for multivariable models of time-to-event data, based on their prognostic ability. We derive formulae for determining the sample size required for multivariable prognostic models in time-to-event data, based on a measure of discrimination, D, developed by Royston and Sauerbrei. These formulae fall into two categories: either based on the significance of the value of D in a new study compared to a previous estimate, or based on the precision of the estimate of D in a new study in terms of confidence interval width. Using simulation we show that they give the desired power and type I error and are not affected by random censoring. Additionally, we conduct a literature review to collate published values of D in different disease areas. We illustrate our methods using parameters from a published prognostic study in liver cancer. The resulting sample sizes can be large, and we suggest controlling study size by expressing the desired accuracy in the new study as a relative value as well as an absolute value. To improve usability we use the values of D obtained from the literature review to develop an equation to approximately convert the commonly reported Harrell's c-index to D. A flow chart is provided to aid decision making when using these methods. We have developed a suite of sample size calculations based on the prognostic ability of a survival model, rather than the magnitude or significance of model coefficients. We have

  12. Sample size calculations for stepped wedge and cluster randomised trials: a unified approach

    PubMed Central

    Hemming, Karla; Taljaard, Monica

    2016-01-01

    Objectives To clarify and illustrate sample size calculations for the cross-sectional stepped wedge cluster randomized trial (SW-CRT) and to present a simple approach for comparing the efficiencies of competing designs within a unified framework. Study Design and Setting We summarize design effects for the SW-CRT, the parallel cluster randomized trial (CRT), and the parallel cluster randomized trial with before and after observations (CRT-BA), assuming cross-sectional samples are selected over time. We present new formulas that enable trialists to determine the required cluster size for a given number of clusters. We illustrate by example how to implement the presented design effects and give practical guidance on the design of stepped wedge studies. Results For a fixed total cluster size, the choice of study design that provides the greatest power depends on the intracluster correlation coefficient (ICC) and the cluster size. When the ICC is small, the CRT tends to be more efficient; when the ICC is large, the SW-CRT tends to be more efficient and can serve as an alternative design when the CRT is an infeasible design. Conclusion Our unified approach allows trialists to easily compare the efficiencies of three competing designs to inform the decision about the most efficient design in a given scenario. PMID:26344808

  13. Developing the Noncentrality Parameter for Calculating Group Sample Sizes in Heterogeneous Analysis of Variance

    ERIC Educational Resources Information Center

    Luh, Wei-Ming; Guo, Jiin-Huarng

    2011-01-01

    Sample size determination is an important issue in planning research. In the context of one-way fixed-effect analysis of variance, the conventional sample size formula cannot be applied for the heterogeneous variance cases. This study discusses the sample size requirement for the Welch test in the one-way fixed-effect analysis of variance with…

  14. Developing the Noncentrality Parameter for Calculating Group Sample Sizes in Heterogeneous Analysis of Variance

    ERIC Educational Resources Information Center

    Luh, Wei-Ming; Guo, Jiin-Huarng

    2011-01-01

    Sample size determination is an important issue in planning research. In the context of one-way fixed-effect analysis of variance, the conventional sample size formula cannot be applied for the heterogeneous variance cases. This study discusses the sample size requirement for the Welch test in the one-way fixed-effect analysis of variance with…

  15. Inference and sample size calculation for clinical trials with incomplete observations of paired binary outcomes.

    PubMed

    Zhang, Song; Cao, Jing; Ahn, Chul

    2017-02-20

    We investigate the estimation of intervention effect and sample size determination for experiments where subjects are supposed to contribute paired binary outcomes with some incomplete observations. We propose a hybrid estimator to appropriately account for the mixed nature of observed data: paired outcomes from those who contribute complete pairs of observations and unpaired outcomes from those who contribute either pre-intervention or post-intervention outcomes. We theoretically prove that if incomplete data are evenly distributed between the pre-intervention and post-intervention periods, the proposed estimator will always be more efficient than the traditional estimator. A numerical research shows that when the distribution of incomplete data is unbalanced, the proposed estimator will be superior when there is moderate-to-strong positive within-subject correlation. We further derive a closed-form sample size formula to help researchers determine how many subjects need to be enrolled in such studies. Simulation results suggest that the calculated sample size maintains the empirical power and type I error under various design configurations. We demonstrate the proposed method using a real application example. Copyright © 2016 John Wiley & Sons, Ltd.

  16. Testing bioequivalence for multiple formulations with power and sample size calculations.

    PubMed

    Zheng, Cheng; Wang, Jixian; Zhao, Lihui

    2012-01-01

    Bioequivalence (BE) trials play an important role in drug development for demonstrating the BE between test and reference formulations. The key statistical analysis for BE trials is the use of two one-sided tests (TOST), which is equivalent to showing that the 90% confidence interval of the relative bioavailability is within a given range. Power and sample size calculations for the comparison between one test formulation and the reference formulation has been intensively investigated, and tables and software are available for practical use. From a statistical and logistical perspective, it might be more efficient to test more than one formulation in a single trial. However, approaches for controlling the overall type I error may be required. We propose a method called multiplicity-adjusted TOST (MATOST) combining multiple comparison adjustment approaches, such as Hochberg's or Dunnett's method, with TOST. Because power and sample size calculations become more complex and are difficult to solve analytically, efficient simulation-based procedures for this purpose have been developed and implemented in an R package. Some numerical results for a range of scenarios are presented in the paper. We show that given the same overall type I error and power, a BE crossover trial designed to test multiple formulations simultaneously only requires a small increase in the total sample size compared with a simple 2 × 2 crossover design evaluating only one test formulation. Hence, we conclude that testing multiple formulations in a single study is generally an efficient approach. The R package MATOST is available at https://sites.google.com/site/matostbe/.

  17. Subgroup detection and sample size calculation with proportional hazards regression for survival data.

    PubMed

    Kang, Suhyun; Lu, Wenbin; Song, Rui

    2017-08-08

    In this paper, we propose a testing procedure for detecting and estimating the subgroup with an enhanced treatment effect in survival data analysis. Here, we consider a new proportional hazard model that includes a nonparametric component for the covariate effect in the control group and a subgroup-treatment-interaction effect defined by a change plane. We develop a score-type test for detecting the existence of the subgroup, which is doubly robust against misspecification of the baseline effect model or the propensity score but not both under mild assumptions for censoring. When the null hypothesis of no subgroup is rejected, the change-plane parameters that define the subgroup can be estimated on the basis of supremum of the normalized score statistic. The asymptotic distributions of the proposed test statistic under the null and local alternative hypotheses are established. On the basis of established asymptotic distributions, we further propose a sample size calculation formula for detecting a given subgroup effect and derive a numerical algorithm for implementing the sample size calculation in clinical trial designs. The performance of the proposed approach is evaluated by simulation studies. An application to an AIDS clinical trial data is also given for illustration. Copyright © 2017 John Wiley & Sons, Ltd.

  18. Sample size calculations in human electrophysiology (EEG and ERP) studies: A systematic review and recommendations for increased rigor.

    PubMed

    Larson, Michael J; Carbine, Kaylie A

    2017-01-01

    There is increasing focus across scientific fields on adequate sample sizes to ensure non-biased and reproducible effects. Very few studies, however, report sample size calculations or even the information needed to accurately calculate sample sizes for grants and future research. We systematically reviewed 100 randomly selected clinical human electrophysiology studies from six high impact journals that frequently publish electroencephalography (EEG) and event-related potential (ERP) research to determine the proportion of studies that reported sample size calculations, as well as the proportion of studies reporting the necessary components to complete such calculations. Studies were coded by the two authors blinded to the other's results. Inter-rater reliability was 100% for the sample size calculations and kappa above 0.82 for all other variables. Zero of the 100 studies (0%) reported sample size calculations. 77% utilized repeated-measures designs, yet zero studies (0%) reported the necessary variances and correlations among repeated measures to accurately calculate future sample sizes. Most studies (93%) reported study statistical values (e.g., F or t values). Only 40% reported effect sizes, 56% reported mean values, and 47% reported indices of variance (e.g., standard deviations/standard errors). Absence of such information hinders accurate determination of sample sizes for study design, grant applications, and meta-analyses of research and whether studies were adequately powered to detect effects of interest. Increased focus on sample size calculations, utilization of registered reports, and presenting information detailing sample size calculations and statistics for future researchers are needed and will increase sample size-related scientific rigor in human electrophysiology research.

  19. Pitfalls in reporting sample size calculation in randomized controlled trials published in leading anaesthesia journals: a systematic review.

    PubMed

    Abdulatif, M; Mukhtar, A; Obayah, G

    2015-11-01

    We have evaluated the pitfalls in reporting sample size calculation in randomized controlled trials (RCTs) published in the 10 highest impact factor anaesthesia journals.Superiority RCTs published in 2013 were identified and checked for the basic components required for sample size calculation and replication. The difference between the reported and replicated sample size was estimated. The sources used for estimating the expected effect size (Δ) were identified, and the difference between the expected and observed effect sizes (Δ gap) was estimated.We enrolled 194 RCTs. Sample size calculation was reported in 91.7% of studies. Replication of sample size calculation was possible in 80.3% of studies. The original and replicated sample sizes were identical in 67.8% of studies. The difference between the replicated and reported sample sizes exceeded 10% in 28.7% of studies. The expected and observed effect sizes were comparable in RCTs with positive outcomes (P=0.1). Studies with negative outcome tended to overestimate the effect size (Δ gap 42%, 95% confidence interval 32-51%), P<0.001. Post hoc power of negative studies was 20.2% (95% confidence interval 13.4-27.1%). Studies using data derived from pilot studies for sample size calculation were associated with the smallest Δ gaps (P=0.008).Sample size calculation is frequently reported in anaesthesia journals, but the details of basic elements for calculation are not consistently provided. In almost one-third of RCTs, the reported and replicated sample sizes were not identical and the assumptions for the expected effect size and variance were not supported by relevant literature or pilot studies.

  20. Determination of reference limits: statistical concepts and tools for sample size calculation.

    PubMed

    Wellek, Stefan; Lackner, Karl J; Jennen-Steinmetz, Christine; Reinhard, Iris; Hoffmann, Isabell; Blettner, Maria

    2014-12-01

    Reference limits are estimators for 'extreme' percentiles of the distribution of a quantitative diagnostic marker in the healthy population. In most cases, interest will be in the 90% or 95% reference intervals. The standard parametric method of determining reference limits consists of computing quantities of the form X̅±c·S. The proportion of covered values in the underlying population coincides with the specificity obtained when a measurement value falling outside the corresponding reference region is classified as diagnostically suspect. Nonparametrically, reference limits are estimated by means of so-called order statistics. In both approaches, the precision of the estimate depends on the sample size. We present computational procedures for calculating minimally required numbers of subjects to be enrolled in a reference study. The much more sophisticated concept of reference bands replacing statistical reference intervals in case of age-dependent diagnostic markers is also discussed.

  1. The quality of the reported sample size calculations in randomized controlled trials indexed in PubMed.

    PubMed

    Lee, Paul H; Tse, Andy C Y

    2017-05-01

    There are limited data on the quality of reporting of information essential for replication of the calculation as well as the accuracy of the sample size calculation. We examine the current quality of reporting of the sample size calculation in randomized controlled trials (RCTs) published in PubMed and to examine the variation in reporting across study design, study characteristics, and journal impact factor. We also reviewed the targeted sample size reported in trial registries. We reviewed and analyzed all RCTs published in December 2014 with journals indexed in PubMed. The 2014 Impact Factors for the journals were used as proxies for their quality. Of the 451 analyzed papers, 58.1% reported an a priori sample size calculation. Nearly all papers provided the level of significance (97.7%) and desired power (96.6%), and most of the papers reported the minimum clinically important effect size (73.3%). The median (inter-quartile range) of the percentage difference of the reported and calculated sample size calculation was 0.0% (IQR -4.6%;3.0%). The accuracy of the reported sample size was better for studies published in journals that endorsed the CONSORT statement and journals with an impact factor. A total of 98 papers had provided targeted sample size on trial registries and about two-third of these papers (n=62) reported sample size calculation, but only 25 (40.3%) had no discrepancy with the reported number in the trial registries. The reporting of the sample size calculation in RCTs published in PubMed-indexed journals and trial registries were poor. The CONSORT statement should be more widely endorsed. Copyright © 2016 European Federation of Internal Medicine. Published by Elsevier B.V. All rights reserved.

  2. Power and sample size calculations for interval-censored survival analysis.

    PubMed

    Kim, Hae-Young; Williamson, John M; Lin, Hung-Mo

    2016-04-15

    We propose a method for calculating power and sample size for studies involving interval-censored failure time data that only involves standard software required for fitting the appropriate parametric survival model. We use the framework of a longitudinal study where patients are assessed periodically for a response and the only resultant information available to the investigators is the failure window: the time between the last negative and first positive test results. The survival model is fit to an expanded data set using easily computed weights. We illustrate with a Weibull survival model and a two-group comparison. The investigator can specify a group difference in terms of a hazards ratio. Our simulation results demonstrate the merits of these proposed power calculations. We also explore how the number of assessments (visits), and thus the corresponding lengths of the failure intervals, affect study power. The proposed method can be easily extended to more complex study designs and a variety of survival and censoring distributions. Copyright © 2015 John Wiley & Sons, Ltd.

  3. Sample size calculation based on exact test for assessing differential expression analysis in RNA-seq data

    PubMed Central

    2013-01-01

    Background Sample size calculation is an important issue in the experimental design of biomedical research. For RNA-seq experiments, the sample size calculation method based on the Poisson model has been proposed; however, when there are biological replicates, RNA-seq data could exhibit variation significantly greater than the mean (i.e. over-dispersion). The Poisson model cannot appropriately model the over-dispersion, and in such cases, the negative binomial model has been used as a natural extension of the Poisson model. Because the field currently lacks a sample size calculation method based on the negative binomial model for assessing differential expression analysis of RNA-seq data, we propose a method to calculate the sample size. Results We propose a sample size calculation method based on the exact test for assessing differential expression analysis of RNA-seq data. Conclusions The proposed sample size calculation method is straightforward and not computationally intensive. Simulation studies to evaluate the performance of the proposed sample size method are presented; the results indicate our method works well, with achievement of desired power. PMID:24314022

  4. Sample size calculation based on exact test for assessing differential expression analysis in RNA-seq data.

    PubMed

    Li, Chung-I; Su, Pei-Fang; Shyr, Yu

    2013-12-06

    Sample size calculation is an important issue in the experimental design of biomedical research. For RNA-seq experiments, the sample size calculation method based on the Poisson model has been proposed; however, when there are biological replicates, RNA-seq data could exhibit variation significantly greater than the mean (i.e. over-dispersion). The Poisson model cannot appropriately model the over-dispersion, and in such cases, the negative binomial model has been used as a natural extension of the Poisson model. Because the field currently lacks a sample size calculation method based on the negative binomial model for assessing differential expression analysis of RNA-seq data, we propose a method to calculate the sample size. We propose a sample size calculation method based on the exact test for assessing differential expression analysis of RNA-seq data. The proposed sample size calculation method is straightforward and not computationally intensive. Simulation studies to evaluate the performance of the proposed sample size method are presented; the results indicate our method works well, with achievement of desired power.

  5. Quantification of Errors in Ordinal Outcome Scales Using Shannon Entropy: Effect on Sample Size Calculations

    PubMed Central

    Mandava, Pitchaiah; Krumpelman, Chase S.; Shah, Jharna N.; White, Donna L.; Kent, Thomas A.

    2013-01-01

    provide the user with programs to calculate and incorporate errors into sample size estimation. PMID:23861800

  6. Estimating design effect and calculating sample size for respondent-driven sampling studies of injection drug users in the United States.

    PubMed

    Wejnert, Cyprian; Pham, Huong; Krishna, Nevin; Le, Binh; DiNenno, Elizabeth

    2012-05-01

    Respondent-driven sampling (RDS) has become increasingly popular for sampling hidden populations, including injecting drug users (IDU). However, RDS data are unique and require specialized analysis techniques, many of which remain underdeveloped. RDS sample size estimation requires knowing design effect (DE), which can only be calculated post hoc. Few studies have analyzed RDS DE using real world empirical data. We analyze estimated DE from 43 samples of IDU collected using a standardized protocol. We find the previous recommendation that sample size be at least doubled, consistent with DE = 2, underestimates true DE and recommend researchers use DE = 4 as an alternate estimate when calculating sample size. A formula for calculating sample size for RDS studies among IDU is presented. Researchers faced with limited resources may wish to accept slightly higher standard errors to keep sample size requirements low. Our results highlight dangers of ignoring sampling design in analysis.

  7. Sample size calculation through the incorporation of heteroscedasticity and dependence for a penalized t-statistic in microarray experiments.

    PubMed

    Hirakawa, Akihiro; Hamada, Chikuma; Yoshimura, Isao

    2012-01-01

    When identifying the differentially expressed genes (DEGs) in microarray data, we often observe heteroscedasticity between groups and dependence among genes. Incorporating these factors is necessary for sample size calculation in microarray experiments. A penalized t-statistic is widely used to improve the identifiability of DEGs. We develop a formula to calculate sample size with dependence adjustment for the penalized t-statistic. Sample size is determined on the basis of overall power under certain conditions to maintain a certain false discovery rate. The usefulness of the proposed method is demonstrated by numerical studies using both simulated data and real data.

  8. Confidence intervals and sample size calculations for the weighted eta-squared effect sizes in one-way heteroscedastic ANOVA.

    PubMed

    Shieh, Gwowen

    2013-03-01

    Effect size reporting and interpreting practices have been extensively recommended in academic journals when primary outcomes of all empirical studies have been analyzed. This article presents an alternative approach to constructing confidence intervals of the weighted eta-squared effect size within the context of one-way heteroscedastic ANOVA models. It is shown that the proposed interval procedure has advantages over an existing method in its theoretical justification, computational simplicity, and numerical performance. For design planning, the corresponding sample size procedures for precise interval estimation of the weighted eta-squared association measure are also delineated. Specifically, the developed formulas compute the necessary sample sizes with respect to the considerations of expected confidence interval width and tolerance probability of interval width within a designated value. Supplementary computer programs are provided to aid the implementation of the suggested techniques in practical applications of ANOVA designs when the assumption of homogeneous variances is not tenable.

  9. Sample size calculations for noninferiority trials with Poisson distributed count data.

    PubMed

    Stucke, Kathrin; Kieser, Meinhard

    2013-03-01

    Clinical trials with Poisson distributed count data as the primary outcome are common in various medical areas such as relapse counts in multiple sclerosis trials or the number of attacks in trials for the treatment of migraine. In this article, we present approximate sample size formulae for testing noninferiority using asymptotic tests which are based on restricted or unrestricted maximum likelihood estimators of the Poisson rates. The Poisson outcomes are allowed to be observed for unequal follow-up schemes, and both the situations that the noninferiority margin is expressed in terms of the difference and the ratio are considered. The exact type I error rates and powers of these tests are evaluated and the accuracy of the approximate sample size formulae is examined. The test statistic using the restricted maximum likelihood estimators (for the difference test problem) and the test statistic that is based on the logarithmic transformation and employs the maximum likelihood estimators (for the ratio test problem) show favorable type I error control and can be recommended for practical application. The approximate sample size formulae show high accuracy even for small sample sizes and provide power values identical or close to the aspired ones. The methods are illustrated by a clinical trial example from anesthesia.

  10. Sample size calculations for cluster randomised controlled trials with a fixed number of clusters

    PubMed Central

    2011-01-01

    Abstract Background Cluster randomised controlled trials (CRCTs) are frequently used in health service evaluation. Assuming an average cluster size, required sample sizes are readily computed for both binary and continuous outcomes, by estimating a design effect or inflation factor. However, where the number of clusters are fixed in advance, but where it is possible to increase the number of individuals within each cluster, as is frequently the case in health service evaluation, sample size formulae have been less well studied. Methods We systematically outline sample size formulae (including required number of randomisation units, detectable difference and power) for CRCTs with a fixed number of clusters, to provide a concise summary for both binary and continuous outcomes. Extensions to the case of unequal cluster sizes are provided. Results For trials with a fixed number of equal sized clusters (k), the trial will be feasible provided the number of clusters is greater than the product of the number of individuals required under individual randomisation (nI) and the estimated intra-cluster correlation (ρ). So, a simple rule is that the number of clusters (k) will be sufficient provided: Where this is not the case, investigators can determine the maximum available power to detect the pre-specified difference, or the minimum detectable difference under the pre-specified value for power. Conclusions Designing a CRCT with a fixed number of clusters might mean that the study will not be feasible, leading to the notion of a minimum detectable difference (or a maximum achievable power), irrespective of how many individuals are included within each cluster. PMID:21718530

  11. Bayesian sample size calculation for estimation of the difference between two binomial proportions.

    PubMed

    Pezeshk, Hamid; Nematollahi, Nader; Maroufy, Vahed; Marriott, Paul; Gittins, John

    2013-12-01

    In this study, we discuss a decision theoretic or fully Bayesian approach to the sample size question in clinical trials with binary responses. Data are assumed to come from two binomial distributions. A Dirichlet distribution is assumed to describe prior knowledge of the two success probabilities p1 and p2. The parameter of interest is p = p1 - p2. The optimal size of the trial is obtained by maximising the expected net benefit function. The methodology presented in this article extends previous work by the assumption of dependent prior distributions for p1 and p2.

  12. Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments.

    PubMed

    Bi, Ran; Liu, Peng

    2016-03-31

    RNA-Sequencing (RNA-seq) experiments have been popularly applied to transcriptome studies in recent years. Such experiments are still relatively costly. As a result, RNA-seq experiments often employ a small number of replicates. Power analysis and sample size calculation are challenging in the context of differential expression analysis with RNA-seq data. One challenge is that there are no closed-form formulae to calculate power for the popularly applied tests for differential expression analysis. In addition, false discovery rate (FDR), instead of family-wise type I error rate, is controlled for the multiple testing error in RNA-seq data analysis. So far, there are very few proposals on sample size calculation for RNA-seq experiments. In this paper, we propose a procedure for sample size calculation while controlling FDR for RNA-seq experimental design. Our procedure is based on the weighted linear model analysis facilitated by the voom method which has been shown to have competitive performance in terms of power and FDR control for RNA-seq differential expression analysis. We derive a method that approximates the average power across the differentially expressed genes, and then calculate the sample size to achieve a desired average power while controlling FDR. Simulation results demonstrate that the actual power of several popularly applied tests for differential expression is achieved and is close to the desired power for RNA-seq data with sample size calculated based on our method. Our proposed method provides an efficient algorithm to calculate sample size while controlling FDR for RNA-seq experimental design. We also provide an R package ssizeRNA that implements our proposed method and can be downloaded from the Comprehensive R Archive Network ( http://cran.r-project.org ).

  13. Development of a sampling strategy and sample size calculation to estimate the distribution of mammographic breast density in Korean women.

    PubMed

    Jun, Jae Kwan; Kim, Mi Jin; Choi, Kui Son; Suh, Mina; Jung, Kyu-Won

    2012-01-01

    Mammographic breast density is a known risk factor for breast cancer. To conduct a survey to estimate the distribution of mammographic breast density in Korean women, appropriate sampling strategies for representative and efficient sampling design were evaluated through simulation. Using the target population from the National Cancer Screening Programme (NCSP) for breast cancer in 2009, we verified the distribution estimate by repeating the simulation 1,000 times using stratified random sampling to investigate the distribution of breast density of 1,340,362 women. According to the simulation results, using a sampling design stratifying the nation into three groups (metropolitan, urban, and rural), with a total sample size of 4,000, we estimated the distribution of breast density in Korean women at a level of 0.01% tolerance. Based on the results of our study, a nationwide survey for estimating the distribution of mammographic breast density among Korean women can be conducted efficiently.

  14. Exact calculation of power and sample size in bioequivalence studies using two one-sided tests.

    PubMed

    Shen, Meiyu; Russek-Cohen, Estelle; Slud, Eric V

    2015-01-01

    The number of subjects in a pharmacokinetic two-period two-treatment crossover bioequivalence study is typically small, most often less than 60. The most common approach to testing for bioequivalence is the two one-sided tests procedure. No explicit mathematical formula for the power function in the context of the two one-sided tests procedure exists in the statistical literature, although the exact power based on Owen's special case of bivariate noncentral t-distribution has been tabulated and graphed. Several approximations have previously been published for the probability of rejection in the two one-sided tests procedure for crossover bioequivalence studies. These approximations and associated sample size formulas are reviewed in this article and compared for various parameter combinations with exact power formulas derived here, which are computed analytically as univariate integrals and which have been validated by Monte Carlo simulations. The exact formulas for power and sample size are shown to improve markedly in realistic parameter settings over the previous approximations.

  15. Sample Size Calculation: Inaccurate A Priori Assumptions for Nuisance Parameters Can Greatly Affect the Power of a Randomized Controlled Trial.

    PubMed

    Tavernier, Elsa; Giraudeau, Bruno

    2015-01-01

    We aimed to examine the extent to which inaccurate assumptions for nuisance parameters used to calculate sample size can affect the power of a randomized controlled trial (RCT). In a simulation study, we separately considered an RCT with continuous, dichotomous or time-to-event outcomes, with associated nuisance parameters of standard deviation, success rate in the control group and survival rate in the control group at some time point, respectively. For each type of outcome, we calculated a required sample size N for a hypothesized treatment effect, an assumed nuisance parameter and a nominal power of 80%. We then assumed a nuisance parameter associated with a relative error at the design stage. For each type of outcome, we randomly drew 10,000 relative errors of the associated nuisance parameter (from empirical distributions derived from a previously published review). Then, retro-fitting the sample size formula, we derived, for the pre-calculated sample size N, the real power of the RCT, taking into account the relative error for the nuisance parameter. In total, 23%, 0% and 18% of RCTs with continuous, binary and time-to-event outcomes, respectively, were underpowered (i.e., the real power was < 60%, as compared with the 80% nominal power); 41%, 16% and 6%, respectively, were overpowered (i.e., with real power > 90%). Even with proper calculation of sample size, a substantial number of trials are underpowered or overpowered because of imprecise knowledge of nuisance parameters. Such findings raise questions about how sample size for RCTs should be determined.

  16. Integrating Software for Sample Size Calculations, Data Entry, and Tabulation: Software Demonstration of a System for Survey Research.

    ERIC Educational Resources Information Center

    Lambert, Richard; Flowers, Claudia; Sipe, Theresa; Idleman, Lynda

    This paper discusses three software packages that offer unique features and options that greatly simplify the research package for conducting surveys. The first package, EPSILON, from Resource Group, Ltd. of Dallas (Texas) is designed to perform a variety of sample size calculations covering most of the commonly encountered survey research…

  17. A convenient formula for sample size calculations in clinical trials with multiple co-primary continuous endpoints.

    PubMed

    Sugimoto, Tomoyuki; Sozu, Takashi; Hamasaki, Toshimitsu

    2012-01-01

    The clinical efficacy of a new treatment may often be better evaluated by two or more co-primary endpoints. Recently, in pharmaceutical drug development, there has been increasing discussion regarding establishing statistically significant favorable results on more than one endpoint in comparisons between treatments, which is referred to as a problem of multiple co-primary endpoints. Several methods have been proposed for calculating the sample size required to design a trial with multiple co-primary correlated endpoints. However, because these methods require users to have considerable mathematical sophistication and knowledge of programming techniques, their application and spread may be restricted in practice. To improve the convenience of these methods, in this paper, we provide a useful formula with accompanying numerical tables for sample size calculations to design clinical trials with two treatments, where the efficacy of a new treatment is demonstrated on continuous co-primary endpoints. In addition, we provide some examples to illustrate the sample size calculations made using the formula. Using the formula and the tables, which can be read according to the patterns of correlations and effect size ratios expected in multiple co-primary endpoints, makes it convenient to evaluate the required sample size promptly.

  18. Statistical grand rounds: a review of analysis and sample size calculation considerations for Wilcoxon tests.

    PubMed

    Divine, George; Norton, H James; Hunt, Ronald; Dienemann, Jacqueline

    2013-09-01

    When a study uses an ordinal outcome measure with unknown differences in the anchors and a small range such as 4 or 7, use of the Wilcoxon rank sum test or the Wilcoxon signed rank test may be most appropriate. However, because nonparametric methods are at best indirect functions of standard measures of location such as means or medians, the choice of the most appropriate summary measure can be difficult. The issues underlying use of these tests are discussed. The Wilcoxon-Mann-Whitney odds directly reflects the quantity that the rank sum procedure actually tests, and thus it can be a superior summary measure. Unlike the means and medians, its value will have a one-to-one correspondence with the Wilcoxon rank sum test result. The companion article appearing in this issue of Anesthesia & Analgesia ("Aromatherapy as Treatment for Postoperative Nausea: A Randomized Trial") illustrates these issues and provides an example of a situation for which the medians imply no difference between 2 groups, even though the groups are, in fact, quite different. The trial cited also provides an example of a single sample that has a median of zero, yet there is a substantial shift for much of the nonzero data, and the Wilcoxon signed rank test is quite significant. These examples highlight the potential discordance between medians and Wilcoxon test results. Along with the issues surrounding the choice of a summary measure, there are considerations for the computation of sample size and power, confidence intervals, and multiple comparison adjustment. In addition, despite the increased robustness of the Wilcoxon procedures relative to parametric tests, some circumstances in which the Wilcoxon tests may perform poorly are noted, along with alternative versions of the procedures that correct for such limitations. 

  19. Optimality, sample size, and power calculations for the sequential parallel comparison design.

    PubMed

    Ivanova, Anastasia; Qaqish, Bahjat; Schoenfeld, David A

    2011-10-15

    The sequential parallel comparison design (SPCD) has been proposed to increase the likelihood of success of clinical trials in therapeutic areas where high-placebo response is a concern. The trial is run in two stages, and subjects are randomized into three groups: (i) placebo in both stages; (ii) placebo in the first stage and drug in the second stage; and (iii) drug in both stages. We consider the case of binary response data (response/no response). In the SPCD, all first-stage and second-stage data from placebo subjects who failed to respond in the first stage of the trial are utilized in the efficacy analysis. We develop 1 and 2 degree of freedom score tests for treatment effect in the SPCD. We give formulae for asymptotic power and for sample size computations and evaluate their accuracy via simulation studies. We compute the optimal allocation ratio between drug and placebo in stage 1 for the SPCD to determine from a theoretical viewpoint whether a single-stage design, a two-stage design with placebo only in the first stage, or a two-stage design is the best design for a given set of response rates. As response rates are not known before the trial, a two-stage approach with allocation to active drug in both stages is a robust design choice. Copyright © 2011 John Wiley & Sons, Ltd.

  20. Some Issues of Sample Size Calculation for Time-to-Event Endpoints Using the Freedman and Schoenfeld Formulas.

    PubMed

    Abel, Ulrich R; Jensen, Katrin; Karapanagiotou-Schenkel, Irini; Kieser, Meinhard

    2015-01-01

    This article deals with seven special issues related to the assumptions, applicability, and practical use of formulas for calculating power or sample size, respectively, for comparative clinical trials with time-to-event endpoints, with particular focus on the well-known Freedman and Schoenfeld methods. All problems addressed are illustrated by numerical examples, and recommendations are given on how to deal with them in the planning of clinical trials.

  1. 45 CFR Appendix C to Part 1356 - Calculating Sample Size for NYTD Follow-Up Populations

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... Populations C Appendix C to Part 1356 Public Welfare Regulations Relating to Public Welfare (Continued) OFFICE... Follow-Up Populations 1. Using Finite Population Correction The Finite Population Correction (FPC) is applied when the sample is drawn from a population of one to 5,000 youth, because the sample is more than...

  2. 45 CFR Appendix C to Part 1356 - Calculating Sample Size for NYTD Follow-Up Populations

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... Populations C Appendix C to Part 1356 Public Welfare Regulations Relating to Public Welfare (Continued) OFFICE... Follow-Up Populations 1. Using Finite Population Correction The Finite Population Correction (FPC) is applied when the sample is drawn from a population of one to 5,000 youth, because the sample is more than...

  3. 45 CFR Appendix C to Part 1356 - Calculating Sample Size for NYTD Follow-Up Populations

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... Populations C Appendix C to Part 1356 Public Welfare Regulations Relating to Public Welfare (Continued) OFFICE... Follow-Up Populations 1. Using Finite Population Correction The Finite Population Correction (FPC) is applied when the sample is drawn from a population of one to 5,000 youth, because the sample is more than...

  4. 45 CFR Appendix C to Part 1356 - Calculating Sample Size for NYTD Follow-Up Populations

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... Populations C Appendix C to Part 1356 Public Welfare Regulations Relating to Public Welfare (Continued) OFFICE... Follow-Up Populations 1. Using Finite Population Correction The Finite Population Correction (FPC) is applied when the sample is drawn from a population of one to 5,000 youth, because the sample is more than...

  5. Sample size calculation based on generalized linear models for differential expression analysis in RNA-seq data.

    PubMed

    Li, Chung-I; Shyr, Yu

    2016-12-01

    As RNA-seq rapidly develops and costs continually decrease, the quantity and frequency of samples being sequenced will grow exponentially. With proteomic investigations becoming more multivariate and quantitative, determining a study's optimal sample size is now a vital step in experimental design. Current methods for calculating a study's required sample size are mostly based on the hypothesis testing framework, which assumes each gene count can be modeled through Poisson or negative binomial distributions; however, these methods are limited when it comes to accommodating covariates. To address this limitation, we propose an estimating procedure based on the generalized linear model. This easy-to-use method constructs a representative exemplary dataset and estimates the conditional power, all without requiring complicated mathematical approximations or formulas. Even more attractive, the downstream analysis can be performed with current R/Bioconductor packages. To demonstrate the practicability and efficiency of this method, we apply it to three real-world studies, and introduce our on-line calculator developed to determine the optimal sample size for a RNA-seq study.

  6. Sample size calculation in survival trials accounting for time-varying relationship between noncompliance and risk of outcome event.

    PubMed

    Li, Bingbing; Grambsch, Patricia

    2006-01-01

    Most methods of sample size calculations for survival trials adjust the estimated outcome event rates for noncompliance based on the assumption that non-compliance is independent of the risk of the outcome event although there has been published evidence that noncompliers are often at a higher risk than compliers. More recent work has started to consider the situations of informative noncompliance and different risks for noncompliers. However, the possibility of a time-varying association between noncompliance and risk has been ignored. Our analysis indicated a strong time-varying relationship between noncompliance defined as permanent discontinuation of study treatments and risk of the outcome event in the CONVINCE trial. The purpose of this research is to develop methods for the log-rank sample size calculations for two-arm clinical trials that allow for the relationship between risk and noncompliance to vary over time and to study how sample size requirements vary with different patterns of the time relationship. The method developed takes Lakatos' Markov chain approach as a basis, modifying it to incorporate time dynamics, and emphasizing permanent discontinuation of study medication as the form of noncompliance to be considered. Results with our method show that sample size depends on the relative rates of noncompliance in the two arms, the hazard for the outcome event following non-compliance, whether it involves switching to the hazard of the opposite arm or is common to both arms, and whether noncompliance occurs early or late in the trial. These factors interact with each other in complex ways, precluding simple summaries. This research focuses on two-arm clinical trials with time to event as primary outcome measure. The method developed is not directly applicable to trials with more complicated designs and/or trials with other types of primary outcome. The pattern of the relationship between noncompliance and risk can have a dramatic impact on the sample

  7. [Sample Size Calculation using SAS for PhaseⅡin Two-stage Clinical Trials of Anti-tumor Drugs].

    PubMed

    Li, Ji-Jie; Hou, Li-Sha; Zhu, Ping; Du, Xu-Dong; Zhu, Cai-Rong

    2017-07-01

    To compare Gehan two-stage design and Simon two-stage design in sample size calculations for phase Ⅱ clinical trials of anti-tumor drugs. We explained the sample size calculation methods with a single-stage design, Gehan two-stage design, and Simon optimal two-stage and minimax two-stage designs in line with the principle of exact binomial probability. By setting up different parameters in SAS macro program, the advantages and disadvantages of these designs were compared. The minimax two-stage design does not increase the maximum sample size compared with the single-stage design. Compared with the Gehan two-stage design, the Simon two-stage design has the advantage of being able to determine an early termination of trials when no or low anti-tumor activities are evident. Simon two-stage design is better than single-stage design and Gehan two-stage design. The minimax design is more popular than the optimal design.

  8. Empirical power and sample size calculations for cluster-randomized and cluster-randomized crossover studies.

    PubMed

    Reich, Nicholas G; Myers, Jessica A; Obeng, Daniel; Milstone, Aaron M; Perl, Trish M

    2012-01-01

    In recent years, the number of studies using a cluster-randomized design has grown dramatically. In addition, the cluster-randomized crossover design has been touted as a methodological advance that can increase efficiency of cluster-randomized studies in certain situations. While the cluster-randomized crossover trial has become a popular tool, standards of design, analysis, reporting and implementation have not been established for this emergent design. We address one particular aspect of cluster-randomized and cluster-randomized crossover trial design: estimating statistical power. We present a general framework for estimating power via simulation in cluster-randomized studies with or without one or more crossover periods. We have implemented this framework in the clusterPower software package for R, freely available online from the Comprehensive R Archive Network. Our simulation framework is easy to implement and users may customize the methods used for data analysis. We give four examples of using the software in practice. The clusterPower package could play an important role in the design of future cluster-randomized and cluster-randomized crossover studies. This work is the first to establish a universal method for calculating power for both cluster-randomized and cluster-randomized clinical trials. More research is needed to develop standardized and recommended methodology for cluster-randomized crossover studies.

  9. Sample-size calculation and reestimation for a semiparametric analysis of recurrent event data taking robust standard errors into account.

    PubMed

    Ingel, Katharina; Jahn-Eimermacher, Antje

    2014-07-01

    In some clinical trials, the repeated occurrence of the same type of event is of primary interest and the Andersen-Gill model has been proposed to analyze recurrent event data. Existing methods to determine the required sample size for an Andersen-Gill analysis rely on the strong assumption that all heterogeneity in the individuals' risk to experience events can be explained by known covariates. In practice, however, this assumption might be violated due to unknown or unmeasured covariates affecting the time to events. In these situations, the use of a robust variance estimate in calculating the test statistic is highly recommended to assure the type I error rate, but this will in turn decrease the actual power of the trial. In this article, we derive a new sample-size formula to reach the desired power even in the presence of unexplained heterogeneity. The formula is based on an inflation factor that considers the degree of heterogeneity and characteristics of the robust variance estimate. Nevertheless, in the planning phase of a trial there will usually be some uncertainty about the size of the inflation factor. Therefore, we propose an internal pilot study design to reestimate the inflation factor during the study and adjust the sample size accordingly. In a simulation study, the performance and validity of this design with respect to type I error rate and power are proven. Our method is applied to the HepaTel trial evaluating a new intervention for patients with cirrhosis of the liver. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. Towards power and sample size calculations for the comparison of two groups of patients with item response theory models.

    PubMed

    Hardouin, Jean-Benoit; Amri, Sarah; Feddag, Mohand-Larbi; Sébille, Véronique

    2012-05-20

    Evaluation of patient-reported outcomes (PRO) is increasingly performed in health sciences. PRO differs from other measurements because such patient characteristics cannot be directly observed. Item response theory (IRT) is an attractive way for PRO analysis. However, in the framework of IRT, sample size justification is rarely provided or ignores the fact that PRO measures are latent variables with the use of formulas developed for observed variables. It might therefore be inappropriate and might provide inadequately sized studies. The objective was to develop valid sample size methodology for the comparison of PRO in two groups of patients using IRT. The proposed approach takes into account questionnaire's items parameters, the difference of the latent variables means, and its variance whose derivation is approximated using Cramer-Rao bound (CRB). We also computed the associated power. We realized a simulation study taking into account sample size, number of items, and value of the group effect. We compared power obtained from CRB with the one obtained from simulations (SIM) and with the power based on observed variables (OBS). For a given sample size, powers using CRB and SIM were similar and always lower than OBS. We observed a strong impact of the number of items for CRB and SIM, the power increasing with the questionnaire's length but not for OBS. In the context of latent variables, it seems important to use an adapted sample size formula because the formula developed for observed variables seems to be inadequate and leads to an underestimated study size.

  11. Analytic power and sample size calculation for the genotypic transmission/disequilibrium test in case-parent trio studies.

    PubMed

    Neumann, Christoph; Taub, Margaret A; Younkin, Samuel G; Beaty, Terri H; Ruczinski, Ingo; Schwender, Holger

    2014-11-01

    Case-parent trio studies considering genotype data from children affected by a disease and their parents are frequently used to detect single nucleotide polymorphisms (SNPs) associated with disease. The most popular statistical tests for this study design are transmission/disequilibrium tests (TDTs). Several types of these tests have been developed, for example, procedures based on alleles or genotypes. Therefore, it is of great interest to examine which of these tests have the highest statistical power to detect SNPs associated with disease. Comparisons of the allelic and the genotypic TDT for individual SNPs have so far been conducted based on simulation studies, since the test statistic of the genotypic TDT was determined numerically. Recently, however, it has been shown that this test statistic can be presented in closed form. In this article, we employ this analytic solution to derive equations for calculating the statistical power and the required sample size for different types of the genotypic TDT. The power of this test is then compared with the one of the corresponding score test assuming the same mode of inheritance as well as the allelic TDT based on a multiplicative mode of inheritance, which is equivalent to the score test assuming an additive mode of inheritance. This is, thus, the first time the power of these tests are compared based on equations, yielding instant results and omitting the need for time-consuming simulation studies. This comparison reveals that these tests have almost the same power, with the score test being slightly more powerful.

  12. "PowerUp"!: A Tool for Calculating Minimum Detectable Effect Sizes and Minimum Required Sample Sizes for Experimental and Quasi-Experimental Design Studies

    ERIC Educational Resources Information Center

    Dong, Nianbo; Maynard, Rebecca

    2013-01-01

    This paper and the accompanying tool are intended to complement existing supports for conducting power analysis tools by offering a tool based on the framework of Minimum Detectable Effect Sizes (MDES) formulae that can be used in determining sample size requirements and in estimating minimum detectable effect sizes for a range of individual- and…

  13. Fundamentals of estimating sample size.

    PubMed

    Malone, Helen Evelyn; Nicholl, Honor; Coyne, Imelda

    2016-05-01

    Estimating sample size is an integral requirement in the planning stages of quantitative studies. However, although abundant literature is available that describes techniques for calculating sample size, many are in-depth and have varying degrees of complexity. To provide an overview of four basic parameters that underpin the determination of sample size and to explain sample-size estimation for three study designs common in nursing research. Researchers can estimate basic sample size if they have a comprehension of four parameters, such as significance level, power, effect size, and standard deviation (for continuous data) or event rate (for dichotomous data). In this paper, these parameters are applied to determine sample size for the following well-established study designs: a comparison of two independent means, the paired mean study design and a comparison of two proportions. An informed choice of parameter values to input into estimates of sample size enables the researcher to derive the minimum sample size required with sufficient power to detect a meaningful effect. An understanding of the parameters provides the foundation from which to generalise to more complex size estimates. It also enables more informed entry of required parameters into sample size software. Underpinning the concept of evidence-based practice in nursing and midwifery is the application of findings that are statistically sound. Researchers with a good understanding of parameters, such as significance level, power, effect size, standard deviation and event rate, are enabled to calculate an informed sample size estimation and to report more clearly the rationale for applying any particular parameter value in sample size determination.

  14. A systematic review of the reporting of sample size calculations and corresponding data components in observational functional magnetic resonance imaging studies.

    PubMed

    Guo, Qing; Thabane, Lehana; Hall, Geoffrey; McKinnon, Margaret; Goeree, Ron; Pullenayegum, Eleanor

    2014-02-01

    Anecdotal evidence suggests that functional magnetic resonance imaging (fMRI) studies rarely consider statistical power when setting a sample size. This raises concerns since undersized studies may fail to detect effects of interest and encourage data dredging. Although sample size methodology in this field exists, implementation requires specifications of estimated effect size and variance components. We therefore systematically evaluated how often estimates of effect size and variance components were reported in observational fMRI studies involving clinical human participants published in six leading journals between January 2010 and December 2011. A random sample of 100 eligible articles was included in data extraction and analyses. Two independent reviewers assessed the reporting of sample size calculations and the data components required to perform the calculations in the fMRI literature. One article (1%) reported sample size calculations. The reporting of parameter estimates for effect size (8%), between-subject variance (4%), within-subject variance (1%) and temporal autocorrelation matrix (0%) was uncommon. Three articles (3%) reported Cohen's d or F effect sizes. The majority (83%) reported peak or average t, z or F statistics. The inter-rater agreement was very good, with a prevalence-adjusted bias-adjusted kappa (PABAK) value greater than 0.88. We concluded that sample size calculations were seldom reported in fMRI studies. Moreover, omission of parameter estimates for effect size, between- and within-subject variances, and temporal autocorrelation matrix could limit investigators' ability to perform power analyses for new studies. We suggest routine reporting of these quantities, and recommend strategies for reducing bias in their reported values.

  15. Power and sample size calculations for the Wilcoxon-Mann-Whitney test in the presence of death-censored observations.

    PubMed

    Matsouaka, Roland A; Betensky, Rebecca A

    2015-02-10

    We consider a clinical trial of a potentially lethal disease in which patients are randomly assigned to two treatment groups and are followed for a fixed period of time; a continuous endpoint is measured at the end of follow-up. For some patients; however, death (or severe disease progression) may preclude measurement of the endpoint. A statistical analysis that includes only patients with endpoint measurements may be biased. An alternative analysis includes all randomized patients, with rank scores assigned to the patients who are available for the endpoint measurement on the basis of the magnitude of their responses and with 'worst-rank' scores assigned to those patients whose death precluded the measurement of the continuous endpoint. The worst-rank scores are worse than all observed rank scores. The treatment effect is then evaluated using the Wilcoxon-Mann-Whitney test. In this paper, we derive closed-form formulae for the power and sample size of the Wilcoxon-Mann-Whitney test when missing measurements of the continuous endpoints because of death are replaced by worst-rank scores. We distinguish two approaches for assigning the worst-rank scores. In the tied worst-rank approach, all deaths are weighted equally, and the worst-rank scores are set to a single value that is worse than all measured responses. In the untied worst-rank approach, the worst-rank scores further rank patients according to their time of death, so that an earlier death is considered worse than a later death, which in turn is worse than all measured responses. In addition, we propose four methods for the implementation of the sample size formulae for a trial with expected early death. We conduct Monte Carlo simulation studies to evaluate the accuracy of our power and sample size formulae and to compare the four sample size estimation methods. Copyright © 2014 John Wiley & Sons, Ltd.

  16. The impact of obstructive sleep apnea variability measured in-lab versus in-home on sample size calculations.

    PubMed

    Levendowski, Daniel; Steward, David; Woodson, B Tucker; Olmstead, Richard; Popovic, Djordje; Westbrook, Philip

    2009-01-02

    When conducting a treatment intervention, it is assumed that variability associated with measurement of the disease can be controlled sufficiently to reasonably assess the outcome. In this study we investigate the variability of Apnea-Hypopnea Index obtained by polysomnography and by in-home portable recording in untreated mild to moderate obstructive sleep apnea (OSA) patients at a four- to six-month interval. Thirty-seven adult patients serving as placebo controls underwent a baseline polysomnography and in-home sleep study followed by a second set of studies under the same conditions. The polysomnography studies were acquired and scored at three independent American Academy of Sleep Medicine accredited sleep laboratories. The in-home studies were acquired by the patient and scored using validated auto-scoring algorithms. The initial in-home study was conducted on average two months prior to the first polysomnography, the follow-up polysomnography and in-home studies were conducted approximately five to six months after the initial polysomnography. When comparing the test-retest Apnea-hypopnea Index (AHI) and apnea index (AI), the in-home results were more highly correlated (r = 0.65 and 0.68) than the comparable PSG results (r = 0.56 and 0.58). The in-home results provided approximately 50% less test-retest variability than the comparable polysomnography AHI and AI values. Both the overall polysomnography AHI and AI showed a substantial bias toward increased severity upon retest (8 and 6 events/hr respectively) while the in-home bias was essentially zero. The in-home percentage of time supine showed a better correlation compared to polysomnography (r = 0.72 vs. 0.43). Patients biased toward more time supine during the initial polysomnography; no trends in time supine for in-home studies were noted. Night-to-night variability in sleep-disordered breathing can be a confounding factor in assessing treatment outcomes. The sample size of this study was small given

  17. A practical simulation method to calculate sample size of group sequential trials for time-to-event data under exponential and Weibull distribution.

    PubMed

    Jiang, Zhiwei; Wang, Ling; Li, Chanjuan; Xia, Jielai; Jia, Hongxia

    2012-01-01

    Group sequential design has been widely applied in clinical trials in the past few decades. The sample size estimation is a vital concern of sponsors and investigators. Especially in the survival group sequential trials, it is a thorny question because of its ambiguous distributional form, censored data and different definition of information time. A practical and easy-to-use simulation-based method is proposed for multi-stage two-arm survival group sequential design in the article and its SAS program is available. Besides the exponential distribution, which is usually assumed for survival data, the Weibull distribution is considered here. The incorporation of the probability of discontinuation in the simulation leads to the more accurate estimate. The assessment indexes calculated in the simulation are helpful to the determination of number and timing of the interim analysis. The use of the method in the survival group sequential trials is illustrated and the effects of the varied shape parameter on the sample size under the Weibull distribution are explored by employing an example. According to the simulation results, a method to estimate the shape parameter of the Weibull distribution is proposed based on the median survival time of the test drug and the hazard ratio, which are prespecified by the investigators and other participants. 10+ simulations are recommended to achieve the robust estimate of the sample size. Furthermore, the method is still applicable in adaptive design if the strategy of sample size scheme determination is adopted when designing or the minor modifications on the program are made.

  18. Design and analysis of genetic association studies to finely map a locus identified by linkage analysis: sample size and power calculations.

    PubMed

    Hanson, R L; Looker, H C; Ma, L; Muller, Y L; Baier, L J; Knowler, W C

    2006-05-01

    Association (e.g. case-control) studies are often used to finely map loci identified by linkage analysis. We investigated the influence of various parameters on power and sample size requirements for such a study. Calculations were performed for various values of a high-risk functional allele (fA), frequency of a marker allele associated with the high risk allele (f1), degree of linkage disquilibrium between functional and marker alleles (D') and trait heritability attributable to the functional locus (h2). The calculations show that if cases and controls are selected from equal but opposite extreme quantiles of a quantitative trait, the primary determinants of power are h2 and the specific quantiles selected. For a dichotomous trait, power also depends on population prevalence. Power is optimal if functional alleles are studied (fA= f1 and D'= 1.0) and can decrease substantially as D' diverges from 1.0 or as f(1) diverges from fA. These analyses suggest that association studies to finely map loci are most powerful if potential functional polymorphisms are identified a priori or if markers are typed to maximize haplotypic diversity. In the absence of such information, expected minimum power at a given location for a given sample size can be calculated by specifying a range of potential frequencies for fA (e.g. 0.1-0.9) and determining power for all markers within the region with specification of the expected D' between the markers and the functional locus. This method is illustrated for a fine-mapping project with 662 single nucleotide polymorphisms in 24 Mb. Regions differed by marker density and allele frequencies. Thus, in some, power was near its theoretical maximum and little additional information is expected from additional markers, while in others, additional markers appear to be necessary. These methods may be useful in the analysis and interpretation of fine-mapping studies.

  19. MSurvPow: a FORTRAN program to calculate the sample size and power for cluster-randomized clinical trials with survival outcomes.

    PubMed

    Gao, Feng; Manatunga, Amita K; Chen, Shande

    2005-04-01

    Manatunga and Chen [A.K. Manatunga, S. Chen, Sample size estimation for survival outcomes in cluster-randomized studies with small cluster sizes, Biometrics 56 (2000) 616-621] proposed a method to estimate sample size and power for cluster-randomized studies where the primary outcome variable was survival time. The sample size formula was constructed by considering a bivariate marginal distribution (Clayton-Oakes model) with univariate exponential marginal distributions. In this paper, a user-friendly FORTRAN 90 program was provided to implement this method and a simple example was used to illustrate the features of the program.

  20. Calculating body frame size (image)

    MedlinePlus

    Images ... boned category. Determining frame size: To determine the body frame size, measure the wrist with a tape measure and use the following chart to determine whether the person is small, medium, or large boned. Women: Height under 5'2" Small = wrist size less ...

  1. Sample Size for Correlation Estimates

    DTIC Science & Technology

    1989-09-01

    graphs, and computer programs are developed to find the sample number needed for a desired confidence interval size. Nonparametric measures of...correlation (Spearman’s ra and Kendall’s tau) are also examined for appropriate sample numbers when a specific confidence interval size desired.

  2. mHealth Series: Factors influencing sample size calculations for mHealth–based studies – A mixed methods study in rural China

    PubMed Central

    van Velthoven, Michelle Helena; Li, Ye; Wang, Wei; Du, Xiaozhen; Chen, Li; Wu, Qiong; Majeed, Azeem; Zhang, Yanfeng; Car, Josip

    2013-01-01

    Background An important issue for mHealth evaluation is the lack of information for sample size calculations. Objective To explore factors that influence sample size calculations for mHealth–based studies and to suggest strategies for increasing the participation rate. Methods We explored factors influencing recruitment and follow–up of participants (caregivers of children) in an mHealth text messaging data collection cross–over study. With help of village doctors, we recruited 1026 (25%) caregivers of children under five out of the 4170 registered. To explore factors influencing recruitment and provide recommendations for improving recruitment, we conducted semi–structured interviews with village doctors. Of the 1014 included participants, 662 (65%) responded to the first question about willingness to participate, 538 (53%) responded to the first survey question and 356 (35%) completed the text message survey. To explore factors influencing follow–up and provide recommendations for improving follow–up, we conducted interviews with participants. We added views from the researchers who were involved in the study to contextualize the findings. Results We found several factors influencing recruitment related to the following themes: experiences with recruitment, village doctors’ work, village doctors’ motivations, caregivers’ characteristics, caregivers’ motivations. Village doctors gave several recommendations for ways to recruit more caregivers and we added our views to these. We found the following factors influencing follow–up: mobile phone usage, ability to use mobile phone, problems with mobile phone, checking mobile phone, available time, paying back text message costs, study incentives, subjective norm, culture, trust, perceived usefulness of process, perceived usefulness of outcome, perceived ease of use, attitude, behavioural intention to use, and actual use. From our perspective, factors influencing follow–up were: different

  3. Sample size requirement for comparison of decompression outcomes using ultrasonically detected venous gas emboli (VGE): power calculations using Monte Carlo resampling from real data.

    PubMed

    Doolette, David J; Gault, Keith A; Gutvik, Christian R

    2014-03-01

    In studies of decompression procedures, ultrasonically detected venous gas emboli (VGE) are commonly used as a surrogate outcome if decompression sickness (DCS) is unlikely to be observed. There is substantial variability in observed VGE grades, and studies should be designed with sufficient power to detect an important effect. Data for estimating sample size requirements for studies using VGE as an outcome is provided by a comparison of two decompression schedules that found corresponding differences in DCS incidence (3/192 [DCS/dives] vs. 10/198) and median maximum VGE grade (2 vs. 3, P < 0.0001, Wilcoxon test). Sixty-two subjects dived each schedule at least once, accounting for 183 and 180 man-dives on each schedule. From these data, the frequency with which 10,000 randomly resampled, paired samples of maximum VGE grade were significantly different (paired Wilcoxon test, one-sided P ⋜ 0.05 or 0.025) in the same direction as the VGE grades of the full data set were counted (estimated power). Resampling was also used to estimate power of a Bayesian method that ranks two samples based on DCS risks estimated from the VGE grades. Paired sample sizes of 50 subjects yielded about 80% power, but the power dropped to less than 50% with fewer than 30 subjects. Comparisons of VGE grades that fail to find a difference between paired sample sizes of 30 or fewer must be interpreted cautiously. Studies can be considered well powered if the sample size is 50 even if only a one-grade difference in median VGE grade is of interest.

  4. Biostatistics Series Module 5: Determining Sample Size

    PubMed Central

    Hazra, Avijit; Gogtay, Nithya

    2016-01-01

    Determining the appropriate sample size for a study, whatever be its type, is a fundamental aspect of biomedical research. An adequate sample ensures that the study will yield reliable information, regardless of whether the data ultimately suggests a clinically important difference between the interventions or elements being studied. The probability of Type 1 and Type 2 errors, the expected variance in the sample and the effect size are the essential determinants of sample size in interventional studies. Any method for deriving a conclusion from experimental data carries with it some risk of drawing a false conclusion. Two types of false conclusion may occur, called Type 1 and Type 2 errors, whose probabilities are denoted by the symbols σ and β. A Type 1 error occurs when one concludes that a difference exists between the groups being compared when, in reality, it does not. This is akin to a false positive result. A Type 2 error occurs when one concludes that difference does not exist when, in reality, a difference does exist, and it is equal to or larger than the effect size defined by the alternative to the null hypothesis. This may be viewed as a false negative result. When considering the risk of Type 2 error, it is more intuitive to think in terms of power of the study or (1 − β). Power denotes the probability of detecting a difference when a difference does exist between the groups being compared. Smaller α or larger power will increase sample size. Conventional acceptable values for power and α are 80% or above and 5% or below, respectively, when calculating sample size. Increasing variance in the sample tends to increase the sample size required to achieve a given power level. The effect size is the smallest clinically important difference that is sought to be detected and, rather than statistical convention, is a matter of past experience and clinical judgment. Larger samples are required if smaller differences are to be detected. Although the

  5. Sample Size Verification for Clinical Trials

    PubMed Central

    2013-01-01

    Abstract In this paper, we shall provide simple methods where nonstatisticians can evaluate sample size calculations for most large simple trials, as an important part of the peer review process, whether a grant, an Institutional Review Board review, an internal scientific review committee, or a journal referee. Through the methods of the paper, not only can readers determine if there is a major disparity, but they can readily determine the correct sample size. It will be of comfort to find in most cases that the sample size computation is correct, but the implications can be major for the minority where serious errors occur. We shall provide three real examples, one where the sample size need was seriously overestimated, one (HIP PRO‐test of a device to prevent hip fractures) where the sample size need was dramatically underestimated, and one where the sample size was correct. The HIP PRO case is especially troubling as it went through an NIH study section and two peer reviewed journal reports without anyone catching this sample size error of a factor of more than five‐fold. PMID:24119049

  6. Sample size verification for clinical trials.

    PubMed

    Shuster, Jonathan J

    2014-02-01

    In this paper, we shall provide simple methods where nonstatisticians can evaluate sample size calculations for most large simple trials, as an important part of the peer review process, whether a grant, an Institutional Review Board review, an internal scientific review committee, or a journal referee. Through the methods of the paper, not only can readers determine if there is a major disparity, but they can readily determine the correct sample size. It will be of comfort to find in most cases that the sample size computation is correct, but the implications can be major for the minority where serious errors occur. We shall provide three real examples, one where the sample size need was seriously overestimated, one (HIP PRO-test of a device to prevent hip fractures) where the sample size need was dramatically underestimated, and one where the sample size was correct. The HIP PRO case is especially troubling as it went through an NIH study section and two peer reviewed journal reports without anyone catching this sample size error of a factor of more than five-fold. © 2013 Wiley Periodicals, Inc.

  7. Improving your Hypothesis Testing: Determining Sample Sizes.

    ERIC Educational Resources Information Center

    Luftig, Jeffrey T.; Norton, Willis P.

    1982-01-01

    This article builds on an earlier discussion of the importance of the Type II error (beta) and power to the hypothesis testing process (CE 511 484), and illustrates the methods by which sample size calculations should be employed so as to improve the research process. (Author/CT)

  8. [Clinical research V. Sample size].

    PubMed

    Talavera, Juan O; Rivas-Ruiz, Rodolfo; Bernal-Rosales, Laura Paola

    2011-01-01

    In clinical research it is impossible and inefficient to study all patients with a specific pathology, so it is necessary to study a sample of them. The estimation of the sample size before starting a study guarantees the stability of the results and allows us to foresee the feasibility of the study depending on the availability of patients and cost. The basic structure of sample size estimation is based on the premise that seeks to demonstrate, among other cases, that the observed difference between two or more maneuvers in the subsequent state is real. Initially, it requires knowing the value of the expected difference (δ) and its data variation (standard deviation). These data are usually obtained from previous studies. Then, other components must be considered: a (alpha), percentage of error in the assertion that the difference between means is real, usually 5 %; and β, error rate accepting the claim that the no-difference between the means is real, usually ranging from 15 to 20 %. Finally, these values are substituted into the formula or in an electronic program for estimating sample size. While summary and dispersion measures vary with the type of variable according to the outcome, the basic structure is the same.

  9. Sample sizes for confidence limits for reliability.

    SciTech Connect

    Darby, John L.

    2010-02-01

    We recently performed an evaluation of the implications of a reduced stockpile of nuclear weapons for surveillance to support estimates of reliability. We found that one technique developed at Sandia National Laboratories (SNL) under-estimates the required sample size for systems-level testing. For a large population the discrepancy is not important, but for a small population it is important. We found that another technique used by SNL provides the correct required sample size. For systems-level testing of nuclear weapons, samples are selected without replacement, and the hypergeometric probability distribution applies. Both of the SNL techniques focus on samples without defects from sampling without replacement. We generalized the second SNL technique to cases with defects in the sample. We created a computer program in Mathematica to automate the calculation of confidence for reliability. We also evaluated sampling with replacement where the binomial probability distribution applies.

  10. Sample size planning for classification models.

    PubMed

    Beleites, Claudia; Neugebauer, Ute; Bocklitz, Thomas; Krafft, Christoph; Popp, Jürgen

    2013-01-14

    In biospectroscopy, suitably annotated and statistically independent samples (e.g. patients, batches, etc.) for classifier training and testing are scarce and costly. Learning curves show the model performance as function of the training sample size and can help to determine the sample size needed to train good classifiers. However, building a good model is actually not enough: the performance must also be proven. We discuss learning curves for typical small sample size situations with 5-25 independent samples per class. Although the classification models achieve acceptable performance, the learning curve can be completely masked by the random testing uncertainty due to the equally limited test sample size. In consequence, we determine test sample sizes necessary to achieve reasonable precision in the validation and find that 75-100 samples will usually be needed to test a good but not perfect classifier. Such a data set will then allow refined sample size planning on the basis of the achieved performance. We also demonstrate how to calculate necessary sample sizes in order to show the superiority of one classifier over another: this often requires hundreds of statistically independent test samples or is even theoretically impossible. We demonstrate our findings with a data set of ca. 2550 Raman spectra of single cells (five classes: erythrocytes, leukocytes and three tumour cell lines BT-20, MCF-7 and OCI-AML3) as well as by an extensive simulation that allows precise determination of the actual performance of the models in question. Copyright © 2012 Elsevier B.V. All rights reserved.

  11. Improved sample size determination for attributes and variables sampling

    SciTech Connect

    Stirpe, D.; Picard, R.R.

    1985-01-01

    Earlier INMM papers have addressed the attributes/variables problem and, under conservative/limiting approximations, have reported analytical solutions for the attributes and variables sample sizes. Through computer simulation of this problem, we have calculated attributes and variables sample sizes as a function of falsification, measurement uncertainties, and required detection probability without using approximations. Using realistic assumptions for uncertainty parameters of measurement, the simulation results support the conclusions: (1) previously used conservative approximations can be expensive because they lead to larger sample sizes than needed; and (2) the optimal verification strategy, as well as the falsification strategy, are highly dependent on the underlying uncertainty parameters of the measurement instruments. 1 ref., 3 figs.

  12. Sample size recalculation in sequential diagnostic trials.

    PubMed

    Tang, Liansheng Larry; Liu, Aiyi

    2010-01-01

    Before a comparative diagnostic trial is carried out, maximum sample sizes for the diseased group and the nondiseased group need to be obtained to achieve a nominal power to detect a meaningful difference in diagnostic accuracy. Sample size calculation depends on the variance of the statistic of interest, which is the difference between receiver operating characteristic summary measures of 2 medical diagnostic tests. To obtain an appropriate value for the variance, one often has to assume an arbitrary parametric model and the associated parameter values for the 2 groups of subjects under 2 tests to be compared. It becomes more tedious to do so when the same subject undergoes 2 different tests because the correlation is then involved in modeling the test outcomes. The calculated variance based on incorrectly specified parametric models may be smaller than the true one, which will subsequently result in smaller maximum sample sizes, leaving the study underpowered. In this paper, we develop a nonparametric adaptive method for comparative diagnostic trials to update the sample sizes using interim data, while allowing early stopping during interim analyses. We show that the proposed method maintains the nominal power and type I error rate through theoretical proofs and simulation studies.

  13. How Sample Size Affects a Sampling Distribution

    ERIC Educational Resources Information Center

    Mulekar, Madhuri S.; Siegel, Murray H.

    2009-01-01

    If students are to understand inferential statistics successfully, they must have a profound understanding of the nature of the sampling distribution. Specifically, they must comprehend the determination of the expected value and standard error of a sampling distribution as well as the meaning of the central limit theorem. Many students in a high…

  14. How Sample Size Affects a Sampling Distribution

    ERIC Educational Resources Information Center

    Mulekar, Madhuri S.; Siegel, Murray H.

    2009-01-01

    If students are to understand inferential statistics successfully, they must have a profound understanding of the nature of the sampling distribution. Specifically, they must comprehend the determination of the expected value and standard error of a sampling distribution as well as the meaning of the central limit theorem. Many students in a high…

  15. Stepwise two-stage sample size adaptation.

    PubMed

    Wan, Hong; Ellenberg, Susan S; Anderson, Keaven M

    2015-01-15

    Several adaptive design methods have been proposed to reestimate sample size using the observed treatment effect after an initial stage of a clinical trial while preserving the overall type I error at the time of the final analysis. One unfortunate property of the algorithms used in some methods is that they can be inverted to reveal the exact treatment effect at the interim analysis. We propose using a step function with an inverted U-shape of observed treatment difference for sample size reestimation to lessen the information on treatment effect revealed. This will be referred to as stepwise two-stage sample size adaptation. This method applies calculation methods used for group sequential designs. We minimize expected sample size among a class of these designs and compare efficiency with the fully optimized two-stage design, optimal two-stage group sequential design, and designs based on promising conditional power. The trade-off between efficiency versus the improved blinding of the interim treatment effect will be discussed.

  16. Sample-size requirements for evaluating population size structure

    USGS Publications Warehouse

    Vokoun, J.C.; Rabeni, C.F.; Stanovick, J.S.

    2001-01-01

    A method with an accompanying computer program is described to estimate the number of individuals needed to construct a sample length-frequency with a given accuracy and precision. First, a reference length-frequency assumed to be accurate for a particular sampling gear and collection strategy was constructed. Bootstrap procedures created length-frequencies with increasing sample size that were randomly chosen from the reference data and then were compared with the reference length-frequency by calculating the mean squared difference. Outputs from two species collected with different gears and an artificial even length-frequency are used to describe the characteristics of the method. The relations between the number of individuals used to construct a length-frequency and the similarity to the reference length-frequency followed a negative exponential distribution and showed the importance of using 300-400 individuals whenever possible.

  17. Sample Size Estimation: The Easy Way

    ERIC Educational Resources Information Center

    Weller, Susan C.

    2015-01-01

    This article presents a simple approach to making quick sample size estimates for basic hypothesis tests. Although there are many sources available for estimating sample sizes, methods are not often integrated across statistical tests, levels of measurement of variables, or effect sizes. A few parameters are required to estimate sample sizes and…

  18. Effect size calculations for the clinician: methods and comparability.

    PubMed

    Seidel, Jason A; Miller, Scott D; Chow, Daryl L

    2014-01-01

    The measurement of clinical change via single-group pre-post effect size has become increasingly common in psychotherapy settings that collect practice-based evidence and engage in feedback-informed treatment. Different methods of calculating effect size for the same sample of clients and the same measure can lead to wide-ranging results, reducing interpretability. Effect sizes from therapists-including those drawn from a large web-based database of practicing clinicians-were calculated using nine different methods. The resulting effect sizes varied significantly depending on the method employed. Differences between measurement methods routinely exceeded 0.40 for individual therapists. Three methods for calculating effect sizes are recommended for moderating these differences, including two equations that show promise as valid and practical methods for use by clinicians in professional practice.

  19. Sample size matters: Investigating the optimal sample size for a logistic regression debris flow susceptibility model

    NASA Astrophysics Data System (ADS)

    Heckmann, Tobias; Gegg, Katharina; Becht, Michael

    2013-04-01

    Statistical approaches to landslide susceptibility modelling on the catchment and regional scale are used very frequently compared to heuristic and physically based approaches. In the present study, we deal with the problem of the optimal sample size for a logistic regression model. More specifically, a stepwise approach has been chosen in order to select those independent variables (from a number of derivatives of a digital elevation model and landcover data) that explain best the spatial distribution of debris flow initiation zones in two neighbouring central alpine catchments in Austria (used mutually for model calculation and validation). In order to minimise problems arising from spatial autocorrelation, we sample a single raster cell from each debris flow initiation zone within an inventory. In addition, as suggested by previous work using the "rare events logistic regression" approach, we take a sample of the remaining "non-event" raster cells. The recommendations given in the literature on the size of this sample appear to be motivated by practical considerations, e.g. the time and cost of acquiring data for non-event cases, which do not apply to the case of spatial data. In our study, we aim at finding empirically an "optimal" sample size in order to avoid two problems: First, a sample too large will violate the independent sample assumption as the independent variables are spatially autocorrelated; hence, a variogram analysis leads to a sample size threshold above which the average distance between sampled cells falls below the autocorrelation range of the independent variables. Second, if the sample is too small, repeated sampling will lead to very different results, i.e. the independent variables and hence the result of a single model calculation will be extremely dependent on the choice of non-event cells. Using a Monte-Carlo analysis with stepwise logistic regression, 1000 models are calculated for a wide range of sample sizes. For each sample size

  20. Requirements for Minimum Sample Size for Sensitivity and Specificity Analysis

    PubMed Central

    Adnan, Tassha Hilda

    2016-01-01

    Sensitivity and specificity analysis is commonly used for screening and diagnostic tests. The main issue researchers face is to determine the sufficient sample sizes that are related with screening and diagnostic studies. Although the formula for sample size calculation is available but concerning majority of the researchers are not mathematicians or statisticians, hence, sample size calculation might not be easy for them. This review paper provides sample size tables with regards to sensitivity and specificity analysis. These tables were derived from formulation of sensitivity and specificity test using Power Analysis and Sample Size (PASS) software based on desired type I error, power and effect size. The approaches on how to use the tables were also discussed. PMID:27891446

  1. Sample Size Estimation in Clinical Trial

    PubMed Central

    Sakpal, Tushar Vijay

    2010-01-01

    Every clinical trial should be planned. This plan should include the objective of trial, primary and secondary end-point, method of collecting data, sample to be included, sample size with scientific justification, method of handling data, statistical methods and assumptions. This plan is termed as clinical trial protocol. One of the key aspects of this protocol is sample size estimation. The aim of this article is to discuss how important sample size estimation is for a clinical trial, and also to understand the effects of sample size over- estimation or under-estimation on outcome of a trial. Also an attempt is made to understand importance of minimum sample to detect a clinically important difference. This article is also an attempt to provide inputs on different parameters that impact sample size and basic rules for these parameters with the help of some simple examples. PMID:21829786

  2. Methods for sample size determination in cluster randomized trials

    PubMed Central

    Rutterford, Clare; Copas, Andrew; Eldridge, Sandra

    2015-01-01

    Background: The use of cluster randomized trials (CRTs) is increasing, along with the variety in their design and analysis. The simplest approach for their sample size calculation is to calculate the sample size assuming individual randomization and inflate this by a design effect to account for randomization by cluster. The assumptions of a simple design effect may not always be met; alternative or more complicated approaches are required. Methods: We summarise a wide range of sample size methods available for cluster randomized trials. For those familiar with sample size calculations for individually randomized trials but with less experience in the clustered case, this manuscript provides formulae for a wide range of scenarios with associated explanation and recommendations. For those with more experience, comprehensive summaries are provided that allow quick identification of methods for a given design, outcome and analysis method. Results: We present first those methods applicable to the simplest two-arm, parallel group, completely randomized design followed by methods that incorporate deviations from this design such as: variability in cluster sizes; attrition; non-compliance; or the inclusion of baseline covariates or repeated measures. The paper concludes with methods for alternative designs. Conclusions: There is a large amount of methodology available for sample size calculations in CRTs. This paper gives the most comprehensive description of published methodology for sample size calculation and provides an important resource for those designing these trials. PMID:26174515

  3. Additional Considerations in Determining Sample Size.

    ERIC Educational Resources Information Center

    Levin, Joel R.; Subkoviak, Michael J.

    Levin's (1975) sample-size determination procedure for completely randomized analysis of variance designs is extended to designs in which antecedent or blocking variables information is considered. In particular, a researcher's choice of designs is framed in terms of determining the respective sample sizes necessary to detect specified contrasts…

  4. Semiparametric Regression in Size-Biased Sampling

    PubMed Central

    Chen, Ying Qing

    2009-01-01

    Summary Size-biased sampling arises when a positive-valued outcome variable is sampled with selection probability proportional to its size. In this article, we propose a semiparametric linear regression model to analyze size-biased outcomes. In our proposed model, the regression parameters of the covariates are of major interest, while the distribution of random errors is unspecified. Under the proposed model, we discover that the regression parameters are invariant regardless of size-biased sampling. Following this invariance property, we develop a simple estimation procedure for inferences. Our proposed methods are evaluated in simulation studies and applied to two real data analyses. PMID:19432792

  5. Adjustment for unbalanced sample size for analytical biosimilar equivalence assessment.

    PubMed

    Dong, Xiaoyu Cassie; Weng, Yu-Ting; Tsong, Yi

    2017-01-06

    Large sample size imbalance is not uncommon in the biosimilar development. At the beginning of a product development, sample sizes of a biosimilar and a reference product may be limited. Thus, a sample size calculation may not be feasible. During the development stage, more batches of reference products may be added at a later stage to have a more reliable estimate of the reference variability. On the other hand, we also need a sufficient number of biosimilar batches in order to have a better understanding of the product. Those challenges lead to a potential sample size imbalance. In this paper, we show that large sample size imbalance may increase the power of the equivalence test in an unfavorable way, giving higher power for less similar products when the sample size of biosimilar is much smaller than that of the reference product. Thus, it is necessary to make some sample size imbalance adjustments to motivate sufficient sample size for biosimilar as well. This paper discusses two adjustment methods for the equivalence test in analytical biosimilarity studies. Please keep in mind that sufficient sample sizes for both biosimilar and reference products (if feasible) are desired during the planning stage.

  6. Rock sampling. [apparatus for controlling particle size

    NASA Technical Reports Server (NTRS)

    Blum, P. (Inventor)

    1971-01-01

    An apparatus for sampling rock and other brittle materials and for controlling resultant particle sizes is described. The device includes grinding means for cutting grooves in the rock surface and to provide a grouping of thin, shallow, parallel ridges and cutter means to reduce these ridges to a powder specimen. Collection means is provided for the powder. The invention relates to rock grinding and particularly to the sampling of rock specimens with good size control.

  7. Size safety valve discharge piping with a programmable calculator

    SciTech Connect

    D'ambra, A.

    1982-10-01

    Discussed is a program that will aid in the proper sizing of steam safety valve discharge piping frequently encountered in steam distribution systems. Basis for calculation is the ASME/ANSI Power Piping Code. Code reference is not necessary for running the program. Presented is a safety valve installation schematic, the program listing, data registers, constants, and a sample problem. The calculation done by this program is a fluid momentum check to assure that selected pipe sizes yield velocities and back pressures such that the steam blowing out of the safety valve is driven up the stack and not backwards out of the clearance. Back pressure should not exceed safety valve manufacturers' limits to realize full design capacity of the installation.

  8. [Effect sizes, statistical power and sample sizes in "the Japanese Journal of Psychology"].

    PubMed

    Suzukawa, Yumi; Toyoda, Hideki

    2012-04-01

    This study analyzed the statistical power of research studies published in the "Japanese Journal of Psychology" in 2008 and 2009. Sample effect sizes and sample statistical powers were calculated for each statistical test and analyzed with respect to the analytical methods and the fields of the studies. The results show that in the fields like perception, cognition or learning, the effect sizes were relatively large, although the sample sizes were small. At the same time, because of the small sample sizes, some meaningful effects could not be detected. In the other fields, because of the large sample sizes, meaningless effects could be detected. This implies that researchers who could not get large enough effect sizes would use larger samples to obtain significant results.

  9. Experimental determination of size distributions: analyzing proper sample sizes

    NASA Astrophysics Data System (ADS)

    Buffo, A.; Alopaeus, V.

    2016-04-01

    The measurement of various particle size distributions is a crucial aspect for many applications in the process industry. Size distribution is often related to the final product quality, as in crystallization or polymerization. In other cases it is related to the correct evaluation of heat and mass transfer, as well as reaction rates, depending on the interfacial area between the different phases or to the assessment of yield stresses of polycrystalline metals/alloys samples. The experimental determination of such distributions often involves laborious sampling procedures and the statistical significance of the outcome is rarely investigated. In this work, we propose a novel rigorous tool, based on inferential statistics, to determine the number of samples needed to obtain reliable measurements of size distribution, according to specific requirements defined a priori. Such methodology can be adopted regardless of the measurement technique used.

  10. Publication Bias in Psychology: A Diagnosis Based on the Correlation between Effect Size and Sample Size

    PubMed Central

    Kühberger, Anton; Fritz, Astrid; Scherndl, Thomas

    2014-01-01

    Background The p value obtained from a significance test provides no information about the magnitude or importance of the underlying phenomenon. Therefore, additional reporting of effect size is often recommended. Effect sizes are theoretically independent from sample size. Yet this may not hold true empirically: non-independence could indicate publication bias. Methods We investigate whether effect size is independent from sample size in psychological research. We randomly sampled 1,000 psychological articles from all areas of psychological research. We extracted p values, effect sizes, and sample sizes of all empirical papers, and calculated the correlation between effect size and sample size, and investigated the distribution of p values. Results We found a negative correlation of r = −.45 [95% CI: −.53; −.35] between effect size and sample size. In addition, we found an inordinately high number of p values just passing the boundary of significance. Additional data showed that neither implicit nor explicit power analysis could account for this pattern of findings. Conclusion The negative correlation between effect size and samples size, and the biased distribution of p values indicate pervasive publication bias in the entire field of psychology. PMID:25192357

  11. Publication bias in psychology: a diagnosis based on the correlation between effect size and sample size.

    PubMed

    Kühberger, Anton; Fritz, Astrid; Scherndl, Thomas

    2014-01-01

    The p value obtained from a significance test provides no information about the magnitude or importance of the underlying phenomenon. Therefore, additional reporting of effect size is often recommended. Effect sizes are theoretically independent from sample size. Yet this may not hold true empirically: non-independence could indicate publication bias. We investigate whether effect size is independent from sample size in psychological research. We randomly sampled 1,000 psychological articles from all areas of psychological research. We extracted p values, effect sizes, and sample sizes of all empirical papers, and calculated the correlation between effect size and sample size, and investigated the distribution of p values. We found a negative correlation of r = -.45 [95% CI: -.53; -.35] between effect size and sample size. In addition, we found an inordinately high number of p values just passing the boundary of significance. Additional data showed that neither implicit nor explicit power analysis could account for this pattern of findings. The negative correlation between effect size and samples size, and the biased distribution of p values indicate pervasive publication bias in the entire field of psychology.

  12. Sample Size Determination for Clustered Count Data

    PubMed Central

    Amatya, A.; Bhaumik, D.; Gibbons, R.D.

    2013-01-01

    We consider the problem of sample size determination for count data. Such data arise naturally in the context of multi-center (or cluster) randomized clinical trials, where patients are nested within research centers. We consider cluster-specific and population-average estimators (maximum likelihood based on generalized mixed-effects regression and generalized estimating equations respectively) for subject-level and cluster-level randomized designs respectively. We provide simple expressions for calculating number of clusters when comparing event rates of two groups in cross-sectional studies. The expressions we derive have closed form solutions and are based on either between-cluster variation or inter-cluster correlation for cross-sectional studies. We provide both theoretical and numerical comparisons of our methods with other existing methods. We specifically show that the performance of the proposed method is better for subject-level randomized designs, whereas the comparative performance depends on the rate ratio for the cluster-level randomized designs. We also provide a versatile method for longitudinal studies. Results are illustrated by three real data examples. PMID:23589228

  13. Sample size in orthodontic randomized controlled trials: are numbers justified?

    PubMed

    Koletsi, Despina; Pandis, Nikolaos; Fleming, Padhraig S

    2014-02-01

    Sample size calculations are advocated by the Consolidated Standards of Reporting Trials (CONSORT) group to justify sample sizes in randomized controlled trials (RCTs). This study aimed to analyse the reporting of sample size calculations in trials published as RCTs in orthodontic speciality journals. The performance of sample size calculations was assessed and calculations verified where possible. Related aspects, including number of authors; parallel, split-mouth, or other design; single- or multi-centre study; region of publication; type of data analysis (intention-to-treat or per-protocol basis); and number of participants recruited and lost to follow-up, were considered. Of 139 RCTs identified, complete sample size calculations were reported in 41 studies (29.5 per cent). Parallel designs were typically adopted (n = 113; 81 per cent), with 80 per cent (n = 111) involving two arms and 16 per cent having three arms. Data analysis was conducted on an intention-to-treat (ITT) basis in a small minority of studies (n = 18; 13 per cent). According to the calculations presented, overall, a median of 46 participants were required to demonstrate sufficient power to highlight meaningful differences (typically at a power of 80 per cent). The median number of participants recruited was 60, with a median of 4 participants being lost to follow-up. Our finding indicates good agreement between projected numbers required and those verified (median discrepancy: 5.3 per cent), although only a minority of trials (29.5 per cent) could be examined. Although sample size calculations are often reported in trials published as RCTs in orthodontic speciality journals, presentation is suboptimal and in need of significant improvement.

  14. Sample size recalculation using conditional power.

    PubMed

    Denne, J S

    The sample size required to achieve a given power at a prespecified absolute difference in mean response may depend on one or more nuisance parameters, which are usually unknown. Proposed methods for using an internal pilot to recalculate the sample size using estimates of these parameters have been well studied. Most of these methods ignore the fact that data on the parameter of interest from within this internal pilot will contribute towards the value of the final test statistic. We propose a method which involves recalculating the target sample size by computing the number of further observations required to maintain the probability of rejecting the null hypothesis at the end of the study under the prespecified absolute difference in mean response conditional on the data observed so far. We do this within the framework of a two-group error-spending sequential test, modified so as to prevent inflation of the type I error rate. Copyright 2001 John Wiley & Sons, Ltd.

  15. Calculation of the size of ice hummocks

    SciTech Connect

    Kozitskii, I.E.

    1985-03-01

    Ice hummocks are often seen during the breakup of water bodies and are the result of shifting of the ice cover during spring movements and are confined both to the shore slope, or exposed stretches of the bottom, and to shallow waters. At the same time, the shore is often used for needs of construction, transportation, power engineering and economic purposes, and cases of damage to structures and disruption of operations by ice hummocks are known. The authors therefore study here the character and extent of the phenomenon as it affects the design of shore engineering structures. They add that existing standards do not fully reflect the composition of ice loads on structures, in connection with which it is expedient to theorize as regards the expected size of ice hummocks.

  16. Causality in Statistical Power: Isomorphic Properties of Measurement, Research Design, Effect Size, and Sample Size

    PubMed Central

    Heidel, R. Eric

    2016-01-01

    Statistical power is the ability to detect a significant effect, given that the effect actually exists in a population. Like most statistical concepts, statistical power tends to induce cognitive dissonance in hepatology researchers. However, planning for statistical power by an a priori sample size calculation is of paramount importance when designing a research study. There are five specific empirical components that make up an a priori sample size calculation: the scale of measurement of the outcome, the research design, the magnitude of the effect size, the variance of the effect size, and the sample size. A framework grounded in the phenomenon of isomorphism, or interdependencies amongst different constructs with similar forms, will be presented to understand the isomorphic effects of decisions made on each of the five aforementioned components of statistical power. PMID:27073717

  17. Causality in Statistical Power: Isomorphic Properties of Measurement, Research Design, Effect Size, and Sample Size.

    PubMed

    Heidel, R Eric

    2016-01-01

    Statistical power is the ability to detect a significant effect, given that the effect actually exists in a population. Like most statistical concepts, statistical power tends to induce cognitive dissonance in hepatology researchers. However, planning for statistical power by an a priori sample size calculation is of paramount importance when designing a research study. There are five specific empirical components that make up an a priori sample size calculation: the scale of measurement of the outcome, the research design, the magnitude of the effect size, the variance of the effect size, and the sample size. A framework grounded in the phenomenon of isomorphism, or interdependencies amongst different constructs with similar forms, will be presented to understand the isomorphic effects of decisions made on each of the five aforementioned components of statistical power.

  18. Sample size considerations for clinical research studies in nuclear cardiology.

    PubMed

    Chiuzan, Cody; West, Erin A; Duong, Jimmy; Cheung, Ken Y K; Einstein, Andrew J

    2015-12-01

    Sample size calculation is an important element of research design that investigators need to consider in the planning stage of the study. Funding agencies and research review panels request a power analysis, for example, to determine the minimum number of subjects needed for an experiment to be informative. Calculating the right sample size is crucial to gaining accurate information and ensures that research resources are used efficiently and ethically. The simple question "How many subjects do I need?" does not always have a simple answer. Before calculating the sample size requirements, a researcher must address several aspects, such as purpose of the research (descriptive or comparative), type of samples (one or more groups), and data being collected (continuous or categorical). In this article, we describe some of the most frequent methods for calculating the sample size with examples from nuclear cardiology research, including for t tests, analysis of variance (ANOVA), non-parametric tests, correlation, Chi-squared tests, and survival analysis. For the ease of implementation, several examples are also illustrated via user-friendly free statistical software.

  19. Determining sample size for tree utilization surveys

    Treesearch

    Stanley J. Zarnoch; James W. Bentley; Tony G. Johnson

    2004-01-01

    The U.S. Department of Agriculture Forest Service has conducted many studies to determine what proportion of the timber harvested in the South is actually utilized. This paper describes the statistical methods used to determine required sample sizes for estimating utilization ratios for a required level of precision. The data used are those for 515 hardwood and 1,557...

  20. Exploratory Factor Analysis with Small Sample Sizes

    ERIC Educational Resources Information Center

    de Winter, J. C. F.; Dodou, D.; Wieringa, P. A.

    2009-01-01

    Exploratory factor analysis (EFA) is generally regarded as a technique for large sample sizes ("N"), with N = 50 as a reasonable absolute minimum. This study offers a comprehensive overview of the conditions in which EFA can yield good quality results for "N" below 50. Simulations were carried out to estimate the minimum required "N" for different…

  1. Sample size of the reference sample in a case-augmented study.

    PubMed

    Ghosh, Palash; Dewanji, Anup

    2017-03-13

    The case-augmented study, in which a case sample is augmented with a reference (random) sample from the source population with only covariates information known, is becoming popular in different areas of applied science such as pharmacovigilance, ecology, and econometrics. In general, the case sample is available from some source (for example, hospital database, case registry, etc.); however, the reference sample is required to be drawn from the corresponding source population. The required minimum size of the reference sample is an important issue in this regard. In this work, we address the minimum sample size calculation and discuss related issues. Copyright © 2017 John Wiley & Sons, Ltd.

  2. Statistical Analysis Techniques for Small Sample Sizes

    NASA Technical Reports Server (NTRS)

    Navard, S. E.

    1984-01-01

    The small sample sizes problem which is encountered when dealing with analysis of space-flight data is examined. Because of such a amount of data available, careful analyses are essential to extract the maximum amount of information with acceptable accuracy. Statistical analysis of small samples is described. The background material necessary for understanding statistical hypothesis testing is outlined and the various tests which can be done on small samples are explained. Emphasis is on the underlying assumptions of each test and on considerations needed to choose the most appropriate test for a given type of analysis.

  3. Sample size and optimal sample design in tuberculosis surveys

    PubMed Central

    Sánchez-Crespo, J. L.

    1967-01-01

    Tuberculosis surveys sponsored by the World Health Organization have been carried out in different communities during the last few years. Apart from the main epidemiological findings, these surveys have provided basic statistical data for use in the planning of future investigations. In this paper an attempt is made to determine the sample size desirable in future surveys that include one of the following examinations: tuberculin test, direct microscopy, and X-ray examination. The optimum cluster sizes are found to be 100-150 children under 5 years of age in the tuberculin test, at least 200 eligible persons in the examination for excretors of tubercle bacilli (direct microscopy) and at least 500 eligible persons in the examination for persons with radiological evidence of pulmonary tuberculosis (X-ray). Modifications of the optimum sample size in combined surveys are discussed. PMID:5300008

  4. The cost of large numbers of hypothesis tests on power, effect size and sample size.

    PubMed

    Lazzeroni, L C; Ray, A

    2012-01-01

    Advances in high-throughput biology and computer science are driving an exponential increase in the number of hypothesis tests in genomics and other scientific disciplines. Studies using current genotyping platforms frequently include a million or more tests. In addition to the monetary cost, this increase imposes a statistical cost owing to the multiple testing corrections needed to avoid large numbers of false-positive results. To safeguard against the resulting loss of power, some have suggested sample sizes on the order of tens of thousands that can be impractical for many diseases or may lower the quality of phenotypic measurements. This study examines the relationship between the number of tests on the one hand and power, detectable effect size or required sample size on the other. We show that once the number of tests is large, power can be maintained at a constant level, with comparatively small increases in the effect size or sample size. For example at the 0.05 significance level, a 13% increase in sample size is needed to maintain 80% power for ten million tests compared with one million tests, whereas a 70% increase in sample size is needed for 10 tests compared with a single test. Relative costs are less when measured by increases in the detectable effect size. We provide an interactive Excel calculator to compute power, effect size or sample size when comparing study designs or genome platforms involving different numbers of hypothesis tests. The results are reassuring in an era of extreme multiple testing.

  5. Information Conversion, Effective Samples, and Parameter Size

    PubMed Central

    Lin, Xiaodong; Pittman, Jennifer; Clarke, Bertrand

    2008-01-01

    Consider the relative entropy between a posterior density for a parameter given a sample and a second posterior density for the same parameter, based on a different model and a different data set. Then the relative entropy can be minimized over the second sample to get a virtual sample that would make the second posterior as close as possible to the first in an informational sense. If the first posterior is based on a dependent dataset and the second posterior uses an independence model, the effective inferential power of the dependent sample is transferred into the independent sample by the optimization. Examples of this optimization are presented for models with nuisance parameters, finite mixture models, and models for correlated data. Our approach is also used to choose the effective parameter size in a Bayesian hierarchical model. PMID:19079764

  6. Conservative Sample Size Determination for Repeated Measures Analysis of Covariance.

    PubMed

    Morgan, Timothy M; Case, L Douglas

    2013-07-05

    In the design of a randomized clinical trial with one pre and multiple post randomized assessments of the outcome variable, one needs to account for the repeated measures in determining the appropriate sample size. Unfortunately, one seldom has a good estimate of the variance of the outcome measure, let alone the correlations among the measurements over time. We show how sample sizes can be calculated by making conservative assumptions regarding the correlations for a variety of covariance structures. The most conservative choice for the correlation depends on the covariance structure and the number of repeated measures. In the absence of good estimates of the correlations, the sample size is often based on a two-sample t-test, making the 'ultra' conservative and unrealistic assumption that there are zero correlations between the baseline and follow-up measures while at the same time assuming there are perfect correlations between the follow-up measures. Compared to the case of taking a single measurement, substantial savings in sample size can be realized by accounting for the repeated measures, even with very conservative assumptions regarding the parameters of the assumed correlation matrix. Assuming compound symmetry, the sample size from the two-sample t-test calculation can be reduced at least 44%, 56%, and 61% for repeated measures analysis of covariance by taking 2, 3, and 4 follow-up measures, respectively. The results offer a rational basis for determining a fairly conservative, yet efficient, sample size for clinical trials with repeated measures and a baseline value.

  7. Design issues and sample size when exposure measurement is inaccurate.

    PubMed

    Rippin, G

    2001-05-01

    Measurement error often leads to biased estimates and incorrect tests in epidemiological studies. These problems can be corrected by design modifications which allow for refined statistical models, or in some situations by adjusted sample sizes to compensate a power reduction. The design options are mainly an additional replication or internal validation study. Sample size calculations for these designs are more complex, since usually there is no unique design solution to obtain a prespecified power. Thus, additionally to a power requirement, an optimal design should also fulfill the criteria of minimizing overall costs. In this review corresponding strategies and formulae are described and appraised.

  8. Defining sample size and sampling strategy for dendrogeomorphic rockfall reconstructions

    NASA Astrophysics Data System (ADS)

    Morel, Pauline; Trappmann, Daniel; Corona, Christophe; Stoffel, Markus

    2015-05-01

    Optimized sampling strategies have been recently proposed for dendrogeomorphic reconstructions of mass movements with a large spatial footprint, such as landslides, snow avalanches, and debris flows. Such guidelines have, by contrast, been largely missing for rockfalls and cannot be transposed owing to the sporadic nature of this process and the occurrence of individual rocks and boulders. Based on a data set of 314 European larch (Larix decidua Mill.) trees (i.e., 64 trees/ha), growing on an active rockfall slope, this study bridges this gap and proposes an optimized sampling strategy for the spatial and temporal reconstruction of rockfall activity. Using random extractions of trees, iterative mapping, and a stratified sampling strategy based on an arbitrary selection of trees, we investigate subsets of the full tree-ring data set to define optimal sample size and sampling design for the development of frequency maps of rockfall activity. Spatially, our results demonstrate that the sampling of only 6 representative trees per ha can be sufficient to yield a reasonable mapping of the spatial distribution of rockfall frequencies on a slope, especially if the oldest and most heavily affected individuals are included in the analysis. At the same time, however, sampling such a low number of trees risks causing significant errors especially if nonrepresentative trees are chosen for analysis. An increased number of samples therefore improves the quality of the frequency maps in this case. Temporally, we demonstrate that at least 40 trees/ha are needed to obtain reliable rockfall chronologies. These results will facilitate the design of future studies, decrease the cost-benefit ratio of dendrogeomorphic studies and thus will permit production of reliable reconstructions with reasonable temporal efforts.

  9. Sample size estimation and power analysis for clinical research studies

    PubMed Central

    Suresh, KP; Chandrashekara, S

    2012-01-01

    Determining the optimal sample size for a study assures an adequate power to detect statistical significance. Hence, it is a critical step in the design of a planned research protocol. Using too many participants in a study is expensive and exposes more number of subjects to procedure. Similarly, if study is underpowered, it will be statistically inconclusive and may make the whole protocol a failure. This paper covers the essentials in calculating power and sample size for a variety of applied study designs. Sample size computation for single group mean, survey type of studies, 2 group studies based on means and proportions or rates, correlation studies and for case-control for assessing the categorical outcome are presented in detail. PMID:22870008

  10. Effect size estimates: current use, calculations, and interpretation.

    PubMed

    Fritz, Catherine O; Morris, Peter E; Richler, Jennifer J

    2012-02-01

    The Publication Manual of the American Psychological Association (American Psychological Association, 2001, American Psychological Association, 2010) calls for the reporting of effect sizes and their confidence intervals. Estimates of effect size are useful for determining the practical or theoretical importance of an effect, the relative contributions of factors, and the power of an analysis. We surveyed articles published in 2009 and 2010 in the Journal of Experimental Psychology: General, noting the statistical analyses reported and the associated reporting of effect size estimates. Effect sizes were reported for fewer than half of the analyses; no article reported a confidence interval for an effect size. The most often reported analysis was analysis of variance, and almost half of these reports were not accompanied by effect sizes. Partial η2 was the most commonly reported effect size estimate for analysis of variance. For t tests, 2/3 of the articles did not report an associated effect size estimate; Cohen's d was the most often reported. We provide a straightforward guide to understanding, selecting, calculating, and interpreting effect sizes for many types of data and to methods for calculating effect size confidence intervals and power analysis.

  11. 40 CFR 80.127 - Sample size guidelines.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... attest engagement, the auditor shall sample relevant populations to which agreed-upon procedures will be... population; and (b) Sample size shall be determined using one of the following options: (1) Option 1. Determine the sample size using the following table: Sample Size, Based Upon Population Size No. in...

  12. (Sample) Size Matters! An Examination of Sample Size from the SPRINT Trial

    PubMed Central

    Bhandari, Mohit; Tornetta, Paul; Rampersad, Shelly-Ann; Sprague, Sheila; Heels-Ansdell, Diane; Sanders, David W.; Schemitsch, Emil H.; Swiontkowski, Marc; Walter, Stephen

    2012-01-01

    Introduction Inadequate sample size and power in randomized trials can result in misleading findings. This study demonstrates the effect of sample size in a large, clinical trial by evaluating the results of the SPRINT (Study to Prospectively evaluate Reamed Intramedullary Nails in Patients with Tibial fractures) trial as it progressed. Methods The SPRINT trial evaluated reamed versus unreamed nailing of the tibia in 1226 patients, as well as in open and closed fracture subgroups (N=400 and N=826, respectively). We analyzed the re-operation rates and relative risk comparing treatment groups at 50, 100 and then increments of 100 patients up to the final sample size. Results at various enrollments were compared to the final SPRINT findings. Results In the final analysis, there was a statistically significant decreased risk of re-operation with reamed nails for closed fractures (relative risk reduction 35%). Results for the first 35 patients enrolled suggested reamed nails increased the risk of reoperation in closed fractures by 165%. Only after 543 patients with closed fractures were enrolled did the results reflect the final advantage for reamed nails in this subgroup. Similarly, the trend towards an increased risk of re-operation for open fractures (23%) was not seen until 62 patients with open fractures were enrolled. Conclusions Our findings highlight the risk of conducting a trial with insufficient sample size and power. Such studies are not only at risk of missing true effects, but also of giving misleading results. Level of Evidence N/A PMID:23525086

  13. Cusp Catastrophe Polynomial Model: Power and Sample Size Estimation

    PubMed Central

    Chen, Ding-Geng(Din); Chen, Xinguang(Jim); Lin, Feng; Tang, Wan; Lio, Y. L.; Guo, (Tammy) Yuanyuan

    2016-01-01

    Guastello’s polynomial regression method for solving cusp catastrophe model has been widely applied to analyze nonlinear behavior outcomes. However, no statistical power analysis for this modeling approach has been reported probably due to the complex nature of the cusp catastrophe model. Since statistical power analysis is essential for research design, we propose a novel method in this paper to fill in the gap. The method is simulation-based and can be used to calculate statistical power and sample size when Guastello’s polynomial regression method is used to cusp catastrophe modeling analysis. With this novel approach, a power curve is produced first to depict the relationship between statistical power and samples size under different model specifications. This power curve is then used to determine sample size required for specified statistical power. We verify the method first through four scenarios generated through Monte Carlo simulations, and followed by an application of the method with real published data in modeling early sexual initiation among young adolescents. Findings of our study suggest that this simulation-based power analysis method can be used to estimate sample size and statistical power for Guastello’s polynomial regression method in cusp catastrophe modeling. PMID:27158562

  14. Cusp Catastrophe Polynomial Model: Power and Sample Size Estimation.

    PubMed

    Chen, Ding-Geng Din; Chen, Xinguang Jim; Lin, Feng; Tang, Wan; Lio, Y L; Guo, Tammy Yuanyuan

    2014-12-01

    Guastello's polynomial regression method for solving cusp catastrophe model has been widely applied to analyze nonlinear behavior outcomes. However, no statistical power analysis for this modeling approach has been reported probably due to the complex nature of the cusp catastrophe model. Since statistical power analysis is essential for research design, we propose a novel method in this paper to fill in the gap. The method is simulation-based and can be used to calculate statistical power and sample size when Guastello's polynomial regression method is used to cusp catastrophe modeling analysis. With this novel approach, a power curve is produced first to depict the relationship between statistical power and samples size under different model specifications. This power curve is then used to determine sample size required for specified statistical power. We verify the method first through four scenarios generated through Monte Carlo simulations, and followed by an application of the method with real published data in modeling early sexual initiation among young adolescents. Findings of our study suggest that this simulation-based power analysis method can be used to estimate sample size and statistical power for Guastello's polynomial regression method in cusp catastrophe modeling.

  15. Public Opinion Polls, Chicken Soup and Sample Size

    ERIC Educational Resources Information Center

    Nguyen, Phung

    2005-01-01

    Cooking and tasting chicken soup in three different pots of very different size serves to demonstrate that it is the absolute sample size that matters the most in determining the accuracy of the findings of the poll, not the relative sample size, i.e. the size of the sample in relation to its population.

  16. Public Opinion Polls, Chicken Soup and Sample Size

    ERIC Educational Resources Information Center

    Nguyen, Phung

    2005-01-01

    Cooking and tasting chicken soup in three different pots of very different size serves to demonstrate that it is the absolute sample size that matters the most in determining the accuracy of the findings of the poll, not the relative sample size, i.e. the size of the sample in relation to its population.

  17. Sample-size redetermination for repeated measures studies.

    PubMed

    Zucker, David M; Denne, Jonathan

    2002-09-01

    Clinical trialists recently have shown interest in two-stage procedures for updating the sample-size calculation at an interim point in a trial. Because many clinical trials involve repeated measures designs, it is desirable to have available practical two-stage procedures for such designs. Shih and Gould (1995, Statistics in Medicine 14, 2239-2248) discuss sample-size redetermination for repeated measures studies but under a highly simplified setup. We develop two-stage procedures under the general mixed linear model, allowing for dropouts and missed visits. We present a range of procedures and compare their Type I error and power by simulation. We find that, in general, the achieved power is brought considerably closer to the required level without inflating the Type I error rate. We also derive an inflation factor that ensures the power requirement is more closely met.

  18. Randomized controlled trials 5: Determining the sample size and power for clinical trials and cohort studies.

    PubMed

    Greene, Tom

    2015-01-01

    Performing well-powered randomized controlled trials is of fundamental importance in clinical research. The goal of sample size calculations is to assure that statistical power is acceptable while maintaining a small probability of a type I error. This chapter overviews the fundamentals of sample size calculation for standard types of outcomes for two-group studies. It considers (1) the problems of determining the size of the treatment effect that the studies will be designed to detect, (2) the modifications to sample size calculations to account for loss to follow-up and nonadherence, (3) the options when initial calculations indicate that the feasible sample size is insufficient to provide adequate power, and (4) the implication of using multiple primary endpoints. Sample size estimates for longitudinal cohort studies must take account of confounding by baseline factors.

  19. 7 CFR 52.803 - Sample unit size.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... unit size. Compliance with requirements for size and the various quality factors is based on the following sample unit sizes for the applicable factor: (a) Pits, character, and harmless extraneous material...)—100 cherries. Factors of Quality ...

  20. 7 CFR 52.803 - Sample unit size.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... unit size. Compliance with requirements for size and the various quality factors is based on the following sample unit sizes for the applicable factor: (a) Pits, character, and harmless extraneous material...)—100 cherries. Factors of Quality ...

  1. 7 CFR 52.803 - Sample unit size.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... unit size. Compliance with requirements for size and the various quality factors is based on the following sample unit sizes for the applicable factor: (a) Pits, character, and harmless extraneous material...)—100 cherries. Factors of Quality ...

  2. 40 CFR 89.418 - Raw emission sampling calculations.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 40 Protection of Environment 20 2014-07-01 2013-07-01 true Raw emission sampling calculations. 89.418 Section 89.418 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS... Test Procedures § 89.418 Raw emission sampling calculations. (a) The final test results shall be...

  3. 40 CFR 89.418 - Raw emission sampling calculations.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 40 Protection of Environment 20 2010-07-01 2010-07-01 false Raw emission sampling calculations. 89.418 Section 89.418 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS... Test Procedures § 89.418 Raw emission sampling calculations. (a) The final test results shall be...

  4. 40 CFR 89.418 - Raw emission sampling calculations.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 40 Protection of Environment 21 2012-07-01 2012-07-01 false Raw emission sampling calculations. 89.418 Section 89.418 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS... Test Procedures § 89.418 Raw emission sampling calculations. (a) The final test results shall be...

  5. [Unconditioned logistic regression and sample size: a bibliographic review].

    PubMed

    Ortega Calvo, Manuel; Cayuela Domínguez, Aurelio

    2002-01-01

    Unconditioned logistic regression is a highly useful risk prediction method in epidemiology. This article reviews the different solutions provided by different authors concerning the interface between the calculation of the sample size and the use of logistics regression. Based on the knowledge of the information initially provided, a review is made of the customized regression and predictive constriction phenomenon, the design of an ordinal exposition with a binary output, the event of interest per variable concept, the indicator variables, the classic Freeman equation, etc. Some skeptical ideas regarding this subject are also included.

  6. 7 CFR 52.3757 - Standard sample unit size.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... Ripe Olives 1 Product Description, Types, Styles, and Grades § 52.3757 Standard sample unit size... following standard sample unit size for the applicable style: (a) Whole and pitted—50 olives. (b)...

  7. 7 CFR 52.3757 - Standard sample unit size.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... Ripe Olives 1 Product Description, Types, Styles, and Grades § 52.3757 Standard sample unit size... following standard sample unit size for the applicable style: (a) Whole and pitted—50 olives. (b)...

  8. 7 CFR 52.775 - Sample unit size.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... unit size. Compliance with requirements for the size and the various quality factors is based on the following sample unit sizes for the applicable factor: (a) Size, color, pits, and character—20 ounces of... extraneous material—The total contents of each container in the sample. Factors of Quality ...

  9. 7 CFR 52.775 - Sample unit size.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... unit size. Compliance with requirements for the size and the various quality factors is based on the following sample unit sizes for the applicable factor: (a) Size, color, pits, and character—20 ounces of... extraneous material—The total contents of each container in the sample. Factors of Quality ...

  10. Sample Size Estimation for Non-Inferiority Trials: Frequentist Approach versus Decision Theory Approach.

    PubMed

    Bouman, A C; ten Cate-Hoek, A J; Ramaekers, B L T; Joore, M A

    2015-01-01

    Non-inferiority trials are performed when the main therapeutic effect of the new therapy is expected to be not unacceptably worse than that of the standard therapy, and the new therapy is expected to have advantages over the standard therapy in costs or other (health) consequences. These advantages however are not included in the classic frequentist approach of sample size calculation for non-inferiority trials. In contrast, the decision theory approach of sample size calculation does include these factors. The objective of this study is to compare the conceptual and practical aspects of the frequentist approach and decision theory approach of sample size calculation for non-inferiority trials, thereby demonstrating that the decision theory approach is more appropriate for sample size calculation of non-inferiority trials. The frequentist approach and decision theory approach of sample size calculation for non-inferiority trials are compared and applied to a case of a non-inferiority trial on individually tailored duration of elastic compression stocking therapy compared to two years elastic compression stocking therapy for the prevention of post thrombotic syndrome after deep vein thrombosis. The two approaches differ substantially in conceptual background, analytical approach, and input requirements. The sample size calculated according to the frequentist approach yielded 788 patients, using a power of 80% and a one-sided significance level of 5%. The decision theory approach indicated that the optimal sample size was 500 patients, with a net value of €92 million. This study demonstrates and explains the differences between the classic frequentist approach and the decision theory approach of sample size calculation for non-inferiority trials. We argue that the decision theory approach of sample size estimation is most suitable for sample size calculation of non-inferiority trials.

  11. The Relationship between Sample Sizes and Effect Sizes in Systematic Reviews in Education

    ERIC Educational Resources Information Center

    Slavin, Robert; Smith, Dewi

    2009-01-01

    Research in fields other than education has found that studies with small sample sizes tend to have larger effect sizes than those with large samples. This article examines the relationship between sample size and effect size in education. It analyzes data from 185 studies of elementary and secondary mathematics programs that met the standards of…

  12. The Relationship between Sample Sizes and Effect Sizes in Systematic Reviews in Education

    ERIC Educational Resources Information Center

    Slavin, Robert; Smith, Dewi

    2009-01-01

    Research in fields other than education has found that studies with small sample sizes tend to have larger effect sizes than those with large samples. This article examines the relationship between sample size and effect size in education. It analyzes data from 185 studies of elementary and secondary mathematics programs that met the standards of…

  13. Mesh size and code option effects of strength calculations

    SciTech Connect

    Kaul, Ann M

    2010-12-10

    Modern Lagrangian hydrodynamics codes include numerical methods which allow calculations to proceed past the point obtainable by a purely Lagrangian scheme. These options can be employed as the user deems necessary to 'complete' a calculation. While one could argue that any calculation is better than none, to truly understand the calculated results and their relationship to physical reality, the user needs to understand how their runtime choices affect the calculated results. One step toward this goal is to understand the effect of each runtime choice on particular pieces of the code physics. This paper will present simulation results for some experiments typically used for strength model validation. Topics to be covered include effect of mesh size, use of various ALE schemes for mesh detangling, and use of anti-hour-glassing schemes. Experiments to be modeled include the lower strain rate ({approx} 10{sup 4} s{sup -1}) gas gun driven Taylor impact experiments and the higher strain rate ({approx} 10{sup 5}-10{sup 6} s{sup -1}) HE products driven perturbed plate experiments. The necessary mesh resolution and the effect of the code runtime options are highly dependent on the amount of localization of strain and stress in each experiment. In turn, this localization is dependent on the geometry of the experimental setup and the drive conditions.

  14. Finite-size supercell correction schemes for charged defect calculations

    NASA Astrophysics Data System (ADS)

    Komsa, Hannu-Pekka; Rantala, Tapio T.; Pasquarello, Alfredo

    2012-07-01

    Various schemes for correcting the finite-size supercell errors in the case of charged defect calculations are analyzed and their performance for a series of defect systems is compared. We focus on the schemes proposed by Makov and Payne (MP), Freysoldt, Neugebauer, and Van de Walle (FNV), and Lany and Zunger (LZ). The role of the potential alignment is also assessed. We demonstrate a connection between the defect charge distribution and the potential alignment, which establishes a relation between the MP and FNV schemes. Calculations are performed using supercells of various sizes and the corrected formation energies are compared to the values obtained by extrapolation to infinitely large supercells. For defects with localized charge distributions, we generally find that the FNV scheme improves upon the LZ one, while the MP scheme tends to overcorrect except for point-charge-like defects. We also encountered a class of defects, for which all the correction schemes fail to produce results consistent with the extrapolated values. This behavior is found to be caused by partial delocalization of the defect charge. We associate this effect to hybridization between the defect state and the band-edge states of the host. The occurrence of defect charge delocalization also reflects in the evolution of the defect Kohn-Sham levels with increasing supercell size. We discuss the physical relevance of the latter class of defects.

  15. 40 CFR 600.208-77 - Sample calculation.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... ECONOMY AND CARBON-RELATED EXHAUST EMISSIONS OF MOTOR VEHICLES Fuel Economy Regulations for 1977 and Later Model Year Automobiles-Procedures for Calculating Fuel Economy Values § 600.208-77 Sample...

  16. 40 CFR 600.208-77 - Sample calculation.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... ECONOMY AND CARBON-RELATED EXHAUST EMISSIONS OF MOTOR VEHICLES Procedures for Calculating Fuel Economy and Carbon-Related Exhaust Emission Values for 1977 and Later Model Year Automobiles § 600.208-77 Sample...

  17. Optimal flexible sample size design with robust power.

    PubMed

    Zhang, Lanju; Cui, Lu; Yang, Bo

    2016-08-30

    It is well recognized that sample size determination is challenging because of the uncertainty on the treatment effect size. Several remedies are available in the literature. Group sequential designs start with a sample size based on a conservative (smaller) effect size and allow early stop at interim looks. Sample size re-estimation designs start with a sample size based on an optimistic (larger) effect size and allow sample size increase if the observed effect size is smaller than planned. Different opinions favoring one type over the other exist. We propose an optimal approach using an appropriate optimality criterion to select the best design among all the candidate designs. Our results show that (1) for the same type of designs, for example, group sequential designs, there is room for significant improvement through our optimization approach; (2) optimal promising zone designs appear to have no advantages over optimal group sequential designs; and (3) optimal designs with sample size re-estimation deliver the best adaptive performance. We conclude that to deal with the challenge of sample size determination due to effect size uncertainty, an optimal approach can help to select the best design that provides most robust power across the effect size range of interest. Copyright © 2016 John Wiley & Sons, Ltd.

  18. 7 CFR 52.775 - Sample unit size.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... Cherries 1 Sample Unit Size § 52.775 Sample unit size. Compliance with requirements for the size and the..., color, pits, and character—20 ounces of drained cherries. (b) Defects (other than harmless extraneous material)—100 cherries. (c) Harmless extraneous material—The total contents of each container in the...

  19. 7 CFR 52.775 - Sample unit size.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... Cherries 1 Sample Unit Size § 52.775 Sample unit size. Compliance with requirements for the size and the..., color, pits, and character—20 ounces of drained cherries. (b) Defects (other than harmless extraneous material)—100 cherries. (c) Harmless extraneous material—The total contents of each container in the...

  20. Estimating the sample size for a pilot randomised trial to minimise the overall trial sample size for the external pilot and main trial for a continuous outcome variable

    PubMed Central

    Julious, Steven A; Cooper, Cindy L; Campbell, Michael J

    2015-01-01

    Sample size justification is an important consideration when planning a clinical trial, not only for the main trial but also for any preliminary pilot trial. When the outcome is a continuous variable, the sample size calculation requires an accurate estimate of the standard deviation of the outcome measure. A pilot trial can be used to get an estimate of the standard deviation, which could then be used to anticipate what may be observed in the main trial. However, an important consideration is that pilot trials often estimate the standard deviation parameter imprecisely. This paper looks at how we can choose an external pilot trial sample size in order to minimise the sample size of the overall clinical trial programme, that is, the pilot and the main trial together. We produce a method of calculating the optimal solution to the required pilot trial sample size when the standardised effect size for the main trial is known. However, as it may not be possible to know the standardised effect size to be used prior to the pilot trial, approximate rules are also presented. For a main trial designed with 90% power and two-sided 5% significance, we recommend pilot trial sample sizes per treatment arm of 75, 25, 15 and 10 for standardised effect sizes that are extra small (≤0.1), small (0.2), medium (0.5) or large (0.8), respectively. PMID:26092476

  1. Propagation of Uncertainty in System Parameters of a LWR Model by Sampling MCNPX Calculations - Burnup Analysis

    NASA Astrophysics Data System (ADS)

    Campolina, Daniel de A. M.; Lima, Claubia P. B.; Veloso, Maria Auxiliadora F.

    2014-06-01

    For all the physical components that comprise a nuclear system there is an uncertainty. Assessing the impact of uncertainties in the simulation of fissionable material systems is essential for a best estimate calculation that has been replacing the conservative model calculations as the computational power increases. The propagation of uncertainty in a simulation using a Monte Carlo code by sampling the input parameters is recent because of the huge computational effort required. In this work a sample space of MCNPX calculations was used to propagate the uncertainty. The sample size was optimized using the Wilks formula for a 95th percentile and a two-sided statistical tolerance interval of 95%. Uncertainties in input parameters of the reactor considered included geometry dimensions and densities. It was showed the capacity of the sampling-based method for burnup when the calculations sample size is optimized and many parameter uncertainties are investigated together, in the same input.

  2. Are Sample Sizes Clear and Justified in RCTs Published in Dental Journals?

    PubMed Central

    Koletsi, Despina; Fleming, Padhraig S.; Seehra, Jadbinder; Bagos, Pantelis G.; Pandis, Nikolaos

    2014-01-01

    Sample size calculations are advocated by the CONSORT group to justify sample sizes in randomized controlled trials (RCTs). The aim of this study was primarily to evaluate the reporting of sample size calculations, to establish the accuracy of these calculations in dental RCTs and to explore potential predictors associated with adequate reporting. Electronic searching was undertaken in eight leading specific and general dental journals. Replication of sample size calculations was undertaken where possible. Assumed variances or odds for control and intervention groups were also compared against those observed. The relationship between parameters including journal type, number of authors, trial design, involvement of methodologist, single-/multi-center study and region and year of publication, and the accuracy of sample size reporting was assessed using univariable and multivariable logistic regression. Of 413 RCTs identified, sufficient information to allow replication of sample size calculations was provided in only 121 studies (29.3%). Recalculations demonstrated an overall median overestimation of sample size of 15.2% after provisions for losses to follow-up. There was evidence that journal, methodologist involvement (OR = 1.97, CI: 1.10, 3.53), multi-center settings (OR = 1.86, CI: 1.01, 3.43) and time since publication (OR = 1.24, CI: 1.12, 1.38) were significant predictors of adequate description of sample size assumptions. Among journals JCP had the highest odds of adequately reporting sufficient data to permit sample size recalculation, followed by AJODO and JDR, with 61% (OR = 0.39, CI: 0.19, 0.80) and 66% (OR = 0.34, CI: 0.15, 0.75) lower odds, respectively. Both assumed variances and odds were found to underestimate the observed values. Presentation of sample size calculations in the dental literature is suboptimal; incorrect assumptions may have a bearing on the power of RCTs. PMID:24465806

  3. Sample size determination in clinical proteomic profiling experiments using mass spectrometry for class comparison.

    PubMed

    Cairns, David A; Barrett, Jennifer H; Billingham, Lucinda J; Stanley, Anthea J; Xinarianos, George; Field, John K; Johnson, Phillip J; Selby, Peter J; Banks, Rosamonde E

    2009-01-01

    Mass spectrometric profiling approaches such as MALDI-TOF and SELDI-TOF are increasingly being used in disease marker discovery, particularly in the lower molecular weight proteome. However, little consideration has been given to the issue of sample size in experimental design. The aim of this study was to develop a protocol for the use of sample size calculations in proteomic profiling studies using MS. These sample size calculations can be based on a simple linear mixed model which allows the inclusion of estimates of biological and technical variation inherent in the experiment. The use of a pilot experiment to estimate these components of variance is investigated and is shown to work well when compared with larger studies. Examination of data from a number of studies using different sample types and different chromatographic surfaces shows the need for sample- and preparation-specific sample size calculations.

  4. Nonlinearity of Argon Isotope Measurements for Samples of Different Sizes

    NASA Astrophysics Data System (ADS)

    Cox, S. E.; Hemming, S. R.; Turrin, B. D.; Swisher, C. C.

    2010-12-01

    Uncertainty in isotope ratio linearity is not propagated into the final uncertainty of high-precision Ar-Ar analyses. Nonlinearity is assumed to be negligible compared to other sources of error, so mass discrimination is calculated using air pipettes of a single size similar to typical unknowns. The calculated discrimination factor is applied to all measured isotopes regardless of the difference in relative isotope abundance or the difference in gas pressure between the sample and the air pipette. We measured 40Ar/36Ar ratios of different size air samples created using up to twenty pipette shots on two different mass spectrometers with automated air pipette systems (a VG5400 and an MAP 215-50) in order to test the assumption that the measured isotope ratios are consistent at different gas pressures. We typically obtain reproducibility < 0.5% on the 40Ar/36Ar of similar size air standards, but we measured 40Ar/36Ar ratios for aliquots 0.5 to 20 times the typical volume from the same reservoir that varied by as much as 10% (Figure 1). In sets VG1, VG2, and MAP, 40Ar/36Ar ratios increased with gas pressure (expressed as number of air pipette shots; R2 > 0.9). Several months later, we performed the same measurements on the VG5400 with a new filament and different tuning parameters and obtained a different result (Set VG3). In this case, the 40Ar/36Ar ratios still varied with gas pressure, but less drastically (R2 > 0.3), and the slope was reversed--40Ar/36Ar ratios decreased with gas pressure. We conclude that isotope ratio nonlinearity is a common phenomenon that has the potential to affect monitor standard and timescale age calculations at the 0.1% level of significance defined by EARTHTIME. We propose that argon labs incorporate air pipettes of varying size and isotopic compositions to allow for routine calibration of isotope ratio nonlinearity in the course of high-precision analyses. Figure 1: Measured 40Ar/36Ar vs. number of air pipettes. Sets VG1, VG2, and MAP

  5. Sample size determination in medical and surgical research.

    PubMed

    Flikkema, Robert M; Toledo-Pereyra, Luis H

    2012-02-01

    One of the most critical yet frequently misunderstood principles of research is sample size determination. Obtaining an inadequate sample is a serious problem that can invalidate an entire study. Without an extensive background in statistics, the seemingly simple question of selecting a sample size can become quite a daunting task. This article aims to give a researcher with no background in statistics the basic tools needed for sample size determination. After reading this article, the researcher will be aware of all the factors involved in a power analysis and will be able to work more effectively with the statistician when determining sample size. This work also reviews the power of a statistical hypothesis, as well as how to estimate the effect size of a research study. These are the two key components of sample size determination. Several examples will be considered throughout the text.

  6. Ultrasonic energy in liposome production: process modelling and size calculation.

    PubMed

    Barba, A A; Bochicchio, S; Lamberti, G; Dalmoro, A

    2014-04-21

    The use of liposomes in several fields of biotechnology, as well as in pharmaceutical and food sciences is continuously increasing. Liposomes can be used as carriers for drugs and other active molecules. Among other characteristics, one of the main features relevant to their target applications is the liposome size. The size of liposomes, which is determined during the production process, decreases due to the addition of energy. The energy is used to break the lipid bilayer into smaller pieces, then these pieces close themselves in spherical structures. In this work, the mechanisms of rupture of the lipid bilayer and the formation of spheres were modelled, accounting for how the energy, supplied by ultrasonic radiation, is stored within the layers, as the elastic energy due to the curvature and as the tension energy due to the edge, and to account for the kinetics of the bending phenomenon. An algorithm to solve the model equations was designed and the relative calculation code was written. A dedicated preparation protocol, which involves active periods during which the energy is supplied and passive periods during which the energy supply is set to zero, was defined and applied. The model predictions compare well with the experimental results, by using the energy supply rate and the time constant as fitting parameters. Working with liposomes of different sizes as the starting point of the experiments, the key parameter is the ratio between the energy supply rate and the initial surface area.

  7. An evaluation of increasing sample size based on conditional power.

    PubMed

    Gaffney, Michael; Ware, James H

    2017-02-06

    We evaluate properties of sample size re-estimation (SSR) designs similar to the promising zone design considered by Mehta and Pocock (2011). We evaluate these designs under the assumption of a true effect size of 1.1 down to 0.4 of the protocol-specified effect size by six measures: 1. The probability of a sample size increase, 2. The mean proportional increase in sample size given an increase; 3 and 4. The mean true conditional power with and without a sample size increase; 5 and 6. The expected increase in sample size and power due to the SSR procedure. These measures show the probability of a sample size increase and the cost/benefit for given true effect sizes, particularly when the SSR may either be pursuing a small effect size of little clinical importance or be unnecessary when the true effect size is close to the protocol-specified effect size. The results show the clear superiority of conducting the SSR late in the study and the inefficiency of a mid-study SSR. The results indicate that waiting until late in the study for the SSR yields a smaller, better targeted set of studies with a greater increase in overall power than a mid-study SSR.

  8. Sample size estimates for clinical trials of vasospasm in subarachnoid hemorrhage.

    PubMed

    Kreiter, Kurt T; Mayer, Stephan A; Howard, George; Knappertz, Volker; Ilodigwe, Don; Sloan, Michael A; Macdonald, R Loch

    2009-07-01

    Clinical trials for prevention of vasospasm after aneurysmal subarachnoid hemorrhage (SAH) seldom have improved overall outcome; one reason may be inadequate sample size. We used data from the tirilizad trials and the Columbia University subarachnoid hemorrhage outcomes project to estimate sample sizes for clinical trials for reduction of vasospasm after SAH, assuming trials must show effect on 90-day patient-centered outcome. Sample size calculations were based on different definitions of vasospasm, enrichment strategies, sensitivity of short- and long-term outcome instruments for reflecting vasospasm-related morbidity, different event rates of vasospasm, calculation of effect size of vasospasm on outcome instruments, and different treatment effect sizes. Sensitivity analysis was performed for variable event rates of vasospasm for a given treatment effect size. Sample size tables were constructed for different rates of vasospasm and outcome instruments for a given treatment effect size. Vasospasm occurred in 12% to 30% of patients. Symptomatic deterioration and infarction from vasospasm exhibited the strongest relationship to mortality and morbidity after SAH. Enriching for vasospasm by selection of patients with thick SAH slightly decreased sample sizes. Assuming beta=0.80, alpha=0.05 (2-tailed) and treatment effect size of 50%, total sample size exceeds 5000 patients to demonstrate efficacy on 3-month patient-centered outcome (modified Rankin Scale). Clinical trials targeting vasospasm and using traditional patient-centered outcome require very high sample sizes and will therefore be costly, time-consuming, and impractical. This will hinder development of new treatment strategies.

  9. Estimating optimal sampling unit sizes for satellite surveys

    NASA Technical Reports Server (NTRS)

    Hallum, C. R.; Perry, C. R., Jr.

    1984-01-01

    This paper reports on an approach for minimizing data loads associated with satellite-acquired data, while improving the efficiency of global crop area estimates using remotely sensed, satellite-based data. Results of a sampling unit size investigation are given that include closed-form models for both nonsampling and sampling error variances. These models provide estimates of the sampling unit sizes that effect minimal costs. Earlier findings from foundational sampling unit size studies conducted by Mahalanobis, Jessen, Cochran, and others are utilized in modeling the sampling error variance as a function of sampling unit size. A conservative nonsampling error variance model is proposed that is realistic in the remote sensing environment where one is faced with numerous unknown nonsampling errors. This approach permits the sampling unit size selection in the global crop inventorying environment to be put on a more quantitative basis while conservatively guarding against expected component error variances.

  10. How Small Is Big: Sample Size and Skewness.

    PubMed

    Piovesana, Adina; Senior, Graeme

    2016-09-21

    Sample sizes of 50 have been cited as sufficient to obtain stable means and standard deviations in normative test data. The influence of skewness on this minimum number, however, has not been evaluated. Normative test data with varying levels of skewness were compiled for 12 measures from 7 tests collected as part of ongoing normative studies in Brisbane, Australia. Means and standard deviations were computed from sample sizes of 10 to 100 drawn with replacement from larger samples of 272 to 973 cases. The minimum sample size was determined by the number at which both mean and standard deviation estimates remained within the 90% confidence intervals surrounding the population estimates. Sample sizes of greater than 85 were found to generate stable means and standard deviations regardless of the level of skewness, with smaller samples required in skewed distributions. A formula was derived to compute recommended sample size at differing levels of skewness.

  11. Sample size for cluster randomized trials: effect of coefficient of variation of cluster size and analysis method.

    PubMed

    Eldridge, Sandra M; Ashby, Deborah; Kerry, Sally

    2006-10-01

    Cluster randomized trials are increasingly popular. In many of these trials, cluster sizes are unequal. This can affect trial power, but standard sample size formulae for these trials ignore this. Previous studies addressing this issue have mostly focused on continuous outcomes or methods that are sometimes difficult to use in practice. We show how a simple formula can be used to judge the possible effect of unequal cluster sizes for various types of analyses and both continuous and binary outcomes. We explore the practical estimation of the coefficient of variation of cluster size required in this formula and demonstrate the formula's performance for a hypothetical but typical trial randomizing UK general practices. The simple formula provides a good estimate of sample size requirements for trials analysed using cluster-level analyses weighting by cluster size and a conservative estimate for other types of analyses. For trials randomizing UK general practices the coefficient of variation of cluster size depends on variation in practice list size, variation in incidence or prevalence of the medical condition under examination, and practice and patient recruitment strategies, and for many trials is expected to be approximately 0.65. Individual-level analyses can be noticeably more efficient than some cluster-level analyses in this context. When the coefficient of variation is <0.23, the effect of adjustment for variable cluster size on sample size is negligible. Most trials randomizing UK general practices and many other cluster randomized trials should account for variable cluster size in their sample size calculations.

  12. Estimation of sample size and testing power (Part 4).

    PubMed

    Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo

    2012-01-01

    Sample size estimation is necessary for any experimental or survey research. An appropriate estimation of sample size based on known information and statistical knowledge is of great significance. This article introduces methods of sample size estimation of difference test for data with the design of one factor with two levels, including sample size estimation formulas and realization based on the formulas and the POWER procedure of SAS software for quantitative data and qualitative data with the design of one factor with two levels. In addition, this article presents examples for analysis, which will play a leading role for researchers to implement the repetition principle during the research design phase.

  13. 40 CFR 761.286 - Sample size and procedure for collecting a sample.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ..., DISTRIBUTION IN COMMERCE, AND USE PROHIBITIONS Sampling To Verify Completion of Self-Implementing Cleanup and...) § 761.286 Sample size and procedure for collecting a sample. At each selected sampling location for...

  14. 7 CFR 52.775 - Sample unit size.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... United States Standards for Grades of Canned Red Tart Pitted Cherries 1 Sample Unit Size § 52.775 Sample... drained cherries. (b) Defects (other than harmless extraneous material)—100 cherries. (c)...

  15. 40 CFR 80.127 - Sample size guidelines.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ...) REGULATION OF FUELS AND FUEL ADDITIVES Attest Engagements § 80.127 Sample size guidelines. In performing the attest engagement, the auditor shall sample relevant populations to which agreed-upon procedures will...

  16. SNS Sample Activation Calculator Flux Recommendations and Validation

    SciTech Connect

    McClanahan, Tucker C.; Gallmeier, Franz X.; Iverson, Erik B.; Lu, Wei

    2015-02-01

    The Spallation Neutron Source (SNS) at Oak Ridge National Laboratory (ORNL) uses the Sample Activation Calculator (SAC) to calculate the activation of a sample after the sample has been exposed to the neutron beam in one of the SNS beamlines. The SAC webpage takes user inputs (choice of beamline, the mass, composition and area of the sample, irradiation time, decay time, etc.) and calculates the activation for the sample. In recent years, the SAC has been incorporated into the user proposal and sample handling process, and instrument teams and users have noticed discrepancies in the predicted activation of their samples. The Neutronics Analysis Team validated SAC by performing measurements on select beamlines and confirmed the discrepancies seen by the instrument teams and users. The conclusions were that the discrepancies were a result of a combination of faulty neutron flux spectra for the instruments, improper inputs supplied by SAC (1.12), and a mishandling of cross section data in the Sample Activation Program for Easy Use (SAPEU) (1.1.2). This report focuses on the conclusion that the SAPEU (1.1.2) beamline neutron flux spectra have errors and are a significant contributor to the activation discrepancies. The results of the analysis of the SAPEU (1.1.2) flux spectra for all beamlines will be discussed in detail. The recommendations for the implementation of improved neutron flux spectra in SAPEU (1.1.3) are also discussed.

  17. How to calculate normal curvatures of sampled geological surfaces

    NASA Astrophysics Data System (ADS)

    Bergbauer, Stephan; Pollard, David D.

    2003-02-01

    Curvature has been used both to describe geological surfaces and to predict the distribution of deformation in folded or domed strata. Several methods have been proposed in the geoscience literature to approximate the curvature of surfaces; however we advocate a technique for the exact calculation of normal curvature for single-valued gridded surfaces. This technique, based on the First and Second Fundamental Forms of differential geometry, allows for the analytical calculation of the magnitudes and directions of principal curvatures, as well as Gaussian and mean curvature. This approach is an improvement over previous methods to calculate surface curvatures because it avoids common mathematical approximations, which introduce significant errors when calculated over sloped horizons. Moreover, the technique is easily implemented numerically as it calculates curvatures directly from gridded surface data (e.g. seismic or GPS data) without prior surface triangulation. In geological curvature analyses, problems arise because of the sampled nature of geological horizons, which introduces a dependence of calculated curvatures on the sample grid. This dependence makes curvature analysis without prior data manipulation problematic. To ensure a meaningful curvature analysis, surface data should be filtered to extract only those surface wavelengths that scale with the feature under investigation. A curvature analysis of the top-Pennsylvanian horizon at Goose Egg dome, Wyoming shows that sampled surfaces can be smoothed using a moving average low-pass filter to extract curvature information associated with the true morphology of the structure.

  18. Estimating population size with correlated sampling unit estimates

    Treesearch

    David C. Bowden; Gary C. White; Alan B. Franklin; Joseph L. Ganey

    2003-01-01

    Finite population sampling theory is useful in estimating total population size (abundance) from abundance estimates of each sampled unit (quadrat). We develop estimators that allow correlated quadrat abundance estimates, even for quadrats in different sampling strata. Correlated quadrat abundance estimates based on mark–recapture or distance sampling methods occur...

  19. Sample size reassessment for a two-stage design controlling the false discovery rate

    PubMed Central

    Zehetmayer, Sonja; Graf, Alexandra C.; Posch, Martin

    2016-01-01

    Sample size calculations for gene expression microarray and NGS-RNA-Seq experiments are challenging because the overall power depends on unknown quantities as the proportion of true null hypotheses and the distribution of the effect sizes under the alternative. We propose a two-stage design with an adaptive interim analysis where these quantities are estimated from the interim data. The second stage sample size is chosen based on these estimates to achieve a specific overall power. The proposed procedure controls the power in all considered scenarios except for very low first stage sample sizes. The false discovery rate (FDR) is controlled despite of the data dependent choice of sample size. The two-stage design can be a useful tool to determine the sample size of high-dimensional studies if in the planning phase there is high uncertainty regarding the expected effect sizes and variability. PMID:26461844

  20. Determination of sample size in genome-scale RNAi screens.

    PubMed

    Zhang, Xiaohua Douglas; Heyse, Joseph F

    2009-04-01

    For genome-scale RNAi research, it is critical to investigate sample size required for the achievement of reasonably low false negative rate (FNR) and false positive rate. The analysis in this article reveals that current design of sample size contributes to the occurrence of low signal-to-noise ratio in genome-scale RNAi projects. The analysis suggests that (i) an arrangement of 16 wells per plate is acceptable and an arrangement of 20-24 wells per plate is preferable for a negative control to be used for hit selection in a primary screen without replicates; (ii) in a confirmatory screen or a primary screen with replicates, a sample size of 3 is not large enough, and there is a large reduction in FNRs when sample size increases from 3 to 4. To search a tradeoff between benefit and cost, any sample size between 4 and 11 is a reasonable choice. If the main focus is the selection of siRNAs with strong effects, a sample size of 4 or 5 is a good choice. If we want to have enough power to detect siRNAs with moderate effects, sample size needs to be 8, 9, 10 or 11. These discoveries about sample size bring insight to the design of a genome-scale RNAi screen experiment.

  1. Minimum Sample Size Recommendations for Conducting Factor Analyses

    ERIC Educational Resources Information Center

    Mundfrom, Daniel J.; Shaw, Dale G.; Ke, Tian Lu

    2005-01-01

    There is no shortage of recommendations regarding the appropriate sample size to use when conducting a factor analysis. Suggested minimums for sample size include from 3 to 20 times the number of variables and absolute ranges from 100 to over 1,000. For the most part, there is little empirical evidence to support these recommendations. This…

  2. Preliminary Proactive Sample Size Determination for Confirmatory Factor Analysis Models

    ERIC Educational Resources Information Center

    Koran, Jennifer

    2016-01-01

    Proactive preliminary minimum sample size determination can be useful for the early planning stages of a latent variable modeling study to set a realistic scope, long before the model and population are finalized. This study examined existing methods and proposed a new method for proactive preliminary minimum sample size determination.

  3. Sample Sizes when Using Multiple Linear Regression for Prediction

    ERIC Educational Resources Information Center

    Knofczynski, Gregory T.; Mundfrom, Daniel

    2008-01-01

    When using multiple regression for prediction purposes, the issue of minimum required sample size often needs to be addressed. Using a Monte Carlo simulation, models with varying numbers of independent variables were examined and minimum sample sizes were determined for multiple scenarios at each number of independent variables. The scenarios…

  4. Sample Size Requirements for Estimating Pearson, Spearman and Kendall Correlations.

    ERIC Educational Resources Information Center

    Bonett, Douglas G.; Wright, Thomas A.

    2000-01-01

    Reviews interval estimates of the Pearson, Kendall tau-alpha, and Spearman correlates and proposes an improved standard error for the Spearman correlation. Examines the sample size required to yield a confidence interval having the desired width. Findings show accurate results from a two-stage approximation to the sample size. (SLD)

  5. Effects of Calibration Sample Size and Item Bank Size on Ability Estimation in Computerized Adaptive Testing

    ERIC Educational Resources Information Center

    Sahin, Alper; Weiss, David J.

    2015-01-01

    This study aimed to investigate the effects of calibration sample size and item bank size on examinee ability estimation in computerized adaptive testing (CAT). For this purpose, a 500-item bank pre-calibrated using the three-parameter logistic model with 10,000 examinees was simulated. Calibration samples of varying sizes (150, 250, 350, 500,…

  6. Growth in bone strength, body size, and muscle size in a juvenile longitudinal sample.

    PubMed

    Ruff, Christopher

    2003-09-01

    A longitudinal sample of 20 subjects, measured an average of 34 to 35 times each at approximately 6-month intervals from near birth through late adolescence, was used to investigate relationships between body size, muscle size, and bone structural development. The section modulus, an index of bone strength, was calculated from humeral and femoral diaphyseal breadth measurements obtained from serial radiographs. Muscle breadths of the forearm and thigh, also measured radiographically, were used to estimate muscle cross-sectional areas. Body size was assessed as the product of body weight and bone length (humeral or femoral). Stature was also investigated as a surrogate body size measure. Growth velocity in femoral strength was strongly correlated with growth velocity in body weight. femoral length (r2=0.65-0.80), very poorly correlated with growth velocity in stature (r2<0.06), and weakly but significantly correlated with growth velocity in thigh muscle size (r2=0.10-.25). Growth velocity in humeral strength was moderately correlated with that for body weight x humeral length (r2=0.40-0.73), very poorly correlated with that for stature (r2<0.05), and showed a marked sex difference with forearm muscle area velocity, with males having a stronger correlation (r2 approximately 0.65) and females a much weaker correlation (r2 approximately 0.15). Ages at peak adolescent growth velocity were nonsignificantly different between bone strength, body weight x bone length, and muscle area, but significantly earlier for stature. Thus, while there was an early adolescent "lag" between stature and bone strength, there was no such "lag" between a more mechanically appropriate measure of body size and bone strength. "Infancy peaks" in bone strength velocities, earlier in the humerus than in the femur and not paralleled by similar changes in body size, may be the result of the initiation of walking, when mechanical loads relative to body size are changing in both the upper and lower

  7. Application of bag sampling technique for particle size distribution measurements.

    PubMed

    Mazaheri, M; Johnson, G R; Morawska, L

    2009-11-01

    Bag sampling techniques can be used to temporarily store the aerosol and therefore provide sufficient time to utilize sensitive but slow instrumental techniques for recording detailed particle size distributions. Laboratory based assessment of the method was conducted to examine size dependant deposition loss coefficients for aerosols held in Velostat bags conforming to a horizontal cylindrical geometry. Deposition losses of NaCl particles in the range of 10 nm to 160 nm were analysed in relation to the bag size, storage time, and sampling flow rate. Results of this study suggest that the bag sampling method is most useful for moderately short sampling periods of about 5 minutes.

  8. 40 CFR 91.419 - Raw emission sampling calculations.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... (CONTINUED) CONTROL OF EMISSIONS FROM MARINE SPARK-IGNITION ENGINES Gaseous Exhaust Test Procedures § 91.419 Raw emission sampling calculations. (a) Derive the final test results through the steps described in... following equations are used to determine the weighted emission values for the test engine:...

  9. 40 CFR 91.419 - Raw emission sampling calculations.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... (CONTINUED) CONTROL OF EMISSIONS FROM MARINE SPARK-IGNITION ENGINES Gaseous Exhaust Test Procedures § 91.419 Raw emission sampling calculations. (a) Derive the final test results through the steps described in... following equations are used to determine the weighted emission values for the test engine:...

  10. Hellman-Feynman operator sampling in diffusion Monte Carlo calculations.

    PubMed

    Gaudoin, R; Pitarke, J M

    2007-09-21

    Diffusion Monte Carlo (DMC) calculations typically yield highly accurate results in solid-state and quantum-chemical calculations. However, operators that do not commute with the Hamiltonian are at best sampled correctly up to second order in the error of the underlying trial wave function once simple corrections have been applied. This error is of the same order as that for the energy in variational calculations. Operators that suffer from these problems include potential energies and the density. This Letter presents a new method, based on the Hellman-Feynman theorem, for the correct DMC sampling of all operators diagonal in real space. Our method is easy to implement in any standard DMC code.

  11. Sample sizes and model comparison metrics for species distribution models

    Treesearch

    B.B. Hanberry; H.S. He; D.C. Dey

    2012-01-01

    Species distribution models use small samples to produce continuous distribution maps. The question of how small a sample can be to produce an accurate model generally has been answered based on comparisons to maximum sample sizes of 200 observations or fewer. In addition, model comparisons often are made with the kappa statistic, which has become controversial....

  12. Sampling strategies for estimating brook trout effective population size

    Treesearch

    Andrew R. Whiteley; Jason A. Coombs; Mark Hudy; Zachary Robinson; Keith H. Nislow; Benjamin H. Letcher

    2012-01-01

    The influence of sampling strategy on estimates of effective population size (Ne) from single-sample genetic methods has not been rigorously examined, though these methods are increasingly used. For headwater salmonids, spatially close kin association among age-0 individuals suggests that sampling strategy (number of individuals and location from...

  13. Alpha values as a function of sample size, effect size, and power: accuracy over inference.

    PubMed

    Bradley, M T; Brand, A

    2013-06-01

    Tables of alpha values as a function of sample size, effect size, and desired power were presented. The tables indicated expected alphas for small, medium, and large effect sizes given a variety of sample sizes. It was evident that sample sizes for most psychological studies are adequate for large effect sizes defined at .8. The typical alpha level of .05 and desired power of 90% can be achieved with 70 participants in two groups. It was perhaps doubtful if these ideal levels of alpha and power have generally been achieved for medium effect sizes in actual research, since 170 participants would be required. Small effect sizes have rarely been tested with an adequate number of participants or power. Implications were discussed.

  14. CSnrc: Correlated sampling Monte Carlo calculations using EGSnrc

    SciTech Connect

    Buckley, Lesley A.; Kawrakow, I.; Rogers, D.W.O.

    2004-12-01

    CSnrc, a new user-code for the EGSnrc Monte Carlo system is described. This user-code improves the efficiency when calculating ratios of doses from similar geometries. It uses a correlated sampling variance reduction technique. CSnrc is developed from an existing EGSnrc user-code CAVRZnrc and improves upon the correlated sampling algorithm used in an earlier version of the code written for the EGS4 Monte Carlo system. Improvements over the EGS4 version of the algorithm avoid repetition of sections of particle tracks. The new code includes a rectangular phantom geometry not available in other EGSnrc cylindrical codes. Comparison to CAVRZnrc shows gains in efficiency of up to a factor of 64 for a variety of test geometries when computing the ratio of doses to the cavity for two geometries. CSnrc is well suited to in-phantom calculations and is used to calculate the central electrode correction factor P{sub cel} in high-energy photon and electron beams. Current dosimetry protocols base the value of P{sub cel} on earlier Monte Carlo calculations. The current CSnrc calculations achieve 0.02% statistical uncertainties on P{sub cel}, much lower than those previously published. The current values of P{sub cel} compare well with the values used in dosimetry protocols for photon beams. For electrons beams, CSnrc calculations are reported at the reference depth used in recent protocols and show up to a 0.2% correction for a graphite electrode, a correction currently ignored by dosimetry protocols. The calculations show that for a 1 mm diameter aluminum central electrode, the correction factor differs somewhat from the values used in both the IAEA TRS-398 code of practice and the AAPM's TG-51 protocol.

  15. Determination of the optimal sample size for a clinical trial accounting for the population size.

    PubMed

    Stallard, Nigel; Miller, Frank; Day, Simon; Hee, Siew Wan; Madan, Jason; Zohar, Sarah; Posch, Martin

    2017-07-01

    The problem of choosing a sample size for a clinical trial is a very common one. In some settings, such as rare diseases or other small populations, the large sample sizes usually associated with the standard frequentist approach may be infeasible, suggesting that the sample size chosen should reflect the size of the population under consideration. Incorporation of the population size is possible in a decision-theoretic approach either explicitly by assuming that the population size is fixed and known, or implicitly through geometric discounting of the gain from future patients reflecting the expected population size. This paper develops such approaches. Building on previous work, an asymptotic expression is derived for the sample size for single and two-arm clinical trials in the general case of a clinical trial with a primary endpoint with a distribution of one parameter exponential family form that optimizes a utility function that quantifies the cost and gain per patient as a continuous function of this parameter. It is shown that as the size of the population, N, or expected size, N∗ in the case of geometric discounting, becomes large, the optimal trial size is O(N1/2) or O(N∗1/2). The sample size obtained from the asymptotic expression is also compared with the exact optimal sample size in examples with responses with Bernoulli and Poisson distributions, showing that the asymptotic approximations can also be reasonable in relatively small sample sizes. © 2016 The Author. Biometrical Journal published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. Sample Size and Allocation of Effort in Point Count Sampling of Birds in Bottomland Hardwood Forests

    Treesearch

    Winston P. Smith; Daniel J. Twedt; Robert J. Cooper; David A. Wiedenfeld; Paul B. Hamel; Robert P. Ford

    1995-01-01

    To examine sample size requirements and optimum allocation of effort in point count sampling of bottomland hardwood forests, we computed minimum sample sizes from variation recorded during 82 point counts (May 7-May 16, 1992) from three localities containing three habitat types across three regions of the Mississippi Alluvial Valley (MAV). Also, we estimated the effect...

  17. The Precision Efficacy Analysis for Regression Sample Size Method.

    ERIC Educational Resources Information Center

    Brooks, Gordon P.; Barcikowski, Robert S.

    The general purpose of this study was to examine the efficiency of the Precision Efficacy Analysis for Regression (PEAR) method for choosing appropriate sample sizes in regression studies used for precision. The PEAR method, which is based on the algebraic manipulation of an accepted cross-validity formula, essentially uses an effect size to…

  18. Effects of Mesh Size on Sieved Samples of Corophium volutator

    NASA Astrophysics Data System (ADS)

    Crewe, Tara L.; Hamilton, Diana J.; Diamond, Antony W.

    2001-08-01

    Corophium volutator (Pallas), gammaridean amphipods found on intertidal mudflats, are frequently collected in mud samples sieved on mesh screens. However, mesh sizes used vary greatly among studies, raising the possibility that sampling methods bias results. The effect of using different mesh sizes on the resulting size-frequency distributions of Corophium was tested by collecting Corophium from mud samples with 0·5 and 0·25 mm sieves. More than 90% of Corophium less than 2 mm long passed through the larger sieve. A significantly smaller, but still substantial, proportion of 2-2·9 mm Corophium (30%) was also lost. Larger size classes were unaffected by mesh size. Mesh size significantly changed the observed size-frequency distribution of Corophium, and effects varied with sampling date. It is concluded that a 0·5 mm sieve is suitable for studies concentrating on adults, but to accurately estimate Corophium density and size-frequency distributions, a 0·25 mm sieve must be used.

  19. Sample Size Determination: A Comparison of Attribute, Continuous Variable, and Cell Size Methods.

    ERIC Educational Resources Information Center

    Clark, Philip M.

    1984-01-01

    Describes three methods of sample size determination, each having its use in investigation of social science problems: Attribute method; Continuous Variable method; Galtung's Cell Size method. Statistical generalization, benefits of cell size method (ease of use, trivariate analysis and trichotyomized variables), and choice of method are…

  20. The Sample Size Needed for the Trimmed "t" Test when One Group Size Is Fixed

    ERIC Educational Resources Information Center

    Luh, Wei-Ming; Guo, Jiin-Huarng

    2009-01-01

    The sample size determination is an important issue for planning research. However, limitations in size have seldom been discussed in the literature. Thus, how to allocate participants into different treatment groups to achieve the desired power is a practical issue that still needs to be addressed when one group size is fixed. The authors focused…

  1. Sample size requirements for training high-dimensional risk predictors.

    PubMed

    Dobbin, Kevin K; Song, Xiao

    2013-09-01

    A common objective of biomarker studies is to develop a predictor of patient survival outcome. Determining the number of samples required to train a predictor from survival data is important for designing such studies. Existing sample size methods for training studies use parametric models for the high-dimensional data and cannot handle a right-censored dependent variable. We present a new training sample size method that is non-parametric with respect to the high-dimensional vectors, and is developed for a right-censored response. The method can be applied to any prediction algorithm that satisfies a set of conditions. The sample size is chosen so that the expected performance of the predictor is within a user-defined tolerance of optimal. The central method is based on a pilot dataset. To quantify uncertainty, a method to construct a confidence interval for the tolerance is developed. Adequacy of the size of the pilot dataset is discussed. An alternative model-based version of our method for estimating the tolerance when no adequate pilot dataset is available is presented. The model-based method requires a covariance matrix be specified, but we show that the identity covariance matrix provides adequate sample size when the user specifies three key quantities. Application of the sample size method to two microarray datasets is discussed.

  2. Sample size matters: investigating the effect of sample size on a logistic regression susceptibility model for debris flows

    NASA Astrophysics Data System (ADS)

    Heckmann, T.; Gegg, K.; Gegg, A.; Becht, M.

    2014-02-01

    Predictive spatial modelling is an important task in natural hazard assessment and regionalisation of geomorphic processes or landforms. Logistic regression is a multivariate statistical approach frequently used in predictive modelling; it can be conducted stepwise in order to select from a number of candidate independent variables those that lead to the best model. In our case study on a debris flow susceptibility model, we investigate the sensitivity of model selection and quality to different sample sizes in light of the following problem: on the one hand, a sample has to be large enough to cover the variability of geofactors within the study area, and to yield stable and reproducible results; on the other hand, the sample must not be too large, because a large sample is likely to violate the assumption of independent observations due to spatial autocorrelation. Using stepwise model selection with 1000 random samples for a number of sample sizes between n = 50 and n = 5000, we investigate the inclusion and exclusion of geofactors and the diversity of the resulting models as a function of sample size; the multiplicity of different models is assessed using numerical indices borrowed from information theory and biodiversity research. Model diversity decreases with increasing sample size and reaches either a local minimum or a plateau; even larger sample sizes do not further reduce it, and they approach the upper limit of sample size given, in this study, by the autocorrelation range of the spatial data sets. In this way, an optimised sample size can be derived from an exploratory analysis. Model uncertainty due to sampling and model selection, and its predictive ability, are explored statistically and spatially through the example of 100 models estimated in one study area and validated in a neighbouring area: depending on the study area and on sample size, the predicted probabilities for debris flow release differed, on average, by 7 to 23 percentage points. In

  3. Sample size matters: investigating the effect of sample size on a logistic regression debris flow susceptibility model

    NASA Astrophysics Data System (ADS)

    Heckmann, T.; Gegg, K.; Gegg, A.; Becht, M.

    2013-06-01

    Predictive spatial modelling is an important task in natural hazard assessment and regionalisation of geomorphic processes or landforms. Logistic regression is a multivariate statistical approach frequently used in predictive modelling; it can be conducted stepwise in order to select from a number of candidate independent variables those that lead to the best model. In our case study on a debris flow susceptibility model, we investigate the sensitivity of model selection and quality to different sample sizes in light of the following problem: on the one hand, a sample has to be large enough to cover the variability of geofactors within the study area, and to yield stable results; on the other hand, the sample must not be too large, because a large sample is likely to violate the assumption of independent observations due to spatial autocorrelation. Using stepwise model selection with 1000 random samples for a number of sample sizes between n = 50 and n = 5000, we investigate the inclusion and exclusion of geofactors and the diversity of the resulting models as a function of sample size; the multiplicity of different models is assessed using numerical indices borrowed from information theory and biodiversity research. Model diversity decreases with increasing sample size and reaches either a local minimum or a plateau; even larger sample sizes do not further reduce it, and approach the upper limit of sample size given, in this study, by the autocorrelation range of the spatial datasets. In this way, an optimised sample size can be derived from an exploratory analysis. Model uncertainty due to sampling and model selection, and its predictive ability, are explored statistically and spatially through the example of 100 models estimated in one study area and validated in a neighbouring area: depending on the study area and on sample size, the predicted probabilities for debris flow release differed, on average, by 7 to 23 percentage points. In view of these results, we

  4. Abstract: Sample Size Planning for Latent Curve Models.

    PubMed

    Lai, Keke

    2011-11-30

    When designing a study that uses structural equation modeling (SEM), an important task is to decide an appropriate sample size. Historically, this task is approached from the power analytic perspective, where the goal is to obtain sufficient power to reject a false null hypothesis. However, hypothesis testing only tells if a population effect is zero and fails to address the question about the population effect size. Moreover, significance tests in the SEM context often reject the null hypothesis too easily, and therefore the problem in practice is having too much power instead of not enough power. An alternative means to infer the population effect is forming confidence intervals (CIs). A CI is more informative than hypothesis testing because a CI provides a range of plausible values for the population effect size of interest. Given the close relationship between CI and sample size, the sample size for an SEM study can be planned with the goal to obtain sufficiently narrow CIs for the population model parameters of interest. Latent curve models (LCMs) is an application of SEM with mean structure to studying change over time. The sample size planning method for LCM from the CI perspective is based on maximum likelihood and expected information matrix. Given a sample, to form a CI for the model parameter of interest in LCM, it requires the sample covariance matrix S, sample mean vector [Formula: see text], and sample size N. Therefore, the width (w) of the resulting CI can be considered a function of S, [Formula: see text], and N. Inverting the CI formation process gives the sample size planning process. The inverted process requires a proxy for the population covariance matrix Σ, population mean vector μ, and the desired width ω as input, and it returns N as output. The specification of the input information for sample size planning needs to be performed based on a systematic literature review. In the context of covariance structure analysis, Lai and Kelley

  5. Rasch fit statistics and sample size considerations for polytomous data

    PubMed Central

    Smith, Adam B; Rush, Robert; Fallowfield, Lesley J; Velikova, Galina; Sharpe, Michael

    2008-01-01

    Background Previous research on educational data has demonstrated that Rasch fit statistics (mean squares and t-statistics) are highly susceptible to sample size variation for dichotomously scored rating data, although little is known about this relationship for polytomous data. These statistics help inform researchers about how well items fit to a unidimensional latent trait, and are an important adjunct to modern psychometrics. Given the increasing use of Rasch models in health research the purpose of this study was therefore to explore the relationship between fit statistics and sample size for polytomous data. Methods Data were collated from a heterogeneous sample of cancer patients (n = 4072) who had completed both the Patient Health Questionnaire – 9 and the Hospital Anxiety and Depression Scale. Ten samples were drawn with replacement for each of eight sample sizes (n = 25 to n = 3200). The Rating and Partial Credit Models were applied and the mean square and t-fit statistics (infit/outfit) derived for each model. Results The results demonstrated that t-statistics were highly sensitive to sample size, whereas mean square statistics remained relatively stable for polytomous data. Conclusion It was concluded that mean square statistics were relatively independent of sample size for polytomous data and that misfit to the model could be identified using published recommended ranges. PMID:18510722

  6. Sample sizes to control error estimates in determining soil bulk density in California forest soils

    Treesearch

    Youzhi Han; Jianwei Zhang; Kim G. Mattson; Weidong Zhang; Thomas A. Weber

    2016-01-01

    Characterizing forest soil properties with high variability is challenging, sometimes requiring large numbers of soil samples. Soil bulk density is a standard variable needed along with element concentrations to calculate nutrient pools. This study aimed to determine the optimal sample size, the number of observation (n), for predicting the soil bulk density with a...

  7. Sample Size Requirements for Traditional and Regression-Based Norms.

    PubMed

    Oosterhuis, Hannah E M; van der Ark, L Andries; Sijtsma, Klaas

    2016-04-01

    Test norms enable determining the position of an individual test taker in the group. The most frequently used approach to obtain test norms is traditional norming. Regression-based norming may be more efficient than traditional norming and is rapidly growing in popularity, but little is known about its technical properties. A simulation study was conducted to compare the sample size requirements for traditional and regression-based norming by examining the 95% interpercentile ranges for percentile estimates as a function of sample size, norming method, size of covariate effects on the test score, test length, and number of answer categories in an item. Provided the assumptions of the linear regression model hold in the data, for a subdivision of the total group into eight equal-size subgroups, we found that regression-based norming requires samples 2.5 to 5.5 times smaller than traditional norming. Sample size requirements are presented for each norming method, test length, and number of answer categories. We emphasize that additional research is needed to establish sample size requirements when the assumptions of the linear regression model are violated. © The Author(s) 2015.

  8. Sample size determination for the confidence interval of linear contrast in analysis of covariance.

    PubMed

    Liu, Xiaofeng Steven

    2013-03-11

    This article provides a way to determine sample size for the confidence interval of the linear contrast of treatment means in analysis of covariance (ANCOVA) without prior knowledge of the actual covariate means and covariate sum of squares, which are modeled as a t statistic. Using the t statistic, one can calculate the appropriate sample size to achieve the desired probability of obtaining a specified width in the confidence interval of the covariate-adjusted linear contrast.

  9. Survival estimates and sample size: what can we conclude?

    PubMed

    Shouman, R; Witten, M

    1995-05-01

    Attempts to understand aging processes often involve life-span measurements from which a survival curve is constructed and model parameters estimated. The parameter estimates are then compared, and conclusions concerning the underlying biological processes are subsequently deduced, based upon the magnitude of the parameter differences. In this article we discuss the role of sample size and sample fluctuation on the parameter estimates and the profound effect that these factors may play in our arrival at meaningful biological conclusions. We then extend this discussion to examine one methodology that can help select sample sizes for specific parametric survival models.

  10. On the deficiency of the sample median when the sample size is random

    NASA Astrophysics Data System (ADS)

    Bening, V. E.; Korolev, Victor; Zeifman, Alexander

    2017-07-01

    In the paper general theorems concerning the asymptotic deficiencies of sample median based on the sample of random size are presented. These results make it possible to compare the quality of the sample median constructed from samples with both random and non-random sizes in terms of additional observations. The cases of the binomial distribution and the distribution concentrated at three points are considered.

  11. RNAseqPS: A Web Tool for Estimating Sample Size and Power for RNAseq Experiment.

    PubMed

    Guo, Yan; Zhao, Shilin; Li, Chung-I; Sheng, Quanhu; Shyr, Yu

    2014-01-01

    Sample size and power determination is the first step in the experimental design of a successful study. Sample size and power calculation is required for applications for National Institutes of Health (NIH) funding. Sample size and power calculation is well established for traditional biological studies such as mouse model, genome wide association study (GWAS), and microarray studies. Recent developments in high-throughput sequencing technology have allowed RNAseq to replace microarray as the technology of choice for high-throughput gene expression profiling. However, the sample size and power analysis of RNAseq technology is an underdeveloped area. Here, we present RNAseqPS, an advanced online RNAseq power and sample size calculation tool based on the Poisson and negative binomial distributions. RNAseqPS was built using the Shiny package in R. It provides an interactive graphical user interface that allows the users to easily conduct sample size and power analysis for RNAseq experimental design. RNAseqPS can be accessed directly at http://cqs.mc.vanderbilt.edu/shiny/RNAseqPS/.

  12. Using electron microscopy to calculate optical properties of biological samples.

    PubMed

    Wu, Wenli; Radosevich, Andrew J; Eshein, Adam; Nguyen, The-Quyen; Yi, Ji; Cherkezyan, Lusik; Roy, Hemant K; Szleifer, Igal; Backman, Vadim

    2016-11-01

    The microscopic structural origins of optical properties in biological media are still not fully understood. Better understanding these origins can serve to improve the utility of existing techniques and facilitate the discovery of other novel techniques. We propose a novel analysis technique using electron microscopy (EM) to calculate optical properties of specific biological structures. This method is demonstrated with images of human epithelial colon cell nuclei. The spectrum of anisotropy factor g, the phase function and the shape factor D of the nuclei are calculated. The results show strong agreement with an independent study. This method provides a new way to extract the true phase function of biological samples and provides an independent validation for optical property measurement techniques.

  13. Using electron microscopy to calculate optical properties of biological samples

    PubMed Central

    Wu, Wenli; Radosevich, Andrew J.; Eshein, Adam; Nguyen, The-Quyen; Yi, Ji; Cherkezyan, Lusik; Roy, Hemant K.; Szleifer, Igal; Backman, Vadim

    2016-01-01

    The microscopic structural origins of optical properties in biological media are still not fully understood. Better understanding these origins can serve to improve the utility of existing techniques and facilitate the discovery of other novel techniques. We propose a novel analysis technique using electron microscopy (EM) to calculate optical properties of specific biological structures. This method is demonstrated with images of human epithelial colon cell nuclei. The spectrum of anisotropy factor g, the phase function and the shape factor D of the nuclei are calculated. The results show strong agreement with an independent study. This method provides a new way to extract the true phase function of biological samples and provides an independent validation for optical property measurement techniques. PMID:27896013

  14. Estimating hidden population size using Respondent-Driven Sampling data

    PubMed Central

    Handcock, Mark S.; Gile, Krista J.; Mar, Corinne M.

    2015-01-01

    Respondent-Driven Sampling (RDS) is n approach to sampling design and inference in hard-to-reach human populations. It is often used in situations where the target population is rare and/or stigmatized in the larger population, so that it is prohibitively expensive to contact them through the available frames. Common examples include injecting drug users, men who have sex with men, and female sex workers. Most analysis of RDS data has focused on estimating aggregate characteristics, such as disease prevalence. However, RDS is often conducted in settings where the population size is unknown and of great independent interest. This paper presents an approach to estimating the size of a target population based on data collected through RDS. The proposed approach uses a successive sampling approximation to RDS to leverage information in the ordered sequence of observed personal network sizes. The inference uses the Bayesian framework, allowing for the incorporation of prior knowledge. A flexible class of priors for the population size is used that aids elicitation. An extensive simulation study provides insight into the performance of the method for estimating population size under a broad range of conditions. A further study shows the approach also improves estimation of aggregate characteristics. Finally, the method demonstrates sensible results when used to estimate the size of known networked populations from the National Longitudinal Study of Adolescent Health, and when used to estimate the size of a hard-to-reach population at high risk for HIV. PMID:26180577

  15. Two-stage chain sampling inspection plans with different sample sizes in the two stages

    NASA Technical Reports Server (NTRS)

    Stephens, K. S.; Dodge, H. F.

    1976-01-01

    A further generalization of the family of 'two-stage' chain sampling inspection plans is developed - viz, the use of different sample sizes in the two stages. Evaluation of the operating characteristics is accomplished by the Markov chain approach of the earlier work, modified to account for the different sample sizes. Markov chains for a number of plans are illustrated and several algebraic solutions are developed. Since these plans involve a variable amount of sampling, an evaluation of the average sampling number (ASN) is developed. A number of OC curves and ASN curves are presented. Some comparisons with plans having only one sample size are presented and indicate that improved discrimination is achieved by the two-sample-size plans.

  16. Sample-Size Planning for More Accurate Statistical Power: A Method Adjusting Sample Effect Sizes for Publication Bias and Uncertainty.

    PubMed

    Anderson, Samantha F; Kelley, Ken; Maxwell, Scott E

    2017-09-01

    The sample size necessary to obtain a desired level of statistical power depends in part on the population value of the effect size, which is, by definition, unknown. A common approach to sample-size planning uses the sample effect size from a prior study as an estimate of the population value of the effect to be detected in the future study. Although this strategy is intuitively appealing, effect-size estimates, taken at face value, are typically not accurate estimates of the population effect size because of publication bias and uncertainty. We show that the use of this approach often results in underpowered studies, sometimes to an alarming degree. We present an alternative approach that adjusts sample effect sizes for bias and uncertainty, and we demonstrate its effectiveness for several experimental designs. Furthermore, we discuss an open-source R package, BUCSS, and user-friendly Web applications that we have made available to researchers so that they can easily implement our suggested methods.

  17. Sample size considerations for historical control studies with survival outcomes

    PubMed Central

    Zhu, Hong; Zhang, Song; Ahn, Chul

    2015-01-01

    Historical control trials (HCTs) are frequently conducted to compare an experimental treatment with a control treatment from a previous study, when they are applicable and favored over a randomized clinical trial (RCT) due to feasibility, ethics and cost concerns. Makuch and Simon developed a sample size formula for historical control (HC) studies with binary outcomes, assuming that the observed response rate in the HC group is the true response rate. This method was extended by Dixon and Simon to specify sample size for HC studies comparing survival outcomes. For HC studies with binary and continuous outcomes, many researchers have shown that the popular Makuch and Simon method does not preserve the nominal power and type I error, and suggested alternative approaches. For HC studies with survival outcomes, we reveal through simulation that the conditional power and type I error over all the random realizations of the HC data have highly skewed distributions. Therefore, the sampling variability of the HC data needs to be appropriately accounted for in determining sample size. A flexible sample size formula that controls arbitrary percentiles, instead of means, of the conditional power and type I error, is derived. Although an explicit sample size formula with survival outcomes is not available, the computation is straightforward. Simulations demonstrate that the proposed method preserves the operational characteristics in a more realistic scenario where the true hazard rate of the HC group is unknown. A real data application of an advanced non-small cell lung cancer (NSCLC) clinical trial is presented to illustrate sample size considerations for HC studies in comparison of survival outcomes. PMID:26098200

  18. Sample size considerations for historical control studies with survival outcomes.

    PubMed

    Zhu, Hong; Zhang, Song; Ahn, Chul

    2016-01-01

    Historical control trials (HCTs) are frequently conducted to compare an experimental treatment with a control treatment from a previous study, when they are applicable and favored over a randomized clinical trial (RCT) due to feasibility, ethics and cost concerns. Makuch and Simon developed a sample size formula for historical control (HC) studies with binary outcomes, assuming that the observed response rate in the HC group is the true response rate. This method was extended by Dixon and Simon to specify sample size for HC studies comparing survival outcomes. For HC studies with binary and continuous outcomes, many researchers have shown that the popular Makuch and Simon method does not preserve the nominal power and type I error, and suggested alternative approaches. For HC studies with survival outcomes, we reveal through simulation that the conditional power and type I error over all the random realizations of the HC data have highly skewed distributions. Therefore, the sampling variability of the HC data needs to be appropriately accounted for in determining sample size. A flexible sample size formula that controls arbitrary percentiles, instead of means, of the conditional power and type I error, is derived. Although an explicit sample size formula with survival outcomes is not available, the computation is straightforward. Simulations demonstrate that the proposed method preserves the operational characteristics in a more realistic scenario where the true hazard rate of the HC group is unknown. A real data application of an advanced non-small cell lung cancer (NSCLC) clinical trial is presented to illustrate sample size considerations for HC studies in comparison of survival outcomes.

  19. Sample Size Determination for One- and Two-Sample Trimmed Mean Tests

    ERIC Educational Resources Information Center

    Luh, Wei-Ming; Olejnik, Stephen; Guo, Jiin-Huarng

    2008-01-01

    Formulas to determine the necessary sample sizes for parametric tests of group comparisons are available from several sources and appropriate when population distributions are normal. However, in the context of nonnormal population distributions, researchers recommend Yuen's trimmed mean test, but formulas to determine sample sizes have not been…

  20. Sample Size Determination for One- and Two-Sample Trimmed Mean Tests

    ERIC Educational Resources Information Center

    Luh, Wei-Ming; Olejnik, Stephen; Guo, Jiin-Huarng

    2008-01-01

    Formulas to determine the necessary sample sizes for parametric tests of group comparisons are available from several sources and appropriate when population distributions are normal. However, in the context of nonnormal population distributions, researchers recommend Yuen's trimmed mean test, but formulas to determine sample sizes have not been…

  1. A modified approach to estimating sample size for simple logistic regression with one continuous covariate.

    PubMed

    Novikov, I; Fund, N; Freedman, L S

    2010-01-15

    Different methods for the calculation of sample size for simple logistic regression (LR) with one normally distributed continuous covariate give different results. Sometimes the difference can be large. Furthermore, some methods require the user to specify the prevalence of cases when the covariate equals its population mean, rather than the more natural population prevalence. We focus on two commonly used methods and show through simulations that the power for a given sample size may differ substantially from the nominal value for one method, especially when the covariate effect is large, while the other method performs poorly if the user provides the population prevalence instead of the required parameter. We propose a modification of the method of Hsieh et al. that requires specification of the population prevalence and that employs Schouten's sample size formula for a t-test with unequal variances and group sizes. This approach appears to increase the accuracy of the sample size estimates for LR with one continuous covariate.

  2. Aircraft studies of size-dependent aerosol sampling through inlets

    NASA Technical Reports Server (NTRS)

    Porter, J. N.; Clarke, A. D.; Ferry, G.; Pueschel, R. F.

    1992-01-01

    Representative measurement of aerosol from aircraft-aspirated systems requires special efforts in order to maintain near isokinetic sampling conditions, estimate aerosol losses in the sample system, and obtain a measurement of sufficient duration to be statistically significant for all sizes of interest. This last point is especially critical for aircraft measurements which typically require fast response times while sampling in clean remote regions. This paper presents size-resolved tests, intercomparisons, and analysis of aerosol inlet performance as determined by a custom laser optical particle counter. Measurements discussed here took place during the Global Backscatter Experiment (1988-1989) and the Central Pacific Atmospheric Chemistry Experiment (1988). System configurations are discussed including (1) nozzle design and performance, (2) system transmission efficiency, (3) nonadiabatic effects in the sample line and its effect on the sample-line relative humidity, and (4) the use and calibration of a virtual impactor.

  3. Aircraft studies of size-dependent aerosol sampling through inlets

    NASA Technical Reports Server (NTRS)

    Porter, J. N.; Clarke, A. D.; Ferry, G.; Pueschel, R. F.

    1992-01-01

    Representative measurement of aerosol from aircraft-aspirated systems requires special efforts in order to maintain near isokinetic sampling conditions, estimate aerosol losses in the sample system, and obtain a measurement of sufficient duration to be statistically significant for all sizes of interest. This last point is especially critical for aircraft measurements which typically require fast response times while sampling in clean remote regions. This paper presents size-resolved tests, intercomparisons, and analysis of aerosol inlet performance as determined by a custom laser optical particle counter. Measurements discussed here took place during the Global Backscatter Experiment (1988-1989) and the Central Pacific Atmospheric Chemistry Experiment (1988). System configurations are discussed including (1) nozzle design and performance, (2) system transmission efficiency, (3) nonadiabatic effects in the sample line and its effect on the sample-line relative humidity, and (4) the use and calibration of a virtual impactor.

  4. Evaluation of Size Correction Factor for Size-specific Dose Estimates (SSDE) Calculation.

    PubMed

    Mizonobe, Kazufusa; Shiraishi, Yuta; Nakano, Satoshi; Fukuda, Chiaki; Asanuma, Osamu; Harada, Kohei; Date, Hiroyuki

    2016-09-01

    American Association of Physicists in Medicine (AAPM) Report No.204 recommends the size-specific dose estimates (SSDE), wherein SSDE=computed tomography dose index-volume (CTDIvol )×size correction factor (SCF), as an index of the CT dose to consider patient thickness. However, the study on SSDE has not been made yet for area detector CT (ADCT) device such as a 320-row CT scanner. The purpose of this study was to evaluate the SCF values for ADCT by means of a simulation technique to look into the differences in SCF values due to beam width. In the simulation, to construct the geometry of the Aquilion ONE X-ray CT system (120 kV), the dose ratio and the effective energies were measured in the cone angle and fan angle directions, and these were incorporated into the simulation code, Electron Gamma Shower Ver.5 (EGS5). By changing the thickness of a PMMA phantom from 8 cm to 40 cm, CTDIvol and SCF were determined. The SCF values for the beam widths in conventional and volume scans were calculated. The differences among the SCF values of conventional, volume scans, and AAPM were up to 23.0%. However, when SCF values were normalized in a phantom of 16 cm diameter, the error tended to decrease for the cases of thin body thickness, such as those of children. It was concluded that even if beam width and device are different, the SCF values recommended by AAPM are useful in clinical situations.

  5. Trainable fusion rules. II. Small sample-size effects.

    PubMed

    Raudys, Sarunas

    2006-12-01

    Profound theoretical analysis is performed of small-sample properties of trainable fusion rules to determine in which situations neural network ensembles can improve or degrade classification results. We consider small sample effects, specific only to multiple classifiers system design in the two-category case of two important fusion rules: (1) linear weighted average (weighted voting), realized either by the standard Fisher classifier or by the single-layer perceptron, and (2) the non-linear Behavior-Knowledge-Space method. The small sample effects include: (i) training bias, i.e. learning sample size influence on generalization error of the base experts or of the fusion rule, (ii) optimistic biased outputs of the experts (self-boasting effect) and (iii) sample size impact on determining optimal complexity of the fusion rule. Correction terms developed to reduce the self-boasting effect are studied. It is shown that small learning sets increase classification error of the expert classifiers and damage correlation structure between their outputs. If the sizes of learning sets used to develop the expert classifiers are too small, non-trainable fusion rules can outperform more sophisticated trainable ones. A practical technique to fight sample size problems is a noise injection technique. The noise injection reduces the fusion rule's complexity and diminishes the expert's boasting bias.

  6. Nonparametric Sample Size Estimation for Sensitivity and Specificity with Multiple Observations per Subject

    PubMed Central

    Hu, Fan; Schucany, William R.; Ahn, Chul

    2010-01-01

    Summary We propose a sample size calculation approach for the estimation of sensitivity and specificity of diagnostic tests with multiple observations per subjects. Many diagnostic tests such as diagnostic imaging or periodontal tests are characterized by the presence of multiple observations for each subject. The number of observations frequently varies among subjects in diagnostic imaging experiments or periodontal studies. Nonparametric statistical methods for the analysis of clustered binary data have been recently developed by various authors. In this paper, we derive a sample size formula for sensitivity and specificity of diagnostic tests using the sign test while accounting for multiple observations per subjects. Application of the sample size formula for the design of a diagnostic test is discussed. Since the sample size formula is based on large sample theory, simulation studies are conducted to evaluate the finite sample performance of the proposed method. We compare the performance of the proposed sample size formula with that of the parametric sample size formula that assigns equal weight to each observation. Simulation studies show that the proposed sample size formula generally yields empirical powers closer to the nominal level than the parametric method. Simulation studies also show that the number of subjects required increases as the variability in the number of observations per subject increases and the intracluster correlation increases. PMID:22114363

  7. Sample size in psychological research over the past 30 years.

    PubMed

    Marszalek, Jacob M; Barber, Carolyn; Kohlhart, Julie; Holmes, Cooper B

    2011-04-01

    The American Psychological Association (APA) Task Force on Statistical Inference was formed in 1996 in response to a growing body of research demonstrating methodological issues that threatened the credibility of psychological research, and made recommendations to address them. One issue was the small, even dramatically inadequate, size of samples used in studies published by leading journals. The present study assessed the progress made since the Task Force's final report in 1999. Sample sizes reported in four leading APA journals in 1955, 1977, 1995, and 2006 were compared using nonparametric statistics, while data from the last two waves were fit to a hierarchical generalized linear growth model for more in-depth analysis. Overall, results indicate that the recommendations for increasing sample sizes have not been integrated in core psychological research, although results slightly vary by field. This and other implications are discussed in the context of current methodological critique and practice.

  8. On random sample size, ignorability, ancillarity, completeness, separability, and degeneracy: sequential trials, random sample sizes, and missing data.

    PubMed

    Molenberghs, Geert; Kenward, Michael G; Aerts, Marc; Verbeke, Geert; Tsiatis, Anastasios A; Davidian, Marie; Rizopoulos, Dimitris

    2014-02-01

    The vast majority of settings for which frequentist statistical properties are derived assume a fixed, a priori known sample size. Familiar properties then follow, such as, for example, the consistency, asymptotic normality, and efficiency of the sample average for the mean parameter, under a wide range of conditions. We are concerned here with the alternative situation in which the sample size is itself a random variable which may depend on the data being collected. Further, the rule governing this may be deterministic or probabilistic. There are many important practical examples of such settings, including missing data, sequential trials, and informative cluster size. It is well known that special issues can arise when evaluating the properties of statistical procedures under such sampling schemes, and much has been written about specific areas (Grambsch P. Sequential sampling based on the observed Fisher information to guarantee the accuracy of the maximum likelihood estimator. Ann Stat 1983; 11: 68-77; Barndorff-Nielsen O and Cox DR. The effect of sampling rules on likelihood statistics. Int Stat Rev 1984; 52: 309-326). Our aim is to place these various related examples into a single framework derived from the joint modeling of the outcomes and sampling process and so derive generic results that in turn provide insight, and in some cases practical consequences, for different settings. It is shown that, even in the simplest case of estimating a mean, some of the results appear counterintuitive. In many examples, the sample average may exhibit small sample bias and, even when it is unbiased, may not be optimal. Indeed, there may be no minimum variance unbiased estimator for the mean. Such results follow directly from key attributes such as non-ancillarity of the sample size and incompleteness of the minimal sufficient statistic of the sample size and sample sum. Although our results have direct and obvious implications for estimation following group sequential

  9. CALCULATING TIME LAGS FROM UNEVENLY SAMPLED LIGHT CURVES

    SciTech Connect

    Zoghbi, A.; Reynolds, C.; Cackett, E. M.

    2013-11-01

    Timing techniques are powerful tools to study dynamical astrophysical phenomena. In the X-ray band, they offer the potential of probing accretion physics down to the event horizon. Recent work has used frequency- and energy-dependent time lags as tools for studying relativistic reverberation around the black holes in several Seyfert galaxies. This was achieved due to the evenly sampled light curves obtained using XMM-Newton. Continuously sampled data are, however, not always available and standard Fourier techniques are not applicable. Here, building on the work of Miller et al., we discuss and use a maximum likelihood method to obtain frequency-dependent lags that takes into account light curve gaps. Instead of calculating the lag directly, the method estimates the most likely lag values at a particular frequency given two observed light curves. We use Monte Carlo simulations to assess the method's applicability and use it to obtain lag-energy spectra from Suzaku data for two objects, NGC 4151 and MCG-5-23-16, that had previously shown signatures of iron K reverberation. The lags obtained are consistent with those calculated using standard methods using XMM-Newton data.

  10. Sample size bias in the estimation of means.

    PubMed

    Smith, Andrew R; Price, Paul C

    2010-08-01

    The present research concerns the hypothesis that intuitive estimates of the arithmetic mean of a sample of numbers tend to increase as a function of the sample size; that is, they reflect a systematic sample size bias. A similar bias has been observed when people judge the average member of a group of people on an inferred quantity (e.g., a disease risk; see Price, 2001; Price, Smith, & Lench, 2006). Until now, however, it has been unclear whether it would be observed when the stimuli were numbers, in which case the quantity need not be inferred, and "average" can be precisely defined as the arithmetic mean. In two experiments, participants estimated the arithmetic mean of 12 samples of numbers. In the first experiment, samples of from 5 to 20 numbers were presented simultaneously and participants quickly estimated their mean. In the second experiment, the numbers in each sample were presented sequentially. The results of both experiments confirmed the existence of a systematic sample size bias.

  11. Approximate sample sizes required to estimate length distributions

    USGS Publications Warehouse

    Miranda, L.E.

    2007-01-01

    The sample sizes required to estimate fish length were determined by bootstrapping from reference length distributions. Depending on population characteristics and species-specific maximum lengths, 1-cm length-frequency histograms required 375-1,200 fish to estimate within 10% with 80% confidence, 2.5-cm histograms required 150-425 fish, proportional stock density required 75-140 fish, and mean length required 75-160 fish. In general, smaller species, smaller populations, populations with higher mortality, and simpler length statistics required fewer samples. Indices that require low sample sizes may be suitable for monitoring population status, and when large changes in length are evident, additional sampling effort may be allocated to more precisely define length status with more informative estimators. ?? Copyright by the American Fisheries Society 2007.

  12. Optimization of sample size in controlled experiments: the CLAST rule.

    PubMed

    Botella, Juan; Ximénez, Carmen; Revuelta, Javier; Suero, Manuel

    2006-02-01

    Sequential rules are explored in the context of null hypothesis significance testing. Several studies have demonstrated that the fixed-sample stopping rule, in which the sample size used by researchers is determined in advance, is less practical and less efficient than sequential stopping rules. It is proposed that a sequential stopping rule called CLAST (composite limited adaptive sequential test) is a superior variant of COAST (composite open adaptive sequential test), a sequential rule proposed by Frick (1998). Simulation studies are conducted to test the efficiency of the proposed rule in terms of sample size and power. Two statistical tests are used: the one-tailed t test of mean differences with two matched samples, and the chi-square independence test for twofold contingency tables. The results show that the CLAST rule is more efficient than the COAST rule and reflects more realistically the practice of experimental psychology researchers.

  13. Calculating Size of the Saturn's "Leopard Skin" Spots

    NASA Astrophysics Data System (ADS)

    Kochemasov, G. G.

    2007-03-01

    An IR image of the saturnian south (PIA08333) shows huge storm ~8000 km across containing smaller storms about 300 to 600 km across. Assuming a wave nature of this phenomena calculations with wave modulation give diameters of small forms ~400 km.

  14. Teaching Modelling Concepts: Enter the Pocket-Size Programmable Calculator.

    ERIC Educational Resources Information Center

    Gaar, Kermit A., Jr.

    1980-01-01

    Addresses the problem of the failure of students to see a physiological system in an integrated way. Programmable calculators armed with a printer are suggested as useful teaching devices that avoid the expense and the unavailability of computers for modelling in teaching physiology. (Author/SA)

  15. Sample size determination for testing nonequality under a three-treatment two-period incomplete block crossover trial.

    PubMed

    Lui, Kung-Jong; Chang, Kuang-Chao

    2015-05-01

    To reduce the lengthy duration of a crossover trial for comparing three treatments, the incomplete block design has been often considered. A sample size calculation procedure for testing nonequality between either of the two experimental treatments and a placebo under such a design is developed. To evaluate the performance of the proposed sample size calculation procedure, Monte Carlo simulation is employed. The accuracy of the sample size calculation procedure developed here is demonstrated in a variety of situations. As compared with the parallel groups design, a substantial proportional reduction in the total minimum required sample size in use of the incomplete block crossover design is found. A crossover trial comparing two different doses of formoterol with a placebo on the forced expiratory volume is applied to illustrate the use of the sample size calculation procedure.

  16. Sample Size Bias in Judgments of Perceptual Averages

    ERIC Educational Resources Information Center

    Price, Paul C.; Kimura, Nicole M.; Smith, Andrew R.; Marshall, Lindsay D.

    2014-01-01

    Previous research has shown that people exhibit a sample size bias when judging the average of a set of stimuli on a single dimension. The more stimuli there are in the set, the greater people judge the average to be. This effect has been demonstrated reliably for judgments of the average likelihood that groups of people will experience negative,…

  17. Sample-size needs for forestry herbicide trials

    Treesearch

    S.M. Zedaker; T.G. Gregoire; James H. Miller

    1994-01-01

    Forest herbicide experiments are increasingly being designed to evaluate smaller treatment differences when comparing existing effective treatments, tank mix ratios, surfactants, and new low-rate products. The ability to detect small differences in efficacy is dependent upon the relationship among sample size. type I and II error probabilities, and the coefficients of...

  18. Sample size bias in retrospective estimates of average duration.

    PubMed

    Smith, Andrew R; Rule, Shanon; Price, Paul C

    2017-03-25

    People often estimate the average duration of several events (e.g., on average, how long does it take to drive from one's home to his or her office). While there is a great deal of research investigating estimates of duration for a single event, few studies have examined estimates when people must average across numerous stimuli or events. The current studies were designed to fill this gap by examining how people's estimates of average duration were influenced by the number of stimuli being averaged (i.e., the sample size). Based on research investigating the sample size bias, we predicted that participants' judgments of average duration would increase as the sample size increased. Across four studies, we demonstrated a sample size bias for estimates of average duration with different judgment types (numeric estimates and comparisons), study designs (between and within-subjects), and paradigms (observing images and performing tasks). The results are consistent with the more general notion that psychological representations of magnitudes in one dimension (e.g., quantity) can influence representations of magnitudes in another dimension (e.g., duration).

  19. 7 CFR 52.803 - Sample unit size.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... CERTAIN OTHER PROCESSED FOOD PRODUCTS 1 United States Standards for Grades of Frozen Red Tart Pitted... quality factors is based on the following sample unit sizes for the applicable factor: (a) Pits, character... than harmless extraneous material)—100 cherries. Factors of Quality...

  20. 7 CFR 52.803 - Sample unit size.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... CERTAIN OTHER PROCESSED FOOD PRODUCTS 1 United States Standards for Grades of Frozen Red Tart Pitted... quality factors is based on the following sample unit sizes for the applicable factor: (a) Pits, character... than harmless extraneous material)—100 cherries. Factors of Quality...

  1. An Investigation of Sample Size Splitting on ATFIND and DIMTEST

    ERIC Educational Resources Information Center

    Socha, Alan; DeMars, Christine E.

    2013-01-01

    Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…

  2. Sample Size Tables, "t" Test, and a Prevalent Psychometric Distribution.

    ERIC Educational Resources Information Center

    Sawilowsky, Shlomo S.; Hillman, Stephen B.

    Psychology studies often have low statistical power. Sample size tables, as given by J. Cohen (1988), may be used to increase power, but they are based on Monte Carlo studies of relatively "tame" mathematical distributions, as compared to psychology data sets. In this study, Monte Carlo methods were used to investigate Type I and Type II…

  3. Small Sample Sizes Yield Biased Allometric Equations in Temperate Forests

    PubMed Central

    Duncanson, L.; Rourke, O.; Dubayah, R.

    2015-01-01

    Accurate quantification of forest carbon stocks is required for constraining the global carbon cycle and its impacts on climate. The accuracies of forest biomass maps are inherently dependent on the accuracy of the field biomass estimates used to calibrate models, which are generated with allometric equations. Here, we provide a quantitative assessment of the sensitivity of allometric parameters to sample size in temperate forests, focusing on the allometric relationship between tree height and crown radius. We use LiDAR remote sensing to isolate between 10,000 to more than 1,000,000 tree height and crown radius measurements per site in six U.S. forests. We find that fitted allometric parameters are highly sensitive to sample size, producing systematic overestimates of height. We extend our analysis to biomass through the application of empirical relationships from the literature, and show that given the small sample sizes used in common allometric equations for biomass, the average site-level biomass bias is ~+70% with a standard deviation of 71%, ranging from −4% to +193%. These findings underscore the importance of increasing the sample sizes used for allometric equation generation. PMID:26598233

  4. Dependent Variable Reliability and Determination of Sample Size.

    ERIC Educational Resources Information Center

    Maxwell, Scott E.

    Arguments have recently been put forth that standard textbook procedures for determining the sample size necessary to achieve a certain level of power in a completely randomized design are incorrect when the dependent variable is fallible because they ignore measurement error. In fact, however, there are several correct procedures, one of which is…

  5. Sample Size Bias in Judgments of Perceptual Averages

    ERIC Educational Resources Information Center

    Price, Paul C.; Kimura, Nicole M.; Smith, Andrew R.; Marshall, Lindsay D.

    2014-01-01

    Previous research has shown that people exhibit a sample size bias when judging the average of a set of stimuli on a single dimension. The more stimuli there are in the set, the greater people judge the average to be. This effect has been demonstrated reliably for judgments of the average likelihood that groups of people will experience negative,…

  6. Small Sample Sizes Yield Biased Allometric Equations in Temperate Forests.

    PubMed

    Duncanson, L; Rourke, O; Dubayah, R

    2015-11-24

    Accurate quantification of forest carbon stocks is required for constraining the global carbon cycle and its impacts on climate. The accuracies of forest biomass maps are inherently dependent on the accuracy of the field biomass estimates used to calibrate models, which are generated with allometric equations. Here, we provide a quantitative assessment of the sensitivity of allometric parameters to sample size in temperate forests, focusing on the allometric relationship between tree height and crown radius. We use LiDAR remote sensing to isolate between 10,000 to more than 1,000,000 tree height and crown radius measurements per site in six U.S. forests. We find that fitted allometric parameters are highly sensitive to sample size, producing systematic overestimates of height. We extend our analysis to biomass through the application of empirical relationships from the literature, and show that given the small sample sizes used in common allometric equations for biomass, the average site-level biomass bias is ~+70% with a standard deviation of 71%, ranging from -4% to +193%. These findings underscore the importance of increasing the sample sizes used for allometric equation generation.

  7. On sample size estimation and re-estimation adjusting for variability in confirmatory trials.

    PubMed

    Wu, Pei-Shien; Lin, Min; Chow, Shein-Chung

    2016-01-01

    Sample size estimation (SSE) is an important issue in the planning of clinical studies. While larger studies are likely to have sufficient power, it may be unethical to expose more patients than necessary to answer a scientific question. Budget considerations may also cause one to limit the study to an adequate size to answer the question at hand. Typically at the planning stage, a statistically based justification for sample size is provided. An effective sample size is usually planned under a pre-specified type I error rate, a desired power under a particular alternative and variability associated with the observations recorded. The nuisance parameter such as the variance is unknown in practice. Thus, information from a preliminary pilot study is often used to estimate the variance. However, calculating the sample size based on the estimated nuisance parameter may not be stable. Sample size re-estimation (SSR) at the interim analysis may provide an opportunity to re-evaluate the uncertainties using accrued data and continue the trial with an updated sample size. This article evaluates a proposed SSR method based on controlling the variability of nuisance parameter. A numerical study is used to assess the performance of proposed method with respect to the control of type I error. The proposed method and concepts could be extended to SSR approaches with respect to other criteria, such as maintaining effect size, achieving conditional power, and reaching a desired reproducibility probability.

  8. How many more? Sample size determination in studies of morphological integration and evolvability.

    PubMed

    Grabowski, Mark; Porto, Arthur

    2017-05-01

    1. The variational properties of living organisms are an important component of current evolutionary theory. As a consequence, researchers working on the field of multivariate evolution have increasingly used integration and evolvability statistics as a way of capturing the potentially complex patterns of trait association and their effects over evolutionary trajectories. Little attention has been paid, however, to the cascading effects that inaccurate estimates of trait covariance have on these widely used evolutionary statistics. 2. Here, we analyze the relationship between sampling effort and inaccuracy in evolvability and integration statistics calculated from 10-trait matrices with varying patterns of covariation and magnitudes of integration. We then extrapolate our initial approach to different numbers of traits and different magnitudes of integration and estimate general equations relating the inaccuracy of the statistics of interest to sampling effort. We validate our equations using a dataset of cranial traits, and use them to make sample size recommendations. 3. Our results suggest that highly inaccurate estimates of evolvability and integration statistics resulting from small sample sizes are likely common in the literature, given the sampling effort necessary to properly estimate them. We also show that patterns of covariation have no effect on the sampling properties of these statistics, but overall magnitudes of integration interact with sample size and lead to varying degrees of bias, imprecision, and inaccuracy. 4. Finally, we provide R functions that can be used to calculate recommended sample sizes or to simply estimate the level of inaccuracy that should be expected in these statistics, given a sampling design.

  9. A fourier analysis on the maximum acceptable grid size for discrete proton beam dose calculation.

    PubMed

    Li, Haisen S; Romeijn, H Edwin; Dempsey, James F

    2006-09-01

    We developed an analytical method for determining the maximum acceptable grid size for discrete dose calculation in proton therapy treatment plan optimization, so that the accuracy of the optimized dose distribution is guaranteed in the phase of dose sampling and the superfluous computational work is avoided. The accuracy of dose sampling was judged by the criterion that the continuous dose distribution could be reconstructed from the discrete dose within a 2% error limit. To keep the error caused by the discrete dose sampling under a 2% limit, the dose grid size cannot exceed a maximum acceptable value. The method was based on Fourier analysis and the Shannon-Nyquist sampling theorem as an extension of our previous analysis for photon beam intensity modulated radiation therapy [J. F. Dempsey, H. E. Romeijn, J. G. Li, D. A. Low, and J. R. Palta, Med. Phys. 32, 380-388 (2005)]. The proton beam model used for the analysis was a near monoenergetic (of width about 1% the incident energy) and monodirectional infinitesimal (nonintegrated) pencil beam in water medium. By monodirection, we mean that the proton particles are in the same direction before entering the water medium and the various scattering prior to entrance to water is not taken into account. In intensity modulated proton therapy, the elementary intensity modulation entity for proton therapy is either an infinitesimal or finite sized beamlet. Since a finite sized beamlet is the superposition of infinitesimal pencil beams, the result of the maximum acceptable grid size obtained with infinitesimal pencil beam also applies to finite sized beamlet. The analytic Bragg curve function proposed by Bortfeld [T. Bortfeld, Med. Phys. 24, 2024-2033 (1997)] was employed. The lateral profile was approximated by a depth dependent Gaussian distribution. The model included the spreads of the Bragg peak and the lateral profiles due to multiple Coulomb scattering. The dependence of the maximum acceptable dose grid size on the

  10. Sample size determination for logistic regression on a logit-normal distribution.

    PubMed

    Kim, Seongho; Heath, Elisabeth; Heilbrun, Lance

    2017-06-01

    Although the sample size for simple logistic regression can be readily determined using currently available methods, the sample size calculation for multiple logistic regression requires some additional information, such as the coefficient of determination ([Formula: see text]) of a covariate of interest with other covariates, which is often unavailable in practice. The response variable of logistic regression follows a logit-normal distribution which can be generated from a logistic transformation of a normal distribution. Using this property of logistic regression, we propose new methods of determining the sample size for simple and multiple logistic regressions using a normal transformation of outcome measures. Simulation studies and a motivating example show several advantages of the proposed methods over the existing methods: (i) no need for [Formula: see text] for multiple logistic regression, (ii) available interim or group-sequential designs, and (iii) much smaller required sample size.

  11. Optimal sample size allocation for Welch's test in one-way heteroscedastic ANOVA.

    PubMed

    Shieh, Gwowen; Jan, Show-Li

    2015-06-01

    The determination of an adequate sample size is a vital aspect in the planning stage of research studies. A prudent strategy should incorporate all of the critical factors and cost considerations into sample size calculations. This study concerns the allocation schemes of group sizes for Welch's test in a one-way heteroscedastic ANOVA. Optimal allocation approaches are presented for minimizing the total cost while maintaining adequate power and for maximizing power performance for a fixed cost. The commonly recommended ratio of sample sizes is proportional to the ratio of the population standard deviations or the ratio of the population standard deviations divided by the square root of the ratio of the unit sampling costs. Detailed numerical investigations have shown that these usual allocation methods generally do not give the optimal solution. The suggested procedures are illustrated using an example of the cost-efficiency evaluation in multidisciplinary pain centers.

  12. Sample size consideration for immunoassay screening cut-point determination.

    PubMed

    Zhang, Jianchun; Zhang, Lanju; Yang, Harry

    2014-01-01

    Past decades have seen a rapid growth of biopharmaceutical products on the market. The administration of such large molecules can generate antidrug antibodies that can induce unwanted immune reactions in the recipients. Assessment of immunogenicity is required by regulatory agencies in clinical and nonclinical development, and this demands a well-validated assay. One of the important performance characteristics during assay validation is the cut point, which serves as a threshold between positive and negative samples. To precisely determine the cut point, a sufficiently large data set is often needed. However, there is no guideline other than some rule-of-thumb recommendations for sample size requirement in immunoassays. In this article, we propose a systematic approach to sample size determination for immunoassays and provide tables that facilitate its applications by scientists.

  13. Computer program for the calculation of grain size statistics by the method of moments

    USGS Publications Warehouse

    Sawyer, Michael B.

    1977-01-01

    A computer program is presented for a Hewlett-Packard Model 9830A desk-top calculator (1) which calculates statistics using weight or point count data from a grain-size analysis. The program uses the method of moments in contrast to the more commonly used but less inclusive graphic method of Folk and Ward (1957). The merits of the program are: (1) it is rapid; (2) it can accept data in either grouped or ungrouped format; (3) it allows direct comparison with grain-size data in the literature that have been calculated by the method of moments; (4) it utilizes all of the original data rather than percentiles from the cumulative curve as in the approximation technique used by the graphic method; (5) it is written in the computer language BASIC, which is easily modified and adapted to a wide variety of computers; and (6) when used in the HP-9830A, it does not require punching of data cards. The method of moments should be used only if the entire sample has been measured and the worker defines the measured grain-size range. (1) Use of brand names in this paper does not imply endorsement of these products by the U.S. Geological Survey.

  14. Element enrichment factor calculation using grain-size distribution and functional data regression.

    PubMed

    Sierra, C; Ordóñez, C; Saavedra, A; Gallego, J R

    2015-01-01

    In environmental geochemistry studies it is common practice to normalize element concentrations in order to remove the effect of grain size. Linear regression with respect to a particular grain size or conservative element is a widely used method of normalization. In this paper, the utility of functional linear regression, in which the grain-size curve is the independent variable and the concentration of pollutant the dependent variable, is analyzed and applied to detrital sediment. After implementing functional linear regression and classical linear regression models to normalize and calculate enrichment factors, we concluded that the former regression technique has some advantages over the latter. First, functional linear regression directly considers the grain-size distribution of the samples as the explanatory variable. Second, as the regression coefficients are not constant values but functions depending on the grain size, it is easier to comprehend the relationship between grain size and pollutant concentration. Third, regularization can be introduced into the model in order to establish equilibrium between reliability of the data and smoothness of the solutions. Copyright © 2014 Elsevier Ltd. All rights reserved.

  15. Detecting Neuroimaging Biomarkers for Psychiatric Disorders: Sample Size Matters

    PubMed Central

    Schnack, Hugo G.; Kahn, René S.

    2016-01-01

    In a recent review, it was suggested that much larger cohorts are needed to prove the diagnostic value of neuroimaging biomarkers in psychiatry. While within a sample, an increase of diagnostic accuracy of schizophrenia (SZ) with number of subjects (N) has been shown, the relationship between N and accuracy is completely different between studies. Using data from a recent meta-analysis of machine learning (ML) in imaging SZ, we found that while low-N studies can reach 90% and higher accuracy, above N/2 = 50 the maximum accuracy achieved steadily drops to below 70% for N/2 > 150. We investigate the role N plays in the wide variability in accuracy results in SZ studies (63–97%). We hypothesize that the underlying cause of the decrease in accuracy with increasing N is sample heterogeneity. While smaller studies more easily include a homogeneous group of subjects (strict inclusion criteria are easily met; subjects live close to study site), larger studies inevitably need to relax the criteria/recruit from large geographic areas. A SZ prediction model based on a heterogeneous group of patients with presumably a heterogeneous pattern of structural or functional brain changes will not be able to capture the whole variety of changes, thus being limited to patterns shared by most patients. In addition to heterogeneity (sample size), we investigate other factors influencing accuracy and introduce a ML effect size. We derive a simple model of how the different factors, such as sample heterogeneity and study setup determine this ML effect size, and explain the variation in prediction accuracies found from the literature, both in cross-validation and independent sample testing. From this, we argue that smaller-N studies may reach high prediction accuracy at the cost of lower generalizability to other samples. Higher-N studies, on the other hand, will have more generalization power, but at the cost of lower accuracy. In conclusion, when comparing results from different

  16. Effects of sample size on KERNEL home range estimates

    USGS Publications Warehouse

    Seaman, D.E.; Millspaugh, J.J.; Kernohan, Brian J.; Brundige, Gary C.; Raedeke, Kenneth J.; Gitzen, Robert A.

    1999-01-01

    Kernel methods for estimating home range are being used increasingly in wildlife research, but the effect of sample size on their accuracy is not known. We used computer simulations of 10-200 points/home range and compared accuracy of home range estimates produced by fixed and adaptive kernels with the reference (REF) and least-squares cross-validation (LSCV) methods for determining the amount of smoothing. Simulated home ranges varied from simple to complex shapes created by mixing bivariate normal distributions. We used the size of the 95% home range area and the relative mean squared error of the surface fit to assess the accuracy of the kernel home range estimates. For both measures, the bias and variance approached an asymptote at about 50 observations/home range. The fixed kernel with smoothing selected by LSCV provided the least-biased estimates of the 95% home range area. All kernel methods produced similar surface fit for most simulations, but the fixed kernel with LSCV had the lowest frequency and magnitude of very poor estimates. We reviewed 101 papers published in The Journal of Wildlife Management (JWM) between 1980 and 1997 that estimated animal home ranges. A minority of these papers used nonparametric utilization distribution (UD) estimators, and most did not adequately report sample sizes. We recommend that home range studies using kernel estimates use LSCV to determine the amount of smoothing, obtain a minimum of 30 observations per animal (but preferably a?Y50), and report sample sizes in published results.

  17. (Sample) Size Matters: Defining Error in Planktic Foraminiferal Isotope Measurement

    NASA Astrophysics Data System (ADS)

    Lowery, C.; Fraass, A. J.

    2015-12-01

    Planktic foraminifera have been used as carriers of stable isotopic signals since the pioneering work of Urey and Emiliani. In those heady days, instrumental limitations required hundreds of individual foraminiferal tests to return a usable value. This had the fortunate side-effect of smoothing any seasonal to decadal changes within the planktic foram population, which generally turns over monthly, removing that potential noise from each sample. With the advent of more sensitive mass spectrometers, smaller sample sizes have now become standard. This has been a tremendous advantage, allowing longer time series with the same investment of time and energy. Unfortunately, the use of smaller numbers of individuals to generate a data point has lessened the amount of time averaging in the isotopic analysis and decreased precision in paleoceanographic datasets. With fewer individuals per sample, the differences between individual specimens will result in larger variation, and therefore error, and less precise values for each sample. Unfortunately, most workers (the authors included) do not make a habit of reporting the error associated with their sample size. We have created an open-source model in R to quantify the effect of sample sizes under various realistic and highly modifiable parameters (calcification depth, diagenesis in a subset of the population, improper identification, vital effects, mass, etc.). For example, a sample in which only 1 in 10 specimens is diagenetically altered can be off by >0.3‰ δ18O VPDB or ~1°C. Additionally, and perhaps more importantly, we show that under unrealistically ideal conditions (perfect preservation, etc.) it takes ~5 individuals from the mixed-layer to achieve an error of less than 0.1‰. Including just the unavoidable vital effects inflates that number to ~10 individuals to achieve ~0.1‰. Combining these errors with the typical machine error inherent in mass spectrometers make this a vital consideration moving forward.

  18. Rock sampling. [method for controlling particle size distribution

    NASA Technical Reports Server (NTRS)

    Blum, P. (Inventor)

    1971-01-01

    A method for sampling rock and other brittle materials and for controlling resultant particle sizes is described. The method involves cutting grooves in the rock surface to provide a grouping of parallel ridges and subsequently machining the ridges to provide a powder specimen. The machining step may comprise milling, drilling, lathe cutting or the like; but a planing step is advantageous. Control of the particle size distribution is effected primarily by changing the height and width of these ridges. This control exceeds that obtainable by conventional grinding.

  19. Determining Minimum Sample Sizes for Multiple Regression Grade Prediction Equations for Colleges. Research Report No. 83.

    ERIC Educational Resources Information Center

    Sawyer, Richard

    The American College Testing (ACT) Program offers research services through which colleges can predict the freshman grades of their future students. This paper describes research done to establish a minimum sample size requirement for calculating least-squares prediction equations for college freshman grade average. Prediction equations were…

  20. Sample size and power considerations in network meta-analysis

    PubMed Central

    2012-01-01

    Background Network meta-analysis is becoming increasingly popular for establishing comparative effectiveness among multiple interventions for the same disease. Network meta-analysis inherits all methodological challenges of standard pairwise meta-analysis, but with increased complexity due to the multitude of intervention comparisons. One issue that is now widely recognized in pairwise meta-analysis is the issue of sample size and statistical power. This issue, however, has so far only received little attention in network meta-analysis. To date, no approaches have been proposed for evaluating the adequacy of the sample size, and thus power, in a treatment network. Findings In this article, we develop easy-to-use flexible methods for estimating the ‘effective sample size’ in indirect comparison meta-analysis and network meta-analysis. The effective sample size for a particular treatment comparison can be interpreted as the number of patients in a pairwise meta-analysis that would provide the same degree and strength of evidence as that which is provided in the indirect comparison or network meta-analysis. We further develop methods for retrospectively estimating the statistical power for each comparison in a network meta-analysis. We illustrate the performance of the proposed methods for estimating effective sample size and statistical power using data from a network meta-analysis on interventions for smoking cessation including over 100 trials. Conclusion The proposed methods are easy to use and will be of high value to regulatory agencies and decision makers who must assess the strength of the evidence supporting comparative effectiveness estimates. PMID:22992327

  1. Air sampling filtration media: Collection efficiency for respirable size-selective sampling.

    PubMed

    Soo, Jhy-Charm; Monaghan, Keenan; Lee, Taekhee; Kashon, Mike; Harper, Martin

    2016-01-01

    The collection efficiencies of commonly used membrane air sampling filters in the ultrafine particle size range were investigated. Mixed cellulose ester (MCE; 0.45, 0.8, 1.2, and 5 μm pore sizes), polycarbonate (0.4, 0.8, 2, and 5 μm pore sizes), polytetrafluoroethylene (PTFE; 0.45, 1, 2, and 5 μm pore sizes), polyvinyl chloride (PVC; 0.8 and 5 μm pore sizes), and silver membrane (0.45, 0.8, 1.2, and 5 μm pore sizes) filters were exposed to polydisperse sodium chloride (NaCl) particles in the size range of 10-400 nm. Test aerosols were nebulized and introduced into a calm air chamber through a diffusion dryer and aerosol neutralizer. The testing filters (37 mm diameter) were mounted in a conductive polypropylene filter-holder (cassette) within a metal testing tube. The experiments were conducted at flow rates between 1.7 and 11.2 l min(-1). The particle size distributions of NaCl challenge aerosol were measured upstream and downstream of the test filters by a scanning mobility particle sizer (SMPS). Three different filters of each type with at least three repetitions for each pore size were tested. In general, the collection efficiency varied with airflow, pore size, and sampling duration. In addition, both collection efficiency and pressure drop increased with decreased pore size and increased sampling flow rate, but they differed among filter types and manufacturer. The present study confirmed that the MCE, PTFE, and PVC filters have a relatively high collection efficiency for challenge particles much smaller than their nominal pore size and are considerably more efficient than polycarbonate and silver membrane filters, especially at larger nominal pore sizes.

  2. Air sampling filtration media: Collection efficiency for respirable size-selective sampling

    PubMed Central

    Soo, Jhy-Charm; Monaghan, Keenan; Lee, Taekhee; Kashon, Mike; Harper, Martin

    2016-01-01

    The collection efficiencies of commonly used membrane air sampling filters in the ultrafine particle size range were investigated. Mixed cellulose ester (MCE; 0.45, 0.8, 1.2, and 5 μm pore sizes), polycarbonate (0.4, 0.8, 2, and 5 μm pore sizes), polytetrafluoroethylene (PTFE; 0.45, 1, 2, and 5 μm pore sizes), polyvinyl chloride (PVC; 0.8 and 5 μm pore sizes), and silver membrane (0.45, 0.8, 1.2, and 5 μm pore sizes) filters were exposed to polydisperse sodium chloride (NaCl) particles in the size range of 10–400 nm. Test aerosols were nebulized and introduced into a calm air chamber through a diffusion dryer and aerosol neutralizer. The testing filters (37 mm diameter) were mounted in a conductive polypropylene filter-holder (cassette) within a metal testing tube. The experiments were conducted at flow rates between 1.7 and 11.2 l min−1. The particle size distributions of NaCl challenge aerosol were measured upstream and downstream of the test filters by a scanning mobility particle sizer (SMPS). Three different filters of each type with at least three repetitions for each pore size were tested. In general, the collection efficiency varied with airflow, pore size, and sampling duration. In addition, both collection efficiency and pressure drop increased with decreased pore size and increased sampling flow rate, but they differed among filter types and manufacturer. The present study confirmed that the MCE, PTFE, and PVC filters have a relatively high collection efficiency for challenge particles much smaller than their nominal pore size and are considerably more efficient than polycarbonate and silver membrane filters, especially at larger nominal pore sizes. PMID:26834310

  3. Sample size re-estimation for survival data in clinical trials with an adaptive design.

    PubMed

    Togo, Kanae; Iwasaki, Manabu

    2011-01-01

    In clinical trials with survival data, investigators may wish to re-estimate the sample size based on the observed effect size while the trial is ongoing. Besides the inflation of the type-I error rate due to sample size re-estimation, the method for calculating the sample size in an interim analysis should be carefully considered because the data in each stage are mutually dependent in trials with survival data. Although the interim hazard estimate is commonly used to re-estimate the sample size, the estimate can sometimes be considerably higher or lower than the hypothesized hazard by chance. We propose an interim hazard ratio estimate that can be used to re-estimate the sample size under those circumstances. The proposed method was demonstrated through a simulation study and an actual clinical trial as an example. The effect of the shape parameter for the Weibull survival distribution on the sample size re-estimation is presented. Copyright © 2010 John Wiley & Sons, Ltd.

  4. Speeding Up Non-Parametric Bootstrap Computations for Statistics Based on Sample Moments in Small/Moderate Sample Size Applications

    PubMed Central

    Chaibub Neto, Elias

    2015-01-01

    In this paper we propose a vectorized implementation of the non-parametric bootstrap for statistics based on sample moments. Basically, we adopt the multinomial sampling formulation of the non-parametric bootstrap, and compute bootstrap replications of sample moment statistics by simply weighting the observed data according to multinomial counts instead of evaluating the statistic on a resampled version of the observed data. Using this formulation we can generate a matrix of bootstrap weights and compute the entire vector of bootstrap replications with a few matrix multiplications. Vectorization is particularly important for matrix-oriented programming languages such as R, where matrix/vector calculations tend to be faster than scalar operations implemented in a loop. We illustrate the application of the vectorized implementation in real and simulated data sets, when bootstrapping Pearson’s sample correlation coefficient, and compared its performance against two state-of-the-art R implementations of the non-parametric bootstrap, as well as a straightforward one based on a for loop. Our investigations spanned varying sample sizes and number of bootstrap replications. The vectorized bootstrap compared favorably against the state-of-the-art implementations in all cases tested, and was remarkably/considerably faster for small/moderate sample sizes. The same results were observed in the comparison with the straightforward implementation, except for large sample sizes, where the vectorized bootstrap was slightly slower than the straightforward implementation due to increased time expenditures in the generation of weight matrices via multinomial sampling. PMID:26125965

  5. Speeding Up Non-Parametric Bootstrap Computations for Statistics Based on Sample Moments in Small/Moderate Sample Size Applications.

    PubMed

    Chaibub Neto, Elias

    2015-01-01

    In this paper we propose a vectorized implementation of the non-parametric bootstrap for statistics based on sample moments. Basically, we adopt the multinomial sampling formulation of the non-parametric bootstrap, and compute bootstrap replications of sample moment statistics by simply weighting the observed data according to multinomial counts instead of evaluating the statistic on a resampled version of the observed data. Using this formulation we can generate a matrix of bootstrap weights and compute the entire vector of bootstrap replications with a few matrix multiplications. Vectorization is particularly important for matrix-oriented programming languages such as R, where matrix/vector calculations tend to be faster than scalar operations implemented in a loop. We illustrate the application of the vectorized implementation in real and simulated data sets, when bootstrapping Pearson's sample correlation coefficient, and compared its performance against two state-of-the-art R implementations of the non-parametric bootstrap, as well as a straightforward one based on a for loop. Our investigations spanned varying sample sizes and number of bootstrap replications. The vectorized bootstrap compared favorably against the state-of-the-art implementations in all cases tested, and was remarkably/considerably faster for small/moderate sample sizes. The same results were observed in the comparison with the straightforward implementation, except for large sample sizes, where the vectorized bootstrap was slightly slower than the straightforward implementation due to increased time expenditures in the generation of weight matrices via multinomial sampling.

  6. The endothelial sample size analysis in corneal specular microscopy clinical examinations.

    PubMed

    Abib, Fernando C; Holzchuh, Ricardo; Schaefer, Artur; Schaefer, Tania; Godois, Ronialci

    2012-05-01

    To evaluate endothelial cell sample size and statistical error in corneal specular microscopy (CSM) examinations. One hundred twenty examinations were conducted with 4 types of corneal specular microscopes: 30 with each BioOptics, CSO, Konan, and Topcon corneal specular microscopes. All endothelial image data were analyzed by respective instrument software and also by the Cells Analyzer software with a method developed in our lab. A reliability degree (RD) of 95% and a relative error (RE) of 0.05 were used as cut-off values to analyze images of the counted endothelial cells called samples. The sample size mean was the number of cells evaluated on the images obtained with each device. Only examinations with RE < 0.05 were considered statistically correct and suitable for comparisons with future examinations. The Cells Analyzer software was used to calculate the RE and customized sample size for all examinations. Bio-Optics: sample size, 97 ± 22 cells; RE, 6.52 ± 0.86; only 10% of the examinations had sufficient endothelial cell quantity (RE < 0.05); customized sample size, 162 ± 34 cells. CSO: sample size, 110 ± 20 cells; RE, 5.98 ± 0.98; only 16.6% of the examinations had sufficient endothelial cell quantity (RE < 0.05); customized sample size, 157 ± 45 cells. Konan: sample size, 80 ± 27 cells; RE, 10.6 ± 3.67; none of the examinations had sufficient endothelial cell quantity (RE > 0.05); customized sample size, 336 ± 131 cells. Topcon: sample size, 87 ± 17 cells; RE, 10.1 ± 2.52; none of the examinations had sufficient endothelial cell quantity (RE > 0.05); customized sample size, 382 ± 159 cells. A very high number of CSM examinations had sample errors based on Cells Analyzer software. The endothelial sample size (examinations) needs to include more cells to be reliable and reproducible. The Cells Analyzer tutorial routine will be useful for CSM examination reliability and reproducibility.

  7. Sample size determination for testing equality in a cluster randomized trial with noncompliance.

    PubMed

    Lui, Kung-Jong; Chang, Kuang-Chao

    2011-01-01

    For administrative convenience or cost efficiency, we may often employ a cluster randomized trial (CRT), in which randomized units are clusters of patients rather than individual patients. Furthermore, because of ethical reasons or patient's decision, it is not uncommon to encounter data in which there are patients not complying with their assigned treatments. Thus, the development of a sample size calculation procedure for a CRT with noncompliance is important and useful in practice. Under the exclusion restriction model, we have developed an asymptotic test procedure using a tanh(-1)(x) transformation for testing equality between two treatments among compliers for a CRT with noncompliance. We have further derived a sample size formula accounting for both noncompliance and the intraclass correlation for a desired power 1 - β at a nominal α level. We have employed Monte Carlo simulation to evaluate the finite-sample performance of the proposed test procedure with respect to type I error and the accuracy of the derived sample size calculation formula with respect to power in a variety of situations. Finally, we use the data taken from a CRT studying vitamin A supplementation to reduce mortality among preschool children to illustrate the use of sample size calculation proposed here.

  8. Power and sample size in cost-effectiveness analysis.

    PubMed

    Laska, E M; Meisner, M; Siegel, C

    1999-01-01

    For resource allocation under a constrained budget, optimal decision rules for mutually exclusive programs require that the treatment with the highest incremental cost-effectiveness ratio (ICER) below a willingness-to-pay (WTP) criterion be funded. This is equivalent to determining the treatment with the smallest net health cost. The designer of a cost-effectiveness study needs to select a sample size so that the power to reject the null hypothesis, the equality of the net health costs of two treatments, is high. A recently published formula derived under normal distribution theory overstates sample-size requirements. Using net health costs, the authors present simple methods for power analysis based on conventional normal and on nonparametric statistical theory.

  9. Estimation of sample size and testing power (Part 3).

    PubMed

    Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo

    2011-12-01

    This article introduces the definition and sample size estimation of three special tests (namely, non-inferiority test, equivalence test and superiority test) for qualitative data with the design of one factor with two levels having a binary response variable. Non-inferiority test refers to the research design of which the objective is to verify that the efficacy of the experimental drug is not clinically inferior to that of the positive control drug. Equivalence test refers to the research design of which the objective is to verify that the experimental drug and the control drug have clinically equivalent efficacy. Superiority test refers to the research design of which the objective is to verify that the efficacy of the experimental drug is clinically superior to that of the control drug. By specific examples, this article introduces formulas of sample size estimation for the three special tests, and their SAS realization in detail.

  10. Estimation of sample size and testing power (part 6).

    PubMed

    Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo

    2012-03-01

    The design of one factor with k levels (k ≥ 3) refers to the research that only involves one experimental factor with k levels (k ≥ 3), and there is no arrangement for other important non-experimental factors. This paper introduces the estimation of sample size and testing power for quantitative data and qualitative data having a binary response variable with the design of one factor with k levels (k ≥ 3).

  11. Tooth Wear Prevalence and Sample Size Determination : A Pilot Study

    PubMed Central

    Abd. Karim, Nama Bibi Saerah; Ismail, Noorliza Mastura; Naing, Lin; Ismail, Abdul Rashid

    2008-01-01

    Tooth wear is the non-carious loss of tooth tissue, which results from three processes namely attrition, erosion and abrasion. These can occur in isolation or simultaneously. Very mild tooth wear is a physiological effect of aging. This study aims to estimate the prevalence of tooth wear among 16-year old Malay school children and determine a feasible sample size for further study. Fifty-five subjects were examined clinically, followed by the completion of self-administered questionnaires. Questionnaires consisted of socio-demographic and associated variables for tooth wear obtained from the literature. The Smith and Knight tooth wear index was used to chart tooth wear. Other oral findings were recorded using the WHO criteria. A software programme was used to determine pathological tooth wear. About equal ratio of male to female were involved. It was found that 18.2% of subjects have no tooth wear, 63.6% had very mild tooth wear, 10.9% mild tooth wear, 5.5% moderate tooth wear and 1.8 % severe tooth wear. In conclusion 18.2% of subjects were deemed to have pathological tooth wear (mild, moderate & severe). Exploration with all associated variables gave a sample size ranging from 560 – 1715. The final sample size for further study greatly depends on available time and resources. PMID:22589636

  12. Simple and multiple linear regression: sample size considerations.

    PubMed

    Hanley, James A

    2016-11-01

    The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright © 2016 Elsevier Inc. All rights reserved.

  13. Sample size estimation for time-dependent receiver operating characteristic.

    PubMed

    Li, H; Gatsonis, C

    2014-03-15

    In contrast to the usual ROC analysis with a contemporaneous reference standard, the time-dependent setting introduces the possibility that the reference standard refers to an event at a future time and may not be known for every patient due to censoring. The goal of this research is to determine the sample size required for a study design to address the question of the accuracy of a diagnostic test using the area under the curve in time-dependent ROC analysis. We adapt a previously published estimator of the time-dependent area under the ROC curve, which is a function of the expected conditional survival functions. This estimator accommodates censored data. The estimation of the required sample size is based on approximations of the expected conditional survival functions and their variances, derived under parametric assumptions of an exponential failure time and an exponential censoring time. We also consider different patient enrollment strategies. The proposed method can provide an adequate sample size to ensure that the test's accuracy is estimated to a prespecified precision. We present results of a simulation study to assess the accuracy of the method and its robustness to departures from the parametric assumptions. We apply the proposed method to design of a study of positron emission tomography as predictor of disease free survival in women undergoing therapy for cervical cancer. Copyright © 2013 John Wiley & Sons, Ltd.

  14. New shooting algorithms for transition path sampling: centering moves and varied-perturbation sizes for improved sampling.

    PubMed

    Rowley, Christopher N; Woo, Tom K

    2009-12-21

    Transition path sampling has been established as a powerful tool for studying the dynamics of rare events. The trajectory generation moves of this Monte Carlo procedure, shooting moves and shifting modes, were developed primarily for rate constant calculations, although this method has been more extensively used to study the dynamics of reactive processes. We have devised and implemented three alternative trajectory generation moves for use with transition path sampling. The centering-shooting move incorporates a shifting move into a shooting move, which centers the transition period in the middle of the trajectory, eliminating the need for shifting moves and generating an ensemble where the transition event consistently occurs near the middle of the trajectory. We have also developed varied-perturbation size shooting moves, wherein smaller perturbations are made if the shooting point is far from the transition event. The trajectories generated using these moves decorrelate significantly faster than with conventional, constant sized perturbations. This results in an increase in the statistical efficiency by a factor of 2.5-5 when compared to the conventional shooting algorithm. On the other hand, the new algorithm breaks detailed balance and introduces a small bias in the transition time distribution. We have developed a modification of this varied-perturbation size shooting algorithm that preserves detailed balance, albeit at the cost of decreased sampling efficiency. Both varied-perturbation size shooting algorithms are found to have improved sampling efficiency when compared to the original constant perturbation size shooting algorithm.

  15. Size-selective sampling of particulates using a physiologic sampling pump.

    PubMed

    Lee, Larry A; Lee, Eun Gyung; Lee, Taekhee; Kim, Seung Won; Slaven, James E; Harper, Martin

    2011-03-01

    Recent laboratory research indicates physiologic sampling of gas and vapor may provide more representative estimates of personal exposures than traditional methods. Modifications to the physiologic sampling pump (PSP) used in that research are described which extend its usefulness to size-selective sampling of particulates. PSPs used in previous research varied motor speed to keep sampling proportional to the subject's inhalation. This caused airflow and particle velocities through the collection device to continually change making those pumps unsuitable for sampling particulates. The modified implementation of the PSP pulls a constant airflow into and through a cyclone, then uses valves to either direct the airflow through, or divert the airflow around, the sampling filter. By using physiologic inputs to regulate the fraction of each second that air flows through the sampling filter, samples may be collected in proportion to inhalation rate. To evaluate the performance of a functional prototype 5 different sizes of monodisperse aerosols of ammonium fluorescein were generated by a vibrating orifice aerosol generator and introduced into a calm air chamber. To simulate different inhalation rates the valves of the PSP were energized using 9 different duty cycles. Efficiency curves are presented and compared to a standard respirable convention by bias mapping. The performance of the modified cyclone used in the PSP sampling head compared favorably with a commercially available cyclone of the same model, operating at a constant airflow (± 10% over almost all the size distributions of concern). The new method makes physiologic sampling of the respirable fraction of particulates feasible.

  16. Blinded sample size recalculation for clinical trials with normal data and baseline adjusted analysis.

    PubMed

    Friede, Tim; Kieser, Meinhard

    2011-01-01

    Baseline adjusted analyses are commonly encountered in practice, and regulatory guidelines endorse this practice. Sample size calculations for this kind of analyses require knowledge of the magnitude of nuisance parameters that are usually not given when the results of clinical trials are reported in the literature. It is therefore quite natural to start with a preliminary calculated sample size based on the sparse information available in the planning phase and to re-estimate the value of the nuisance parameters (and with it the sample size) when a portion of the planned number of patients have completed the study. We investigate the characteristics of this internal pilot study design when an analysis of covariance with normally distributed outcome and one random covariate is applied. For this purpose we first assess the accuracy of four approximate sample size formulae within the fixed sample size design. Then the performance of the recalculation procedure with respect to its actual Type I error rate and power characteristics is examined. The results of simulation studies show that this approach has favorable properties with respect to the Type I error rate and power. Together with its simplicity, these features should make it attractive for practical application.

  17. SAMPL5: 3D-RISM partition coefficient calculations with partial molar volume corrections and solute conformational sampling

    NASA Astrophysics Data System (ADS)

    Luchko, Tyler; Blinov, Nikolay; Limon, Garrett C.; Joyce, Kevin P.; Kovalenko, Andriy

    2016-11-01

    Implicit solvent methods for classical molecular modeling are frequently used to provide fast, physics-based hydration free energies of macromolecules. Less commonly considered is the transferability of these methods to other solvents. The Statistical Assessment of Modeling of Proteins and Ligands 5 (SAMPL5) distribution coefficient dataset and the accompanying explicit solvent partition coefficient reference calculations provide a direct test of solvent model transferability. Here we use the 3D reference interaction site model (3D-RISM) statistical-mechanical solvation theory, with a well tested water model and a new united atom cyclohexane model, to calculate partition coefficients for the SAMPL5 dataset. The cyclohexane model performed well in training and testing (R=0.98 for amino acid neutral side chain analogues) but only if a parameterized solvation free energy correction was used. In contrast, the same protocol, using single solute conformations, performed poorly on the SAMPL5 dataset, obtaining R=0.73 compared to the reference partition coefficients, likely due to the much larger solute sizes. Including solute conformational sampling through molecular dynamics coupled with 3D-RISM (MD/3D-RISM) improved agreement with the reference calculation to R=0.93. Since our initial calculations only considered partition coefficients and not distribution coefficients, solute sampling provided little benefit comparing against experiment, where ionized and tautomer states are more important. Applying a simple pK_{ {a}} correction improved agreement with experiment from R=0.54 to R=0.66, despite a small number of outliers. Better agreement is possible by accounting for tautomers and improving the ionization correction.

  18. SAMPL5: 3D-RISM partition coefficient calculations with partial molar volume corrections and solute conformational sampling.

    PubMed

    Luchko, Tyler; Blinov, Nikolay; Limon, Garrett C; Joyce, Kevin P; Kovalenko, Andriy

    2016-11-01

    Implicit solvent methods for classical molecular modeling are frequently used to provide fast, physics-based hydration free energies of macromolecules. Less commonly considered is the transferability of these methods to other solvents. The Statistical Assessment of Modeling of Proteins and Ligands 5 (SAMPL5) distribution coefficient dataset and the accompanying explicit solvent partition coefficient reference calculations provide a direct test of solvent model transferability. Here we use the 3D reference interaction site model (3D-RISM) statistical-mechanical solvation theory, with a well tested water model and a new united atom cyclohexane model, to calculate partition coefficients for the SAMPL5 dataset. The cyclohexane model performed well in training and testing ([Formula: see text] for amino acid neutral side chain analogues) but only if a parameterized solvation free energy correction was used. In contrast, the same protocol, using single solute conformations, performed poorly on the SAMPL5 dataset, obtaining [Formula: see text] compared to the reference partition coefficients, likely due to the much larger solute sizes. Including solute conformational sampling through molecular dynamics coupled with 3D-RISM (MD/3D-RISM) improved agreement with the reference calculation to [Formula: see text]. Since our initial calculations only considered partition coefficients and not distribution coefficients, solute sampling provided little benefit comparing against experiment, where ionized and tautomer states are more important. Applying a simple [Formula: see text] correction improved agreement with experiment from [Formula: see text] to [Formula: see text], despite a small number of outliers. Better agreement is possible by accounting for tautomers and improving the ionization correction.

  19. Effect of Sampling Array Irregularity and Window Size on the Discrimination of Sampled Gratings

    PubMed Central

    Evans, David W.; Wang, Yizhong; Haggerty, Kevin M.; Thibos, Larry N.

    2009-01-01

    The effect of sampling irregularity and window size on orientation discrimination was investigated using discretely sampled gratings as stimuli. For regular sampling arrays, visual performance could be accounted for by a theoretical analysis of aliasing produced by undersampling. For irregular arrays produced by adding noise to the location of individual samples, the incidence of perceived orientation reversal declined and the spatial frequency range of flawless performance expanded well beyond the nominal Nyquist frequency. These results provide a psychophysical method to estimate the spatial density and the degree of irregularity in the neural sampling arrays that limit human visual resolution. PMID:19815023

  20. Proposed international conventions for particle size-selective sampling.

    PubMed

    Soderholm, S C

    1989-01-01

    Definitions are proposed for the inspirable (also called inhalable), thoracic and respirable fractions of airborne particles. Each definition is expressed as a sampling efficiency (S) which is a function of particle aerodynamic diameter (d) and specifies the fraction of the ambient concentration of airborne particles collected by an ideal sampler. For the inspirable fraction. SI(d) = 0.5 (1 + e-0.06d). For the thoracic fraction, ST(d) = SI(d)[1 - F(x)], where (formula; see text) F(x) is the cumulative probability function of a standardized normal random variable. For the respirable fraction, SR(d) = SI(d)[1 - F(x)], where gamma = 4.25 microns, sigma = 1.5. International harmonization will require resolution of the differences between the firmly established BMRC [Orenstein, A. J. (1960) Proceedings of the Pneumoconiosis Conference, Johannesburg, 1959, pp. 610-621. A.J. Churchill Ltd, London] and ACGIH [(1985) Particle size-selective sampling in the workplace. Report of the ACGIH Technical Committee on Air Sampling Procedures] definitions of the respirable fraction. The proposed definition differs approximately equally from the BMRC and ACGIH definitions and is at least as defensible when compared to available human data. Several standard-setting organizations are in the process of adopting particle size-selective sampling conventions. Much confusion will be avoided if all adopt the same specifications of the collection efficiencies of ideal samplers, such as those proposed here.

  1. Allocating Sample Sizes to Reduce Budget for Fixed-Effect 2×2 Heterogeneous Analysis of Variance

    ERIC Educational Resources Information Center

    Luh, Wei-Ming; Guo, Jiin-Huarng

    2016-01-01

    This article discusses the sample size requirements for the interaction, row, and column effects, respectively, by forming a linear contrast for a 2×2 factorial design for fixed-effects heterogeneous analysis of variance. The proposed method uses the Welch t test and its corresponding degrees of freedom to calculate the final sample size in a…

  2. Allocating Sample Sizes to Reduce Budget for Fixed-Effect 2×2 Heterogeneous Analysis of Variance

    ERIC Educational Resources Information Center

    Luh, Wei-Ming; Guo, Jiin-Huarng

    2016-01-01

    This article discusses the sample size requirements for the interaction, row, and column effects, respectively, by forming a linear contrast for a 2×2 factorial design for fixed-effects heterogeneous analysis of variance. The proposed method uses the Welch t test and its corresponding degrees of freedom to calculate the final sample size in a…

  3. GUIDE TO CALCULATING TRANSPORT EFFICIENCY OF AEROSOLS IN OCCUPATIONAL AIR SAMPLING SYSTEMS

    SciTech Connect

    Hogue, M.; Hadlock, D.; Thompson, M.; Farfan, E.

    2013-11-12

    This report will present hand calculations for transport efficiency based on aspiration efficiency and particle deposition losses. Because the hand calculations become long and tedious, especially for lognormal distributions of aerosols, an R script (R 2011) will be provided for each element examined. Calculations are provided for the most common elements in a remote air sampling system, including a thin-walled probe in ambient air, straight tubing, bends and a sample housing. One popular alternative approach would be to put such calculations in a spreadsheet, a thorough version of which is shared by Paul Baron via the Aerocalc spreadsheet (Baron 2012). To provide greater transparency and to avoid common spreadsheet vulnerabilities to errors (Burns 2012), this report uses R. The particle size is based on the concept of activity median aerodynamic diameter (AMAD). The AMAD is a particle size in an aerosol where fifty percent of the activity in the aerosol is associated with particles of aerodynamic diameter greater than the AMAD. This concept allows for the simplification of transport efficiency calculations where all particles are treated as spheres with the density of water (1g cm-3). In reality, particle densities depend on the actual material involved. Particle geometries can be very complicated. Dynamic shape factors are provided by Hinds (Hinds 1999). Some example factors are: 1.00 for a sphere, 1.08 for a cube, 1.68 for a long cylinder (10 times as long as it is wide), 1.05 to 1.11 for bituminous coal, 1.57 for sand and 1.88 for talc. Revision 1 is made to correct an error in the original version of this report. The particle distributions are based on activity weighting of particles rather than based on the number of particles of each size. Therefore, the mass correction made in the original version is removed from the text and the calculations. Results affected by the change are updated.

  4. Size Matters: FTIR Spectral Analysis of Apollo Regolith Samples Exhibits Grain Size Dependence.

    NASA Astrophysics Data System (ADS)

    Martin, Dayl; Joy, Katherine; Pernet-Fisher, John; Wogelius, Roy; Morlok, Andreas; Hiesinger, Harald

    2017-04-01

    The Mercury Thermal Infrared Spectrometer (MERTIS) on the upcoming BepiColombo mission is designed to analyse the surface of Mercury in thermal infrared wavelengths (7-14 μm) to investigate the physical properties of the surface materials [1]. Laboratory analyses of analogue materials are useful for investigating how various sample properties alter the resulting infrared spectrum. Laboratory FTIR analysis of Apollo fine (<1mm) soil samples 14259,672, 15401,147, and 67481,96 have provided an insight into how grain size, composition, maturity (i.e., exposure to space weathering processes), and proportion of glassy material affect their average infrared spectra. Each of these samples was analysed as a bulk sample and five size fractions: <25, 25-63, 63-125, 125-250, and <250 μm. Sample 14259,672 is a highly mature highlands regolith with a large proportion of agglutinates [2]. The high agglutinate content (>60%) causes a 'flattening' of the spectrum, with reduced reflectance in the Reststrahlen Band region (RB) as much as 30% in comparison to samples that are dominated by a high proportion of crystalline material. Apollo 15401,147 is an immature regolith with a high proportion of volcanic glass pyroclastic beads [2]. The high mafic mineral content results in a systematic shift in the Christiansen Feature (CF - the point of lowest reflectance) to longer wavelength: 8.6 μm. The glass beads dominate the spectrum, displaying a broad peak around the main Si-O stretch band (at 10.8 μm). As such, individual mineral components of this sample cannot be resolved from the average spectrum alone. Apollo 67481,96 is a sub-mature regolith composed dominantly of anorthite plagioclase [2]. The CF position of the average spectrum is shifted to shorter wavelengths (8.2 μm) due to the higher proportion of felsic minerals. Its average spectrum is dominated by anorthite reflectance bands at 8.7, 9.1, 9.8, and 10.8 μm. The average reflectance is greater than the other samples due to

  5. Design and sample-size considerations in the detection of linkage disequilibrium with a disease locus

    SciTech Connect

    Olson, J.M.; Wijsman, E.M.

    1994-09-01

    The presence of linkage disequilibrium between closely linked loci can aid in the fine mapping of disease loci. The authors investigate the power of several designs for sampling individuals with different disease genotypes. As expected, haplotype data provide the greatest power for detecting disequilibrium, but, in the absence of parental information to resolve the phase of double heterozygotes, the most powerful design samples only individuals homozygous at the trait locus. For rare diseases, such a scheme is generally not feasible, and the authors also provide power and sample-size calculations for designs that sample heterozygotes. The results provide information useful in planning disequilibrium studies. 17 refs., 3 figs., 4 tabs.

  6. On the repeated measures designs and sample sizes for randomized controlled trials.

    PubMed

    Tango, Toshiro

    2016-04-01

    For the analysis of longitudinal or repeated measures data, generalized linear mixed-effects models provide a flexible and powerful tool to deal with heterogeneity among subject response profiles. However, the typical statistical design adopted in usual randomized controlled trials is an analysis of covariance type analysis using a pre-defined pair of "pre-post" data, in which pre-(baseline) data are used as a covariate for adjustment together with other covariates. Then, the major design issue is to calculate the sample size or the number of subjects allocated to each treatment group. In this paper, we propose a new repeated measures design and sample size calculations combined with generalized linear mixed-effects models that depend not only on the number of subjects but on the number of repeated measures before and after randomization per subject used for the analysis. The main advantages of the proposed design combined with the generalized linear mixed-effects models are (1) it can easily handle missing data by applying the likelihood-based ignorable analyses under the missing at random assumption and (2) it may lead to a reduction in sample size, compared with the simple pre-post design. The proposed designs and the sample size calculations are illustrated with real data arising from randomized controlled trials.

  7. A preliminary model to avoid the overestimation of sample size in bioequivalence studies.

    PubMed

    Ramírez, E; Abraira, V; Guerra, P; Borobia, A M; Duque, B; López, J L; Mosquera, B; Lubomirov, R; Carcas, A J; Frías, J

    2013-02-01

    Often the only available data in literature for sample size estimations in bioequivalence studies is intersubject variability, which tends to result in overestimation of sample size. In this paper, we proposed a preliminary model of intrasubject variability based on intersubject variability for Cmax and AUC data from randomized, crossovers, bioequivalence (BE) studies. From 93 Cmax and 121 AUC data from test-reference comparisons that fulfilled BE criteria, we calculated intersubject variability for the reference formulation and intrasubject variability from ANOVA. Lineal and exponential models (y=a(1-e-bx)) were fitted weighted by the inverse of the variance, to predict the intrasubject variability based on intersubject variability. To validate the model we calculated the coefficient of cross-validation of data from 30 new BE studies. The models fit very well (R2=0.997 and 0.990 for Cmax and AUC respectively) and the cross-validation correlation were 0.847 for Cmax and 0.572 for AUC. A preliminary model analyses allow us to estimate the intrasubject variability based on intersubject variability for sample size calculation purposes in BE studies. This approximation provides an opportunity for sample size reduction avoiding unnecessary exposure of healthy volunteers. Further modelling studies are desirable to confirm these results especially suggestions of the higher intersubject variability range.

  8. Detecting spatial structures in throughfall data: the effect of extent, sample size, sampling design, and variogram estimation method

    NASA Astrophysics Data System (ADS)

    Voss, Sebastian; Zimmermann, Beate; Zimmermann, Alexander

    2016-04-01

    In the last three decades, an increasing number of studies analyzed spatial patterns in throughfall to investigate the consequences of rainfall redistribution for biogeochemical and hydrological processes in forests. In the majority of cases, variograms were used to characterize the spatial properties of the throughfall data. The estimation of the variogram from sample data requires an appropriate sampling scheme: most importantly, a large sample and an appropriate layout of sampling locations that often has to serve both variogram estimation and geostatistical prediction. While some recommendations on these aspects exist, they focus on Gaussian data and high ratios of the variogram range to the extent of the study area. However, many hydrological data, and throughfall data in particular, do not follow a Gaussian distribution. In this study, we examined the effect of extent, sample size, sampling design, and calculation methods on variogram estimation of throughfall data. For our investigation, we first generated non-Gaussian random fields based on throughfall data with heavy outliers. Subsequently, we sampled the fields with three extents (plots with edge lengths of 25 m, 50 m, and 100 m), four common sampling designs (two grid-based layouts, transect and random sampling), and five sample sizes (50, 100, 150, 200, 400). We then estimated the variogram parameters by method-of-moments and residual maximum likelihood. Our key findings are threefold. First, the choice of the extent has a substantial influence on the estimation of the variogram. A comparatively small ratio of the extent to the correlation length is beneficial for variogram estimation. Second, a combination of a minimum sample size of 150, a design that ensures the sampling of small distances and variogram estimation by residual maximum likelihood offers a good compromise between accuracy and efficiency. Third, studies relying on method-of-moments based variogram estimation may have to employ at least

  9. Detecting spatial structures in throughfall data: The effect of extent, sample size, sampling design, and variogram estimation method

    NASA Astrophysics Data System (ADS)

    Voss, Sebastian; Zimmermann, Beate; Zimmermann, Alexander

    2016-09-01

    In the last decades, an increasing number of studies analyzed spatial patterns in throughfall by means of variograms. The estimation of the variogram from sample data requires an appropriate sampling scheme: most importantly, a large sample and a layout of sampling locations that often has to serve both variogram estimation and geostatistical prediction. While some recommendations on these aspects exist, they focus on Gaussian data and high ratios of the variogram range to the extent of the study area. However, many hydrological data, and throughfall data in particular, do not follow a Gaussian distribution. In this study, we examined the effect of extent, sample size, sampling design, and calculation method on variogram estimation of throughfall data. For our investigation, we first generated non-Gaussian random fields based on throughfall data with large outliers. Subsequently, we sampled the fields with three extents (plots with edge lengths of 25 m, 50 m, and 100 m), four common sampling designs (two grid-based layouts, transect and random sampling) and five sample sizes (50, 100, 150, 200, 400). We then estimated the variogram parameters by method-of-moments (non-robust and robust estimators) and residual maximum likelihood. Our key findings are threefold. First, the choice of the extent has a substantial influence on the estimation of the variogram. A comparatively small ratio of the extent to the correlation length is beneficial for variogram estimation. Second, a combination of a minimum sample size of 150, a design that ensures the sampling of small distances and variogram estimation by residual maximum likelihood offers a good compromise between accuracy and efficiency. Third, studies relying on method-of-moments based variogram estimation may have to employ at least 200 sampling points for reliable variogram estimates. These suggested sample sizes exceed the number recommended by studies dealing with Gaussian data by up to 100 %. Given that most previous

  10. Influence of grain size on radionuclide activity concentrations and radiological hazard of building material samples.

    PubMed

    Elnobi, Sahar; Harb, S; Ahmed, N K

    2017-09-15

    The knowledge of radioactivity content in various radionuclides in building materials plays an important role in health physics; therefore, we measured the amount of naturally occurring radionuclides in building material (sand, granite, marble, and limestone) samples of different grain sizes by using NaI (Tl) and MCA1024 gamma-ray spectrometers. Data analyses were performed to determine (226)Ra, (232)Th, and (4)°K activity concentrations. The results revealed an inverse relationship between activity concentration and grain size of the samples. The radium equivalent activity (Raeq), representative level index I, and annual absorbed dose rate were calculated. Copyright © 2017. Published by Elsevier Ltd.

  11. How do respiratory state and measurement method affect bra size calculations?

    PubMed Central

    McGhee, D E; Steele, J R

    2006-01-01

    Objectives To investigate the effects of respiratory state and measurement method on bra size calculation. Methods The bra sizes of 16 large‐breasted women were measured during two respiratory states, end voluntary inspiration and relaxed voluntary expiration, and using two sizing methods, which were compared against subject‐reported bra sizes. Results Both respiratory state and measurement method significantly affected bra size estimations, whereby measuring chest circumference during inspiration increased both band and decreased cup size. However, whereas bra size calculated using the standard method differed significantly from subject‐reported bra size, cup size calculated using the breast hemi‐circumference method did not differ significantly from subject‐reported cup size. Conclusions As respiratory state significantly affects bra sizes, it should be standardised during bra size measurements. A more valid and reliable bra sizing method should be developed, possibly using the breast hemi‐circumference method for cup size estimations and raw under‐bust chest circumference values for band size. PMID:17021004

  12. Calculating Confidence, Uncertainty, and Numbers of Samples When Using Statistical Sampling Approaches to Characterize and Clear Contaminated Areas

    SciTech Connect

    Piepel, Gregory F.; Matzke, Brett D.; Sego, Landon H.; Amidan, Brett G.

    2013-04-27

    This report discusses the methodology, formulas, and inputs needed to make characterization and clearance decisions for Bacillus anthracis-contaminated and uncontaminated (or decontaminated) areas using a statistical sampling approach. Specifically, the report includes the methods and formulas for calculating the • number of samples required to achieve a specified confidence in characterization and clearance decisions • confidence in making characterization and clearance decisions for a specified number of samples for two common statistically based environmental sampling approaches. In particular, the report addresses an issue raised by the Government Accountability Office by providing methods and formulas to calculate the confidence that a decision area is uncontaminated (or successfully decontaminated) if all samples collected according to a statistical sampling approach have negative results. Key to addressing this topic is the probability that an individual sample result is a false negative, which is commonly referred to as the false negative rate (FNR). The two statistical sampling approaches currently discussed in this report are 1) hotspot sampling to detect small isolated contaminated locations during the characterization phase, and 2) combined judgment and random (CJR) sampling during the clearance phase. Typically if contamination is widely distributed in a decision area, it will be detectable via judgment sampling during the characterization phrase. Hotspot sampling is appropriate for characterization situations where contamination is not widely distributed and may not be detected by judgment sampling. CJR sampling is appropriate during the clearance phase when it is desired to augment judgment samples with statistical (random) samples. The hotspot and CJR statistical sampling approaches are discussed in the report for four situations: 1. qualitative data (detect and non-detect) when the FNR = 0 or when using statistical sampling methods that account

  13. Sample size for logistic regression with small response probability

    SciTech Connect

    Whittemore, A S

    1980-03-01

    The Fisher information matrix for the estimated parameters in a multiple logistic regression can be approximated by the augmented Hessian matrix of the moment generating function for the covariates. The approximation is valid when the probability of response is small. With its use one can obtain a simple closed form estimate of the asymptotic covariance matrix of the maximum likelihood parameter estimates, and thus approximate sample sizes needed to test hypotheses about the parameters. The method is developed for selected distributions of a single covariate, and for a class of exponential-type distributions of several covariates. It is illustrated with an example concerning risk factors for coronary heart disease.

  14. Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes

    PubMed Central

    Kelleher, Jerome; Etheridge, Alison M; McVean, Gilean

    2016-01-01

    A central challenge in the analysis of genetic variation is to provide realistic genome simulation across millions of samples. Present day coalescent simulations do not scale well, or use approximations that fail to capture important long-range linkage properties. Analysing the results of simulations also presents a substantial challenge, as current methods to store genealogies consume a great deal of space, are slow to parse and do not take advantage of shared structure in correlated trees. We solve these problems by introducing sparse trees and coalescence records as the key units of genealogical analysis. Using these tools, exact simulation of the coalescent with recombination for chromosome-sized regions over hundreds of thousands of samples is possible, and substantially faster than present-day approximate methods. We can also analyse the results orders of magnitude more quickly than with existing methods. PMID:27145223

  15. Maximum type 1 error rate inflation in multiarmed clinical trials with adaptive interim sample size modifications.

    PubMed

    Graf, Alexandra C; Bauer, Peter; Glimm, Ekkehard; Koenig, Franz

    2014-07-01

    Sample size modifications in the interim analyses of an adaptive design can inflate the type 1 error rate, if test statistics and critical boundaries are used in the final analysis as if no modification had been made. While this is already true for designs with an overall change of the sample size in a balanced treatment-control comparison, the inflation can be much larger if in addition a modification of allocation ratios is allowed as well. In this paper, we investigate adaptive designs with several treatment arms compared to a single common control group. Regarding modifications, we consider treatment arm selection as well as modifications of overall sample size and allocation ratios. The inflation is quantified for two approaches: a naive procedure that ignores not only all modifications, but also the multiplicity issue arising from the many-to-one comparison, and a Dunnett procedure that ignores modifications, but adjusts for the initially started multiple treatments. The maximum inflation of the type 1 error rate for such types of design can be calculated by searching for the "worst case" scenarios, that are sample size adaptation rules in the interim analysis that lead to the largest conditional type 1 error rate in any point of the sample space. To show the most extreme inflation, we initially assume unconstrained second stage sample size modifications leading to a large inflation of the type 1 error rate. Furthermore, we investigate the inflation when putting constraints on the second stage sample sizes. It turns out that, for example fixing the sample size of the control group, leads to designs controlling the type 1 error rate.

  16. Blinded sample size re-estimation in three-arm trials with 'gold standard' design.

    PubMed

    Mütze, Tobias; Friede, Tim

    2017-10-15

    In this article, we study blinded sample size re-estimation in the 'gold standard' design with internal pilot study for normally distributed outcomes. The 'gold standard' design is a three-arm clinical trial design that includes an active and a placebo control in addition to an experimental treatment. We focus on the absolute margin approach to hypothesis testing in three-arm trials at which the non-inferiority of the experimental treatment and the assay sensitivity are assessed by pairwise comparisons. We compare several blinded sample size re-estimation procedures in a simulation study assessing operating characteristics including power and type I error. We find that sample size re-estimation based on the popular one-sample variance estimator results in overpowered trials. Moreover, sample size re-estimation based on unbiased variance estimators such as the Xing-Ganju variance estimator results in underpowered trials, as it is expected because an overestimation of the variance and thus the sample size is in general required for the re-estimation procedure to eventually meet the target power. To overcome this problem, we propose an inflation factor for the sample size re-estimation with the Xing-Ganju variance estimator and show that this approach results in adequately powered trials. Because of favorable features of the Xing-Ganju variance estimator such as unbiasedness and a distribution independent of the group means, the inflation factor does not depend on the nuisance parameter and, therefore, can be calculated prior to a trial. Moreover, we prove that the sample size re-estimation based on the Xing-Ganju variance estimator does not bias the effect estimate. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  17. Optimization of finite-size errors in finite-temperature calculations of unordered phases

    NASA Astrophysics Data System (ADS)

    Iyer, Deepak; Srednicki, Mark; Rigol, Marcos

    It is common knowledge that the microcanonical, canonical, and grand canonical ensembles are equivalent in thermodynamically large systems. Here, we study finite-size effects in the latter two ensembles. We show that contrary to naive expectations, finite-size errors are exponentially small in grand canonical ensemble calculations of translationally invariant systems in unordered phases at finite temperature. Open boundary conditions and canonical ensemble calculations suffer from finite-size errors that are only polynomially small in the system size. We further show that finite-size effects are generally smallest in numerical linked cluster expansions. Our conclusions are supported by analytical and numerical analyses of classical and quantum systems.

  18. Optimization of finite-size errors in finite-temperature calculations of unordered phases

    NASA Astrophysics Data System (ADS)

    Iyer, Deepak; Srednicki, Mark; Rigol, Marcos

    2015-06-01

    It is common knowledge that the microcanonical, canonical, and grand-canonical ensembles are equivalent in thermodynamically large systems. Here, we study finite-size effects in the latter two ensembles. We show that contrary to naive expectations, finite-size errors are exponentially small in grand canonical ensemble calculations of translationally invariant systems in unordered phases at finite temperature. Open boundary conditions and canonical ensemble calculations suffer from finite-size errors that are only polynomially small in the system size. We further show that finite-size effects are generally smallest in numerical linked cluster expansions. Our conclusions are supported by analytical and numerical analyses of classical and quantum systems.

  19. NPHMC: an R-package for estimating sample size of proportional hazards mixture cure model.

    PubMed

    Cai, Chao; Wang, Songfeng; Lu, Wenbin; Zhang, Jiajia

    2014-01-01

    Due to advances in medical research, more and more diseases can be cured nowadays, which largely increases the need for an easy-to-use software in calculating sample size of clinical trials with cure fractions. Current available sample size software, such as PROC POWER in SAS, Survival Analysis module in PASS, powerSurvEpi package in R are all based on the standard proportional hazards (PH) model which is not appropriate to design a clinical trial with cure fractions. Instead of the standard PH model, the PH mixture cure model is an important tool in handling the survival data with possible cure fractions. However, there are no tools available that can help design a trial with cure fractions. Therefore, we develop an R package NPHMC to determine the sample size needed for such study design. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  20. NPHMC: An R-package for Estimating Sample Size of Proportional Hazards Mixture Cure Model

    PubMed Central

    Cai, Chao; Wang, Songfeng; Lu, Wenbin; Zhang, Jiajia

    2013-01-01

    Due to advances in medical research, more and more diseases can be cured nowadays, which largely increases the need for an easy-to-use software in calculating sample size of clinical trials with cure fractions. Current available sample size software, such as PROC POWER in SAS, Survival Analysis module in PASS, powerSurvEpi package in R are all based on the standard proportional hazards (PH) model which is not appropriate to design a clinical trial with cure fractions. Instead of the standard PH model, the PH mixture cure model is an important tool in handling the survival data with possible cure fractions. However, there are no tools available that can help design a trial with cure fractions. Therefore, we develop an R package NPHMC to determine the sample size needed for such study design. PMID:24199658

  1. Stocking, Forest Type, and Stand Size Class - The Southern Forest Inventory and Analysis Unit's Calculation of Three Important Stand Descriptors

    Treesearch

    Dennis M. May

    1990-01-01

    The procedures by which the Southern Forest Inventory and Analysis unit calculates stocking from tree data collected on inventory sample plots are described in this report. Stocking is then used to ascertain two other important stand descriptors: forest type and stand size class. Inventory data for three plots from the recently completed 1989 Tennessee survey are used...

  2. Enhanced Ligand Sampling for Relative Protein–Ligand Binding Free Energy Calculations

    PubMed Central

    2016-01-01

    Free energy calculations are used to study how strongly potential drug molecules interact with their target receptors. The accuracy of these calculations depends on the accuracy of the molecular dynamics (MD) force field as well as proper sampling of the major conformations of each molecule. However, proper sampling of ligand conformations can be difficult when there are large barriers separating the major ligand conformations. An example of this is for ligands with an asymmetrically substituted phenyl ring, where the presence of protein loops hinders the proper sampling of the different ring conformations. These ring conformations become more difficult to sample when the size of the functional groups attached to the ring increases. The Adaptive Integration Method (AIM) has been developed, which adaptively changes the alchemical coupling parameter λ during the MD simulation so that conformations sampled at one λ can aid sampling at the other λ values. The Accelerated Adaptive Integration Method (AcclAIM) builds on AIM by lowering potential barriers for specific degrees of freedom at intermediate λ values. However, these methods may not work when there are very large barriers separating the major ligand conformations. In this work, we describe a modification to AIM that improves sampling of the different ring conformations, even when there is a very large barrier between them. This method combines AIM with conformational Monte Carlo sampling, giving improved convergence of ring populations and the resulting free energy. This method, called AIM/MC, is applied to study the relative binding free energy for a pair of ligands that bind to thrombin and a different pair of ligands that bind to aspartyl protease β-APP cleaving enzyme 1 (BACE1). These protein–ligand binding free energy calculations illustrate the improvements in conformational sampling and the convergence of the free energy compared to both AIM and AcclAIM. PMID:25906170

  3. General Conformity Training Modules: Appendix A Sample Emissions Calculations

    EPA Pesticide Factsheets

    Appendix A of the training modules gives example calculations for external and internal combustion sources, construction, fuel storage and transfer, on-road vehicles, aircraft operations, storage piles, and paved roads.

  4. Sample size estimation for pilot animal experiments by using a Markov Chain Monte Carlo approach.

    PubMed

    Allgoewer, Andreas; Mayer, Benjamin

    2017-05-01

    The statistical determination of sample size is mandatory when planning animal experiments, but it is usually difficult to implement appropriately. The main reason is that prior information is hardly ever available, so the assumptions made cannot be verified reliably. This is especially true for pilot experiments. Statistical simulation might help in these situations. We used a Markov Chain Monte Carlo (MCMC) approach to verify the pragmatic assumptions made on different distribution parameters used for power and sample size calculations in animal experiments. Binomial and normal distributions, which are the most frequent distributions in practice, were simulated for categorical and continuous endpoints, respectively. The simulations showed that the common practice of using five or six animals per group for continuous endpoints is reasonable. Even in the case of small effect sizes, the statistical power would be sufficiently large (≥ 80%). For categorical outcomes, group sizes should never be under eight animals, otherwise a sufficient statistical power cannot be guaranteed. This applies even in the case of large effects. The MCMC approach demonstrated to be a useful method for calculating sample size in animal studies that lack prior data. Of course, the simulation results particularly depend on the assumptions made with regard to the distributional properties and effects to be detected, but the same also holds in situations where prior data are available. MCMC is therefore a promising approach toward the more informed planning of pilot research experiments involving the use of animals. 2017 FRAME.

  5. 40 CFR 89.424 - Dilute emission sampling calculations.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... mode for bag measurements and diesel heat exchanger system measurements is determined from the..., for diesel heat exchanger systems, average hydrocarbon concentration of the dilute exhaust sample...

  6. Adaptive sample size modification in clinical trials: start small then ask for more?

    PubMed

    Jennison, Christopher; Turnbull, Bruce W

    2015-12-20

    We consider sample size re-estimation in a clinical trial, in particular when there is a significant delay before the measurement of patient response. Mehta and Pocock have proposed methods in which sample size is increased when interim results fall in a 'promising zone' where it is deemed worthwhile to increase conditional power by adding more subjects. Our analysis reveals potential pitfalls in applying this approach. Mehta and Pocock use results of Chen, DeMets and Lan to identify when increasing sample size, but applying a conventional level α significance test at the end of the trial does not inflate the type I error rate: we have found the greatest gains in power per additional observation are liable to lie outside the region defined by this method. Mehta and Pocock increase sample size to achieve a particular conditional power, calculated under the current estimate of treatment effect: this leads to high increases in sample size for a small range of interim outcomes, whereas we have found it more efficient to make moderate increases in sample size over a wider range of cases. If the aforementioned pitfalls are avoided, we believe the broad framework proposed by Mehta and Pocock is valuable for clinical trial design. Working in this framework, we propose sample size rules that apply explicitly the principle of adding observations when they are most beneficial. The resulting trial designs are closely related to efficient group sequential tests for a delayed response proposed by Hampson and Jennison. Copyright © 2015 John Wiley & Sons, Ltd.

  7. A Bayesian predictive sample size selection design for single-arm exploratory clinical trials.

    PubMed

    Teramukai, Satoshi; Daimon, Takashi; Zohar, Sarah

    2012-12-30

    The aim of an exploratory clinical trial is to determine whether a new intervention is promising for further testing in confirmatory clinical trials. Most exploratory clinical trials are designed as single-arm trials using a binary outcome with or without interim monitoring for early stopping. In this context, we propose a Bayesian adaptive design denoted as predictive sample size selection design (PSSD). The design allows for sample size selection following any planned interim analyses for early stopping of a trial, together with sample size determination before starting the trial. In the PSSD, we determine the sample size using the method proposed by Sambucini (Statistics in Medicine 2008; 27:1199-1224), which adopts a predictive probability criterion with two kinds of prior distributions, that is, an 'analysis prior' used to compute posterior probabilities and a 'design prior' used to obtain prior predictive distributions. In the sample size determination of the PSSD, we provide two sample sizes, that is, N and N(max) , using two types of design priors. At each interim analysis, we calculate the predictive probabilities of achieving a successful result at the end of the trial using the analysis prior in order to stop the trial in case of low or high efficacy (Lee et al., Clinical Trials 2008; 5:93-106), and we select an optimal sample size, that is, either N or N(max) as needed, on the basis of the predictive probabilities. We investigate the operating characteristics through simulation studies, and the PSSD retrospectively applies to a lung cancer clinical trial. (243)

  8. Reduced sample sizes for atrophy outcomes in Alzheimer's disease trials: baseline adjustment

    PubMed Central

    Schott, J.M.; Bartlett, J.W.; Barnes, J.; Leung, K.K.; Ourselin, S.; Fox, N.C.

    2010-01-01

    Cerebral atrophy rate is increasingly used as an outcome measure for Alzheimer's disease (AD) trials. We used the Alzheimer's disease Neuroimaging initiative (ADNI) dataset to assess if adjusting for baseline characteristics can reduce sample sizes. Controls (n = 199), patients with mild cognitive impairment (MCI) (n = 334) and AD (n = 144) had two MRI scans, 1-year apart; ~ 55% had baseline CSF tau, p-tau, and Aβ1-42. Whole brain (KN–BSI) and hippocampal (HMAPS-HBSI) atrophy rate, and ventricular expansion (VBSI) were calculated for each group; numbers required to power a placebo-controlled trial were estimated. Sample sizes per arm (80% power, 25% absolute rate reduction) for AD were (95% CI): brain atrophy = 81 (64,109), hippocampal atrophy = 88 (68,119), ventricular expansion = 118 (92,157); and for MCI: brain atrophy = 149 (122,188), hippocampal atrophy = 201 (160,262), ventricular expansion = 234 (191,295). To detect a 25% reduction relative to normal aging required increased sample sizes ~ 3-fold (AD), and ~ 5-fold (MCI). Disease severity and Aβ1-42 contributed significantly to atrophy rate variability. Adjusting for 11 predefined covariates reduced sample sizes by up to 30%. Treatment trials in AD should consider the effects of normal aging; adjusting for baseline characteristics can significantly reduce required sample sizes. PMID:20620665

  9. Sample Size for Assessing Agreement between Two Methods of Measurement by Bland-Altman Method.

    PubMed

    Lu, Meng-Jie; Zhong, Wei-Hua; Liu, Yu-Xiu; Miao, Hua-Zhang; Li, Yong-Chang; Ji, Mu-Huo

    2016-11-01

    The Bland-Altman method has been widely used for assessing agreement between two methods of measurement. However, it remains unsolved about sample size estimation. We propose a new method of sample size estimation for Bland-Altman agreement assessment. According to the Bland-Altman method, the conclusion on agreement is made based on the width of the confidence interval for LOAs (limits of agreement) in comparison to predefined clinical agreement limit. Under the theory of statistical inference, the formulae of sample size estimation are derived, which depended on the pre-determined level of α, β, the mean and the standard deviation of differences between two measurements, and the predefined limits. With this new method, the sample sizes are calculated under different parameter settings which occur frequently in method comparison studies, and Monte-Carlo simulation is used to obtain the corresponding powers. The results of Monte-Carlo simulation showed that the achieved powers could coincide with the pre-determined level of powers, thus validating the correctness of the method. The method of sample size estimation can be applied in the Bland-Altman method to assess agreement between two methods of measurement.

  10. An approximate approach to sample size determination in bioequivalence testing with multiple pharmacokinetic responses.

    PubMed

    Tsai, Chen-An; Huang, Chih-Yang; Liu, Jen-Pei

    2014-08-30

    The approval of generic drugs requires the evidence of average bioequivalence (ABE) on both the area under the concentration-time curve and the peak concentration Cmax . The bioequivalence (BE) hypothesis can be decomposed into the non-inferiority (NI) and non-superiority (NS) hypothesis. Most of regulatory agencies employ the two one-sided tests (TOST) procedure to test ABE between two formulations. As it is based on the intersection-union principle, the TOST procedure is conservative in terms of the type I error rate. However, the type II error rate is the sum of the type II error rates with respect to each null hypothesis of NI and NS hypotheses. When the difference in population means between two treatments is not 0, no close-form solution for the sample size for the BE hypothesis is available. Current methods provide the sample sizes with either insufficient power or unnecessarily excessive power. We suggest an approximate method for sample size determination, which can also provide the type II rate for each of NI and NS hypotheses. In addition, the proposed method is flexible to allow extension from one pharmacokinetic (PK) response to determination of the sample size required for multiple PK responses. We report the results of a numerical study. An R code is provided to calculate the sample size for BE testing based on the proposed methods.

  11. 40 CFR 91.419 - Raw emission sampling calculations.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... mass flow rate , MHCexh = Molecular weight of hydrocarbons in the exhaust; see the following equation: MHCexh = 12.01 + 1.008 × α Where: α=Hydrocarbon/carbon atomic ratio of the fuel. Mexh=Molecular weight of..., calculated from the following equation: ER04OC96.019 WCO = Mass rate of CO in exhaust, MCO = Molecular weight...

  12. 40 CFR 91.419 - Raw emission sampling calculations.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... mass flow rate , MHCexh = Molecular weight of hydrocarbons in the exhaust; see the following equation: MHCexh = 12.01 + 1.008 × α Where: α=Hydrocarbon/carbon atomic ratio of the fuel. Mexh=Molecular weight of..., calculated from the following equation: ER04OC96.019 WCO = Mass rate of CO in exhaust, MCO = Molecular weight...

  13. 40 CFR 91.419 - Raw emission sampling calculations.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... mass flow rate , MHCexh = Molecular weight of hydrocarbons in the exhaust; see the following equation: MHCexh = 12.01 + 1.008 × α Where: α=Hydrocarbon/carbon atomic ratio of the fuel. Mexh=Molecular weight of..., calculated from the following equation: ER04OC96.019 WCO = Mass rate of CO in exhaust, MCO = Molecular weight...

  14. Enhancing sampling design in mist-net bat surveys by accounting for sample size optimization

    PubMed Central

    Trevelin, Leonardo Carreira; Novaes, Roberto Leonan Morim; Colas-Rosas, Paul François; Benathar, Thayse Cristhina Melo; Peres, Carlos A.

    2017-01-01

    The advantages of mist-netting, the main technique used in Neotropical bat community studies to date, include logistical implementation, standardization and sampling representativeness. Nonetheless, study designs still have to deal with issues of detectability related to how different species behave and use the environment. Yet there is considerable sampling heterogeneity across available studies in the literature. Here, we approach the problem of sample size optimization. We evaluated the common sense hypothesis that the first six hours comprise the period of peak night activity for several species, thereby resulting in a representative sample for the whole night. To this end, we combined re-sampling techniques, species accumulation curves, threshold analysis, and community concordance of species compositional data, and applied them to datasets of three different Neotropical biomes (Amazonia, Atlantic Forest and Cerrado). We show that the strategy of restricting sampling to only six hours of the night frequently results in incomplete sampling representation of the entire bat community investigated. From a quantitative standpoint, results corroborated the existence of a major Sample Area effect in all datasets, although for the Amazonia dataset the six-hour strategy was significantly less species-rich after extrapolation, and for the Cerrado dataset it was more efficient. From the qualitative standpoint, however, results demonstrated that, for all three datasets, the identity of species that are effectively sampled will be inherently impacted by choices of sub-sampling schedule. We also propose an alternative six-hour sampling strategy (at the beginning and the end of a sample night) which performed better when resampling Amazonian and Atlantic Forest datasets on bat assemblages. Given the observed magnitude of our results, we propose that sample representativeness has to be carefully weighed against study objectives, and recommend that the trade-off between

  15. Enhancing sampling design in mist-net bat surveys by accounting for sample size optimization.

    PubMed

    Trevelin, Leonardo Carreira; Novaes, Roberto Leonan Morim; Colas-Rosas, Paul François; Benathar, Thayse Cristhina Melo; Peres, Carlos A

    2017-01-01

    The advantages of mist-netting, the main technique used in Neotropical bat community studies to date, include logistical implementation, standardization and sampling representativeness. Nonetheless, study designs still have to deal with issues of detectability related to how different species behave and use the environment. Yet there is considerable sampling heterogeneity across available studies in the literature. Here, we approach the problem of sample size optimization. We evaluated the common sense hypothesis that the first six hours comprise the period of peak night activity for several species, thereby resulting in a representative sample for the whole night. To this end, we combined re-sampling techniques, species accumulation curves, threshold analysis, and community concordance of species compositional data, and applied them to datasets of three different Neotropical biomes (Amazonia, Atlantic Forest and Cerrado). We show that the strategy of restricting sampling to only six hours of the night frequently results in incomplete sampling representation of the entire bat community investigated. From a quantitative standpoint, results corroborated the existence of a major Sample Area effect in all datasets, although for the Amazonia dataset the six-hour strategy was significantly less species-rich after extrapolation, and for the Cerrado dataset it was more efficient. From the qualitative standpoint, however, results demonstrated that, for all three datasets, the identity of species that are effectively sampled will be inherently impacted by choices of sub-sampling schedule. We also propose an alternative six-hour sampling strategy (at the beginning and the end of a sample night) which performed better when resampling Amazonian and Atlantic Forest datasets on bat assemblages. Given the observed magnitude of our results, we propose that sample representativeness has to be carefully weighed against study objectives, and recommend that the trade-off between

  16. Optimal sample size determinations from an industry perspective based on the expected value of information.

    PubMed

    Willan, Andrew R

    2008-01-01

    Traditional sample size calculations for randomized clinical trials depend on somewhat arbitrarily chosen factors, such as type I and II errors. As an alternative, taking a societal perspective, and using the expected value of information based on Bayesian decision theory, a number of authors have recently shown how to determine the sample size that maximizes the expected net gain, i.e., the difference between the cost of the trial and the value of the information gained from the results. Other authors have proposed Bayesian methods to determine sample sizes from an industry perspective. The purpose of this article is to propose a Bayesian approach to sample size calculations from an industry perspective that attempts to determine the sample size that maximizes expected profit. A model is proposed for expected total profit that includes consideration of per-patient profit, disease incidence, time horizon, trial duration, market share, discount rate, and the relationship between the results and the probability of regulatory approval. The expected value of information provided by trial data is related to the increase in expected profit from increasing the probability of regulatory approval. The methods are applied to an example, including an examination of robustness. The model is extended to consider market share as a function of observed treatment effect. The use of methods based on the expected value of information can provide, from an industry perspective, robust sample size solutions that maximize the difference between the expected cost of the trial and the expected value of information gained from the results. The method is only as good as the model for expected total profit. Although the model probably has all the right elements, it assumes that market share, per-patient profit, and incidence are insensitive to trial results. The method relies on the central limit theorem which assumes that the sample sizes involved ensure that the relevant test statistics

  17. Toward a more standardised and accurate evaluation of glycemic response to foods: recommendations for portion size calculation.

    PubMed

    Bordenave, Nicolas; Kock, Lindsay B; Abernathy, Mengyue; Parcon, Jason C; Gulvady, Apeksha A; van Klinken, B Jan-Willem; Kasturi, Prabhakar

    2015-01-15

    This study aimed at evaluating the adequacy of calculation methods for portions to be provided to subjects in clinical trials evaluating glycemic response to foods. Portion sizes were calculated for 140 food samples, based on Nutrition Facts labels (current practice) and actual available carbohydrate content (current recommendation), and compared against the amount of monosaccharides yielded by the digestive breakdown of their actual available carbohydrate content (basis for glycemic response to food). The current practice can result in significant under- or over-feeding of carbohydrates in 10% of tested cases, as compared to the targeted reference dosage. The method currently recommended can result in significantly inadequate yields of monosaccharides in 24% of tested cases. The current and recommended calculation methods do not seem adequate for a standardised evaluation of glycemic response to foods. It is thus recommended to account for the amount of absorbable monosaccharides of foods for portion size calculation. Copyright © 2014 Elsevier Ltd. All rights reserved.

  18. Sampling hazelnuts for aflatoxin: effect of sample size and accept/reject limit on reducing the risk of misclassifying lots.

    PubMed

    Ozay, Guner; Seyhan, Ferda; Yilmaz, Aysun; Whitaker, Thomas B; Slate, Andrew B; Giesbrecht, Francis G

    2007-01-01

    About 100 countries have established regulatory limits for aflatoxin in food and feeds. Because these limits vary widely among regulating countries, the Codex Committee on Food Additives and Contaminants began work in 2004 to harmonize aflatoxin limits and sampling plans for aflatoxin in almonds, pistachios, hazelnuts, and Brazil nuts. Studies were developed to measure the uncertainty and distribution among replicated sample aflatoxin test results taken from aflatoxin-contaminated treenut lots. The uncertainty and distribution information is used to develop a model that can evaluate the performance (risk of misclassifying lots) of aflatoxin sampling plan designs for treenuts. Once the performance of aflatoxin sampling plans can be predicted, they can be designed to reduce the risks of misclassifying lots traded in either the domestic or export markets. A method was developed to evaluate the performance of sampling plans designed to detect aflatoxin in hazelnuts lots. Twenty hazelnut lots with varying levels of contamination were sampled according to an experimental protocol where 16 test samples were taken from each lot. The observed aflatoxin distribution among the 16 aflatoxin sample test results was compared to lognormal, compound gamma, and negative binomial distributions. The negative binomial distribution was selected to model aflatoxin distribution among sample test results because it gave acceptable fits to observed distributions among sample test results taken from a wide range of lot concentrations. Using the negative binomial distribution, computer models were developed to calculate operating characteristic curves for specific aflatoxin sampling plan designs. The effect of sample size and accept/reject limits on the chances of rejecting good lots (sellers' risk) and accepting bad lots (buyers' risk) was demonstrated for various sampling plan designs.

  19. MEPAG Recommendations for a 2018 Mars Sample Return Caching Lander - Sample Types, Number, and Sizes

    NASA Technical Reports Server (NTRS)

    Allen, Carlton C.

    2011-01-01

    The return to Earth of geological and atmospheric samples from the surface of Mars is among the highest priority objectives of planetary science. The MEPAG Mars Sample Return (MSR) End-to-End International Science Analysis Group (MEPAG E2E-iSAG) was chartered to propose scientific objectives and priorities for returned sample science, and to map out the implications of these priorities, including for the proposed joint ESA-NASA 2018 mission that would be tasked with the crucial job of collecting and caching the samples. The E2E-iSAG identified four overarching scientific aims that relate to understanding: (A) the potential for life and its pre-biotic context, (B) the geologic processes that have affected the martian surface, (C) planetary evolution of Mars and its atmosphere, (D) potential for future human exploration. The types of samples deemed most likely to achieve the science objectives are, in priority order: (1A). Subaqueous or hydrothermal sediments (1B). Hydrothermally altered rocks or low temperature fluid-altered rocks (equal priority) (2). Unaltered igneous rocks (3). Regolith, including airfall dust (4). Present-day atmosphere and samples of sedimentary-igneous rocks containing ancient trapped atmosphere Collection of geologically well-characterized sample suites would add considerable value to interpretations of all collected rocks. To achieve this, the total number of rock samples should be about 30-40. In order to evaluate the size of individual samples required to meet the science objectives, the E2E-iSAG reviewed the analytical methods that would likely be applied to the returned samples by preliminary examination teams, for planetary protection (i.e., life detection, biohazard assessment) and, after distribution, by individual investigators. It was concluded that sample size should be sufficient to perform all high-priority analyses in triplicate. In keeping with long-established curatorial practice of extraterrestrial material, at least 40% by

  20. A Variational Approach to Enhanced Sampling and Free Energy Calculations

    NASA Astrophysics Data System (ADS)

    Parrinello, Michele

    2015-03-01

    The presence of kinetic bottlenecks severely hampers the ability of widely used sampling methods like molecular dynamics or Monte Carlo to explore complex free energy landscapes. One of the most popular methods for addressing this problem is umbrella sampling which is based on the addition of an external bias which helps overcoming the kinetic barriers. The bias potential is usually taken to be a function of a restricted number of collective variables. However constructing the bias is not simple, especially when the number of collective variables increases. Here we introduce a functional of the bias which, when minimized, allows us to recover the free energy. We demonstrate the usefulness and the flexibility of this approach on a number of examples which include the determination of a six dimensional free energy surface. Besides the practical advantages, the existence of such a variational principle allows us to look at the enhanced sampling problem from a rather convenient vantage point.

  1. Variational Approach to Enhanced Sampling and Free Energy Calculations

    NASA Astrophysics Data System (ADS)

    Valsson, Omar; Parrinello, Michele

    2014-08-01

    The ability of widely used sampling methods, such as molecular dynamics or Monte Carlo simulations, to explore complex free energy landscapes is severely hampered by the presence of kinetic bottlenecks. A large number of solutions have been proposed to alleviate this problem. Many are based on the introduction of a bias potential which is a function of a small number of collective variables. However constructing such a bias is not simple. Here we introduce a functional of the bias potential and an associated variational principle. The bias that minimizes the functional relates in a simple way to the free energy surface. This variational principle can be turned into a practical, efficient, and flexible sampling method. A number of numerical examples are presented which include the determination of a three-dimensional free energy surface. We argue that, beside being numerically advantageous, our variational approach provides a convenient and novel standpoint for looking at the sampling problem.

  2. 40 CFR Appendix II to Part 600 - Sample Fuel Economy Calculations

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 40 Protection of Environment 29 2010-07-01 2010-07-01 false Sample Fuel Economy Calculations II... FUEL ECONOMY AND CARBON-RELATED EXHAUST EMISSIONS OF MOTOR VEHICLES Pt. 600, App. II Appendix II to Part 600—Sample Fuel Economy Calculations (a) This sample fuel economy calculation is applicable...

  3. 40 CFR Appendix II to Part 600 - Sample Fuel Economy Calculations

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 40 Protection of Environment 31 2013-07-01 2013-07-01 false Sample Fuel Economy Calculations II... FUEL ECONOMY AND GREENHOUSE GAS EXHAUST EMISSIONS OF MOTOR VEHICLES Pt. 600, App. II Appendix II to Part 600—Sample Fuel Economy Calculations (a) This sample fuel economy calculation is applicable...

  4. 40 CFR Appendix II to Part 600 - Sample Fuel Economy Calculations

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 40 Protection of Environment 30 2011-07-01 2011-07-01 false Sample Fuel Economy Calculations II... FUEL ECONOMY AND CARBON-RELATED EXHAUST EMISSIONS OF MOTOR VEHICLES Pt. 600, App. II Appendix II to Part 600—Sample Fuel Economy Calculations (a) This sample fuel economy calculation is applicable...

  5. 7 CFR 51.308 - Methods of sampling and calculation of percentages.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Methods of sampling and calculation of percentages. 51..., CERTIFICATION, AND STANDARDS) United States Standards for Grades of Apples Methods of Sampling and Calculation of Percentages § 51.308 Methods of sampling and calculation of percentages. (a) When the numerical...

  6. Space resection model calculation based on Random Sample Consensus algorithm

    NASA Astrophysics Data System (ADS)

    Liu, Xinzhu; Kang, Zhizhong

    2016-03-01

    Resection has been one of the most important content in photogrammetry. It aims at the position and attitude information of camera at the shooting point. However in some cases, the observed values for calculating are with gross errors. This paper presents a robust algorithm that using RANSAC method with DLT model can effectually avoiding the difficulties to determine initial values when using co-linear equation. The results also show that our strategies can exclude crude handicap and lead to an accurate and efficient way to gain elements of exterior orientation.

  7. Blinded sample size recalculation in multicentre trials with normally distributed outcome.

    PubMed

    Jensen, Katrin; Kieser, Meinhard

    2010-06-01

    The internal pilot study design enables to estimate nuisance parameters required for sample size calculation on the basis of data accumulated in an ongoing trial. By this, misspecifications made when determining the sample size in the planning phase can be corrected employing updated knowledge. According to regulatory guidelines, blindness of all personnel involved in the trial has to be preserved and the specified type I error rate has to be controlled when the internal pilot study design is applied. Especially in the late phase of drug development, most clinical studies are run in more than one centre. In these multicentre trials, one may have to deal with an unequal distribution of the patient numbers among the centres. Depending on the type of the analysis (weighted or unweighted), unequal centre sample sizes may lead to a substantial loss of power. Like the variance, the magnitude of imbalance is difficult to predict in the planning phase. We propose a blinded sample size recalculation procedure for the internal pilot study design in multicentre trials with normally distributed outcome and two balanced treatment groups that are analysed applying the weighted or the unweighted approach. The method addresses both uncertainty with respect to the variance of the endpoint and the extent of disparity of the centre sample sizes. The actual type I error rate as well as the expected power and sample size of the procedure is investigated in simulation studies. For the weighted analysis as well as for the unweighted analysis, the maximal type I error rate was not or only minimally exceeded. Furthermore, application of the proposed procedure led to an expected power that achieves the specified value in many cases and is throughout very close to it.

  8. Microdosimetry calculations for monoenergetic electrons using Geant4-DNA combined with a weighted track sampling algorithm

    NASA Astrophysics Data System (ADS)

    Famulari, Gabriel; Pater, Piotr; Enger, Shirin A.

    2017-07-01

    The aim of this study was to calculate microdosimetric distributions for low energy electrons simulated using the Monte Carlo track structure code Geant4-DNA. Tracks for monoenergetic electrons with kinetic energies ranging from 100 eV to 1 MeV were simulated in an infinite spherical water phantom using the Geant4-DNA extension included in Geant4 toolkit version 10.2 (patch 02). The microdosimetric distributions were obtained through random sampling of transfer points and overlaying scoring volumes within the associated volume of the tracks. Relative frequency distributions of energy deposition f(>E)/f(>0) and dose mean lineal energy (\\bar{y}D ) values were calculated in nanometer-sized spherical and cylindrical targets. The effects of scoring volume and scoring techniques were examined. The results were compared with published data generated using MOCA8B and KURBUC. Geant4-DNA produces a lower frequency of higher energy deposits than MOCA8B. The \\bar{y}D values calculated with Geant4-DNA are smaller than those calculated using MOCA8B and KURBUC. The differences are mainly due to the lower ionization and excitation cross sections of Geant4-DNA for low energy electrons. To a lesser extent, discrepancies can also be attributed to the implementation in this study of a new and fast scoring technique that differs from that used in previous studies. For the same mean chord length (\\bar{l} ), the \\bar{y}D calculated in cylindrical volumes are larger than those calculated in spherical volumes. The discrepancies due to cross sections and scoring geometries increase with decreasing scoring site dimensions. A new set of \\bar{y}D values has been presented for monoenergetic electrons using a fast track sampling algorithm and the most recent physics models implemented in Geant4-DNA. This dataset can be combined with primary electron spectra to predict the radiation quality of photon and electron beams.

  9. CaSPA - an algorithm for calculation of the size of percolating aggregates

    NASA Astrophysics Data System (ADS)

    Magee, James E.; Dutton, Helen; Siperstein, Flor R.

    2009-09-01

    We present an algorithm (CaSPA) which accounts for the effects of periodic boundary conditions in the calculation of size of percolating aggregated clusters. The algorithm calculates the gyration tensor, allowing for a mixture of infinite (macroscale) and finite (microscale) principle moments. Equilibration of a triblock copolymer system from a disordered initial configuration to a hexagonal phase is examined using the algorithm.

  10. Power analysis and sample size estimation for RNA-Seq differential expression

    PubMed Central

    Ching, Travers; Huang, Sijia

    2014-01-01

    It is crucial for researchers to optimize RNA-seq experimental designs for differential expression detection. Currently, the field lacks general methods to estimate power and sample size for RNA-Seq in complex experimental designs, under the assumption of the negative binomial distribution. We simulate RNA-Seq count data based on parameters estimated from six widely different public data sets (including cell line comparison, tissue comparison, and cancer data sets) and calculate the statistical power in paired and unpaired sample experiments. We comprehensively compare five differential expression analysis packages (DESeq, edgeR, DESeq2, sSeq, and EBSeq) and evaluate their performance by power, receiver operator characteristic (ROC) curves, and other metrics including areas under the curve (AUC), Matthews correlation coefficient (MCC), and F-measures. DESeq2 and edgeR tend to give the best performance in general. Increasing sample size or sequencing depth increases power; however, increasing sample size is more potent than sequencing depth to increase power, especially when the sequencing depth reaches 20 million reads. Long intergenic noncoding RNAs (lincRNA) yields lower power relative to the protein coding mRNAs, given their lower expression level in the same RNA-Seq experiment. On the other hand, paired-sample RNA-Seq significantly enhances the statistical power, confirming the importance of considering the multifactor experimental design. Finally, a local optimal power is achievable for a given budget constraint, and the dominant contributing factor is sample size rather than the sequencing depth. In conclusion, we provide a power analysis tool (http://www2.hawaii.edu/~lgarmire/RNASeqPowerCalculator.htm) that captures the dispersion in the data and can serve as a practical reference under the budget constraint of RNA-Seq experiments. PMID:25246651

  11. Implications of sampling design and sample size for national carbon accounting systems

    PubMed Central

    2011-01-01

    Background Countries willing to adopt a REDD regime need to establish a national Measurement, Reporting and Verification (MRV) system that provides information on forest carbon stocks and carbon stock changes. Due to the extensive areas covered by forests the information is generally obtained by sample based surveys. Most operational sampling approaches utilize a combination of earth-observation data and in-situ field assessments as data sources. Results We compared the cost-efficiency of four different sampling design alternatives (simple random sampling, regression estimators, stratified sampling, 2-phase sampling with regression estimators) that have been proposed in the scope of REDD. Three of the design alternatives provide for a combination of in-situ and earth-observation data. Under different settings of remote sensing coverage, cost per field plot, cost of remote sensing imagery, correlation between attributes quantified in remote sensing and field data, as well as population variability and the percent standard error over total survey cost was calculated. The cost-efficiency of forest carbon stock assessments is driven by the sampling design chosen. Our results indicate that the cost of remote sensing imagery is decisive for the cost-efficiency of a sampling design. The variability of the sample population impairs cost-efficiency, but does not reverse the pattern of cost-efficiency of the individual design alternatives. Conclusions, brief summary and potential implications Our results clearly indicate that it is important to consider cost-efficiency in the development of forest carbon stock assessments and the selection of remote sensing techniques. The development of MRV-systems for REDD need to be based on a sound optimization process that compares different data sources and sampling designs with respect to their cost-efficiency. This helps to reduce the uncertainties related with the quantification of carbon stocks and to increase the financial

  12. Implications of sampling design and sample size for national carbon accounting systems.

    PubMed

    Köhl, Michael; Lister, Andrew; Scott, Charles T; Baldauf, Thomas; Plugge, Daniel

    2011-11-08

    Countries willing to adopt a REDD regime need to establish a national Measurement, Reporting and Verification (MRV) system that provides information on forest carbon stocks and carbon stock changes. Due to the extensive areas covered by forests the information is generally obtained by sample based surveys. Most operational sampling approaches utilize a combination of earth-observation data and in-situ field assessments as data sources. We compared the cost-efficiency of four different sampling design alternatives (simple random sampling, regression estimators, stratified sampling, 2-phase sampling with regression estimators) that have been proposed in the scope of REDD. Three of the design alternatives provide for a combination of in-situ and earth-observation data. Under different settings of remote sensing coverage, cost per field plot, cost of remote sensing imagery, correlation between attributes quantified in remote sensing and field data, as well as population variability and the percent standard error over total survey cost was calculated. The cost-efficiency of forest carbon stock assessments is driven by the sampling design chosen. Our results indicate that the cost of remote sensing imagery is decisive for the cost-efficiency of a sampling design. The variability of the sample population impairs cost-efficiency, but does not reverse the pattern of cost-efficiency of the individual design alternatives. Our results clearly indicate that it is important to consider cost-efficiency in the development of forest carbon stock assessments and the selection of remote sensing techniques. The development of MRV-systems for REDD need to be based on a sound optimization process that compares different data sources and sampling designs with respect to their cost-efficiency. This helps to reduce the uncertainties related with the quantification of carbon stocks and to increase the financial benefits from adopting a REDD regime.

  13. Free energy calculation from umbrella sampling using Bayesian inference

    NASA Astrophysics Data System (ADS)

    Bernstein, Noam; Stecher, Thomas; Csányi, Gábor

    2013-03-01

    Using simulations to obtain information about the free energy of a system far from its free energy minima requires biased sampling, for example using a series of harmonic umbrella confining potentials to scan over a range of collective variable values. One fundamental distinction between existing methods that use this approach is in what quantities are measured and how they are used: histograms of the system's probability distribution in WHAM, or gradients of the potential of mean force for umbrella integration (UI) and the single-sweep radial basis function (RBF) approach. Here we present a method that reconstructs the free energy from umbrella sampling data using Bayesian inference that effectively uses all available information from multiple umbrella windows. We show that for a single collective variable, our method can use histograms, gradients, or both, to match or outperform WHAM and UI in the accuracy of free energy for a given amount of total simulation time. In higher dimensions, our method can effectively use gradient information to reconstruct the multidimensional free energy surface. We test our method for the alanine polypeptide model system, and show that it is more accurate than a RBF reconstruction for sparse data, and more stable for abundant data.

  14. Communication: Finite size correction in periodic coupled cluster theory calculations of solids.

    PubMed

    Liao, Ke; Grüneis, Andreas

    2016-10-14

    We present a method to correct for finite size errors in coupled cluster theory calculations of solids. The outlined technique shares similarities with electronic structure factor interpolation methods used in quantum Monte Carlo calculations. However, our approach does not require the calculation of density matrices. Furthermore we show that the proposed finite size corrections achieve chemical accuracy in the convergence of second-order Møller-Plesset perturbation and coupled cluster singles and doubles correlation energies per atom for insulating solids with two atomic unit cells using 2 × 2 × 2 and 3 × 3 × 3 k-point meshes only.

  15. Threshold-dependent sample sizes for selenium assessment with stream fish tissue.

    PubMed

    Hitt, Nathaniel P; Smith, David R

    2015-01-01

    Natural resource managers are developing assessments of selenium (Se) contamination in freshwater ecosystems based on fish tissue concentrations. We evaluated the effects of sample size (i.e., number of fish per site) on the probability of correctly detecting mean whole-body Se values above a range of potential management thresholds. We modeled Se concentrations as gamma distributions with shape and scale parameters fitting an empirical mean-to-variance relationship in data from southwestern West Virginia, USA (63 collections, 382 individuals). We used parametric bootstrapping techniques to calculate statistical power as the probability of detecting true mean concentrations up to 3 mg Se/kg above management thresholds ranging from 4 to 8 mg Se/kg. Sample sizes required to achieve 80% power varied as a function of management thresholds and Type I error tolerance (α). Higher thresholds required more samples than lower thresholds because populations were more heterogeneous at higher mean Se levels. For instance, to assess a management threshold of 4 mg Se/kg, a sample of eight fish could detect an increase of approximately 1 mg Se/kg with 80% power (given α=0.05), but this sample size would be unable to detect such an increase from a management threshold of 8 mg Se/kg with more than a coin-flip probability. Increasing α decreased sample size requirements to detect above-threshold mean Se concentrations with 80% power. For instance, at an α-level of 0.05, an 8-fish sample could detect an increase of approximately 2 units above a threshold of 8 mg Se/kg with 80% power, but when α was relaxed to 0.2, this sample size was more sensitive to increasing mean Se concentrations, allowing detection of an increase of approximately 1.2 units with equivalent power. Combining individuals into 2- and 4-fish composite samples for laboratory analysis did not decrease power because the reduced number of laboratory samples was compensated for by increased precision of composites

  16. Threshold-dependent sample sizes for selenium assessment with stream fish tissue

    USGS Publications Warehouse

    Hitt, Nathaniel P.; Smith, David R.

    2015-01-01

    Natural resource managers are developing assessments of selenium (Se) contamination in freshwater ecosystems based on fish tissue concentrations. We evaluated the effects of sample size (i.e., number of fish per site) on the probability of correctly detecting mean whole-body Se values above a range of potential management thresholds. We modeled Se concentrations as gamma distributions with shape and scale parameters fitting an empirical mean-to-variance relationship in data from southwestern West Virginia, USA (63 collections, 382 individuals). We used parametric bootstrapping techniques to calculate statistical power as the probability of detecting true mean concentrations up to 3 mg Se/kg above management thresholds ranging from 4 to 8 mg Se/kg. Sample sizes required to achieve 80% power varied as a function of management thresholds and Type I error tolerance (α). Higher thresholds required more samples than lower thresholds because populations were more heterogeneous at higher mean Se levels. For instance, to assess a management threshold of 4 mg Se/kg, a sample of eight fish could detect an increase of approximately 1 mg Se/kg with 80% power (given α = 0.05), but this sample size would be unable to detect such an increase from a management threshold of 8 mg Se/kg with more than a coin-flip probability. Increasing α decreased sample size requirements to detect above-threshold mean Se concentrations with 80% power. For instance, at an α-level of 0.05, an 8-fish sample could detect an increase of approximately 2 units above a threshold of 8 mg Se/kg with 80% power, but when α was relaxed to 0.2, this sample size was more sensitive to increasing mean Se concentrations, allowing detection of an increase of approximately 1.2 units with equivalent power. Combining individuals into 2- and 4-fish composite samples for laboratory analysis did not decrease power because the reduced number of laboratory samples was compensated for by increased

  17. 7 CFR 51.1406 - Sample for grade or size determination.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ..., AND STANDARDS) United States Standards for Grades of Pecans in the Shell 1 Sample for Grade Or Size Determination § 51.1406 Sample for grade or size determination. Each sample shall consist of 100 pecans....

  18. 7 CFR 51.1406 - Sample for grade or size determination.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ..., AND STANDARDS) United States Standards for Grades of Pecans in the Shell 1 Sample for Grade Or Size Determination § 51.1406 Sample for grade or size determination. Each sample shall consist of 100 pecans....

  19. 7 CFR 51.1406 - Sample for grade or size determination.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ..., AND STANDARDS) United States Standards for Grades of Pecans in the Shell 1 Sample for Grade Or Size Determination § 51.1406 Sample for grade or size determination. Each sample shall consist of 100 pecans....

  20. Sample Size Determination for a Three-Arm Equivalence Trial of Poisson and Negative Binomial Responses.

    PubMed

    Chang, Yu-Wei; Tsong, Yi; Zhao, Zhigen

    2016-12-09

    Assessing equivalence or similarity has drawn much attention recently as many drug products have lost or will lose their patents in the next few years, especially certain best-selling biologics. To claim equivalence between the test treatment and the reference treatment when assay sensitivity is well-established from historical data, one has to demonstrate both superiority of the test treatment over placebo and equivalence between the test treatment and the reference treatment. Thus, there is urgency for practitioners to derive a practical way to calculate sample size for a three-arm equivalence trial. The primary endpoints of a clinical trial may not always be continuous, but may be discrete. In this paper, the authors derive power function and discuss sample size requirement for a three-arm equivalence trial with Poisson and negative binomial clinical endpoints. In addition, the authors examine the effect of the dispersion parameter on the power and the sample size by varying its coefficient from small to large. In extensive numerical studies, the authors demonstrate that required sample size heavily depends on the dispersion parameter. Therefore, misusing a Poisson model for negative binomial data may easily lose power up to 20%, depending on the value of the dispersion parameter.

  1. Bayesian sample sizes for exploratory clinical trials comparing multiple experimental treatments with a control.

    PubMed

    Whitehead, John; Cleary, Faye; Turner, Amanda

    2015-05-30

    In this paper, a Bayesian approach is developed for simultaneously comparing multiple experimental treatments with a common control treatment in an exploratory clinical trial. The sample size is set to ensure that, at the end of the study, there will be at least one treatment for which the investigators have a strong belief that it is better than control, or else they have a strong belief that none of the experimental treatments are substantially better than control. This criterion bears a direct relationship with conventional frequentist power requirements, while allowing prior opinion to feature in the analysis with a consequent reduction in sample size. If it is concluded that at least one of the experimental treatments shows promise, then it is envisaged that one or more of these promising treatments will be developed further in a definitive phase III trial. The approach is developed in the context of normally distributed responses sharing a common standard deviation regardless of treatment. To begin with, the standard deviation will be assumed known when the sample size is calculated. The final analysis will not rely upon this assumption, although the intended properties of the design may not be achieved if the anticipated standard deviation turns out to be inappropriate. Methods that formally allow for uncertainty about the standard deviation, expressed in the form of a Bayesian prior, are then explored. Illustrations of the sample sizes computed from the new method are presented, and comparisons are made with frequentist methods devised for the same situation.

  2. Sample size and power determination in joint modeling of longitudinal and survival data.

    PubMed

    Chen, Liddy M; Ibrahim, Joseph G; Chu, Haitao

    2011-08-15

    Owing to the rapid development of biomarkers in clinical trials, joint modeling of longitudinal and survival data has gained its popularity in the recent years because it reduces bias and provides improvements of efficiency in the assessment of treatment effects and other prognostic factors. Although much effort has been put into inferential methods in joint modeling, such as estimation and hypothesis testing, design aspects have not been formally considered. Statistical design, such as sample size and power calculations, is a crucial first step in clinical trials. In this paper, we derive a closed-form sample size formula for estimating the effect of the longitudinal process in joint modeling, and extend Schoenfeld's sample size formula to the joint modeling setting for estimating the overall treatment effect. The sample size formula we develop is quite general, allowing for p-degree polynomial trajectories. The robustness of our model is demonstrated in simulation studies with linear and quadratic trajectories. We discuss the impact of the within-subject variability on power and data collection strategies, such as spacing and frequency of repeated measurements, in order to maximize the power. When the within-subject variability is large, different data collection strategies can influence the power of the study in a significant way. Optimal frequency of repeated measurements also depends on the nature of the trajectory with higher polynomial trajectories and larger measurement error requiring more frequent measurements. Copyright © 2011 John Wiley & Sons, Ltd.

  3. A comparison of different estimation methods for simulation-based sample size determination in longitudinal studies

    NASA Astrophysics Data System (ADS)

    Bahçecitapar, Melike Kaya

    2017-07-01

    Determining sample size necessary for correct results is a crucial step in the design of longitudinal studies. Simulation-based statistical power calculation is a flexible approach to determine number of subjects and repeated measures of longitudinal studies especially in complex design. Several papers have provided sample size/statistical power calculations for longitudinal studies incorporating data analysis by linear mixed effects models (LMMs). In this study, different estimation methods (methods based on maximum likelihood (ML) and restricted ML) with different iterative algorithms (quasi-Newton and ridge-stabilized Newton-Raphson) in fitting LMMs to generated longitudinal data for simulation-based power calculation are compared. This study examines statistical power of F-test statistics for parameter representing difference in responses over time from two treatment groups in the LMM with a longitudinal covariate. The most common procedures in SAS, such as PROC GLIMMIX using quasi-Newton algorithm and PROC MIXED using ridge-stabilized algorithm are used for analyzing generated longitudinal data in simulation. It is seen that both procedures present similar results. Moreover, it is found that the magnitude of the parameter of interest in the model for simulations affect statistical power calculations in both procedures substantially.

  4. Treatment Trials for Neonatal Seizures: The Effect of Design on Sample Size

    PubMed Central

    Stevenson, Nathan J.; Boylan, Geraldine B.; Hellström-Westas, Lena; Vanhatalo, Sampsa

    2016-01-01

    Neonatal seizures are common in the neonatal intensive care unit. Clinicians treat these seizures with several anti-epileptic drugs (AEDs) to reduce seizures in a neonate. Current AEDs exhibit sub-optimal efficacy and several randomized control trials (RCT) of novel AEDs are planned. The aim of this study was to measure the influence of trial design on the required sample size of a RCT. We used seizure time courses from 41 term neonates with hypoxic ischaemic encephalopathy to build seizure treatment trial simulations. We used five outcome measures, three AED protocols, eight treatment delays from seizure onset (Td) and four levels of trial AED efficacy to simulate different RCTs. We performed power calculations for each RCT design and analysed the resultant sample size. We also assessed the rate of false positives, or placebo effect, in typical uncontrolled studies. We found that the false positive rate ranged from 5 to 85% of patients depending on RCT design. For controlled trials, the choice of outcome measure had the largest effect on sample size with median differences of 30.7 fold (IQR: 13.7–40.0) across a range of AED protocols, Td and trial AED efficacy (p<0.001). RCTs that compared the trial AED with positive controls required sample sizes with a median fold increase of 3.2 (IQR: 1.9–11.9; p<0.001). Delays in AED administration from seizure onset also increased the required sample size 2.1 fold (IQR: 1.7–2.9; p<0.001). Subgroup analysis showed that RCTs in neonates treated with hypothermia required a median fold increase in sample size of 2.6 (IQR: 2.4–3.0) compared to trials in normothermic neonates (p<0.001). These results show that RCT design has a profound influence on the required sample size. Trials that use a control group, appropriate outcome measure, and control for differences in Td between groups in analysis will be valid and minimise sample size. PMID:27824913

  5. Treatment Trials for Neonatal Seizures: The Effect of Design on Sample Size.

    PubMed

    Stevenson, Nathan J; Boylan, Geraldine B; Hellström-Westas, Lena; Vanhatalo, Sampsa

    2016-01-01

    Neonatal seizures are common in the neonatal intensive care unit. Clinicians treat these seizures with several anti-epileptic drugs (AEDs) to reduce seizures in a neonate. Current AEDs exhibit sub-optimal efficacy and several randomized control trials (RCT) of novel AEDs are planned. The aim of this study was to measure the influence of trial design on the required sample size of a RCT. We used seizure time courses from 41 term neonates with hypoxic ischaemic encephalopathy to build seizure treatment trial simulations. We used five outcome measures, three AED protocols, eight treatment delays from seizure onset (Td) and four levels of trial AED efficacy to simulate different RCTs. We performed power calculations for each RCT design and analysed the resultant sample size. We also assessed the rate of false positives, or placebo effect, in typical uncontrolled studies. We found that the false positive rate ranged from 5 to 85% of patients depending on RCT design. For controlled trials, the choice of outcome measure had the largest effect on sample size with median differences of 30.7 fold (IQR: 13.7-40.0) across a range of AED protocols, Td and trial AED efficacy (p<0.001). RCTs that compared the trial AED with positive controls required sample sizes with a median fold increase of 3.2 (IQR: 1.9-11.9; p<0.001). Delays in AED administration from seizure onset also increased the required sample size 2.1 fold (IQR: 1.7-2.9; p<0.001). Subgroup analysis showed that RCTs in neonates treated with hypothermia required a median fold increase in sample size of 2.6 (IQR: 2.4-3.0) compared to trials in normothermic neonates (p<0.001). These results show that RCT design has a profound influence on the required sample size. Trials that use a control group, appropriate outcome measure, and control for differences in Td between groups in analysis will be valid and minimise sample size.

  6. Sample size and allocation of effort in point count sampling of birds in bottomland hardwood forests

    USGS Publications Warehouse

    Smith, W.P.; Twedt, D.J.; Cooper, R.J.; Wiedenfeld, D.A.; Hamel, P.B.; Ford, R.P.; Ralph, C. John; Sauer, John R.; Droege, Sam

    1995-01-01

    To examine sample size requirements and optimum allocation of effort in point count sampling of bottomland hardwood forests, we computed minimum sample sizes from variation recorded during 82 point counts (May 7-May 16, 1992) from three localities containing three habitat types across three regions of the Mississippi Alluvial Valley (MAV). Also, we estimated the effect of increasing the number of points or visits by comparing results of 150 four-minute point counts obtained from each of four stands on Delta Experimental Forest (DEF) during May 8-May 21, 1991 and May 30-June 12, 1992. For each stand, we obtained bootstrap estimates of mean cumulative number of species each year from all possible combinations of six points and six visits. ANOVA was used to model cumulative species as a function of number of points visited, number of visits to each point, and interaction of points and visits. There was significant variation in numbers of birds and species between regions and localities (nested within region); neither habitat, nor the interaction between region and habitat, was significant. For a = 0.05 and a = 0.10, minimum sample size estimates (per factor level) varied by orders of magnitude depending upon the observed or specified range of desired detectable difference. For observed regional variation, 20 and 40 point counts were required to accommodate variability in total individuals (MSE = 9.28) and species (MSE = 3.79), respectively, whereas ? 25 percent of the mean could be achieved with five counts per factor level. Sample size sufficient to detect actual differences of Wood Thrush (Hylocichla mustelina) was >200, whereas the Prothonotary Warbler (Protonotaria citrea) required <10 counts. Differences in mean cumulative species were detected among number of points visited and among number of visits to a point. In the lower MAV, mean cumulative species increased with each added point through five points and with each additional visit through four visits

  7. Using Ancillary Information to Reduce Sample Size in Discovery Sampling and the Effects of Measurement Error

    SciTech Connect

    Axelrod, M

    2005-08-18

    Discovery sampling is a tool used in a discovery auditing. The purpose of such an audit is to provide evidence that some (usually large) inventory of items complies with a defined set of criteria by inspecting (or measuring) a representative sample drawn from the inventory. If any of the items in the sample fail compliance (defective items), then the audit has discovered an impropriety, which often triggers some action. However finding defective items in a sample is an unusual event--auditors expect the inventory to be in compliance because they come to the audit with an ''innocent until proven guilty attitude''. As part of their work product, the auditors must provide a confidence statement about compliance level of the inventory. Clearly the more items they inspect, the greater their confidence, but more inspection means more cost. Audit costs can be purely economic, but in some cases, the cost is political because more inspection means more intrusion, which communicates an attitude of distrust. Thus, auditors have every incentive to minimize the number of items in the sample. Indeed, in some cases the sample size can be specifically limited by a prior agreement or an ongoing policy. Statements of confidence about the results of a discovery sample generally use the method of confidence intervals. After finding no defectives in the sample, the auditors provide a range of values that bracket the number of defective items that could credibly be in the inventory. They also state a level of confidence for the interval, usually 90% or 95%. For example, the auditors might say: ''We believe that this inventory of 1,000 items contains no more than 10 defectives with a confidence of 95%''. Frequently clients ask their auditors questions such as: How many items do you need to measure to be 95% confident that there are no more than 10 defectives in the entire inventory? Sometimes when the auditors answer with big numbers like ''300'', their clients balk. They balk because a

  8. Analysis of Sample Size, Counting Time, and Plot Size from an Avian Point Count Survey on Hoosier National Forest, Indiana

    Treesearch

    Frank R. Thompson; Monica J. Schwalbach

    1995-01-01

    We report results of a point count survey of breeding birds on Hoosier National Forest in Indiana. We determined sample size requirements to detect differences in means and the effects of count duration and plot size on individual detection rates. Sample size requirements ranged from 100 to >1000 points with Type I and II error rates of <0.1 and 0.2. Sample...

  9. MetSizeR: selecting the optimal sample size for metabolomic studies using an analysis based approach

    PubMed Central

    2013-01-01

    Background Determining sample sizes for metabolomic experiments is important but due to the complexity of these experiments, there are currently no standard methods for sample size estimation in metabolomics. Since pilot studies are rarely done in metabolomics, currently existing sample size estimation approaches which rely on pilot data can not be applied. Results In this article, an analysis based approach called MetSizeR is developed to estimate sample size for metabolomic experiments even when experimental pilot data are not available. The key motivation for MetSizeR is that it considers the type of analysis the researcher intends to use for data analysis when estimating sample size. MetSizeR uses information about the data analysis technique and prior expert knowledge of the metabolomic experiment to simulate pilot data from a statistical model. Permutation based techniques are then applied to the simulated pilot data to estimate the required sample size. Conclusions The MetSizeR methodology, and a publicly available software package which implements the approach, are illustrated through real metabolomic applications. Sample size estimates, informed by the intended statistical analysis technique, and the associated uncertainty are provided. PMID:24261687

  10. 10 CFR Appendix to Part 474 - Sample Petroleum-Equivalent Fuel Economy Calculations

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 10 Energy 3 2010-01-01 2010-01-01 false Sample Petroleum-Equivalent Fuel Economy Calculations..., DEVELOPMENT, AND DEMONSTRATION PROGRAM; PETROLEUM-EQUIVALENT FUEL ECONOMY CALCULATION Pt. 474, App. Appendix to Part 474—Sample Petroleum-Equivalent Fuel Economy Calculations Example 1: An electric vehicle is...

  11. 46 CFR 280.11 - Example of calculation and sample report.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 46 Shipping 8 2010-10-01 2010-10-01 false Example of calculation and sample report. 280.11 Section... VESSELS AND OPERATORS LIMITATIONS ON THE AWARD AND PAYMENT OF OPERATING-DIFFERENTIAL SUBSIDY FOR LINER OPERATORS § 280.11 Example of calculation and sample report. (a) Example of calculation. The provisions of...

  12. 10 CFR Appendix to Part 474 - Sample Petroleum-Equivalent Fuel Economy Calculations

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 10 Energy 3 2011-01-01 2011-01-01 false Sample Petroleum-Equivalent Fuel Economy Calculations..., DEVELOPMENT, AND DEMONSTRATION PROGRAM; PETROLEUM-EQUIVALENT FUEL ECONOMY CALCULATION Pt. 474, App. Appendix to Part 474—Sample Petroleum-Equivalent Fuel Economy Calculations Example 1: An electric vehicle...

  13. 10 CFR Appendix to Part 474 - Sample Petroleum-Equivalent Fuel Economy Calculations

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 10 Energy 3 2014-01-01 2014-01-01 false Sample Petroleum-Equivalent Fuel Economy Calculations..., DEVELOPMENT, AND DEMONSTRATION PROGRAM; PETROLEUM-EQUIVALENT FUEL ECONOMY CALCULATION Pt. 474, App. Appendix to Part 474—Sample Petroleum-Equivalent Fuel Economy Calculations Example 1: An electric vehicle...

  14. 10 CFR Appendix to Part 474 - Sample Petroleum-Equivalent Fuel Economy Calculations

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 10 Energy 3 2013-01-01 2013-01-01 false Sample Petroleum-Equivalent Fuel Economy Calculations..., DEVELOPMENT, AND DEMONSTRATION PROGRAM; PETROLEUM-EQUIVALENT FUEL ECONOMY CALCULATION Pt. 474, App. Appendix to Part 474—Sample Petroleum-Equivalent Fuel Economy Calculations Example 1: An electric vehicle...

  15. 10 CFR Appendix to Part 474 - Sample Petroleum-Equivalent Fuel Economy Calculations

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 10 Energy 3 2012-01-01 2012-01-01 false Sample Petroleum-Equivalent Fuel Economy Calculations..., DEVELOPMENT, AND DEMONSTRATION PROGRAM; PETROLEUM-EQUIVALENT FUEL ECONOMY CALCULATION Pt. 474, App. Appendix to Part 474—Sample Petroleum-Equivalent Fuel Economy Calculations Example 1: An electric vehicle...

  16. Evaluation of pump pulsation in respirable size-selective sampling: part II. Changes in sampling efficiency.

    PubMed

    Lee, Eun Gyung; Lee, Taekhee; Kim, Seung Won; Lee, Larry; Flemmer, Michael M; Harper, Martin

    2014-01-01

    This second, and concluding, part of this study evaluated changes in sampling efficiency of respirable size-selective samplers due to air pulsations generated by the selected personal sampling pumps characterized in Part I (Lee E, Lee L, Möhlmann C et al. Evaluation of pump pulsation in respirable size-selective sampling: Part I. Pulsation measurements. Ann Occup Hyg 2013). Nine particle sizes of monodisperse ammonium fluorescein (from 1 to 9 μm mass median aerodynamic diameter) were generated individually by a vibrating orifice aerosol generator from dilute solutions of fluorescein in aqueous ammonia and then injected into an environmental chamber. To collect these particles, 10-mm nylon cyclones, also known as Dorr-Oliver (DO) cyclones, were used with five medium volumetric flow rate pumps. Those were the Apex IS, HFS513, GilAir5, Elite5, and Basic5 pumps, which were found in Part I to generate pulsations of 5% (the lowest), 25%, 30%, 56%, and 70% (the highest), respectively. GK2.69 cyclones were used with the Legacy [pump pulsation (PP) = 15%] and Elite12 (PP = 41%) pumps for collection at high flows. The DO cyclone was also used to evaluate changes in sampling efficiency due to pulse shape. The HFS513 pump, which generates a more complex pulse shape, was compared to a single sine wave fluctuation generated by a piston. The luminescent intensity of the fluorescein extracted from each sample was measured with a luminescence spectrometer. Sampling efficiencies were obtained by dividing the intensity of the fluorescein extracted from the filter placed in a cyclone with the intensity obtained from the filter used with a sharp-edged reference sampler. Then, sampling efficiency curves were generated using a sigmoid function with three parameters and each sampling efficiency curve was compared to that of the reference cyclone by constructing bias maps. In general, no change in sampling efficiency (bias under ±10%) was observed until pulsations exceeded 25% for the

  17. Evaluation of Pump Pulsation in Respirable Size-Selective Sampling: Part II. Changes in Sampling Efficiency

    PubMed Central

    Lee, Eun Gyung; Lee, Taekhee; Kim, Seung Won; Lee, Larry; Flemmer, Michael M.; Harper, Martin

    2015-01-01

    This second, and concluding, part of this study evaluated changes in sampling efficiency of respirable size-selective samplers due to air pulsations generated by the selected personal sampling pumps characterized in Part I (Lee E, Lee L, Möhlmann C et al. Evaluation of pump pulsation in respirable size-selective sampling: Part I. Pulsation measurements. Ann Occup Hyg 2013). Nine particle sizes of monodisperse ammonium fluorescein (from 1 to 9 μm mass median aerodynamic diameter) were generated individually by a vibrating orifice aerosol generator from dilute solutions of fluorescein in aqueous ammonia and then injected into an environmental chamber. To collect these particles, 10-mm nylon cyclones, also known as Dorr-Oliver (DO) cyclones, were used with five medium volumetric flow rate pumps. Those were the Apex IS, HFS513, GilAir5, Elite5, and Basic5 pumps, which were found in Part I to generate pulsations of 5% (the lowest), 25%, 30%, 56%, and 70% (the highest), respectively. GK2.69 cyclones were used with the Legacy [pump pulsation (PP) = 15%] and Elite12 (PP = 41%) pumps for collection at high flows. The DO cyclone was also used to evaluate changes in sampling efficiency due to pulse shape. The HFS513 pump, which generates a more complex pulse shape, was compared to a single sine wave fluctuation generated by a piston. The luminescent intensity of the fluorescein extracted from each sample was measured with a luminescence spectrometer. Sampling efficiencies were obtained by dividing the intensity of the fluorescein extracted from the filter placed in a cyclone with the intensity obtained from the filter used with a sharp-edged reference sampler. Then, sampling efficiency curves were generated using a sigmoid function with three parameters and each sampling efficiency curve was compared to that of the reference cyclone by constructing bias maps. In general, no change in sampling efficiency (bias under ±10%) was observed until pulsations exceeded 25% for the

  18. A computer module used to calculate the horizontal control surface size of a conceptual aircraft design

    NASA Technical Reports Server (NTRS)

    Sandlin, Doral R.; Swanson, Stephen Mark

    1990-01-01

    The creation of a computer module used to calculate the size of the horizontal control surfaces of a conceptual aircraft design is discussed. The control surface size is determined by first calculating the size needed to rotate the aircraft during takeoff, and, second, by determining if the calculated size is large enough to maintain stability of the aircraft throughout any specified mission. The tail size needed to rotate during takeoff is calculated from a summation of forces about the main landing gear of the aircraft. The stability of the aircraft is determined from a summation of forces about the center of gravity during different phases of the aircraft's flight. Included in the horizontal control surface analysis are: downwash effects on an aft tail, upwash effects on a forward canard, and effects due to flight in close proximity to the ground. Comparisons of production aircraft with numerical models show good accuracy for control surface sizing. A modified canard design verified the accuracy of the module for canard configurations. Added to this stability and control module is a subroutine that determines one of the three design variables, for a stable vectored thrust aircraft. These include forward thrust nozzle position, aft thrust nozzle angle, and forward thrust split.

  19. Sample Size Determination in Shared Frailty Models for Multivariate Time-to-Event Data

    PubMed Central

    Chen, Liddy M.; Ibrahim, Joseph G.; Chu, Haitao

    2014-01-01

    The frailty model is increasingly popular for analyzing multivariate time-to-event data. The most common model is the shared frailty model. Although study design consideration is as important as analysis strategies, sample size determination methodology in studies with multivariate time-to-event data is greatly lacking in the literature. In this paper, we develop a sample size determination method for the shared frailty model to investigate the treatment effect on multivariate event times. We analyzed the data using both a parametric model and a piecewise model with unknown baseline hazard, and compare the empirical power with the calculated power. Last, we discuss the formula for testing the treatment effect on recurrent events. PMID:24697252

  20. Sample size determination in shared frailty models for multivariate time-to-event data.

    PubMed

    Chen, Liddy M; Ibrahim, Joseph G; Chu, Haitao

    2014-01-01

    The frailty model is increasingly popular for analyzing multivariate time-to-event data. The most common model is the shared frailty model. Although study design consideration is as important as analysis strategies, sample size determination methodology in studies with multivariate time-to-event data is greatly lacking in the literature. In this article, we develop a sample size determination method for the shared frailty model to investigate the treatment effect on multivariate event times. We analyzed the data using both a parametric model and a piecewise model with unknown baseline hazard, and compare the empirical power with the calculated power. Last, we discuss the formula for testing the treatment effect on recurrent events.

  1. Evaluation of Sampling Recommendations From the Influenza Virologic Surveillance Right Size Roadmap for Idaho.

    PubMed

    Rosenthal, Mariana; Anderson, Katey; Tengelsen, Leslie; Carter, Kris; Hahn, Christine; Ball, Christopher

    2017-08-24

    The Right Size Roadmap was developed by the Association of Public Health Laboratories and the Centers for Disease Control and Prevention to improve influenza virologic surveillance efficiency. Guidelines were provided to state health departments regarding representativeness and statistical estimates of specimen numbers needed for seasonal influenza situational awareness, rare or novel influenza virus detection, and rare or novel influenza virus investigation. The aim of this study was to compare Roadmap sampling recommendations with Idaho's influenza virologic surveillance to determine implementation feasibility. We calculated the proportion of medically attended influenza-like illness (MA-ILI) from Idaho's influenza-like illness surveillance among outpatients during October 2008 to May 2014, applied data to Roadmap-provided sample size calculators, and compared calculations with actual numbers of specimens tested for influenza by the Idaho Bureau of Laboratories (IBL). We assessed representativeness among patients' tested specimens to census estimates by age, sex, and health district residence. Among outpatients surveilled, Idaho's mean annual proportion of MA-ILI was 2.30% (20,834/905,818) during a 5-year period. Thus, according to Roadmap recommendations, Idaho needs to collect 128 specimens from MA-ILI patients/week for situational awareness, 1496 influenza-positive specimens/week for detection of a rare or novel influenza virus at 0.2% prevalence, and after detection, 478 specimens/week to confirm true prevalence is ≤2% of influenza-positive samples. The mean number of respiratory specimens Idaho tested for influenza/week, excluding the 2009-2010 influenza season, ranged from 6 to 24. Various influenza virus types and subtypes were collected and specimen submission sources were representative in terms of geographic distribution, patient age range and sex, and disease severity. Insufficient numbers of respiratory specimens are submitted to IBL for influenza

  2. Evaluation of Sampling Recommendations From the Influenza Virologic Surveillance Right Size Roadmap for Idaho

    PubMed Central

    2017-01-01

    Background The Right Size Roadmap was developed by the Association of Public Health Laboratories and the Centers for Disease Control and Prevention to improve influenza virologic surveillance efficiency. Guidelines were provided to state health departments regarding representativeness and statistical estimates of specimen numbers needed for seasonal influenza situational awareness, rare or novel influenza virus detection, and rare or novel influenza virus investigation. Objective The aim of this study was to compare Roadmap sampling recommendations with Idaho’s influenza virologic surveillance to determine implementation feasibility. Methods We calculated the proportion of medically attended influenza-like illness (MA-ILI) from Idaho’s influenza-like illness surveillance among outpatients during October 2008 to May 2014, applied data to Roadmap-provided sample size calculators, and compared calculations with actual numbers of specimens tested for influenza by the Idaho Bureau of Laboratories (IBL). We assessed representativeness among patients’ tested specimens to census estimates by age, sex, and health district residence. Results Among outpatients surveilled, Idaho’s mean annual proportion of MA-ILI was 2.30% (20,834/905,818) during a 5-year period. Thus, according to Roadmap recommendations, Idaho needs to collect 128 specimens from MA-ILI patients/week for situational awareness, 1496 influenza-positive specimens/week for detection of a rare or novel influenza virus at 0.2% prevalence, and after detection, 478 specimens/week to confirm true prevalence is ≤2% of influenza-positive samples. The mean number of respiratory specimens Idaho tested for influenza/week, excluding the 2009-2010 influenza season, ranged from 6 to 24. Various influenza virus types and subtypes were collected and specimen submission sources were representative in terms of geographic distribution, patient age range and sex, and disease severity. Conclusions Insufficient numbers of

  3. Dependence of thermal and epithermal neutron self-shielding on sample size and irradiation site

    NASA Astrophysics Data System (ADS)

    Chilian, C.; St-Pierre, J.; Kennedy, G.

    2006-08-01

    Analytical expressions recently developed for calculating thermal and epithermal neutron self-shielding for cylindrical samples used in neutron activation analysis were verified using three different irradiation sites of a SLOWPOKE reactor. The amount of self-shielding varied by less than 10% from one site to another. The self-shielding parameters varied with the size of the cylinder as r(r+h), for h/r ratios from 0.02 to 6.0, even in slightly non-isotropic neutron fields. A practical expression, based on the parameters of the neutron spectrum and the well-known thermal neutron absorption cross-section and the newly defined epithermal neutron absorption cross-section, is proposed for calculating the self-shielding in cylindrical samples.

  4. Optimum sample size allocation to minimize cost or maximize power for the two-sample trimmed mean test.

    PubMed

    Guo, Jiin-Huarng; Luh, Wei-Ming

    2009-05-01

    When planning a study, sample size determination is one of the most important tasks facing the researcher. The size will depend on the purpose of the study, the cost limitations, and the nature of the data. By specifying the standard deviation ratio and/or the sample size ratio, the present study considers the problem of heterogeneous variances and non-normality for Yuen's two-group test and develops sample size formulas to minimize the total cost or maximize the power of the test. For a given power, the sample size allocation ratio can be manipulated so that the proposed formulas can minimize the total cost, the total sample size, or the sum of total sample size and total cost. On the other hand, for a given total cost, the optimum sample size allocation ratio can maximize the statistical power of the test. After the sample size is determined, the present simulation applies Yuen's test to the sample generated, and then the procedure is validated in terms of Type I errors and power. Simulation results show that the proposed formulas can control Type I errors and achieve the desired power under the various conditions specified. Finally, the implications for determining sample sizes in experimental studies and future research are discussed.

  5. 7 CFR 51.3200 - Samples for grade and size determination.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 7 Agriculture 2 2011-01-01 2011-01-01 false Samples for grade and size determination. 51.3200..., AND STANDARDS) United States Standards for Grades of Bermuda-Granex-Grano Type Onions Samples for Grade and Size Determination § 51.3200 Samples for grade and size determination. Individual samples...

  6. 7 CFR 51.3200 - Samples for grade and size determination.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Samples for grade and size determination. 51.3200..., AND STANDARDS) United States Standards for Grades of Bermuda-Granex-Grano Type Onions Samples for Grade and Size Determination § 51.3200 Samples for grade and size determination. Individual samples...

  7. Calculating hominin and nonhuman anthropoid femoral head diameter from acetabular size.

    PubMed

    Plavcan, J Michael; Hammond, Ashley S; Ward, Carol V

    2014-11-01

    Femoral head size provides important information on body size in extinct species. Although it is well-known that femoral head size is correlated with acetabular size, the precision with which femoral head size can be estimated from acetabular size has not been quantified. The availability of accurate 3D surface models of fossil acetabular remains opens the possibility of obtaining accurate estimates of femoral head size from even fragmentary fossil remains [Hammond et al.,: Am J Phys Anthropol 150 (2013) 565-578]. Here we evaluate the relationship between spheres fit to surface models of the femoral head and acetabulum of a large sample of extant anthropoid primates. Sphere diameters are tightly correlated and scale isometrically. In spite of significant taxonomic and possibly functional differences in the relationship between femoral head size and acetabulum size, percent prediction errors of estimated femoral head size remain low regardless of the taxonomic composition of the reference sample. We provide estimates of femoral head size for a series of fossil hominins and monkeys. © 2014 Wiley Periodicals, Inc.

  8. Exploring the Dependence of QM/MM Calculations of Enzyme Catalysis on the Size of the QM Region

    PubMed Central

    2016-01-01

    Although QM/MM calculations are the primary current tool for modeling enzymatic reactions, the reliability of such calculations can be limited by the size of the QM region. Thus, we examine in this work the dependence of QM/MM calculations on the size of the QM region, using the reaction of catechol-O-methyl transferase (COMT) as a test case. Our study focuses on the effect of adding residues to the QM region on the activation free energy, obtained with extensive QM/MM sampling. It is found that the sensitivity of the activation barrier to the size of the QM is rather limited, while the dependence of the reaction free energy is somewhat larger. Of course, the results depend on the inclusion of the first solvation shell in the QM regions. For example, the inclusion of the Mg2+ ion can change the activation barrier due to charge transfer effects. However, such effects can easily be included in semiempirical approaches by proper parametrization. Overall, we establish that QM/MM calculations of activation barriers of enzymatic reactions are not highly sensitive to the size of the QM region, beyond the immediate region that describes the reacting atoms. PMID:27552257

  9. Sampling of Stochastic Input Parameters for Rockfall Calculations and for Structural Response Calculations Under Vibratory Ground Motion

    SciTech Connect

    M. Gross

    2004-09-01

    The purpose of this scientific analysis is to define the sampled values of stochastic (random) input parameters for (1) rockfall calculations in the lithophysal and nonlithophysal zones under vibratory ground motions, and (2) structural response calculations for the drip shield and waste package under vibratory ground motions. This analysis supplies: (1) Sampled values of ground motion time history and synthetic fracture pattern for analysis of rockfall in emplacement drifts in nonlithophysal rock (Section 6.3 of ''Drift Degradation Analysis'', BSC 2004 [DIRS 166107]); (2) Sampled values of ground motion time history and rock mechanical properties category for analysis of rockfall in emplacement drifts in lithophysal rock (Section 6.4 of ''Drift Degradation Analysis'', BSC 2004 [DIRS 166107]); (3) Sampled values of ground motion time history and metal to metal and metal to rock friction coefficient for analysis of waste package and drip shield damage to vibratory motion in ''Structural Calculations of Waste Package Exposed to Vibratory Ground Motion'' (BSC 2004 [DIRS 167083]) and in ''Structural Calculations of Drip Shield Exposed to Vibratory Ground Motion'' (BSC 2003 [DIRS 163425]). The sampled values are indices representing the number of ground motion time histories, number of fracture patterns and rock mass properties categories. These indices are translated into actual values within the respective analysis and model reports or calculations. This report identifies the uncertain parameters and documents the sampled values for these parameters. The sampled values are determined by GoldSim V6.04.007 [DIRS 151202] calculations using appropriate distribution types and parameter ranges. No software development or model development was required for these calculations. The calculation of the sampled values allows parameter uncertainty to be incorporated into the rockfall and structural response calculations that support development of the seismic scenario for the

  10. Effects of sample size and sampling frequency on studies of brown bear home ranges and habitat use

    USGS Publications Warehouse

    Arthur, Steve M.; Schwartz, Charles C.

    1999-01-01

    We equipped 9 brown bears (Ursus arctos) on the Kenai Peninsula, Alaska, with collars containing both conventional very-high-frequency (VHF) transmitters and global positioning system (GPS) receivers programmed to determine an animal's position at 5.75-hr intervals. We calculated minimum convex polygon (MCP) and fixed and adaptive kernel home ranges for randomly-selected subsets of the GPS data to examine the effects of sample size on accuracy and precision of home range estimates. We also compared results obtained by weekly aerial radiotracking versus more frequent GPS locations to test for biases in conventional radiotracking data. Home ranges based on the MCP were 20-606 km2 (x = 201) for aerial radiotracking data (n = 12-16 locations/bear) and 116-1,505 km2 (x = 522) for the complete GPS data sets (n = 245-466 locations/bear). Fixed kernel home ranges were 34-955 km2 (x = 224) for radiotracking data and 16-130 km2 (x = 60) for the GPS data. Differences between means for radiotracking and GPS data were due primarily to the larger samples provided by the GPS data. Means did not differ between radiotracking data and equivalent-sized subsets of GPS data (P > 0.10). For the MCP, home range area increased and variability decreased asymptotically with number of locations. For the kernel models, both area and variability decreased with increasing sample size. Simulations suggested that the MCP and kernel models required >60 and >80 locations, respectively, for estimates to be both accurate (change in area <1%/additional location) and precise (CV < 50%). Although the radiotracking data appeared unbiased, except for the relationship between area and sample size, these data failed to indicate some areas that likely were important to bears. Our results suggest that the usefulness of conventional radiotracking data may be limited by potential biases and variability due to small samples. Investigators that use home range estimates in statistical tests should consider the

  11. Sample size in disease management program evaluation: the challenge of demonstrating a statistically significant reduction in admissions.

    PubMed

    Linden, Ariel

    2008-04-01

    Prior to implementing a disease management (DM) strategy, a needs assessment should be conducted to determine whether sufficient opportunity exists for an intervention to be successful in the given population. A central component of this assessment is a sample size analysis to determine whether the population is of sufficient size to allow the expected program effect to achieve statistical significance. This paper discusses the parameters that comprise the generic sample size formula for independent samples and their interrelationships, followed by modifications for the DM setting. In addition, a table is provided with sample size estimates for various effect sizes. Examples are described in detail along with strategies for overcoming common barriers. Ultimately, conducting these calculations up front will help set appropriate expectations about the ability to demonstrate the success of the intervention.

  12. On Using a Pilot Sample Variance for Sample Size Determination in the Detection of Differences between Two Means: Power Consideration

    ERIC Educational Resources Information Center

    Shieh, Gwowen

    2013-01-01

    The a priori determination of a proper sample size necessary to achieve some specified power is an important problem encountered frequently in practical studies. To establish the needed sample size for a two-sample "t" test, researchers may conduct the power analysis by specifying scientifically important values as the underlying population means…

  13. Sample size allocation for food item radiation monitoring and safety inspection.

    PubMed

    Seto, Mayumi; Uriu, Koichiro

    2015-03-01

    The objective of this study is to identify a procedure for determining sample size allocation for food radiation inspections of more than one food item to minimize the potential risk to consumers of internal radiation exposure. We consider a simplified case of food radiation monitoring and safety inspection in which a risk manager is required to monitor two food items, milk and spinach, in a contaminated area. Three protocols for food radiation monitoring with different sample size allocations were assessed by simulating random sampling and inspections of milk and spinach in a conceptual monitoring site. Distributions of (131)I and radiocesium concentrations were determined in reference to (131)I and radiocesium concentrations detected in Fukushima prefecture, Japan, for March and April 2011. The results of the simulations suggested that a protocol that allocates sample size to milk and spinach based on the estimation of (131)I and radiocesium concentrations using the apparent decay rate constants sequentially calculated from past monitoring data can most effectively minimize the potential risks of internal radiation exposure.

  14. Comparing Server Energy Use and Efficiency Using Small Sample Sizes

    SciTech Connect

    Coles, Henry C.; Qin, Yong; Price, Phillip N.

    2014-11-01

    This report documents a demonstration that compared the energy consumption and efficiency of a limited sample size of server-type IT equipment from different manufacturers by measuring power at the server power supply power cords. The results are specific to the equipment and methods used. However, it is hoped that those responsible for IT equipment selection can used the methods described to choose models that optimize energy use efficiency. The demonstration was conducted in a data center at Lawrence Berkeley National Laboratory in Berkeley, California. It was performed with five servers of similar mechanical and electronic specifications; three from Intel and one each from Dell and Supermicro. Server IT equipment is constructed using commodity components, server manufacturer-designed assemblies, and control systems. Server compute efficiency is constrained by the commodity component specifications and integration requirements. The design freedom, outside of the commodity component constraints, provides room for the manufacturer to offer a product with competitive efficiency that meets market needs at a compelling price. A goal of the demonstration was to compare and quantify the server efficiency for three different brands. The efficiency is defined as the average compute rate (computations per unit of time) divided by the average energy consumption rate. The research team used an industry standard benchmark software package to provide a repeatable software load to obtain the compute rate and provide a variety of power consumption levels. Energy use when the servers were in an idle state (not providing computing work) were also measured. At high server compute loads, all brands, using the same key components (processors and memory), had similar results; therefore, from these results, it could not be concluded that one brand is more efficient than the other brands. The test results show that the power consumption variability caused by the key components as a

  15. Efficiency of whole-body counter for various body size calculated by MCNP5 software.

    PubMed

    Krstic, D; Nikezic, D

    2012-11-01

    The efficiency of a whole-body counter for (137)Cs and (40)K was calculated using the MCNP5 code. The ORNL phantoms of a human body of different body sizes were applied in a sitting position in front of a detector. The aim was to investigate the dependence of efficiency on the body size (age) and the detector position with respect to the body and to estimate the accuracy of real measurements. The calculation work presented here is related to the NaI detector, which is available in the Serbian Whole-body Counter facility in Vinca Institute.

  16. Sample size considerations of prediction-validation methods in high-dimensional data for survival outcomes.

    PubMed

    Pang, Herbert; Jung, Sin-Ho

    2013-04-01

    A variety of prediction methods are used to relate high-dimensional genome data with a clinical outcome using a prediction model. Once a prediction model is developed from a data set, it should be validated using a resampling method or an independent data set. Although the existing prediction methods have been intensively evaluated by many investigators, there has not been a comprehensive study investigating the performance of the validation methods, especially with a survival clinical outcome. Understanding the properties of the various validation methods can allow researchers to perform more powerful validations while controlling for type I error. In addition, sample size calculation strategy based on these validation methods is lacking. We conduct extensive simulations to examine the statistical properties of these validation strategies. In both simulations and a real data example, we have found that 10-fold cross-validation with permutation gave the best power while controlling type I error close to the nominal level. Based on this, we have also developed a sample size calculation method that will be used to design a validation study with a user-chosen combination of prediction. Microarray and genome-wide association studies data are used as illustrations. The power calculation method in this presentation can be used for the design of any biomedical studies involving high-dimensional data and survival outcomes.

  17. Sample Size Considerations of Prediction-Validation Methods in High-Dimensional Data for Survival Outcomes

    PubMed Central

    Pang, Herbert; Jung, Sin-Ho

    2013-01-01

    A variety of prediction methods are used to relate high-dimensional genome data with a clinical outcome using a prediction model. Once a prediction model is developed from a data set, it should be validated using a resampling method or an independent data set. Although the existing prediction methods have been intensively evaluated by many investigators, there has not been a comprehensive study investigating the performance of the validation methods, especially with a survival clinical outcome. Understanding the properties of the various validation methods can allow researchers to perform more powerful validations while controlling for type I error. In addition, sample size calculation strategy based on these validation methods is lacking. We conduct extensive simulations to examine the statistical properties of these validation strategies. In both simulations and a real data example, we have found that 10-fold cross-validation with permutation gave the best power while controlling type I error close to the nominal level. Based on this, we have also developed a sample size calculation method that will be used to design a validation study with a user-chosen combination of prediction. Microarray and genome-wide association studies data are used as illustrations. The power calculation method in this presentation can be used for the design of any biomedical studies involving high-dimensional data and survival outcomes. PMID:23471879

  18. A Novel Size-Selective Airborne Particle Sampling Instrument (Wras) for Health Risk Evaluation

    NASA Astrophysics Data System (ADS)

    Gnewuch, H.; Muir, R.; Gorbunov, B.; Priest, N. D.; Jackson, P. R.

    Health risks associated with inhalation of airborne particles are known to be influenced by particle sizes. A reliable, size resolving sampler, classifying particles in size ranges from 2 nm—30 μm and suitable for use in the field would be beneficial in investigating health risks associated with inhalation of airborne particles. A review of current aerosol samplers highlighted a number of limitations. These could be overcome by combining an inertial deposition impactor with a diffusion collector in a single device. The instrument was designed for analysing mass size distributions. Calibration was carried out using a number of recognised techniques. The instrument was tested in the field by collecting size resolved samples of lead containing aerosols present at workplaces in factories producing crystal glass. The mass deposited on each substrate proved sufficient to be detected and measured using atomic absorption spectroscopy. Mass size distributions of lead were produced and the proportion of lead present in the aerosol nanofraction calculated and varied from 10% to 70% by weight.

  19. Determination of the size of a radiation source by the method of calculation of diffraction patterns

    NASA Astrophysics Data System (ADS)

    Tilikin, I. N.; Shelkovenko, T. A.; Pikuz, S. A.; Hammer, D. A.

    2013-07-01

    In traditional X-ray radiography, which has been used for various purposes since the discovery of X-ray radiation, the shadow image of an object under study is constructed based on the difference in the absorption of the X-ray radiation by different parts of the object. The main method that ensures a high spatial resolution is the method of point projection X-ray radiography, i.e., radiography from a point and bright radiation source. For projection radiography, the small size of the source is the most important characteristic of the source, which mainly determines the spatial resolution of the method. In this work, as a point source of soft X-ray radiation for radiography with a high spatial and temporal resolution, radiation from a hot spot of X-pinches is used. The size of the radiation source in different setups and configurations can be different. For four different high-current generators, we have calculated the sizes of sources of soft X-ray radiation from X-ray patterns of corresponding objects using Fresnel-Kirchhoff integrals. Our calculations show that the size of the source is in the range 0.7-2.8 μm. The method of the determination of the size of a radiation source from calculations of Fresnel-Kirchhoff integrals makes it possible to determine the size with an accuracy that exceeds the diffraction limit, which frequently restricts the resolution of standard methods.

  20. Sample Sizes for Confidence Intervals on the Increase in the Squared Multiple Correlation Coefficient.

    ERIC Educational Resources Information Center

    Algina, James; Moulder, Bradley C.

    2001-01-01

    Studied sample sizes for confidence intervals on the increase in the squared multiple correlation coefficient using simulation. Discusses predictors and actual coverage probability and provides sample-size guidelines for probability coverage to be near the nominal confidence interval. (SLD)

  1. Sample Sizes for Confidence Intervals on the Increase in the Squared Multiple Correlation Coefficient.

    ERIC Educational Resources Information Center

    Algina, James; Moulder, Bradley C.

    2001-01-01

    Studied sample sizes for confidence intervals on the increase in the squared multiple correlation coefficient using simulation. Discusses predictors and actual coverage probability and provides sample-size guidelines for probability coverage to be near the nominal confidence interval. (SLD)

  2. Cavern/Vault Disposal Concepts and Thermal Calculations for Direct Disposal of 37-PWR Size DPCs

    SciTech Connect

    Hardin, Ernest; Hadgu, Teklu; Clayton, Daniel James

    2015-03-01

    This report provides two sets of calculations not presented in previous reports on the technical feasibility of spent nuclear fuel (SNF) disposal directly in dual-purpose canisters (DPCs): 1) thermal calculations for reference disposal concepts using larger 37-PWR size DPC-based waste packages, and 2) analysis and thermal calculations for underground vault-type storage and eventual disposal of DPCs. The reader is referred to the earlier reports (Hardin et al. 2011, 2012, 2013; Hardin and Voegele 2013) for contextual information on DPC direct disposal alternatives.

  3. A contemporary decennial global sample of changing agricultural field sizes

    NASA Astrophysics Data System (ADS)

    White, E.; Roy, D. P.

    2011-12-01

    In the last several hundred years agriculture has caused significant human induced Land Cover Land Use Change (LCLUC) with dramatic cropland expansion and a marked increase in agricultural productivity. The size of agricultural fields is a fundamental description of rural landscapes and provides an insight into the drivers of rural LCLUC. Increasing field sizes cause a subsequent decrease in the number of fields and therefore decreased landscape spatial complexity with impacts on biodiversity, habitat, soil erosion, plant-pollinator interactions, diffusion of disease pathogens and pests, and loss or degradation in buffers to nutrient, herbicide and pesticide flows. In this study, globally distributed locations with significant contemporary field size change were selected guided by a global map of agricultural yield and literature review and were selected to be representative of different driving forces of field size change (associated with technological innovation, socio-economic conditions, government policy, historic patterns of land cover land use, and environmental setting). Seasonal Landsat data acquired on a decadal basis (for 1980, 1990, 2000 and 2010) were used to extract field boundaries and the temporal changes in field size quantified and their causes discussed.

  4. Development of size reduction equations for calculating power input for grinding pine wood chips using hammer mill

    DOE PAGES

    Naimi, Ladan J.; Collard, Flavien; Bi, Xiaotao; ...

    2016-01-05

    Size reduction is an unavoidable operation for preparing biomass for biofuels and bioproduct conversion. Yet, there is considerable uncertainty in power input requirement and the uniformity of ground biomass. Considerable gains are possible if the required power input for a size reduction ratio is estimated accurately. In this research three well-known mechanistic equations attributed to Rittinger, Kick, and Bond available for predicting energy input for grinding pine wood chips were tested against experimental grinding data. Prior to testing, samples of pine wood chips were conditioned to 11.7% wb, moisture content. The wood chips were successively ground in a hammer millmore » using screen sizes of 25.4 mm, 10 mm, 6.4 mm, and 3.2 mm. The input power and the flow of material into the grinder were recorded continuously. The recorded power input vs. mean particle size showed that the Rittinger equation had the best fit to the experimental data. The ground particle sizes were 4 to 7 times smaller than the size of installed screen. Geometric mean size of particles were calculated using two methods (1) Tyler sieves and using particle size analysis and (2) Sauter mean diameter calculated from the ratio of volume to surface that were estimated from measured length and width. The two mean diameters agreed well, pointing to the fact that either mechanical sieving or particle imaging can be used to characterize particle size. In conclusion, specific energy input to the hammer mill increased from 1.4 kWh t–1 (5.2 J g–1) for large 25.1-mm screen to 25 kWh t–1 (90.4 J g–1) for small 3.2-mm screen.« less

  5. Development of size reduction equations for calculating power input for grinding pine wood chips using hammer mill

    SciTech Connect

    Naimi, Ladan J.; Collard, Flavien; Bi, Xiaotao; Lim, C. Jim; Sokhansanj, Shahab

    2016-01-05

    Size reduction is an unavoidable operation for preparing biomass for biofuels and bioproduct conversion. Yet, there is considerable uncertainty in power input requirement and the uniformity of ground biomass. Considerable gains are possible if the required power input for a size reduction ratio is estimated accurately. In this research three well-known mechanistic equations attributed to Rittinger, Kick, and Bond available for predicting energy input for grinding pine wood chips were tested against experimental grinding data. Prior to testing, samples of pine wood chips were conditioned to 11.7% wb, moisture content. The wood chips were successively ground in a hammer mill using screen sizes of 25.4 mm, 10 mm, 6.4 mm, and 3.2 mm. The input power and the flow of material into the grinder were recorded continuously. The recorded power input vs. mean particle size showed that the Rittinger equation had the best fit to the experimental data. The ground particle sizes were 4 to 7 times smaller than the size of installed screen. Geometric mean size of particles were calculated using two methods (1) Tyler sieves and using particle size analysis and (2) Sauter mean diameter calculated from the ratio of volume to surface that were estimated from measured length and width. The two mean diameters agreed well, pointing to the fact that either mechanical sieving or particle imaging can be used to characterize particle size. In conclusion, specific energy input to the hammer mill increased from 1.4 kWh t–1 (5.2 J g–1) for large 25.1-mm screen to 25 kWh t–1 (90.4 J g–1) for small 3.2-mm screen.

  6. Increasing the sample size at interim for a two-sample experiment without Type I error inflation.

    PubMed

    Dunnigan, Keith; King, Dennis W

    2010-01-01

    For the case of a one-sample experiment with known variance σ² =1, it has been shown that at interim analysis the sample size (SS) may be increased by any arbitrary amount provided: (1) The conditional power (CP) at interim is ≥ 50% and (2) there can be no decision to decrease the SS (stop the trial early). In this paper we verify this result for the case of a two-sample experiment with proportional SS in the treatment groups and an arbitrary common variance. Numerous authors have presented the formula for the CP at interim for a two-sample test with equal SS in the treatment groups and an arbitrary common variance, for both the one- and two-sided hypothesis tests. In this paper we derive the corresponding formula for the case of unequal, but proportional SS in the treatment groups for both one-sided superiority and two-sided hypothesis tests. Finally, we present an SAS macro for doing this calculation and provide a worked out hypothetical example. In discussion we note that this type of trial design trades the ability to stop early (for lack of efficacy) for the elimination of the Type I error penalty. The loss of early stopping requires that such a design employs a data monitoring committee, blinding of the sponsor to the interim calculations, and pre-planning of how much and under what conditions to increase the SS and that this all be formally written into an interim analysis plan before the start of the study.

  7. 7 CFR 51.1548 - Samples for grade and size determination.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 7 Agriculture 2 2013-01-01 2013-01-01 false Samples for grade and size determination. 51.1548... PRODUCTS 1,2 (INSPECTION, CERTIFICATION, AND STANDARDS) United States Standards for Grades of Potatoes 1 Samples for Grade and Size Determination § 51.1548 Samples for grade and size determination. Individual...

  8. 7 CFR 51.3200 - Samples for grade and size determination.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 7 Agriculture 2 2014-01-01 2014-01-01 false Samples for grade and size determination. 51.3200... PRODUCTS 1 2 (INSPECTION, CERTIFICATION, AND STANDARDS) United States Standards for Grades of Bermuda-Granex-Grano Type Onions Samples for Grade and Size Determination § 51.3200 Samples for grade and size...

  9. 7 CFR 51.3200 - Samples for grade and size determination.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 7 Agriculture 2 2013-01-01 2013-01-01 false Samples for grade and size determination. 51.3200... PRODUCTS 1,2 (INSPECTION, CERTIFICATION, AND STANDARDS) United States Standards for Grades of Bermuda-Granex-Grano Type Onions Samples for Grade and Size Determination § 51.3200 Samples for grade and size...

  10. 7 CFR 51.2838 - Samples for grade and size determination.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 7 Agriculture 2 2011-01-01 2011-01-01 false Samples for grade and size determination. 51.2838..., AND STANDARDS) United States Standards for Grades of Onions (Other Than Bermuda-Granex-Grano and Creole Types) Samples for Grade and Size Determination § 51.2838 Samples for grade and size determination...

  11. 7 CFR 51.2838 - Samples for grade and size determination.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Samples for grade and size determination. 51.2838..., AND STANDARDS) United States Standards for Grades of Onions (Other Than Bermuda-Granex-Grano and Creole Types) Samples for Grade and Size Determination § 51.2838 Samples for grade and size determination...

  12. 7 CFR 51.1548 - Samples for grade and size determination.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 7 Agriculture 2 2014-01-01 2014-01-01 false Samples for grade and size determination. 51.1548... PRODUCTS 1 2 (INSPECTION, CERTIFICATION, AND STANDARDS) United States Standards for Grades of Potatoes 1 Samples for Grade and Size Determination § 51.1548 Samples for grade and size determination. Individual...

  13. Sample Size Determination for Regression Models Using Monte Carlo Methods in R

    ERIC Educational Resources Information Center

    Beaujean, A. Alexander

    2014-01-01

    A common question asked by researchers using regression models is, What sample size is needed for my study? While there are formulae to estimate sample sizes, their assumptions are often not met in the collected data. A more realistic approach to sample size determination requires more information such as the model of interest, strength of the…

  14. Sample Size Determination for Regression Models Using Monte Carlo Methods in R

    ERIC Educational Resources Information Center

    Beaujean, A. Alexander

    2014-01-01

    A common question asked by researchers using regression models is, What sample size is needed for my study? While there are formulae to estimate sample sizes, their assumptions are often not met in the collected data. A more realistic approach to sample size determination requires more information such as the model of interest, strength of the…

  15. Air and smear sample calculational tool for Fluor Hanford Radiological control

    SciTech Connect

    BAUMANN, B.L.

    2003-09-24

    A spreadsheet calculation tool was developed to automate the calculations performed for determining the concentration of airborne radioactivity and smear counting as outlined in HNF-13536, Section 5.2.7, Analyzing Air and smear Samples. This document reports on the design and testing of the calculation tool.

  16. Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses

    PubMed Central

    Lanfear, Robert; Hua, Xia; Warren, Dan L.

    2016-01-01

    Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS) estimates how many truly independent samples of a given parameter the output of the MCMC represents. The ESS of a parameter is frequently much lower than the number of samples taken from the MCMC because sequential samples from the chain can be non-independent due to autocorrelation. Typically, phylogeneticists use a rule of thumb that the ESS of all parameters should be greater than 200. However, we have no method to calculate an ESS of tree topology samples, despite the fact that the tree topology is often the parameter of primary interest and is almost always central to the estimation of other parameters. That is, we lack a method to determine whether we have adequately sampled one of the most important parameters in our analyses. In this study, we address this problem by developing methods to estimate the ESS for tree topologies. We combine these methods with two new diagnostic plots for assessing posterior samples of tree topologies, and compare their performance on simulated and empirical data sets. Combined, the methods we present provide new ways to assess the mixing and convergence of phylogenetic tree topologies in Bayesian MCMC analyses. PMID:27435794

  17. 40 CFR Appendix III to Part 600 - Sample Fuel Economy Label Calculation

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 40 Protection of Environment 29 2010-07-01 2010-07-01 false Sample Fuel Economy Label Calculation...) ENERGY POLICY FUEL ECONOMY AND CARBON-RELATED EXHAUST EMISSIONS OF MOTOR VEHICLES Pt. 600, App. III Appendix III to Part 600—Sample Fuel Economy Label Calculation Suppose that a manufacturer called...

  18. 7 CFR 51.308 - Methods of sampling and calculation of percentages.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... Grades of Apples Methods of Sampling and Calculation of Percentages § 51.308 Methods of sampling and... weigh ten pounds or less, or in any container where the minimum diameter of the smallest apple does not vary more than 1/2 inch from the minimum diameter of the largest apple, percentages shall be calculated...

  19. 7 CFR 51.308 - Methods of sampling and calculation of percentages.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... Grades of Apples Methods of Sampling and Calculation of Percentages § 51.308 Methods of sampling and... weigh ten pounds or less, or in any container where the minimum diameter of the smallest apple does not vary more than 1/2 inch from the minimum diameter of the largest apple, percentages shall be calculated...

  20. Two Test Items to Explore High School Students' Beliefs of Sample Size When Sampling from Large Populations

    ERIC Educational Resources Information Center

    Bill, Anthony; Henderson, Sally; Penman, John

    2010-01-01

    Two test items that examined high school students' beliefs of sample size for large populations using the context of opinion polls conducted prior to national and state elections were developed. A trial of the two items with 21 male and 33 female Year 9 students examined their naive understanding of sample size: over half of students chose a…

  1. Sample size and scene identification (cloud) - Effect on albedo

    NASA Technical Reports Server (NTRS)

    Vemury, S. K.; Stowe, L.; Jacobowitz, H.

    1984-01-01

    Scan channels on the Nimbus 7 Earth Radiation Budget instrument sample radiances from underlying earth scenes at a number of incident and scattering angles. A sampling excess toward measurements at large satellite zenith angles is noted. Also, at large satellite zenith angles, the present scheme for scene selection causes many observations to be classified as cloud, resulting in higher flux averages. Thus the combined effect of sampling bias and scene identification errors is to overestimate the computed albedo. It is shown, using a process of successive thresholding, that observations with satellite zenith angles greater than 50-60 deg lead to incorrect cloud identification. Elimination of these observations has reduced the albedo from 32.2 to 28.8 percent. This reduction is very nearly the same and in the right direction as the discrepancy between the albedoes derived from the scanner and the wide-field-of-view channels.

  2. 7 CFR 51.1548 - Samples for grade and size determination.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... opened to provide at least a 20-pound sample. The number of such individual samples drawn for grade and... 7 Agriculture 2 2010-01-01 2010-01-01 false Samples for grade and size determination. 51.1548..., AND STANDARDS) United States Standards for Grades of Potatoes 1 Samples for Grade and Size...

  3. 7 CFR 51.1548 - Samples for grade and size determination.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... opened to provide at least a 20-pound sample. The number of such individual samples drawn for grade and... 7 Agriculture 2 2011-01-01 2011-01-01 false Samples for grade and size determination. 51.1548..., AND STANDARDS) United States Standards for Grades of Potatoes 1 Samples for Grade and Size...

  4. Precision of Student Growth Percentiles with Small Sample Sizes

    ERIC Educational Resources Information Center

    Culbertson, Michael J.

    2016-01-01

    States in the Regional Educational Laboratory (REL) Central region serve a largely rural population with many states enrolling fewer than 350,000 students. A common challenge identified among REL Central educators is identifying appropriate methods for analyzing data with small samples of students. In particular, members of the REL Central…

  5. 7 CFR 201.43 - Size of sample.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... units. Coated seed for germination test only shall consist of at least 1,000 seed units. [10 FR 9950..., Inspections, Marketing Practices), DEPARTMENT OF AGRICULTURE (CONTINUED) FEDERAL SEED ACT FEDERAL SEED ACT... of samples of agricultural seed, vegetable seed and screenings to be submitted for analysis, test, or...

  6. 7 CFR 201.43 - Size of sample.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... units. Coated seed for germination test only shall consist of at least 1,000 seed units. ..., Inspections, Marketing Practices), DEPARTMENT OF AGRICULTURE (CONTINUED) FEDERAL SEED ACT FEDERAL SEED ACT... of samples of agricultural seed, vegetable seed and screenings to be submitted for analysis, test, or...

  7. 7 CFR 201.43 - Size of sample.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... units. Coated seed for germination test only shall consist of at least 1,000 seed units. ..., Inspections, Marketing Practices), DEPARTMENT OF AGRICULTURE (CONTINUED) FEDERAL SEED ACT FEDERAL SEED ACT... of samples of agricultural seed, vegetable seed and screenings to be submitted for analysis, test, or...

  8. 7 CFR 201.43 - Size of sample.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... units. Coated seed for germination test only shall consist of at least 1,000 seed units. [10 FR 9950..., Inspections, Marketing Practices), DEPARTMENT OF AGRICULTURE (CONTINUED) FEDERAL SEED ACT FEDERAL SEED ACT... of samples of agricultural seed, vegetable seed and screenings to be submitted for analysis, test, or...

  9. 7 CFR 201.43 - Size of sample.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... units. Coated seed for germination test only shall consist of at least 1,000 seed units. [10 FR 9950..., Inspections, Marketing Practices), DEPARTMENT OF AGRICULTURE (CONTINUED) FEDERAL SEED ACT FEDERAL SEED ACT... of samples of agricultural seed, vegetable seed and screenings to be submitted for analysis, test, or...

  10. Utility of Inferential Norming with Smaller Sample Sizes

    ERIC Educational Resources Information Center

    Zhu, Jianjun; Chen, Hsin-Yi

    2011-01-01

    We examined the utility of inferential norming using small samples drawn from the larger "Wechsler Intelligence Scales for Children-Fourth Edition" (WISC-IV) standardization data set. The quality of the norms was estimated with multiple indexes such as polynomial curve fit, percentage of cases receiving the same score, average absolute…

  11. Utility of Inferential Norming with Smaller Sample Sizes

    ERIC Educational Resources Information Center

    Zhu, Jianjun; Chen, Hsin-Yi

    2011-01-01

    We examined the utility of inferential norming using small samples drawn from the larger "Wechsler Intelligence Scales for Children-Fourth Edition" (WISC-IV) standardization data set. The quality of the norms was estimated with multiple indexes such as polynomial curve fit, percentage of cases receiving the same score, average absolute…

  12. Effects of grid size and aggregation on regional scale landuse scenario calculations using SVAT schemes

    NASA Astrophysics Data System (ADS)

    Bormann, H.

    2006-09-01

    This paper analyses the effect of spatial input data resolution on the simulated effects of regional scale landuse scenarios using the TOPLATS model. A data set of 25 m resolution of the central German Dill catchment (693 km2) and three different landuse scenarios are used for the investigation. Landuse scenarios in this study are field size scenarios, and depending on a specific target field size (0.5 ha, 1.5 ha and 5.0 ha) landuse is determined by optimising economic outcome of agricultural used areas and forest. After an aggregation of digital elevation model, soil map, current landuse and landuse scenarios to 50 m, 75 m, 100 m, 150 m, 200 m, 300 m, 500 m, 1 km and 2 km, water balances and water flow components for a 20 years time period are calculated for the entire Dill catchment as well as for 3 subcatchments without any recalibration. Additionally water balances based on the three landuse scenarios as well as changes between current conditions and scenarios are calculated. The study reveals that both model performance measures (for current landuse) as well as water balances (for current landuse and landuse scenarios) almost remain constant for most of the aggregation steps for all investigated catchments. Small deviations are detected at the resolution of 50 m to 500 m, while significant differences occur at the resolution of 1 km and 2 km which can be explained by changes in the statistics of the input data. Calculating the scenario effects based on increasing grid sizes yields similar results. However, the change effects react more sensitive to data aggregation than simple water balance calculations. Increasing deviations between simulations based on small grid sizes and simulations using grid sizes of 300 m and more are observed. Summarizing, this study indicates that an aggregation of input data for the calculation of regional water balances using TOPLATS type models does not lead to significant errors up to a resolution of 500 m. Focusing on scenario

  13. Error and bias in under-5 mortality estimates derived from birth histories with small sample sizes.

    PubMed

    Dwyer-Lindgren, Laura; Gakidou, Emmanuela; Flaxman, Abraham; Wang, Haidong

    2013-07-26

    Estimates of under-5 mortality at the national level for countries without high-quality vital registration systems are routinely derived from birth history data in censuses and surveys. Subnational or stratified analyses of under-5 mortality could also be valuable, but the usefulness of under-5 mortality estimates derived from birth histories from relatively small samples of women is not known. We aim to assess the magnitude and direction of error that can be expected for estimates derived from birth histories with small samples of women using various analysis methods. We perform a data-based simulation study using Demographic and Health Surveys. Surveys are treated as populations with known under-5 mortality, and samples of women are drawn from each population to mimic surveys with small sample sizes. A variety of methods for analyzing complete birth histories and one method for analyzing summary birth histories are used on these samples, and the results are compared to corresponding true under-5 mortality. We quantify the expected magnitude and direction of error by calculating the mean error, mean relative error, mean absolute error, and mean absolute relative error. All methods are prone to high levels of error at the smallest sample size with no method performing better than 73% error on average when the sample contains 10 women. There is a high degree of variation in performance between the methods at each sample size, with methods that contain considerable pooling of information generally performing better overall. Additional stratified analyses suggest that performance varies for most methods according to the true level of mortality and the time prior to survey. This is particularly true of the summary birth history method as well as complete birth history methods that contain considerable pooling of information across time. Performance of all birth history analysis methods is extremely poor when used on very small samples of women, both in terms of

  14. Progression of MRI markers in cerebral small vessel disease: Sample size considerations for clinical trials

    PubMed Central

    Zeestraten, Eva; Lambert, Christian; Chis Ster, Irina; Williams, Owen A; Lawrence, Andrew J; Patel, Bhavini; MacKinnon, Andrew D; Barrick, Thomas R; Markus, Hugh S

    2016-01-01

    Detecting treatment efficacy using cognitive change in trials of cerebral small vessel disease (SVD) has been challenging, making the use of surrogate markers such as magnetic resonance imaging (MRI) attractive. We determined the sensitivity of MRI to change in SVD and used this information to calculate sample size estimates for a clinical trial. Data from the prospective SCANS (St George’s Cognition and Neuroimaging in Stroke) study of patients with symptomatic lacunar stroke and confluent leukoaraiosis was used (n = 121). Ninety-nine subjects returned at one or more time points. Multimodal MRI and neuropsychologic testing was performed annually over 3 years. We evaluated the change in brain volume, T2 white matter hyperintensity (WMH) volume, lacunes, and white matter damage on diffusion tensor imaging (DTI). Over 3 years, change was detectable in all MRI markers but not in cognitive measures. WMH volume and DTI parameters were most sensitive to change and therefore had the smallest sample size estimates. MRI markers, particularly WMH volume and DTI parameters, are more sensitive to SVD progression over short time periods than cognition. These markers could significantly reduce the size of trials to screen treatments for efficacy in SVD, although further validation from longitudinal and intervention studies is required. PMID:26036939

  15. Sample size needs for characterizing pollutant concentrations in highway runoff

    SciTech Connect

    Thomson, N.R.; Mostrenko, I.; McBean, E.A.; Snodgrass, W.

    1997-10-01

    The identification of environmentally acceptable and cost-effective technologies for the control of highway storm-water runoff is of significant concern throughout North America. The environmental impact of storm-water runoff, in particular at highway crossings over small surface waterbodies is of sufficient concern to require examination of the detrimental impacts of highway runoff on the flora and fauna. The number of samples necessary for characterization of highway storm-water runoff concentrations is examined. Using extensive field monitoring results available from Minnesota, the statistical modeling results demonstrate that approximately 15 to 20 samples are required to provide reasonable estimates of the mean concentrations of runoff events for total suspended solids, total dissolved solids, total organic carbon, and zinc.

  16. ANOVA with random sample sizes: An application to a Brazilian database on cancer registries

    NASA Astrophysics Data System (ADS)

    Nunes, Célia; Capistrano, Gilberto; Ferreira, Dário; Ferreira, Sandra S.

    2013-10-01

    We apply our results on random sample size ANOVA to a Brazilian database on cancer registries. The samples sizes will be considered as realizations of random variables. The interest of this approach lies in avoiding false rejections obtained when using the classical fixed size F-tests.

  17. Sample-size considerations and strategies for linkage analysis in autosomal recessive disorders.

    PubMed Central

    Wong, F L; Cantor, R M; Rotter, J I

    1986-01-01

    The opportunity raised by recombinant DNA technology to develop a linkage marker panel that spans the human genome requires cost-efficient strategies for its optimal utilization. Questions arise as to whether it is more cost-effective to convert a dimorphic restriction enzyme marker system into a highly polymorphic system or, instead, to increase the number of families studied, simply using the available marker alleles. The choice is highly dependent on the population available for study, and, therefore, an examination of the informational content of the various family structures is important to obtain the most informative data. To guide such decisions, we have developed tables of the average sample number of families required to detect linkage for autosomal recessive disorders under single backcross and under "fully informative" matings. The latter cross consists of a marker locus with highly polymorphic codominant alleles such that the parental marker genotypes can be uniquely distinguished. The sampling scheme considers families with unaffected parents of known mating types ascertained via affected offspring, for sibship sizes ranging from two to four and various numbers of affected individuals. The sample-size tables, calculated for various values of the recombination fractions and lod scores, may serve as a guide to a more efficient application of the restriction fragment length polymorphism technology to sequential linkage analysis. PMID:3019130

  18. Accelerating potential of mean force calculations for lipid membrane permeation: System size, reaction coordinate, solute-solute distance, and cutoffs

    NASA Astrophysics Data System (ADS)

    Nitschke, Naomi; Atkovska, Kalina; Hub, Jochen S.

    2016-09-01

    Molecular dynamics simulations are capable of predicting the permeability of lipid membranes for drug-like solutes, but the calculations have remained prohibitively expensive for high-throughput studies. Here, we analyze simple measures for accelerating potential of mean force (PMF) calculations of membrane permeation, namely, (i) using smaller simulation systems, (ii) simulating multiple solutes per system, and (iii) using shorter cutoffs for the Lennard-Jones interactions. We find that PMFs for membrane permeation are remarkably robust against alterations of such parameters, suggesting that accurate PMF calculations are possible at strongly reduced computational cost. In addition, we evaluated the influence of the definition of the membrane center of mass (COM), used to define the transmembrane reaction coordinate. Membrane-COM definitions based on all lipid atoms lead to artifacts due to undulations and, consequently, to PMFs dependent on membrane size. In contrast, COM definitions based on a cylinder around the solute lead to size-independent PMFs, down to systems of only 16 lipids per monolayer. In summary, compared to popular setups that simulate a single solute in a membrane of 128 lipids with a Lennard-Jones cutoff of 1.2 nm, the measures applied here yield a speedup in sampling by factor of ˜40, without reducing the accuracy of the calculated PMF.

  19. Finite-sample-size effects on convection in mushy layers

    NASA Astrophysics Data System (ADS)

    Zhong, J.-Q.; Fragoso, A. T.; Wells, A. J.; Wettlaufer, J. S.

    2012-08-01

    We report theoretical and experimental investigations of the flow instability responsible for the mushy-layer mode of convection and the formation of chimneys, drainage channels devoid of solid, during steady-state solidification of aqueous ammonium chloride. Under certain growth conditions a state of steady mushy-layer growth with no flow is unstable to the onset of convection, resulting in the formation of chimneys. We present regime diagrams to quantify the state of the flow as a function of the initial liquid concentration, the porous-medium Rayleigh number, and the sample width. For a given liquid concentration, increasing both the porous-medium Rayleigh number and the sample width caused the system to change from a stable state of no flow to a different state with the formation of chimneys. Decreasing the concentration ratio destabilized the system and promoted the formation of chimneys. As the initial liquid concentration increased, onset of convection and formation of chimneys occurred at larger values of the porous-medium Rayleigh number, but the critical cell widths for chimney formation are far less sensitive to the liquid concentration. At the highest liquid concentration, the mushy-layer mode of convection did not occur in the experiment. The formation of multiple chimneys and the morphological transitions between these states are discussed. The experimental results are interpreted in terms of a previous theoretical analysis of finite amplitude convection with chimneys, with a single value of the mushy-layer permeability consistent with the liquid concentrations considered in this study.

  20. Presentation of coefficient of variation for bioequivalence sample-size calculation
.

    PubMed

    Lee, Yi Lin; Mak, Wen Yao; Looi, Irene; Wong, Jia Woei; Yuen, Kah Hay

    2017-03-03

    The current study aimed to further contribute information on intrasubject coefficient of variation (CV) from 43 bioequivalence studies conducted by our center. Consistent with Yuen et al. (2001), current work also attempted to evaluate the effect of different parameters (AUC0-t, AUC0-∞, and Cmax) used in the estimation of the study power. Furthermore, we have estimated the number of subjects required for each study by looking at the values of intrasubject CV of AUC0-∞ and have also taken into consideration the minimum sample-size requirement set by the US FDA. A total of 37 immediate-release and 6 extended-release formulations from 28 different active pharmaceutical ingredients (APIs) were evaluated. Out of the total number of studies conducted, 10 studies did not achieve satisfactory statistical power on two or more parameters; 4 studies consistently scored poorly across all three parameters. In general, intrasubject CV values calculated from Cmax were more variable compared to either AUC0-t and AUC0-∞. 20 out of 43 studies did not achieve more than 80% power when the value was calculated from Cmax value, compared to only 11 (AUC0-∞) and 8 (AUC0-t) studies. This finding is consistent with Steinijans et al. (1995) [2] and Yuen et al. (2001) [3]. In conclusion, the CV values obtained from AUC0-t and AUC0-∞ were similar, while those derived from Cmax were consistently more variable. Hence, CV derived from AUC instead of Cmax should be used in sample-size calculation to achieve a sufficient, yet practical, test power.
.

  1. Improved patient size estimates for accurate dose calculations in abdomen computed tomography

    NASA Astrophysics Data System (ADS)

    Lee, Chang-Lae

    2017-07-01

    The radiation dose of CT (computed tomography) is generally represented by the CTDI (CT dose index). CTDI, however, does not accurately predict the actual patient doses for different human body sizes because it relies on a cylinder-shaped head (diameter : 16 cm) and body (diameter : 32 cm) phantom. The purpose of this study was to eliminate the drawbacks of the conventional CTDI and to provide more accurate radiation dose information. Projection radiographs were obtained from water cylinder phantoms of various sizes, and the sizes of the water cylinder phantoms were calculated and verified using attenuation profiles. The effective diameter was also calculated using the attenuation of the abdominal projection radiographs of 10 patients. When the results of the attenuation-based method and the geometry-based method shown were compared with the results of the reconstructed-axial-CT-image-based method, the effective diameter of the attenuation-based method was found to be similar to the effective diameter of the reconstructed-axial-CT-image-based method, with a difference of less than 3.8%, but the geometry-based method showed a difference of less than 11.4%. This paper proposes a new method of accurately computing the radiation dose of CT based on the patient sizes. This method computes and provides the exact patient dose before the CT scan, and can therefore be effectively used for imaging and dose control.

  2. Structured estimation - Sample size reduction for adaptive pattern classification

    NASA Technical Reports Server (NTRS)

    Morgera, S.; Cooper, D. B.

    1977-01-01

    The Gaussian two-category classification problem with known category mean value vectors and identical but unknown category covariance matrices is considered. The weight vector depends on the unknown common covariance matrix, so the procedure is to estimate the covariance matrix in order to obtain an estimate of the optimum weight vector. The measure of performance for the adapted classifier is the output signal-to-interference noise ratio (SIR). A simple approximation for the expected SIR is gained by using the general sample covariance matrix estimator; this performance is both signal and true covariance matrix independent. An approximation is also found for the expected SIR obtained by using a Toeplitz form covariance matrix estimator; this performance is found to be dependent on both the signal and the true covariance matrix.

  3. Sample Size in Differential Item Functioning: An Application of Hierarchical Linear Modeling

    ERIC Educational Resources Information Center

    Acar, Tulin

    2011-01-01

    The purpose of this study is to examine the number of DIF items detected by HGLM at different sample sizes. Eight different sized data files have been composed. The population of the study is 798307 students who had taken the 2006 OKS Examination. 10727 students of 798307 are chosen by random sampling method as the sample of the study. Turkish,…

  4. A Note on Sample Size and Solution Propriety for Confirmatory Factor Analytic Models

    ERIC Educational Resources Information Center

    Jackson, Dennis L.; Voth, Jennifer; Frey, Marc P.

    2013-01-01

    Determining an appropriate sample size for use in latent variable modeling techniques has presented ongoing challenges to researchers. In particular, small sample sizes are known to present concerns over sampling error for the variances and covariances on which model estimation is based, as well as for fit indexes and convergence failures. The…

  5. A Note on Sample Size and Solution Propriety for Confirmatory Factor Analytic Models

    ERIC Educational Resources Information Center

    Jackson, Dennis L.; Voth, Jennifer; Frey, Marc P.

    2013-01-01

    Determining an appropriate sample size for use in latent variable modeling techniques has presented ongoing challenges to researchers. In particular, small sample sizes are known to present concerns over sampling error for the variances and covariances on which model estimation is based, as well as for fit indexes and convergence failures. The…

  6. Distance software: design and analysis of distance sampling surveys for estimating population size.

    PubMed

    Thomas, Len; Buckland, Stephen T; Rexstad, Eric A; Laake, Jeff L; Strindberg, Samantha; Hedley, Sharon L; Bishop, Jon Rb; Marques, Tiago A; Burnham, Kenneth P

    2010-02-01

    1.Distance sampling is a widely used technique for estimating the size or density of biological populations. Many distance sampling designs and most analyses use the software Distance.2.We briefly review distance sampling and its assumptions, outline the history, structure and capabilities of Distance, and provide hints on its use.3.Good survey design is a crucial prerequisite for obtaining reliable results. Distance has a survey design engine, with a built-in geographic information system, that allows properties of different proposed designs to be examined via simulation, and survey plans to be generated.4.A first step in analysis of distance sampling data is modelling the probability of detection. Distance contains three increasingly sophisticated analysis engines for this: conventional distance sampling, which models detection probability as a function of distance from the transect and assumes all objects at zero distance are detected; multiple-covariate distance sampling, which allows covariates in addition to distance; and mark-recapture distance sampling, which relaxes the assumption of certain detection at zero distance.5.All three engines allow estimation of density or abundance, stratified if required, with associated measures of precision calculated either analytically or via the bootstrap.6.Advanced analysis topics covered include the use of multipliers to allow analysis of indirect surveys (such as dung or nest surveys), the density surface modelling analysis engine for spatial and habitat modelling, and information about accessing the analysis engines directly from other software.7.Synthesis and applications. Distance sampling is a key method for producing abundance and density estimates in challenging field conditions. The theory underlying the methods continues to expand to cope with realistic estimation situations. In step with theoretical developments, state-of-the-art software that implements these methods is described that makes the methods

  7. Consistency analysis of plastic samples based on similarity calculation from limited range of the Raman spectra

    NASA Astrophysics Data System (ADS)

    Lai, B. W.; Wu, Z. X.; Dong, X. P.; Lu, D.; Tao, S. C.

    2016-07-01

    We proposed a novel method to calculate the similarity between samples with only small differences at unknown and specific positions in their Raman spectra, using a moving interval window scanning across the whole Raman spectra. Two ABS plastic samples, one with and the other without flame retardant, were tested in the experiment. Unlike the traditional method in which the similarity is calculated based on the whole spectrum, we do the calculation by using a window to cut out a certain segment from Raman spectra, each at a time as the window moves across the entire spectrum range. By our method, a curve of similarity versus wave number is obtained. And the curve shows a large change where the partial spectra of the two samples is different. Thus, the new similarity calculation method identifies samples with tiny difference in their Raman spectra better.

  8. XAFSmass: a program for calculating the optimal mass of XAFS samples

    NASA Astrophysics Data System (ADS)

    Klementiev, K.; Chernikov, R.

    2016-05-01

    We present a new implementation of the XAFSmass program that calculates the optimal mass of XAFS samples. It has several improvements as compared to the old Windows based program XAFSmass: 1) it is truly platform independent, as provided by Python language, 2) it has an improved parser of chemical formulas that enables parentheses and nested inclusion-to-matrix weight percentages. The program calculates the absorption edge height given the total optical thickness, operates with differently determined sample amounts (mass, pressure, density or sample area) depending on the aggregate state of the sample and solves the inverse problem of finding the elemental composition given the experimental absorption edge jump and the chemical formula.

  9. 40 CFR 761.243 - Standard wipe sample method and size.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ..., AND USE PROHIBITIONS Determining a PCB Concentration for Purposes of Abandonment or Disposal of Natural Gas Pipeline: Selecting Sample Sites, Collecting Surface Samples, and Analyzing Standard PCB Wipe Samples § 761.243 Standard wipe sample method and size. (a) Collect a surface sample from a natural...

  10. 40 CFR 761.243 - Standard wipe sample method and size.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ..., AND USE PROHIBITIONS Determining a PCB Concentration for Purposes of Abandonment or Disposal of Natural Gas Pipeline: Selecting Sample Sites, Collecting Surface Samples, and Analyzing Standard PCB Wipe Samples § 761.243 Standard wipe sample method and size. (a) Collect a surface sample from a natural...

  11. 40 CFR 761.243 - Standard wipe sample method and size.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ..., AND USE PROHIBITIONS Determining a PCB Concentration for Purposes of Abandonment or Disposal of Natural Gas Pipeline: Selecting Sample Sites, Collecting Surface Samples, and Analyzing Standard PCB Wipe Samples § 761.243 Standard wipe sample method and size. (a) Collect a surface sample from a natural...

  12. Evaluation of design flood estimates with respect to sample size

    NASA Astrophysics Data System (ADS)

    Kobierska, Florian; Engeland, Kolbjorn

    2016-04-01

    Estimation of design floods forms the basis for hazard management related to flood risk and is a legal obligation when building infrastructure such as dams, bridges and roads close to water bodies. Flood inundation maps used for land use planning are also produced based on design flood estimates. In Norway, the current guidelines for design flood estimates give recommendations on which data, probability distribution, and method to use dependent on length of the local record. If less than 30 years of local data is available, an index flood approach is recommended where the local observations are used for estimating the index flood and regional data are used for estimating the growth curve. For 30-50 years of data, a 2 parameter distribution is recommended, and for more than 50 years of data, a 3 parameter distribution should be used. Many countries have national guidelines for flood frequency estimation, and recommended distributions include the log Pearson II, generalized logistic and generalized extreme value distributions. For estimating distribution parameters, ordinary and linear moments, maximum likelihood and Bayesian methods are used. The aim of this study is to r-evaluate the guidelines for local flood frequency estimation. In particular, we wanted to answer the following questions: (i) Which distribution gives the best fit to the data? (ii) Which estimation method provides the best fit to the data? (iii) Does the answer to (i) and (ii) depend on local data availability? To answer these questions we set up a test bench for local flood frequency analysis using data based cross-validation methods. The criteria were based on indices describing stability and reliability of design flood estimates. Stability is used as a criterion since design flood estimates should not excessively depend on the data sample. The reliability indices describe to which degree design flood predictions can be trusted.

  13. 40 CFR 600.211-08 - Sample calculation of fuel economy values for labeling.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 40 Protection of Environment 29 2010-07-01 2010-07-01 false Sample calculation of fuel economy... AGENCY (CONTINUED) ENERGY POLICY FUEL ECONOMY AND CARBON-RELATED EXHAUST EMISSIONS OF MOTOR VEHICLES Fuel Economy Regulations for 1977 and Later Model Year Automobiles-Procedures for Calculating Fuel...

  14. Air and smear sample calculational tool for Fluor Hanford Radiological control

    SciTech Connect

    BAUMANN, B.L.

    2003-07-11

    A spreadsheet calculation tool was developed to automate the calculations performed for determining the concentration of airborne radioactivity and smear counting as outlined in HNF-13536, Section 5.2.7, ''Analyzing Air and Smear Samples''. This document reports on the design and testing of the calculation tool. Radiological Control Technicians (RCTs) will save time and reduce hand written and calculation errors by using an electronic form for documenting and calculating work place air samples. Current expectations are RCTs will perform an air sample and collect the filter or perform a smear for surface contamination. RCTs will then survey the filter for gross alpha and beta/gamma radioactivity and with the gross counts utilize either hand calculation method or a calculator to determine activity on the filter. The electronic form will allow the RCT with a few key strokes to document the individual's name, payroll, gross counts, instrument identifiers; produce an error free record. This productivity gain is realized by the enhanced ability to perform mathematical calculations electronically (reducing errors) and at the same time, documenting the air sample.

  15. 40 CFR 600.211-08 - Sample calculation of fuel economy values for labeling.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 40 Protection of Environment 30 2011-07-01 2011-07-01 false Sample calculation of fuel economy... AGENCY (CONTINUED) ENERGY POLICY FUEL ECONOMY AND CARBON-RELATED EXHAUST EMISSIONS OF MOTOR VEHICLES Procedures for Calculating Fuel Economy and Carbon-Related Exhaust Emission Values for 1977 and Later...

  16. 7 CFR 51.308 - Methods of sampling and calculation of percentages.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ..., CERTIFICATION, AND STANDARDS) United States Standards for Grades of Apples Methods of Sampling and Calculation... where the minimum diameter of the smallest apple does not vary more than 1/2 inch from the minimum diameter of the largest apple, percentages shall be calculated on the basis of count. (b) In all other...

  17. 7 CFR 51.308 - Methods of sampling and calculation of percentages.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ..., CERTIFICATION, AND STANDARDS) United States Standards for Grades of Apples Methods of Sampling and Calculation... where the minimum diameter of the smallest apple does not vary more than 1/2 inch from the minimum diameter of the largest apple, percentages shall be calculated on the basis of count. (b) In all other...

  18. Got power? A systematic review of sample size adequacy in health professions education research.

    PubMed

    Cook, David A; Hatala, Rose

    2015-03-01

    Many education research studies employ small samples, which in turn lowers statistical power. We re-analyzed the results of a meta-analysis of simulation-based education to determine study power across a range of effect sizes, and the smallest effect that could be plausibly excluded. We systematically searched multiple databases through May 2011, and included all studies evaluating simulation-based education for health professionals in comparison with no intervention or another simulation intervention. Reviewers working in duplicate abstracted information to calculate standardized mean differences (SMD's). We included 897 original research studies. Among the 627 no-intervention-comparison studies the median sample size was 25. Only two studies (0.3%) had ≥80% power to detect a small difference (SMD > 0.2 standard deviations) and 136 (22%) had power to detect a large difference (SMD > 0.8). 110 no-intervention-comparison studies failed to find a statistically significant difference, but none excluded a small difference and only 47 (43%) excluded a large difference. Among 297 studies comparing alternate simulation approaches the median sample size was 30. Only one study (0.3%) had ≥80% power to detect a small difference and 79 (27%) had power to detect a large difference. Of the 128 studies that did not detect a statistically significant effect, 4 (3%) excluded a small difference and 91 (71%) excluded a large difference. In conclusion, most education research studies are powered only to detect effects of large magnitude. For most studies that do not reach statistical significance, the possibility of large and important differences still exists.

  19. Gamma self-shielding correction factors calculation for aqueous bulk sample analysis by PGNAA technique.

    PubMed

    Nasrabadi, M N; Mohammadi, A; Jalali, M

    2009-01-01

    In this paper bulk sample prompt gamma neutron activation analysis (BSPGNAA) was applied to aqueous sample analysis using a relative method. For elemental analysis of an unknown bulk sample, gamma self-shielding coefficient was required. Gamma self-shielding coefficient of unknown samples was estimated by an experimental method and also by MCNP code calculation. The proposed methodology can be used for the determination of the elemental concentration of unknown aqueous samples by BSPGNAA where knowledge of the gamma self-shielding within the sample volume is required.

  20. Issues of sample size in sensitivity and specificity analysis with special reference to oncology.

    PubMed

    Juneja, Atul; Sharma, Shashi

    2015-01-01

    Sample size is one of the basics issues, which medical researcher including oncologist faces with any research program. The current communication attempts to discuss the computation of sample size when sensitivity and specificity are being evaluated. The article intends to present the situation that the researcher could easily visualize for appropriate use of sample size techniques for sensitivity and specificity when any screening method for early detection of cancer is in question. Moreover, the researcher would be in a position to efficiently communicate with a statistician for sample size computation and most importantly applicability of the results under the conditions of the negotiated precision.

  1. Sampling bee communities using pan traps: alternative methods increase sample size

    USDA-ARS?s Scientific Manuscript database

    Monitoring of the status of bee populations and inventories of bee faunas require systematic sampling. Efficiency and ease of implementation has encouraged the use of pan traps to sample bees. Efforts to find an optimal standardized sampling method for pan traps have focused on pan trap color. Th...

  2. Sample Size for Measuring Grammaticality in Preschool Children from Picture-Elicited Language Samples

    ERIC Educational Resources Information Center

    Eisenberg, Sarita L.; Guo, Ling-Yu

    2015-01-01

    Purpose: The purpose of this study was to investigate whether a shorter language sample elicited with fewer pictures (i.e., 7) would yield a percent grammatical utterances (PGU) score similar to that computed from a longer language sample elicited with 15 pictures for 3-year-old children. Method: Language samples were elicited by asking forty…

  3. Sample Size for Measuring Grammaticality in Preschool Children from Picture-Elicited Language Samples

    ERIC Educational Resources Information Center

    Eisenberg, Sarita L.; Guo, Ling-Yu

    2015-01-01

    Purpose: The purpose of this study was to investigate whether a shorter language sample elicited with fewer pictures (i.e., 7) would yield a percent grammatical utterances (PGU) score similar to that computed from a longer language sample elicited with 15 pictures for 3-year-old children. Method: Language samples were elicited by asking forty…

  4. Design and sample size considerations for simultaneous global drug development program.

    PubMed

    Huang, Qin; Chen, Gang; Yuan, Zhilong; Lan, K K Gordon

    2012-09-01

    Due to the potential impact of ethnic factors on clinical outcomes, the global registration of a new treatment is challenging. China and Japan often require local trials in addition to a multiregional clinical trial (MRCT) to support the efficacy and safety claim of the treatment. The impact of ethnic factors on the treatment effect has been intensively investigated and discussed from different perspectives. However, most current methods are focusing on the assessment of the consistency or similarity of the treatment effect between different ethnic groups in exploratory nature. In this article, we propose a new method for the design and sample size consideration for a simultaneous global drug development program (SGDDP) using weighted z-tests. In the proposed method, to test the efficacy of a new treatment for the targeted ethnic (TE) group, a weighted test that combines the information collected from both the TE group and the nontargeted ethnic (NTE) group is used. The influence of ethnic factors and local medical practice on the treatment effect is accounted for by down-weighting the information collected from NTE group in the combined test statistic. This design controls rigorously the overall false positive rate for the program at a given level. The sample sizes needed for the TE group in an SGDDP for three most commonly used efficacy endpoints, continuous, binary, and time-to-event, are then calculated.

  5. Assessing and improving the stability of chemometric models in small sample size situations.

    PubMed

    Beleites, Claudia; Salzer, Reiner

    2008-03-01

    Small sample sizes are very common in multivariate analysis. Sample sizes of 10-100 statistically independent objects (rejects from processes or loading dock analysis, or patients with a rare disease), each with hundreds of data points, cause unstable models with poor predictive quality. Model stability is assessed by comparing models that were built using slightly varying training data. Iterated k-fold cross-validation is used for this purpose. Aggregation stabilizes models. It is possible to assess the quality of the aggregated model without calculating further models. The validation and aggregation methods investigated in this study apply to regression as well as to classification. These techniques are useful for analyzing data with large numbers of variates, e.g., any spectral data like FT-IR, Raman, UV/VIS, fluorescence, AAS, and MS. FT-IR images of tumor tissue were used in this study. Some tissue types occur frequently, while some are very rare. They are classified using LDA. Initial models were severely unstable. Aggregation stabilizes the predictions. The hit rate increased from 67% to 82%.

  6. Distribution of the two-sample t-test statistic following blinded sample size re-estimation.

    PubMed

    Lu, Kaifeng

    2016-05-01

    We consider the blinded sample size re-estimation based on the simple one-sample variance estimator at an interim analysis. We characterize the exact distribution of the standard two-sample t-test statistic at the final analysis. We describe a simulation algorithm for the evaluation of the probability of rejecting the null hypothesis at given treatment effect. We compare the blinded sample size re-estimation method with two unblinded methods with respect to the empirical type I error, the empirical power, and the empirical distribution of the standard deviation estimator and final sample size. We characterize the type I error inflation across the range of standardized non-inferiority margin for non-inferiority trials, and derive the adjusted significance level to ensure type I error control for given sample size of the internal pilot study. We show that the adjusted significance level increases as the sample size of the internal pilot study increases. Copyright © 2016 John Wiley & Sons, Ltd.

  7. Effect Size, Statistical Power and Sample Size Requirements for the Bootstrap Likelihood Ratio Test in Latent Class Analysis

    PubMed Central

    Dziak, John J.; Lanza, Stephanie T.; Tan, Xianming

    2014-01-01

    Selecting the number of different classes which will be assumed to exist in the population is an important step in latent class analysis (LCA). The bootstrap likelihood ratio test (BLRT) provides a data-driven way to evaluate the relative adequacy of a (K −1)-class model compared to a K-class model. However, very little is known about how to predict the power or the required sample size for the BLRT in LCA. Based on extensive Monte Carlo simulations, we provide practical effect size measures and power curves which can be used to predict power for the BLRT in LCA given a proposed sample size and a set of hypothesized population parameters. Estimated power curves and tables provide guidance for researchers wishing to size a study to have sufficient power to detect hypothesized underlying latent classes. PMID:25328371

  8. Effect Size, Statistical Power and Sample Size Requirements for the Bootstrap Likelihood Ratio Test in Latent Class Analysis.

    PubMed

    Dziak, John J; Lanza, Stephanie T; Tan, Xianming

    2014-01-01

    Selecting the number of different classes which will be assumed to exist in the population is an important step in latent class analysis (LCA). The bootstrap likelihood ratio test (BLRT) provides a data-driven way to evaluate the relative adequacy of a (K -1)-class model compared to a K-class model. However, very little is known about how to predict the power or the required sample size for the BLRT in LCA. Based on extensive Monte Carlo simulations, we provide practical effect size measures and power curves which can be used to predict power for the BLRT in LCA given a proposed sample size and a set of hypothesized population parameters. Estimated power curves and tables provide guidance for researchers wishing to size a study to have sufficient power to detect hypothesized underlying latent classes.

  9. Additive scales in degenerative disease - calculation of effect sizes and clinical judgment

    PubMed Central

    2011-01-01

    Background The therapeutic efficacy of an intervention is often assessed in clinical trials by scales measuring multiple diverse activities that are added to produce a cumulative global score. Medical communities and health care systems subsequently use these data to calculate pooled effect sizes to compare treatments. This is done because major doubt has been cast over the clinical relevance of statistically significant findings relying on p values with the potential to report chance findings. Hence in an aim to overcome this pooling the results of clinical studies into a meta-analyses with a statistical calculus has been assumed to be a more definitive way of deciding of efficacy. Methods We simulate the therapeutic effects as measured with additive scales in patient cohorts with different disease severity and assess the limitations of an effect size calculation of additive scales which are proven mathematically. Results We demonstrate that the major problem, which cannot be overcome by current numerical methods, is the complex nature and neurobiological foundation of clinical psychiatric endpoints in particular and additive scales in general. This is particularly relevant for endpoints used in dementia research. 'Cognition' is composed of functions such as memory, attention, orientation and many more. These individual functions decline in varied and non-linear ways. Here we demonstrate that with progressive diseases cumulative values from multidimensional scales are subject to distortion by the limitations of the additive scale. The non-linearity of the decline of function impedes the calculation of effect sizes based on cumulative values from these multidimensional scales. Conclusions Statistical analysis needs to be guided by boundaries of the biological condition. Alternatively, we suggest a different approach avoiding the error imposed by over-analysis of cumulative global scores from additive scales. PMID:22176535

  10. Comparison between various finite-size supercell correction schemes for charged defect calculations

    NASA Astrophysics Data System (ADS)

    Komsa, Hannu-Pekka; Rantala, Tapio; Pasquarello, Alfredo

    2012-08-01

    We present a comparison of the most common finite-size supercell correction schemes for charged defects in density functional theory calculations. Considered schemes include those proposed by Makov and Payne (MP), Lany and Zunger (LZ), and Freysoldt, Neugebauer, and Van de Walle (FNV). The role of the potential alignment is also assessed. Supercells of various sizes are considered and the corrected formation energies are compared to the values obtained by extrapolation to large supercells. For defects with localized charge distributions, we generally find that the FNV scheme slightly improves upon the LZ one, while the MP scheme generally overcorrects except for point-charge-like defects. We also encountered more complex situations in which the extrapolated values do not coincide. Inspection of the defect electronic structure indicates that this occurs when the defect Kohn-Sham states are degenerate with band-edge states of the host.

  11. Converging Nuclear Magnetic Shielding Calculations with Respect to Basis and System Size in Protein Systems

    PubMed Central

    Hartman, Joshua D.; Neubauer, Thomas J.; Caulkins, Bethany G.; Mueller, Leonard J.; Beran, Gregory J. O.

    2015-01-01

    Ab initio chemical shielding calculations greatly facilitate the interpretation of nuclear magnetic resonance (NMR) chemical shifts in biological systems, but the large sizes of these systems requires approximations in the chemical models used to represent them. Achieving good convergence in the predicted chemical shieldings is necessary before one can unravel how other complex structural and dynamical factors affect the NMR measurements. Here, we investigate how to balance trade-offs between using a better basis set or a larger cluster model for predicting the chemical shieldings of the substrates in two representative examples of protein-substrate systems involving different domains in tryptophan synthase: the N-(4′-trifluoromethoxybenzoyl)-2-aminoethyl phosphate (F9) ligand which binds in the α active site, and the 2-aminophenol (2AP) quinonoid intermediate formed in the β active site. We first demonstrate that a chemically intuitive three-layer, locally dense basis model that uses a large basis on the substrate, a medium triple-zeta basis to describe its hydrogen-bonding partners and/or surrounding van derWaals cavity, and a crude basis set for more distant atoms provides chemical shieldings in good agreement with much more expensive large basis calculations. Second, long-range quantum mechanical interactions are important, and one can accurately estimate them as a small-basis correction to larger-basis calculations on a smaller cluster. The combination of these approaches enables one to perform density functional theory NMR chemical shift calculations in protein systems that are well-converged with respect to both basis set and cluster size. PMID:25993979

  12. 40 CFR 761.286 - Sample size and procedure for collecting a sample.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... (CONTINUED) TOXIC SUBSTANCES CONTROL ACT POLYCHLORINATED BIPHENYLS (PCBs) MANUFACTURING, PROCESSING, DISTRIBUTION IN COMMERCE, AND USE PROHIBITIONS Sampling To Verify Completion of Self-Implementing Cleanup...

  13. Probabilistic Requirements (Partial) Verification Methods Best Practices Improvement. Variables Acceptance Sampling Calculators: Empirical Testing. Volume 2

    NASA Technical Reports Server (NTRS)

    Johnson, Kenneth L.; White, K. Preston, Jr.

    2012-01-01

    The NASA Engineering and Safety Center was requested to improve on the Best Practices document produced for the NESC assessment, Verification of Probabilistic Requirements for the Constellation Program, by giving a recommended procedure for using acceptance sampling by variables techniques as an alternative to the potentially resource-intensive acceptance sampling by attributes method given in the document. In this paper, the results of empirical tests intended to assess the accuracy of acceptance sampling plan calculators implemented for six variable distributions are presented.

  14. Implications of sampling design and sample size for national carbon accounting systems

    Treesearch

    Michael Köhl; Andrew Lister; Charles T. Scott; Thomas Baldauf; Daniel. Plugge

    2011-01-01

    Countries willing to adopt a REDD regime need to establish a national Measurement, Reporting and Verification (MRV) system that provides information on forest carbon stocks and carbon stock changes. Due to the extensive areas covered by forests the information is generally obtained by sample based surveys. Most operational sampling approaches utilize a combination of...

  15. The Statistics and Mathematics of High Dimension Low Sample Size Asymptotics.

    PubMed

    Shen, Dan; Shen, Haipeng; Zhu, Hongtu; Marron, J S

    2016-10-01

    The aim of this paper is to establish several deep theoretical properties of principal component analysis for multiple-component spike covariance models. Our new results reveal an asymptotic conical structure in critical sample eigendirections under the spike models with distinguishable (or indistinguishable) eigenvalues, when the sample size and/or the number of variables (or dimension) tend to infinity. The consistency of the sample eigenvectors relative to their population counterparts is determined by the ratio between the dimension and the product of the sample size with the spike size. When this ratio converges to a nonzero constant, the sample eigenvector converges to a cone, with a certain angle to its corresponding population eigenvector. In the High Dimension, Low Sample Size case, the angle between the sample eigenvector and its population counterpart converges to a limiting distribution. Several generalizations of the multi-spike covariance models are also explored, and additional theoretical results are presented.

  16. The Statistics and Mathematics of High Dimension Low Sample Size Asymptotics

    PubMed Central

    Shen, Dan; Shen, Haipeng; Zhu, Hongtu; Marron, J. S.

    2016-01-01

    The aim of this paper is to establish several deep theoretical properties of principal component analysis for multiple-component spike covariance models. Our new results reveal an asymptotic conical structure in critical sample eigendirections under the spike models with distinguishable (or indistinguishable) eigenvalues, when the sample size and/or the number of variables (or dimension) tend to infinity. The consistency of the sample eigenvectors relative to their population counterparts is determined by the ratio between the dimension and the product of the sample size with the spike size. When this ratio converges to a nonzero constant, the sample eigenvector converges to a cone, with a certain angle to its corresponding population eigenvector. In the High Dimension, Low Sample Size case, the angle between the sample eigenvector and its population counterpart converges to a limiting distribution. Several generalizations of the multi-spike covariance models are also explored, and additional theoretical results are presented. PMID:28018116

  17. Weighting by Inverse Variance or by Sample Size in Random-Effects Meta-Analysis

    ERIC Educational Resources Information Center

    Marin-Martinez, Fulgencio; Sanchez-Meca, Julio

    2010-01-01

    Most of the statistical procedures in meta-analysis are based on the estimation of average effect sizes from a set of primary studies. The optimal weight for averaging a set of independent effect sizes is the inverse variance of each effect size, but in practice these weights have to be estimated, being affected by sampling error. When assuming a…

  18. Weighting by Inverse Variance or by Sample Size in Random-Effects Meta-Analysis

    ERIC Educational Resources Information Center

    Marin-Martinez, Fulgencio; Sanchez-Meca, Julio

    2010-01-01

    Most of the statistical procedures in meta-analysis are based on the estimation of average effect sizes from a set of primary studies. The optimal weight for averaging a set of independent effect sizes is the inverse variance of each effect size, but in practice these weights have to be estimated, being affected by sampling error. When assuming a…

  19. Developing Criteria for Sample Sizes in Jet Engine Analytical Component Inspections and the Associated Confidence Levels

    DTIC Science & Technology

    1988-09-01

    5 Sample The samples taken from each population will not be random samples . They will be nonprobability , purposive samples . More specifically, they...section will justify why statistical techniques based on the assumption of a random sample , will be used. First, this is the only possible method of...w lu 88 12 21 029 AFIT/GSM/LSM/88S-22 DEVELOPING CRITERIA FOR SAMPLE SIZES IN JET ENGINE ANALYTICAL COMPONENT INSPECTIONS AND THE ASSOCIATED

  20. Sample Size Tables for Correlation Analysis with Applications in Partial Correlation and Multiple Regression Analysis

    ERIC Educational Resources Information Center

    Algina, James; Olejnik, Stephen

    2003-01-01

    Tables for selecting sample size in correlation studies are presented. Some of the tables allow selection of sample size so that r (or r[squared], depending on the statistic the researcher plans to interpret) will be within a target interval around the population parameter with probability 0.95. The intervals are [plus or minus] 0.05, [plus or…