The sample size is the number of patients or other experimental units that need to be included in a study to answer the research question. Pre-study calculation of the sample size is important; if a sample size is too small, one will not be able to detect an effect, while a sample that is too large may be a waste of time and money. Methods to calculate the sample size are explained in statistical textbooks, but because there are many different formulas available, it can be difficult for investigators to decide which method to use. Moreover, these calculations are prone to errors, because small changes in the selected parameters can lead to large differences in the sample size. This paper explains the basic principles of sample size calculations and demonstrates how to perform such a calculation for a simple study design. PMID:21293154
In atherosclerotic lesions, the endothelial barrier against the bloodstream can become compromised, resulting in the exposure of the extracellular matrix (ECM) and intimal cells beneath. In theory, this allows adequately sized nanocarriers in circulation to infiltrate into the intimal lesion intravascularly. We sought to evaluate this possibility using rat carotid arteries with induced neointima. Cy5-labeled polyethylene glycol-conjugated polyion complex (PIC) micelles and vesicles, with diameters of 40, 100, or 200 nm (PICs-40, PICs-100, and PICs-200, respectively) were intravenously administered to rats after injury to the carotid artery using a balloon catheter. High accumulation and long retention of PICs-40 in the induced neointima was confirmed by in vivo imaging, while the accumulation of PICs-100 and PICs-200 was limited, indicating that the size of nanocarriers is a crucial factor for efficient delivery. Furthermore, epirubicin-incorporated polymeric micelles with a diameter similar to that of PICs-40 showed significant curative effects in rats with induced neointima, in terms of lesion size and cell number. Specific and effective drug delivery to pre-existing neointimal lesions was demonstrated with adequate size control of the nanocarriers. We consider that this nanocarrier-based drug delivery system could be utilized for the treatment of atherosclerosis. PMID:27183493
Sample Size and Correlational Inference
In 4 studies, the authors examined the hypothesis that the structure of the informational environment makes small samples more informative than large ones for drawing inferences about population correlations. The specific purpose of the studies was to test predictions arising from the signal detection simulations of R. B. Anderson, M. E. Doherty,…
Sample size requirements for training high-dimensional risk predictors
A common objective of biomarker studies is to develop a predictor of patient survival outcome. Determining the number of samples required to train a predictor from survival data is important for designing such studies. Existing sample size methods for training studies use parametric models for the high-dimensional data and cannot handle a right-censored dependent variable. We present a new training sample size method that is non-parametric with respect to the high-dimensional vectors, and is developed for a right-censored response. The method can be applied to any prediction algorithm that satisfies a set of conditions. The sample size is chosen so that the expected performance of the predictor is within a user-defined tolerance of optimal. The central method is based on a pilot dataset. To quantify uncertainty, a method to construct a confidence interval for the tolerance is developed. Adequacy of the size of the pilot dataset is discussed. An alternative model-based version of our method for estimating the tolerance when no adequate pilot dataset is available is presented. The model-based method requires a covariance matrix be specified, but we show that the identity covariance matrix provides adequate sample size when the user specifies three key quantities. Application of the sample size method to two microarray datasets is discussed. PMID:23873895
How to Show that Sample Size Matters
This article suggests how to explain a problem of small sample size when considering correlation between two Normal variables. Two techniques are shown: one based on graphs and the other on simulation. (Contains 3 figures and 1 table.)
Sample sizes for confidence limits for reliability.
We recently performed an evaluation of the implications of a reduced stockpile of nuclear weapons for surveillance to support estimates of reliability. We found that one technique developed at Sandia National Laboratories (SNL) under-estimates the required sample size for systems-level testing. For a large population the discrepancy is not important, but for a small population it is important. We found that another technique used by SNL provides the correct required sample size. For systems-level testing of nuclear weapons, samples are selected without replacement, and the hypergeometric probability distribution applies. Both of the SNL techniques focus on samples without defects from sampling without replacement. We generalized the second SNL technique to cases with defects in the sample. We created a computer program in Mathematica to automate the calculation of confidence for reliability. We also evaluated sampling with replacement where the binomial probability distribution applies.
Experimental determination of size distributions: analyzing proper sample sizes
The measurement of various particle size distributions is a crucial aspect for many applications in the process industry. Size distribution is often related to the final product quality, as in crystallization or polymerization. In other cases it is related to the correct evaluation of heat and mass transfer, as well as reaction rates, depending on the interfacial area between the different phases or to the assessment of yield stresses of polycrystalline metals/alloys samples. The experimental determination of such distributions often involves laborious sampling procedures and the statistical significance of the outcome is rarely investigated. In this work, we propose a novel rigorous tool, based on inferential statistics, to determine the number of samples needed to obtain reliable measurements of size distribution, according to specific requirements defined a priori. Such methodology can be adopted regardless of the measurement technique used.
Finite sample size effects in transformation kinetics
The effect of finite sample size on the kinetic law of phase transformations is considered. The case where the second phase develops by a nucleation and growth mechanism is treated under the assumption of isothermal conditions and constant and uniform nucleation rate. It is demonstrated that for spherical particle growth, a thin sample transformation formula given previously is an approximate version of a more general transformation law. The thin sample approximation is shown to be reliable when a certain dimensionless thickness is small. The latter quantity, rather than the actual sample thickness, determines when the usual law of transformation kinetics valid for bulk (large dimension) samples must be modified.
Sample size calculation in metabolic phenotyping studies.
The number of samples needed to identify significant effects is a key question in biomedical studies, with consequences on experimental designs, costs and potential discoveries. In metabolic phenotyping studies, sample size determination remains a complex step. This is due particularly to the multiple hypothesis-testing framework and the top-down hypothesis-free approach, with no a priori known metabolic target. Until now, there was no standard procedure available to address this purpose. In this review, we discuss sample size estimation procedures for metabolic phenotyping studies. We release an automated implementation of the Data-driven Sample size Determination (DSD) algorithm for MATLAB and GNU Octave. Original research concerning DSD was published elsewhere. DSD allows the determination of an optimized sample size in metabolic phenotyping studies. The procedure uses analytical data only from a small pilot cohort to generate an expanded data set. The statistical recoupling of variables procedure is used to identify metabolic variables, and their intensity distributions are estimated by Kernel smoothing or log-normal density fitting. Statistically significant metabolic variations are evaluated using the Benjamini-Yekutieli correction and processed for data sets of various sizes. Optimal sample size determination is achieved in a context of biomarker discovery (at least one statistically significant variation) or metabolic exploration (a maximum of statistically significant variations). DSD toolbox is encoded in MATLAB R2008A (Mathworks, Natick, MA) for Kernel and log-normal estimates, and in GNU Octave for log-normal estimates (Kernel density estimates are not robust enough in GNU octave). It is available at http://www.prabi.fr/redmine/projects/dsd/repository, with a tutorial at http://www.prabi.fr/redmine/projects/dsd/wiki. PMID:25600654
Improved sample size determination for attributes and variables sampling
Earlier INMM papers have addressed the attributes/variables problem and, under conservative/limiting approximations, have reported analytical solutions for the attributes and variables sample sizes. Through computer simulation of this problem, we have calculated attributes and variables sample sizes as a function of falsification, measurement uncertainties, and required detection probability without using approximations. Using realistic assumptions for uncertainty parameters of measurement, the simulation results support the conclusions: (1) previously used conservative approximations can be expensive because they lead to larger sample sizes than needed; and (2) the optimal verification strategy, as well as the falsification strategy, are highly dependent on the underlying uncertainty parameters of the measurement instruments. 1 ref., 3 figs.
Exploratory Factor Analysis with Small Sample Sizes
Exploratory factor analysis (EFA) is generally regarded as a technique for large sample sizes ("N"), with N = 50 as a reasonable absolute minimum. This study offers a comprehensive overview of the conditions in which EFA can yield good quality results for "N" below 50. Simulations were carried out to estimate the minimum required "N" for different…
A New Sample Size Formula for Regression.
Brooks, Gordon P.; Barcikowski, Robert S.
The focus of this research was to determine the efficacy of a new method of selecting sample sizes for multiple linear regression. A Monte Carlo simulation was used to study both empirical predictive power rates and empirical statistical power rates of the new method and seven other methods: those of C. N. Park and A. L. Dudycha (1974); J. Cohen…
Predicting sample size required for classification performance
2012-01-01
Background Supervised learning methods need annotated data in order to generate efficient models. Annotated data, however, is a relatively scarce resource and can be expensive to obtain. For both passive and active learning methods, there is a need to estimate the size of the annotated sample required to reach a performance target. Methods We designed and implemented a method that fits an inverse power law model to points of a given learning curve created using a small annotated training set. Fitting is carried out using nonlinear weighted least squares optimization. The fitted model is then used to predict the classifier's performance and confidence interval for larger sample sizes. For evaluation, the nonlinear weighted curve fitting method was applied to a set of learning curves generated using clinical text and waveform classification tasks with active and passive sampling methods, and predictions were validated using standard goodness of fit measures. As control we used an un-weighted fitting method. Results A total of 568 models were fitted and the model predictions were compared with the observed performances. Depending on the data set and sampling method, it took between 80 to 560 annotated samples to achieve mean average and root mean squared error below 0.01. Results also show that our weighted fitting method outperformed the baseline un-weighted method (p < 0.05). Conclusions This paper describes a simple and effective sample size prediction algorithm that conducts weighted fitting of learning curves. The algorithm outperformed an un-weighted algorithm described in previous literature. It can help researchers determine annotation sample size for supervised machine learning. PMID:22336388
Statistical Analysis Techniques for Small Sample Sizes
NASA Technical Reports Server (NTRS)
The small sample sizes problem which is encountered when dealing with analysis of space-flight data is examined. Because of such a amount of data available, careful analyses are essential to extract the maximum amount of information with acceptable accuracy. Statistical analysis of small samples is described. The background material necessary for understanding statistical hypothesis testing is outlined and the various tests which can be done on small samples are explained. Emphasis is on the underlying assumptions of each test and on considerations needed to choose the most appropriate test for a given type of analysis.
Planning sample sizes when effect sizes are uncertain: The power-calibrated effect size approach.
Statistical power and thus the sample size required to achieve some desired level of power depend on the size of the effect of interest. However, effect sizes are seldom known exactly in psychological research. Instead, researchers often possess an estimate of an effect size as well as a measure of its uncertainty (e.g., a standard error or confidence interval). Previous proposals for planning sample sizes either ignore this uncertainty thereby resulting in sample sizes that are too small and thus power that is lower than the desired level or overstate the impact of this uncertainty thereby resulting in sample sizes that are too large and thus power that is higher than the desired level. We propose a power-calibrated effect size (PCES) approach to sample size planning that accounts for the uncertainty associated with an effect size estimate in a properly calibrated manner: sample sizes determined on the basis of the PCES are neither too small nor too large and thus provide the desired level of power. We derive the PCES for comparisons of independent and dependent means, comparisons of independent and dependent proportions, and tests of correlation coefficients. We also provide a tutorial on setting sample sizes for a replication study using data from prior studies and discuss an easy-to-use website and code that implement our PCES approach to sample size planning. PMID:26651984
Sample-size requirements for evaluating population size structure
A method with an accompanying computer program is described to estimate the number of individuals needed to construct a sample length-frequency with a given accuracy and precision. First, a reference length-frequency assumed to be accurate for a particular sampling gear and collection strategy was constructed. Bootstrap procedures created length-frequencies with increasing sample size that were randomly chosen from the reference data and then were compared with the reference length-frequency by calculating the mean squared difference. Outputs from two species collected with different gears and an artificial even length-frequency are used to describe the characteristics of the method. The relations between the number of individuals used to construct a length-frequency and the similarity to the reference length-frequency followed a negative exponential distribution and showed the importance of using 300-400 individuals whenever possible.
Sample Size for Confidence Interval of Covariate-Adjusted Mean Difference
This article provides a way to determine adequate sample size for the confidence interval of covariate-adjusted mean difference in randomized experiments. The standard error of adjusted mean difference depends on covariate variance and balance, which are two unknown quantities at the stage of planning sample size. If covariate observations are…
Effects of sample size on KERNEL home range estimates
Seaman, D.E.; Millspaugh, J.J.; Kernohan, Brian J.; Brundige, Gary C.; Raedeke, Kenneth J.; Gitzen, Robert A.
1999-01-01
Kernel methods for estimating home range are being used increasingly in wildlife research, but the effect of sample size on their accuracy is not known. We used computer simulations of 10-200 points/home range and compared accuracy of home range estimates produced by fixed and adaptive kernels with the reference (REF) and least-squares cross-validation (LSCV) methods for determining the amount of smoothing. Simulated home ranges varied from simple to complex shapes created by mixing bivariate normal distributions. We used the size of the 95% home range area and the relative mean squared error of the surface fit to assess the accuracy of the kernel home range estimates. For both measures, the bias and variance approached an asymptote at about 50 observations/home range. The fixed kernel with smoothing selected by LSCV provided the least-biased estimates of the 95% home range area. All kernel methods produced similar surface fit for most simulations, but the fixed kernel with LSCV had the lowest frequency and magnitude of very poor estimates. We reviewed 101 papers published in The Journal of Wildlife Management (JWM) between 1980 and 1997 that estimated animal home ranges. A minority of these papers used nonparametric utilization distribution (UD) estimators, and most did not adequately report sample sizes. We recommend that home range studies using kernel estimates use LSCV to determine the amount of smoothing, obtain a minimum of 30 observations per animal (but preferably a?Y50), and report sample sizes in published results.
40 CFR 80.127 - Sample size guidelines.
... attest engagement, the auditor shall sample relevant populations to which agreed-upon procedures will be... population; and (b) Sample size shall be determined using one of the following options: (1) Option 1. Determine the sample size using the following table: Sample Size, Based Upon Population Size No....
(Sample) Size Matters! An Examination of Sample Size from the SPRINT Trial
Introduction Inadequate sample size and power in randomized trials can result in misleading findings. This study demonstrates the effect of sample size in a large, clinical trial by evaluating the results of the SPRINT (Study to Prospectively evaluate Reamed Intramedullary Nails in Patients with Tibial fractures) trial as it progressed. Methods The SPRINT trial evaluated reamed versus unreamed nailing of the tibia in 1226 patients, as well as in open and closed fracture subgroups (N=400 and N=826, respectively). We analyzed the re-operation rates and relative risk comparing treatment groups at 50, 100 and then increments of 100 patients up to the final sample size. Results at various enrollments were compared to the final SPRINT findings. Results In the final analysis, there was a statistically significant decreased risk of re-operation with reamed nails for closed fractures (relative risk reduction 35%). Results for the first 35 patients enrolled suggested reamed nails increased the risk of reoperation in closed fractures by 165%. Only after 543 patients with closed fractures were enrolled did the results reflect the final advantage for reamed nails in this subgroup. Similarly, the trend towards an increased risk of re-operation for open fractures (23%) was not seen until 62 patients with open fractures were enrolled. Conclusions Our findings highlight the risk of conducting a trial with insufficient sample size and power. Such studies are not only at risk of missing true effects, but also of giving misleading results. Level of Evidence N/A PMID:23525086
Adequate description of hydraulic variables based on a sample of field measurements is challenging in coarse-bed streams, a consequence of high spatial heterogeneity in flow properties that arises due to the complexity of channel boundary. By applying a resampling procedure based on bootstrapping to an extensive field data set, we have estimated sampling variability and its relationship with sample size in relation to two common methods of representing flow characteristics, spatially averaged velocity profiles and fitted probability distributions. The coefficient of variation in bed shear stress and roughness length estimated from spatially averaged velocity profiles and in shape and scale parameters of gamma distribution fitted to local values of bed shear stress, velocity, and depth was high, reaching 15-20% of the parameter value even at the sample size of 100 (sampling density 1 m-2). We illustrated implications of these findings with two examples. First, sensitivity analysis of a 2-D hydrodynamic model to changes in roughness length parameter showed that the sampling variability range observed in our resampling procedure resulted in substantially different frequency distributions and spatial patterns of modeled hydraulic variables. Second, using a bedload formula, we showed that propagation of uncertainty in the parameters of a gamma distribution used to model bed shear stress led to the coefficient of variation in predicted transport rates exceeding 50%. Overall, our findings underscore the importance of reporting the precision of estimated hydraulic parameters. When such estimates serve as input into models, uncertainty propagation should be explicitly accounted for by running ensemble simulations.
Public Opinion Polls, Chicken Soup and Sample Size
Cooking and tasting chicken soup in three different pots of very different size serves to demonstrate that it is the absolute sample size that matters the most in determining the accuracy of the findings of the poll, not the relative sample size, i.e. the size of the sample in relation to its population.
7 CFR 52.803 - Sample unit size.
... 7 Agriculture 2 2010-01-01 2010-01-01 false Sample unit size. 52.803 Section 52.803 Agriculture... United States Standards for Grades of Frozen Red Tart Pitted Cherries Sample Unit Size § 52.803 Sample unit size. Compliance with requirements for size and the various quality factors is based on...
7 CFR 52.3757 - Standard sample unit size.
... Ripe Olives 1 Product Description, Types, Styles, and Grades § 52.3757 Standard sample unit size... following standard sample unit size for the applicable style: (a) Whole and pitted—50 olives. (b)...
7 CFR 52.3757 - Standard sample unit size.
... Ripe Olives 1 Product Description, Types, Styles, and Grades § 52.3757 Standard sample unit size... following standard sample unit size for the applicable style: (a) Whole and pitted—50 olives. (b)...
Considerations when calculating the sample size for an inequality test
2016-01-01
Click here for Korean Translation. Calculating the sample size is a vital step during the planning of a study in order to ensure the desired power for detecting clinically meaningful differences. However, estimating the sample size is not always straightforward. A number of key components should be considered to calculate a suitable sample size. In this paper, general considerations for conducting sample size calculations for inequality tests are summarized. PMID:27482308
Objective: Gonadotropin stimulation test is the gold standard to document precocious puberty. However, the test is costly, time-consuming and uncomfortable. The aim of this study was to simplify the intravenous gonadotropin-releasing hormone (GnRH) stimulation test in the diagnosis of precocious puberty and in the assessment of pubertal suppression. Methods: Data pertaining to 584 GnRH stimulation tests (314 testsfor diagnosis and 270 for assessment of pubertal suppression) were analyzed. Results: Forty-minute post-injection samples had the greatest frequency of “peaking luteinizing hormone (LH)” (p<0.001) in the diagnostic tests when the cut-off value was taken as 5 IU/L for LH, 40th minute sample was found to have 98% sensitivity and 100% specificity in the diagnosis of precocious puberty, while the sensitivity and specificity of the 20th minute sample was 100% in the assessment of pubertal suppression. Conclusion: LH level at the 40th minute post-injection in the diagnosis of central precocious puberty and at the 20th minute post-injection in the assessment of pubertal suppression is highly sensitive and specific. A single sample at these time points can be used in the diagnosis of early puberty and in the assessment of pubertal suppression. Conflict of interest:None declared. PMID:21448328
Research in fields other than education has found that studies with small sample sizes tend to have larger effect sizes than those with large samples. This article examines the relationship between sample size and effect size in education. It analyzes data from 185 studies of elementary and secondary mathematics programs that met the standards of…
Strategies for Field Sampling When Large Sample Sizes are Required
Estimates of prevalence or incidence of infection with a pathogen endemic in a fish population can be valuable information for development and evaluation of aquatic animal health management strategies. However, hundreds of unbiased samples may be required in order to accurately estimate these parame...
7 CFR 52.775 - Sample unit size.
... Cherries 1 Sample Unit Size § 52.775 Sample unit size. Compliance with requirements for the size and the..., color, pits, and character—20 ounces of drained cherries. (b) Defects (other than harmless extraneous material)—100 cherries. (c) Harmless extraneous material—The total contents of each container in the...
Optimal flexible sample size design with robust power.
It is well recognized that sample size determination is challenging because of the uncertainty on the treatment effect size. Several remedies are available in the literature. Group sequential designs start with a sample size based on a conservative (smaller) effect size and allow early stop at interim looks. Sample size re-estimation designs start with a sample size based on an optimistic (larger) effect size and allow sample size increase if the observed effect size is smaller than planned. Different opinions favoring one type over the other exist. We propose an optimal approach using an appropriate optimality criterion to select the best design among all the candidate designs. Our results show that (1) for the same type of designs, for example, group sequential designs, there is room for significant improvement through our optimization approach; (2) optimal promising zone designs appear to have no advantages over optimal group sequential designs; and (3) optimal designs with sample size re-estimation deliver the best adaptive performance. We conclude that to deal with the challenge of sample size determination due to effect size uncertainty, an optimal approach can help to select the best design that provides most robust power across the effect size range of interest. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26999385
Simple, Defensible Sample Sizes Based on Cost Efficiency
Summary The conventional approach of choosing sample size to provide 80% or greater power ignores the cost implications of different sample size choices. Costs, however, are often impossible for investigators and funders to ignore in actual practice. Here, we propose and justify a new approach for choosing sample size based on cost efficiency, the ratio of a study’s projected scientific and/or practical value to its total cost. By showing that a study’s projected value exhibits diminishing marginal returns as a function of increasing sample size for a wide variety of definitions of study value, we are able to develop two simple choices that can be defended as more cost efficient than any larger sample size. The first is to choose the sample size that minimizes the average cost per subject. The second is to choose sample size to minimize total cost divided by the square root of sample size. This latter method is theoretically more justifiable for innovative studies, but also performs reasonably well and has some justification in other cases. For example, if projected study value is assumed to be proportional to power at a specific alternative and total cost is a linear function of sample size, then this approach is guaranteed either to produce more than 90% power or to be more cost efficient than any sample size that does. These methods are easy to implement, based on reliable inputs, and well justified, so they should be regarded as acceptable alternatives to current conventional approaches. PMID:18482055
7 CFR 51.2341 - Sample size for grade determination.
... 7 Agriculture 2 2010-01-01 2010-01-01 false Sample size for grade determination. 51.2341 Section..., AND STANDARDS) United States Standards for Grades of Kiwifruit § 51.2341 Sample size for grade determination. For fruit place-packed in tray pack containers, the sample shall consist of the contents of...
A computer program for sample size computations for banding studies
Wilson, K.R.; Nichols, J.D.; Hines, J.E.
Estimating optimal sampling unit sizes for satellite surveys
This paper reports on an approach for minimizing data loads associated with satellite-acquired data, while improving the efficiency of global crop area estimates using remotely sensed, satellite-based data. Results of a sampling unit size investigation are given that include closed-form models for both nonsampling and sampling error variances. These models provide estimates of the sampling unit sizes that effect minimal costs. Earlier findings from foundational sampling unit size studies conducted by Mahalanobis, Jessen, Cochran, and others are utilized in modeling the sampling error variance as a function of sampling unit size. A conservative nonsampling error variance model is proposed that is realistic in the remote sensing environment where one is faced with numerous unknown nonsampling errors. This approach permits the sampling unit size selection in the global crop inventorying environment to be put on a more quantitative basis while conservatively guarding against expected component error variances.
A review of software for sample size determination.
The size of a sample is an important element in determining the statistical precision with which population values can be estimated. This article identifies and describes free and commercial programs for sample size determination. Programs are categorized as follows: (a) multiple procedure for sample size determination; (b) single procedure for sample size determination; and (c) Web-based. Programs are described in terms of (a) cost; (b) ease of use, including interface, operating system and hardware requirements, and availability of documentation and technical support; (c) file management, including input and output formats; and (d) analytical and graphical capabilities. PMID:19696082
7 CFR 52.775 - Sample unit size.
... United States Standards for Grades of Canned Red Tart Pitted Cherries 1 Sample Unit Size § 52.775 Sample... drained cherries. (b) Defects (other than harmless extraneous material)—100 cherries. (c)...
40 CFR 80.127 - Sample size guidelines.
...) REGULATION OF FUELS AND FUEL ADDITIVES Attest Engagements § 80.127 Sample size guidelines. In performing the attest engagement, the auditor shall sample relevant populations to which agreed-upon procedures will...
40 CFR 80.127 - Sample size guidelines.
...) REGULATION OF FUELS AND FUEL ADDITIVES Attest Engagements § 80.127 Sample size guidelines. In performing the attest engagement, the auditor shall sample relevant populations to which agreed-upon procedures will...
40 CFR 80.127 - Sample size guidelines.
...) REGULATION OF FUELS AND FUEL ADDITIVES Attest Engagements § 80.127 Sample size guidelines. In performing the attest engagement, the auditor shall sample relevant populations to which agreed-upon procedures will...
40 CFR 80.127 - Sample size guidelines.
...) REGULATION OF FUELS AND FUEL ADDITIVES Attest Engagements § 80.127 Sample size guidelines. In performing the attest engagement, the auditor shall sample relevant populations to which agreed-upon procedures will...
Sample Sizes when Using Multiple Linear Regression for Prediction
When using multiple regression for prediction purposes, the issue of minimum required sample size often needs to be addressed. Using a Monte Carlo simulation, models with varying numbers of independent variables were examined and minimum sample sizes were determined for multiple scenarios at each number of independent variables. The scenarios…
7 CFR 52.775 - Sample unit size.
... 7 Agriculture 2 2010-01-01 2010-01-01 false Sample unit size. 52.775 Section 52.775 Agriculture Regulations of the Department of Agriculture AGRICULTURAL MARKETING SERVICE (Standards, Inspections, Marketing... United States Standards for Grades of Canned Red Tart Pitted Cherries 1 Sample Unit Size § 52.775...
Minimum Sample Size Recommendations for Conducting Factor Analyses
There is no shortage of recommendations regarding the appropriate sample size to use when conducting a factor analysis. Suggested minimums for sample size include from 3 to 20 times the number of variables and absolute ranges from 100 to over 1,000. For the most part, there is little empirical evidence to support these recommendations. This…
Power Analysis and Sample Size Determination in Metabolic Phenotyping.
Estimation of statistical power and sample size is a key aspect of experimental design. However, in metabolic phenotyping, there is currently no accepted approach for these tasks, in large part due to the unknown nature of the expected effect. In such hypothesis free science, neither the number or class of important analytes nor the effect size are known a priori. We introduce a new approach, based on multivariate simulation, which deals effectively with the highly correlated structure and high-dimensionality of metabolic phenotyping data. First, a large data set is simulated based on the characteristics of a pilot study investigating a given biomedical issue. An effect of a given size, corresponding either to a discrete (classification) or continuous (regression) outcome is then added. Different sample sizes are modeled by randomly selecting data sets of various sizes from the simulated data. We investigate different methods for effect detection, including univariate and multivariate techniques. Our framework allows us to investigate the complex relationship between sample size, power, and effect size for real multivariate data sets. For instance, we demonstrate for an example pilot data set that certain features achieve a power of 0.8 for a sample size of 20 samples or that a cross-validated predictivity QY(2) of 0.8 is reached with an effect size of 0.2 and 200 samples. We exemplify the approach for both nuclear magnetic resonance and liquid chromatography-mass spectrometry data from humans and the model organism C. elegans. PMID:27116637
Sample size re-estimation in a breast cancer trial
Background During the recruitment phase of a randomized breast cancer trial, investigating the time to recurrence, we found evidence that the failure probabilities used at the design stage were too high. Since most of the methodological research involving sample size re-estimation has focused on normal or binary outcomes, we developed a method which preserves blinding to re-estimate sample size in our time to event trial. Purpose A mistakenly high estimate of the failure rate at the design stage may reduce the power unacceptably for a clinically important hazard ratio. We describe an ongoing trial and an application of a sample size re-estimation method that combines current trial data with prior trial data or assumes a parametric model to re-estimate failure probabilities in a blinded fashion. Methods Using our current blinded trial data and additional information from prior studies, we re-estimate the failure probabilities to be used in sample size re-calculation. We employ bootstrap resampling to quantify uncertainty in the re-estimated sample sizes. Results At the time of re-estimation data from 278 patients was available, averaging 1.2 years of follow up. Using either method, we estimated an increase of 0 for the hazard ratio proposed at the design stage. We show that our method of blinded sample size re-estimation preserves the Type I error rate. We show that when the initial guess of the failure probabilities are correct; the median increase in sample size is zero. Limitations Either some prior knowledge of an appropriate survival distribution shape or prior data is needed for re-estimation. Conclusions In trials when the accrual period is lengthy, blinded sample size re-estimation near the end of the planned accrual period should be considered. In our examples, when assumptions about failure probabilities and HRs are correct the methods usually do not increase sample size or otherwise increase it by very little. PMID:20392786
Methods for sample size determination in cluster randomized trials
Background: The use of cluster randomized trials (CRTs) is increasing, along with the variety in their design and analysis. The simplest approach for their sample size calculation is to calculate the sample size assuming individual randomization and inflate this by a design effect to account for randomization by cluster. The assumptions of a simple design effect may not always be met; alternative or more complicated approaches are required. Methods: We summarise a wide range of sample size methods available for cluster randomized trials. For those familiar with sample size calculations for individually randomized trials but with less experience in the clustered case, this manuscript provides formulae for a wide range of scenarios with associated explanation and recommendations. For those with more experience, comprehensive summaries are provided that allow quick identification of methods for a given design, outcome and analysis method. Results: We present first those methods applicable to the simplest two-arm, parallel group, completely randomized design followed by methods that incorporate deviations from this design such as: variability in cluster sizes; attrition; non-compliance; or the inclusion of baseline covariates or repeated measures. The paper concludes with methods for alternative designs. Conclusions: There is a large amount of methodology available for sample size calculations in CRTs. This paper gives the most comprehensive description of published methodology for sample size calculation and provides an important resource for those designing these trials. PMID:26174515
The cost for conducting a "thorough QT/QTc study" is substantial and an unsuccessful outcome of the study can be detrimental to the safety profile of the drug, so sample size calculations play a very important role in ensuring adequate power for a thorough QT study. Current literature offers some help in designing such studies, but these methods have limitations and mostly apply only in the context of linear mixed models with compound symmetry covariance structure. It is not evident that such models can satisfactorily be employed to represent all kinds of QTc data, and the existing literature inadequately addresses whether there is a change in sample size and power for more general covariance structures for the linear mixed models. We assess the use of some of the existing methods to design a thorough QT study through data arising from a GlaxoSmithKline (GSK)-conducted thorough QT study, and explore newer models for sample size calculation. We also provide a new method to calculate the sample size required to detect assay sensitivity with adequate power. PMID:20358438
Sample Size Requirements for Comparing Two Alpha Coefficients.
Derived general formulas to determine the sample size requirements for hypothesis testing with desired power and interval estimation with desired precision. Illustrated the approach with the example of a screening test for adolescent attention deficit disorder. (SLD)
7 CFR 52.803 - Sample unit size.
... PROCESSED FRUITS AND VEGETABLES, PROCESSED PRODUCTS THEREOF, AND CERTAIN OTHER PROCESSED FOOD PRODUCTS 1 United States Standards for Grades of Frozen Red Tart Pitted Cherries Sample Unit Size § 52.803...
7 CFR 52.803 - Sample unit size.
... PROCESSED FRUITS AND VEGETABLES, PROCESSED PRODUCTS THEREOF, AND CERTAIN OTHER PROCESSED FOOD PRODUCTS 1 United States Standards for Grades of Frozen Red Tart Pitted Cherries Sample Unit Size § 52.803...
The Precision Efficacy Analysis for Regression Sample Size Method.
The general purpose of this study was to examine the efficiency of the Precision Efficacy Analysis for Regression (PEAR) method for choosing appropriate sample sizes in regression studies used for precision. The PEAR method, which is based on the algebraic manipulation of an accepted cross-validity formula, essentially uses an effect size to…
The Sample Size Needed for the Trimmed "t" Test when One Group Size Is Fixed
ERIC Educational Resources Information Center
The sample size determination is an important issue for planning research. However, limitations in size have seldom been discussed in the literature. Thus, how to allocate participants into different treatment groups to achieve the desired power is a practical issue that still needs to be addressed when one group size is fixed. The authors focused…
Two-stage chain sampling inspection plans with different sample sizes in the two stages
A further generalization of the family of 'two-stage' chain sampling inspection plans is developed - viz, the use of different sample sizes in the two stages. Evaluation of the operating characteristics is accomplished by the Markov chain approach of the earlier work, modified to account for the different sample sizes. Markov chains for a number of plans are illustrated and several algebraic solutions are developed. Since these plans involve a variable amount of sampling, an evaluation of the average sampling number (ASN) is developed. A number of OC curves and ASN curves are presented. Some comparisons with plans having only one sample size are presented and indicate that improved discrimination is achieved by the two-sample-size plans.
Sample size calculation for the proportional hazards cure model.
In clinical trials with time-to-event endpoints, it is not uncommon to see a significant proportion of patients being cured (or long-term survivors), such as trials for the non-Hodgkins lymphoma disease. The popularly used sample size formula derived under the proportional hazards (PH) model may not be proper to design a survival trial with a cure fraction, because the PH model assumption may be violated. To account for a cure fraction, the PH cure model is widely used in practice, where a PH model is used for survival times of uncured patients and a logistic distribution is used for the probability of patients being cured. In this paper, we develop a sample size formula on the basis of the PH cure model by investigating the asymptotic distributions of the standard weighted log-rank statistics under the null and local alternative hypotheses. The derived sample size formula under the PH cure model is more flexible because it can be used to test the differences in the short-term survival and/or cure fraction. Furthermore, we also investigate as numerical examples the impacts of accrual methods and durations of accrual and follow-up periods on sample size calculation. The results show that ignoring the cure rate in sample size calculation can lead to either underpowered or overpowered studies. We evaluate the performance of the proposed formula by simulation studies and provide an example to illustrate its application with the use of data from a melanoma trial. PMID:22786805
Sample sizes in dosage investigational clinical trials: a systematic evaluation.
The main purpose of investigational phase II clinical trials is to explore indications and effective doses. However, as yet, there is no clear rule and no related published literature about the precise suitable sample sizes to be used in phase II clinical trials. To explore this, we searched for clinical trials in the ClinicalTrials.gov registry using the keywords "dose-finding" or "dose-response" and "Phase II". The time span of the search was September 20, 1999, to December 31, 2013. A total of 2103 clinical trials were finally included in our review. Regarding sample sizes, 1,156 clinical trials had <40 participants in each group, accounting for 55.0% of the studies reviewed, and only 17.2% of the studies reviewed had >100 patient cases in a single group. Sample sizes used in parallel study designs tended to be larger than those of crossover designs (median sample size 151 and 37, respectively). In conclusion, in the earlier phases of drug research and development, there are a variety of designs for dosage investigational studies. The sample size of each trial should be comprehensively considered and selected according to the study design and purpose. PMID:25609916
Aircraft studies of size-dependent aerosol sampling through inlets
Representative measurement of aerosol from aircraft-aspirated systems requires special efforts in order to maintain near isokinetic sampling conditions, estimate aerosol losses in the sample system, and obtain a measurement of sufficient duration to be statistically significant for all sizes of interest. This last point is especially critical for aircraft measurements which typically require fast response times while sampling in clean remote regions. This paper presents size-resolved tests, intercomparisons, and analysis of aerosol inlet performance as determined by a custom laser optical particle counter. Measurements discussed here took place during the Global Backscatter Experiment (1988-1989) and the Central Pacific Atmospheric Chemistry Experiment (1988). System configurations are discussed including (1) nozzle design and performance, (2) system transmission efficiency, (3) nonadiabatic effects in the sample line and its effect on the sample-line relative humidity, and (4) the use and calibration of a virtual impactor.
Sample Size Determination for One- and Two-Sample Trimmed Mean Tests
Formulas to determine the necessary sample sizes for parametric tests of group comparisons are available from several sources and appropriate when population distributions are normal. However, in the context of nonnormal population distributions, researchers recommend Yuen's trimmed mean test, but formulas to determine sample sizes have not been…
Sample size considerations for livestock movement network data.
The movement of animals between farms contributes to infectious disease spread in production animal populations, and is increasingly investigated with social network analysis methods. Tangible outcomes of this work include the identification of high-risk premises for targeting surveillance or control programs. However, knowledge of the effect of sampling or incomplete network enumeration on these studies is limited. In this study, a simulation algorithm is presented that provides an estimate of required sampling proportions based on predicted network size, density and degree value distribution. The algorithm may be applied a priori to ensure network analyses based on sampled or incomplete data provide population estimates of known precision. Results demonstrate that, for network degree measures, sample size requirements vary with sampling method. The repeatability of the algorithm output under constant network and sampling criteria was found to be consistent for networks with at least 1000 nodes (in this case, farms). Where simulated networks can be constructed to closely mimic the true network in a target population, this algorithm provides a straightforward approach to determining sample size under a given sampling procedure for a network measure of interest. It can be used to tailor study designs of known precision, for investigating specific livestock movement networks and their impact on disease dissemination within populations. PMID:26276397
Statistical power is the ability to detect a significant effect, given that the effect actually exists in a population. Like most statistical concepts, statistical power tends to induce cognitive dissonance in hepatology researchers. However, planning for statistical power by an a priori sample size calculation is of paramount importance when designing a research study. There are five specific empirical components that make up an a priori sample size calculation: the scale of measurement of the outcome, the research design, the magnitude of the effect size, the variance of the effect size, and the sample size. A framework grounded in the phenomenon of isomorphism, or interdependencies amongst different constructs with similar forms, will be presented to understand the isomorphic effects of decisions made on each of the five aforementioned components of statistical power. PMID:27073717
Statistical power is the ability to detect a significant effect, given that the effect actually exists in a population. Like most statistical concepts, statistical power tends to induce cognitive dissonance in hepatology researchers. However, planning for statistical power by an a priori sample size calculation is of paramount importance when designing a research study. There are five specific empirical components that make up an a priori sample size calculation: the scale of measurement of the outcome, the research design, the magnitude of the effect size, the variance of the effect size, and the sample size. A framework grounded in the phenomenon of isomorphism, or interdependencies amongst different constructs with similar forms, will be presented to understand the isomorphic effects of decisions made on each of the five aforementioned components of statistical power. PMID:27073717
Sample Size Calculations for Precise Interval Estimation of the Eta-Squared Effect Size
Analysis of variance is one of the most frequently used statistical analyses in the behavioral, educational, and social sciences, and special attention has been paid to the selection and use of an appropriate effect size measure of association in analysis of variance. This article presents the sample size procedures for precise interval estimation…
Estimation of benthic macroinvertebrate populations over large spatial scales is difficult due to the high variability in abundance and the cost of sample processing and taxonomic analysis. To determine a cost-effective, statistically powerful sample design, we conducted an exploratory study of the spatial variation of benthic macroinvertebrates in a 37 km reach of the Upper Mississippi River. We sampled benthos at 36 sites within each of two strata, contiguous backwater and channel border. Three standard ponar (525 cm(2)) grab samples were obtained at each site ('Original Design'). Analysis of variance and sampling cost of strata-wide estimates for abundance of Oligochaeta, Chironomidae, and total invertebrates showed that only one ponar sample per site ('Reduced Design') yielded essentially the same abundance estimates as the Original Design, while reducing the overall cost by 63%. A posteriori statistical power analysis (alpha = 0.05, beta = 0.20) on the Reduced Design estimated that at least 18 sites per stratum were needed to detect differences in mean abundance between contiguous backwater and channel border areas for Oligochaeta, Chironomidae, and total invertebrates. Statistical power was nearly identical for the three taxonomic groups. The abundances of several taxa of concern (e.g., Hexagenia mayflies and Musculium fingernail clams) were too spatially variable to estimate power with our method. Resampling simulations indicated that to achieve adequate sampling precision for Oligochaeta, at least 36 sample sites per stratum would be required, whereas a sampling precision of 0.2 would not be attained with any sample size for Hexagenia in channel border areas, or Chironomidae and Musculium in both strata given the variance structure of the original samples. Community-wide diversity indices (Brillouin and 1-Simpsons) increased as sample area per site increased. The backwater area had higher diversity than the channel border area. The number of sampling sites
Approximate sample sizes required to estimate length distributions
The sample sizes required to estimate fish length were determined by bootstrapping from reference length distributions. Depending on population characteristics and species-specific maximum lengths, 1-cm length-frequency histograms required 375-1,200 fish to estimate within 10% with 80% confidence, 2.5-cm histograms required 150-425 fish, proportional stock density required 75-140 fish, and mean length required 75-160 fish. In general, smaller species, smaller populations, populations with higher mortality, and simpler length statistics required fewer samples. Indices that require low sample sizes may be suitable for monitoring population status, and when large changes in length are evident, additional sampling effort may be allocated to more precisely define length status with more informative estimators. ?? Copyright by the American Fisheries Society 2007.
Researchers are becoming increasingly concerned with airborne particulate matter, not only in the respirable size range, but also in larger size ranges. International Standards Organization (ISO) and the American Conference of Governmental Industrial Hygienist (ACGIH) have developed standards for {open_quotes}inhalable{close_quotes} and {open_quotes}thoracic{close_quotes} particulate matter. These require sampling particles up to approximately 100 {mu}m in diameter. The size distribution and mass concentration of airborne particulate matter have been measured in air quality studies of the working sections of more than 20 underground mines by University of Minnesota and U.S. Bureau of Mines personnel. Measurements have been made in more than 15 coal mines and five metal/nonmetal mines over the past eight years. Although mines using diesel-powered equipment were emphasized, mines using all-electric powered equipment were also included. Particle sampling was conducted at fixed locations, i.e., mine portal, ventilation intake entry, haulageways, ventilation return entry, and near raincars, bolters and load-haul-dump equipment. The primary sampling device used was the MSP Model 100 micro-orifice uniform deposit impactor (MOUDI). The MOUDI samples at a flow rate of 30 LPM and. provides particle size distribution information for particles primarily in the 0.1 to 18 {mu}m size range. Up to five MOUDI samplers were simultaneously deployed at the fixed locations. Sampling times were typically 4 to 6 hrs/shift. Results from these field studies have been summarized to determine the average size distributions and mass concentrations at various locations in the mine section sampled. From these average size distributions, predictions are made regarding the expected levels of respirable and thoracic mass concentrations as defined by various health-based size-selective aerosol-sampling criteria.
A simulation study of sample size for DNA barcoding.
For some groups of organisms, DNA barcoding can provide a useful tool in taxonomy, evolutionary biology, and biodiversity assessment. However, the efficacy of DNA barcoding depends on the degree of sampling per species, because a large enough sample size is needed to provide a reliable estimate of genetic polymorphism and for delimiting species. We used a simulation approach to examine the effects of sample size on four estimators of genetic polymorphism related to DNA barcoding: mismatch distribution, nucleotide diversity, the number of haplotypes, and maximum pairwise distance. Our results showed that mismatch distributions derived from subsamples of ≥20 individuals usually bore a close resemblance to that of the full dataset. Estimates of nucleotide diversity from subsamples of ≥20 individuals tended to be bell-shaped around that of the full dataset, whereas estimates from smaller subsamples were not. As expected, greater sampling generally led to an increase in the number of haplotypes. We also found that subsamples of ≥20 individuals allowed a good estimate of the maximum pairwise distance of the full dataset, while smaller ones were associated with a high probability of underestimation. Overall, our study confirms the expectation that larger samples are beneficial for the efficacy of DNA barcoding and suggests that a minimum sample size of 20 individuals is needed in practice for each population. PMID:26811761
Sample Size Bias in Judgments of Perceptual Averages
Previous research has shown that people exhibit a sample size bias when judging the average of a set of stimuli on a single dimension. The more stimuli there are in the set, the greater people judge the average to be. This effect has been demonstrated reliably for judgments of the average likelihood that groups of people will experience negative,…
Small Sample Sizes Yield Biased Allometric Equations in Temperate Forests.
Accurate quantification of forest carbon stocks is required for constraining the global carbon cycle and its impacts on climate. The accuracies of forest biomass maps are inherently dependent on the accuracy of the field biomass estimates used to calibrate models, which are generated with allometric equations. Here, we provide a quantitative assessment of the sensitivity of allometric parameters to sample size in temperate forests, focusing on the allometric relationship between tree height and crown radius. We use LiDAR remote sensing to isolate between 10,000 to more than 1,000,000 tree height and crown radius measurements per site in six U.S. forests. We find that fitted allometric parameters are highly sensitive to sample size, producing systematic overestimates of height. We extend our analysis to biomass through the application of empirical relationships from the literature, and show that given the small sample sizes used in common allometric equations for biomass, the average site-level biomass bias is ~+70% with a standard deviation of 71%, ranging from -4% to +193%. These findings underscore the importance of increasing the sample sizes used for allometric equation generation. PMID:26598233
Sample Size Tables, "t" Test, and a Prevalent Psychometric Distribution.
Psychology studies often have low statistical power. Sample size tables, as given by J. Cohen (1988), may be used to increase power, but they are based on Monte Carlo studies of relatively "tame" mathematical distributions, as compared to psychology data sets. In this study, Monte Carlo methods were used to investigate Type I and Type II error…
Small Sample Sizes Yield Biased Allometric Equations in Temperate Forests
Accurate quantification of forest carbon stocks is required for constraining the global carbon cycle and its impacts on climate. The accuracies of forest biomass maps are inherently dependent on the accuracy of the field biomass estimates used to calibrate models, which are generated with allometric equations. Here, we provide a quantitative assessment of the sensitivity of allometric parameters to sample size in temperate forests, focusing on the allometric relationship between tree height and crown radius. We use LiDAR remote sensing to isolate between 10,000 to more than 1,000,000 tree height and crown radius measurements per site in six U.S. forests. We find that fitted allometric parameters are highly sensitive to sample size, producing systematic overestimates of height. We extend our analysis to biomass through the application of empirical relationships from the literature, and show that given the small sample sizes used in common allometric equations for biomass, the average site-level biomass bias is ~+70% with a standard deviation of 71%, ranging from −4% to +193%. These findings underscore the importance of increasing the sample sizes used for allometric equation generation. PMID:26598233
An Investigation of Sample Size Splitting on ATFIND and DIMTEST
Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…
The Fisher-Yates Exact Test and Unequal Sample Sizes
A computational short cut suggested by Feldman and Klinger for the one-sided Fisher-Yates exact test is clarified and is extended to the calculation of probability values for certain two-sided tests when sample sizes are unequal. (Author)
Sampling and surface reconstruction with adaptive-size meshes
This paper presents a new approach to sampling and surface reconstruction which uses the physically based models. We introduce adaptive-size meshes which automatically update the size of the meshes as the distance between the nodes changes. We have implemented the adaptive-size algorithm to the following three applications: (1) Sampling of the intensity data. (2) Surface reconstruction of the range data. (3) Surface reconstruction of the 3-D computed tomography left ventricle data. The LV data was acquired by the 3-D computed tomography (CT) scanner. It was provided by Dr. Eric Hoffman at University of Pennsylvania Medical school and consists of 16 volumetric (128 X 128 X 118) images taken through the heart cycle.
Unless the sample encompasses a substantial portion of the population, the standard error of an estimator depends on the size of the sample, but not the size of the population. This is a crucial statistical insight that students find very counterintuitive. After trying several ways of convincing students of the validity of this principle, I have…
Effective Sample Size in Diffuse Reflectance Near-IR Spectrometry.
Two independent methods for determination of the effectively sampled mass per unit area are presented and compared. The first method combines directional-hemispherical transmittance and reflectance measurements. A three-flux approximation of the equation of radiative transfer is used, to separately determine the specific absorption and scattering coefficients of the powder material, which subsequently are used to determine the effective sample size. The second method uses a number of diffuse reflectance measurements on layers of controlled powder thickness in an empirical approach. The two methods are shown to agree well and thus confirm each other. From the determination of the effective sample size at each measured wavelength in the visible-NIR region for two different model powder materials, large differences was found, both between the two analyzed powders and between different wavelengths. As an example, the effective sample size ranges between 15 and 70 mg/cm(2) for microcrystalline cellulose and between 70 and 300 mg/cm(2) for film-coated pellets. However, the contribution to the spectral information obtained from a certain layer decreases rapidly with increasing distance from the powder surface. With both methods, the extent of contribution from various depths of a powder sample to the visible-NIR diffuse reflection signal is characterized. This information is valuable for validation of analytical applications of diffuse reflectance visible-NIR spectrometry. PMID:21662719
Detecting Neuroimaging Biomarkers for Psychiatric Disorders: Sample Size Matters.
In a recent review, it was suggested that much larger cohorts are needed to prove the diagnostic value of neuroimaging biomarkers in psychiatry. While within a sample, an increase of diagnostic accuracy of schizophrenia (SZ) with number of subjects (N) has been shown, the relationship between N and accuracy is completely different between studies. Using data from a recent meta-analysis of machine learning (ML) in imaging SZ, we found that while low-N studies can reach 90% and higher accuracy, above N/2 = 50 the maximum accuracy achieved steadily drops to below 70% for N/2 > 150. We investigate the role N plays in the wide variability in accuracy results in SZ studies (63-97%). We hypothesize that the underlying cause of the decrease in accuracy with increasing N is sample heterogeneity. While smaller studies more easily include a homogeneous group of subjects (strict inclusion criteria are easily met; subjects live close to study site), larger studies inevitably need to relax the criteria/recruit from large geographic areas. A SZ prediction model based on a heterogeneous group of patients with presumably a heterogeneous pattern of structural or functional brain changes will not be able to capture the whole variety of changes, thus being limited to patterns shared by most patients. In addition to heterogeneity (sample size), we investigate other factors influencing accuracy and introduce a ML effect size. We derive a simple model of how the different factors, such as sample heterogeneity and study setup determine this ML effect size, and explain the variation in prediction accuracies found from the literature, both in cross-validation and independent sample testing. From this, we argue that smaller-N studies may reach high prediction accuracy at the cost of lower generalizability to other samples. Higher-N studies, on the other hand, will have more generalization power, but at the cost of lower accuracy. In conclusion, when comparing results from different
Detecting Neuroimaging Biomarkers for Psychiatric Disorders: Sample Size Matters
2016-01-01
In a recent review, it was suggested that much larger cohorts are needed to prove the diagnostic value of neuroimaging biomarkers in psychiatry. While within a sample, an increase of diagnostic accuracy of schizophrenia (SZ) with number of subjects (N) has been shown, the relationship between N and accuracy is completely different between studies. Using data from a recent meta-analysis of machine learning (ML) in imaging SZ, we found that while low-N studies can reach 90% and higher accuracy, above N/2 = 50 the maximum accuracy achieved steadily drops to below 70% for N/2 > 150. We investigate the role N plays in the wide variability in accuracy results in SZ studies (63–97%). We hypothesize that the underlying cause of the decrease in accuracy with increasing N is sample heterogeneity. While smaller studies more easily include a homogeneous group of subjects (strict inclusion criteria are easily met; subjects live close to study site), larger studies inevitably need to relax the criteria/recruit from large geographic areas. A SZ prediction model based on a heterogeneous group of patients with presumably a heterogeneous pattern of structural or functional brain changes will not be able to capture the whole variety of changes, thus being limited to patterns shared by most patients. In addition to heterogeneity (sample size), we investigate other factors influencing accuracy and introduce a ML effect size. We derive a simple model of how the different factors, such as sample heterogeneity and study setup determine this ML effect size, and explain the variation in prediction accuracies found from the literature, both in cross-validation and independent sample testing. From this, we argue that smaller-N studies may reach high prediction accuracy at the cost of lower generalizability to other samples. Higher-N studies, on the other hand, will have more generalization power, but at the cost of lower accuracy. In conclusion, when comparing results from different
(Sample) Size Matters: Defining Error in Planktic Foraminiferal Isotope Measurement
Planktic foraminifera have been used as carriers of stable isotopic signals since the pioneering work of Urey and Emiliani. In those heady days, instrumental limitations required hundreds of individual foraminiferal tests to return a usable value. This had the fortunate side-effect of smoothing any seasonal to decadal changes within the planktic foram population, which generally turns over monthly, removing that potential noise from each sample. With the advent of more sensitive mass spectrometers, smaller sample sizes have now become standard. This has been a tremendous advantage, allowing longer time series with the same investment of time and energy. Unfortunately, the use of smaller numbers of individuals to generate a data point has lessened the amount of time averaging in the isotopic analysis and decreased precision in paleoceanographic datasets. With fewer individuals per sample, the differences between individual specimens will result in larger variation, and therefore error, and less precise values for each sample. Unfortunately, most workers (the authors included) do not make a habit of reporting the error associated with their sample size. We have created an open-source model in R to quantify the effect of sample sizes under various realistic and highly modifiable parameters (calcification depth, diagenesis in a subset of the population, improper identification, vital effects, mass, etc.). For example, a sample in which only 1 in 10 specimens is diagenetically altered can be off by >0.3‰ δ18O VPDB or ~1°C. Additionally, and perhaps more importantly, we show that under unrealistically ideal conditions (perfect preservation, etc.) it takes ~5 individuals from the mixed-layer to achieve an error of less than 0.1‰. Including just the unavoidable vital effects inflates that number to ~10 individuals to achieve ~0.1‰. Combining these errors with the typical machine error inherent in mass spectrometers make this a vital consideration moving forward.
Rock sampling. [method for controlling particle size distribution
A method for sampling rock and other brittle materials and for controlling resultant particle sizes is described. The method involves cutting grooves in the rock surface to provide a grouping of parallel ridges and subsequently machining the ridges to provide a powder specimen. The machining step may comprise milling, drilling, lathe cutting or the like; but a planing step is advantageous. Control of the particle size distribution is effected primarily by changing the height and width of these ridges. This control exceeds that obtainable by conventional grinding.
Air sampling filtration media: Collection efficiency for respirable size-selective sampling
The collection efficiencies of commonly used membrane air sampling filters in the ultrafine particle size range were investigated. Mixed cellulose ester (MCE; 0.45, 0.8, 1.2, and 5 μm pore sizes), polycarbonate (0.4, 0.8, 2, and 5 μm pore sizes), polytetrafluoroethylene (PTFE; 0.45, 1, 2, and 5 μm pore sizes), polyvinyl chloride (PVC; 0.8 and 5 μm pore sizes), and silver membrane (0.45, 0.8, 1.2, and 5 μm pore sizes) filters were exposed to polydisperse sodium chloride (NaCl) particles in the size range of 10–400 nm. Test aerosols were nebulized and introduced into a calm air chamber through a diffusion dryer and aerosol neutralizer. The testing filters (37 mm diameter) were mounted in a conductive polypropylene filter-holder (cassette) within a metal testing tube. The experiments were conducted at flow rates between 1.7 and 11.2 l min−1. The particle size distributions of NaCl challenge aerosol were measured upstream and downstream of the test filters by a scanning mobility particle sizer (SMPS). Three different filters of each type with at least three repetitions for each pore size were tested. In general, the collection efficiency varied with airflow, pore size, and sampling duration. In addition, both collection efficiency and pressure drop increased with decreased pore size and increased sampling flow rate, but they differed among filter types and manufacturer. The present study confirmed that the MCE, PTFE, and PVC filters have a relatively high collection efficiency for challenge particles much smaller than their nominal pore size and are considerably more efficient than polycarbonate and silver membrane filters, especially at larger nominal pore sizes. PMID:26834310
Sample size determination for longitudinal designs with binary response.
In this article, we develop appropriate statistical methods for determining the required sample size while comparing the efficacy of an intervention to a control with repeated binary response outcomes. Our proposed methodology incorporates the complexity of the hierarchical nature of underlying designs and provides solutions when varying attrition rates are present over time. We explore how the between-subject variability and attrition rates jointly influence the computation of sample size formula. Our procedure also shows how efficient estimation methods play a crucial role in power analysis. A practical guideline is provided when information regarding individual variance component is unavailable. The validity of our methods is established by extensive simulation studies. Results are illustrated with the help of two randomized clinical trials in the areas of contraception and insomnia. PMID:24820424
Effect of sample size on deformation in amorphous metals
Uniaxial compression tests were performed on micron-sized columns of amorphous PdSi to investigate the effect of sample size on deformation behavior. Cylindrical columns with diameters between 8μm and 140nm were fabricated from sputtered amorphous Pd77Si23 films on Si substrates by focused ion beam machining and compression tests were performed with a nanoindenter outfitted with a flat diamond punch. The columns exhibited elastic behavior until they yielded by either shear band formation on a plane at 50° to the loading axis or by homogenous deformation. Shear band formation occurred only in columns with diameters larger than 400nm. The change in deformation mechanism from shear band formation to homogeneous deformation with decreasing column size is attributed to a required critical strained volume for shear band formation.
GLIMMPSE Lite: Calculating Power and Sample Size on Smartphone Devices
Researchers seeking to develop complex statistical applications for mobile devices face a common set of difficult implementation issues. In this work, we discuss general solutions to the design challenges. We demonstrate the utility of the solutions for a free mobile application designed to provide power and sample size calculations for univariate, one-way analysis of variance (ANOVA), GLIMMPSE Lite. Our design decisions provide a guide for other scientists seeking to produce statistical software for mobile platforms. PMID:25541688
Tooth Wear Prevalence and Sample Size Determination : A Pilot Study
Tooth wear is the non-carious loss of tooth tissue, which results from three processes namely attrition, erosion and abrasion. These can occur in isolation or simultaneously. Very mild tooth wear is a physiological effect of aging. This study aims to estimate the prevalence of tooth wear among 16-year old Malay school children and determine a feasible sample size for further study. Fifty-five subjects were examined clinically, followed by the completion of self-administered questionnaires. Questionnaires consisted of socio-demographic and associated variables for tooth wear obtained from the literature. The Smith and Knight tooth wear index was used to chart tooth wear. Other oral findings were recorded using the WHO criteria. A software programme was used to determine pathological tooth wear. About equal ratio of male to female were involved. It was found that 18.2% of subjects have no tooth wear, 63.6% had very mild tooth wear, 10.9% mild tooth wear, 5.5% moderate tooth wear and 1.8 % severe tooth wear. In conclusion 18.2% of subjects were deemed to have pathological tooth wear (mild, moderate & severe). Exploration with all associated variables gave a sample size ranging from 560 – 1715. The final sample size for further study greatly depends on available time and resources. PMID:22589636
Improving Microarray Sample Size Using Bootstrap Data Combination
Microarray technology has enabled us to simultaneously measure the expression of thousands of genes. Using this high-throughput technology, we can examine subtle genetic changes between biological samples and build predictive models for clinical applications. Although microarrays have dramatically increased the rate of data collection, sample size is still a major issue when selecting features. Previous methods show that combining multiple microarray datasets improves feature selection using simple methods such as fold change. We propose a wrapper-based gene selection technique that combines bootstrap estimated classification errors for individual genes across multiple datasets and reduces the contribution of datasets with high variance. We use the bootstrap because it is an unbiased estimator of classification error that is also effective for small sample data. Coupled with data combination across multiple datasets, we show that our meta-analytic approach improves the biological relevance of gene selection using prostate and renal cancer microarray data. PMID:19164001
Developmental studies have provided mixed evidence with regard to the question of whether children consider sample size and sample diversity in their inductive generalizations. Results from four experiments with 105 undergraduates, 105 school-age children (M = 7.2 years), and 105 preschoolers (M = 4.9 years) showed that preschoolers made a higher…
Decadal predictive skill assessment - ensemble and hindcast sample size impact
Hindcast, respectively retrospective prediction experiments have to be performed to validate decadal prediction systems. These are necessarily restricted in the number due to the computational constrains. From weather and seasonal prediction it is known that, the ensemble size is crucial. A similar dependency is likely for decadal predictions but, differences are expected due to the differing time-scales of the involved processes and the longer prediction horizon. It is shown here, that the ensemble and hindcast sample size have a large impact on the uncertainty assessment of the ensemble mean, as well as for the detection of prediction skill. For that purpose a conceptual model is developed, which enables the systematic analysis of statistical properties and its dependencies in a framework close to that of real decadal predictions. In addition, a set of extended range hindcast experiments have been undertaken, covering the entire 20th century.
Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes
A central challenge in the analysis of genetic variation is to provide realistic genome simulation across millions of samples. Present day coalescent simulations do not scale well, or use approximations that fail to capture important long-range linkage properties. Analysing the results of simulations also presents a substantial challenge, as current methods to store genealogies consume a great deal of space, are slow to parse and do not take advantage of shared structure in correlated trees. We solve these problems by introducing sparse trees and coalescence records as the key units of genealogical analysis. Using these tools, exact simulation of the coalescent with recombination for chromosome-sized regions over hundreds of thousands of samples is possible, and substantially faster than present-day approximate methods. We can also analyse the results orders of magnitude more quickly than with existing methods. PMID:27145223
Automated sampling assessment for molecular simulations using the effective sample size
To quantify the progress in the development of algorithms and forcefields used in molecular simulations, a general method for the assessment of the sampling quality is needed. Statistical mechanics principles suggest the populations of physical states characterize equilibrium sampling in a fundamental way. We therefore develop an approach for analyzing the variances in state populations, which quantifies the degree of sampling in terms of the effective sample size (ESS). The ESS estimates the number of statistically independent configurations contained in a simulated ensemble. The method is applicable to both traditional dynamics simulations as well as more modern (e.g., multi–canonical) approaches. Our procedure is tested in a variety of systems from toy models to atomistic protein simulations. We also introduce a simple automated procedure to obtain approximate physical states from dynamic trajectories: this allows sample–size estimation in systems for which physical states are not known in advance. PMID:21221418
Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses
Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS) estimates how many truly independent samples of a given parameter the output of the MCMC represents. The ESS of a parameter is frequently much lower than the number of samples taken from the MCMC because sequential samples from the chain can be non-independent due to autocorrelation. Typically, phylogeneticists use a rule of thumb that the ESS of all parameters should be greater than 200. However, we have no method to calculate an ESS of tree topology samples, despite the fact that the tree topology is often the parameter of primary interest and is almost always central to the estimation of other parameters. That is, we lack a method to determine whether we have adequately sampled one of the most important parameters in our analyses. In this study, we address this problem by developing methods to estimate the ESS for tree topologies. We combine these methods with two new diagnostic plots for assessing posterior samples of tree topologies, and compare their performance on simulated and empirical data sets. Combined, the methods we present provide new ways to assess the mixing and convergence of phylogenetic tree topologies in Bayesian MCMC analyses. PMID:27435794
2014-01-01
Background Despite the widespread use of patient-reported Outcomes (PRO) in clinical studies, their design remains a challenge. Justification of study size is hardly provided, especially when a Rasch model is planned for analysing the data in a 2-group comparison study. The classical sample size formula (CLASSIC) for comparing normally distributed endpoints between two groups has shown to be inadequate in this setting (underestimated study sizes). A correction factor (RATIO) has been proposed to reach an adequate sample size from the CLASSIC when a Rasch model is intended to be used for analysis. The objective was to explore the impact of the parameters used for study design on the RATIO and to identify the most relevant to provide a simple method for sample size determination for Rasch modelling. Methods A large combination of parameters used for study design was simulated using a Monte Carlo method: variance of the latent trait, group effect, sample size per group, number of items and items difficulty parameters. A linear regression model explaining the RATIO and including all the former parameters as covariates was fitted. Results The most relevant parameters explaining the ratio’s variations were the number of items and the variance of the latent trait (R2 = 99.4%). Conclusions Using the classical sample size formula adjusted with the proposed RATIO can provide a straightforward and reliable formula for sample size computation for 2-group comparison of PRO data using Rasch models. PMID:24996957
Computing Power and Sample Size for Informational Odds Ratio †
The informational odds ratio (IOR) measures the post-exposure odds divided by the pre-exposure odds (i.e., information gained after knowing exposure status). A desirable property of an adjusted ratio estimate is collapsibility, wherein the combined crude ratio will not change after adjusting for a variable that is not a confounder. Adjusted traditional odds ratios (TORs) are not collapsible. In contrast, Mantel-Haenszel adjusted IORs, analogous to relative risks (RRs) generally are collapsible. IORs are a useful measure of disease association in case-referent studies, especially when the disease is common in the exposed and/or unexposed groups. This paper outlines how to compute power and sample size in the simple case of unadjusted IORs. PMID:24157518
Quantum state discrimination bounds for finite sample size
In the problem of quantum state discrimination, one has to determine by measurements the state of a quantum system, based on the a priori side information that the true state is one of the two given and completely known states, {rho} or {sigma}. In general, it is not possible to decide the identity of the true state with certainty, and the optimal measurement strategy depends on whether the two possible errors (mistaking {rho} for {sigma}, or the other way around) are treated as of equal importance or not. Results on the quantum Chernoff and Hoeffding bounds and the quantum Stein's lemma show that, if several copies of the system are available then the optimal error probabilities decay exponentially in the number of copies, and the decay rate is given by a certain statistical distance between {rho} and {sigma} (the Chernoff distance, the Hoeffding distances, and the relative entropy, respectively). While these results provide a complete solution to the asymptotic problem, they are not completely satisfying from a practical point of view. Indeed, in realistic scenarios one has access only to finitely many copies of a system, and therefore it is desirable to have bounds on the error probabilities for finite sample size. In this paper we provide finite-size bounds on the so-called Stein errors, the Chernoff errors, the Hoeffding errors, and the mixed error probabilities related to the Chernoff and the Hoeffding errors.
MEPAG Recommendations for a 2018 Mars Sample Return Caching Lander - Sample Types, Number, and Sizes
The return to Earth of geological and atmospheric samples from the surface of Mars is among the highest priority objectives of planetary science. The MEPAG Mars Sample Return (MSR) End-to-End International Science Analysis Group (MEPAG E2E-iSAG) was chartered to propose scientific objectives and priorities for returned sample science, and to map out the implications of these priorities, including for the proposed joint ESA-NASA 2018 mission that would be tasked with the crucial job of collecting and caching the samples. The E2E-iSAG identified four overarching scientific aims that relate to understanding: (A) the potential for life and its pre-biotic context, (B) the geologic processes that have affected the martian surface, (C) planetary evolution of Mars and its atmosphere, (D) potential for future human exploration. The types of samples deemed most likely to achieve the science objectives are, in priority order: (1A). Subaqueous or hydrothermal sediments (1B). Hydrothermally altered rocks or low temperature fluid-altered rocks (equal priority) (2). Unaltered igneous rocks (3). Regolith, including airfall dust (4). Present-day atmosphere and samples of sedimentary-igneous rocks containing ancient trapped atmosphere Collection of geologically well-characterized sample suites would add considerable value to interpretations of all collected rocks. To achieve this, the total number of rock samples should be about 30-40. In order to evaluate the size of individual samples required to meet the science objectives, the E2E-iSAG reviewed the analytical methods that would likely be applied to the returned samples by preliminary examination teams, for planetary protection (i.e., life detection, biohazard assessment) and, after distribution, by individual investigators. It was concluded that sample size should be sufficient to perform all high-priority analyses in triplicate. In keeping with long-established curatorial practice of extraterrestrial material, at least 40% by
7 CFR 51.2838 - Samples for grade and size determination.
... or Jumbo size or larger the package shall be the sample. When individual packages contain less than... 7 Agriculture 2 2010-01-01 2010-01-01 false Samples for grade and size determination. 51.2838... Creole Types) Samples for Grade and Size Determination § 51.2838 Samples for grade and size...
Statistical identifiability and sample size calculations for serial seroepidemiology
Inference on disease dynamics is typically performed using case reporting time series of symptomatic disease. The inferred dynamics will vary depending on the reporting patterns and surveillance system for the disease in question, and the inference will miss mild or underreported epidemics. To eliminate the variation introduced by differing reporting patterns and to capture asymptomatic or subclinical infection, inferential methods can be applied to serological data sets instead of case reporting data. To reconstruct complete disease dynamics, one would need to collect a serological time series. In the statistical analysis presented here, we consider a particular kind of serological time series with repeated, periodic collections of population-representative serum. We refer to this study design as a serial seroepidemiology (SSE) design, and we base the analysis on our epidemiological knowledge of influenza. We consider a study duration of three to four years, during which a single antigenic type of influenza would be circulating, and we evaluate our ability to reconstruct disease dynamics based on serological data alone. We show that the processes of reinfection, antibody generation, and antibody waning confound each other and are not always statistically identifiable, especially when dynamics resemble a non-oscillating endemic equilibrium behavior. We introduce some constraints to partially resolve this confounding, and we show that transmission rates and basic reproduction numbers can be accurately estimated in SSE study designs. Seasonal forcing is more difficult to identify as serology-based studies only detect oscillations in antibody titers of recovered individuals, and these oscillations are typically weaker than those observed for infected individuals. To accurately estimate the magnitude and timing of seasonal forcing, serum samples should be collected every two months and 200 or more samples should be included in each collection; this sample size estimate
7 CFR 51.1406 - Sample for grade or size determination.
..., AND STANDARDS) United States Standards for Grades of Pecans in the Shell 1 Sample for Grade Or Size Determination § 51.1406 Sample for grade or size determination. Each sample shall consist of 100 pecans....
7 CFR 51.1406 - Sample for grade or size determination.
..., AND STANDARDS) United States Standards for Grades of Pecans in the Shell 1 Sample for Grade Or Size Determination § 51.1406 Sample for grade or size determination. Each sample shall consist of 100 pecans....
7 CFR 51.1406 - Sample for grade or size determination.
..., AND STANDARDS) United States Standards for Grades of Pecans in the Shell 1 Sample for Grade Or Size Determination § 51.1406 Sample for grade or size determination. Each sample shall consist of 100 pecans....
Sample size and allocation of effort in point count sampling of birds in bottomland hardwood forests
To examine sample size requirements and optimum allocation of effort in point count sampling of bottomland hardwood forests, we computed minimum sample sizes from variation recorded during 82 point counts (May 7-May 16, 1992) from three localities containing three habitat types across three regions of the Mississippi Alluvial Valley (MAV). Also, we estimated the effect of increasing the number of points or visits by comparing results of 150 four-minute point counts obtained from each of four stands on Delta Experimental Forest (DEF) during May 8-May 21, 1991 and May 30-June 12, 1992. For each stand, we obtained bootstrap estimates of mean cumulative number of species each year from all possible combinations of six points and six visits. ANOVA was used to model cumulative species as a function of number of points visited, number of visits to each point, and interaction of points and visits. There was significant variation in numbers of birds and species between regions and localities (nested within region); neither habitat, nor the interaction between region and habitat, was significant. For a = 0.05 and a = 0.10, minimum sample size estimates (per factor level) varied by orders of magnitude depending upon the observed or specified range of desired detectable difference. For observed regional variation, 20 and 40 point counts were required to accommodate variability in total individuals (MSE = 9.28) and species (MSE = 3.79), respectively, whereas ? 25 percent of the mean could be achieved with five counts per factor level. Sample size sufficient to detect actual differences of Wood Thrush (Hylocichla mustelina) was >200, whereas the Prothonotary Warbler (Protonotaria citrea) required <10 counts. Differences in mean cumulative species were detected among number of points visited and among number of visits to a point. In the lower MAV, mean cumulative species increased with each added point through five points and with each additional visit through four visits
Discovery sampling is a tool used in a discovery auditing. The purpose of such an audit is to provide evidence that some (usually large) inventory of items complies with a defined set of criteria by inspecting (or measuring) a representative sample drawn from the inventory. If any of the items in the sample fail compliance (defective items), then the audit has discovered an impropriety, which often triggers some action. However finding defective items in a sample is an unusual event--auditors expect the inventory to be in compliance because they come to the audit with an ''innocent until proven guilty attitude''. As part of their work product, the auditors must provide a confidence statement about compliance level of the inventory. Clearly the more items they inspect, the greater their confidence, but more inspection means more cost. Audit costs can be purely economic, but in some cases, the cost is political because more inspection means more intrusion, which communicates an attitude of distrust. Thus, auditors have every incentive to minimize the number of items in the sample. Indeed, in some cases the sample size can be specifically limited by a prior agreement or an ongoing policy. Statements of confidence about the results of a discovery sample generally use the method of confidence intervals. After finding no defectives in the sample, the auditors provide a range of values that bracket the number of defective items that could credibly be in the inventory. They also state a level of confidence for the interval, usually 90% or 95%. For example, the auditors might say: ''We believe that this inventory of 1,000 items contains no more than 10 defectives with a confidence of 95%''. Frequently clients ask their auditors questions such as: How many items do you need to measure to be 95% confident that there are no more than 10 defectives in the entire inventory? Sometimes when the auditors answer with big numbers like ''300'', their clients balk. They balk because a
7 CFR 51.3200 - Samples for grade and size determination.
... 7 Agriculture 2 2010-01-01 2010-01-01 false Samples for grade and size determination. 51.3200... Grade and Size Determination § 51.3200 Samples for grade and size determination. Individual samples.... When individual packages contain 20 pounds or more and the onions are packed for Large or Jumbo size...
This second, and concluding, part of this study evaluated changes in sampling efficiency of respirable size-selective samplers due to air pulsations generated by the selected personal sampling pumps characterized in Part I (Lee E, Lee L, Möhlmann C et al. Evaluation of pump pulsation in respirable size-selective sampling: Part I. Pulsation measurements. Ann Occup Hyg 2013). Nine particle sizes of monodisperse ammonium fluorescein (from 1 to 9 μm mass median aerodynamic diameter) were generated individually by a vibrating orifice aerosol generator from dilute solutions of fluorescein in aqueous ammonia and then injected into an environmental chamber. To collect these particles, 10-mm nylon cyclones, also known as Dorr-Oliver (DO) cyclones, were used with five medium volumetric flow rate pumps. Those were the Apex IS, HFS513, GilAir5, Elite5, and Basic5 pumps, which were found in Part I to generate pulsations of 5% (the lowest), 25%, 30%, 56%, and 70% (the highest), respectively. GK2.69 cyclones were used with the Legacy [pump pulsation (PP) = 15%] and Elite12 (PP = 41%) pumps for collection at high flows. The DO cyclone was also used to evaluate changes in sampling efficiency due to pulse shape. The HFS513 pump, which generates a more complex pulse shape, was compared to a single sine wave fluctuation generated by a piston. The luminescent intensity of the fluorescein extracted from each sample was measured with a luminescence spectrometer. Sampling efficiencies were obtained by dividing the intensity of the fluorescein extracted from the filter placed in a cyclone with the intensity obtained from the filter used with a sharp-edged reference sampler. Then, sampling efficiency curves were generated using a sigmoid function with three parameters and each sampling efficiency curve was compared to that of the reference cyclone by constructing bias maps. In general, no change in sampling efficiency (bias under ±10%) was observed until pulsations exceeded 25% for the
7 CFR 51.1548 - Samples for grade and size determination.
... 7 Agriculture 2 2010-01-01 2010-01-01 false Samples for grade and size determination. 51.1548..., AND STANDARDS) United States Standards for Grades of Potatoes 1 Samples for Grade and Size Determination § 51.1548 Samples for grade and size determination. Individual samples shall consist of at...
7 CFR 51.629 - Sample for grade or size determination.
... 7 Agriculture 2 2010-01-01 2010-01-01 false Sample for grade or size determination. 51.629 Section..., California, and Arizona) Sample for Grade Or Size Determination § 51.629 Sample for grade or size determination. Each sample shall consist of 33 grapefruit. When individual packages contain at least...
7 CFR 51.690 - Sample for grade or size determination.
... 7 Agriculture 2 2010-01-01 2010-01-01 false Sample for grade or size determination. 51.690 Section..., California, and Arizona) Sample for Grade Or Size Determination § 51.690 Sample for grade or size determination. Each sample shall consist of 50 oranges. When individual packages contain at least 50...
The a priori determination of a proper sample size necessary to achieve some specified power is an important problem encountered frequently in practical studies. To establish the needed sample size for a two-sample "t" test, researchers may conduct the power analysis by specifying scientifically important values as the underlying population means…
Comparing Server Energy Use and Efficiency Using Small Sample Sizes
This report documents a demonstration that compared the energy consumption and efficiency of a limited sample size of server-type IT equipment from different manufacturers by measuring power at the server power supply power cords. The results are specific to the equipment and methods used. However, it is hoped that those responsible for IT equipment selection can used the methods described to choose models that optimize energy use efficiency. The demonstration was conducted in a data center at Lawrence Berkeley National Laboratory in Berkeley, California. It was performed with five servers of similar mechanical and electronic specifications; three from Intel and one each from Dell and Supermicro. Server IT equipment is constructed using commodity components, server manufacturer-designed assemblies, and control systems. Server compute efficiency is constrained by the commodity component specifications and integration requirements. The design freedom, outside of the commodity component constraints, provides room for the manufacturer to offer a product with competitive efficiency that meets market needs at a compelling price. A goal of the demonstration was to compare and quantify the server efficiency for three different brands. The efficiency is defined as the average compute rate (computations per unit of time) divided by the average energy consumption rate. The research team used an industry standard benchmark software package to provide a repeatable software load to obtain the compute rate and provide a variety of power consumption levels. Energy use when the servers were in an idle state (not providing computing work) were also measured. At high server compute loads, all brands, using the same key components (processors and memory), had similar results; therefore, from these results, it could not be concluded that one brand is more efficient than the other brands. The test results show that the power consumption variability caused by the key components as a
Alternative sample sizes for verification dose experiments and dose audits
ISO 11137 (1995), "Sterilization of Health Care Products—Requirements for Validation and Routine Control—Radiation Sterilization", provides sampling plans for performing initial verification dose experiments and quarterly dose audits. Alternative sampling plans are presented which provide equivalent protection. These sampling plans can significantly reduce the cost of testing. These alternative sampling plans have been included in a draft ISO Technical Report (type 2). This paper examines the rational behind the proposed alternative sampling plans. The protection provided by the current verification and audit sampling plans is first examined. Then methods for identifying equivalent plans are highlighted. Finally, methods for comparing the cost associated with the different plans are provided. This paper includes additional guidance for selecting between the original and alternative sampling plans not included in the technical report.
Sample size determination is an important issue in planning research. In the context of one-way fixed-effect analysis of variance, the conventional sample size formula cannot be applied for the heterogeneous variance cases. This study discusses the sample size requirement for the Welch test in the one-way fixed-effect analysis of variance with…
Sample Size Determination for Regression Models Using Monte Carlo Methods in R
A common question asked by researchers using regression models is, What sample size is needed for my study? While there are formulae to estimate sample sizes, their assumptions are often not met in the collected data. A more realistic approach to sample size determination requires more information such as the model of interest, strength of the…
A contemporary decennial global sample of changing agricultural field sizes
In the last several hundred years agriculture has caused significant human induced Land Cover Land Use Change (LCLUC) with dramatic cropland expansion and a marked increase in agricultural productivity. The size of agricultural fields is a fundamental description of rural landscapes and provides an insight into the drivers of rural LCLUC. Increasing field sizes cause a subsequent decrease in the number of fields and therefore decreased landscape spatial complexity with impacts on biodiversity, habitat, soil erosion, plant-pollinator interactions, diffusion of disease pathogens and pests, and loss or degradation in buffers to nutrient, herbicide and pesticide flows. In this study, globally distributed locations with significant contemporary field size change were selected guided by a global map of agricultural yield and literature review and were selected to be representative of different driving forces of field size change (associated with technological innovation, socio-economic conditions, government policy, historic patterns of land cover land use, and environmental setting). Seasonal Landsat data acquired on a decadal basis (for 1980, 1990, 2000 and 2010) were used to extract field boundaries and the temporal changes in field size quantified and their causes discussed.
Sample size justification is an important consideration when planning a clinical trial, not only for the main trial but also for any preliminary pilot trial. When the outcome is a continuous variable, the sample size calculation requires an accurate estimate of the standard deviation of the outcome measure. A pilot trial can be used to get an estimate of the standard deviation, which could then be used to anticipate what may be observed in the main trial. However, an important consideration is that pilot trials often estimate the standard deviation parameter imprecisely. This paper looks at how we can choose an external pilot trial sample size in order to minimise the sample size of the overall clinical trial programme, that is, the pilot and the main trial together. We produce a method of calculating the optimal solution to the required pilot trial sample size when the standardised effect size for the main trial is known. However, as it may not be possible to know the standardised effect size to be used prior to the pilot trial, approximate rules are also presented. For a main trial designed with 90% power and two-sided 5% significance, we recommend pilot trial sample sizes per treatment arm of 75, 25, 15 and 10 for standardised effect sizes that are extra small (≤0.1), small (0.2), medium (0.5) or large (0.8), respectively. PMID:26092476
Grain size measurements using the point-sampled intercept technique
Recent developments in the field of stereology and measurement of three-dimensional size scales from two-dimensional sections have emanated from the medical field, particularly in the area of pathology. Here, the measurement of biological cell sizes and their distribution are critical for diagnosis and treatment of such deadly diseases as cancer. The purpose of this paper is to introduce these new methods to the materials science community and to illustrate their application using a series of typical microstructures found in polycrystalline ceramics. As far as the current authors are aware, these methods have not been applied in materials-science related applications.
Two test items that examined high school students' beliefs of sample size for large populations using the context of opinion polls conducted prior to national and state elections were developed. A trial of the two items with 21 male and 33 female Year 9 students examined their naive understanding of sample size: over half of students chose a…
Sample sizes for randomized controlled trials are typically based on power calculations. They require us to specify values for parameters such as the treatment effect, which is often difficult because we lack sufficient prior information. The objective of this paper is to provide an alternative design which circumvents the need for sample size calculation. In a simulation study, we compared a meta-experiment approach to the classical approach to assess treatment efficacy. The meta-experiment approach involves use of meta-analyzed results from 3 randomized trials of fixed sample size, 100 subjects. The classical approach involves a single randomized trial with the sample size calculated on the basis of an a priori-formulated hypothesis. For the sample size calculation in the classical approach, we used observed articles to characterize errors made on the formulated hypothesis. A prospective meta-analysis of data from trials of fixed sample size provided the same precision, power and type I error rate, on average, as the classical approach. The meta-experiment approach may provide an alternative design which does not require a sample size calculation and addresses the essential need for study replication; results may have greater external validity. PMID:27362939
Basic concepts for sample size calculation: Critical step for any clinical trials!
Quality of clinical trials has improved steadily over last two decades, but certain areas in trial methodology still require special attention like in sample size calculation. The sample size is one of the basic steps in planning any clinical trial and any negligence in its calculation may lead to rejection of true findings and false results may get approval. Although statisticians play a major role in sample size estimation basic knowledge regarding sample size calculation is very sparse among most of the anesthesiologists related to research including under trainee doctors. In this review, we will discuss how important sample size calculation is for research studies and the effects of underestimation or overestimation of sample size on project's results. We have highlighted the basic concepts regarding various parameters needed to calculate the sample size along with examples. PMID:27375390
Sample size and scene identification (cloud) - Effect on albedo
Scan channels on the Nimbus 7 Earth Radiation Budget instrument sample radiances from underlying earth scenes at a number of incident and scattering angles. A sampling excess toward measurements at large satellite zenith angles is noted. Also, at large satellite zenith angles, the present scheme for scene selection causes many observations to be classified as cloud, resulting in higher flux averages. Thus the combined effect of sampling bias and scene identification errors is to overestimate the computed albedo. It is shown, using a process of successive thresholding, that observations with satellite zenith angles greater than 50-60 deg lead to incorrect cloud identification. Elimination of these observations has reduced the albedo from 32.2 to 28.8 percent. This reduction is very nearly the same and in the right direction as the discrepancy between the albedoes derived from the scanner and the wide-field-of-view channels.
7 CFR 201.43 - Size of sample.
... units. Coated seed for germination test only shall consist of at least 1,000 seed units. ..., Inspections, Marketing Practices), DEPARTMENT OF AGRICULTURE (CONTINUED) FEDERAL SEED ACT FEDERAL SEED ACT... of samples of agricultural seed, vegetable seed and screenings to be submitted for analysis, test,...
7 CFR 201.43 - Size of sample.
... units. Coated seed for germination test only shall consist of at least 1,000 seed units. ..., Inspections, Marketing Practices), DEPARTMENT OF AGRICULTURE (CONTINUED) FEDERAL SEED ACT FEDERAL SEED ACT... of samples of agricultural seed, vegetable seed and screenings to be submitted for analysis, test,...
7 CFR 201.43 - Size of sample.
... units. Coated seed for germination test only shall consist of at least 1,000 seed units. ..., Inspections, Marketing Practices), DEPARTMENT OF AGRICULTURE (CONTINUED) FEDERAL SEED ACT FEDERAL SEED ACT... of samples of agricultural seed, vegetable seed and screenings to be submitted for analysis, test,...
7 CFR 201.43 - Size of sample.
..., Inspections, Marketing Practices), DEPARTMENT OF AGRICULTURE (CONTINUED) FEDERAL SEED ACT FEDERAL SEED ACT... of samples of agricultural seed, vegetable seed and screenings to be submitted for analysis, test, or examination: (a) Two ounces (57 grams) of grass seed not otherwise mentioned, white or alsike clover, or...
7 CFR 201.43 - Size of sample.
..., Inspections, Marketing Practices), DEPARTMENT OF AGRICULTURE (CONTINUED) FEDERAL SEED ACT FEDERAL SEED ACT... of samples of agricultural seed, vegetable seed and screenings to be submitted for analysis, test, or examination: (a) Two ounces (57 grams) of grass seed not otherwise mentioned, white or alsike clover, or...
Utility of Inferential Norming with Smaller Sample Sizes
2011-01-01
We examined the utility of inferential norming using small samples drawn from the larger "Wechsler Intelligence Scales for Children-Fourth Edition" (WISC-IV) standardization data set. The quality of the norms was estimated with multiple indexes such as polynomial curve fit, percentage of cases receiving the same score, average absolute score…
Geoscience Education Research Methods: Thinking About Sample Size
Slater, S. J.; Slater, T. F.; CenterAstronomy; Physics Education Research
Geoscience education research is at a critical point in which conditions are sufficient to propel our field forward toward meaningful improvements in geosciences education practices. Our field has now reached a point where the outcomes of our research is deemed important to endusers and funding agencies, and where we now have a large number of scientists who are either formally trained in geosciences education research, or who have dedicated themselves to excellence in this domain. At this point we now must collectively work through our epistemology, our rules of what methodologies will be considered sufficiently rigorous, and what data and analysis techniques will be acceptable for constructing evidence. In particular, we have to work out our answer to that most difficult of research questions: "How big should my 'N' be??" This paper presents a very brief answer to that question, addressing both quantitative and qualitative methodologies. Research question/methodology alignment, effect size and statistical power will be discussed, in addition to a defense of the notion that bigger is not always better.
Sample Size in Differential Item Functioning: An Application of Hierarchical Linear Modeling
Acar, Tulin
The purpose of this study is to examine the number of DIF items detected by HGLM at different sample sizes. Eight different sized data files have been composed. The population of the study is 798307 students who had taken the 2006 OKS Examination. 10727 students of 798307 are chosen by random sampling method as the sample of the study. Turkish,…
Tian, Guo-Liang; Tang, Man-Lai; Zhenqiu Liu; Ming Tan; Tang, Nian-Sheng
2011-06-01
Sample size determination is an essential component in public health survey designs on sensitive topics (e.g. drug abuse, homosexuality, induced abortions and pre or extramarital sex). Recently, non-randomised models have been shown to be an efficient and cost effective design when comparing with randomised response models. However, sample size formulae for such non-randomised designs are not yet available. In this article, we derive sample size formulae for the non-randomised triangular design based on the power analysis approach. We first consider the one-sample problem. Power functions and their corresponding sample size formulae for the one- and two-sided tests based on the large-sample normal approximation are derived. The performance of the sample size formulae is evaluated in terms of (i) the accuracy of the power values based on the estimated sample sizes and (ii) the sample size ratio of the non-randomised triangular design and the design of direct questioning (DDQ). We also numerically compare the sample sizes required for the randomised Warner design with those required for the DDQ and the non-randomised triangular design. Theoretical justification is provided. Furthermore, we extend the one-sample problem to the two-sample problem. An example based on an induced abortion study in Taiwan is presented to illustrate the proposed methods. PMID:19221169
A Note on Sample Size and Solution Propriety for Confirmatory Factor Analytic Models
Jackson, Dennis L.; Voth, Jennifer; Frey, Marc P.
Determining an appropriate sample size for use in latent variable modeling techniques has presented ongoing challenges to researchers. In particular, small sample sizes are known to present concerns over sampling error for the variances and covariances on which model estimation is based, as well as for fit indexes and convergence failures. The…
Structured estimation - Sample size reduction for adaptive pattern classification
Morgera, S.; Cooper, D. B.
The Gaussian two-category classification problem with known category mean value vectors and identical but unknown category covariance matrices is considered. The weight vector depends on the unknown common covariance matrix, so the procedure is to estimate the covariance matrix in order to obtain an estimate of the optimum weight vector. The measure of performance for the adapted classifier is the output signal-to-interference noise ratio (SIR). A simple approximation for the expected SIR is gained by using the general sample covariance matrix estimator; this performance is both signal and true covariance matrix independent. An approximation is also found for the expected SIR obtained by using a Toeplitz form covariance matrix estimator; this performance is found to be dependent on both the signal and the true covariance matrix.
Sample size estimation for the van Elteren test--a stratified Wilcoxon-Mann-Whitney test.
Zhao, Yan D
2006-08-15
The van Elteren test is a type of stratified Wilcoxon-Mann-Whitney test for comparing two treatments accounting for strata. In this paper, we study sample size estimation methods for the asymptotic version of the van Elteren test, assuming that the stratum fractions (ratios of each stratum size to the total sample size) and the treatment fractions (ratios of each treatment size to the stratum size) are known in the study design. In particular, we develop three large-sample sample size estimation methods and present a real data example to illustrate the necessary information in the study design phase in order to apply the methods. Simulation studies are conducted to compare the performance of the methods and recommendations are made for method choice. Finally, sample size estimation for the van Elteren test when the stratum fractions are unknown is also discussed. PMID:16372389
Model Choice and Sample Size in Item Response Theory Analysis of Aphasia Tests
ERIC Educational Resources Information Center
Hula, William D.; Fergadiotis, Gerasimos; Martin, Nadine
2012-01-01
Purpose: The purpose of this study was to identify the most appropriate item response theory (IRT) measurement model for aphasia tests requiring 2-choice responses and to determine whether small samples are adequate for estimating such models. Method: Pyramids and Palm Trees (Howard & Patterson, 1992) test data that had been collected from…
Sample Size Requirements for Discrete-Choice Experiments in Healthcare: a Practical Guide.
de Bekker-Grob, Esther W; Donkers, Bas; Jonker, Marcel F; Stolk, Elly A
2015-10-01
Discrete-choice experiments (DCEs) have become a commonly used instrument in health economics and patient-preference analysis, addressing a wide range of policy questions. An important question when setting up a DCE is the size of the sample needed to answer the research question of interest. Although theory exists as to the calculation of sample size requirements for stated choice data, it does not address the issue of minimum sample size requirements in terms of the statistical power of hypothesis tests on the estimated coefficients. The purpose of this paper is threefold: (1) to provide insight into whether and how researchers have dealt with sample size calculations for healthcare-related DCE studies; (2) to introduce and explain the required sample size for parameter estimates in DCEs; and (3) to provide a step-by-step guide for the calculation of the minimum sample size requirements for DCEs in health care. PMID:25726010
Evaluation of design flood estimates with respect to sample size
Kobierska, Florian; Engeland, Kolbjorn
2016-04-01
Estimation of design floods forms the basis for hazard management related to flood risk and is a legal obligation when building infrastructure such as dams, bridges and roads close to water bodies. Flood inundation maps used for land use planning are also produced based on design flood estimates. In Norway, the current guidelines for design flood estimates give recommendations on which data, probability distribution, and method to use dependent on length of the local record. If less than 30 years of local data is available, an index flood approach is recommended where the local observations are used for estimating the index flood and regional data are used for estimating the growth curve. For 30-50 years of data, a 2 parameter distribution is recommended, and for more than 50 years of data, a 3 parameter distribution should be used. Many countries have national guidelines for flood frequency estimation, and recommended distributions include the log Pearson II, generalized logistic and generalized extreme value distributions. For estimating distribution parameters, ordinary and linear moments, maximum likelihood and Bayesian methods are used. The aim of this study is to r-evaluate the guidelines for local flood frequency estimation. In particular, we wanted to answer the following questions: (i) Which distribution gives the best fit to the data? (ii) Which estimation method provides the best fit to the data? (iii) Does the answer to (i) and (ii) depend on local data availability? To answer these questions we set up a test bench for local flood frequency analysis using data based cross-validation methods. The criteria were based on indices describing stability and reliability of design flood estimates. Stability is used as a criterion since design flood estimates should not excessively depend on the data sample. The reliability indices describe to which degree design flood predictions can be trusted.
Eisenberg, Sarita L.; Guo, Ling-Yu
Purpose: The purpose of this study was to investigate whether a shorter language sample elicited with fewer pictures (i.e., 7) would yield a percent grammatical utterances (PGU) score similar to that computed from a longer language sample elicited with 15 pictures for 3-year-old children. Method: Language samples were elicited by asking forty…
Distribution of the two-sample t-test statistic following blinded sample size re-estimation.
Lu, Kaifeng
2016-05-01
We consider the blinded sample size re-estimation based on the simple one-sample variance estimator at an interim analysis. We characterize the exact distribution of the standard two-sample t-test statistic at the final analysis. We describe a simulation algorithm for the evaluation of the probability of rejecting the null hypothesis at given treatment effect. We compare the blinded sample size re-estimation method with two unblinded methods with respect to the empirical type I error, the empirical power, and the empirical distribution of the standard deviation estimator and final sample size. We characterize the type I error inflation across the range of standardized non-inferiority margin for non-inferiority trials, and derive the adjusted significance level to ensure type I error control for given sample size of the internal pilot study. We show that the adjusted significance level increases as the sample size of the internal pilot study increases. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26865383
Dziak, John J.; Lanza, Stephanie T.; Tan, Xianming
2014-01-01
Selecting the number of different classes which will be assumed to exist in the population is an important step in latent class analysis (LCA). The bootstrap likelihood ratio test (BLRT) provides a data-driven way to evaluate the relative adequacy of a (K −1)-class model compared to a K-class model. However, very little is known about how to predict the power or the required sample size for the BLRT in LCA. Based on extensive Monte Carlo simulations, we provide practical effect size measures and power curves which can be used to predict power for the BLRT in LCA given a proposed sample size and a set of hypothesized population parameters. Estimated power curves and tables provide guidance for researchers wishing to size a study to have sufficient power to detect hypothesized underlying latent classes. PMID:25328371
Manju, Md Abu; Candel, Math J J M; Berger, Martijn P F
2014-07-10
In this paper, the optimal sample sizes at the cluster and person levels for each of two treatment arms are obtained for cluster randomized trials where the cost-effectiveness of treatments on a continuous scale is studied. The optimal sample sizes maximize the efficiency or power for a given budget or minimize the budget for a given efficiency or power. Optimal sample sizes require information on the intra-cluster correlations (ICCs) for effects and costs, the correlations between costs and effects at individual and cluster levels, the ratio of the variance of effects translated into costs to the variance of the costs (the variance ratio), sampling and measuring costs, and the budget. When planning, a study information on the model parameters usually is not available. To overcome this local optimality problem, the current paper also presents maximin sample sizes. The maximin sample sizes turn out to be rather robust against misspecifying the correlation between costs and effects at the cluster and individual levels but may lose much efficiency when misspecifying the variance ratio. The robustness of the maximin sample sizes against misspecifying the ICCs depends on the variance ratio. The maximin sample sizes are robust under misspecification of the ICC for costs for realistic values of the variance ratio greater than one but not robust under misspecification of the ICC for effects. Finally, we show how to calculate optimal or maximin sample sizes that yield sufficient power for a test on the cost-effectiveness of an intervention. PMID:25019136
Brändle, E; Melzer, H; Gomez-Anson, B; Flohr, P; Kleinschmidt, K; Sieberth, H G; Hautmann, R E
1996-03-01
The gold standard for metabolic evaluation of stone-forming patients is the 24-h urine specimen. Recently, some authors have suggested that for routine metabolic evaluation spot urine samples are as valuable as the 24-h urine specimen. The purpose of our study, was to determine the value of the spot urine sample in comparison with the 24-h urine specimens. Eighty-eight healthy volunteers on different diets were investigated (32 vegetarians, 12 body-builders without protein concentrates, 28 body-builders on protein concentrates, and 16 subjects on a regular European diet). Using 24-h specimens, excretion rates of oxalate, calcium, sodium and potassium were determined. The concentration ratio of these electrolytes to creatinine was calculated for spot urine samples. A highly significant correlation between the excretion rates and the results of the spot urine samples was found for all parameters. However, the correlations showed considerable variations. On the other hand, we were able to show that creatinine excretion is highly dependent on daily protein intake, body weight and glomerular filtration rate. This leads to a considerable inter- and intraindividual variation in creatinine excretion. This variation of the creatinine excretion is the major cause for the variation in the results of spot urine samples. It is concluded that spot urine samples are an inadequate substitute for the 24-h urine specimen and that the 24-h urine specimen is still the basis for metabolic evaluation in stone patients. PMID:8650847
Wolf, Erika J.; Harrington, Kelly M.; Clark, Shaunna L.; Miller, Mark W.
2013-01-01
Determining sample size requirements for structural equation modeling (SEM) is a challenge often faced by investigators, peer reviewers, and grant writers. Recent years have seen a large increase in SEMs in the behavioral science literature, but consideration of sample size requirements for applied SEMs often relies on outdated rules-of-thumb.…
Algina, James; Olejnik, Stephen
2003-01-01
Tables for selecting sample size in correlation studies are presented. Some of the tables allow selection of sample size so that r (or r[squared], depending on the statistic the researcher plans to interpret) will be within a target interval around the population parameter with probability 0.95. The intervals are [plus or minus] 0.05, [plus or…
EFFECTS OF SAMPLING NOZZLES ON THE PARTICLE COLLECTION CHARACTERISTICS OF INERTIAL SIZING DEVICES
In several particle-sizing samplers, the sample extraction nozzle is necessarily closely coupled to the first inertial sizing stage. Devices of this type include small sampling cyclones, right angle impactor precollectors for in-stack impactors, and the first impaction stage of s...
Using the Student's "t"-Test with Extremely Small Sample Sizes
ERIC Educational Resources Information Center
de Winter, J. C .F.
2013-01-01
Researchers occasionally have to work with an extremely small sample size, defined herein as "N" less than or equal to 5. Some methodologists have cautioned against using the "t"-test when the sample size is extremely small, whereas others have suggested that using the "t"-test is feasible in such a case. The present…
Sample Size and Item Parameter Estimation Precision When Utilizing the One-Parameter "Rasch" Model
ERIC Educational Resources Information Center
Custer, Michael
2015-01-01
This study examines the relationship between sample size and item parameter estimation precision when utilizing the one-parameter model. Item parameter estimates are examined relative to "true" values by evaluating the decline in root mean squared deviation (RMSD) and the number of outliers as sample size increases. This occurs across…
A Comparative Study of Power and Sample Size Calculations for Multivariate General Linear Models
ERIC Educational Resources Information Center
Shieh, Gwowen
2003-01-01
Repeated measures and longitudinal studies arise often in social and behavioral science research. During the planning stage of such studies, the calculations of sample size are of particular interest to the investigators and should be an integral part of the research projects. In this article, we consider the power and sample size calculations for…
Weighting by Inverse Variance or by Sample Size in Random-Effects Meta-Analysis
ERIC Educational Resources Information Center
Marin-Martinez, Fulgencio; Sanchez-Meca, Julio
2010-01-01
Most of the statistical procedures in meta-analysis are based on the estimation of average effect sizes from a set of primary studies. The optimal weight for averaging a set of independent effect sizes is the inverse variance of each effect size, but in practice these weights have to be estimated, being affected by sampling error. When assuming a…
Thermomagnetic behavior of magnetic susceptibility – heating rate and sample size effects
Jordanova, Diana; Jordanova, Neli
2015-12-01
Thermomagnetic analysis of magnetic susceptibility k(T) was carried out for a number of natural powder materials from soils, baked clay and anthropogenic dust samples using fast (11oC/min) and slow (6.5oC/min) heating rates available in the furnace of Kappabridge KLY2 (Agico). Based on the additional data for mineralogy, grain size and magnetic properties of the studied samples, behaviour of k(T) cycles and the observed differences in the curves for fast and slow heating rate are interpreted in terms of mineralogical transformations and Curie temperatures (Tc). The effect of different sample size is also explored, using large volume and small volume of powder material. It is found that soil samples show enhanced information on mineralogical transformations and appearance of new strongly magnetic phases when using fast heating rate and large sample size. This approach moves the transformation at higher temperature, but enhances the amplitude of the signal of newly created phase. Large sample size gives prevalence of the local micro- environment, created by evolving gases, released during transformations. The example from archeological brick reveals the effect of different sample sizes on the observed Curie temperatures on heating and cooling curves, when the magnetic carrier is substituted magnetite (Mn0.2Fe2.70O4). Large sample size leads to bigger differences in Tcs on heating and cooling, while small sample size results in similar Tcs for both heating rates.
Bouman, A. C.; ten Cate-Hoek, A. J.; Ramaekers, B. L. T.; Joore, M. A.
2015-01-01
Background Non-inferiority trials are performed when the main therapeutic effect of the new therapy is expected to be not unacceptably worse than that of the standard therapy, and the new therapy is expected to have advantages over the standard therapy in costs or other (health) consequences. These advantages however are not included in the classic frequentist approach of sample size calculation for non-inferiority trials. In contrast, the decision theory approach of sample size calculation does include these factors. The objective of this study is to compare the conceptual and practical aspects of the frequentist approach and decision theory approach of sample size calculation for non-inferiority trials, thereby demonstrating that the decision theory approach is more appropriate for sample size calculation of non-inferiority trials. Methods The frequentist approach and decision theory approach of sample size calculation for non-inferiority trials are compared and applied to a case of a non-inferiority trial on individually tailored duration of elastic compression stocking therapy compared to two years elastic compression stocking therapy for the prevention of post thrombotic syndrome after deep vein thrombosis. Results The two approaches differ substantially in conceptual background, analytical approach, and input requirements. The sample size calculated according to the frequentist approach yielded 788 patients, using a power of 80% and a one-sided significance level of 5%. The decision theory approach indicated that the optimal sample size was 500 patients, with a net value of €92 million. Conclusions This study demonstrates and explains the differences between the classic frequentist approach and the decision theory approach of sample size calculation for non-inferiority trials. We argue that the decision theory approach of sample size estimation is most suitable for sample size calculation of non-inferiority trials. PMID:26076354
Fujita, Masahiro; Yajima, Tomonari; Iijima, Kazuaki; Sato, Kiyoshi
2012-05-01
The uncertainty in pesticide residue levels (UPRL) associated with sampling size was estimated using individual acetamiprid and cypermethrin residue data from preharvested apple, broccoli, cabbage, grape, and sweet pepper samples. The relative standard deviation from the mean of each sampling size (n = 2(x), where x = 1-6) of randomly selected samples was defined as the UPRL for each sampling size. The estimated UPRLs, which were calculated on the basis of the regulatory sampling size recommended by the OECD Guidelines on Crop Field Trials (weights from 1 to 5 kg, and commodity unit numbers from 12 to 24), ranged from 2.1% for cypermethrin in sweet peppers to 14.6% for cypermethrin in cabbage samples. The percentages of commodity exceeding the maximum residue limits (MRLs) specified by the Japanese Food Sanitation Law may be predicted from the equation derived from this study, which was based on samples of various size ranges with mean residue levels below the MRL. The estimated UPRLs have confirmed that sufficient sampling weight and numbers are required for analysis and/or re-examination of subsamples to provide accurate values of pesticide residue levels for the enforcement of MRLs. The equation derived from the present study would aid the estimation of more accurate residue levels even from small sampling sizes. PMID:22475588
Sample Size under Inverse Negative Binomial Group Testing for Accuracy in Parameter Estimation
Montesinos-López, Osval Antonio; Montesinos-López, Abelardo; Crossa, José; Eskridge, Kent
2012-01-01
Background The group testing method has been proposed for the detection and estimation of genetically modified plants (adventitious presence of unwanted transgenic plants, AP). For binary response variables (presence or absence), group testing is efficient when the prevalence is low, so that estimation, detection, and sample size methods have been developed under the binomial model. However, when the event is rare (low prevalence <0.1), and testing occurs sequentially, inverse (negative) binomial pooled sampling may be preferred. Methodology/Principal Findings This research proposes three sample size procedures (two computational and one analytic) for estimating prevalence using group testing under inverse (negative) binomial sampling. These methods provide the required number of positive pools (), given a pool size (k), for estimating the proportion of AP plants using the Dorfman model and inverse (negative) binomial sampling. We give real and simulated examples to show how to apply these methods and the proposed sample-size formula. The Monte Carlo method was used to study the coverage and level of assurance achieved by the proposed sample sizes. An R program to create other scenarios is given in Appendix S2. Conclusions The three methods ensure precision in the estimated proportion of AP because they guarantee that the width (W) of the confidence interval (CI) will be equal to, or narrower than, the desired width (), with a probability of . With the Monte Carlo study we found that the computational Wald procedure (method 2) produces the more precise sample size (with coverage and assurance levels very close to nominal values) and that the samples size based on the Clopper-Pearson CI (method 1) is conservative (overestimates the sample size); the analytic Wald sample size method we developed (method 3) sometimes underestimated the optimum number of pools. PMID:22457714
Voss, Sebastian; Zimmermann, Beate; Zimmermann, Alexander
2016-04-01
In the last three decades, an increasing number of studies analyzed spatial patterns in throughfall to investigate the consequences of rainfall redistribution for biogeochemical and hydrological processes in forests. In the majority of cases, variograms were used to characterize the spatial properties of the throughfall data. The estimation of the variogram from sample data requires an appropriate sampling scheme: most importantly, a large sample and an appropriate layout of sampling locations that often has to serve both variogram estimation and geostatistical prediction. While some recommendations on these aspects exist, they focus on Gaussian data and high ratios of the variogram range to the extent of the study area. However, many hydrological data, and throughfall data in particular, do not follow a Gaussian distribution. In this study, we examined the effect of extent, sample size, sampling design, and calculation methods on variogram estimation of throughfall data. For our investigation, we first generated non-Gaussian random fields based on throughfall data with heavy outliers. Subsequently, we sampled the fields with three extents (plots with edge lengths of 25 m, 50 m, and 100 m), four common sampling designs (two grid-based layouts, transect and random sampling), and five sample sizes (50, 100, 150, 200, 400). We then estimated the variogram parameters by method-of-moments and residual maximum likelihood. Our key findings are threefold. First, the choice of the extent has a substantial influence on the estimation of the variogram. A comparatively small ratio of the extent to the correlation length is beneficial for variogram estimation. Second, a combination of a minimum sample size of 150, a design that ensures the sampling of small distances and variogram estimation by residual maximum likelihood offers a good compromise between accuracy and efficiency. Third, studies relying on method-of-moments based variogram estimation may have to employ at least
Voss, Sebastian; Zimmermann, Beate; Zimmermann, Alexander
2016-09-01
In the last decades, an increasing number of studies analyzed spatial patterns in throughfall by means of variograms. The estimation of the variogram from sample data requires an appropriate sampling scheme: most importantly, a large sample and a layout of sampling locations that often has to serve both variogram estimation and geostatistical prediction. While some recommendations on these aspects exist, they focus on Gaussian data and high ratios of the variogram range to the extent of the study area. However, many hydrological data, and throughfall data in particular, do not follow a Gaussian distribution. In this study, we examined the effect of extent, sample size, sampling design, and calculation method on variogram estimation of throughfall data. For our investigation, we first generated non-Gaussian random fields based on throughfall data with large outliers. Subsequently, we sampled the fields with three extents (plots with edge lengths of 25 m, 50 m, and 100 m), four common sampling designs (two grid-based layouts, transect and random sampling) and five sample sizes (50, 100, 150, 200, 400). We then estimated the variogram parameters by method-of-moments (non-robust and robust estimators) and residual maximum likelihood. Our key findings are threefold. First, the choice of the extent has a substantial influence on the estimation of the variogram. A comparatively small ratio of the extent to the correlation length is beneficial for variogram estimation. Second, a combination of a minimum sample size of 150, a design that ensures the sampling of small distances and variogram estimation by residual maximum likelihood offers a good compromise between accuracy and efficiency. Third, studies relying on method-of-moments based variogram estimation may have to employ at least 200 sampling points for reliable variogram estimates. These suggested sample sizes exceed the number recommended by studies dealing with Gaussian data by up to 100 %. Given that most previous
Chaibub Neto, Elias
2015-01-01
In this paper we propose a vectorized implementation of the non-parametric bootstrap for statistics based on sample moments. Basically, we adopt the multinomial sampling formulation of the non-parametric bootstrap, and compute bootstrap replications of sample moment statistics by simply weighting the observed data according to multinomial counts instead of evaluating the statistic on a resampled version of the observed data. Using this formulation we can generate a matrix of bootstrap weights and compute the entire vector of bootstrap replications with a few matrix multiplications. Vectorization is particularly important for matrix-oriented programming languages such as R, where matrix/vector calculations tend to be faster than scalar operations implemented in a loop. We illustrate the application of the vectorized implementation in real and simulated data sets, when bootstrapping Pearson’s sample correlation coefficient, and compared its performance against two state-of-the-art R implementations of the non-parametric bootstrap, as well as a straightforward one based on a for loop. Our investigations spanned varying sample sizes and number of bootstrap replications. The vectorized bootstrap compared favorably against the state-of-the-art implementations in all cases tested, and was remarkably/considerably faster for small/moderate sample sizes. The same results were observed in the comparison with the straightforward implementation, except for large sample sizes, where the vectorized bootstrap was slightly slower than the straightforward implementation due to increased time expenditures in the generation of weight matrices via multinomial sampling. PMID:26125965
The economic efficiency of sampling size: the case of beef trim revisited.
Powell, Mark R
2013-03-01
A recent paper by Ferrier and Buzby provides a framework for selecting the sample size when testing a lot of beef trim for Escherichia coli O157:H7 that equates the averted costs of recalls and health damages from contaminated meats sold to consumers with the increased costs of testing while allowing for uncertainty about the underlying prevalence of contamination. Ferrier and Buzby conclude that the optimal sample size is larger than the current sample size. However, Ferrier and Buzby's optimization model has a number of errors, and their simulations failed to consider available evidence about the likelihood of the scenarios explored under the model. After correctly modeling microbial prevalence as dependent on portion size and selecting model inputs based on available evidence, the model suggests that the optimal sample size is zero under most plausible scenarios. It does not follow, however, that sampling beef trim for E. coli O157:H7, or food safety sampling more generally, should be abandoned. Sampling is not generally cost effective as a direct consumer safety control measure due to the extremely large sample sizes required to provide a high degree of confidence of detecting very low acceptable defect levels. Food safety verification sampling creates economic incentives for food producing firms to develop, implement, and maintain effective control measures that limit the probability and degree of noncompliance with regulatory limits or private contract specifications. PMID:23496435
Effects of Sample Size on Estimates of Population Growth Rates Calculated with Matrix Models
Fiske, Ian J.; Bruna, Emilio M.; Bolker, Benjamin M.
2008-01-01
Background Matrix models are widely used to study the dynamics and demography of populations. An important but overlooked issue is how the number of individuals sampled influences estimates of the population growth rate (λ) calculated with matrix models. Even unbiased estimates of vital rates do not ensure unbiased estimates of λ–Jensen's Inequality implies that even when the estimates of the vital rates are accurate, small sample sizes lead to biased estimates of λ due to increased sampling variance. We investigated if sampling variability and the distribution of sampling effort among size classes lead to biases in estimates of λ. Methodology/Principal Findings Using data from a long-term field study of plant demography, we simulated the effects of sampling variance by drawing vital rates and calculating λ for increasingly larger populations drawn from a total population of 3842 plants. We then compared these estimates of λ with those based on the entire population and calculated the resulting bias. Finally, we conducted a review of the literature to determine the sample sizes typically used when parameterizing matrix models used to study plant demography. Conclusions/Significance We found significant bias at small sample sizes when survival was low (survival = 0.5), and that sampling with a more-realistic inverse J-shaped population structure exacerbated this bias. However our simulations also demonstrate that these biases rapidly become negligible with increasing sample sizes or as survival increases. For many of the sample sizes used in demographic studies, matrix models are probably robust to the biases resulting from sampling variance of vital rates. However, this conclusion may depend on the structure of populations or the distribution of sampling effort in ways that are unexplored. We suggest more intensive sampling of populations when individual survival is low and greater sampling of stages with high elasticities. PMID:18769483
Guo, Ling-Yu
2015-01-01
Purpose The purpose of this study was to investigate whether a shorter language sample elicited with fewer pictures (i.e., 7) would yield a percent grammatical utterances (PGU) score similar to that computed from a longer language sample elicited with 15 pictures for 3-year-old children. Method Language samples were elicited by asking forty 3-year-old children with varying language skills to talk about pictures in response to prompts. PGU scores were computed for each of two 7-picture sets and for the full set of 15 pictures. Results PGU scores for the two 7-picture sets did not differ significantly from, and were highly correlated with, PGU scores for the full set and with each other. Agreement for making pass–fail decisions between each 7-picture set and the full set and between the two 7-picture sets ranged from 80% to 100%. Conclusion The current study suggests that the PGU measure is robust enough that it can be computed on the basis of 7, at least in 3-year-old children whose language samples were elicited using similar procedures. PMID:25615691
Tai, Bee-Choo; Grundy, Richard; Machin, David
2011-03-15
Purpose: To accurately model the cumulative need for radiotherapy in trials designed to delay or avoid irradiation among children with malignant brain tumor, it is crucial to account for competing events and evaluate how each contributes to the timing of irradiation. An appropriate choice of statistical model is also important for adequate determination of sample size. Methods and Materials: We describe the statistical modeling of competing events (A, radiotherapy after progression; B, no radiotherapy after progression; and C, elective radiotherapy) using proportional cause-specific and subdistribution hazard functions. The procedures of sample size estimation based on each method are outlined. These are illustrated by use of data comparing children with ependymoma and other malignant brain tumors. The results from these two approaches are compared. Results: The cause-specific hazard analysis showed a reduction in hazards among infants with ependymoma for all event types, including Event A (adjusted cause-specific hazard ratio, 0.76; 95% confidence interval, 0.45-1.28). Conversely, the subdistribution hazard analysis suggested an increase in hazard for Event A (adjusted subdistribution hazard ratio, 1.35; 95% confidence interval, 0.80-2.30), but the reduction in hazards for Events B and C remained. Analysis based on subdistribution hazard requires a larger sample size than the cause-specific hazard approach. Conclusions: Notable differences in effect estimates and anticipated sample size were observed between methods when the main event showed a beneficial effect whereas the competing events showed an adverse effect on the cumulative incidence. The subdistribution hazard is the most appropriate for modeling treatment when its effects on both the main and competing events are of interest.
The PowerAtlas: a power and sample size atlas for microarray experimental design and research
Page, Grier P; Edwards, Jode W; Gadbury, Gary L; Yelisetti, Prashanth; Wang, Jelai; Trivedi, Prinal; Allison, David B
2006-01-01
Background Microarrays permit biologists to simultaneously measure the mRNA abundance of thousands of genes. An important issue facing investigators planning microarray experiments is how to estimate the sample size required for good statistical power. What is the projected sample size or number of replicate chips needed to address the multiple hypotheses with acceptable accuracy? Statistical methods exist for calculating power based upon a single hypothesis, using estimates of the variability in data from pilot studies. There is, however, a need for methods to estimate power and/or required sample sizes in situations where multiple hypotheses are being tested, such as in microarray experiments. In addition, investigators frequently do not have pilot data to estimate the sample sizes required for microarray studies. Results To address this challenge, we have developed a Microrarray PowerAtlas [1]. The atlas enables estimation of statistical power by allowing investigators to appropriately plan studies by building upon previous studies that have similar experimental characteristics. Currently, there are sample sizes and power estimates based on 632 experiments from Gene Expression Omnibus (GEO). The PowerAtlas also permits investigators to upload their own pilot data and derive power and sample size estimates from these data. This resource will be updated regularly with new datasets from GEO and other databases such as The Nottingham Arabidopsis Stock Center (NASC). Conclusion This resource provides a valuable tool for investigators who are planning efficient microarray studies and estimating required sample sizes. PMID:16504070
Wolf, Erika J.; Harrington, Kelly M.; Clark, Shaunna L.; Miller, Mark W.
2015-01-01
Determining sample size requirements for structural equation modeling (SEM) is a challenge often faced by investigators, peer reviewers, and grant writers. Recent years have seen a large increase in SEMs in the behavioral science literature, but consideration of sample size requirements for applied SEMs often relies on outdated rules-of-thumb. This study used Monte Carlo data simulation techniques to evaluate sample size requirements for common applied SEMs. Across a series of simulations, we systematically varied key model properties, including number of indicators and factors, magnitude of factor loadings and path coefficients, and amount of missing data. We investigated how changes in these parameters affected sample size requirements with respect to statistical power, bias in the parameter estimates, and overall solution propriety. Results revealed a range of sample size requirements (i.e., from 30 to 460 cases), meaningful patterns of association between parameters and sample size, and highlight the limitations of commonly cited rules-of-thumb. The broad “lessons learned” for determining SEM sample size requirements are discussed. PMID:25705052
Parameter Estimation with Small Sample Size: A Higher-Order IRT Model Approach
ERIC Educational Resources Information Center
de la Torre, Jimmy; Hong, Yuan
2010-01-01
Sample size ranks as one of the most important factors that affect the item calibration task. However, due to practical concerns (e.g., item exposure) items are typically calibrated with much smaller samples than what is desired. To address the need for a more flexible framework that can be used in small sample item calibration, this article…
Teoh, Wei Lin; Khoo, Michael B C; Teh, Sin Yin
2013-01-01
Designs of the double sampling (DS) X chart are traditionally based on the average run length (ARL) criterion. However, the shape of the run length distribution changes with the process mean shifts, ranging from highly skewed when the process is in-control to almost symmetric when the mean shift is large. Therefore, we show that the ARL is a complicated performance measure and that the median run length (MRL) is a more meaningful measure to depend on. This is because the MRL provides an intuitive and a fair representation of the central tendency, especially for the rightly skewed run length distribution. Since the DS X chart can effectively reduce the sample size without reducing the statistical efficiency, this paper proposes two optimal designs of the MRL-based DS X chart, for minimizing (i) the in-control average sample size (ASS) and (ii) both the in-control and out-of-control ASSs. Comparisons with the optimal MRL-based EWMA X and Shewhart X charts demonstrate the superiority of the proposed optimal MRL-based DS X chart, as the latter requires a smaller sample size on the average while maintaining the same detection speed as the two former charts. An example involving the added potassium sorbate in a yoghurt manufacturing process is used to illustrate the effectiveness of the proposed MRL-based DS X chart in reducing the sample size needed. PMID:23935873
Teoh, Wei Lin; Khoo, Michael B. C.; Teh, Sin Yin
2013-01-01
Designs of the double sampling (DS) chart are traditionally based on the average run length (ARL) criterion. However, the shape of the run length distribution changes with the process mean shifts, ranging from highly skewed when the process is in-control to almost symmetric when the mean shift is large. Therefore, we show that the ARL is a complicated performance measure and that the median run length (MRL) is a more meaningful measure to depend on. This is because the MRL provides an intuitive and a fair representation of the central tendency, especially for the rightly skewed run length distribution. Since the DS chart can effectively reduce the sample size without reducing the statistical efficiency, this paper proposes two optimal designs of the MRL-based DS chart, for minimizing (i) the in-control average sample size (ASS) and (ii) both the in-control and out-of-control ASSs. Comparisons with the optimal MRL-based EWMA and Shewhart charts demonstrate the superiority of the proposed optimal MRL-based DS chart, as the latter requires a smaller sample size on the average while maintaining the same detection speed as the two former charts. An example involving the added potassium sorbate in a yoghurt manufacturing process is used to illustrate the effectiveness of the proposed MRL-based DS chart in reducing the sample size needed. PMID:23935873
Link, W.A.; Nichols, J.D.
1994-01-01
Our purpose here is to emphasize the need to properly deal with sampling variance when studying population variability and to present a means of doing so. We present an estimator for temporal variance of population size for the general case in which there are both sampling variances and covariances associated with estimates of population size. We illustrate the estimation approach with a series of population size estimates for black-capped chickadees (Parus atricapillus) wintering in a Connecticut study area and with a series of population size estimates for breeding populations of ducks in southwestern Manitoba.
Lawson, Chris A
2014-07-01
Three experiments with 81 3-year-olds (M=3.62years) examined the conditions that enable young children to use the sample size principle (SSP) of induction-the inductive rule that facilitates generalizations from large rather than small samples of evidence. In Experiment 1, children exhibited the SSP when exemplars were presented sequentially but not when exemplars were presented simultaneously. Results from Experiment 3 suggest that the advantage of sequential presentation is not due to the additional time to process the available input from the two samples but instead may be linked to better memory for specific individuals in the large sample. In addition, findings from Experiments 1 and 2 suggest that adherence to the SSP is mediated by the disparity between presented samples. Overall, these results reveal that the SSP appears early in development and is guided by basic cognitive processes triggered during the acquisition of input. PMID:24439115
Bice, K.; Clement, S. C.
1981-01-01
X-ray diffraction and spectroscopy were used to investigate the mineralogical and chemical properties of the Calvert, Ball Old Mine, Ball Martin, and Jordan Sediments. The particle size distribution and index of refraction of each sample were determined. The samples are composed primarily of quartz, kaolinite, and illite. The clay minerals are most abundant in the finer particle size fractions. The chemical properties of the four samples are similar. The Calvert sample is most notably different in that it contains a relatively high amount of iron. The dominant particle size fraction in each sample is silt, with lesser amounts of clay and sand. The indices of refraction of the sediments are the same with the exception of the Calvert sample which has a slightly higher value.
Small sample sizes in the study of ontogenetic allometry; implications for palaeobiology
Vavrek, Matthew J.
2015-01-01
Quantitative morphometric analyses, particularly ontogenetic allometry, are common methods used in quantifying shape, and changes therein, in both extinct and extant organisms. Due to incompleteness and the potential for restricted sample sizes in the fossil record, palaeobiological analyses of allometry may encounter higher rates of error. Differences in sample size between fossil and extant studies and any resulting effects on allometric analyses have not been thoroughly investigated, and a logical lower threshold to sample size is not clear. Here we show that studies based on fossil datasets have smaller sample sizes than those based on extant taxa. A similar pattern between vertebrates and invertebrates indicates this is not a problem unique to either group, but common to both. We investigate the relationship between sample size, ontogenetic allometric relationship and statistical power using an empirical dataset of skull measurements of modern Alligator mississippiensis. Across a variety of subsampling techniques, used to simulate different taphonomic and/or sampling effects, smaller sample sizes gave less reliable and more variable results, often with the result that allometric relationships will go undetected due to Type II error (failure to reject the null hypothesis). This may result in a false impression of fewer instances of positive/negative allometric growth in fossils compared to living organisms. These limitations are not restricted to fossil data and are equally applicable to allometric analyses of rare extant taxa. No mathematically derived minimum sample size for ontogenetic allometric studies is found; rather results of isometry (but not necessarily allometry) should not be viewed with confidence at small sample sizes. PMID:25780770
McCain, J.D.; Dawes, S.S.; Farthing, W.E.
1986-05-01
The report is Attachment No. 2 to the Final Report of ARB Contract A3-092-32 and provides a tutorial on the use of Cascade (Series) Cyclones to obtain size-fractionated particulate samples from industrial flue gases at stationary sources. The instrumentation and procedures described are designed to protect the purity of the collected samples so that post-test chemical analysis may be performed for organic and inorganic compounds, including instrumental analysis for trace elements. The instrumentation described collects bulk quantities for each of six size fractions over the range 10 to 0.4 micrometer diameter. The report describes the operating principles, calibration, and empirical modeling of small cyclone performance. It also discusses the preliminary calculations, operation, sample retrieval, and data analysis associated with the use of cyclones to obtain size-segregated samples and to measure particle-size distributions.
Chen, Y.; Nguyen, D.; Guertin, S.; Berstein, J.; White, M.; Menke, R.; Kayali, S.
2003-01-01
This paper presents a reliability evaluation methodology to obtain the statistical reliability information of memory chips for space applications when the test sample size needs to be kept small because of the high cost of the radiation hardness memories.
Computing Confidence Bounds for Power and Sample Size of the General Linear Univariate Model
Taylor, Douglas J.; Muller, Keith E.
2013-01-01
The power of a test, the probability of rejecting the null hypothesis in favor of an alternative, may be computed using estimates of one or more distributional parameters. Statisticians frequently fix mean values and calculate power or sample size using a variance estimate from an existing study. Hence computed power becomes a random variable for a fixed sample size. Likewise, the sample size necessary to achieve a fixed power varies randomly. Standard statistical practice requires reporting uncertainty associated with such point estimates. Previous authors studied an asymptotically unbiased method of obtaining confidence intervals for noncentrality and power of the general linear univariate model in this setting. We provide exact confidence intervals for noncentrality, power, and sample size. Such confidence intervals, particularly one-sided intervals, help in planning a future study and in evaluating existing studies. PMID:24039272
Zan, Jinbo; Fang, Xiaomin; Yang, Shengli; Yan, Maodu
2015-01-01
studies demonstrate that particle size separation based on gravitational settling and detailed rock magnetic measurements of the resulting fractionated samples constitutes an effective approach to evaluating the relative contributions of pedogenic and detrital components in the loess and paleosol sequences on the Chinese Loess Plateau. So far, however, similar work has not been undertaken on the loess deposits in Central Asia. In this paper, 17 loess and paleosol samples from three representative loess sections in Central Asia were separated into four grain size fractions, and then systematic rock magnetic measurements were made on the fractions. Our results demonstrate that the content of the <4 μm fraction in the Central Asian loess deposits is relatively low and that the samples generally have a unimodal particle distribution with a peak in the medium-coarse silt range. We find no significant difference between the particle size distributions obtained by the laser diffraction and the pipette and wet sieving methods. Rock magnetic studies further demonstrate that the medium-coarse silt fraction (e.g., the 20-75 μm fraction) provides the main control on the magnetic properties of the loess and paleosol samples in Central Asia. The contribution of pedogenically produced superparamagnetic (SP) and stable single-domain (SD) magnetic particles to the bulk magnetic properties is very limited. In addition, the coarsest fraction (>75 μm) exhibits the minimum values of χ, χARM, and SIRM, demonstrating that the concentrations of ferrimagnetic grains are not positively correlated with the bulk particle size in the Central Asian loess deposits.
The Impact of Sample Size and Other Factors When Estimating Multilevel Logistic Models
ERIC Educational Resources Information Center
Schoeneberger, Jason A.
2016-01-01
The design of research studies utilizing binary multilevel models must necessarily incorporate knowledge of multiple factors, including estimation method, variance component size, or number of predictors, in addition to sample sizes. This Monte Carlo study examined the performance of random effect binary outcome multilevel models under varying…
ERIC Educational Resources Information Center
Kelley, Ken; Rausch, Joseph R.
2011-01-01
Longitudinal studies are necessary to examine individual change over time, with group status often being an important variable in explaining some individual differences in change. Although sample size planning for longitudinal studies has focused on statistical power, recent calls for effect sizes and their corresponding confidence intervals…
A margin based approach to determining sample sizes via tolerance bounds.
Newcomer, Justin T.; Freeland, Katherine Elizabeth
2013-09-01
This paper proposes a tolerance bound approach for determining sample sizes. With this new methodology we begin to think of sample size in the context of uncertainty exceeding margin. As the sample size decreases the uncertainty in the estimate of margin increases. This can be problematic when the margin is small and only a few units are available for testing. In this case there may be a true underlying positive margin to requirements but the uncertainty may be too large to conclude we have sufficient margin to those requirements with a high level of statistical confidence. Therefore, we provide a methodology for choosing a sample size large enough such that an estimated QMU uncertainty based on the tolerance bound approach will be smaller than the estimated margin (assuming there is positive margin). This ensures that the estimated tolerance bound will be within performance requirements and the tolerance ratio will be greater than one, supporting a conclusion that we have sufficient margin to the performance requirements. In addition, this paper explores the relationship between margin, uncertainty, and sample size and provides an approach and recommendations for quantifying risk when sample sizes are limited.
Exact Power and Sample Size Calculations for the Two One-Sided Tests of Equivalence.
Shieh, Gwowen
2016-01-01
Equivalent testing has been strongly recommended for demonstrating the comparability of treatment effects in a wide variety of research fields including medical studies. Although the essential properties of the favorable two one-sided tests of equivalence have been addressed in the literature, the associated power and sample size calculations were illustrated mainly for selecting the most appropriate approximate method. Moreover, conventional power analysis does not consider the allocation restrictions and cost issues of different sample size choices. To extend the practical usefulness of the two one-sided tests procedure, this article describes exact approaches to sample size determinations under various allocation and cost considerations. Because the presented features are not generally available in common software packages, both R and SAS computer codes are presented to implement the suggested power and sample size computations for planning equivalence studies. The exact power function of the TOST procedure is employed to compute optimal sample sizes under four design schemes allowing for different allocation and cost concerns. The proposed power and sample size methodology should be useful for medical sciences to plan equivalence studies. PMID:27598468
Size and modal analyses of fines and ultrafines from some Apollo 17 samples
NASA Technical Reports Server (NTRS)
Greene, G. M.; King, D. T., Jr.; Banholzer, G. S., Jr.; King, E. A.
1975-01-01
Scanning electron and optical microscopy techniques have been used to determine the grain-size frequency distributions and morphology-based modal analyses of fine and ultrafine fractions of some Apollo 17 regolith samples. There are significant and large differences between the grain-size frequency distributions of the less than 10-micron size fraction of Apollo 17 samples, but there are no clear relations to the local geologic setting from which individual samples have been collected. This may be due to effective lateral mixing of regolith particles in this size range by micrometeoroid impacts. None of the properties of the frequency distributions support the idea of selective transport of any fine grain-size fraction, as has been proposed by other workers. All of the particle types found in the coarser size fractions also occur in the less than 10-micron particles. In the size range from 105 to 10 microns there is a strong tendency for the percentage of regularly shaped glass to increase as the graphic mean grain size of the less than 1-mm size fraction decreases, both probably being controlled by exposure age.
Sample size calculations for surveys to substantiate freedom of populations from infectious agents.
Johnson, Wesley O; Su, Chun-Lung; Gardner, Ian A; Christensen, Ronald
2004-03-01
We develop a Bayesian approach to sample size computations for surveys designed to provide evidence of freedom from a disease or from an infectious agent. A population is considered "disease-free" when the prevalence or probability of disease is less than some threshold value. Prior distributions are specified for diagnostic test sensitivity and specificity and we test the null hypothesis that the prevalence is below the threshold. Sample size computations are developed using hypergeometric sampling for finite populations and binomial sampling for infinite populations. A normal approximation is also developed. Our procedures are compared with the frequentist methods of Cameron and Baldock (1998a, Preventive Veterinary Medicine34, 1-17.) using an example of foot-and-mouth disease. User-friendly programs for sample size calculation and analysis of survey data are available at http://www.epi.ucdavis.edu/diagnostictests/. PMID:15032786
A normative inference approach for optimal sample sizes in decisions from experience.
Ostwald, Dirk; Starke, Ludger; Hertwig, Ralph
2015-01-01
"Decisions from experience" (DFE) refers to a body of work that emerged in research on behavioral decision making over the last decade. One of the major experimental paradigms employed to study experience-based choice is the "sampling paradigm," which serves as a model of decision making under limited knowledge about the statistical structure of the world. In this paradigm respondents are presented with two payoff distributions, which, in contrast to standard approaches in behavioral economics, are specified not in terms of explicit outcome-probability information, but by the opportunity to sample outcomes from each distribution without economic consequences. Participants are encouraged to explore the distributions until they feel confident enough to decide from which they would prefer to draw from in a final trial involving real monetary payoffs. One commonly employed measure to characterize the behavior of participants in the sampling paradigm is the sample size, that is, the number of outcome draws which participants choose to obtain from each distribution prior to terminating sampling. A natural question that arises in this context concerns the "optimal" sample size, which could be used as a normative benchmark to evaluate human sampling behavior in DFE. In this theoretical study, we relate the DFE sampling paradigm to the classical statistical decision theoretic literature and, under a probabilistic inference assumption, evaluate optimal sample sizes for DFE. In our treatment we go beyond analytically established results by showing how the classical statistical decision theoretic framework can be used to derive optimal sample sizes under arbitrary, but numerically evaluable, constraints. Finally, we critically evaluate the value of deriving optimal sample sizes under this framework as testable predictions for the experimental study of sampling behavior in DFE. PMID:26441720
Minimum Sample Size for Cronbach's Coefficient Alpha: A Monte-Carlo Study
ERIC Educational Resources Information Center
Yurdugul, Halil
2008-01-01
The coefficient alpha is the most widely used measure of internal consistency for composite scores in the educational and psychological studies. However, due to the difficulties of data gathering in psychometric studies, the minimum sample size for the sample coefficient alpha has been frequently debated. There are various suggested minimum sample…
Generating Random Samples of a Given Size Using Social Security Numbers.
Erickson, Richard C.; Brauchle, Paul E.
The purposes of this article are (1) to present a method by which social security numbers may be used to draw cluster samples of a predetermined size and (2) to describe procedures used to validate this method of drawing random samples. (JOW)
Computer program for sample sizes required to determine disease incidence in fish populations
Ossiander, Frank J.; Wedemeyer, Gary
1973-01-01
A computer program is described for generating the sample size tables required in fish hatchery disease inspection and certification. The program was designed to aid in detection of infectious pancreatic necrosis (IPN) in salmonids, but it is applicable to any fish disease inspection when the sampling plan follows the hypergeometric distribution.
The Effects of Sample Size, Estimation Methods, and Model Specification on SEM Indices.
ERIC Educational Resources Information Center
Fan, Xitao; And Others
A Monte Carlo simulation study was conducted to investigate the effects of sample size, estimation method, and model specification on structural equation modeling (SEM) fit indices. Based on a balanced 3x2x5 design, a total of 6,000 samples were generated from a prespecified population covariance matrix, and eight popular SEM fit indices were…
Macrobenthic data from samples taken in 1980, 1983 and 1985 along a pollution gradient in the Southern California Bight (USA) were analyzed at 5 taxonomic levels (species, genus, family, order, phylum) to determIne the taxon and sample size sufficient for assessing pollution impa...
Norm Block Sample Sizes: A Review of 17 Individually Administered Intelligence Tests
ERIC Educational Resources Information Center
Norfolk, Philip A.; Farmer, Ryan L.; Floyd, Randy G.; Woods, Isaac L.; Hawkins, Haley K.; Irby, Sarah M.
2015-01-01
The representativeness, recency, and size of norm samples strongly influence the accuracy of inferences drawn from their scores. Inadequate norm samples may lead to inflated or deflated scores for individuals and poorer prediction of developmental and academic outcomes. The purpose of this study was to apply Kranzler and Floyd's method for…
ERIC Educational Resources Information Center
Finch, W. Holmes; Finch, Maria E. Hernandez
2016-01-01
Researchers and data analysts are sometimes faced with the problem of very small samples, where the number of variables approaches or exceeds the overall sample size; i.e. high dimensional data. In such cases, standard statistical models such as regression or analysis of variance cannot be used, either because the resulting parameter estimates…
Fienen, Michael N.; Selbig, William R.
2012-01-01
A new sample collection system was developed to improve the representation of sediment entrained in urban storm water by integrating water quality samples from the entire water column. The depth-integrated sampler arm (DISA) was able to mitigate sediment stratification bias in storm water, thereby improving the characterization of suspended-sediment concentration and particle size distribution at three independent study locations. Use of the DISA decreased variability, which improved statistical regression to predict particle size distribution using surrogate environmental parameters, such as precipitation depth and intensity. The performance of this statistical modeling technique was compared to results using traditional fixed-point sampling methods and was found to perform better. When environmental parameters can be used to predict particle size distributions, environmental managers have more options when characterizing concentrations, loads, and particle size distributions in urban runoff.
Study on Proper Sample Size for Multivariate Frequency Analysis for Rainfall Quantile
Joo, K.; Nam, W.; Choi, S.; Heo, J. H.
2014-12-01
For a given rainfall event, it can be characterized into some properties such as rainfall depth (amount), duration, and intensity. By considering these factors simultaneously, the actual phenomenon of rainfall event can be explained better than univariate model. Recently, applications of multivariate analysis for hydrological data such as extreme rainfall, drought and flood events are increasing rapidly. Theoretically, sample size on 2-dimension sample space needs n-square sample size if univariate frequency analysis needs n sample size. Main object of this study is to estimate of appropriate sample size of bivariate frequency analysis (especially using copula model) for rainfall data. Hourly recorded data (1961~2010) of Seoul weather station from Korea Meteorological Administration (KMA) is applied for frequency analysis and three copula models (Clayton, Frank, Gumbel) are used. Parameter estimation is performed by using pseudo-likelihood estimation and estimated mean square error (MSE) on various sample size by peaks over threshold (POT) concept. As a result, estimated thresholds of rainfall depth are 65.4 mm for Clayton, 74.2 mm for Frank, and 76.9 mm for Gumbel, respectively
Constrained statistical inference: sample-size tables for ANOVA and regression
Vanbrabant, Leonard; Van De Schoot, Rens; Rosseel, Yves
2015-01-01
Researchers in the social and behavioral sciences often have clear expectations about the order/direction of the parameters in their statistical model. For example, a researcher might expect that regression coefficient β1 is larger than β2 and β3. The corresponding hypothesis is H: β1 > {β2, β3} and this is known as an (order) constrained hypothesis. A major advantage of testing such a hypothesis is that power can be gained and inherently a smaller sample size is needed. This article discusses this gain in sample size reduction, when an increasing number of constraints is included into the hypothesis. The main goal is to present sample-size tables for constrained hypotheses. A sample-size table contains the necessary sample-size at a pre-specified power (say, 0.80) for an increasing number of constraints. To obtain sample-size tables, two Monte Carlo simulations were performed, one for ANOVA and one for multiple regression. Three results are salient. First, in an ANOVA the needed sample-size decreases with 30–50% when complete ordering of the parameters is taken into account. Second, small deviations from the imposed order have only a minor impact on the power. Third, at the maximum number of constraints, the linear regression results are comparable with the ANOVA results. However, in the case of fewer constraints, ordering the parameters (e.g., β1 > β2) results in a higher power than assigning a positive or a negative sign to the parameters (e.g., β1 > 0). PMID:25628587
EFFECTS OF SAMPLE SIZE ON THE STRESS-PERMEABILITY RELATIONSHIP FOR NATURAL FRACTURES
Gale, J. E.; Raven, K. G.
1980-10-01
Five granite cores (10.0, 15.0, 19.3, 24.5, and 29.4 cm in diameter) containing natural fractures oriented normal to the core axis, were used to study the effect of sample size on the permeability of natural fractures. Each sample, taken from the same fractured plane, was subjected to three uniaxial compressive loading and unloading cycles with a maximum axial stress of 30 MPa. For each loading and unloading cycle, the flowrate through the fracture plane from a central borehole under constant (±2% of the pressure increment) injection pressures was measured at specified increments of effective normal stress. Both fracture deformation and flowrate exhibited highly nonlinear variation with changes in normal stress. Both fracture deformation and flowrate hysteresis between loading and unloading cycles were observed for all samples, but this hysteresis decreased with successive loading cycles. The results of this study suggest that a sample-size effect exists. Fracture deformation and flowrate data indicate that crushing of the fracture plane asperities occurs in the smaller samples because of a poorer initial distribution of contact points than in the larger samples, which deform more elastically. Steady-state flow tests also suggest a decrease in minimum fracture permeability at maximum normal stress with increasing sample size for four of the five samples. Regression analyses of the flowrate and fracture closure data suggest that deformable natural fractures deviate from the cubic relationship between fracture aperture and flowrate and that this is especially true for low flowrates and small apertures, when the fracture sides are in intimate contact under high normal stress conditions, In order to confirm the trends suggested in this study, it is necessary to quantify the scale and variation of fracture plane roughness and to determine, from additional laboratory studies, the degree of variation in the stress-permeability relationship between samples of the same
Brera, Carlo; De Santis, Barbara; Prantera, Elisabetta; Debegnach, Francesca; Pannunzi, Elena; Fasano, Floriana; Berdini, Clara; Slate, Andrew B; Miraglia, Marina; Whitaker, Thomas B
2010-08-11
Use of proper sampling methods throughout the agri-food chain is crucial when it comes to effectively detecting contaminants in foods and feeds. The objective of the study was to estimate the performance of sampling plan designs to determine aflatoxin B(1) (AFB(1)) contamination in corn fields. A total of 840 ears were selected from a corn field suspected of being contaminated with aflatoxin. The mean and variance among the aflatoxin values for each ear were 10.6 mug/kg and 2233.3, respectively. The variability and confidence intervals associated with sample means of a given size could be predicted using an equation associated with the normal distribution. Sample sizes of 248 and 674 ears would be required to estimate the true field concentration of 10.6 mug/kg within +/-50 and +/-30%, respectively. Using the distribution information from the study, operating characteristic curves were developed to show the performance of various sampling plan designs. PMID:20608734
Power and sample size calculations for Mendelian randomization studies using one genetic instrument.
Freeman, Guy; Cowling, Benjamin J; Schooling, C Mary
2013-08-01
Mendelian randomization, which is instrumental variable analysis using genetic variants as instruments, is an increasingly popular method of making causal inferences from observational studies. In order to design efficient Mendelian randomization studies, it is essential to calculate the sample sizes required. We present formulas for calculating the power of a Mendelian randomization study using one genetic instrument to detect an effect of a given size, and the minimum sample size required to detect effects for given levels of significance and power, using asymptotic statistical theory. We apply the formulas to some example data and compare the results with those from simulation methods. Power and sample size calculations using these formulas should be more straightforward to carry out than simulation approaches. These formulas make explicit that the sample size needed for Mendelian randomization study is inversely proportional to the square of the correlation between the genetic instrument and the exposure and proportional to the residual variance of the outcome after removing the effect of the exposure, as well as inversely proportional to the square of the effect size. PMID:23934314
Son, Dae-Soon; Lee, DongHyuk; Lee, Kyusang; Jung, Sin-Ho; Ahn, Taejin; Lee, Eunjin; Sohn, Insuk; Chung, Jongsuk; Park, Woongyang; Huh, Nam; Lee, Jae Won
2015-02-01
An empirical method of sample size determination for building prediction models was proposed recently. Permutation method which is used in this procedure is a commonly used method to address the problem of overfitting during cross-validation while evaluating the performance of prediction models constructed from microarray data. But major drawback of such methods which include bootstrapping and full permutations is prohibitively high cost of computation required for calculating the sample size. In this paper, we propose that a single representative null distribution can be used instead of a full permutation by using both simulated and real data sets. During simulation, we have used a dataset with zero effect size and confirmed that the empirical type I error approaches to 0.05. Hence this method can be confidently applied to reduce overfitting problem during cross-validation. We have observed that pilot data set generated by random sampling from real data could be successfully used for sample size determination. We present our results using an experiment that was repeated for 300 times while producing results comparable to that of full permutation method. Since we eliminate full permutation, sample size estimation time is not a function of pilot data size. In our experiment we have observed that this process takes around 30min. With the increasing number of clinical studies, developing efficient sample size determination methods for building prediction models is critical. But empirical methods using bootstrap and permutation usually involve high computing costs. In this study, we propose a method that can reduce required computing time drastically by using representative null distribution of permutations. We use data from pilot experiments to apply this method for designing clinical studies efficiently for high throughput data. PMID:25555898
Shirazi, Mohammadali; Lord, Dominique; Geedipally, Srinivas Reddy
2016-08-01
The Highway Safety Manual (HSM) prediction models are fitted and validated based on crash data collected from a selected number of states in the United States. Therefore, for a jurisdiction to be able to fully benefit from applying these models, it is necessary to calibrate or recalibrate them to local conditions. The first edition of the HSM recommends calibrating the models using a one-size-fits-all sample-size of 30-50 locations with total of at least 100 crashes per year. However, the HSM recommendation is not fully supported by documented studies. The objectives of this paper are consequently: (1) to examine the required sample size based on the characteristics of the data that will be used for the calibration or recalibration process; and, (2) propose revised guidelines. The objectives were accomplished using simulation runs for different scenarios that characterized the sample mean and variance of the data. The simulation results indicate that as the ratio of the standard deviation to the mean (i.e., coefficient of variation) of the crash data increases, a larger sample-size is warranted to fulfill certain levels of accuracy. Taking this observation into account, sample-size guidelines were prepared based on the coefficient of variation of the crash data that are needed for the calibration process. The guidelines were then successfully applied to the two observed datasets. The proposed guidelines can be used for all facility types and both for segment and intersection prediction models. PMID:27183517
Demonstration of multi- and single-reader sample size program for diagnostic studies software
Hillis, Stephen L.; Schartz, Kevin M.
2015-03-01
The recently released software Multi- and Single-Reader Sample Size Sample Size Program for Diagnostic Studies, written by Kevin Schartz and Stephen Hillis, performs sample size computations for diagnostic reader-performance studies. The program computes the sample size needed to detect a specified difference in a reader performance measure between two modalities, when using the analysis methods initially proposed by Dorfman, Berbaum, and Metz (DBM) and Obuchowski and Rockette (OR), and later unified and improved by Hillis and colleagues. A commonly used reader performance measure is the area under the receiver-operating-characteristic curve. The program can be used with typical common reader-performance measures which can be estimated parametrically or nonparametrically. The program has an easy-to-use step-by-step intuitive interface that walks the user through the entry of the needed information. Features of the software include the following: (1) choice of several study designs; (2) choice of inputs obtained from either OR or DBM analyses; (3) choice of three different inference situations: both readers and cases random, readers fixed and cases random, and readers random and cases fixed; (4) choice of two types of hypotheses: equivalence or noninferiority; (6) choice of two output formats: power for specified case and reader sample sizes, or a listing of case-reader combinations that provide a specified power; (7) choice of single or multi-reader analyses; and (8) functionality in Windows, Mac OS, and Linux.
Sample size estimation for the sorcerer's apprentice. Guide for the uninitiated and intimidated.
Ray, J. G.; Vermeulen, M. J.
1999-01-01
OBJECTIVE: To review the importance of and practical application of sample size determination for clinical studies in the primary care setting. QUALITY OF EVIDENCE: A MEDLINE search was performed from January 1966 to January 1998 using the MeSH headings and text words "sample size," "sample estimation," and "study design." Article references, medical statistics texts, and university colleagues were also consulted for recommended resources. Citations that offered a clear and simple approach to sample size estimation were accepted, specifically those related to statistical analyses commonly applied in primary care research. MAIN MESSAGE: The chance of committing an alpha statistical error, or finding that there is a difference between two groups when there really is none, is usually set at 5%. The probability of finding no difference between two groups, when, in actuality, there is a difference, is commonly accepted at 20%, and is called the beta error. The power of a study, usually set at 80% (i.e., 1 minus beta), defines the probability that a true difference will be observed between two groups. Using these parameters, we provide examples for estimating the required sample size for comparing two means (t test), comparing event rates between two groups, calculating an odds ratio or a correlation coefficient, or performing a meta-analysis. Estimation of sample size needed before initiation of a study enables statistical power to be maximized and bias minimized, increasing the validity of the study. CONCLUSION: Sample size estimation can be done by any novice researcher who wishes to maximize the quality of his or her study. PMID:10424273
Comparative studies of grain size separates of 60009. [lunar soil samples
NASA Technical Reports Server (NTRS)
Mckay, D. S.; Morris, R. V.; Dungan, M. A.; Fruland, R. M.; Fuhrman, R.
1976-01-01
Five samples from 60009, the lower half of a double drive tube, were analyzed via grain-size methods, with particle types classified and counted in the coarser grain sizes. Studies were undertaken of particle types and distributions by petrographic methods, of magnetic fractions, of the size splits and magnetic splits as analyzed by ferromagnetic resonance (FMR) techniques, of maturity (based on agglutinate content, FMR index Is/FeO, mean size of sub-cm material, magnetic fraction), of possible reworking or mixing in situ, and of depositional history. Maturity indices are in substantial agreement for all of the five samples. Strong positive correlation of percent agglutinates and percent bedrock-derived lithic fragments, combined with negative correlation of those components with percent single crystal plagioclase, argue against in situ reworking of the same soil.
Effects of sample size and sampling frequency on studies of brown bear home ranges and habitat use
Arthur, Steve M.; Schwartz, Charles C.
1999-01-01
We equipped 9 brown bears (Ursus arctos) on the Kenai Peninsula, Alaska, with collars containing both conventional very-high-frequency (VHF) transmitters and global positioning system (GPS) receivers programmed to determine an animal's position at 5.75-hr intervals. We calculated minimum convex polygon (MCP) and fixed and adaptive kernel home ranges for randomly-selected subsets of the GPS data to examine the effects of sample size on accuracy and precision of home range estimates. We also compared results obtained by weekly aerial radiotracking versus more frequent GPS locations to test for biases in conventional radiotracking data. Home ranges based on the MCP were 20-606 km2 (x = 201) for aerial radiotracking data (n = 12-16 locations/bear) and 116-1,505 km2 (x = 522) for the complete GPS data sets (n = 245-466 locations/bear). Fixed kernel home ranges were 34-955 km2 (x = 224) for radiotracking data and 16-130 km2 (x = 60) for the GPS data. Differences between means for radiotracking and GPS data were due primarily to the larger samples provided by the GPS data. Means did not differ between radiotracking data and equivalent-sized subsets of GPS data (P > 0.10). For the MCP, home range area increased and variability decreased asymptotically with number of locations. For the kernel models, both area and variability decreased with increasing sample size. Simulations suggested that the MCP and kernel models required >60 and >80 locations, respectively, for estimates to be both accurate (change in area <1%/additional location) and precise (CV < 50%). Although the radiotracking data appeared unbiased, except for the relationship between area and sample size, these data failed to indicate some areas that likely were important to bears. Our results suggest that the usefulness of conventional radiotracking data may be limited by potential biases and variability due to small samples. Investigators that use home range estimates in statistical tests should consider the
Threshold-dependent sample sizes for selenium assessment with stream fish tissue
Hitt, Nathaniel P.; Smith, David
2013-01-01
Natural resource managers are developing assessments of selenium (Se) contamination in freshwater ecosystems based on fish tissue concentrations. We evaluated the effects of sample size (i.e., number of fish per site) on the probability of correctly detecting mean whole-body Se values above a range of potential management thresholds. We modeled Se concentrations as gamma distributions with shape and scale parameters fitting an empirical mean-to-variance relationship in data from southwestern West Virginia, USA (63 collections, 382 individuals). We used parametric bootstrapping techniques to calculate statistical power as the probability of detecting true mean concentrations up to 3 mg Se/kg above management thresholds ranging from 4-8 mg Se/kg. Sample sizes required to achieve 80% power varied as a function of management thresholds and type-I error tolerance (α). Higher thresholds required more samples than lower thresholds because populations were more heterogeneous at higher mean Se levels. For instance, to assess a management threshold of 4 mg Se/kg, a sample of 8 fish could detect an increase of ∼ 1 mg Se/kg with 80% power (given α = 0.05), but this sample size would be unable to detect such an increase from a management threshold of 8 mg Se/kg with more than a coin-flip probability. Increasing α decreased sample size requirements to detect above-threshold mean Se concentrations with 80% power. For instance, at an α-level of 0.05, an 8-fish sample could detect an increase of ∼ 2 units above a threshold of 8 mg Se/kg with 80% power, but when α was relaxed to 0.2 this sample size was more sensitive to increasing mean Se concentrations, allowing detection of an increase of ∼ 1.2 units with equivalent power. Combining individuals into 2- and 4-fish composite samples for laboratory analysis did not decrease power because the reduced number of laboratory samples was compensated by increased precision of composites for estimating mean
A practical comparison of blinded methods for sample size reviews in survival data clinical trials.
Todd, Susan; Valdés-Márquez, Elsa; West, Jodie
2012-01-01
This paper presents practical approaches to the problem of sample size re-estimation in the case of clinical trials with survival data when proportional hazards can be assumed. When data are readily available at the time of the review, on a full range of survival experiences across the recruited patients, it is shown that, as expected, performing a blinded re-estimation procedure is straightforward and can help to maintain the trial's pre-specified error rates. Two alternative methods for dealing with the situation where limited survival experiences are available at the time of the sample size review are then presented and compared. In this instance, extrapolation is required in order to undertake the sample size re-estimation. Worked examples, together with results from a simulation study are described. It is concluded that, as in the standard case, use of either extrapolation approach successfully protects the trial error rates. PMID:22337635
Mesh-size effects on drift sample composition as determined with a triple net sampler
Slack, K.V.; Tilley, L.J.; Kennelly, S.S.
1991-01-01
Nested nets of three different mesh apertures were used to study mesh-size effects on drift collected in a small mountain stream. The innermost, middle, and outermost nets had, respectively, 425 ??m, 209 ??m and 106 ??m openings, a design that reduced clogging while partitioning collections into three size groups. The open area of mesh in each net, from largest to smallest mesh opening, was 3.7, 5.7 and 8.0 times the area of the net mouth. Volumes of filtered water were determined with a flowmeter. The results are expressed as (1) drift retained by each net, (2) drift that would have been collected by a single net of given mesh size, and (3) the percentage of total drift (the sum of the catches from all three nets) that passed through the 425 ??m and 209 ??m nets. During a two day period in August 1986, Chironomidae larvae were dominant numerically in all 209 ??m and 106 ??m samples and midday 425 ??m samples. Large drifters (Ephemerellidae) occurred only in 425 ??m or 209 ??m nets, but the general pattern was an increase in abundance and number of taxa with decreasing mesh size. Relatively more individuals occurred in the larger mesh nets at night than during the day. The two larger mesh sizes retained 70% of the total sediment/detritus in the drift collections, and this decreased the rate of clogging of the 106 ??m net. If an objective of a sampling program is to compare drift density or drift rate between areas or sampling dates, the same mesh size should be used for all sample collection and processing. The mesh aperture used for drift collection should retain all species and life stages of significance in a study. The nested net design enables an investigator to test the adequacy of drift samples. ?? 1991 Kluwer Academic Publishers.
Morgera, S. D.; Cooper, D. B.
1976-01-01
The experimental observation that a surprisingly small sample size vis-a-vis dimension is needed to achieve good signal-to-interference ratio (SIR) performance with an adaptive predetection filter is explained. The adaptive filter requires estimates as obtained by a recursive stochastic algorithm of the inverse of the filter input data covariance matrix. The SIR performance with sample size is compared for the situations where the covariance matrix estimates are of unstructured (generalized) form and of structured (finite Toeplitz) form; the latter case is consistent with weak stationarity of the input data stochastic process.
Estimation of grain size in asphalt samples using digital image analysis
Källén, Hanna; Heyden, Anders; Lindh, Per
2014-09-01
Asphalt is made of a mixture of stones of different sizes and a binder called bitumen, the size distribution of the stones is determined by the recipe of the asphalt. One quality check of asphalt is to see if the real size distribution of asphalt samples is consistent with the recipe. This is usually done by first extracting the binder using methylenchloride and the sieving the stones and see how much that pass every sieve size. Methylenchloride is highly toxic and it is desirable to find the size distribution in some other way. In this paper we find the size distribution by slicing up the asphalt sample and using image analysis techniques to analyze the cross-sections. First the stones are segmented from the background, bitumen, and then rectangles are fit to the detected stones. We then estimate the sizes of the stones by using the width of the rectangle. The result is compared with both the recipe for the asphalt and with the result from the standard analysis method, and our method shows good correlation with those.
Sample size determination for testing equality in a cluster randomized trial with noncompliance.
Lui, Kung-Jong; Chang, Kuang-Chao
2011-01-01
For administrative convenience or cost efficiency, we may often employ a cluster randomized trial (CRT), in which randomized units are clusters of patients rather than individual patients. Furthermore, because of ethical reasons or patient's decision, it is not uncommon to encounter data in which there are patients not complying with their assigned treatments. Thus, the development of a sample size calculation procedure for a CRT with noncompliance is important and useful in practice. Under the exclusion restriction model, we have developed an asymptotic test procedure using a tanh(-1)(x) transformation for testing equality between two treatments among compliers for a CRT with noncompliance. We have further derived a sample size formula accounting for both noncompliance and the intraclass correlation for a desired power 1 - β at a nominal α level. We have employed Monte Carlo simulation to evaluate the finite-sample performance of the proposed test procedure with respect to type I error and the accuracy of the derived sample size calculation formula with respect to power in a variety of situations. Finally, we use the data taken from a CRT studying vitamin A supplementation to reduce mortality among preschool children to illustrate the use of sample size calculation proposed here. PMID:21191850
Simulation analyses of space use: Home range estimates, variability, and sample size
Bekoff, M.; Mech, L.D.
1984-01-01
Simulations of space use by animals were run to determine the relationship among home range area estimates, variability, and sample size {number of locations}. As sample size increased, home range size increased asymptotically, whereas variability decreased among mean home range area estimates generated by multiple simulations for the same sample size. Our results suggest that field workers should ascertain between 100 and 200 locations in order to estimate reliably home range area. In some cases, this suggested guideline is higher than values found in the few published studies in which the relationship between home range area and number of locations is addressed. Sampling differences for small species occupying relatively small home ranges indicate that fewer locations may be sufficient to allow for a reliable estimate of home range. Intraspecffic variability in social status (group member, loner, resident, transient), age, sex, reproductive condition, and food resources also have to be considered, as do season, habitat, and differences in sampling and analytical methods. Comparative data still are needed.
Size-dependent Turbidimatric Quantification of Mobile Colloids in Field Samples
NASA Astrophysics Data System (ADS)
Yan, J.; Meng, X.; Jin, Y.
2015-12-01
Natural colloids, often defined as entities with sizes < 1.0 μm, have attracted much research attention because of their ability to facilitate the transport of contaminants in the subsurface environment. However, due to their small size and generally low concentrations in field samples, quantification of mobile colloids, especially the smaller fractions (< 0.45 µm), which are operationally defined as dissolved, is largely impeded and hence the natural colloidal pool is greatly overlooked and underestimated. The main objectives of this study are to: (1) develop an experimentally and economically efficient methodology to quantify natural colloids in different size fractions (0.1-0.45 and 0.45-1 µm); (2) quantify mobile colloids including small colloids, < 0.45 µm particularly, in different natural aquatic samples. We measured and generated correlations between mass concentration and turbidity of colloid suspensions, made by extracting and fractionating water dispersible colloids in 37 soils from different areas in the U.S. and Denmark, for colloid size fractions 0.1-0.45 and 0.45-1 µm. Results show that the correlation between turbidity and colloid mass concentration is largely affected by colloid size and iron content, indicating the need to generate different correlations for colloids with constrained size range and iron content. This method enabled quick quantification of colloid concentrations in a large number of field samples collected from freshwater, wetland and estuaries in different size fractions. As a general trend, we observed high concentrations of colloids in the < 0.45 µm fraction, which constitutes a significant percentage of the total mobile colloidal pool (< 1 µm). This observation suggests that the operationally defined cut-off size for "dissolved" phase can greatly underestimate colloid concentration therefore the role that colloids play in the transport of associated contaminants or other elements.
Sample-Size Effects on the Compression Behavior of a Ni-BASED Amorphous Alloy
NASA Astrophysics Data System (ADS)
Liang, Weizhong; Zhao, Guogang; Wu, Linzhi; Yu, Hongjun; Li, Ming; Zhang, Lin
Ni42Cu5Ti20Zr21.5Al8Si3.5 bulk metallic glasses rods with diameters of 1 mm and 3 mm, were prepared by arc melting of composing elements in a Ti-gettered argon atmosphere. The compressive deformation and fracture behavior of the amorphous alloy samples with different size were investigated by testing machine and scanning electron microscope. The compressive stress-strain curves of 1 mm and 3 mm samples exhibited 4.5% and 0% plastic strain, while the compressive fracture strength for 1 mm and 3 mm rod is 4691 MPa and 2631 MPa, respectively. The compressive fracture surface of different size sample consisted of shear zone and non-shear one. Typical vein patterns with some melting drops can be seen on the shear region of 1 mm rod, while fish-bone shape patterns can be observed on 3 mm specimen surface. Some interesting different spacing periodic ripples existed on the non-shear zone of 1 and 3 mm rods. On the side surface of 1 mm sample, high density of shear bands was observed. The skip of shear bands can be seen on 1 mm sample surface. The mechanisms of the effect of sample size on fracture strength and plasticity of the Ni-based amorphous alloy are discussed.
Janja Tursic; Irena Grgic; Axel Berner; Jaroslav Skantar; Igor Cuhalev
2008-02-01
A special sampling system for measurements of size-segregated particles directly at the source of emission was designed and constructed. The central part of this system is a low-pressure cascade impactor with 10 collection stages for the size ranges between 15 nm and 16 {mu}m. Its capability and suitability was proven by sampling particles at the stack (100{sup o}C) of a coal-fired power station in Slovenia. These measurements showed very reasonable results in comparison with a commercial cascade impactor for PM10 and PM2.5 and with a plane device for total suspended particulate matter (TSP). The best agreement with the measurements made by a commercial impactor was found for concentrations of TSP above 10 mg m{sup -3}, i.e., the average PM2.5/PM10 ratios obtained by a commercial impactor and by our impactor were 0.78 and 0.80, respectively. Analysis of selected elements in size-segregated emission particles additionally confirmed the suitability of our system. The measurements showed that the mass size distributions were generally bimodal, with the most pronounced mass peak in the 1-2 {mu}m size range. The first results of elemental mass size distributions showed some distinctive differences in comparison to the most common ambient anthropogenic sources (i.e., traffic emissions). For example, trace elements, like Pb, Cd, As, and V, typically related to traffic emissions, are usually more abundant in particles less than 1 {mu}m in size, whereas in our specific case they were found at about 2 {mu}m. Thus, these mass size distributions can be used as a signature of this source. Simultaneous measurements of size-segregated particles at the source and in the surrounding environment can therefore significantly increase the sensitivity of the contribution of a specific source to the actual ambient concentrations. 25 refs., 3 figs., 2 tabs.
[Sample size calculation in clinical post-marketing evaluation of traditional Chinese medicine].
Fu, Yingkun; Xie, Yanming
2011-10-01
In recent years, as the Chinese government and people pay more attention on the post-marketing research of Chinese Medicine, part of traditional Chinese medicine breed has or is about to begin after the listing of post-marketing evaluation study. In the post-marketing evaluation design, sample size calculation plays a decisive role. It not only ensures the accuracy and reliability of post-marketing evaluation. but also assures that the intended trials will have a desired power for correctly detecting a clinically meaningful difference of different medicine under study if such a difference truly exists. Up to now, there is no systemic method of sample size calculation in view of the traditional Chinese medicine. In this paper, according to the basic method of sample size calculation and the characteristic of the traditional Chinese medicine clinical evaluation, the sample size calculation methods of the Chinese medicine efficacy and safety are discussed respectively. We hope the paper would be beneficial to medical researchers, and pharmaceutical scientists who are engaged in the areas of Chinese medicine research. PMID:22292397
Sideridis, Georgios; Simos, Panagiotis; Papanicolaou, Andrew; Fletcher, Jack
2014-01-01
The present study assessed the impact of sample size on the power and fit of structural equation modeling applied to functional brain connectivity hypotheses. The data consisted of time-constrained minimum norm estimates of regional brain activity during performance of a reading task obtained with magnetoencephalography. Power analysis was first…
Analysis of variograms with various sample sizes from a multispectral image
Technology Transfer Automated Retrieval System (TEKTRAN)
Variograms play a crucial role in remote sensing application and geostatistics. In this study, the analysis of variograms with various sample sizes of remotely sensed data was conducted. A 100 X 100 pixel subset was chosen from an aerial multispectral image which contained three wavebands, green, ...
Analysis of variograms with various sample sizes from a multispectral image
Technology Transfer Automated Retrieval System (TEKTRAN)
Variogram plays a crucial role in remote sensing application and geostatistics. It is very important to estimate variogram reliably from sufficient data. In this study, the analysis of variograms with various sample sizes of remotely sensed data was conducted. A 100x100-pixel subset was chosen from ...
Kelley, Ken; Rausch, Joseph R.
2006-01-01
Methods for planning sample size (SS) for the standardized mean difference so that a narrow confidence interval (CI) can be obtained via the accuracy in parameter estimation (AIPE) approach are developed. One method plans SS so that the expected width of the CI is sufficiently narrow. A modification adjusts the SS so that the obtained CI is no…
Introduction to Sample Size Choice for Confidence Intervals Based on "t" Statistics
ERIC Educational Resources Information Center
2014-01-01
Sample size can be chosen to achieve a specified width in a confidence interval. The probability of obtaining a narrow width given that the confidence interval includes the population parameter is defined as the power of the confidence interval, a concept unfamiliar to many practitioners. This article shows how to utilize the Statistical Analysis…
Measurement Model Quality, Sample Size, and Solution Propriety in Confirmatory Factor Models
ERIC Educational Resources Information Center
Gagne, Phill; Hancock, Gregory R.
2006-01-01
Sample size recommendations in confirmatory factor analysis (CFA) have recently shifted away from observations per variable or per parameter toward consideration of model quality. Extending research by Marsh, Hau, Balla, and Grayson (1998), simulations were conducted to determine the extent to which CFA model convergence and parameter estimation…
Cao, Zhiguo; Xu, Fuchao; Li, Wenchao; Sun, Jianhui; Shen, Mohai; Su, Xianfa; Feng, Jinglan; Yu, Gang; Covaci, Adrian
2015-09-15
Particle size is a significant parameter which determines the environmental fate and the behavior of dust particles and, implicitly, the exposure risk of humans to particle-bound contaminants. Currently, the influence of dust particle size on the occurrence and seasonal variation of hexabromocyclododecanes (HBCDs) remains unclear. While HBCDs are now restricted by the Stockholm Convention, information regarding HBCD contamination in indoor dust in China is still limited. We analyzed composite dust samples from offices (n = 22), hotels (n = 3), kindergartens (n = 2), dormitories (n = 40), and main roads (n = 10). Each composite dust sample (one per type of microenvironment) was fractionated into 9 fractions (F1-F9: 2000-900, 900-500, 500-400, 400-300, 300-200, 200-100, 100-74, 74-50, and <50 μm). Total HBCD concentrations ranged from 5.3 (road dust, F4) to 2580 ng g(-1) (dormitory dust, F4) in the 45 size-segregated samples. The seasonality of HBCDs in indoor dust was investigated in 40 samples from two offices. A consistent seasonal trend of HBCD levels was evident with dust collected in the winter being more contaminated with HBCDs than dust from the summer. Particle size-selection strategy for dust analysis has been found to be influential on the HBCD concentrations, while overestimation or underestimation would occur with improper strategies. PMID:26301772
ERIC Educational Resources Information Center
Kim, Su-Young
2012-01-01
Just as growth mixture models are useful with single-phase longitudinal data, multiphase growth mixture models can be used with multiple-phase longitudinal data. One of the practically important issues in single- and multiphase growth mixture models is the sample size requirements for accurate estimation. In a Monte Carlo simulation study, the…
The Influence of Virtual Sample Size on Confidence and Causal-Strength Judgments
ERIC Educational Resources Information Center
Liljeholm, Mimi; Cheng, Patricia W.
2009-01-01
The authors investigated whether confidence in causal judgments varies with virtual sample size--the frequency of cases in which the outcome is (a) absent before the introduction of a generative cause or (b) present before the introduction of a preventive cause. Participants were asked to evaluate the influence of various candidate causes on an…
Size Distributions and Characterization of Native and Ground Samples for Toxicology Studies
NASA Technical Reports Server (NTRS)
McKay, David S.; Cooper, Bonnie L.; Taylor, Larry A.
2010-01-01
This slide presentation shows charts and graphs that review the particle size distribution and characterization of natural and ground samples for toxicology studies. There are graphs which show the volume distribution versus the number distribution for natural occurring dust, jet mill ground dust, and ball mill ground dust.
Sample Size Calculation for Estimating or Testing a Nonzero Squared Multiple Correlation Coefficient
ERIC Educational Resources Information Center
Krishnamoorthy, K.; Xia, Yanping
2008-01-01
The problems of hypothesis testing and interval estimation of the squared multiple correlation coefficient of a multivariate normal distribution are considered. It is shown that available one-sided tests are uniformly most powerful, and the one-sided confidence intervals are uniformly most accurate. An exact method of calculating sample size to…
One-Sided Nonparametric Comparison of Treatments with a Standard for Unequal Sample Sizes.
ERIC Educational Resources Information Center
Chakraborti, S.; Gibbons, Jean D.
1992-01-01
The one-sided problem of comparing treatments with a standard on the basis of data available in the context of a one-way analysis of variance is examined, and the methodology of S. Chakraborti and J. D. Gibbons (1991) is extended to the case of unequal sample sizes. (SLD)
Bolton tooth size ratio among Sudanese Population sample: A preliminary study
Abdalla Hashim, Ala’a Hayder; Eldin, AL-Hadi Mohi; Hashim, Hayder Abdalla
2015-01-01
Background: The study of the mesiodistal size, the morphology of teeth and dental arch may play an important role in clinical dentistry, as well as other sciences such as Forensic Dentistry and Anthropology. Aims: The aims of the present study were to establish tooth-size ratio in Sudanese sample with Class I normal occlusion, to compare the tooth-size ratio between the present study and Bolton's study and between genders. Materials and Methods: The sample consisted of dental casts of 60 subjects (30 males and 30 females). Bolton formula was used to compute the overall and anterior ratio. The correlation coefficient between the anterior ratio and overall ratio was tested, and Student's t-test was used to compare tooth-size ratios between males and females, and between the present study and Bolton's result. Results: The results of the overall and anterior ratio was relatively similar to the mean values reported by Bolton, and there were no statistically significant differences between the mean values of the anterior ratio and the overall ratio between males and females. The correlation coefficient was (r = 0.79). Conclusions: The result obtained was similar to the Caucasian race. However, the reality indicates that the Sudanese population consisted of different racial groups; therefore, the firm conclusion is difficult to draw. Since this sample is not representative for the Sudanese population, hence, a further study with a large sample collected from the different parts of the Sudan is required. PMID:26229948
Got Power? A Systematic Review of Sample Size Adequacy in Health Professions Education Research
ERIC Educational Resources Information Center
Cook, David A.; Hatala, Rose
2015-01-01
Many education research studies employ small samples, which in turn lowers statistical power. We re-analyzed the results of a meta-analysis of simulation-based education to determine study power across a range of effect sizes, and the smallest effect that could be plausibly excluded. We systematically searched multiple databases through May 2011,…
Gao, Ka; Li, Shuangming; Xu, Lei; Fu, Hengzhi
2014-05-01
Al-40% Cu hypereutectic alloy samples were successfully directionally solidified at a growth rate of 10 μm/s in different sizes (4 mm, 1.8 mm, and 0.45 mm thickness in transverse section). Using the serial sectioning technique, the three-dimensional (3D) microstructures of the primary intermetallic Al2Cu phase of the alloy can be observed with various growth patterns, L-shape, E-shape, and regular rectangular shape with respect to growth orientations of the (110) and (310) plane. The L-shape and regular rectangular shape of Al2Cu phase are bounded by {110} facets. When the sample size was reduced from 4 mm to 0.45 mm, the solidified microstructures changed from multi-layer dendrites to single-layer dendrite along the growth direction, and then the orientation texture was at the plane (310). The growth mechanism for the regular faceted intermetallic Al2Cu at different sample sizes was interpreted by the oriented attachment mechanism (OA). The experimental results showed that the directionally solidified Al-40% Cu alloy sample in a much smaller size can achieve a well-aligned morphology with a specific growth texture.
Approaches to sample size estimation in the design of clinical trials--a review.
Donner, A
1984-01-01
Over the last decade, considerable interest has focused on sample size estimation in the design of clinical trials. The resulting literature is scattered over many textbooks and journals. This paper presents these methods in a single review and comments on their application in practice. PMID:6385187
Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning
ERIC Educational Resources Information Center
Li, Zhushan
2014-01-01
Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…
A Unified Approach to Power Calculation and Sample Size Determination for Random Regression Models
ERIC Educational Resources Information Center
Shieh, Gwowen
2007-01-01
The underlying statistical models for multiple regression analysis are typically attributed to two types of modeling: fixed and random. The procedures for calculating power and sample size under the fixed regression models are well known. However, the literature on random regression models is limited and has been confined to the case of all…
The Relation among Fit Indexes, Power, and Sample Size in Structural Equation Modeling
ERIC Educational Resources Information Center
Kim, Kevin H.
2005-01-01
The relation among fit indexes, power, and sample size in structural equation modeling is examined. The noncentrality parameter is required to compute power. The 2 existing methods of computing power have estimated the noncentrality parameter by specifying an alternative hypothesis or alternative fit. These methods cannot be implemented easily and…
Support vector regression to predict porosity and permeability: Effect of sample size
NASA Astrophysics Data System (ADS)
Al-Anazi, A. F.; Gates, I. D.
2012-02-01
Porosity and permeability are key petrophysical parameters obtained from laboratory core analysis. Cores, obtained from drilled wells, are often few in number for most oil and gas fields. Porosity and permeability correlations based on conventional techniques such as linear regression or neural networks trained with core and geophysical logs suffer poor generalization to wells with only geophysical logs. The generalization problem of correlation models often becomes pronounced when the training sample size is small. This is attributed to the underlying assumption that conventional techniques employing the empirical risk minimization (ERM) inductive principle converge asymptotically to the true risk values as the number of samples increases. In small sample size estimation problems, the available training samples must span the complexity of the parameter space so that the model is able both to match the available training samples reasonably well and to generalize to new data. This is achieved using the structural risk minimization (SRM) inductive principle by matching the capability of the model to the available training data. One method that uses SRM is support vector regression (SVR) network. In this research, the capability of SVR to predict porosity and permeability in a heterogeneous sandstone reservoir under the effect of small sample size is evaluated. Particularly, the impact of Vapnik's ɛ-insensitivity loss function and least-modulus loss function on generalization performance was empirically investigated. The results are compared to the multilayer perception (MLP) neural network, a widely used regression method, which operates under the ERM principle. The mean square error and correlation coefficients were used to measure the quality of predictions. The results demonstrate that SVR yields consistently better predictions of the porosity and permeability with small sample size than the MLP method. Also, the performance of SVR depends on both kernel function
Forestry inventory based on multistage sampling with probability proportional to size
NASA Technical Reports Server (NTRS)
Lee, D. C. L.; Hernandez, P., Jr.; Shimabukuro, Y. E.
1983-01-01
A multistage sampling technique, with probability proportional to size, is developed for a forest volume inventory using remote sensing data. The LANDSAT data, Panchromatic aerial photographs, and field data are collected. Based on age and homogeneity, pine and eucalyptus classes are identified. Selection of tertiary sampling units is made through aerial photographs to minimize field work. The sampling errors for eucalyptus and pine ranged from 8.34 to 21.89 percent and from 7.18 to 8.60 percent, respectively.
Sabharwal, Sanjeeve; Patel, Nirav K; Holloway, Ian; Athanasiou, Thanos
2015-03-01
The purpose of this study was to identify how often sample size calculations were reported in recent orthopaedic randomized controlled trials (RCTs) and to determine what proportion of studies that failed to find a significant treatment effect were at risk of type II error. A pre-defined computerized search was performed in MEDLINE to identify RCTs published in 2012 in the 20 highest ranked orthopaedic journals based on impact factor. Data from these studies was used to perform post hoc analysis to determine whether each study was sufficiently powered to detect a small (0.2), medium (0.5) and large (0.8) effect size as defined by Cohen. Sufficient power (1-β) was considered to be 80% and a two-tailed test was performed with an alpha value of 0.05. 120 RCTs were identified using our stated search protocol and just 73 studies (60.80%) described an appropriate sample size calculation. Examination of studies with negative primary outcome revealed that 68 (93.15%) were at risk of type II error for a small treatment effect and only 4 (5.48%) were at risk of type II error for a medium sized treatment effect. Although comparison of the results with existing data from over 10 years ago infers improved practice in sample size calculations within orthopaedic surgery, there remains an ongoing need for improvement of practice. Orthopaedic researchers, as well as journal reviewers and editors have a responsibility to ensure that RCTs conform to standardized methodological guidelines and perform appropriate sample size calculations. PMID:26280864
Sample sizes for brain atrophy outcomes in trials for secondary progressive multiple sclerosis
Altmann, D R.; Jasperse, B; Barkhof, F; Beckmann, K; Filippi, M; Kappos, L D.; Molyneux, P; Polman, C H.; Pozzilli, C; Thompson, A J.; Wagner, K; Yousry, T A.; Miller, D H.
2009-01-01
Background: Progressive brain atrophy in multiple sclerosis (MS) may reflect neuroaxonal and myelin loss and MRI measures of brain tissue loss are used as outcome measures in MS treatment trials. This study investigated sample sizes required to demonstrate reduction of brain atrophy using three outcome measures in a parallel group, placebo-controlled trial for secondary progressive MS (SPMS). Methods: Data were taken from a cohort of 43 patients with SPMS who had been followed up with 6-monthly T1-weighted MRI for up to 3 years within the placebo arm of a therapeutic trial. Central cerebral volumes (CCVs) were measured using a semiautomated segmentation approach, and brain volume normalized for skull size (NBV) was measured using automated segmentation (SIENAX). Change in CCV and NBV was measured by subtraction of baseline from serial CCV and SIENAX images; in addition, percentage brain volume change relative to baseline was measured directly using a registration-based method (SIENA). Sample sizes for given treatment effects and power were calculated for standard analyses using parameters estimated from the sample. Results: For a 2-year trial duration, minimum sample sizes per arm required to detect a 50% treatment effect at 80% power were 32 for SIENA, 69 for CCV, and 273 for SIENAX. Two-year minimum sample sizes were smaller than 1-year by 71% for SIENAX, 55% for CCV, and 44% for SIENA. Conclusion: SIENA and central cerebral volume are feasible outcome measures for inclusion in placebo-controlled trials in secondary progressive multiple sclerosis. GLOSSARY ANCOVA = analysis of covariance; CCV = central cerebral volume; FSL = FMRIB Software Library; MNI = Montreal Neurological Institute; MS = multiple sclerosis; NBV = normalized brain volume; PBVC = percent brain volume change; RRMS = relapsing–remitting multiple sclerosis; SPMS = secondary progressive multiple sclerosis. PMID:19005170
Electrospray ionization mass spectrometry from discrete nanoliter-sized sample volumes.
Ek, Patrik; Stjernström, Mårten; Emmer, Asa; Roeraade, Johan
2010-09-15
We describe a method for nanoelectrospray ionization mass spectrometry (nESI-MS) of very small sample volumes. Nanoliter-sized sample droplets were taken up by suction into a nanoelectrospray needle from a silicon microchip prior to ESI. To avoid a rapid evaporation of the small sample volumes, all manipulation steps were performed under a cover of fluorocarbon liquid. Sample volumes down to 1.5 nL were successfully analyzed, and an absolute limit of detection of 105 attomole of insulin (chain B, oxidized) was obtained. The open access to the sample droplets on the silicon chip provides the possibility to add reagents to the sample droplets and perform chemical reactions under an extended period of time. This was demonstrated in an example where we performed a tryptic digestion of cytochrome C in a nanoliter-sized sample volume for 2.5 h, followed by monitoring the outcome of the reaction with nESI-MS. The technology was also utilized for tandem mass spectrometry (MS/MS) sequencing analysis of a 2 nL solution of angiotensin I. PMID:20740531
Gutenberg-Richter b-value maximum likelihood estimation and sample size
NASA Astrophysics Data System (ADS)
Nava, F. A.; Márquez-Ramírez, V. H.; Zúñiga, F. R.; Ávila-Barrientos, L.; Quinteros, C. B.
2016-06-01
The Aki-Utsu maximum likelihood method is widely used for estimation of the Gutenberg-Richter b-value, but not all authors are conscious of the method's limitations and implicit requirements. The Aki/Utsu method requires a representative estimate of the population mean magnitude; a requirement seldom satisfied in b-value studies, particularly in those that use data from small geographic and/or time windows, such as b-mapping and b-vs-time studies. Monte Carlo simulation methods are used to determine how large a sample is necessary to achieve representativity, particularly for rounded magnitudes. The size of a representative sample weakly depends on the actual b-value. It is shown that, for commonly used precisions, small samples give meaningless estimations of b. Our results give estimates on the probabilities of getting correct estimates of b for a given desired precision for samples of different sizes. We submit that all published studies reporting b-value estimations should include information about the size of the samples used.
Estimating the Correlation in Bivariate Normal Data with Known Variances and Small Sample Sizes1
Fosdick, Bailey K.; Raftery, Adrian E.
2013-01-01
We consider the problem of estimating the correlation in bivariate normal data when the means and variances are assumed known, with emphasis on the small sample case. We consider eight different estimators, several of them considered here for the first time in the literature. In a simulation study, we found that Bayesian estimators using the uniform and arc-sine priors outperformed several empirical and exact or approximate maximum likelihood estimators in small samples. The arc-sine prior did better for large values of the correlation. For testing whether the correlation is zero, we found that Bayesian hypothesis tests outperformed significance tests based on the empirical and exact or approximate maximum likelihood estimators considered in small samples, but that all tests performed similarly for sample size 50. These results lead us to suggest using the posterior mean with the arc-sine prior to estimate the correlation in small samples when the variances are assumed known. PMID:23378667
Tsai, Pei-Chien; Bell, Jordana T
2015-01-01
Background: Epigenome-wide association scans (EWAS) are under way for many complex human traits, but EWAS power has not been fully assessed. We investigate power of EWAS to detect differential methylation using case-control and disease-discordant monozygotic (MZ) twin designs with genome-wide DNA methylation arrays. Methods and Results: We performed simulations to estimate power under the case-control and discordant MZ twin EWAS study designs, under a range of epigenetic risk effect sizes and conditions. For example, to detect a 10% mean methylation difference between affected and unaffected subjects at a genome-wide significance threshold of P = 1 × 10−6, 98 MZ twin pairs were required to reach 80% EWAS power, and 112 cases and 112 controls pairs were needed in the case-control design. We also estimated the minimum sample size required to reach 80% EWAS power under both study designs. Our analyses highlighted several factors that significantly influenced EWAS power, including sample size, epigenetic risk effect size, the variance of DNA methylation at the locus of interest and the correlation in DNA methylation patterns within the twin sample. Conclusions: We provide power estimates for array-based DNA methylation EWAS under case-control and disease-discordant MZ twin designs, and explore multiple factors that impact on EWAS power. Our results can help guide EWAS experimental design and interpretation for future epigenetic studies. PMID:25972603
Vertical grain size distribution in dust devils: Analyses of in situ samples from southern Morocco
NASA Astrophysics Data System (ADS)
Raack, J.; Reiss, D.; Ori, G. G.; Taj-Eddine, K.
2014-04-01
Dust devils are vertical convective vortices occurring on Earth and Mars [1]. Entrained particle sizes such as dust and sand lifted by dust devils make them visible [1]. On Earth, finer particles (<~50 μm) can be entrained in the boundary layer and transported over long distances [e.g., 2]. The lifetime of entrained particles in the atmosphere depends on their size, where smaller particles maintain longer into the atmosphere [3]. Mineral aerosols such as desert dust are important for human health, weather, climate, and biogeochemistry [4]. The entrainment of dust particles by dust devil and its vertical grain size distribution is not well constrained. In situ grain size samples from active dust devils were so far derived by [5,6,7] in three different continents: Africa, Australia, and North America, respectively. In this study we report about in situ samples directly derived from active dust devils in the Sahara Desert (Erg Chegaga) in southern Morocco in 2012 to characterize the vertical grain size distribution within dust devils.
Sample size calculation for recurrent events data in one-arm studies.
Rebora, Paola; Galimberti, Stefania
2012-01-01
In some exceptional circumstances, as in very rare diseases, nonrandomized one-arm trials are the sole source of evidence to demonstrate efficacy and safety of a new treatment. The design of such studies needs a sound methodological approach in order to provide reliable information, and the determination of the appropriate sample size still represents a critical step of this planning process. As, to our knowledge, no method exists for sample size calculation in one-arm trials with a recurrent event endpoint, we propose here a closed sample size formula. It is derived assuming a mixed Poisson process, and it is based on the asymptotic distribution of the one-sample robust nonparametric test recently developed for the analysis of recurrent events data. The validity of this formula in managing a situation with heterogeneity of event rates, both in time and between patients, and time-varying treatment effect was demonstrated with exhaustive simulation studies. Moreover, although the method requires the specification of a process for events generation, it seems to be robust under erroneous definition of this process, provided that the number of events at the end of the study is similar to the one assumed in the planning phase. The motivating clinical context is represented by a nonrandomized one-arm study on gene therapy in a very rare immunodeficiency in children (ADA-SCID), where a major endpoint is the recurrence of severe infections. PMID:23024035
Estimating the Size of Populations at High Risk for HIV Using Respondent-Driven Sampling Data
Handcock, Mark S.; Gile, Krista J.; Mar, Corinne M.
2015-01-01
Summary The study of hard-to-reach populations presents significant challenges. Typically, a sampling frame is not available, and population members are difficult to identify or recruit from broader sampling frames. This is especially true of populations at high risk for HIV/AIDS. Respondent-driven sampling (RDS) is often used in such settings with the primary goal of estimating the prevalence of infection. In such populations, the number of people at risk for infection and the number of people infected are of fundamental importance. This article presents a case-study of the estimation of the size of the hard-to-reach population based on data collected through RDS. We study two populations of female sex workers and men-who-have-sex-with-men in El Salvador. The approach is Bayesian and we consider different forms of prior information, including using the UNAIDS population size guidelines for this region. We show that the method is able to quantify the amount of information on population size available in RDS samples. As separate validation, we compare our results to those estimated by extrapolating from a capture–recapture study of El Salvadorian cities. The results of our case-study are largely comparable to those of the capture–recapture study when they differ from the UNAIDS guidelines. Our method is widely applicable to data from RDS studies and we provide a software package to facilitate this. PMID:25585794
ERIC Educational Resources Information Center
Dong, Nianbo; Maynard, Rebecca
2013-01-01
This paper and the accompanying tool are intended to complement existing supports for conducting power analysis tools by offering a tool based on the framework of Minimum Detectable Effect Sizes (MDES) formulae that can be used in determining sample size requirements and in estimating minimum detectable effect sizes for a range of individual- and…
A Novel Size-Selective Airborne Particle Sampling Instrument (Wras) for Health Risk Evaluation
NASA Astrophysics Data System (ADS)
Gnewuch, H.; Muir, R.; Gorbunov, B.; Priest, N. D.; Jackson, P. R.
Health risks associated with inhalation of airborne particles are known to be influenced by particle sizes. A reliable, size resolving sampler, classifying particles in size ranges from 2 nm—30 μm and suitable for use in the field would be beneficial in investigating health risks associated with inhalation of airborne particles. A review of current aerosol samplers highlighted a number of limitations. These could be overcome by combining an inertial deposition impactor with a diffusion collector in a single device. The instrument was designed for analysing mass size distributions. Calibration was carried out using a number of recognised techniques. The instrument was tested in the field by collecting size resolved samples of lead containing aerosols present at workplaces in factories producing crystal glass. The mass deposited on each substrate proved sufficient to be detected and measured using atomic absorption spectroscopy. Mass size distributions of lead were produced and the proportion of lead present in the aerosol nanofraction calculated and varied from 10% to 70% by weight.
Chondrules in Apollo 14 samples and size analyses of Apollo 14 and 15 fines.
NASA Technical Reports Server (NTRS)
King, E. A., Jr.; Butler, J. C.; Carman, M. F.
1972-01-01
Chondrules have been observed in several breccia samples and one fines sample returned by the Apollo 14 mission. The chondrules are formed by at least three different processes that appear to be related to large impacts: (1) crystallization of shock-melted spherules and droplets; (2) rounding of rock clasts and mineral grains by abrasion in the base surge; and (3) diffusion and recrystallization around clasts in hot base surge and fall-back deposits. In the case of the Apollo 14 samples, the large impact almost certainly is the Imbrian event. Grain size analyses of undisturbed fines samples from the Apollo 14 site and from the Apollo 15 Apennine Front are almost identical, indicating that the two localities have similar meteoroid bombardment exposure ages, approximately 3.7 x 10 to the 9th yr. This observation is consistent with the interpretation that both the Fra Mauro formation and the Apennine Front material originated as ejecta from the Imbrian event.
Multiple Approaches to Down Sizing of the Lunar Sample Return Collection
NASA Technical Reports Server (NTRS)
Lofgren, Gary E.; Horz, F.
2010-01-01
Future Lunar missions are planned for at least 7 days, significantly longer than the 3 days of the later Apollo missions. The last of those missions, A-17, returned 111 kg of samples plus another 20 kg of containers. The current Constellation program requirements for return weight for science is 100 kg with the hope of raising that limit to near 250 kg including containers and other non-geological materials. The estimated return weight for rock and soil samples will, at best, be about 175 kg. One method proposed to accomplish down-sizing of the collection is the use of a Geo-Lab in the lunar habitat to complete a preliminary examination of selected samples and facilitate prioritizing the return samples.
Sample size calculation for the Wilcoxon-Mann-Whitney test adjusting for ties.
Zhao, Yan D; Rahardja, Dewi; Qu, Yongming
2008-02-10
In this paper we study sample size calculation methods for the asymptotic Wilcoxon-Mann-Whitney test for data with or without ties. The existing methods are applicable either to data with ties or to data without ties but not to both cases. While the existing methods developed for data without ties perform well, the methods developed for data with ties have limitations in that they are either applicable to proportional odds alternatives or have computational difficulties. We propose a new method which has a closed-form formula and therefore is very easy to calculate. In addition, the new method can be applied to both data with or without ties. Simulations have demonstrated that the new sample size formula performs very well as the corresponding actual powers are close to the nominal powers. PMID:17487941
Grain size analysis and high frequency electrical properties of Apollo 15 and 16 samples
NASA Technical Reports Server (NTRS)
Gold, T.; Bilson, E.; Yerbury, M.
1973-01-01
The particle size distribution of eleven surface fines samples collected by Apollo 15 and 16 was determined by the method of measuring the sedimentation rate in a column of water. The fact that the grain size distribution in the core samples shows significant differences within a few centimeters variation of depth is important for the understanding of the surface transportation processes which are responsible for the deposition of thin layers of different physical and/or chemical origin. The variation with density of the absorption length is plotted, and results would indicate that for the case of meter wavelength radar waves, reflections from depths of more than 100 meters generally contribute significantly to the radar echoes obtained.
Sample size for estimating the mean concentration of organisms in ballast water.
Costa, Eliardo G; Lopes, Rubens M; Singer, Julio M
2016-09-15
We consider the computation of sample sizes for estimating the mean concentration of organisms in ballast water. Given the possible heterogeneity of their distribution in the tank, we adopt a negative binomial model to obtain confidence intervals for the mean concentration. We show that the results obtained by Chen and Chen (2012) in a different set-up hold for the proposed model and use them to develop algorithms to compute sample sizes both in cases where the mean concentration is known to lie in some bounded interval or where there is no information about its range. We also construct simple diagrams that may be easily employed to decide for compliance with the D-2 regulation of the International Maritime Organization (IMO). PMID:27266648
The inertial and electrical effects on aerosol sampling, charging, and size distribution
Wang, Chuenchung.
1991-01-01
An experimental study was conducted to investigate the effect of particle inertia on deposition behavior near the filter cassette sampler. Field sampling cassettes were tested in a subsonic wind tunnel for 0.2, 0.5 and 0.68 m/s wind speeds to simulate indoor air environment. Fluorescein aerosols of 2 and 5 {mu}m were generated from Berglund-Liu vibrating orifice generator as test material. Sampling tests were conducted in a subsonic wind tunnel with variables of particle size, wind speed, suction velocity and orientation of sampler examined to evaluate the combined effects. Sampling efficiencies were also examined. Electrostatic force is usually used as an effective method for removing, classifying and separating aerosols according to the electrical mobilities of the particulates. On the other hand, the aerosol charging theories possess differences in the ultrafine size range and need experimental verification. The present TSI's electrostatic aerosol analyzer has particle loss problem and cannot be used as a reliable tool in achieving efficient charging. A new unipolar charger with associated electronic circuits was designed, constructed and tested. The performance of the charger is tested in terms of particle loss, uncharged particles, and the collection efficiency of the precipitator. The results were compared with other investigator's data. The log-Beta distribution function is considered to be more versatile in representing size distribution. This study discussed the method in determining the size parameters under different conditions. Also the mutability of size distribution was evaluated when particles undergo coagulation or classification processes. Comparison of evolution between log-Beta and lognormal distributions were made.
Muhm, J M; Olshan, A F
1989-01-01
A program for the Hewlett Packard 41 series programmable calculator that determines sample size, power, and least detectable relative risk for comparative studies with independent groups is described. The user may specify any ratio of cases to controls (or exposed to unexposed subjects) and, if calculating least detectable relative risks, may specify whether the study is a case-control or cohort study. PMID:2910062
On the validity of the Poisson assumption in sampling nanometer-sized aerosols
Damit, Brian E; Wu, Dr. Chang-Yu; Cheng, Mengdawn
2014-01-01
A Poisson process is traditionally believed to apply to the sampling of aerosols. For a constant aerosol concentration, it is assumed that a Poisson process describes the fluctuation in the measured concentration because aerosols are stochastically distributed in space. Recent studies, however, have shown that sampling of micrometer-sized aerosols has non-Poissonian behavior with positive correlations. The validity of the Poisson assumption for nanometer-sized aerosols has not been examined and thus was tested in this study. Its validity was tested for four particle sizes - 10 nm, 25 nm, 50 nm and 100 nm - by sampling from indoor air with a DMA- CPC setup to obtain a time series of particle counts. Five metrics were calculated from the data: pair-correlation function (PCF), time-averaged PCF, coefficient of variation, probability of measuring a concentration at least 25% greater than average, and posterior distributions from Bayesian inference. To identify departures from Poissonian behavior, these metrics were also calculated for 1,000 computer-generated Poisson time series with the same mean as the experimental data. For nearly all comparisons, the experimental data fell within the range of 80% of the Poisson-simulation values. Essentially, the metrics for the experimental data were indistinguishable from a simulated Poisson process. The greater influence of Brownian motion for nanometer-sized aerosols may explain the Poissonian behavior observed for smaller aerosols. Although the Poisson assumption was found to be valid in this study, it must be carefully applied as the results here do not definitively prove applicability in all sampling situations.
Hirleman, E. D.; Oechsle, V.; Chigier, N. A.
1984-01-01
The response characteristics of laser diffraction particle sizing instruments were studied theoretically and experimentally. In particular, the extent of optical sample volume and the effects of receiving lens properties were investigated in detail. The experimental work was performed with a particle size analyzer using a calibration reticle containing a two-dimensional array of opaque circular disks on a glass substrate. The calibration slide simulated the forward-scattering characteristics of a Rosin-Rammler droplet size distribution. The reticle was analyzed with collection lenses of 63 mm, 100 mm, and 300 mm focal lengths using scattering inversion software that determined best-fit Rosin-Rammler size distribution parameters. The data differed from the predicted response for the reticle by about 10 percent. A set of calibration factor for the detector elements was determined that corrected for the nonideal response of the instrument. The response of the instrument was also measured as a function of reticle position, and the results confirmed a theoretical optical sample volume model presented here.
Živković, Daniel; Steinrücken, Matthias; Song, Yun S.; Stephan, Wolfgang
2015-01-01
Advances in empirical population genetics have made apparent the need for models that simultaneously account for selection and demography. To address this need, we here study the Wright–Fisher diffusion under selection and variable effective population size. In the case of genic selection and piecewise-constant effective population sizes, we obtain the transition density by extending a recently developed method for computing an accurate spectral representation for a constant population size. Utilizing this extension, we show how to compute the sample frequency spectrum in the presence of genic selection and an arbitrary number of instantaneous changes in the effective population size. We also develop an alternate, efficient algorithm for computing the sample frequency spectrum using a moment-based approach. We apply these methods to answer the following questions: If neutrality is incorrectly assumed when there is selection, what effects does it have on demographic parameter estimation? Can the impact of negative selection be observed in populations that undergo strong exponential growth? PMID:25873633
Type-II generalized family-wise error rate formulas with application to sample size determination.
Delorme, Phillipe; de Micheaux, Pierre Lafaye; Liquet, Benoit; Riou, Jérémie
2016-07-20
Multiple endpoints are increasingly used in clinical trials. The significance of some of these clinical trials is established if at least r null hypotheses are rejected among m that are simultaneously tested. The usual approach in multiple hypothesis testing is to control the family-wise error rate, which is defined as the probability that at least one type-I error is made. More recently, the q-generalized family-wise error rate has been introduced to control the probability of making at least q false rejections. For procedures controlling this global type-I error rate, we define a type-II r-generalized family-wise error rate, which is directly related to the r-power defined as the probability of rejecting at least r false null hypotheses. We obtain very general power formulas that can be used to compute the sample size for single-step and step-wise procedures. These are implemented in our R package rPowerSampleSize available on the CRAN, making them directly available to end users. Complexities of the formulas are presented to gain insight into computation time issues. Comparison with Monte Carlo strategy is also presented. We compute sample sizes for two clinical trials involving multiple endpoints: one designed to investigate the effectiveness of a drug against acute heart failure and the other for the immunogenicity of a vaccine strategy against pneumococcus. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26914402
Jiang, Shengyu; Wang, Chun; Weiss, David J
2016-01-01
Likert types of rating scales in which a respondent chooses a response from an ordered set of response options are used to measure a wide variety of psychological, educational, and medical outcome variables. The most appropriate item response theory model for analyzing and scoring these instruments when they provide scores on multiple scales is the multidimensional graded response model (MGRM) A simulation study was conducted to investigate the variables that might affect item parameter recovery for the MGRM. Data were generated based on different sample sizes, test lengths, and scale intercorrelations. Parameter estimates were obtained through the flexMIRT software. The quality of parameter recovery was assessed by the correlation between true and estimated parameters as well as bias and root-mean-square-error. Results indicated that for the vast majority of cases studied a sample size of N = 500 provided accurate parameter estimates, except for tests with 240 items when 1000 examinees were necessary to obtain accurate parameter estimates. Increasing sample size beyond N = 1000 did not increase the accuracy of MGRM parameter estimates. PMID:26903916