Blinded sample size re-estimation in three-arm trials with 'gold standard' design.
Mütze, Tobias; Friede, Tim
2017-10-15
In this article, we study blinded sample size re-estimation in the 'gold standard' design with internal pilot study for normally distributed outcomes. The 'gold standard' design is a three-arm clinical trial design that includes an active and a placebo control in addition to an experimental treatment. We focus on the absolute margin approach to hypothesis testing in three-arm trials at which the non-inferiority of the experimental treatment and the assay sensitivity are assessed by pairwise comparisons. We compare several blinded sample size re-estimation procedures in a simulation study assessing operating characteristics including power and type I error. We find that sample size re-estimation based on the popular one-sample variance estimator results in overpowered trials. Moreover, sample size re-estimation based on unbiased variance estimators such as the Xing-Ganju variance estimator results in underpowered trials, as it is expected because an overestimation of the variance and thus the sample size is in general required for the re-estimation procedure to eventually meet the target power. To overcome this problem, we propose an inflation factor for the sample size re-estimation with the Xing-Ganju variance estimator and show that this approach results in adequately powered trials. Because of favorable features of the Xing-Ganju variance estimator such as unbiasedness and a distribution independent of the group means, the inflation factor does not depend on the nuisance parameter and, therefore, can be calculated prior to a trial. Moreover, we prove that the sample size re-estimation based on the Xing-Ganju variance estimator does not bias the effect estimate. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Distribution of the two-sample t-test statistic following blinded sample size re-estimation.
Lu, Kaifeng
2016-05-01
We consider the blinded sample size re-estimation based on the simple one-sample variance estimator at an interim analysis. We characterize the exact distribution of the standard two-sample t-test statistic at the final analysis. We describe a simulation algorithm for the evaluation of the probability of rejecting the null hypothesis at given treatment effect. We compare the blinded sample size re-estimation method with two unblinded methods with respect to the empirical type I error, the empirical power, and the empirical distribution of the standard deviation estimator and final sample size. We characterize the type I error inflation across the range of standardized non-inferiority margin for non-inferiority trials, and derive the adjusted significance level to ensure type I error control for given sample size of the internal pilot study. We show that the adjusted significance level increases as the sample size of the internal pilot study increases. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Re-estimating sample size in cluster randomised trials with active recruitment within clusters.
van Schie, S; Moerbeek, M
2014-08-30
Often only a limited number of clusters can be obtained in cluster randomised trials, although many potential participants can be recruited within each cluster. Thus, active recruitment is feasible within the clusters. To obtain an efficient sample size in a cluster randomised trial, the cluster level and individual level variance should be known before the study starts, but this is often not the case. We suggest using an internal pilot study design to address this problem of unknown variances. A pilot can be useful to re-estimate the variances and re-calculate the sample size during the trial. Using simulated data, it is shown that an initially low or high power can be adjusted using an internal pilot with the type I error rate remaining within an acceptable range. The intracluster correlation coefficient can be re-estimated with more precision, which has a positive effect on the sample size. We conclude that an internal pilot study design may be used if active recruitment is feasible within a limited number of clusters. Copyright © 2014 John Wiley & Sons, Ltd.
Silverman, Rachel K; Ivanova, Anastasia
2017-01-01
Sequential parallel comparison design (SPCD) was proposed to reduce placebo response in a randomized trial with placebo comparator. Subjects are randomized between placebo and drug in stage 1 of the trial, and then, placebo non-responders are re-randomized in stage 2. Efficacy analysis includes all data from stage 1 and all placebo non-responding subjects from stage 2. This article investigates the possibility to re-estimate the sample size and adjust the design parameters, allocation proportion to placebo in stage 1 of SPCD, and weight of stage 1 data in the overall efficacy test statistic during an interim analysis.
Liu, Jingxia; Colditz, Graham A
2018-05-01
There is growing interest in conducting cluster randomized trials (CRTs). For simplicity in sample size calculation, the cluster sizes are assumed to be identical across all clusters. However, equal cluster sizes are not guaranteed in practice. Therefore, the relative efficiency (RE) of unequal versus equal cluster sizes has been investigated when testing the treatment effect. One of the most important approaches to analyze a set of correlated data is the generalized estimating equation (GEE) proposed by Liang and Zeger, in which the "working correlation structure" is introduced and the association pattern depends on a vector of association parameters denoted by ρ. In this paper, we utilize GEE models to test the treatment effect in a two-group comparison for continuous, binary, or count data in CRTs. The variances of the estimator of the treatment effect are derived for the different types of outcome. RE is defined as the ratio of variance of the estimator of the treatment effect for equal to unequal cluster sizes. We discuss a commonly used structure in CRTs-exchangeable, and derive the simpler formula of RE with continuous, binary, and count outcomes. Finally, REs are investigated for several scenarios of cluster size distributions through simulation studies. We propose an adjusted sample size due to efficiency loss. Additionally, we also propose an optimal sample size estimation based on the GEE models under a fixed budget for known and unknown association parameter (ρ) in the working correlation structure within the cluster. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Optimal flexible sample size design with robust power.
Zhang, Lanju; Cui, Lu; Yang, Bo
2016-08-30
It is well recognized that sample size determination is challenging because of the uncertainty on the treatment effect size. Several remedies are available in the literature. Group sequential designs start with a sample size based on a conservative (smaller) effect size and allow early stop at interim looks. Sample size re-estimation designs start with a sample size based on an optimistic (larger) effect size and allow sample size increase if the observed effect size is smaller than planned. Different opinions favoring one type over the other exist. We propose an optimal approach using an appropriate optimality criterion to select the best design among all the candidate designs. Our results show that (1) for the same type of designs, for example, group sequential designs, there is room for significant improvement through our optimization approach; (2) optimal promising zone designs appear to have no advantages over optimal group sequential designs; and (3) optimal designs with sample size re-estimation deliver the best adaptive performance. We conclude that to deal with the challenge of sample size determination due to effect size uncertainty, an optimal approach can help to select the best design that provides most robust power across the effect size range of interest. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
McClure, Leslie A; Szychowski, Jeff M; Benavente, Oscar; Hart, Robert G; Coffey, Christopher S
2016-10-01
The use of adaptive designs has been increasing in randomized clinical trials. Sample size re-estimation is a type of adaptation in which nuisance parameters are estimated at an interim point in the trial and the sample size re-computed based on these estimates. The Secondary Prevention of Small Subcortical Strokes study was a randomized clinical trial assessing the impact of single- versus dual-antiplatelet therapy and control of systolic blood pressure to a higher (130-149 mmHg) versus lower (<130 mmHg) target on recurrent stroke risk in a two-by-two factorial design. A sample size re-estimation was performed during the Secondary Prevention of Small Subcortical Strokes study resulting in an increase from the planned sample size of 2500-3020, and we sought to determine the impact of the sample size re-estimation on the study results. We assessed the results of the primary efficacy and safety analyses with the full 3020 patients and compared them to the results that would have been observed had randomization ended with 2500 patients. The primary efficacy outcome considered was recurrent stroke, and the primary safety outcomes were major bleeds and death. We computed incidence rates for the efficacy and safety outcomes and used Cox proportional hazards models to examine the hazard ratios for each of the two treatment interventions (i.e. the antiplatelet and blood pressure interventions). In the antiplatelet intervention, the hazard ratio was not materially modified by increasing the sample size, nor did the conclusions regarding the efficacy of mono versus dual-therapy change: there was no difference in the effect of dual- versus monotherapy on the risk of recurrent stroke hazard ratios (n = 3020 HR (95% confidence interval): 0.92 (0.72, 1.2), p = 0.48; n = 2500 HR (95% confidence interval): 1.0 (0.78, 1.3), p = 0.85). With respect to the blood pressure intervention, increasing the sample size resulted in less certainty in the results, as the hazard ratio for higher versus lower systolic blood pressure target approached, but did not achieve, statistical significance with the larger sample (n = 3020 HR (95% confidence interval): 0.81 (0.63, 1.0), p = 0.089; n = 2500 HR (95% confidence interval): 0.89 (0.68, 1.17), p = 0.40). The results from the safety analyses were similar to 3020 and 2500 patients for both study interventions. Other trial-related factors, such as contracts, finances, and study management, were impacted as well. Adaptive designs can have benefits in randomized clinical trials, but do not always result in significant findings. The impact of adaptive designs should be measured in terms of both trial results, as well as practical issues related to trial management. More post hoc analyses of study adaptations will lead to better understanding of the balance between the benefits and the costs. © The Author(s) 2016.
Pritchett, Yili; Jemiai, Yannis; Chang, Yuchiao; Bhan, Ishir; Agarwal, Rajiv; Zoccali, Carmine; Wanner, Christoph; Lloyd-Jones, Donald; Cannata-Andía, Jorge B; Thompson, Taylor; Appelbaum, Evan; Audhya, Paul; Andress, Dennis; Zhang, Wuyan; Solomon, Scott; Manning, Warren J; Thadhani, Ravi
2011-04-01
Chronic kidney disease is associated with a marked increase in risk for left ventricular hypertrophy and cardiovascular mortality compared with the general population. Therapy with vitamin D receptor activators has been linked with reduced mortality in chronic kidney disease and an improvement in left ventricular hypertrophy in animal studies. PRIMO (Paricalcitol capsules benefits in Renal failure Induced cardia MOrbidity) is a multinational, multicenter randomized controlled trial to assess the effects of paricalcitol (a selective vitamin D receptor activator) on mild to moderate left ventricular hypertrophy in patients with chronic kidney disease. Subjects with mild-moderate chronic kidney disease are randomized to paricalcitol or placebo after confirming left ventricular hypertrophy using a cardiac echocardiogram. Cardiac magnetic resonance imaging is then used to assess left ventricular mass index at baseline, 24 and 48 weeks, which is the primary efficacy endpoint of the study. Because of limited prior data to estimate sample size, a maximum information group sequential design with sample size re-estimation is implemented to allow sample size adjustment based on the nuisance parameter estimated using the interim data. An interim efficacy analysis is planned at a pre-specified time point conditioned on the status of enrollment. The decision to increase sample size depends on the observed treatment effect. A repeated measures analysis model, using available data at Week 24 and 48 with a backup model of an ANCOVA analyzing change from baseline to the final nonmissing observation, are pre-specified to evaluate the treatment effect. Gamma-family of spending function is employed to control family-wise Type I error rate as stopping for success is planned in the interim efficacy analysis. If enrollment is slower than anticipated, the smaller sample size used in the interim efficacy analysis and the greater percent of missing week 48 data might decrease the parameter estimation accuracy, either for the nuisance parameter or for the treatment effect, which might in turn affect the interim decision-making. The application of combining a group sequential design with a sample-size re-estimation in clinical trial design has the potential to improve efficiency and to increase the probability of trial success while ensuring integrity of the study.
New Estimates of Rhenium in the Crust: Implications for Mantle Re-Os Budgets
NASA Astrophysics Data System (ADS)
Bennett, V. C.; Sun, W.
2002-12-01
The 187Re-187Os isotopic system has provided a new probe of mantle chemical structure with, for example, now numerous studies balancing estimates of the Os isotopic compositions of the upper modern mantle with sizes and ages of proposed conjugate reservoirs stored within the deep mantle. This style of modeling is dependent upon estimates of the parent Re in the various reservoirs including total crust, upper mantle, MORB and ocean island basalts. New laser ICP-MS in situ and ID whole rock results from OIB, arc and back-arc basalts suggest Re concentrations in oceanic and crustal domains may have been greatly underestimated. For example Hawaiian OIBs show a clear distinction between subaerial and submarine erupted samples with the latter having Re much closer to the higher MORB estimates (1) than to previous OIB estimates. This difference has been attributed to Re volatility and loss during syn- and post-eruption degassing of subaerial samples. Recent work has produced similar results for submarine arc samples using both dredged glasses and melt inclusions in olivines from primitive basalts. Both have much higher average Re (ca. 1.5 and 3.4 ppb; 2,3) than literature values for arcs (ca. 0.30ppb) determined largely from sub-aerial samples, or for average crust estimated from loess (0.2 ppb; 4). If the undegassed arc samples are representative, then the total crust may have more than 5 times the Re previously estimated. Re lost during arc eruptions may ultimately be concentrated in anoxic seafloor sediments. Prior under-estimates may be linked to the extremely heterogeneous concentration (> 5 orders of magnitude) of the chalcophile, redox sensitive Re in crustal environments. If the residence time of high Re in the crust is long (>1 Ga) then, 1) much smaller reservoirs of stored Re in the deep mantle are required to balance Re depletions in the upper mantle, and 2) significant portions of the upper mantle are likely Re depleted. Alternatively Re may be rapidly recycled in oceanic sediments (short residence time) resulting in a smaller affect on Re-Os budgets, but creating areas of extreme Re heterogeneity in the upper mantle. Refs: 1. Bennett, Norman and Garcia, EPSL 2000. 2. Sun et al. (in press, Chemical Geology) 3. Sun et al. (submitted). 4. Peucker-Ehrenbrink and Jahn, G3, 2001.
Re-use of pilot data and interim analysis of pivotal data in MRMC studies: a simulation study
NASA Astrophysics Data System (ADS)
Chen, Weijie; Samuelson, Frank; Sahiner, Berkman; Petrick, Nicholas
2017-03-01
Novel medical imaging devices are often evaluated with multi-reader multi-case (MRMC) studies in which radiologists read images of patient cases for a specified clinical task (e.g., cancer detection). A pilot study is often used to measure the effect size and variance parameters that are necessary for sizing a pivotal study (including sizing readers, non-diseased and diseased cases). Due to the practical difficulty of collecting patient cases or recruiting clinical readers, some investigators attempt to include the pilot data as part of their pivotal study. In other situations, some investigators attempt to perform an interim analysis of their pivotal study data based upon which the sample sizes may be re-estimated. Re-use of the pilot data or interim analyses of the pivotal data may inflate the type I error of the pivotal study. In this work, we use the Roe and Metz model to simulate MRMC data under the null hypothesis (i.e., two devices have equal diagnostic performance) and investigate the type I error rate for several practical designs involving re-use of pilot data or interim analysis of pivotal data. Our preliminary simulation results indicate that, under the simulation conditions we investigated, the inflation of type I error is none or only marginal for some design strategies (e.g., re-use of patient data without re-using readers, and size re-estimation without using the effect-size estimated in the interim analysis). Upon further verifications, these are potentially useful design methods in that they may help make a study less burdensome and have a better chance to succeed without substantial loss of the statistical rigor.
The endothelial sample size analysis in corneal specular microscopy clinical examinations.
Abib, Fernando C; Holzchuh, Ricardo; Schaefer, Artur; Schaefer, Tania; Godois, Ronialci
2012-05-01
To evaluate endothelial cell sample size and statistical error in corneal specular microscopy (CSM) examinations. One hundred twenty examinations were conducted with 4 types of corneal specular microscopes: 30 with each BioOptics, CSO, Konan, and Topcon corneal specular microscopes. All endothelial image data were analyzed by respective instrument software and also by the Cells Analyzer software with a method developed in our lab. A reliability degree (RD) of 95% and a relative error (RE) of 0.05 were used as cut-off values to analyze images of the counted endothelial cells called samples. The sample size mean was the number of cells evaluated on the images obtained with each device. Only examinations with RE < 0.05 were considered statistically correct and suitable for comparisons with future examinations. The Cells Analyzer software was used to calculate the RE and customized sample size for all examinations. Bio-Optics: sample size, 97 ± 22 cells; RE, 6.52 ± 0.86; only 10% of the examinations had sufficient endothelial cell quantity (RE < 0.05); customized sample size, 162 ± 34 cells. CSO: sample size, 110 ± 20 cells; RE, 5.98 ± 0.98; only 16.6% of the examinations had sufficient endothelial cell quantity (RE < 0.05); customized sample size, 157 ± 45 cells. Konan: sample size, 80 ± 27 cells; RE, 10.6 ± 3.67; none of the examinations had sufficient endothelial cell quantity (RE > 0.05); customized sample size, 336 ± 131 cells. Topcon: sample size, 87 ± 17 cells; RE, 10.1 ± 2.52; none of the examinations had sufficient endothelial cell quantity (RE > 0.05); customized sample size, 382 ± 159 cells. A very high number of CSM examinations had sample errors based on Cells Analyzer software. The endothelial sample size (examinations) needs to include more cells to be reliable and reproducible. The Cells Analyzer tutorial routine will be useful for CSM examination reliability and reproducibility.
Estimating parasitic sea lamprey abundance in Lake Huron from heterogenous data sources
Young, Robert J.; Jones, Michael L.; Bence, James R.; McDonald, Rodney B.; Mullett, Katherine M.; Bergstedt, Roger A.
2003-01-01
The Great Lakes Fishery Commission uses time series of transformer, parasitic, and spawning population estimates to evaluate the effectiveness of its sea lamprey (Petromyzon marinus) control program. This study used an inverse variance weighting method to integrate Lake Huron sea lamprey population estimates derived from two estimation procedures: 1) prediction of the lake-wide spawning population from a regression model based on stream size and, 2) whole-lake mark and recapture estimates. In addition, we used a re-sampling procedure to evaluate the effect of trading off sampling effort between the regression and mark-recapture models. Population estimates derived from the regression model ranged from 132,000 to 377,000 while mark-recapture estimates of marked recently metamorphosed juveniles and parasitic sea lampreys ranged from 536,000 to 634,000 and 484,000 to 1,608,000, respectively. The precision of the estimates varied greatly among estimation procedures and years. The integrated estimate of the mark-recapture and spawner regression procedures ranged from 252,000 to 702,000 transformers. The re-sampling procedure indicated that the regression model is more sensitive to reduction in sampling effort than the mark-recapture model. Reliance on either the regression or mark-recapture model alone could produce misleading estimates of abundance of sea lampreys and the effect of the control program on sea lamprey abundance. These analyses indicate that the precision of the lakewide population estimate can be maximized by re-allocating sampling effort from marking sea lampreys to trapping additional streams.
NASA Astrophysics Data System (ADS)
Lai, Xiaoming; Zhu, Qing; Zhou, Zhiwen; Liao, Kaihua
2017-12-01
In this study, seven random combination sampling strategies were applied to investigate the uncertainties in estimating the hillslope mean soil water content (SWC) and correlation coefficients between the SWC and soil/terrain properties on a tea + bamboo hillslope. One of the sampling strategies is the global random sampling and the other six are the stratified random sampling on the top, middle, toe, top + mid, top + toe and mid + toe slope positions. When each sampling strategy was applied, sample sizes were gradually reduced and each sampling size contained 3000 replicates. Under each sampling size of each sampling strategy, the relative errors (REs) and coefficients of variation (CVs) of the estimated hillslope mean SWC and correlation coefficients between the SWC and soil/terrain properties were calculated to quantify the accuracy and uncertainty. The results showed that the uncertainty of the estimations decreased as the sampling size increasing. However, larger sample sizes were required to reduce the uncertainty in correlation coefficient estimation than in hillslope mean SWC estimation. Under global random sampling, 12 randomly sampled sites on this hillslope were adequate to estimate the hillslope mean SWC with RE and CV ≤10%. However, at least 72 randomly sampled sites were needed to ensure the estimated correlation coefficients with REs and CVs ≤10%. Comparing with all sampling strategies, reducing sampling sites on the middle slope had the least influence on the estimation of hillslope mean SWC and correlation coefficients. Under this strategy, 60 sites (10 on the middle slope and 50 on the top and toe slopes) were enough to ensure the estimated correlation coefficients with REs and CVs ≤10%. This suggested that when designing the SWC sampling, the proportion of sites on the middle slope can be reduced to 16.7% of the total number of sites. Findings of this study will be useful for the optimal SWC sampling design.
An evaluation of population index and estimation techniques for tadpoles in desert pools
Jung, Robin E.; Dayton, Gage H.; Williamson, Stephen J.; Sauer, John R.; Droege, Sam
2002-01-01
Using visual (VI) and dip net indices (DI) and double-observer (DOE), removal (RE), and neutral red dye capture-recapture (CRE) estimates, we counted, estimated, and censused Couch's spadefoot (Scaphiopus couchii) and canyon treefrog (Hyla arenicolor) tadpole populations in Big Bend National Park, Texas. Initial dye experiments helped us determine appropriate dye concentrations and exposure times to use in mesocosm and field trials. The mesocosm study revealed higher tadpole detection rates, more accurate population estimates, and lower coefficients of variation among pools compared to those from the field study. In both mesocosm and field studies, CRE was the best method for estimating tadpole populations, followed by DOE and RE. In the field, RE, DI, and VI often underestimated populations in pools with higher tadpole numbers. DI improved with increased sampling. Larger pools supported larger tadpole populations, and tadpole detection rates in general decreased with increasing pool volume and surface area. Hence, pool size influenced bias in tadpole sampling. Across all techniques, tadpole detection rates differed among pools, indicating that sampling bias was inherent and techniques did not consistently sample the same proportion of tadpoles in each pool. Estimating bias (i.e., calculating detection rates) therefore was essential in assessing tadpole abundance. Unlike VI and DOE, DI, RE, and CRE could be used in turbid waters in which tadpoles are not visible. The tadpole population estimates we used accommodated differences in detection probabilities in simple desert pool environments but may not work in more complex habitats.
Robustness of survival estimates for radio-marked animals
Bunck, C.M.; Chen, C.-L.
1992-01-01
Telemetry techniques are often used to study the survival of birds and mammals; particularly whcn mark-recapture approaches are unsuitable. Both parametric and nonparametric methods to estimate survival have becn developed or modified from other applications. An implicit assumption in these approaches is that the probability of re-locating an animal with a functioning transmitter is one. A Monte Carlo study was conducted to determine the bias and variance of the Kaplan-Meier estimator and an estimator based also on the assumption of constant hazard and to eva!uate the performance of the two-sample tests associated with each. Modifications of each estimator which allow a re-Iocation probability of less than one are described and evaluated. Generallv the unmodified estimators were biased but had lower variance. At low sample sizes all estimators performed poorly. Under the null hypothesis, the distribution of all test statistics reasonably approximated the null distribution when survival was low but not when it was high. The power of the two-sample tests were similar.
Kahan, Brennan C
2016-12-13
Patient recruitment in clinical trials is often challenging, and as a result, many trials are stopped early due to insufficient recruitment. The re-randomization design allows patients to be re-enrolled and re-randomized for each new treatment episode that they experience. Because it allows multiple enrollments for each patient, this design has been proposed as a way to increase the recruitment rate in clinical trials. However, it is unknown to what extent recruitment could be increased in practice. We modelled the expected recruitment rate for parallel-group and re-randomization trials in different settings based on estimates from real trials and datasets. We considered three clinical areas: in vitro fertilization, severe asthma exacerbations, and acute sickle cell pain crises. We compared the two designs in terms of the expected time to complete recruitment, and the sample size recruited over a fixed recruitment period. Across the different scenarios we considered, we estimated that re-randomization could reduce the expected time to complete recruitment by between 4 and 22 months (relative reductions of 19% and 45%), or increase the sample size recruited over a fixed recruitment period by between 29% and 171%. Re-randomization can increase recruitment most for trials with a short follow-up period, a long trial recruitment duration, and patients with high rates of treatment episodes. Re-randomization has the potential to increase the recruitment rate in certain settings, and could lead to quicker and more efficient trials in these scenarios.
Kunz, Cornelia U; Stallard, Nigel; Parsons, Nicholas; Todd, Susan; Friede, Tim
2017-03-01
Regulatory authorities require that the sample size of a confirmatory trial is calculated prior to the start of the trial. However, the sample size quite often depends on parameters that might not be known in advance of the study. Misspecification of these parameters can lead to under- or overestimation of the sample size. Both situations are unfavourable as the first one decreases the power and the latter one leads to a waste of resources. Hence, designs have been suggested that allow a re-assessment of the sample size in an ongoing trial. These methods usually focus on estimating the variance. However, for some methods the performance depends not only on the variance but also on the correlation between measurements. We develop and compare different methods for blinded estimation of the correlation coefficient that are less likely to introduce operational bias when the blinding is maintained. Their performance with respect to bias and standard error is compared to the unblinded estimator. We simulated two different settings: one assuming that all group means are the same and one assuming that different groups have different means. Simulation results show that the naïve (one-sample) estimator is only slightly biased and has a standard error comparable to that of the unblinded estimator. However, if the group means differ, other estimators have better performance depending on the sample size per group and the number of groups. © 2016 The Authors. Biometrical Journal Published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Kalman filter approach for uncertainty quantification in time-resolved laser-induced incandescence.
Hadwin, Paul J; Sipkens, Timothy A; Thomson, Kevin A; Liu, Fengshan; Daun, Kyle J
2018-03-01
Time-resolved laser-induced incandescence (TiRe-LII) data can be used to infer spatially and temporally resolved volume fractions and primary particle size distributions of soot-laden aerosols, but these estimates are corrupted by measurement noise as well as uncertainties in the spectroscopic and heat transfer submodels used to interpret the data. Estimates of the temperature, concentration, and size distribution of soot primary particles within a sample aerosol are typically made by nonlinear regression of modeled spectral incandescence decay, or effective temperature decay, to experimental data. In this work, we employ nonstationary Bayesian estimation techniques to infer aerosol properties from simulated and experimental LII signals, specifically the extended Kalman filter and Schmidt-Kalman filter. These techniques exploit the time-varying nature of both the measurements and the models, and they reveal how uncertainty in the estimates computed from TiRe-LII data evolves over time. Both techniques perform better when compared with standard deterministic estimates; however, we demonstrate that the Schmidt-Kalman filter produces more realistic uncertainty estimates.
Information for forest process models: a review of NRS-FIA vegetation measurements
Charles D. Canham; William H. McWilliams
2012-01-01
The Forest and Analysis Program of the Northern Research Station (NRS-FIA) has re-designed Phase 3 measurements and intensified the sample intensity following a study to balance costs, utility, and sample size. The sampling scheme consists of estimating canopy-cover percent for six vegetation growth habits on 24-foot-radius subplots in four height classes and as an...
Robustness of methods for blinded sample size re-estimation with overdispersed count data.
Schneider, Simon; Schmidli, Heinz; Friede, Tim
2013-09-20
Counts of events are increasingly common as primary endpoints in randomized clinical trials. With between-patient heterogeneity leading to variances in excess of the mean (referred to as overdispersion), statistical models reflecting this heterogeneity by mixtures of Poisson distributions are frequently employed. Sample size calculation in the planning of such trials requires knowledge on the nuisance parameters, that is, the control (or overall) event rate and the overdispersion parameter. Usually, there is only little prior knowledge regarding these parameters in the design phase resulting in considerable uncertainty regarding the sample size. In this situation internal pilot studies have been found very useful and very recently several blinded procedures for sample size re-estimation have been proposed for overdispersed count data, one of which is based on an EM-algorithm. In this paper we investigate the EM-algorithm based procedure with respect to aspects of their implementation by studying the algorithm's dependence on the choice of convergence criterion and find that the procedure is sensitive to the choice of the stopping criterion in scenarios relevant to clinical practice. We also compare the EM-based procedure to other competing procedures regarding their operating characteristics such as sample size distribution and power. Furthermore, the robustness of these procedures to deviations from the model assumptions is explored. We find that some of the procedures are robust to at least moderate deviations. The results are illustrated using data from the US National Heart, Lung and Blood Institute sponsored Asymptomatic Cardiac Ischemia Pilot study. Copyright © 2013 John Wiley & Sons, Ltd.
Ait Kaci Azzou, S; Larribe, F; Froda, S
2016-10-01
In Ait Kaci Azzou et al. (2015) we introduced an Importance Sampling (IS) approach for estimating the demographic history of a sample of DNA sequences, the skywis plot. More precisely, we proposed a new nonparametric estimate of a population size that changes over time. We showed on simulated data that the skywis plot can work well in typical situations where the effective population size does not undergo very steep changes. In this paper, we introduce an iterative procedure which extends the previous method and gives good estimates under such rapid variations. In the iterative calibrated skywis plot we approximate the effective population size by a piecewise constant function, whose values are re-estimated at each step. These piecewise constant functions are used to generate the waiting times of non homogeneous Poisson processes related to a coalescent process with mutation under a variable population size model. Moreover, the present IS procedure is based on a modified version of the Stephens and Donnelly (2000) proposal distribution. Finally, we apply the iterative calibrated skywis plot method to a simulated data set from a rapidly expanding exponential model, and we show that the method based on this new IS strategy correctly reconstructs the demographic history. Copyright © 2016. Published by Elsevier Inc.
A Portuguese value set for the SF-6D.
Ferreira, Lara N; Ferreira, Pedro L; Pereira, Luis N; Brazier, John; Rowen, Donna
2010-08-01
The SF-6D is a preference-based measure of health derived from the SF-36 that can be used for cost-effectiveness analysis using cost-per-quality adjusted life-year analysis. This study seeks to estimate a system weight for the SF-6D for Portugal and to compare the results with the UK system weights. A sample of 55 health states defined by the SF-6D has been valued by a representative random sample of the Portuguese population, stratified by sex and age (n = 140), using the Standard Gamble (SG). Several models are estimated at both the individual and aggregate levels for predicting health-state valuations. Models with main effects, with interaction effects and with the constant forced to unity are presented. Random effects (RE) models are estimated using generalized least squares (GLS) regressions. Generalized estimation equations (GEE) are used to estimate RE models with the constant forced to unity. Estimations at the individual level were performed using 630 health-state valuations. Alternative functional forms are considered to account for the skewed distribution of health-state valuations. The models are analyzed in terms of their coefficients, overall fit, and the ability for predicting the SG-values. The RE models estimated using GLS and through GEE produce significant coefficients, which are robust across model specification. However, there are concerns regarding some inconsistent estimates, and so parsimonious consistent models were estimated. There is evidence of under prediction in some states assigned to poor health. The results are consistent with the UK results. The models estimated provide preference-based quality of life weights for the Portuguese population when health status data have been collected using the SF-36. Although the sample was randomly drowned findings should be treated with caution, given the small sample size, even knowing that they have been estimated at the individual level.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Man, Jun; Zhang, Jiangjiang; Li, Weixuan
2016-10-01
The ensemble Kalman filter (EnKF) has been widely used in parameter estimation for hydrological models. The focus of most previous studies was to develop more efficient analysis (estimation) algorithms. On the other hand, it is intuitively understandable that a well-designed sampling (data-collection) strategy should provide more informative measurements and subsequently improve the parameter estimation. In this work, a Sequential Ensemble-based Optimal Design (SEOD) method, coupled with EnKF, information theory and sequential optimal design, is proposed to improve the performance of parameter estimation. Based on the first-order and second-order statistics, different information metrics including the Shannon entropy difference (SD), degrees ofmore » freedom for signal (DFS) and relative entropy (RE) are used to design the optimal sampling strategy, respectively. The effectiveness of the proposed method is illustrated by synthetic one-dimensional and two-dimensional unsaturated flow case studies. It is shown that the designed sampling strategies can provide more accurate parameter estimation and state prediction compared with conventional sampling strategies. Optimal sampling designs based on various information metrics perform similarly in our cases. The effect of ensemble size on the optimal design is also investigated. Overall, larger ensemble size improves the parameter estimation and convergence of optimal sampling strategy. Although the proposed method is applied to unsaturated flow problems in this study, it can be equally applied in any other hydrological problems.« less
Methodological issues with adaptation of clinical trial design.
Hung, H M James; Wang, Sue-Jane; O'Neill, Robert T
2006-01-01
Adaptation of clinical trial design generates many issues that have not been resolved for practical applications, though statistical methodology has advanced greatly. This paper focuses on some methodological issues. In one type of adaptation such as sample size re-estimation, only the postulated value of a parameter for planning the trial size may be altered. In another type, the originally intended hypothesis for testing may be modified using the internal data accumulated at an interim time of the trial, such as changing the primary endpoint and dropping a treatment arm. For sample size re-estimation, we make a contrast between an adaptive test weighting the two-stage test statistics with the statistical information given by the original design and the original sample mean test with a properly corrected critical value. We point out the difficulty in planning a confirmatory trial based on the crude information generated by exploratory trials. In regards to selecting a primary endpoint, we argue that the selection process that allows switching from one endpoint to the other with the internal data of the trial is not very likely to gain a power advantage over the simple process of selecting one from the two endpoints by testing them with an equal split of alpha (Bonferroni adjustment). For dropping a treatment arm, distributing the remaining sample size of the discontinued arm to other treatment arms can substantially improve the statistical power of identifying a superior treatment arm in the design. A common difficult methodological issue is that of how to select an adaptation rule in the trial planning stage. Pre-specification of the adaptation rule is important for the practicality consideration. Changing the originally intended hypothesis for testing with the internal data generates great concerns to clinical trial researchers.
Leroux, Robin A; Dutton, Peter H; Abreu-Grobois, F Alberto; Lagueux, Cynthia J; Campbell, Cathi L; Delcroix, Eric; Chevalier, Johan; Horrocks, Julia A; Hillis-Starr, Zandy; Troëng, Sebastian; Harrison, Emma; Stapleton, Seth
2012-01-01
Management of the critically endangered hawksbill turtle in the Wider Caribbean (WC) has been hampered by knowledge gaps regarding stock structure. We carried out a comprehensive stock structure re-assessment of 11 WC hawksbill rookeries using longer mtDNA sequences, larger sample sizes (N = 647), and additional rookeries compared to previous surveys. Additional variation detected by 740 bp sequences between populations allowed us to differentiate populations such as Barbados-Windward and Guadeloupe (F (st) = 0.683, P < 0.05) that appeared genetically indistinguishable based on shorter 380 bp sequences. POWSIM analysis showed that longer sequences improved power to detect population structure and that when N < 30, increasing the variation detected was as effective in increasing power as increasing sample size. Geographic patterns of genetic variation suggest a model of periodic long-distance colonization coupled with region-wide dispersal and subsequent secondary contact within the WC. Mismatch analysis results for individual clades suggest a general population expansion in the WC following a historic bottleneck about 100 000-300 000 years ago. We estimated an effective female population size (N (ef)) of 6000-9000 for the WC, similar to the current estimated numbers of breeding females, highlighting the importance of these regional rookeries to maintaining genetic diversity in hawksbills. Our results provide a basis for standardizing future work to 740 bp sequence reads and establish a more complete baseline for determining stock boundaries in this migratory marine species. Finally, our findings illustrate the value of maintaining an archive of specimens for re-analysis as new markers become available.
Burgess, George H.; Bruce, Barry D.; Cailliet, Gregor M.; Goldman, Kenneth J.; Grubbs, R. Dean; Lowe, Christopher G.; MacNeil, M. Aaron; Mollet, Henry F.; Weng, Kevin C.; O'Sullivan, John B.
2014-01-01
White sharks are highly migratory and segregate by sex, age and size. Unlike marine mammals, they neither surface to breathe nor frequent haul-out sites, hindering generation of abundance data required to estimate population size. A recent tag-recapture study used photographic identifications of white sharks at two aggregation sites to estimate abundance in “central California” at 219 mature and sub-adult individuals. They concluded this represented approximately one-half of the total abundance of mature and sub-adult sharks in the entire eastern North Pacific Ocean (ENP). This low estimate generated great concern within the conservation community, prompting petitions for governmental endangered species designations. We critically examine that study and find violations of model assumptions that, when considered in total, lead to population underestimates. We also use a Bayesian mixture model to demonstrate that the inclusion of transient sharks, characteristic of white shark aggregation sites, would substantially increase abundance estimates for the adults and sub-adults in the surveyed sub-population. Using a dataset obtained from the same sampling locations and widely accepted demographic methodology, our analysis indicates a minimum all-life stages population size of >2000 individuals in the California subpopulation is required to account for the number and size range of individual sharks observed at the two sampled sites. Even accounting for methodological and conceptual biases, an extrapolation of these data to estimate the white shark population size throughout the ENP is inappropriate. The true ENP white shark population size is likely several-fold greater as both our study and the original published estimate exclude non-aggregating sharks and those that independently aggregate at other important ENP sites. Accurately estimating the central California and ENP white shark population size requires methodologies that account for biases introduced by sampling a limited number of sites and that account for all life history stages across the species' range of habitats. PMID:24932483
Burgess, George H; Bruce, Barry D; Cailliet, Gregor M; Goldman, Kenneth J; Grubbs, R Dean; Lowe, Christopher G; MacNeil, M Aaron; Mollet, Henry F; Weng, Kevin C; O'Sullivan, John B
2014-01-01
White sharks are highly migratory and segregate by sex, age and size. Unlike marine mammals, they neither surface to breathe nor frequent haul-out sites, hindering generation of abundance data required to estimate population size. A recent tag-recapture study used photographic identifications of white sharks at two aggregation sites to estimate abundance in "central California" at 219 mature and sub-adult individuals. They concluded this represented approximately one-half of the total abundance of mature and sub-adult sharks in the entire eastern North Pacific Ocean (ENP). This low estimate generated great concern within the conservation community, prompting petitions for governmental endangered species designations. We critically examine that study and find violations of model assumptions that, when considered in total, lead to population underestimates. We also use a Bayesian mixture model to demonstrate that the inclusion of transient sharks, characteristic of white shark aggregation sites, would substantially increase abundance estimates for the adults and sub-adults in the surveyed sub-population. Using a dataset obtained from the same sampling locations and widely accepted demographic methodology, our analysis indicates a minimum all-life stages population size of >2000 individuals in the California subpopulation is required to account for the number and size range of individual sharks observed at the two sampled sites. Even accounting for methodological and conceptual biases, an extrapolation of these data to estimate the white shark population size throughout the ENP is inappropriate. The true ENP white shark population size is likely several-fold greater as both our study and the original published estimate exclude non-aggregating sharks and those that independently aggregate at other important ENP sites. Accurately estimating the central California and ENP white shark population size requires methodologies that account for biases introduced by sampling a limited number of sites and that account for all life history stages across the species' range of habitats.
Nearest neighbor density ratio estimation for large-scale applications in astronomy
NASA Astrophysics Data System (ADS)
Kremer, J.; Gieseke, F.; Steenstrup Pedersen, K.; Igel, C.
2015-09-01
In astronomical applications of machine learning, the distribution of objects used for building a model is often different from the distribution of the objects the model is later applied to. This is known as sample selection bias, which is a major challenge for statistical inference as one can no longer assume that the labeled training data are representative. To address this issue, one can re-weight the labeled training patterns to match the distribution of unlabeled data that are available already in the training phase. There are many examples in practice where this strategy yielded good results, but estimating the weights reliably from a finite sample is challenging. We consider an efficient nearest neighbor density ratio estimator that can exploit large samples to increase the accuracy of the weight estimates. To solve the problem of choosing the right neighborhood size, we propose to use cross-validation on a model selection criterion that is unbiased under covariate shift. The resulting algorithm is our method of choice for density ratio estimation when the feature space dimensionality is small and sample sizes are large. The approach is simple and, because of the model selection, robust. We empirically find that it is on a par with established kernel-based methods on relatively small regression benchmark datasets. However, when applied to large-scale photometric redshift estimation, our approach outperforms the state-of-the-art.
Accounting for imperfect detection of groups and individuals when estimating abundance.
Clement, Matthew J; Converse, Sarah J; Royle, J Andrew
2017-09-01
If animals are independently detected during surveys, many methods exist for estimating animal abundance despite detection probabilities <1. Common estimators include double-observer models, distance sampling models and combined double-observer and distance sampling models (known as mark-recapture-distance-sampling models; MRDS). When animals reside in groups, however, the assumption of independent detection is violated. In this case, the standard approach is to account for imperfect detection of groups, while assuming that individuals within groups are detected perfectly. However, this assumption is often unsupported. We introduce an abundance estimator for grouped animals when detection of groups is imperfect and group size may be under-counted, but not over-counted. The estimator combines an MRDS model with an N-mixture model to account for imperfect detection of individuals. The new MRDS-Nmix model requires the same data as an MRDS model (independent detection histories, an estimate of distance to transect, and an estimate of group size), plus a second estimate of group size provided by the second observer. We extend the model to situations in which detection of individuals within groups declines with distance. We simulated 12 data sets and used Bayesian methods to compare the performance of the new MRDS-Nmix model to an MRDS model. Abundance estimates generated by the MRDS-Nmix model exhibited minimal bias and nominal coverage levels. In contrast, MRDS abundance estimates were biased low and exhibited poor coverage. Many species of conservation interest reside in groups and could benefit from an estimator that better accounts for imperfect detection. Furthermore, the ability to relax the assumption of perfect detection of individuals within detected groups may allow surveyors to re-allocate resources toward detection of new groups instead of extensive surveys of known groups. We believe the proposed estimator is feasible because the only additional field data required are a second estimate of group size.
Accounting for imperfect detection of groups and individuals when estimating abundance
Clement, Matthew J.; Converse, Sarah J.; Royle, J. Andrew
2017-01-01
If animals are independently detected during surveys, many methods exist for estimating animal abundance despite detection probabilities <1. Common estimators include double-observer models, distance sampling models and combined double-observer and distance sampling models (known as mark-recapture-distance-sampling models; MRDS). When animals reside in groups, however, the assumption of independent detection is violated. In this case, the standard approach is to account for imperfect detection of groups, while assuming that individuals within groups are detected perfectly. However, this assumption is often unsupported. We introduce an abundance estimator for grouped animals when detection of groups is imperfect and group size may be under-counted, but not over-counted. The estimator combines an MRDS model with an N-mixture model to account for imperfect detection of individuals. The new MRDS-Nmix model requires the same data as an MRDS model (independent detection histories, an estimate of distance to transect, and an estimate of group size), plus a second estimate of group size provided by the second observer. We extend the model to situations in which detection of individuals within groups declines with distance. We simulated 12 data sets and used Bayesian methods to compare the performance of the new MRDS-Nmix model to an MRDS model. Abundance estimates generated by the MRDS-Nmix model exhibited minimal bias and nominal coverage levels. In contrast, MRDS abundance estimates were biased low and exhibited poor coverage. Many species of conservation interest reside in groups and could benefit from an estimator that better accounts for imperfect detection. Furthermore, the ability to relax the assumption of perfect detection of individuals within detected groups may allow surveyors to re-allocate resources toward detection of new groups instead of extensive surveys of known groups. We believe the proposed estimator is feasible because the only additional field data required are a second estimate of group size.
Estimating individual glomerular volume in the human kidney: clinical perspectives.
Puelles, Victor G; Zimanyi, Monika A; Samuel, Terence; Hughson, Michael D; Douglas-Denton, Rebecca N; Bertram, John F; Armitage, James A
2012-05-01
Measurement of individual glomerular volumes (IGV) has allowed the identification of drivers of glomerular hypertrophy in subjects without overt renal pathology. This study aims to highlight the relevance of IGV measurements with possible clinical implications and determine how many profiles must be measured in order to achieve stable size distribution estimates. We re-analysed 2250 IGV estimates obtained using the disector/Cavalieri method in 41 African and 34 Caucasian Americans. Pooled IGV analysis of mean and variance was conducted. Monte-Carlo (Jackknife) simulations determined the effect of the number of sampled glomeruli on mean IGV. Lin's concordance coefficient (R(C)), coefficient of variation (CV) and coefficient of error (CE) measured reliability. IGV mean and variance increased with overweight and hypertensive status. Superficial glomeruli were significantly smaller than juxtamedullary glomeruli in all subjects (P < 0.01), by race (P < 0.05) and in obese individuals (P < 0.01). Subjects with multiple chronic kidney disease (CKD) comorbidities showed significant increases in IGV mean and variability. Overall, mean IGV was particularly reliable with nine or more sampled glomeruli (R(C) > 0.95, <5% difference in CV and CE). These observations were not affected by a reduced sample size and did not disrupt the inverse linear correlation between mean IGV and estimated total glomerular number. Multiple comorbidities for CKD are associated with increased IGV mean and variance within subjects, including overweight, obesity and hypertension. Zonal selection and the number of sampled glomeruli do not represent drawbacks for future longitudinal biopsy-based studies of glomerular size and distribution.
Brain size is correlated with endangerment status in mammals.
Abelson, Eric S
2016-02-24
Increases in relative encephalization (RE), brain size after controlling for body size, comes at a great metabolic cost and is correlated with a host of cognitive traits, from the ability to count objects to higher rates of innovation. Despite many studies examining the implications and trade-offs accompanying increased RE, the relationship between mammalian extinction risk and RE is unknown. I examine whether mammals with larger levels of RE are more or less likely to be at risk of endangerment than less-encephalized species. I find that extant species with large levels of encephalization are at greater risk of endangerment, with this effect being strongest in species with small body sizes. These results suggest that RE could be a valuable asset in estimating extinction vulnerability. Additionally, these findings suggest that the cost-benefit trade-off of RE is different in large-bodied species when compared with small-bodied species. © 2016 The Author(s).
Sidler, Dominik; Cristòfol-Clough, Michael; Riniker, Sereina
2017-06-13
Replica-exchange enveloping distribution sampling (RE-EDS) allows the efficient estimation of free-energy differences between multiple end-states from a single molecular dynamics (MD) simulation. In EDS, a reference state is sampled, which can be tuned by two types of parameters, i.e., smoothness parameters(s) and energy offsets, such that all end-states are sufficiently sampled. However, the choice of these parameters is not trivial. Replica exchange (RE) or parallel tempering is a widely applied technique to enhance sampling. By combining EDS with the RE technique, the parameter choice problem could be simplified and the challenge shifted toward an optimal distribution of the replicas in the smoothness-parameter space. The choice of a certain replica distribution can alter the sampling efficiency significantly. In this work, global round-trip time optimization (GRTO) algorithms are tested for the use in RE-EDS simulations. In addition, a local round-trip time optimization (LRTO) algorithm is proposed for systems with slowly adapting environments, where a reliable estimate for the round-trip time is challenging to obtain. The optimization algorithms were applied to RE-EDS simulations of a system of nine small-molecule inhibitors of phenylethanolamine N-methyltransferase (PNMT). The energy offsets were determined using our recently proposed parallel energy-offset (PEOE) estimation scheme. While the multistate GRTO algorithm yielded the best replica distribution for the ligands in water, the multistate LRTO algorithm was found to be the method of choice for the ligands in complex with PNMT. With this, the 36 alchemical free-energy differences between the nine ligands were calculated successfully from a single RE-EDS simulation 10 ns in length. Thus, RE-EDS presents an efficient method for the estimation of relative binding free energies.
Moore, S A; Le Coz, J; Hurther, D; Paquier, A
2013-04-01
Multi-frequency acoustic backscatter profiles recorded with side-looking acoustic Doppler current profilers are used to monitor the concentration and size of sedimentary particles suspended in fluvial environments. Data at 300, 600, and 1200 kHz are presented from the Isère River in France where the dominant particles in suspension are silt and clay sizes. The contribution of suspended sediment to the through-water attenuation was determined for three high concentration (> 100 mg/L) events and compared to theoretical values for spherical particles having size distributions that were measured by laser diffraction in water samples. Agreement was good for the 300 kHz data, but it worsened with increasing frequency. A method for the determination of grain size using multi-frequency attenuation data is presented considering models for spherical and oblate spheroidal particles. When the resulting size estimates are used to convert sediment attenuation to concentration, the spheroidal model provides the best agreement with optical estimates of concentration, but the aspect ratio and grain size that provide the best fit differ between events. The acoustic estimates of size were one-third the values from laser grain sizing. This agreement is encouraging considering optical and acoustical instruments measure different parameters.
Elwan, Ahmed; Singh, Ranvir; Patterson, Maree; Roygard, Jon; Horne, Dave; Clothier, Brent; Jones, Geoffrey
2018-01-11
Better management of water quality in streams, rivers and lakes requires precise and accurate estimates of different contaminant loads. We assessed four sampling frequencies (2 days, weekly, fortnightly and monthly) and five load calculation methods (global mean (GM), rating curve (RC), ratio estimator (RE), flow-stratified (FS) and flow-weighted (FW)) to quantify loads of nitrate-nitrogen (NO 3 - -N), soluble inorganic nitrogen (SIN), total nitrogen (TN), dissolved reactive phosphorus (DRP), total phosphorus (TP) and total suspended solids (TSS), in the Manawatu River, New Zealand. The estimated annual river loads were compared to the reference 'true' loads, calculated using daily measurements of flow and water quality from May 2010 to April 2011, to quantify bias (i.e. accuracy) and root mean square error 'RMSE' (i.e. accuracy and precision). The GM method resulted into relatively higher RMSE values and a consistent negative bias (i.e. underestimation) in estimates of annual river loads across all sampling frequencies. The RC method resulted in the lowest RMSE for TN, TP and TSS at monthly sampling frequency. Yet, RC highly overestimated the loads for parameters that showed dilution effect such as NO 3 - -N and SIN. The FW and RE methods gave similar results, and there was no essential improvement in using RE over FW. In general, FW and RE performed better than FS in terms of bias, but FS performed slightly better than FW and RE in terms of RMSE for most of the water quality parameters (DRP, TP, TN and TSS) using a monthly sampling frequency. We found no significant decrease in RMSE values for estimates of NO 3 - N, SIN, TN and DRP loads when the sampling frequency was increased from monthly to fortnightly. The bias and RMSE values in estimates of TP and TSS loads (estimated by FW, RE and FS), however, showed a significant decrease in the case of weekly or 2-day sampling. This suggests potential for a higher sampling frequency during flow peaks for more precise and accurate estimates of annual river loads for TP and TSS, in the study river and other similar conditions.
Estimating individual glomerular volume in the human kidney: clinical perspectives
Puelles, Victor G.; Zimanyi, Monika A.; Samuel, Terence; Hughson, Michael D.; Douglas-Denton, Rebecca N.; Bertram, John F.
2012-01-01
Background. Measurement of individual glomerular volumes (IGV) has allowed the identification of drivers of glomerular hypertrophy in subjects without overt renal pathology. This study aims to highlight the relevance of IGV measurements with possible clinical implications and determine how many profiles must be measured in order to achieve stable size distribution estimates. Methods. We re-analysed 2250 IGV estimates obtained using the disector/Cavalieri method in 41 African and 34 Caucasian Americans. Pooled IGV analysis of mean and variance was conducted. Monte-Carlo (Jackknife) simulations determined the effect of the number of sampled glomeruli on mean IGV. Lin’s concordance coefficient (RC), coefficient of variation (CV) and coefficient of error (CE) measured reliability. Results. IGV mean and variance increased with overweight and hypertensive status. Superficial glomeruli were significantly smaller than juxtamedullary glomeruli in all subjects (P < 0.01), by race (P < 0.05) and in obese individuals (P < 0.01). Subjects with multiple chronic kidney disease (CKD) comorbidities showed significant increases in IGV mean and variability. Overall, mean IGV was particularly reliable with nine or more sampled glomeruli (RC > 0.95, <5% difference in CV and CE). These observations were not affected by a reduced sample size and did not disrupt the inverse linear correlation between mean IGV and estimated total glomerular number. Conclusions. Multiple comorbidities for CKD are associated with increased IGV mean and variance within subjects, including overweight, obesity and hypertension. Zonal selection and the number of sampled glomeruli do not represent drawbacks for future longitudinal biopsy-based studies of glomerular size and distribution. PMID:21984554
Near-Earth-object survey progress and population of small near-Earth asteroids
NASA Astrophysics Data System (ADS)
Harris, A.
2014-07-01
Estimating the total population vs. size of NEAs and the completion of surveys is the same thing since the total population is just the number discovered divided by the estimated completion. I review the method of completion estimation based on ratio of re-detected objects to total detections (known plus new discoveries). The method is quite general and can be used for population estimations of all sorts, from wildlife to various classes of solar system bodies. Since 2001, I have been making estimates of population and survey progress approximately every two years. Plotted below, left, is my latest estimate, including NEA discoveries up to August, 2012. I plan to present an update at the meeting. All asteroids of a given size are not equally easy to detect because of specific orbital geometries. Thus a model of the orbital distribution is necessary, and computer simulations using those orbits need to establish the relation between the raw re-detection ratio and the actual completion fraction. This can be done for any sub-group population, allowing to estimate the population of a subgroup and the expected current completion. Once a reliable survey computer model has been developed and ''calibrated'' with respect to actual survey re-detections versus size, it can be extrapolated to smaller sizes to estimate completion even at very small size where re-detections are rare or even zero. I have recently investigated the subgroup of extremely low encounter velocity NEAs, the class of interest for the Asteroid Redirect Mission (ARM), recently proposed by NASA. I found that asteroids of diameter ˜ 10 m with encounter velocity with the Earth lower than 2.5 km/sec are detected by current surveys nearly 1,000 times more efficiently than the general background of NEAs of that size. Thus the current completion of these slow relative velocity objects may be around 1%, compared to 10^{-6} for that size objects of the general velocity distribution. Current surveys are nowhere near complete, but there may be fewer such objects than have been suggested. This conclusion is reinforced by the fact that at least a couple such discovered objects are known to be not real asteroids but spent rocket bodies in heliocentric orbit, of which there are only of the order of a hundred. Brown et al. (Nature 503, 238-241, 2013, below right, green squares are a re-plot of my blue circles on left plot) recently suggested that the population of small NEAs in the size range from roughly 5 to 50 meters in diameter may have been substantially under-estimated. To be sure, the greatest uncertainty in population estimates is in that range, since there are very few bolide events to use for estimation, and the surveys are extremely incomplete in that size range, so a factor of 3 or so discrepancy is not significant. However, the population estimated from surveys carried still smaller, where the bolide frequency becomes more secure, disagrees from the bolide estimate by even less than a factor of 3 and in fact intersects at about 3 m diameter. On the other hand, the shallow-sloping size-frequency distribution derived from the sparse large bolide data diverges badly from the survey estimates, in sizes where the survey estimates become ever-increasingly reliable, even by 100-200 m diameter. It appears that the bolide data provides a good "anchor" of the population in the size range up to about 5 m diameter, but above that one might do better just connecting that population with a straight line (on a log-log plot) with the survey-determined population at larger size, 50-100 m diameter or so.
Sample Size Estimation: The Easy Way
ERIC Educational Resources Information Center
Weller, Susan C.
2015-01-01
This article presents a simple approach to making quick sample size estimates for basic hypothesis tests. Although there are many sources available for estimating sample sizes, methods are not often integrated across statistical tests, levels of measurement of variables, or effect sizes. A few parameters are required to estimate sample sizes and…
The observed clustering of damaging extratropical cyclones in Europe
NASA Astrophysics Data System (ADS)
Cusack, Stephen
2016-04-01
The clustering of severe European windstorms on annual timescales has substantial impacts on the (re-)insurance industry. Our knowledge of the risk is limited by large uncertainties in estimates of clustering from typical historical storm data sets covering the past few decades. Eight storm data sets are gathered for analysis in this study in order to reduce these uncertainties. Six of the data sets contain more than 100 years of severe storm information to reduce sampling errors, and observational errors are reduced by the diversity of information sources and analysis methods between storm data sets. All storm severity measures used in this study reflect damage, to suit (re-)insurance applications. The shortest storm data set of 42 years provides indications of stronger clustering with severity, particularly for regions off the main storm track in central Europe and France. However, clustering estimates have very large sampling and observational errors, exemplified by large changes in estimates in central Europe upon removal of one stormy season, 1989/1990. The extended storm records place 1989/1990 into a much longer historical context to produce more robust estimates of clustering. All the extended storm data sets show increased clustering between more severe storms from return periods (RPs) of 0.5 years to the longest measured RPs of about 20 years. Further, they contain signs of stronger clustering off the main storm track, and weaker clustering for smaller-sized areas, though these signals are more uncertain as they are drawn from smaller data samples. These new ultra-long storm data sets provide new information on clustering to improve our management of this risk.
Passive acoustic measurement of bedload grain size distribution using self-generated noise
NASA Astrophysics Data System (ADS)
Petrut, Teodor; Geay, Thomas; Gervaise, Cédric; Belleudy, Philippe; Zanker, Sebastien
2018-01-01
Monitoring sediment transport processes in rivers is of particular interest to engineers and scientists to assess the stability of rivers and hydraulic structures. Various methods for sediment transport process description were proposed using conventional or surrogate measurement techniques. This paper addresses the topic of the passive acoustic monitoring of bedload transport in rivers and especially the estimation of the bedload grain size distribution from self-generated noise. It discusses the feasibility of linking the acoustic signal spectrum shape to bedload grain sizes involved in elastic impacts with the river bed treated as a massive slab. Bedload grain size distribution is estimated by a regularized algebraic inversion scheme fed with the power spectrum density of river noise estimated from one hydrophone. The inversion methodology relies upon a physical model that predicts the acoustic field generated by the collision between rigid bodies. Here we proposed an analytic model of the acoustic energy spectrum generated by the impacts between a sphere and a slab. The proposed model computes the power spectral density of bedload noise using a linear system of analytic energy spectra weighted by the grain size distribution. The algebraic system of equations is then solved by least square optimization and solution regularization methods. The result of inversion leads directly to the estimation of the bedload grain size distribution. The inversion method was applied to real acoustic data from passive acoustics experiments realized on the Isère River, in France. The inversion of in situ measured spectra reveals good estimations of grain size distribution, fairly close to what was estimated by physical sampling instruments. These results illustrate the potential of the hydrophone technique to be used as a standalone method that could ensure high spatial and temporal resolution measurements for sediment transport in rivers.
Using known populations of pronghorn to evaluate sampling plans and estimators
Kraft, K.M.; Johnson, D.H.; Samuelson, J.M.; Allen, S.H.
1995-01-01
Although sampling plans and estimators of abundance have good theoretical properties, their performance in real situations is rarely assessed because true population sizes are unknown. We evaluated widely used sampling plans and estimators of population size on 3 known clustered distributions of pronghorn (Antilocapra americana). Our criteria were accuracy of the estimate, coverage of 95% confidence intervals, and cost. Sampling plans were combinations of sampling intensities (16, 33, and 50%), sample selection (simple random sampling without replacement, systematic sampling, and probability proportional to size sampling with replacement), and stratification. We paired sampling plans with suitable estimators (simple, ratio, and probability proportional to size). We used area of the sampling unit as the auxiliary variable for the ratio and probability proportional to size estimators. All estimators were nearly unbiased, but precision was generally low (overall mean coefficient of variation [CV] = 29). Coverage of 95% confidence intervals was only 89% because of the highly skewed distribution of the pronghorn counts and small sample sizes, especially with stratification. Stratification combined with accurate estimates of optimal stratum sample sizes increased precision, reducing the mean CV from 33 without stratification to 25 with stratification; costs increased 23%. Precise results (mean CV = 13) but poor confidence interval coverage (83%) were obtained with simple and ratio estimators when the allocation scheme included all sampling units in the stratum containing most pronghorn. Although areas of the sampling units varied, ratio estimators and probability proportional to size sampling did not increase precision, possibly because of the clumped distribution of pronghorn. Managers should be cautious in using sampling plans and estimators to estimate abundance of aggregated populations.
Fearon, Elizabeth; Chabata, Sungai T; Thompson, Jennifer A; Cowan, Frances M; Hargreaves, James R
2017-09-14
While guidance exists for obtaining population size estimates using multiplier methods with respondent-driven sampling surveys, we lack specific guidance for making sample size decisions. To guide the design of multiplier method population size estimation studies using respondent-driven sampling surveys to reduce the random error around the estimate obtained. The population size estimate is obtained by dividing the number of individuals receiving a service or the number of unique objects distributed (M) by the proportion of individuals in a representative survey who report receipt of the service or object (P). We have developed an approach to sample size calculation, interpreting methods to estimate the variance around estimates obtained using multiplier methods in conjunction with research into design effects and respondent-driven sampling. We describe an application to estimate the number of female sex workers in Harare, Zimbabwe. There is high variance in estimates. Random error around the size estimate reflects uncertainty from M and P, particularly when the estimate of P in the respondent-driven sampling survey is low. As expected, sample size requirements are higher when the design effect of the survey is assumed to be greater. We suggest a method for investigating the effects of sample size on the precision of a population size estimate obtained using multipler methods and respondent-driven sampling. Uncertainty in the size estimate is high, particularly when P is small, so balancing against other potential sources of bias, we advise researchers to consider longer service attendance reference periods and to distribute more unique objects, which is likely to result in a higher estimate of P in the respondent-driven sampling survey. ©Elizabeth Fearon, Sungai T Chabata, Jennifer A Thompson, Frances M Cowan, James R Hargreaves. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 14.09.2017.
Borkhoff, Cornelia M; Johnston, Patrick R; Stephens, Derek; Atenafu, Eshetu
2015-07-01
Aligning the method used to estimate sample size with the planned analytic method ensures the sample size needed to achieve the planned power. When using generalized estimating equations (GEE) to analyze a paired binary primary outcome with no covariates, many use an exact McNemar test to calculate sample size. We reviewed the approaches to sample size estimation for paired binary data and compared the sample size estimates on the same numerical examples. We used the hypothesized sample proportions for the 2 × 2 table to calculate the correlation between the marginal proportions to estimate sample size based on GEE. We solved the inside proportions based on the correlation and the marginal proportions to estimate sample size based on exact McNemar, asymptotic unconditional McNemar, and asymptotic conditional McNemar. The asymptotic unconditional McNemar test is a good approximation of GEE method by Pan. The exact McNemar is too conservative and yields unnecessarily large sample size estimates than all other methods. In the special case of a 2 × 2 table, even when a GEE approach to binary logistic regression is the planned analytic method, the asymptotic unconditional McNemar test can be used to estimate sample size. We do not recommend using an exact McNemar test. Copyright © 2015 Elsevier Inc. All rights reserved.
RnaSeqSampleSize: real data based sample size estimation for RNA sequencing.
Zhao, Shilin; Li, Chung-I; Guo, Yan; Sheng, Quanhu; Shyr, Yu
2018-05-30
One of the most important and often neglected components of a successful RNA sequencing (RNA-Seq) experiment is sample size estimation. A few negative binomial model-based methods have been developed to estimate sample size based on the parameters of a single gene. However, thousands of genes are quantified and tested for differential expression simultaneously in RNA-Seq experiments. Thus, additional issues should be carefully addressed, including the false discovery rate for multiple statistic tests, widely distributed read counts and dispersions for different genes. To solve these issues, we developed a sample size and power estimation method named RnaSeqSampleSize, based on the distributions of gene average read counts and dispersions estimated from real RNA-seq data. Datasets from previous, similar experiments such as the Cancer Genome Atlas (TCGA) can be used as a point of reference. Read counts and their dispersions were estimated from the reference's distribution; using that information, we estimated and summarized the power and sample size. RnaSeqSampleSize is implemented in R language and can be installed from Bioconductor website. A user friendly web graphic interface is provided at http://cqs.mc.vanderbilt.edu/shiny/RnaSeqSampleSize/ . RnaSeqSampleSize provides a convenient and powerful way for power and sample size estimation for an RNAseq experiment. It is also equipped with several unique features, including estimation for interested genes or pathway, power curve visualization, and parameter optimization.
Nan, Jun; Wang, Zhenbei; Yao, Meng; Yang, Yueming; Zhang, Xiaofei
2016-12-01
The impact of mixing speed in three stages-before breakage, during breakage, and after breakage-on re-grown floc properties was investigated by using a non-intrusive optical sampling and digital image analysis technique, respectively. And then, on the basis of different influence extent of mixing speed during each stage on size and structure of re-grown flocs, coagulation performance with varying mixing speed was analyzed. The results indicated that the broken flocs could not re-grow to the size before breakage in all cases. Furthermore, increasing mixing intensity contributed to the re-formation of smaller flocs with higher degree of compactness. For slow mixing before breakage, an increase in mixing speed had less influence on re-grown floc properties due to the same breakage strength during breakage, resulting in inconspicuous variation of coagulation efficiency. For rapid mixing during breakage, larger mixing speed markedly decreased the coagulation efficiency. This could be attributed that mixing speed during breakage generated greater influence on re-grown floc size. However, as slow mixing after breakage was elevated, the coagulation efficiency presented significant rise, indicating that slow mixing after breakage had more influence on re-grown floc structure upon re-structuring and re-arrangement mechanism.
Porto, Paolo; Walling, Des E; Alewell, Christine; Callegari, Giovanni; Mabit, Lionel; Mallimo, Nicola; Meusburger, Katrin; Zehringer, Markus
2014-12-01
Soil erosion and both its on-site and off-site impacts are increasingly seen as a serious environmental problem across the world. The need for an improved evidence base on soil loss and soil redistribution rates has directed attention to the use of fallout radionuclides, and particularly (137)Cs, for documenting soil redistribution rates. This approach possesses important advantages over more traditional means of documenting soil erosion and soil redistribution. However, one key limitation of the approach is the time-averaged or lumped nature of the estimated erosion rates. In nearly all cases, these will relate to the period extending from the main period of bomb fallout to the time of sampling. Increasing concern for the impact of global change, particularly that related to changing land use and climate change, has frequently directed attention to the need to document changes in soil redistribution rates within this period. Re-sampling techniques, which should be distinguished from repeat-sampling techniques, have the potential to meet this requirement. As an example, the use of a re-sampling technique to derive estimates of the mean annual net soil loss from a small (1.38 ha) forested catchment in southern Italy is reported. The catchment was originally sampled in 1998 and samples were collected from points very close to the original sampling points again in 2013. This made it possible to compare the estimate of mean annual erosion for the period 1954-1998 with that for the period 1999-2013. The availability of measurements of sediment yield from the catchment for parts of the overall period made it possible to compare the results provided by the (137)Cs re-sampling study with the estimates of sediment yield for the same periods. In order to compare the estimates of soil loss and sediment yield for the two different periods, it was necessary to establish the uncertainty associated with the individual estimates. In the absence of a generally accepted procedure for such calculations, key factors influencing the uncertainty of the estimates were identified and a procedure developed. The results of the study demonstrated that there had been no significant change in mean annual soil loss in recent years and this was consistent with the information provided by the estimates of sediment yield from the catchment for the same periods. The study demonstrates the potential for using a re-sampling technique to document recent changes in soil redistribution rates. Copyright © 2014. Published by Elsevier Ltd.
Probabilistic treatment of the uncertainty from the finite size of weighted Monte Carlo data
NASA Astrophysics Data System (ADS)
Glüsenkamp, Thorsten
2018-06-01
Parameter estimation in HEP experiments often involves Monte Carlo simulation to model the experimental response function. A typical application are forward-folding likelihood analyses with re-weighting, or time-consuming minimization schemes with a new simulation set for each parameter value. Problematically, the finite size of such Monte Carlo samples carries intrinsic uncertainty that can lead to a substantial bias in parameter estimation if it is neglected and the sample size is small. We introduce a probabilistic treatment of this problem by replacing the usual likelihood functions with novel generalized probability distributions that incorporate the finite statistics via suitable marginalization. These new PDFs are analytic, and can be used to replace the Poisson, multinomial, and sample-based unbinned likelihoods, which covers many use cases in high-energy physics. In the limit of infinite statistics, they reduce to the respective standard probability distributions. In the general case of arbitrary Monte Carlo weights, the expressions involve the fourth Lauricella function FD, for which we find a new finite-sum representation in a certain parameter setting. The result also represents an exact form for Carlson's Dirichlet average Rn with n > 0, and thereby an efficient way to calculate the probability generating function of the Dirichlet-multinomial distribution, the extended divided difference of a monomial, or arbitrary moments of univariate B-splines. We demonstrate the bias reduction of our approach with a typical toy Monte Carlo problem, estimating the normalization of a peak in a falling energy spectrum, and compare the results with previously published methods from the literature.
Image re-sampling detection through a novel interpolation kernel.
Hilal, Alaa
2018-06-01
Image re-sampling involved in re-size and rotation transformations is an essential element block in a typical digital image alteration. Fortunately, traces left from such processes are detectable, proving that the image has gone a re-sampling transformation. Within this context, we present in this paper two original contributions. First, we propose a new re-sampling interpolation kernel. It depends on five independent parameters that controls its amplitude, angular frequency, standard deviation, and duration. Then, we demonstrate its capacity to imitate the same behavior of the most frequent interpolation kernels used in digital image re-sampling applications. Secondly, the proposed model is used to characterize and detect the correlation coefficients involved in re-sampling transformations. The involved process includes a minimization of an error function using the gradient method. The proposed method is assessed over a large database of 11,000 re-sampled images. Additionally, it is implemented within an algorithm in order to assess images that had undergone complex transformations. Obtained results demonstrate better performance and reduced processing time when compared to a reference method validating the suitability of the proposed approaches. Copyright © 2018 Elsevier B.V. All rights reserved.
77 FR 2697 - Proposed Information Collection; Comment Request; Annual Services Report
Federal Register 2010, 2011, 2012, 2013, 2014
2012-01-19
... and from a sample of small- and medium-sized businesses selected using a stratified sampling procedure... be canvassed when the sample is re-drawn, while nearly all of the small- and medium-sized firms from...); Educational Services (NAICS 61); Health Care and Social Assistance (NAICS 62); Arts, Entertainment, and...
Estimating population size with correlated sampling unit estimates
David C. Bowden; Gary C. White; Alan B. Franklin; Joseph L. Ganey
2003-01-01
Finite population sampling theory is useful in estimating total population size (abundance) from abundance estimates of each sampled unit (quadrat). We develop estimators that allow correlated quadrat abundance estimates, even for quadrats in different sampling strata. Correlated quadrat abundance estimates based on markârecapture or distance sampling methods occur...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lobo Lapidus, R.; Gates, B
2009-01-01
Supported metals prepared from H{sub 3}Re{sub 3}(CO){sub 12} on {gamma}-Al{sub 2}O{sub 3} were treated under conditions that led to various rhenium structures on the support and were tested as catalysts for n-butane conversion in the presence of H{sub 2} in a flow reactor at 533 K and 1 atm. After use, two samples were characterized by X-ray absorption edge positions of approximately 5.6 eV (relative to rhenium metal), indicating that the rhenium was cationic and essentially in the same average oxidation state in each. But the Re-Re coordination numbers found by extended X-ray absorption fine structure spectroscopy (2.2 and 5.1)more » show that the clusters in the two samples were significantly different in average nuclearity despite their indistinguishable rhenium oxidation states. Spectra of a third sample after catalysis indicate approximately Re{sub 3} clusters, on average, and an edge position of 4.5 eV. Thus, two samples contained clusters approximated as Re{sub 3} (on the basis of the Re-Re coordination number), on average, with different average rhenium oxidation states. The data allow resolution of the effects of rhenium oxidation state and cluster size, both of which affect the catalytic activity; larger clusters and a greater degree of reduction lead to increased activity.« less
NASA Astrophysics Data System (ADS)
Madsen, P. T.; Kerr, I.; Payne, R.
2004-10-01
Pods of the little known pygmy killer whale (Feresa attenuata) in the northern Indian Ocean were recorded with a vertical hydrophone array connected to a digital recorder sampling at 320 kHz. Recorded clicks were directional, short (25 μs) transients with estimated source levels between 197 and 223 dB re. 1 μPa (pp). Spectra of clicks recorded close to or on the acoustic axis were bimodal with peak frequencies between 45 and 117 kHz, and with centroid frequencies between 70 and 85 kHz. The clicks share characteristics of echolocation clicks from similar sized, whistling delphinids, and have properties suited for the detection and classification of prey targeted by this odontocete. .
Estimation of sample size and testing power (Part 4).
Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo
2012-01-01
Sample size estimation is necessary for any experimental or survey research. An appropriate estimation of sample size based on known information and statistical knowledge is of great significance. This article introduces methods of sample size estimation of difference test for data with the design of one factor with two levels, including sample size estimation formulas and realization based on the formulas and the POWER procedure of SAS software for quantitative data and qualitative data with the design of one factor with two levels. In addition, this article presents examples for analysis, which will play a leading role for researchers to implement the repetition principle during the research design phase.
The effectiveness of robust RMCD control chart as outliers’ detector
NASA Astrophysics Data System (ADS)
Darmanto; Astutik, Suci
2017-12-01
A well-known control chart to monitor a multivariate process is Hotelling’s T 2 which its parameters are estimated classically, very sensitive and also marred by masking and swamping of outliers data effect. To overcome these situation, robust estimators are strongly recommended. One of robust estimators is re-weighted minimum covariance determinant (RMCD) which has robust characteristics as same as MCD. In this paper, the effectiveness term is accuracy of the RMCD control chart in detecting outliers as real outliers. In other word, how effectively this control chart can identify and remove masking and swamping effects of outliers. We assessed the effectiveness the robust control chart based on simulation by considering different scenarios: n sample sizes, proportion of outliers, number of p quality characteristics. We found that in some scenarios, this RMCD robust control chart works effectively.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xue, Renzhong; Department of Technology and Physics, Zhengzhou University of Light Industry, Zhengzhou 450002; Chen, Zhenping, E-mail: xrzbotao@163.com
2015-06-15
Graphical abstract: The dielectric constant decreases monotonically with reduced RE doping ion radius and is more frequency independent compared with that of pure CCTO sample. - Highlights: • The mean grain sizes decrease monotonically with reduced RE doping ionic radius. • Doping gives rise to the monotonic decrease of ϵ{sub r} with reduced RE ionic radius. • The nonlinear coefficient and breakdown field increase with RE ionic doping. • α of all the samples is associated with the potential barrier width rather than Φ{sub b}. - Abstract: Ca{sub 1–x}R{sub x}Cu{sub 3}Ti{sub 4}O{sub 12}(R = La, Nd, Eu, Gd, Er; xmore » = 0 and 0.005) ceramics were prepared by the conventional solid-state method. The influences of rare earth (RE) ion doping on the microstructure, dielectric and electrical properties of CaCu{sub 3}Ti{sub 4}O{sub 12} (CCTO) ceramics were investigated systematically. Single-phase formation is confirmed by XRD analyses. The mean grain size decreases monotonically with reduced RE ion radius. The EDS results reveal that RE ionic doping reduces Cu-rich phase segregation at the grain boundaries (GBs). Doping gives rise to the monotonic decrease of dielectric constant with reduced RE ionic radius but significantly improves stability with frequency. The lower dielectric loss of doped samples is obtained due to the increase of GB resistance. In addition, the nonlinear coefficient and breakdown field increase with RE ionic doping. Both the fine grains and the enhancement of potential barrier at GBs are responsible for the improvement of the nonlinear current–voltage properties in doped CCTO samples.« less
Accounting for twin births in sample size calculations for randomised trials.
Yelland, Lisa N; Sullivan, Thomas R; Collins, Carmel T; Price, David J; McPhee, Andrew J; Lee, Katherine J
2018-05-04
Including twins in randomised trials leads to non-independence or clustering in the data. Clustering has important implications for sample size calculations, yet few trials take this into account. Estimates of the intracluster correlation coefficient (ICC), or the correlation between outcomes of twins, are needed to assist with sample size planning. Our aims were to provide ICC estimates for infant outcomes, describe the information that must be specified in order to account for clustering due to twins in sample size calculations, and develop a simple tool for performing sample size calculations for trials including twins. ICCs were estimated for infant outcomes collected in four randomised trials that included twins. The information required to account for clustering due to twins in sample size calculations is described. A tool that calculates the sample size based on this information was developed in Microsoft Excel and in R as a Shiny web app. ICC estimates ranged between -0.12, indicating a weak negative relationship, and 0.98, indicating a strong positive relationship between outcomes of twins. Example calculations illustrate how the ICC estimates and sample size calculator can be used to determine the target sample size for trials including twins. Clustering among outcomes measured on twins should be taken into account in sample size calculations to obtain the desired power. Our ICC estimates and sample size calculator will be useful for designing future trials that include twins. Publication of additional ICCs is needed to further assist with sample size planning for future trials. © 2018 John Wiley & Sons Ltd.
Determining Sample Size for Accurate Estimation of the Squared Multiple Correlation Coefficient.
ERIC Educational Resources Information Center
Algina, James; Olejnik, Stephen
2000-01-01
Discusses determining sample size for estimation of the squared multiple correlation coefficient and presents regression equations that permit determination of the sample size for estimating this parameter for up to 20 predictor variables. (SLD)
Evaluating re-identification risks with respect to the HIPAA privacy rule
Benitez, Kathleen
2010-01-01
Objective Many healthcare organizations follow data protection policies that specify which patient identifiers must be suppressed to share “de-identified” records. Such policies, however, are often applied without knowledge of the risk of “re-identification”. The goals of this work are: (1) to estimate re-identification risk for data sharing policies of the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule; and (2) to evaluate the risk of a specific re-identification attack using voter registration lists. Measurements We define several risk metrics: (1) expected number of re-identifications; (2) estimated proportion of a population in a group of size g or less, and (3) monetary cost per re-identification. For each US state, we estimate the risk posed to hypothetical datasets, protected by the HIPAA Safe Harbor and Limited Dataset policies by an attacker with full knowledge of patient identifiers and with limited knowledge in the form of voter registries. Results The percentage of a state's population estimated to be vulnerable to unique re-identification (ie, g=1) when protected via Safe Harbor and Limited Datasets ranges from 0.01% to 0.25% and 10% to 60%, respectively. In the voter attack, this number drops for many states, and for some states is 0%, due to the variable availability of voter registries in the real world. We also find that re-identification cost ranges from $0 to $17 000, further confirming risk variability. Conclusions This work illustrates that blanket protection policies, such as Safe Harbor, leave different organizations vulnerable to re-identification at different rates. It provides justification for locally performed re-identification risk estimates prior to sharing data. PMID:20190059
Johnston, Lisa G; McLaughlin, Katherine R; Rhilani, Houssine El; Latifi, Amina; Toufik, Abdalla; Bennani, Aziza; Alami, Kamal; Elomari, Boutaina; Handcock, Mark S
2015-01-01
Background Respondent-driven sampling is used worldwide to estimate the population prevalence of characteristics such as HIV/AIDS and associated risk factors in hard-to-reach populations. Estimating the total size of these populations is of great interest to national and international organizations, however reliable measures of population size often do not exist. Methods Successive Sampling-Population Size Estimation (SS-PSE) along with network size imputation allows population size estimates to be made without relying on separate studies or additional data (as in network scale-up, multiplier and capture-recapture methods), which may be biased. Results Ten population size estimates were calculated for people who inject drugs, female sex workers, men who have sex with other men, and migrants from sub-Sahara Africa in six different cities in Morocco. SS-PSE estimates fell within or very close to the likely values provided by experts and the estimates from previous studies using other methods. Conclusions SS-PSE is an effective method for estimating the size of hard-to-reach populations that leverages important information within respondent-driven sampling studies. The addition of a network size imputation method helps to smooth network sizes allowing for more accurate results. However, caution should be used particularly when there is reason to believe that clustered subgroups may exist within the population of interest or when the sample size is small in relation to the population. PMID:26258908
NASA Astrophysics Data System (ADS)
Somu, Vijaya Bhaskar
Apparent ionospheric reflection heights estimated using the zero-to-zero and peak-to-peak methods to measure skywave delay relative to the groundwave were compared for 108 first and 124 subsequent strokes observed at LOG in 2009. For either metric there was a considerable decrease in average re ection height for subsequent strokes relative to first strokes. Median uncertainties in daytime re ection heights did not exceed 0.7 km. The standard errors in mean re ection heights were less than 3% of the mean value. Apparent changes in re ection height (estimated using the peak-to-peak method) within individual ashes for 54 daytime and 11 nighttime events at distances ranging from 50 km to 330 km were compared. For daytime conditions, the majority of the ashes showed a monotonic decrease in re ection height. For nighttime ashes, the monotonic decrease was found to be considerably less frequent. The apparent ionospheric re ection height tends to increase with return-stroke peak current. In order to increase the sample size for nighttime conditions, additional data for 43 nighttime flashes observed at LOG in 2014 were analyzed. The "fast-break-point" method of measuring skywave delay (McDonald et al., 1979) was additionally used. The 2014 results for return strokes are generally consistent with the 2009 results. The 2014 data were also used for estimating ionospheric re ection heights for elevated sources (6 CIDs and 3 PB pulses) using the double-skywave feature. The results were compared with re ection heights estimated for corresponding return strokes (if any), and fairly good agreement was generally found. It has been shown, using two different FDTD simulation codes, that the observed differences in re ection height cannot be explained by the difference in the frequency content of first and subsequent return-stroke currents. FDTD simulations showed that within 200 km the re ection heights estimated using the peak-to-peak method are close to the hOE parameter of the ionospheric profile for both daytime and nighttime conditions and for both first and second skywaves. The TL model was used to estimate the radial extent of elves produced by the interaction of LEMP with the ionosphere as a function of return-stroke peak current. For a peak current of 100 kA and the speed equal to one-half of the speed of light, the expected radius of elves is 157 km. Skywaves associated with 24 return strokes in 6 lightning ashes triggered at CB in 2015 and recorded at LOG (at a distance of 45 km from CB) were not found for any of the strokes recorded. In contrast, natural-lightning strokes do produce skywaves at comparable distances. One possible reason is the difference in the higher-frequency content (field waveforms for triggered lightning are more narrow than for natural lightning).
Influence of limestone characteristics on mercury re-emission in WFGD systems.
Ochoa-González, Raquel; Díaz-Somoano, Mercedes; Martínez-Tarazona, M Rosa
2013-03-19
This work evaluates the influence of the effect of the properties of limestones on their reactivity and the re-emission of mercury under typical wet scrubber conditions. The influence of the composition, particle size, and porosity of limestones on their reactivity and the effect of sorbent concentration, pH, redox potential, and the sulphite and iron content of the slurry on Hg(0) re-emission was assessed. A small particle size, a high porosity and a low magnesium content increased the high reactivity of the limestones. Moreover, it was found that the higher the reactivity of the sample the greater the amount of mercury captured in the scrubber. Although sulphite ions did not cause the re-emission of mercury from the suspensions of the gypsums, the limestones enriched in iron increased Hg(0) re-emission under low oxygen conditions. It was observed that the low pH values of the gypsum suspensions favored the cocapture of mercury because Fe(2+) formation was avoided. The partitioning of the mercury in the byproducts of the scrubber depended on the impurities of the limestones rather than on their particle size. No leaching of mercury from the gypsum samples occurred suggesting that mercury was either tightly bound to the impurities of the limestone or was transformed into insoluble mercury species.
Biased phylodynamic inferences from analysing clusters of viral sequences
Xiang, Fei; Frost, Simon D. W.
2017-01-01
Abstract Phylogenetic methods are being increasingly used to help understand the transmission dynamics of measurably evolving viruses, including HIV. Clusters of highly similar sequences are often observed, which appear to follow a ‘power law’ behaviour, with a small number of very large clusters. These clusters may help to identify subpopulations in an epidemic, and inform where intervention strategies should be implemented. However, clustering of samples does not necessarily imply the presence of a subpopulation with high transmission rates, as groups of closely related viruses can also occur due to non-epidemiological effects such as over-sampling. It is important to ensure that observed phylogenetic clustering reflects true heterogeneity in the transmitting population, and is not being driven by non-epidemiological effects. We qualify the effect of using a falsely identified ‘transmission cluster’ of sequences to estimate phylodynamic parameters including the effective population size and exponential growth rate under several demographic scenarios. Our simulation studies show that taking the maximum size cluster to re-estimate parameters from trees simulated under a randomly mixing, constant population size coalescent process systematically underestimates the overall effective population size. In addition, the transmission cluster wrongly resembles an exponential or logistic growth model 99% of the time. We also illustrate the consequences of false clusters in exponentially growing coalescent and birth-death trees, where again, the growth rate is skewed upwards. This has clear implications for identifying clusters in large viral databases, where a false cluster could result in wasted intervention resources. PMID:28852573
Effects of sample size on estimates of population growth rates calculated with matrix models.
Fiske, Ian J; Bruna, Emilio M; Bolker, Benjamin M
2008-08-28
Matrix models are widely used to study the dynamics and demography of populations. An important but overlooked issue is how the number of individuals sampled influences estimates of the population growth rate (lambda) calculated with matrix models. Even unbiased estimates of vital rates do not ensure unbiased estimates of lambda-Jensen's Inequality implies that even when the estimates of the vital rates are accurate, small sample sizes lead to biased estimates of lambda due to increased sampling variance. We investigated if sampling variability and the distribution of sampling effort among size classes lead to biases in estimates of lambda. Using data from a long-term field study of plant demography, we simulated the effects of sampling variance by drawing vital rates and calculating lambda for increasingly larger populations drawn from a total population of 3842 plants. We then compared these estimates of lambda with those based on the entire population and calculated the resulting bias. Finally, we conducted a review of the literature to determine the sample sizes typically used when parameterizing matrix models used to study plant demography. We found significant bias at small sample sizes when survival was low (survival = 0.5), and that sampling with a more-realistic inverse J-shaped population structure exacerbated this bias. However our simulations also demonstrate that these biases rapidly become negligible with increasing sample sizes or as survival increases. For many of the sample sizes used in demographic studies, matrix models are probably robust to the biases resulting from sampling variance of vital rates. However, this conclusion may depend on the structure of populations or the distribution of sampling effort in ways that are unexplored. We suggest more intensive sampling of populations when individual survival is low and greater sampling of stages with high elasticities.
Measuring size evolution of distant, faint galaxies in the radio regime
NASA Astrophysics Data System (ADS)
Lindroos, L.; Knudsen, K. K.; Stanley, F.; Muxlow, T. W. B.; Beswick, R. J.; Conway, J.; Radcliffe, J. F.; Wrigley, N.
2018-05-01
We measure the evolution of sizes for star-forming galaxies as seen in 1.4 GHz continuum radio for z = 0-3. The measurements are based on combined VLA+MERLIN data of the Hubble Deep Field, and using a uv-stacking algorithm combined with model fitting to estimate the average sizes of galaxies. A sample of ˜1000 star-forming galaxies is selected from optical and near-infrared catalogues, with stellar masses M⊙ ≈ 1010-1011 M⊙ and photometric redshifts 0-3. The median sizes are parametrized for stellar mass M* = 5 × 1010 M⊙ as R_e = A× {}(H(z)/H(1.5))^{α _z}. We find that the median radio sizes evolve towards larger sizes at later times with αz = -1.1 ± 0.6, and A (the median size at z ≈ 1.5) is found to be 0.26^'' ± 0.07^'' or 2.3±0.6 kpc. The measured radio sizes are typically a factor of 2 smaller than those measure in the optical, and are also smaller than the typical H α sizes in the literature. This indicates that star formation, as traced by the radio continuum, is typically concentrated towards the centre of galaxies, for the sampled redshift range. Furthermore, the discrepancy of measured sizes from different tracers of star formation, indicates the need for models of size evolution to adopt a multiwavelength approach in the measurement of the sizes star-forming regions.
Effects of tree-to-tree variations on sap flux-based transpiration estimates in a forested watershed
NASA Astrophysics Data System (ADS)
Kume, Tomonori; Tsuruta, Kenji; Komatsu, Hikaru; Kumagai, Tomo'omi; Higashi, Naoko; Shinohara, Yoshinori; Otsuki, Kyoichi
2010-05-01
To estimate forest stand-scale water use, we assessed how sample sizes affect confidence of stand-scale transpiration (E) estimates calculated from sap flux (Fd) and sapwood area (AS_tree) measurements of individual trees. In a Japanese cypress plantation, we measured Fd and AS_tree in all trees (n = 58) within a 20 × 20 m study plot, which was divided into four 10 × 10 subplots. We calculated E from stand AS_tree (AS_stand) and mean stand Fd (JS) values. Using Monte Carlo analyses, we examined potential errors associated with sample sizes in E, AS_stand, and JS by using the original AS_tree and Fd data sets. Consequently, we defined optimal sample sizes of 10 and 15 for AS_stand and JS estimates, respectively, in the 20 × 20 m plot. Sample sizes greater than the optimal sample sizes did not decrease potential errors. The optimal sample sizes for JS changed according to plot size (e.g., 10 × 10 m and 10 × 20 m), while the optimal sample sizes for AS_stand did not. As well, the optimal sample sizes for JS did not change in different vapor pressure deficit conditions. In terms of E estimates, these results suggest that the tree-to-tree variations in Fd vary among different plots, and that plot size to capture tree-to-tree variations in Fd is an important factor. This study also discusses planning balanced sampling designs to extrapolate stand-scale estimates to catchment-scale estimates.
Green, Michael V; Seidel, Jurgen; Choyke, Peter L; Jagoda, Elaine M
2017-10-01
We describe a simple fixture that can be added to the imaging bed of a small-animal PET scanner that allows for automated counting of multiple organ or tissue samples from mouse-sized animals and counting of injection syringes prior to administration of the radiotracer. The combination of imaging and counting capabilities in the same machine offers advantages in certain experimental settings. A polyethylene block of plastic, sculpted to mate with the animal imaging bed of a small-animal PET scanner, is machined to receive twelve 5-ml containers, each capable of holding an entire organ from a mouse-sized animal. In addition, a triangular cross-section slot is machined down the centerline of the block to secure injection syringes from 1-ml to 3-ml in size. The sample holder is scanned in PET whole-body mode to image all samples or in one bed position to image a filled injection syringe. Total radioactivity in each sample or syringe is determined from the reconstructed images of these objects using volume re-projection of the coronal images and a single region-of-interest for each. We tested the accuracy of this method by comparing PET estimates of sample and syringe activity with well counter and dose calibrator estimates of these same activities. PET and well counting of the same samples gave near identical results (in MBq, R 2 =0.99, slope=0.99, intercept=0.00-MBq). PET syringe and dose calibrator measurements of syringe activity in MBq were also similar (R 2 =0.99, slope=0.99, intercept=- 0.22-MBq). A small-animal PET scanner can be easily converted into a multi-sample and syringe counting device by the addition of a sample block constructed for that purpose. This capability, combined with live animal imaging, can improve efficiency and flexibility in certain experimental settings. Copyright © 2017 Elsevier Inc. All rights reserved.
Species richness in soil bacterial communities: a proposed approach to overcome sample size bias.
Youssef, Noha H; Elshahed, Mostafa S
2008-09-01
Estimates of species richness based on 16S rRNA gene clone libraries are increasingly utilized to gauge the level of bacterial diversity within various ecosystems. However, previous studies have indicated that regardless of the utilized approach, species richness estimates obtained are dependent on the size of the analyzed clone libraries. We here propose an approach to overcome sample size bias in species richness estimates in complex microbial communities. Parametric (Maximum likelihood-based and rarefaction curve-based) and non-parametric approaches were used to estimate species richness in a library of 13,001 near full-length 16S rRNA clones derived from soil, as well as in multiple subsets of the original library. Species richness estimates obtained increased with the increase in library size. To obtain a sample size-unbiased estimate of species richness, we calculated the theoretical clone library sizes required to encounter the estimated species richness at various clone library sizes, used curve fitting to determine the theoretical clone library size required to encounter the "true" species richness, and subsequently determined the corresponding sample size-unbiased species richness value. Using this approach, sample size-unbiased estimates of 17,230, 15,571, and 33,912 were obtained for the ML-based, rarefaction curve-based, and ACE-1 estimators, respectively, compared to bias-uncorrected values of 15,009, 11,913, and 20,909.
Multicollinearity in hierarchical linear models.
Yu, Han; Jiang, Shanhe; Land, Kenneth C
2015-09-01
This study investigates an ill-posed problem (multicollinearity) in Hierarchical Linear Models from both the data and the model perspectives. We propose an intuitive, effective approach to diagnosing the presence of multicollinearity and its remedies in this class of models. A simulation study demonstrates the impacts of multicollinearity on coefficient estimates, associated standard errors, and variance components at various levels of multicollinearity for finite sample sizes typical in social science studies. We further investigate the role multicollinearity plays at each level for estimation of coefficient parameters in terms of shrinkage. Based on these analyses, we recommend a top-down method for assessing multicollinearity in HLMs that first examines the contextual predictors (Level-2 in a two-level model) and then the individual predictors (Level-1) and uses the results for data collection, research problem redefinition, model re-specification, variable selection and estimation of a final model. Copyright © 2015 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Nelson, Erica June; van Dokkum, Pieter G.; Brammer, Gabriel; Förster Schreiber, Natascha; Franx, Marijn; Fumagalli, Mattia; Patel, Shannon; Rix, Hans-Walter; Skelton, Rosalind E.; Bezanson, Rachel; Da Cunha, Elisabete; Kriek, Mariska; Labbe, Ivo; Lundgren, Britt; Quadri, Ryan; Schmidt, Kasper B.
2012-03-01
We investigate the buildup of galaxies at z ~ 1 using maps of Hα and stellar continuum emission for a sample of 57 galaxies with rest-frame Hα equivalent widths >100 Å in the 3D-HST grism survey. We find that the Hα emission broadly follows the rest-frame R-band light but that it is typically somewhat more extended and clumpy. We quantify the spatial distribution with the half-light radius. The median Hα effective radius re (Hα) is 4.2 ± 0.1 kpc but the sizes span a large range, from compact objects with re (Hα) ~ 1.0 kpc to extended disks with re (Hα) ~ 15 kpc. Comparing Hα sizes to continuum sizes, we find
The observed clustering of damaging extra-tropical cyclones in Europe
NASA Astrophysics Data System (ADS)
Cusack, S.
2015-12-01
The clustering of severe European windstorms on annual timescales has substantial impacts on the re/insurance industry. Management of the risk is impaired by large uncertainties in estimates of clustering from historical storm datasets typically covering the past few decades. The uncertainties are unusually large because clustering depends on the variance of storm counts. Eight storm datasets are gathered for analysis in this study in order to reduce these uncertainties. Six of the datasets contain more than 100~years of severe storm information to reduce sampling errors, and the diversity of information sources and analysis methods between datasets sample observational errors. All storm severity measures used in this study reflect damage, to suit re/insurance applications. It is found that the shortest storm dataset of 42 years in length provides estimates of clustering with very large sampling and observational errors. The dataset does provide some useful information: indications of stronger clustering for more severe storms, particularly for southern countries off the main storm track. However, substantially different results are produced by removal of one stormy season, 1989/1990, which illustrates the large uncertainties from a 42-year dataset. The extended storm records place 1989/1990 into a much longer historical context to produce more robust estimates of clustering. All the extended storm datasets show a greater degree of clustering with increasing storm severity and suggest clustering of severe storms is much more material than weaker storms. Further, they contain signs of stronger clustering in areas off the main storm track, and weaker clustering for smaller-sized areas, though these signals are smaller than uncertainties in actual values. Both the improvement of existing storm records and development of new historical storm datasets would help to improve management of this risk.
Improving the accuracy of livestock distribution estimates through spatial interpolation.
Bryssinckx, Ward; Ducheyne, Els; Muhwezi, Bernard; Godfrey, Sunday; Mintiens, Koen; Leirs, Herwig; Hendrickx, Guy
2012-11-01
Animal distribution maps serve many purposes such as estimating transmission risk of zoonotic pathogens to both animals and humans. The reliability and usability of such maps is highly dependent on the quality of the input data. However, decisions on how to perform livestock surveys are often based on previous work without considering possible consequences. A better understanding of the impact of using different sample designs and processing steps on the accuracy of livestock distribution estimates was acquired through iterative experiments using detailed survey. The importance of sample size, sample design and aggregation is demonstrated and spatial interpolation is presented as a potential way to improve cattle number estimates. As expected, results show that an increasing sample size increased the precision of cattle number estimates but these improvements were mainly seen when the initial sample size was relatively low (e.g. a median relative error decrease of 0.04% per sampled parish for sample sizes below 500 parishes). For higher sample sizes, the added value of further increasing the number of samples declined rapidly (e.g. a median relative error decrease of 0.01% per sampled parish for sample sizes above 500 parishes. When a two-stage stratified sample design was applied to yield more evenly distributed samples, accuracy levels were higher for low sample densities and stabilised at lower sample sizes compared to one-stage stratified sampling. Aggregating the resulting cattle number estimates yielded significantly more accurate results because of averaging under- and over-estimates (e.g. when aggregating cattle number estimates from subcounty to district level, P <0.009 based on a sample of 2,077 parishes using one-stage stratified samples). During aggregation, area-weighted mean values were assigned to higher administrative unit levels. However, when this step is preceded by a spatial interpolation to fill in missing values in non-sampled areas, accuracy is improved remarkably. This counts especially for low sample sizes and spatially even distributed samples (e.g. P <0.001 for a sample of 170 parishes using one-stage stratified sampling and aggregation on district level). Whether the same observations apply on a lower spatial scale should be further investigated.
NREL Screens Universities for Solar and Battery Storage Potential
DOE Office of Scientific and Technical Information (OSTI.GOV)
In support of the U.S. Department of Energy's SunShot initiative, NREL provided solar photovoltaic (PV) screenings in 2016 for eight universities seeking to go solar. NREL conducted an initial technoeconomic assessment of PV and storage feasibility at the selected universities using the REopt model, an energy planning platform that can be used to evaluate RE options, estimate costs, and suggest a mix of RE technologies to meet defined assumptions and constraints. NREL provided each university with customized results, including the cost-effectiveness of PV and storage, recommended system size, estimated capital cost to implement the technology, and estimated life cycle costmore » savings.« less
NASA Astrophysics Data System (ADS)
Hong, Gang; Minnis, Patrick; Doelling, David; Ayers, J. Kirk; Sun-Mack, Szedung
2012-03-01
A method for estimating effective ice particle radius Re at the tops of tropical deep convective clouds (DCC) is developed on the basis of precomputed look-up tables (LUTs) of brightness temperature differences (BTDs) between the 3.7 and 11.0 μm bands. A combination of discrete ordinates radiative transfer and correlated k distribution programs, which account for the multiple scattering and monochromatic molecular absorption in the atmosphere, is utilized to compute the LUTs as functions of solar zenith angle, satellite zenith angle, relative azimuth angle, Re, cloud top temperature (CTT), and cloud visible optical thickness τ. The LUT-estimated DCC Re agrees well with the cloud retrievals of the Moderate Resolution Imaging Spectroradiometer (MODIS) for the NASA Clouds and Earth's Radiant Energy System with a correlation coefficient of 0.988 and differences of less than 10%. The LUTs are applied to 1 year of measurements taken from MODIS aboard Aqua in 2007 to estimate DCC Re and are compared to a similar quantity from CloudSat over the region bounded by 140°E, 180°E, 0°N, and 20°N in the Western Pacific Warm Pool. The estimated DCC Re values are mainly concentrated in the range of 25-45 μm and decrease with CTT. Matching the LUT-estimated Re with ice cloud Re retrieved by CloudSat, it is found that the ice cloud τ values from DCC top to the vertical location where LUT-estimated Re is located at the CloudSat-retrieved Re profile are mostly less than 2.5 with a mean value of about 1.3. Changes in the DCC τ can result in differences of less than 10% for Re estimated from LUTs. The LUTs of 0.65 μm bidirectional reflectance distribution function (BRDF) are built as functions of viewing geometry and column amount of ozone above upper troposphere. The 0.65 μm BRDF can eliminate some noncore portions of the DCCs detected using only 11 μm brightness temperature thresholds, which result in a mean difference of only 0.6 μm for DCC Re estimated from BTD LUTs.
A re-evaluation of a case-control model with contaminated controls for resource selection studies
Christopher T. Rota; Joshua J. Millspaugh; Dylan C. Kesler; Chad P. Lehman; Mark A. Rumble; Catherine M. B. Jachowski
2013-01-01
A common sampling design in resource selection studies involves measuring resource attributes at sample units used by an animal and at sample units considered available for use. Few models can estimate the absolute probability of using a sample unit from such data, but such approaches are generally preferred over statistical methods that estimate a relative probability...
Lanza, Amy; Ravaud, Philippe; Riveros, Carolina; Dechartres, Agnes
2016-01-01
Observational studies are increasingly being used for assessing therapeutic interventions. Case-control studies are generally considered to have greater risk of bias than cohort studies, but we lack evidence of differences in effect estimates between the 2 study types. We aimed to compare estimates between cohort and case-control studies in meta-analyses of observational studies of therapeutic interventions by using a meta-epidemiological study. We used a random sample of meta-analyses of therapeutic interventions published in 2013 that included both cohort and case-control studies assessing a binary outcome. For each meta-analysis, the ratio of estimates (RE) was calculated by comparing the estimate in case-control studies to that in cohort studies. Then, we used random-effects meta-analysis to estimate a combined RE across meta-analyses. An RE < 1 indicated that case-control studies yielded larger estimates than cohort studies. The final analysis included 23 meta-analyses: 138 cohort and 133 case-control studies. Treatment effect estimates did not significantly differ between case-control and cohort studies (combined RE 0.97 [95% CI 0.86-1.09]). Heterogeneity was low, with between-meta-analysis variance τ2 = 0.0049. Estimates did not differ between case-control and prospective or retrospective cohort studies (RE = 1.05 [95% CI 0.96-1.15] and RE = 0.99 [95% CI, 0.83-1.19], respectively). Sensitivity analysis of studies reporting adjusted estimates also revealed no significant difference (RE = 1.03 [95% CI 0.91-1.16]). Heterogeneity was also low for these analyses. We found no significant difference in treatment effect estimates between case-control and cohort studies assessing therapeutic interventions.
ERIC Educational Resources Information Center
Alika, Ijeoma Henrietta; Egbochuku, Elizabeth Omotunde
2012-01-01
The study investigated the relationship between vocational interest socio-economic status and re-entry of girls into school in Edo State. The research design adopted was correlational because it sought to establish the relationship between the independent variable and the dependent variable. A sample size of 306 girls who re-enrolled in institutes…
Estimation of sample size and testing power (part 5).
Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo
2012-02-01
Estimation of sample size and testing power is an important component of research design. This article introduced methods for sample size and testing power estimation of difference test for quantitative and qualitative data with the single-group design, the paired design or the crossover design. To be specific, this article introduced formulas for sample size and testing power estimation of difference test for quantitative and qualitative data with the above three designs, the realization based on the formulas and the POWER procedure of SAS software and elaborated it with examples, which will benefit researchers for implementing the repetition principle.
NASA Astrophysics Data System (ADS)
Voss, Sebastian; Zimmermann, Beate; Zimmermann, Alexander
2016-09-01
In the last decades, an increasing number of studies analyzed spatial patterns in throughfall by means of variograms. The estimation of the variogram from sample data requires an appropriate sampling scheme: most importantly, a large sample and a layout of sampling locations that often has to serve both variogram estimation and geostatistical prediction. While some recommendations on these aspects exist, they focus on Gaussian data and high ratios of the variogram range to the extent of the study area. However, many hydrological data, and throughfall data in particular, do not follow a Gaussian distribution. In this study, we examined the effect of extent, sample size, sampling design, and calculation method on variogram estimation of throughfall data. For our investigation, we first generated non-Gaussian random fields based on throughfall data with large outliers. Subsequently, we sampled the fields with three extents (plots with edge lengths of 25 m, 50 m, and 100 m), four common sampling designs (two grid-based layouts, transect and random sampling) and five sample sizes (50, 100, 150, 200, 400). We then estimated the variogram parameters by method-of-moments (non-robust and robust estimators) and residual maximum likelihood. Our key findings are threefold. First, the choice of the extent has a substantial influence on the estimation of the variogram. A comparatively small ratio of the extent to the correlation length is beneficial for variogram estimation. Second, a combination of a minimum sample size of 150, a design that ensures the sampling of small distances and variogram estimation by residual maximum likelihood offers a good compromise between accuracy and efficiency. Third, studies relying on method-of-moments based variogram estimation may have to employ at least 200 sampling points for reliable variogram estimates. These suggested sample sizes exceed the number recommended by studies dealing with Gaussian data by up to 100 %. Given that most previous throughfall studies relied on method-of-moments variogram estimation and sample sizes ≪200, currently available data are prone to large uncertainties.
A computer program for sample size computations for banding studies
Wilson, K.R.; Nichols, J.D.; Hines, J.E.
1989-01-01
Sample sizes necessary for estimating survival rates of banded birds, adults and young, are derived based on specified levels of precision. The banding study can be new or ongoing. The desired coefficient of variation (CV) for annual survival estimates, the CV for mean annual survival estimates, and the length of the study must be specified to compute sample sizes. A computer program is available for computation of the sample sizes, and a description of the input and output is provided.
NASA Astrophysics Data System (ADS)
Soucemarianadin, Laure; Barré, Pierre; Baudin, François; Chenu, Claire; Houot, Sabine; Kätterer, Thomas; Macdonald, Andy; van Oort, Folkert; Plante, Alain F.; Cécillon, Lauric
2017-04-01
The organic carbon reservoir of soils is a key component of climate change, calling for an accurate knowledge of the residence time of soil organic carbon (SOC). Existing proxies of the size of SOC labile pool such as SOC fractionation or respiration tests are time consuming and unable to consistently predict SOC mineralization over years to decades. Similarly, models of SOC dynamics often yield unrealistic values of the size of SOC kinetic pools. Thermal analysis of bulk soil samples has recently been shown to provide useful and cost-effective information regarding the long-term in-situ decomposition of SOC. Barré et al. (2016) analyzed soil samples from long-term bare fallow sites in northwestern Europe using Rock-Eval 6 pyrolysis (RE6), and demonstrated that persistent SOC is thermally more stable and has less hydrogen-rich compounds (low RE6 HI parameter) than labile SOC. The objective of this study was to predict SOC loss over a 20-year period (i.e. the size of the SOC pool with a residence time lower than 20 years) using RE6 indicators. Thirty-six archive soil samples coming from 4 long-term bare fallow chronosequences (Grignon, France; Rothamsted, Great Britain; Ultuna, Sweden; Versailles, France) were used in this study. For each sample, the value of bi-decadal SOC mineralization was obtained from the observed SOC dynamics of its long-term bare fallow plot (approximated by a spline function). Those values ranged from 0.8 to 14.3 gC·kg-1 (concentration data), representing 8.6 to 50.6% of total SOC (proportion data). All samples were analyzed using RE6 and simple linear regression models were used to predict bi-decadal SOC loss (concentration and proportion data) from 4 RE6 parameters: HI, OI, PC/SOC and T50 CO2 oxidation. HI (the amount of hydrogen-rich effluents formed during the pyrolysis phase of RE6; mgCH.g-1SOC) and OI (the CO2 yield during the pyrolysis phase of RE6; mgCO2.g-1SOC) parameters describe SOC bulk chemistry. PC/SOC (the amount of organic C evolved during the pyrolysis phase of RE6; % of total SOC) and T50 CO2 oxidation (the temperature at which 50% of the residual organic C was oxidized to CO2 during the RE6 oxidation phase; °C) parameters represent SOC thermal stability. The RE6 HI parameter yielded the best predictions of bi-decadal SOC mineralization, for both concentration (R2 = 0.75) and proportion (R2 = 0.66) data. PC/SOC and T50 CO2 oxidation parameters also yielded significant regression models with R2 = 0.68 and 0.42 for concentration data and R2 = 0.59 and 0.26 for proportion data, respectively. The OI parameter was not a good predictor of bi-decadal SOC loss, with non-significant regression models. The RE6 thermal analysis method can predict in-situ SOC biogeochemical stability. SOC chemical composition, and to a lesser SOC thermal stability, are related to its bi-decadal dynamics. RE6 appears to be a more accurate and convenient proxy of the size of the bi-decadal labile SOC pool than other existing methodologies. Future developments include the validation of these RE6 models of bi-decadal SOC loss on soils from contrasted pedoclimatic conditions. Reference: Barré et al., 2016. Biogeochemistry 130, 1-12
NASA Astrophysics Data System (ADS)
Yamada, K.; Suzuki, H.; Kitahata, H.; Matsushita, Y.; Nozawa, K.; Komori, F.; Yu, R. S.; Kobayashi, Y.; Ohdaira, T.; Oshima, N.; Suzuki, R.; Takagiwa, Y.; Kimura, K.; Kanazawa, I.
2018-01-01
The size of structural vacancies and structural vacancy density of 1/1-Al-Re-Si approximant crystals with different Re compositions were evaluated by positron annihilation lifetime and Doppler broadening measurements. Incident positrons were found to be trapped at the monovacancy-size open space surrounded by Al atoms. From a previous analysis using the maximum entropy method and Rietveld method, such an open space is shown to correspond to the centre of Al icosahedral clusters, which locates at the vertex and body centre. The structural vacancy density of non-metallic Al73Re17Si10 was larger than that of metallic Al73Re15Si12. The observed difference in the structural vacancy density reflects that in bonding nature and may explain that in the physical properties of the two samples.
NASA Astrophysics Data System (ADS)
Lusiana, Evellin Dewi
2017-12-01
The parameters of binary probit regression model are commonly estimated by using Maximum Likelihood Estimation (MLE) method. However, MLE method has limitation if the binary data contains separation. Separation is the condition where there are one or several independent variables that exactly grouped the categories in binary response. It will result the estimators of MLE method become non-convergent, so that they cannot be used in modeling. One of the effort to resolve the separation is using Firths approach instead. This research has two aims. First, to identify the chance of separation occurrence in binary probit regression model between MLE method and Firths approach. Second, to compare the performance of binary probit regression model estimator that obtained by MLE method and Firths approach using RMSE criteria. Those are performed using simulation method and under different sample size. The results showed that the chance of separation occurrence in MLE method for small sample size is higher than Firths approach. On the other hand, for larger sample size, the probability decreased and relatively identic between MLE method and Firths approach. Meanwhile, Firths estimators have smaller RMSE than MLEs especially for smaller sample sizes. But for larger sample sizes, the RMSEs are not much different. It means that Firths estimators outperformed MLE estimator.
Sample Size and Item Parameter Estimation Precision When Utilizing the One-Parameter "Rasch" Model
ERIC Educational Resources Information Center
Custer, Michael
2015-01-01
This study examines the relationship between sample size and item parameter estimation precision when utilizing the one-parameter model. Item parameter estimates are examined relative to "true" values by evaluating the decline in root mean squared deviation (RMSD) and the number of outliers as sample size increases. This occurs across…
Comparison of Sample Size by Bootstrap and by Formulas Based on Normal Distribution Assumption.
Wang, Zuozhen
2018-01-01
Bootstrapping technique is distribution-independent, which provides an indirect way to estimate the sample size for a clinical trial based on a relatively smaller sample. In this paper, sample size estimation to compare two parallel-design arms for continuous data by bootstrap procedure are presented for various test types (inequality, non-inferiority, superiority, and equivalence), respectively. Meanwhile, sample size calculation by mathematical formulas (normal distribution assumption) for the identical data are also carried out. Consequently, power difference between the two calculation methods is acceptably small for all the test types. It shows that the bootstrap procedure is a credible technique for sample size estimation. After that, we compared the powers determined using the two methods based on data that violate the normal distribution assumption. To accommodate the feature of the data, the nonparametric statistical method of Wilcoxon test was applied to compare the two groups in the data during the process of bootstrap power estimation. As a result, the power estimated by normal distribution-based formula is far larger than that by bootstrap for each specific sample size per group. Hence, for this type of data, it is preferable that the bootstrap method be applied for sample size calculation at the beginning, and that the same statistical method as used in the subsequent statistical analysis is employed for each bootstrap sample during the course of bootstrap sample size estimation, provided there is historical true data available that can be well representative of the population to which the proposed trial is planning to extrapolate.
NASA Astrophysics Data System (ADS)
Voss, Sebastian; Zimmermann, Beate; Zimmermann, Alexander
2016-04-01
In the last three decades, an increasing number of studies analyzed spatial patterns in throughfall to investigate the consequences of rainfall redistribution for biogeochemical and hydrological processes in forests. In the majority of cases, variograms were used to characterize the spatial properties of the throughfall data. The estimation of the variogram from sample data requires an appropriate sampling scheme: most importantly, a large sample and an appropriate layout of sampling locations that often has to serve both variogram estimation and geostatistical prediction. While some recommendations on these aspects exist, they focus on Gaussian data and high ratios of the variogram range to the extent of the study area. However, many hydrological data, and throughfall data in particular, do not follow a Gaussian distribution. In this study, we examined the effect of extent, sample size, sampling design, and calculation methods on variogram estimation of throughfall data. For our investigation, we first generated non-Gaussian random fields based on throughfall data with heavy outliers. Subsequently, we sampled the fields with three extents (plots with edge lengths of 25 m, 50 m, and 100 m), four common sampling designs (two grid-based layouts, transect and random sampling), and five sample sizes (50, 100, 150, 200, 400). We then estimated the variogram parameters by method-of-moments and residual maximum likelihood. Our key findings are threefold. First, the choice of the extent has a substantial influence on the estimation of the variogram. A comparatively small ratio of the extent to the correlation length is beneficial for variogram estimation. Second, a combination of a minimum sample size of 150, a design that ensures the sampling of small distances and variogram estimation by residual maximum likelihood offers a good compromise between accuracy and efficiency. Third, studies relying on method-of-moments based variogram estimation may have to employ at least 200 sampling points for reliable variogram estimates. These suggested sample sizes exceed the numbers recommended by studies dealing with Gaussian data by up to 100 %. Given that most previous throughfall studies relied on method-of-moments variogram estimation and sample sizes << 200, our current knowledge about throughfall spatial variability stands on shaky ground.
NASA Astrophysics Data System (ADS)
Soucemarianadin, Laure; Cécillon, Lauric; Chenu, Claire; Baudin, François; Nicolas, Manuel; Savignac, Florence; Barré, Pierre
2017-04-01
Soil organic matter (SOM) is the biggest terrestrial carbon reservoir, storing 3 to 4 times more carbon than the atmosphere. However, despite its major importance for climate regulation SOM dynamics remains insufficiently understood. For instance, there is still no widely accepted method to assess SOM lability. Soil respiration tests and particulate organic matter (POM) obtained by different fractionation schemes have been used for decades and are now considered as classical estimates of very labile and labile soil organic carbon (SOC), respectively. But the pertinence of these methods to characterize SOM turnover can be questioned. Moreover, they are very time-consuming and their reproducibility might be an issue. Alternate ways of determining the labile SOC component are thus well-needed. Thermal analyses have been used to characterize SOM among which Rock-Eval 6 (RE6) analysis of soil has shown promising results in the determination of SOM biogeochemical stability (Gregorich et al., 2015; Barré et al., 2016). Using a large set of samples of French forest soils representing contrasted pedoclimatic conditions, including deep samples (up to 1 m depth), we compared different techniques used for SOM lability assessment. We explored whether results from soil respiration test (10-week laboratory incubations), SOM size-density fractionation and RE6 thermal analysis were comparable and how they were correlated. A set of 222 (respiration test and RE6), 103 (SOM fractionation and RE6) and 93 (respiration test, SOM fractionation and RE6) forest soils samples were respectively analyzed and compared. The comparison of the three methods (n = 93) using a principal component analysis separated samples from the surface (0-10 cm) and deep (40-80 cm) layers, highlighting a clear effect of depth on the short-term persistence of SOC. A correlation analysis demonstrated that, for these samples, the two classical methods of labile SOC determination (respiration and SOM fractionation) were only weakly positively correlated (Spearman's ρ = 0.26, n = 93). Similarly, soil respiration had only a weak negative correlation (Spearman's ρ = -0.24, n = 93; ρ = -0.33, n = 222) with the RE6 parameter T50 CH pyrolysis. This parameter, previously used as an indicator of labile SOC (Gregorich et al., 2015), represents the temperature at which 50% of the OM was pyrolyzed to effluents (mainly hydrocarbons) during the pyrolysis phase of RE6. Conversely, POC content (% of total SOC) showed a higher negative correlation with T50 CH pyrolysis (ρ = -0.66, n = 93; ρ = -0.65, n = 103) and was positively and negatively correlated to the hydrogen index, HI (mg HC/g TOC; ρ = 0.56/0.53) and the oxygen index, OI (mg CO2/g TOC; ρ = -0.63/-0.62) respectively. Our results showed that RE6 results are consistent with respiration and fractionation results: SOC with higher respiration rate and higher POC content burns at a lower temperature. RE6 thermal analysis could therefore be viewed as a useful fast and cost effective alternative to more time-consuming methods used in SOM fractions determination. Barré, P. et al. Biogeochemistry 2016, 1-12, 130. Gregorich, E.G. et al. Soil Biol. Biochem. 2015, 182-191, 91.
ERIC Educational Resources Information Center
Sahin, Alper; Weiss, David J.
2015-01-01
This study aimed to investigate the effects of calibration sample size and item bank size on examinee ability estimation in computerized adaptive testing (CAT). For this purpose, a 500-item bank pre-calibrated using the three-parameter logistic model with 10,000 examinees was simulated. Calibration samples of varying sizes (150, 250, 350, 500,…
Simulation analyses of space use: Home range estimates, variability, and sample size
Bekoff, Marc; Mech, L. David
1984-01-01
Simulations of space use by animals were run to determine the relationship among home range area estimates, variability, and sample size (number of locations). As sample size increased, home range size increased asymptotically, whereas variability decreased among mean home range area estimates generated by multiple simulations for the same sample size. Our results suggest that field workers should ascertain between 100 and 200 locations in order to estimate reliably home range area. In some cases, this suggested guideline is higher than values found in the few published studies in which the relationship between home range area and number of locations is addressed. Sampling differences for small species occupying relatively small home ranges indicate that fewer locations may be sufficient to allow for a reliable estimate of home range. Intraspecific variability in social status (group member, loner, resident, transient), age, sex, reproductive condition, and food resources also have to be considered, as do season, habitat, and differences in sampling and analytical methods. Comparative data still are needed.
Got Power? A Systematic Review of Sample Size Adequacy in Health Professions Education Research
ERIC Educational Resources Information Center
Cook, David A.; Hatala, Rose
2015-01-01
Many education research studies employ small samples, which in turn lowers statistical power. We re-analyzed the results of a meta-analysis of simulation-based education to determine study power across a range of effect sizes, and the smallest effect that could be plausibly excluded. We systematically searched multiple databases through May 2011,…
Improving the analysis of composite endpoints in rare disease trials.
McMenamin, Martina; Berglind, Anna; Wason, James M S
2018-05-22
Composite endpoints are recommended in rare diseases to increase power and/or to sufficiently capture complexity. Often, they are in the form of responder indices which contain a mixture of continuous and binary components. Analyses of these outcomes typically treat them as binary, thus only using the dichotomisations of continuous components. The augmented binary method offers a more efficient alternative and is therefore especially useful for rare diseases. Previous work has indicated the method may have poorer statistical properties when the sample size is small. Here we investigate small sample properties and implement small sample corrections. We re-sample from a previous trial with sample sizes varying from 30 to 80. We apply the standard binary and augmented binary methods and determine the power, type I error rate, coverage and average confidence interval width for each of the estimators. We implement Firth's adjustment for the binary component models and a small sample variance correction for the generalized estimating equations, applying the small sample adjusted methods to each sub-sample as before for comparison. For the log-odds treatment effect the power of the augmented binary method is 20-55% compared to 12-20% for the standard binary method. Both methods have approximately nominal type I error rates. The difference in response probabilities exhibit similar power but both unadjusted methods demonstrate type I error rates of 6-8%. The small sample corrected methods have approximately nominal type I error rates. On both scales, the reduction in average confidence interval width when using the adjusted augmented binary method is 17-18%. This is equivalent to requiring a 32% smaller sample size to achieve the same statistical power. The augmented binary method with small sample corrections provides a substantial improvement for rare disease trials using composite endpoints. We recommend the use of the method for the primary analysis in relevant rare disease trials. We emphasise that the method should be used alongside other efforts in improving the quality of evidence generated from rare disease trials rather than replace them.
Estimation After a Group Sequential Trial.
Milanzi, Elasma; Molenberghs, Geert; Alonso, Ariel; Kenward, Michael G; Tsiatis, Anastasios A; Davidian, Marie; Verbeke, Geert
2015-10-01
Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al (2012) and Milanzi et al (2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2 n . In this paper, we consider the more practically useful setting of sample sizes in a the finite set { n 1 , n 2 , …, n L }. It is shown that the sample average is then a justifiable estimator , in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why simulations can give the false impression of bias in the sample average when considered conditional upon the sample size. The consequence is that no corrections need to be made to estimators following sequential trials. When small-sample bias is of concern, the conditional likelihood estimator provides a relatively straightforward modification to the sample average. Finally, it is shown that classical likelihood-based standard errors and confidence intervals can be applied, obviating the need for technical corrections.
Effective pore size and radius of capture for K(+) ions in K-channels.
Moldenhauer, Hans; Díaz-Franulic, Ignacio; González-Nilo, Fernando; Naranjo, David
2016-02-02
Reconciling protein functional data with crystal structure is arduous because rare conformations or crystallization artifacts occur. Here we present a tool to validate the dimensions of open pore structures of potassium-selective ion channels. We used freely available algorithms to calculate the molecular contour of the pore to determine the effective internal pore radius (r(E)) in several K-channel crystal structures. r(E) was operationally defined as the radius of the biggest sphere able to enter the pore from the cytosolic side. We obtained consistent r(E) estimates for MthK and Kv1.2/2.1 structures, with r(E) = 5.3-5.9 Å and r(E) = 4.5-5.2 Å, respectively. We compared these structural estimates with functional assessments of the internal mouth radii of capture (r(C)) for two electrophysiological counterparts, the large conductance calcium activated K-channel (r(C) = 2.2 Å) and the Shaker Kv-channel (r(C) = 0.8 Å), for MthK and Kv1.2/2.1 structures, respectively. Calculating the difference between r(E) and r(C), produced consistent size radii of 3.1-3.7 Å and 3.6-4.4 Å for hydrated K(+) ions. These hydrated K(+) estimates harmonize with others obtained with diverse experimental and theoretical methods. Thus, these findings validate MthK and the Kv1.2/2.1 structures as templates for open BK and Kv-channels, respectively.
Using known map category marginal frequencies to improve estimates of thematic map accuracy
NASA Technical Reports Server (NTRS)
Card, D. H.
1982-01-01
By means of two simple sampling plans suggested in the accuracy-assessment literature, it is shown how one can use knowledge of map-category relative sizes to improve estimates of various probabilities. The fact that maximum likelihood estimates of cell probabilities for the simple random sampling and map category-stratified sampling were identical has permitted a unified treatment of the contingency-table analysis. A rigorous analysis of the effect of sampling independently within map categories is made possible by results for the stratified case. It is noted that such matters as optimal sample size selection for the achievement of a desired level of precision in various estimators are irrelevant, since the estimators derived are valid irrespective of how sample sizes are chosen.
A cautionary note on Bayesian estimation of population size by removal sampling with diffuse priors.
Bord, Séverine; Bioche, Christèle; Druilhet, Pierre
2018-05-01
We consider the problem of estimating a population size by removal sampling when the sampling rate is unknown. Bayesian methods are now widespread and allow to include prior knowledge in the analysis. However, we show that Bayes estimates based on default improper priors lead to improper posteriors or infinite estimates. Similarly, weakly informative priors give unstable estimators that are sensitive to the choice of hyperparameters. By examining the likelihood, we show that population size estimates can be stabilized by penalizing small values of the sampling rate or large value of the population size. Based on theoretical results and simulation studies, we propose some recommendations on the choice of the prior. Then, we applied our results to real datasets. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Angly, Florent E; Willner, Dana; Prieto-Davó, Alejandra; Edwards, Robert A; Schmieder, Robert; Vega-Thurber, Rebecca; Antonopoulos, Dionysios A; Barott, Katie; Cottrell, Matthew T; Desnues, Christelle; Dinsdale, Elizabeth A; Furlan, Mike; Haynes, Matthew; Henn, Matthew R; Hu, Yongfei; Kirchman, David L; McDole, Tracey; McPherson, John D; Meyer, Folker; Miller, R Michael; Mundt, Egbert; Naviaux, Robert K; Rodriguez-Mueller, Beltran; Stevens, Rick; Wegley, Linda; Zhang, Lixin; Zhu, Baoli; Rohwer, Forest
2009-12-01
Metagenomic studies characterize both the composition and diversity of uncultured viral and microbial communities. BLAST-based comparisons have typically been used for such analyses; however, sampling biases, high percentages of unknown sequences, and the use of arbitrary thresholds to find significant similarities can decrease the accuracy and validity of estimates. Here, we present Genome relative Abundance and Average Size (GAAS), a complete software package that provides improved estimates of community composition and average genome length for metagenomes in both textual and graphical formats. GAAS implements a novel methodology to control for sampling bias via length normalization, to adjust for multiple BLAST similarities by similarity weighting, and to select significant similarities using relative alignment lengths. In benchmark tests, the GAAS method was robust to both high percentages of unknown sequences and to variations in metagenomic sequence read lengths. Re-analysis of the Sargasso Sea virome using GAAS indicated that standard methodologies for metagenomic analysis may dramatically underestimate the abundance and importance of organisms with small genomes in environmental systems. Using GAAS, we conducted a meta-analysis of microbial and viral average genome lengths in over 150 metagenomes from four biomes to determine whether genome lengths vary consistently between and within biomes, and between microbial and viral communities from the same environment. Significant differences between biomes and within aquatic sub-biomes (oceans, hypersaline systems, freshwater, and microbialites) suggested that average genome length is a fundamental property of environments driven by factors at the sub-biome level. The behavior of paired viral and microbial metagenomes from the same environment indicated that microbial and viral average genome sizes are independent of each other, but indicative of community responses to stressors and environmental conditions.
NASA Technical Reports Server (NTRS)
Parada, N. D. J. (Principal Investigator); Moreira, M. A.; Chen, S. C.; Batista, G. T.
1984-01-01
A procedure to estimate wheat (Triticum aestivum L) area using sampling technique based on aerial photographs and digital LANDSAT MSS data is developed. Aerial photographs covering 720 square km are visually analyzed. To estimate wheat area, a regression approach is applied using different sample sizes and various sampling units. As the size of sampling unit decreased, the percentage of sampled area required to obtain similar estimation performance also decreased. The lowest percentage of the area sampled for wheat estimation with relatively high precision and accuracy through regression estimation is 13.90% using 10 square km as the sampling unit. Wheat area estimation using only aerial photographs is less precise and accurate than those obtained by regression estimation.
Zhu, Hong; Xu, Xiaohan; Ahn, Chul
2017-01-01
Paired experimental design is widely used in clinical and health behavioral studies, where each study unit contributes a pair of observations. Investigators often encounter incomplete observations of paired outcomes in the data collected. Some study units contribute complete pairs of observations, while the others contribute either pre- or post-intervention observations. Statistical inference for paired experimental design with incomplete observations of continuous outcomes has been extensively studied in literature. However, sample size method for such study design is sparsely available. We derive a closed-form sample size formula based on the generalized estimating equation approach by treating the incomplete observations as missing data in a linear model. The proposed method properly accounts for the impact of mixed structure of observed data: a combination of paired and unpaired outcomes. The sample size formula is flexible to accommodate different missing patterns, magnitude of missingness, and correlation parameter values. We demonstrate that under complete observations, the proposed generalized estimating equation sample size estimate is the same as that based on the paired t-test. In the presence of missing data, the proposed method would lead to a more accurate sample size estimate comparing with the crude adjustment. Simulation studies are conducted to evaluate the finite-sample performance of the generalized estimating equation sample size formula. A real application example is presented for illustration.
NASA Technical Reports Server (NTRS)
Hixson, M. M.; Bauer, M. E.; Davis, B. J.
1979-01-01
The effect of sampling on the accuracy (precision and bias) of crop area estimates made from classifications of LANDSAT MSS data was investigated. Full-frame classifications of wheat and non-wheat for eighty counties in Kansas were repetitively sampled to simulate alternative sampling plants. Four sampling schemes involving different numbers of samples and different size sampling units were evaluated. The precision of the wheat area estimates increased as the segment size decreased and the number of segments was increased. Although the average bias associated with the various sampling schemes was not significantly different, the maximum absolute bias was directly related to sampling unit size.
HIV prevention trial design in an era of effective pre-exposure prophylaxis.
Cutrell, Amy; Donnell, Deborah; Dunn, David T; Glidden, David V; Grobler, Anneke; Hanscom, Brett; Stancil, Britt S; Meyer, R Daniel; Wang, Ronnie; Cuffe, Robert L
2017-01-01
Pre-exposure prophylaxis (PrEP) has demonstrated remarkable effectiveness protecting at-risk individuals from HIV-1 infection. Despite this record of effectiveness, concerns persist about the diminished protective effect observed in women compared with men and the influence of adherence and risk behaviors on effectiveness in targeted subpopulations. Furthermore, the high prophylactic efficacy of the first PrEP agent, tenofovir disoproxil fumarate/emtricitabine (TDF/FTC), presents challenges for demonstrating the efficacy of new candidates. Trials of new agents would typically require use of non-inferiority (NI) designs in which acceptable efficacy for an experimental agent is determined using pre-defined margins based on the efficacy of the proven active comparator (i.e. TDF/FTC) in placebo-controlled trials. Setting NI margins is a critical step in designing registrational studies. Under- or over-estimation of the margin can call into question the utility of the study in the registration package. The dependence on previous placebo-controlled trials introduces the same issues as external/historical controls. These issues will need to be addressed using trial design features such as re-estimated NI margins, enrichment strategies, run-in periods, crossover between study arms, and adaptive re-estimation of sample sizes. These measures and other innovations can help to ensure that new PrEP agents are made available to the public using stringent standards of evidence.
Trattner, Sigal; Cheng, Bin; Pieniazek, Radoslaw L.; Hoffmann, Udo; Douglas, Pamela S.; Einstein, Andrew J.
2014-01-01
Purpose: Effective dose (ED) is a widely used metric for comparing ionizing radiation burden between different imaging modalities, scanners, and scan protocols. In computed tomography (CT), ED can be estimated by performing scans on an anthropomorphic phantom in which metal-oxide-semiconductor field-effect transistor (MOSFET) solid-state dosimeters have been placed to enable organ dose measurements. Here a statistical framework is established to determine the sample size (number of scans) needed for estimating ED to a desired precision and confidence, for a particular scanner and scan protocol, subject to practical limitations. Methods: The statistical scheme involves solving equations which minimize the sample size required for estimating ED to desired precision and confidence. It is subject to a constrained variation of the estimated ED and solved using the Lagrange multiplier method. The scheme incorporates measurement variation introduced both by MOSFET calibration, and by variation in MOSFET readings between repeated CT scans. Sample size requirements are illustrated on cardiac, chest, and abdomen–pelvis CT scans performed on a 320-row scanner and chest CT performed on a 16-row scanner. Results: Sample sizes for estimating ED vary considerably between scanners and protocols. Sample size increases as the required precision or confidence is higher and also as the anticipated ED is lower. For example, for a helical chest protocol, for 95% confidence and 5% precision for the ED, 30 measurements are required on the 320-row scanner and 11 on the 16-row scanner when the anticipated ED is 4 mSv; these sample sizes are 5 and 2, respectively, when the anticipated ED is 10 mSv. Conclusions: Applying the suggested scheme, it was found that even at modest sample sizes, it is feasible to estimate ED with high precision and a high degree of confidence. As CT technology develops enabling ED to be lowered, more MOSFET measurements are needed to estimate ED with the same precision and confidence. PMID:24694150
Bogart, Justin A.; Cole, Bren E.; Boreen, Michael A.; Lippincott, Connor A.; Manor, Brian C.; Carroll, Patrick J.; Schelter, Eric J.
2016-01-01
Rare earth (RE) metals are critical components of electronic materials and permanent magnets. Recycling of consumer materials is a promising new source of rare REs. To incentivize recycling, there is a clear need for the development of simple methods for targeted separations of mixtures of RE metal salts. Metal complexes of a tripodal hydroxylaminato ligand, TriNOx3–, featured a size-sensitive aperture formed of its three η2-(N,O) ligand arms. Exposure of cations in the aperture induced a self-associative equilibrium comprising RE(TriNOx)THF and [RE(TriNOx)]2 species. Differences in the equilibrium constants Kdimer for early and late metals enabled simple separations through leaching. Separations were performed on RE1/RE2 mixtures, where RE1 = La–Sm and RE2 = Gd–Lu, with emphasis on Eu/Y separations for potential applications in the recycling of phosphor waste from compact fluorescent light bulbs. Using the leaching method, separations factors approaching 2,000 were obtained for early–late RE combinations. Following solvent optimization, >95% pure samples of Eu were obtained with a 67% recovery for the technologically relevant Eu/Y separation. PMID:27956636
Bogart, Justin A; Cole, Bren E; Boreen, Michael A; Lippincott, Connor A; Manor, Brian C; Carroll, Patrick J; Schelter, Eric J
2016-12-27
Rare earth (RE) metals are critical components of electronic materials and permanent magnets. Recycling of consumer materials is a promising new source of rare REs. To incentivize recycling, there is a clear need for the development of simple methods for targeted separations of mixtures of RE metal salts. Metal complexes of a tripodal hydroxylaminato ligand, TriNOx 3- , featured a size-sensitive aperture formed of its three η 2 -(N,O) ligand arms. Exposure of cations in the aperture induced a self-associative equilibrium comprising RE(TriNOx)THF and [RE(TriNOx)] 2 species. Differences in the equilibrium constants K dimer for early and late metals enabled simple separations through leaching. Separations were performed on RE1/RE2 mixtures, where RE1 = La-Sm and RE2 = Gd-Lu, with emphasis on Eu/Y separations for potential applications in the recycling of phosphor waste from compact fluorescent light bulbs. Using the leaching method, separations factors approaching 2,000 were obtained for early-late RE combinations. Following solvent optimization, >95% pure samples of Eu were obtained with a 67% recovery for the technologically relevant Eu/Y separation.
Effect of retrogression duration on the grain boundary microstructure and microchemistry of AA7010
NASA Astrophysics Data System (ADS)
Nandana, M. S.; Bhat, K. Udaya; Manjunatha, C. M.
2018-04-01
The paper presents the microstructural characterization of the aluminium alloy 7010 in retrogression and re- ageing (RRA) condition by using Transmission Electron Microscope (TEM). The grain boundary microstructure is analyzed with the focus on variation of GBP's (grain boundary precipitate) size and PFZ (precipitate free zone) size during retrogression performed at 200 °C for duration of 10-60 min. The microchemistry of the GBP's is analyzed by using TEM-EDS (Energy Dispersive X-ray spectroscopy). The results reveal the coarsening of discrete GBP's along with enrichment of the Cu in them. The average size of the GBP's in RRA treated sample vary from 30 nm during 10 min of retrogression to 59 nm at 60 min of retrogression. The PFZ size varied from 35 nm to 51 nm for 10 min and 60 min of retrogression time, respectively. The Cu content of the GBP's increased from 3.54 wt% for 10 min of retrogression to 5.27 wt% for 60 min of retrogression and re-aged sample.
Estimating numbers of females with cubs-of-the-year in the Yellowstone grizzly bear population
Keating, K.A.; Schwartz, C.C.; Haroldson, M.A.; Moody, D.
2001-01-01
For grizzly bears (Ursus arctos horribilis) in the Greater Yellowstone Ecosystem (GYE), minimum population size and allowable numbers of human-caused mortalities have been calculated as a function of the number of unique females with cubs-of-the-year (FCUB) seen during a 3- year period. This approach underestimates the total number of FCUB, thereby biasing estimates of population size and sustainable mortality. Also, it does not permit calculation of valid confidence bounds. Many statistical methods can resolve or mitigate these problems, but there is no universal best method. Instead, relative performances of different methods can vary with population size, sample size, and degree of heterogeneity among sighting probabilities for individual animals. We compared 7 nonparametric estimators, using Monte Carlo techniques to assess performances over the range of sampling conditions deemed plausible for the Yellowstone population. Our goal was to estimate the number of FCUB present in the population each year. Our evaluation differed from previous comparisons of such estimators by including sample coverage methods and by treating individual sightings, rather than sample periods, as the sample unit. Consequently, our conclusions also differ from earlier studies. Recommendations regarding estimators and necessary sample sizes are presented, together with estimates of annual numbers of FCUB in the Yellowstone population with bootstrap confidence bounds.
Olives, Casey; Valadez, Joseph J; Pagano, Marcello
2014-03-01
To assess the bias incurred when curtailment of Lot Quality Assurance Sampling (LQAS) is ignored, to present unbiased estimators, to consider the impact of cluster sampling by simulation and to apply our method to published polio immunization data from Nigeria. We present estimators of coverage when using two kinds of curtailed LQAS strategies: semicurtailed and curtailed. We study the proposed estimators with independent and clustered data using three field-tested LQAS designs for assessing polio vaccination coverage, with samples of size 60 and decision rules of 9, 21 and 33, and compare them to biased maximum likelihood estimators. Lastly, we present estimates of polio vaccination coverage from previously published data in 20 local government authorities (LGAs) from five Nigerian states. Simulations illustrate substantial bias if one ignores the curtailed sampling design. Proposed estimators show no bias. Clustering does not affect the bias of these estimators. Across simulations, standard errors show signs of inflation as clustering increases. Neither sampling strategy nor LQAS design influences estimates of polio vaccination coverage in 20 Nigerian LGAs. When coverage is low, semicurtailed LQAS strategies considerably reduces the sample size required to make a decision. Curtailed LQAS designs further reduce the sample size when coverage is high. Results presented dispel the misconception that curtailed LQAS data are unsuitable for estimation. These findings augment the utility of LQAS as a tool for monitoring vaccination efforts by demonstrating that unbiased estimation using curtailed designs is not only possible but these designs also reduce the sample size. © 2014 John Wiley & Sons Ltd.
Candel, Math J J M; Van Breukelen, Gerard J P
2010-06-30
Adjustments of sample size formulas are given for varying cluster sizes in cluster randomized trials with a binary outcome when testing the treatment effect with mixed effects logistic regression using second-order penalized quasi-likelihood estimation (PQL). Starting from first-order marginal quasi-likelihood (MQL) estimation of the treatment effect, the asymptotic relative efficiency of unequal versus equal cluster sizes is derived. A Monte Carlo simulation study shows this asymptotic relative efficiency to be rather accurate for realistic sample sizes, when employing second-order PQL. An approximate, simpler formula is presented to estimate the efficiency loss due to varying cluster sizes when planning a trial. In many cases sampling 14 per cent more clusters is sufficient to repair the efficiency loss due to varying cluster sizes. Since current closed-form formulas for sample size calculation are based on first-order MQL, planning a trial also requires a conversion factor to obtain the variance of the second-order PQL estimator. In a second Monte Carlo study, this conversion factor turned out to be 1.25 at most. (c) 2010 John Wiley & Sons, Ltd.
Braaten, P.J.; Fuller, D.B.; Lott, R.D.; Jordan, G.R.
2009-01-01
Juvenile pallid sturgeon Scaphirhynchus albus raised in hatcheries and stocked in the wild are used to augment critically imperiled populations of this federally endangered species in the United States. For pallid sturgeon in recovery priority management area 2 (RPMA 2) of the Missouri River and lower Yellowstone River where natural recruitment has not occurred for decades, restoration programs aim to stock an annual minimum of 9000 juvenile pallid sturgeon for 20 years to re-establish a minimum population of 1700 adults. However, establishment of this target was based on general guidelines for maintaining the genetic integrity of populations rather than pallid sturgeon-specific demographic information because data on the historical population size was lacking. In this study, information from a recent population estimate (158 wild adults in 2004, 95% confidence interval 129-193 adults) and an empirically derived adult mortality rate (5%) was used in a cohort population model to back-estimate the historic abundance of adult pallid sturgeon in RPMA 2. Three back-estimation age models were developed, and assumed that adults alive during 2004 were 30-, 40-, or 50-years old. Based on these age assumptions, population sizes [??95% confidence intervals; (CI)] were back-estimated to 1989, 1979, and 1969 to approximate size of the population when individuals would have been sexually mature (15 years old) and capable of spawning. Back-estimations yielded predictions of 344 adults in 1989 (95% CI 281-420), 577 adults in 1979 (95% CI 471-704), and 968 adults in 1969 (95% CI 790-1182) for the 30-, 40-, and 50-year age models, respectively. Although several assumptions are inherent in the back-estimation models, results suggest the juvenile stocking program for pallid sturgeon will likely re-establish an adult population that equals in the short-term and exceeds in the long-term the predicted population numbers that occurred during past decades in RPMA 2. However, re-establishment of a large population in RPMA 2 that exceeds populations present 40+ years ago should be considered conservatively, as this strategy will increase the number of reproductive adults and thereby increase the likelihood for natural recruitment in this recruitment-limited system. ?? 2009 Blackwell Verlag GmbH.
Influence of sampling window size and orientation on parafoveal cone packing density
Lombardo, Marco; Serrao, Sebastiano; Ducoli, Pietro; Lombardo, Giuseppe
2013-01-01
We assessed the agreement between sampling windows of different size and orientation on packing density estimates in images of the parafoveal cone mosaic acquired using a flood-illumination adaptive optics retinal camera. Horizontal and vertical oriented sampling windows of different size (320x160 µm, 160x80 µm and 80x40 µm) were selected in two retinal locations along the horizontal meridian in one eye of ten subjects. At each location, cone density tended to decline with decreasing sampling area. Although the differences in cone density estimates were not statistically significant, Bland-Altman plots showed that the agreement between cone density estimated within the different sampling window conditions was moderate. The percentage of the preferred packing arrangements of cones by Voronoi tiles was slightly affected by window size and orientation. The results illustrated the high importance of specifying the size and orientation of the sampling window used to derive cone metric estimates to facilitate comparison of different studies. PMID:24009995
Gore, Mauvis A.; Frey, Peter H.; Ormond, Rupert F.; Allan, Holly; Gilkes, Gabriella
2016-01-01
Following centuries of exploitation, basking sharks (Cetorhinus maximus) are considered by IUCN as Endangered in the Northeast Atlantic, where they have now been substantially protected for over two decades. However, the present size of this population remains unknown. We investigated the use of photo-identification of individuals’ dorsal fins, combined with mark-recapture methodology, to investigate the size of populations of basking shark within the west coast of Scotland. From a total of 921 encounters photographed between 2004 and 2011, 710 sharks were found to be individually identifiable based on dorsal fin damage and natural features. Of these, only 41 individuals were re-sighted, most commonly both within days of, and close to the site of, the initial encounter. A smaller number were re-sighted after longer periods of up to two years. A comparison of the distinguishing features of individuals on first recording and subsequent re-sighting showed that in almost all cases these features remained little changed, suggesting the low re-sighting rate was not due to a loss of distinguishing features. Because of the low number of re-sighting we were not able to produce reliable estimates for the long-term regional population. However, for one 50 km diameter study area between the islands of Mull, Coll and Tiree, we were able to generate closed-population estimates for 6–9 day periods in 2010 of 985 (95% CI = 494–1683), and in 2011 of 201 (95% CI = 143–340). For the same 2011 period an open-population model generated a similar estimate of 213 (95% CI = 111–317). Otherwise the low rate and temporal patterning of re-sightings support the view that such local basking shark populations are temporary, dynamic groupings of individuals drawn from a much larger regional population than previously supposed. The study demonstrated the feasibility and limitations of photo-identification as a non-invasive technique for identifying individual basking sharks. PMID:26930611
An assessment of re-randomization methods in bark beetle (Scolytidae) trapping bioassays
Christopher J. Fettig; Christopher P. Dabney; Stepehen R. McKelvey; Robert R. Borys
2006-01-01
Numerous studies have explored the role of semiochemicals in the behavior of bark beetles (Scolytidae). Multiple funnel traps are often used to elucidate these behavioral responses. Sufficient sample sizes are obtained by using large numbers of traps to which treatments are randomly assigned once, or by frequent collection of trap catches and subsequent re-...
Ramezani, Habib; Holm, Sören; Allard, Anna; Ståhl, Göran
2010-05-01
Environmental monitoring of landscapes is of increasing interest. To quantify landscape patterns, a number of metrics are used, of which Shannon's diversity, edge length, and density are studied here. As an alternative to complete mapping, point sampling was applied to estimate the metrics for already mapped landscapes selected from the National Inventory of Landscapes in Sweden (NILS). Monte-Carlo simulation was applied to study the performance of different designs. Random and systematic samplings were applied for four sample sizes and five buffer widths. The latter feature was relevant for edge length, since length was estimated through the number of points falling in buffer areas around edges. In addition, two landscape complexities were tested by applying two classification schemes with seven or 20 land cover classes to the NILS data. As expected, the root mean square error (RMSE) of the estimators decreased with increasing sample size. The estimators of both metrics were slightly biased, but the bias of Shannon's diversity estimator was shown to decrease when sample size increased. In the edge length case, an increasing buffer width resulted in larger bias due to the increased impact of boundary conditions; this effect was shown to be independent of sample size. However, we also developed adjusted estimators that eliminate the bias of the edge length estimator. The rates of decrease of RMSE with increasing sample size and buffer width were quantified by a regression model. Finally, indicative cost-accuracy relationships were derived showing that point sampling could be a competitive alternative to complete wall-to-wall mapping.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Piepel, Gregory F.; Amidan, Brett G.; Krauter, Paula
2011-05-01
Two concerns were raised by the Government Accountability Office following the 2001 building contaminations via letters containing Bacillus anthracis (BA). These included the: 1) lack of validated sampling methods, and 2) need to use statistical sampling to quantify the confidence of no contamination when all samples have negative results. Critical to addressing these concerns is quantifying the false negative rate (FNR). The FNR may depend on the 1) method of contaminant deposition, 2) surface concentration of the contaminant, 3) surface material being sampled, 4) sample collection method, 5) sample storage/transportation conditions, 6) sample processing method, and 7) sample analytical method.more » A review of the literature found 17 laboratory studies that focused on swab, wipe, or vacuum samples collected from a variety of surface materials contaminated by BA or a surrogate, and used culture methods to determine the surface contaminant concentration. These studies quantified performance of the sampling and analysis methods in terms of recovery efficiency (RE) and not FNR (which left a major gap in available information). Quantifying the FNR under a variety of conditions is a key aspect of validating sample and analysis methods, and also for calculating the confidence in characterization or clearance decisions based on a statistical sampling plan. A laboratory study was planned to partially fill the gap in FNR results. This report documents the experimental design developed by Pacific Northwest National Laboratory and Sandia National Laboratories (SNL) for a sponge-wipe method. The testing was performed by SNL and is now completed. The study investigated the effects on key response variables from six surface materials contaminated with eight surface concentrations of a BA surrogate (Bacillus atrophaeus). The key response variables include measures of the contamination on test coupons of surface materials tested, contamination recovered from coupons by sponge-wipe samples, RE, and FNR. The experimental design involves 16 test runs, performed in two blocks of eight runs. Three surface materials (stainless steel, vinyl tile, and ceramic tile) were tested in the first block, while three other surface materials (plastic, painted wood paneling, and faux leather) were tested in the second block. The eight surface concentrations of the surrogate were randomly assigned to test runs within each block. Some of the concentrations were very low and presented challenges for deposition, sampling, and analysis. However, such tests are needed to investigate RE and FNR over the full range of concentrations of interest. In each run, there were 10 test coupons of each of the three surface materials. A positive control sample was generated at the same time as each test sample. The positive control results will be used to 1) calculate RE values for the wipe sampling and analysis method, and 2) fit RE- and FNR-concentration equations, for each of the six surface materials. Data analyses will support 1) estimating the FNR for each combination of contaminant concentration and surface material, 2) estimating the surface concentrations and their uncertainties of the contaminant for each combination of concentration and surface material, 3) estimating RE (%) and their uncertainties for each combination of contaminant concentration and surface material, 4) fitting FNR-concentration and RE-concentration equations for each of the six surface materials, 5) assessing goodness-of-fit of the equations, and 6) quantifying the uncertainty in FNR and RE predictions made with the fitted equations.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Piepel, Gregory F.; Amidan, Brett G.; Krauter, Paula
2010-12-16
Two concerns were raised by the Government Accountability Office following the 2001 building contaminations via letters containing Bacillus anthracis (BA). These included the: 1) lack of validated sampling methods, and 2) need to use statistical sampling to quantify the confidence of no contamination when all samples have negative results. Critical to addressing these concerns is quantifying the probability of correct detection (PCD) (or equivalently the false negative rate FNR = 1 - PCD). The PCD/FNR may depend on the 1) method of contaminant deposition, 2) surface concentration of the contaminant, 3) surface material being sampled, 4) sample collection method, 5)more » sample storage/transportation conditions, 6) sample processing method, and 7) sample analytical method. A review of the literature found 17 laboratory studies that focused on swab, wipe, or vacuum samples collected from a variety of surface materials contaminated by BA or a surrogate, and used culture methods to determine the surface contaminant concentration. These studies quantified performance of the sampling and analysis methods in terms of recovery efficiency (RE) and not PCD/FNR (which left a major gap in available information). Quantifying the PCD/FNR under a variety of conditions is a key aspect of validating sample and analysis methods, and also for calculating the confidence in characterization or clearance decisions based on a statistical sampling plan. A laboratory study was planned to partially fill the gap in PCD/FNR results. This report documents the experimental design developed by Pacific Northwest National Laboratory and Sandia National Laboratories (SNL) for a sponge-wipe method. The study will investigate the effects on key response variables from six surface materials contaminated with eight surface concentrations of a BA surrogate (Bacillus atrophaeus). The key response variables include measures of the contamination on test coupons of surface materials tested, contamination recovered from coupons by sponge-wipe samples, RE, and PCD/FNR. The experimental design involves 16 test runs, to be performed in two blocks of eight runs. Three surface materials (stainless steel, vinyl tile, and ceramic tile) were tested in the first block, while three other surface materials (plastic, painted wood paneling, and faux leather) will be tested in the second block. The eight surface concentrations of the surrogate were randomly assigned to test runs within each block. Some of the concentrations will be very low and may present challenges for deposition, sampling, and analysis. However, such tests are needed to investigate RE and PCD/FNR over the full range of concentrations of interest. In each run, there will be 10 test coupons of each of the three surface materials. A positive control sample will be generated prior to each test sample. The positive control results will be used to 1) calculate RE values for the wipe sampling and analysis method, and 2) fit RE- and PCD-concentration equations, for each of the six surface materials. Data analyses will support 1) estimating the PCD for each combination of contaminant concentration and surface material, 2) estimating the surface concentrations and their uncertainties of the contaminant for each combination of concentration and surface material, 3) estimating RE (%) and their uncertainties for each combination of contaminant concentration and surface material, 4) fitting PCD-concentration and RE-concentration equations for each of the six surface materials, 5) assessing goodness-of-fit of the equations, and 6) quantifying the uncertainty in PCD and RE predictions made with the fitted equations.« less
Experiments with central-limit properties of spatial samples from locally covariant random fields
Barringer, T.H.; Smith, T.E.
1992-01-01
When spatial samples are statistically dependent, the classical estimator of sample-mean standard deviation is well known to be inconsistent. For locally dependent samples, however, consistent estimators of sample-mean standard deviation can be constructed. The present paper investigates the sampling properties of one such estimator, designated as the tau estimator of sample-mean standard deviation. In particular, the asymptotic normality properties of standardized sample means based on tau estimators are studied in terms of computer experiments with simulated sample-mean distributions. The effects of both sample size and dependency levels among samples are examined for various value of tau (denoting the size of the spatial kernel for the estimator). The results suggest that even for small degrees of spatial dependency, the tau estimator exhibits significantly stronger normality properties than does the classical estimator of standardized sample means. ?? 1992.
Considerations in Forest Growth Estimation Between Two Measurements of Mapped Forest Inventory Plots
Michael T. Thompson
2006-01-01
Several aspects of the enhanced Forest Inventory and Analysis (FIA) program?s national plot design complicate change estimation. The design incorporates up to three separate plot sizes (microplot, subplot, and macroplot) to sample trees of different sizes. Because multiple plot sizes are involved, change estimators designed for polyareal plot sampling, such as those...
Sepúlveda, Nuno; Paulino, Carlos Daniel; Drakeley, Chris
2015-12-30
Several studies have highlighted the use of serological data in detecting a reduction in malaria transmission intensity. These studies have typically used serology as an adjunct measure and no formal examination of sample size calculations for this approach has been conducted. A sample size calculator is proposed for cross-sectional surveys using data simulation from a reverse catalytic model assuming a reduction in seroconversion rate (SCR) at a given change point before sampling. This calculator is based on logistic approximations for the underlying power curves to detect a reduction in SCR in relation to the hypothesis of a stable SCR for the same data. Sample sizes are illustrated for a hypothetical cross-sectional survey from an African population assuming a known or unknown change point. Overall, data simulation demonstrates that power is strongly affected by assuming a known or unknown change point. Small sample sizes are sufficient to detect strong reductions in SCR, but invariantly lead to poor precision of estimates for current SCR. In this situation, sample size is better determined by controlling the precision of SCR estimates. Conversely larger sample sizes are required for detecting more subtle reductions in malaria transmission but those invariantly increase precision whilst reducing putative estimation bias. The proposed sample size calculator, although based on data simulation, shows promise of being easily applicable to a range of populations and survey types. Since the change point is a major source of uncertainty, obtaining or assuming prior information about this parameter might reduce both the sample size and the chance of generating biased SCR estimates.
Effective pore size and radius of capture for K+ ions in K-channels
Moldenhauer, Hans; Díaz-Franulic, Ignacio; González-Nilo, Fernando; Naranjo, David
2016-01-01
Reconciling protein functional data with crystal structure is arduous because rare conformations or crystallization artifacts occur. Here we present a tool to validate the dimensions of open pore structures of potassium-selective ion channels. We used freely available algorithms to calculate the molecular contour of the pore to determine the effective internal pore radius (rE) in several K-channel crystal structures. rE was operationally defined as the radius of the biggest sphere able to enter the pore from the cytosolic side. We obtained consistent rE estimates for MthK and Kv1.2/2.1 structures, with rE = 5.3–5.9 Å and rE = 4.5–5.2 Å, respectively. We compared these structural estimates with functional assessments of the internal mouth radii of capture (rC) for two electrophysiological counterparts, the large conductance calcium activated K-channel (rC = 2.2 Å) and the Shaker Kv-channel (rC = 0.8 Å), for MthK and Kv1.2/2.1 structures, respectively. Calculating the difference between rE and rC, produced consistent size radii of 3.1–3.7 Å and 3.6–4.4 Å for hydrated K+ ions. These hydrated K+ estimates harmonize with others obtained with diverse experimental and theoretical methods. Thus, these findings validate MthK and the Kv1.2/2.1 structures as templates for open BK and Kv-channels, respectively. PMID:26831782
Lewis, G.J.; Panizzon, M.S.; Eyler, L.; Fennema-Notestine, C.; Chen, C.-H.; Neale, M.C.; Jernigan, T.L.; Lyons, M.J.; Dale, A.M.; Kremen, W.S.; Franz, C.E.
2015-01-01
While many studies have reported that individual differences in personality traits are genetically influenced, the neurobiological bases mediating these influences have not yet been well characterized. To advance understanding concerning the pathway from genetic variation to personality, here we examined whether measures of heritable variation in neuroanatomical size in candidate regions (amygdala and medial orbitofrontal cortex) were associated with heritable effects on personality. A sample of 486 middle-aged (mean = 55 years) male twins (complete MZ pairs = 120; complete DZ pairs = 84) underwent structural brain scans and also completed measures of two core domains of personality: positive and negative emotionality. After adjusting for estimated intracranial volume, significant phenotypic (rp) and genetic (rg) correlations were observed between left amygdala volume and positive emotionality (rp = .16, p < .01; rg = .23, p < .05, respectively). In addition, after adjusting for mean cortical thickness, genetic and nonshared-environmental correlations (re) between left medial orbitofrontal cortex thickness and negative emotionality were also observed (rg = .34, p < .01; re = −.19, p < .05, respectively). These findings support a model positing that heritable bases of personality are, at least in part, mediated through individual differences in the size of brain structures, although further work is still required to confirm this causal interpretation. PMID:25263286
Future Software Sizing Metrics and Estimation Challenges
2011-07-01
systems 4. Ultrahigh software system assurance 5. Legacy maintenance and Brownfield development 6. Agile and Lean/ Kanban development. This paper...refined as the design of the maintenance modifications or Brownfield re-engineering is determined. VII. 6. AGILE AND LEAN/ KANBAN DEVELOPMENT The...difficulties of software maintenance estimation can often be mitigated by using lean workflow management techniques such as Kanban [25]. In Kanban
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trattner, Sigal; Cheng, Bin; Pieniazek, Radoslaw L.
2014-04-15
Purpose: Effective dose (ED) is a widely used metric for comparing ionizing radiation burden between different imaging modalities, scanners, and scan protocols. In computed tomography (CT), ED can be estimated by performing scans on an anthropomorphic phantom in which metal-oxide-semiconductor field-effect transistor (MOSFET) solid-state dosimeters have been placed to enable organ dose measurements. Here a statistical framework is established to determine the sample size (number of scans) needed for estimating ED to a desired precision and confidence, for a particular scanner and scan protocol, subject to practical limitations. Methods: The statistical scheme involves solving equations which minimize the sample sizemore » required for estimating ED to desired precision and confidence. It is subject to a constrained variation of the estimated ED and solved using the Lagrange multiplier method. The scheme incorporates measurement variation introduced both by MOSFET calibration, and by variation in MOSFET readings between repeated CT scans. Sample size requirements are illustrated on cardiac, chest, and abdomen–pelvis CT scans performed on a 320-row scanner and chest CT performed on a 16-row scanner. Results: Sample sizes for estimating ED vary considerably between scanners and protocols. Sample size increases as the required precision or confidence is higher and also as the anticipated ED is lower. For example, for a helical chest protocol, for 95% confidence and 5% precision for the ED, 30 measurements are required on the 320-row scanner and 11 on the 16-row scanner when the anticipated ED is 4 mSv; these sample sizes are 5 and 2, respectively, when the anticipated ED is 10 mSv. Conclusions: Applying the suggested scheme, it was found that even at modest sample sizes, it is feasible to estimate ED with high precision and a high degree of confidence. As CT technology develops enabling ED to be lowered, more MOSFET measurements are needed to estimate ED with the same precision and confidence.« less
Post-stratified estimation: with-in strata and total sample size recommendations
James A. Westfall; Paul L. Patterson; John W. Coulston
2011-01-01
Post-stratification is used to reduce the variance of estimates of the mean. Because the stratification is not fixed in advance, within-strata sample sizes can be quite small. The survey statistics literature provides some guidance on minimum within-strata sample sizes; however, the recommendations and justifications are inconsistent and apply broadly for many...
Schillaci, Michael A; Schillaci, Mario E
2009-02-01
The use of small sample sizes in human and primate evolutionary research is commonplace. Estimating how well small samples represent the underlying population, however, is not commonplace. Because the accuracy of determinations of taxonomy, phylogeny, and evolutionary process are dependant upon how well the study sample represents the population of interest, characterizing the uncertainty, or potential error, associated with analyses of small sample sizes is essential. We present a method for estimating the probability that the sample mean is within a desired fraction of the standard deviation of the true mean using small (n<10) or very small (n < or = 5) sample sizes. This method can be used by researchers to determine post hoc the probability that their sample is a meaningful approximation of the population parameter. We tested the method using a large craniometric data set commonly used by researchers in the field. Given our results, we suggest that sample estimates of the population mean can be reasonable and meaningful even when based on small, and perhaps even very small, sample sizes.
Investigation of the Specht density estimator
NASA Technical Reports Server (NTRS)
Speed, F. M.; Rydl, L. M.
1971-01-01
The feasibility of using the Specht density estimator function on the IBM 360/44 computer is investigated. Factors such as storage, speed, amount of calculations, size of the smoothing parameter and sample size have an effect on the results. The reliability of the Specht estimator for normal and uniform distributions and the effects of the smoothing parameter and sample size are investigated.
Sample size for estimating mean and coefficient of variation in species of crotalarias.
Toebe, Marcos; Machado, Letícia N; Tartaglia, Francieli L; Carvalho, Juliana O DE; Bandeira, Cirineu T; Cargnelutti Filho, Alberto
2018-04-16
The objective of this study was to determine the sample size necessary to estimate the mean and coefficient of variation in four species of crotalarias (C. juncea, C. spectabilis, C. breviflora and C. ochroleuca). An experiment was carried out for each species during the season 2014/15. At harvest, 1,000 pods of each species were randomly collected. In each pod were measured: mass of pod with and without seeds, length, width and height of pods, number and mass of seeds per pod, and mass of hundred seeds. Measures of central tendency, variability and distribution were calculated, and the normality was verified. The sample size necessary to estimate the mean and coefficient of variation with amplitudes of the confidence interval of 95% (ACI95%) of 2%, 4%, ..., 20% was determined by resampling with replacement. The sample size varies among species and characters, being necessary a larger sample size to estimate the mean in relation of the necessary for the coefficient of variation.
Pitcher, L.; Helz, R.T.; Walker, R.J.; Piccoli, P.
2009-01-01
Kilauea Iki lava lake formed during the 1959 summit eruption of Kilauea Volcano, then crystallized and differentiated over a period of 35??years. It offers an opportunity to evaluate the fractionation behavior of trace elements in a uniquely well-documented basaltic system. A suite of 14 core samples recovered from 1967 to 1981 has been analyzed for 5 platinum-group elements (PGE: Ir, Os, Ru, Pt, Pd), plus Re. These samples have MgO ranging from 2.4 to 26.9??wt.%, with temperatures prior to quench ranging from 1140????C to ambient (110????C). Five eruption samples were also analyzed. Osmium and Ru concentrations vary by nearly four orders of magnitude (0.0006-1.40??ppb for Os and 0.0006-2.01??ppb for Ru) and are positively correlated with MgO content. These elements behaved compatibly during crystallization, mostly likely being concentrated in trace phases (alloy or sulfide) present in olivine phenocrysts or included chromite. Iridium also correlates positively with MgO, although less strongly than Os and Ru. The somewhat poorer correlation for Ir, compared with Os and Ru, may reflect variable loss of Ir as volatile IrF6 in some of the most magnesian samples. Rhenium is negatively correlated with MgO, behaving as an incompatible trace element. Its behavior in the lava lake is complicated by apparent volatile loss of Re, as suggested by a decrease in Re concentration with time of quenching for lake samples vs. eruption samples. Platinum and Pd concentrations are negatively, albeit weakly, correlated with MgO, so these elements were modestly incompatible during crystallization of the major silicate phases. Palladium contents peaked before precipitation of immiscible sulfide liquid, however, and decline sharply in the most differentiated samples. In contrast, Pt appears to have been unaffected by sulfide precipitation. Microprobe data confirm that Pd entered the sulfide liquid before Re, and that Pt is not strongly chalcophile in this system. Occasional high Pt values in both eruption and lava lake samples suggest the presence of unevenly distributed, unidentified Pt-rich trace phases in some Kilauea Iki materials. Estimated mineral (olivine + chromite)/melt D values for Os, Ir, Ru and Pt for equilibrium crystallization for samples from ~ 7 to 27??wt.% MgO are 26, 8.2, 19 and 0.55, respectively. These Os, Ir and Ru estimates are somewhat higher than previous estimates for similar systems. If fractional crystallization is instead assumed, D values are much more similar. Results confirm many prior observations in other mafic systems that olivine (together with included phases) has a major effect on absolute and relative abundances of Re and the PGE. The relatively linear correlations between these elements and MgO potentially permit accurate estimation of the concentrations of these elements in the primary melts of comparable systems, especially in instances where the MgO content of the primary melt is well constrained. ?? 2008 Elsevier B.V.
Sampling strategies for estimating brook trout effective population size
Andrew R. Whiteley; Jason A. Coombs; Mark Hudy; Zachary Robinson; Keith H. Nislow; Benjamin H. Letcher
2012-01-01
The influence of sampling strategy on estimates of effective population size (Ne) from single-sample genetic methods has not been rigorously examined, though these methods are increasingly used. For headwater salmonids, spatially close kin association among age-0 individuals suggests that sampling strategy (number of individuals and location from...
Approximate sample sizes required to estimate length distributions
Miranda, L.E.
2007-01-01
The sample sizes required to estimate fish length were determined by bootstrapping from reference length distributions. Depending on population characteristics and species-specific maximum lengths, 1-cm length-frequency histograms required 375-1,200 fish to estimate within 10% with 80% confidence, 2.5-cm histograms required 150-425 fish, proportional stock density required 75-140 fish, and mean length required 75-160 fish. In general, smaller species, smaller populations, populations with higher mortality, and simpler length statistics required fewer samples. Indices that require low sample sizes may be suitable for monitoring population status, and when large changes in length are evident, additional sampling effort may be allocated to more precisely define length status with more informative estimators. ?? Copyright by the American Fisheries Society 2007.
Rhenium Mechanical Properties and Joining Technology
NASA Technical Reports Server (NTRS)
Reed, Brian D.; Biaglow, James A.
1996-01-01
Iridium-coated rhenium (Ir/Re) provides thermal margin for high performance and long life radiation cooled rockets. Two issues that have arisen in the development of flight Ir/Re engines are the sparsity of rhenium (Re) mechanical property data (particularly at high temperatures) required for engineering design, and the inability to directly electron beam weld Re chambers to C103 nozzle skirts. To address these issues, a Re mechanical property database is being established and techniques for creating Re/C103 transition joints are being investigated. This paper discusses the tensile testing results of powder metallurgy Re samples at temperatures from 1370 to 2090 C. Also discussed is the evaluation of Re/C103 transition pieces joined by both, explosive and diffusion bonding. Finally, the evaluation of full size Re transition pieces, joined by inertia welding, as well as explosive and diffusion bonding, is detailed.
Rosenberger, Amanda E.; Dunham, Jason B.
2005-01-01
Estimation of fish abundance in streams using the removal model or the Lincoln - Peterson mark - recapture model is a common practice in fisheries. These models produce misleading results if their assumptions are violated. We evaluated the assumptions of these two models via electrofishing of rainbow trout Oncorhynchus mykiss in central Idaho streams. For one-, two-, three-, and four-pass sampling effort in closed sites, we evaluated the influences of fish size and habitat characteristics on sampling efficiency and the accuracy of removal abundance estimates. We also examined the use of models to generate unbiased estimates of fish abundance through adjustment of total catch or biased removal estimates. Our results suggested that the assumptions of the mark - recapture model were satisfied and that abundance estimates based on this approach were unbiased. In contrast, the removal model assumptions were not met. Decreasing sampling efficiencies over removal passes resulted in underestimated population sizes and overestimates of sampling efficiency. This bias decreased, but was not eliminated, with increased sampling effort. Biased removal estimates based on different levels of effort were highly correlated with each other but were less correlated with unbiased mark - recapture estimates. Stream size decreased sampling efficiency, and stream size and instream wood increased the negative bias of removal estimates. We found that reliable estimates of population abundance could be obtained from models of sampling efficiency for different levels of effort. Validation of abundance estimates requires extra attention to routine sampling considerations but can help fisheries biologists avoid pitfalls associated with biased data and facilitate standardized comparisons among studies that employ different sampling methods.
Estimation of the bottleneck size in Florida panthers
Culver, M.; Hedrick, P.W.; Murphy, K.; O'Brien, S.; Hornocker, M.G.
2008-01-01
We have estimated the extent of genetic variation in museum (1890s) and contemporary (1980s) samples of Florida panthers Puma concolor coryi for both nuclear loci and mtDNA. The microsatellite heterozygosity in the contemporary sample was only 0.325 that in the museum samples although our sample size and number of loci are limited. Support for this estimate is provided by a sample of 84 microsatellite loci in contemporary Florida panthers and Idaho pumas Puma concolor hippolestes in which the contemporary Florida panther sample had only 0.442 the heterozygosity of Idaho pumas. The estimated diversities in mtDNA in the museum and contemporary samples were 0.600 and 0.000, respectively. Using a population genetics approach, we have estimated that to reduce either the microsatellite heterozygosity or the mtDNA diversity this much (in a period of c. 80years during the 20th century when the numbers were thought to be low) that a very small bottleneck size of c. 2 for several generations and a small effective population size in other generations is necessary. Using demographic data from Yellowstone pumas, we estimated the ratio of effective to census population size to be 0.315. Using this ratio, the census population size in the Florida panthers necessary to explain the loss of microsatellite variation was c .41 for the non-bottleneck generations and 6.2 for the two bottleneck generations. These low bottleneck population sizes and the concomitant reduced effectiveness of selection are probably responsible for the high frequency of several detrimental traits in Florida panthers, namely undescended testicles and poor sperm quality. The recent intensive monitoring both before and after the introduction of Texas pumas in 1995 will make the recovery and genetic restoration of Florida panthers a classic study of an endangered species. Our estimates of the bottleneck size responsible for the loss of genetic variation in the Florida panther completes an unknown aspect of this account. ?? 2008 The Authors. Journal compilation ?? 2008 The Zoological Society of London.
Luo, Shezhou; Chen, Jing M; Wang, Cheng; Xi, Xiaohuan; Zeng, Hongcheng; Peng, Dailiang; Li, Dong
2016-05-30
Vegetation leaf area index (LAI), height, and aboveground biomass are key biophysical parameters. Corn is an important and globally distributed crop, and reliable estimations of these parameters are essential for corn yield forecasting, health monitoring and ecosystem modeling. Light Detection and Ranging (LiDAR) is considered an effective technology for estimating vegetation biophysical parameters. However, the estimation accuracies of these parameters are affected by multiple factors. In this study, we first estimated corn LAI, height and biomass (R2 = 0.80, 0.874 and 0.838, respectively) using the original LiDAR data (7.32 points/m2), and the results showed that LiDAR data could accurately estimate these biophysical parameters. Second, comprehensive research was conducted on the effects of LiDAR point density, sampling size and height threshold on the estimation accuracy of LAI, height and biomass. Our findings indicated that LiDAR point density had an important effect on the estimation accuracy for vegetation biophysical parameters, however, high point density did not always produce highly accurate estimates, and reduced point density could deliver reasonable estimation results. Furthermore, the results showed that sampling size and height threshold were additional key factors that affect the estimation accuracy of biophysical parameters. Therefore, the optimal sampling size and the height threshold should be determined to improve the estimation accuracy of biophysical parameters. Our results also implied that a higher LiDAR point density, larger sampling size and height threshold were required to obtain accurate corn LAI estimation when compared with height and biomass estimations. In general, our results provide valuable guidance for LiDAR data acquisition and estimation of vegetation biophysical parameters using LiDAR data.
NASA Astrophysics Data System (ADS)
Yamashita, F.; Mizoguchi, K.; Fukuyama, E.; Omura, K.
2008-12-01
To infer the activity and physical state of intraplate faults in Japan, we re-examined the crustal stress with the hydraulic fracturing test by measuring the tensile strength of rocks. The tensile strength was measured by fracturing hollow cylindrical rock samples (inner and outer radius are 25.0-25.2 mm and 55.1-101.5 mm, respectively, length is 137.0-140.1 mm) which were obtained close to the in situ stress measurement locations by pressurizing the inner hole of the sample. Confining pressure is not applied to the samples in this test. To check the reliability and accuracy of this test, we conducted similar experiments with the standard rock sample (Inada granite) whose physical property is well known. Then, we measured the tensile strength of all available core samples including the Atera fault (at Ueno, Fukuoka, and Hatajiri), the Atotsugawa fault, and the Nojima fault (at Hirabayashi, Iwaya and Kabutoyama), in central Japan, which had been obtained by the National Research Institute for Earth Science and Disaster Prevention (NIED) by the stress measurement with the hydraulic fracturing method. The measured tensile strength data reveals that the in situ re- opening pressure, which is one of the parameters needed for the determination of the maximum in situ horizontal stress, was obviously biased. We re-estimated the re-opening pressure using the measured tensile strength and the in situ breakdown pressure, and re-calculated the in situ stress around the Atera fault. Although the past dislocation of the Atera fault has been considered to be left lateral from the geographical features around the fault, the re-estimated stress suggests that the present dislocation of the Atera fault is right lateral. And the shear stress decreases from the fault. The right lateral dislocation is also supported by the present-day horizontal crustal deformation observed by the triangular and GPS surveys by Geographical Survey Institute in Japan. Therefore, the dislocation direction of the Atera fault seems to change from left lateral to right lateral some time ago. The amount of accumulated right lateral dislocation estimated from the stress data with the dislocation model by Okada (1992) is 2.2-2.6 m. Because the current slip rate from the GPS survey is 2.1-2.3 mm/yr, the accumulation period of the dislocation becomes 960-1240 years if the slip rate is stable. This estimation suggests that during the last 1586 Tensho earthquake the Atera fault dislocated right laterally.
Sampling for area estimation: A comparison of full-frame sampling with the sample segment approach
NASA Technical Reports Server (NTRS)
Hixson, M.; Bauer, M. E.; Davis, B. J. (Principal Investigator)
1979-01-01
The author has identified the following significant results. Full-frame classifications of wheat and non-wheat for eighty counties in Kansas were repetitively sampled to simulate alternative sampling plans. Evaluation of four sampling schemes involving different numbers of samples and different size sampling units shows that the precision of the wheat estimates increased as the segment size decreased and the number of segments was increased. Although the average bias associated with the various sampling schemes was not significantly different, the maximum absolute bias was directly related to sampling size unit.
Variance Estimation, Design Effects, and Sample Size Calculations for Respondent-Driven Sampling
2006-01-01
Hidden populations, such as injection drug users and sex workers, are central to a number of public health problems. However, because of the nature of these groups, it is difficult to collect accurate information about them, and this difficulty complicates disease prevention efforts. A recently developed statistical approach called respondent-driven sampling improves our ability to study hidden populations by allowing researchers to make unbiased estimates of the prevalence of certain traits in these populations. Yet, not enough is known about the sample-to-sample variability of these prevalence estimates. In this paper, we present a bootstrap method for constructing confidence intervals around respondent-driven sampling estimates and demonstrate in simulations that it outperforms the naive method currently in use. We also use simulations and real data to estimate the design effects for respondent-driven sampling in a number of situations. We conclude with practical advice about the power calculations that are needed to determine the appropriate sample size for a study using respondent-driven sampling. In general, we recommend a sample size twice as large as would be needed under simple random sampling. PMID:16937083
Estimating accuracy of land-cover composition from two-stage cluster sampling
Stehman, S.V.; Wickham, J.D.; Fattorini, L.; Wade, T.D.; Baffetta, F.; Smith, J.H.
2009-01-01
Land-cover maps are often used to compute land-cover composition (i.e., the proportion or percent of area covered by each class), for each unit in a spatial partition of the region mapped. We derive design-based estimators of mean deviation (MD), mean absolute deviation (MAD), root mean square error (RMSE), and correlation (CORR) to quantify accuracy of land-cover composition for a general two-stage cluster sampling design, and for the special case of simple random sampling without replacement (SRSWOR) at each stage. The bias of the estimators for the two-stage SRSWOR design is evaluated via a simulation study. The estimators of RMSE and CORR have small bias except when sample size is small and the land-cover class is rare. The estimator of MAD is biased for both rare and common land-cover classes except when sample size is large. A general recommendation is that rare land-cover classes require large sample sizes to ensure that the accuracy estimators have small bias. ?? 2009 Elsevier Inc.
NASA Astrophysics Data System (ADS)
Xu, P.; Chen, Y.; Li, S.; Wang, K.
2017-12-01
In geological history, the uplift of the Tibet plateau has accelerated the silicate weathering and organic carbon burial at the same time, which made great influence on the global carbon cycle by increasing the carbon sink. Because of the vital connection between tectonic uplift and carbon cycle, more and more attention was casted on rivers originating from orogens. The Yangtze River, as an important large river in the world, is one of them. However, although silicate weathering has been studied thoroughly, researches on organic carbon cycle are much less, and oxidation of fossil organic carbon remained poorly constrained. In this study, we try to use rhenium(Re) as a proxy to estimate the oxidation rate of fossil organic carbon and thus proceed our understanding towards the carbon cycle, the silicate weathering. This is because Re has a close relationship with organic carbon in the sediments and will be released into hydrological network in the mountain river catchments by being oxidized and exist as soluble ReO4-, so that we can use Re concentration in river water to estimate the oxidation rate of organic carbon. We collected water samples from the Yangtze River fortnightly at Banqiao Ferry and the sampling date cover the non-flood period. In this way, we are able to have a rough estimate of the amount of carbon dioxide that released to the atmosphere by the oxidation of organic carbon, using the data of non-flood period we got. We found that Re concentration in Yangtze River ranges approximately from 45 to 85 pmol/L. The rate of organic carbon weathering is estimated using the expression, ΦCO2,fossil=[Re]×runoff×[OC/Re]rock, and according to researches on the black shale of Yangtze River, the value 2.86×106 is chosen as the ratio OC(organic carbon) to Re in the black shale. The result is a really high flux, up to 152×109mol/y, just a little less than of the CO2 consumption rates from silicate weathering which is 191×109mol/y and about 166×109mol/y in non-flood period. Our result indicates that in the Yangtze Basin, oxidation of fossil organic carbon can very likely offset the carbon dioxide that removed by silicate weathering.
NASA Astrophysics Data System (ADS)
Kydd, Jocelyn; Rajakaruna, Harshana; Briski, Elizabeta; Bailey, Sarah
2018-03-01
Many commercial ships will soon begin to use treatment systems to manage their ballast water and reduce the global transfer of harmful aquatic organisms and pathogens in accordance with upcoming International Maritime Organization regulations. As a result, rapid and accurate automated methods will be needed to monitoring compliance of ships' ballast water. We examined two automated particle counters for monitoring organisms ≥ 50 μm in minimum dimension: a High Resolution Laser Optical Plankton Counter (HR-LOPC), and a Flow Cytometer with digital imaging Microscope (FlowCAM), in comparison to traditional (manual) microscopy considering plankton concentration, size frequency distributions and particle size measurements. The automated tools tended to underestimate particle concentration compared to standard microscopy, but gave similar results in terms of relative abundance of individual taxa. For most taxa, particle size measurements generated by FlowCAM ABD (Area Based Diameter) were more similar to microscope measurements than were those by FlowCAM ESD (Equivalent Spherical Diameter), though there was a mismatch in size estimates for some organisms between the FlowCAM ABD and microscope due to orientation and complex morphology. When a single problematic taxon is very abundant, the resulting size frequency distribution curves can become skewed, as was observed with Asterionella in this study. In particular, special consideration is needed when utilizing automated tools to analyse samples containing colonial species. Re-analysis of the size frequency distributions with the removal of Asterionella from FlowCAM and microscope data resulted in more similar curves across methods with FlowCAM ABD having the best fit compared to the microscope, although microscope concentration estimates were still significantly higher than estimates from the other methods. The results of our study indicate that both automated tools can generate frequency distributions of particles that might be particularly useful if correction factors can be developed for known differences in well-studied aquatic ecosystems.
Kent, Peter; Boyle, Eleanor; Keating, Jennifer L; Albert, Hanne B; Hartvigsen, Jan
2017-02-01
To quantify variability in the results of statistical analyses based on contingency tables and discuss the implications for the choice of sample size for studies that derive clinical prediction rules. An analysis of three pre-existing sets of large cohort data (n = 4,062-8,674) was performed. In each data set, repeated random sampling of various sample sizes, from n = 100 up to n = 2,000, was performed 100 times at each sample size and the variability in estimates of sensitivity, specificity, positive and negative likelihood ratios, posttest probabilities, odds ratios, and risk/prevalence ratios for each sample size was calculated. There were very wide, and statistically significant, differences in estimates derived from contingency tables from the same data set when calculated in sample sizes below 400 people, and typically, this variability stabilized in samples of 400-600 people. Although estimates of prevalence also varied significantly in samples below 600 people, that relationship only explains a small component of the variability in these statistical parameters. To reduce sample-specific variability, contingency tables should consist of 400 participants or more when used to derive clinical prediction rules or test their performance. Copyright © 2016 Elsevier Inc. All rights reserved.
Evaluating common de-identification heuristics for personal health information.
El Emam, Khaled; Jabbouri, Sam; Sams, Scott; Drouet, Youenn; Power, Michael
2006-11-21
With the growing adoption of electronic medical records, there are increasing demands for the use of this electronic clinical data in observational research. A frequent ethics board requirement for such secondary use of personal health information in observational research is that the data be de-identified. De-identification heuristics are provided in the Health Insurance Portability and Accountability Act Privacy Rule, funding agency and professional association privacy guidelines, and common practice. The aim of the study was to evaluate whether the re-identification risks due to record linkage are sufficiently low when following common de-identification heuristics and whether the risk is stable across sample sizes and data sets. Two methods were followed to construct identification data sets. Re-identification attacks were simulated on these. For each data set we varied the sample size down to 30 individuals, and for each sample size evaluated the risk of re-identification for all combinations of quasi-identifiers. The combinations of quasi-identifiers that were low risk more than 50% of the time were considered stable. The identification data sets we were able to construct were the list of all physicians and the list of all lawyers registered in Ontario, using 1% sampling fractions. The quasi-identifiers of region, gender, and year of birth were found to be low risk more than 50% of the time across both data sets. The combination of gender and region was also found to be low risk more than 50% of the time. We were not able to create an identification data set for the whole population. Existing Canadian federal and provincial privacy laws help explain why it is difficult to create an identification data set for the whole population. That such examples of high re-identification risk exist for mainstream professions makes a strong case for not disclosing the high-risk variables and their combinations identified here. For professional subpopulations with published membership lists, many variables often needed by researchers would have to be excluded or generalized to ensure consistently low re-identification risk. Data custodians and researchers need to consider other statistical disclosure techniques for protecting privacy.
Evaluating Common De-Identification Heuristics for Personal Health Information
Jabbouri, Sam; Sams, Scott; Drouet, Youenn; Power, Michael
2006-01-01
Background With the growing adoption of electronic medical records, there are increasing demands for the use of this electronic clinical data in observational research. A frequent ethics board requirement for such secondary use of personal health information in observational research is that the data be de-identified. De-identification heuristics are provided in the Health Insurance Portability and Accountability Act Privacy Rule, funding agency and professional association privacy guidelines, and common practice. Objective The aim of the study was to evaluate whether the re-identification risks due to record linkage are sufficiently low when following common de-identification heuristics and whether the risk is stable across sample sizes and data sets. Methods Two methods were followed to construct identification data sets. Re-identification attacks were simulated on these. For each data set we varied the sample size down to 30 individuals, and for each sample size evaluated the risk of re-identification for all combinations of quasi-identifiers. The combinations of quasi-identifiers that were low risk more than 50% of the time were considered stable. Results The identification data sets we were able to construct were the list of all physicians and the list of all lawyers registered in Ontario, using 1% sampling fractions. The quasi-identifiers of region, gender, and year of birth were found to be low risk more than 50% of the time across both data sets. The combination of gender and region was also found to be low risk more than 50% of the time. We were not able to create an identification data set for the whole population. Conclusions Existing Canadian federal and provincial privacy laws help explain why it is difficult to create an identification data set for the whole population. That such examples of high re-identification risk exist for mainstream professions makes a strong case for not disclosing the high-risk variables and their combinations identified here. For professional subpopulations with published membership lists, many variables often needed by researchers would have to be excluded or generalized to ensure consistently low re-identification risk. Data custodians and researchers need to consider other statistical disclosure techniques for protecting privacy. PMID:17213047
Stallard, Nigel; Parsons, Nicholas; Todd, Susan; Friede, Tim
2016-01-01
Regulatory authorities require that the sample size of a confirmatory trial is calculated prior to the start of the trial. However, the sample size quite often depends on parameters that might not be known in advance of the study. Misspecification of these parameters can lead to under‐ or overestimation of the sample size. Both situations are unfavourable as the first one decreases the power and the latter one leads to a waste of resources. Hence, designs have been suggested that allow a re‐assessment of the sample size in an ongoing trial. These methods usually focus on estimating the variance. However, for some methods the performance depends not only on the variance but also on the correlation between measurements. We develop and compare different methods for blinded estimation of the correlation coefficient that are less likely to introduce operational bias when the blinding is maintained. Their performance with respect to bias and standard error is compared to the unblinded estimator. We simulated two different settings: one assuming that all group means are the same and one assuming that different groups have different means. Simulation results show that the naïve (one‐sample) estimator is only slightly biased and has a standard error comparable to that of the unblinded estimator. However, if the group means differ, other estimators have better performance depending on the sample size per group and the number of groups. PMID:27886393
Estimating the Size of a Large Network and its Communities from a Random Sample
Chen, Lin; Karbasi, Amin; Crawford, Forrest W.
2017-01-01
Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = (V, E) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W ⊆ V and letting G(W) be the induced subgraph in G of the vertices in W. In addition to G(W), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that accurately estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K, and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios. PMID:28867924
Estimating the Size of a Large Network and its Communities from a Random Sample.
Chen, Lin; Karbasi, Amin; Crawford, Forrest W
2016-01-01
Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = ( V, E ) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W ⊆ V and letting G ( W ) be the induced subgraph in G of the vertices in W . In addition to G ( W ), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that accurately estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K , and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios.
[Practical aspects regarding sample size in clinical research].
Vega Ramos, B; Peraza Yanes, O; Herrera Correa, G; Saldívar Toraya, S
1996-01-01
The knowledge of the right sample size let us to be sure if the published results in medical papers had a suitable design and a proper conclusion according to the statistics analysis. To estimate the sample size we must consider the type I error, type II error, variance, the size of the effect, significance and power of the test. To decide what kind of mathematics formula will be used, we must define what kind of study we have, it means if its a prevalence study, a means values one or a comparative one. In this paper we explain some basic topics of statistics and we describe four simple samples of estimation of sample size.
NASA Astrophysics Data System (ADS)
Kong, Shaofei; Lu, Bing; Ji, Yaqin; Bai, Zhipeng; Xu, Yonghai; Liu, Yong; Jiang, Hua
2012-08-01
Thirty re-suspended dust samples were collected from building surfaces in an oilfield city, re-suspended and sampled through PM2.5, PM10 and PM100 inlets and analyzed for 18 PAHs by GC-MS technique. PAHs concentrations, toxicity and profiles characteristic for different districts and size were studied. PAHs sources were identified by diagnostic ratios and primary component analysis. Results showed that the total amounts of analyzed PAHs in re-suspended dust in Dongying were 45.29, 23.79 and 11.41 μg g-1 for PM2.5, PM10 and PM100, respectively. PAHs tended to concentrate in finer particles with mass ratios of PM2.5/PM10 and PM10/PM100 as 1.96 ± 0.86 and 2.53 ± 1.57. The old district with more human activities and long oil exploitation history exhibited higher concentrations of PAHs from both combustion and non-combustion sources. BaP-based toxic equivalent factor and BaP-based equivalent carcinogenic power exhibited decreasing sequence as PM2.5 > PM10 > PM100 suggesting that the finer the particles, the more toxic of the dust. NaP, Phe, Flu, Pyr, BbF and BghiP were the abundant species. Coefficient of divergence analysis implied that PAHs in different districts and size fractions had common sources. Coal combustion, industrial sources, vehicle emission and petroleum were probably the main contributions according to the principal component analysis result.
Khan, Bilal; Lee, Hsuan-Wei; Fellows, Ian; Dombrowski, Kirk
2018-01-01
Size estimation is particularly important for populations whose members experience disproportionate health issues or pose elevated health risks to the ambient social structures in which they are embedded. Efforts to derive size estimates are often frustrated when the population is hidden or hard-to-reach in ways that preclude conventional survey strategies, as is the case when social stigma is associated with group membership or when group members are involved in illegal activities. This paper extends prior research on the problem of network population size estimation, building on established survey/sampling methodologies commonly used with hard-to-reach groups. Three novel one-step, network-based population size estimators are presented, for use in the context of uniform random sampling, respondent-driven sampling, and when networks exhibit significant clustering effects. We give provably sufficient conditions for the consistency of these estimators in large configuration networks. Simulation experiments across a wide range of synthetic network topologies validate the performance of the estimators, which also perform well on a real-world location-based social networking data set with significant clustering. Finally, the proposed schemes are extended to allow them to be used in settings where participant anonymity is required. Systematic experiments show favorable tradeoffs between anonymity guarantees and estimator performance. Taken together, we demonstrate that reasonable population size estimates are derived from anonymous respondent driven samples of 250-750 individuals, within ambient populations of 5,000-40,000. The method thus represents a novel and cost-effective means for health planners and those agencies concerned with health and disease surveillance to estimate the size of hidden populations. We discuss limitations and future work in the concluding section.
McClure, Foster D; Lee, Jung K
2005-01-01
Sample size formulas are developed to estimate the repeatability and reproducibility standard deviations (Sr and S(R)) such that the actual error in (Sr and S(R)) relative to their respective true values, sigmar and sigmaR, are at predefined levels. The statistical consequences associated with AOAC INTERNATIONAL required sample size to validate an analytical method are discussed. In addition, formulas to estimate the uncertainties of (Sr and S(R)) were derived and are provided as supporting documentation. Formula for the Number of Replicates Required for a Specified Margin of Relative Error in the Estimate of the Repeatability Standard Deviation.
Soil specific re-calibration of water content sensors for a field-scale sensor network
NASA Astrophysics Data System (ADS)
Gasch, Caley K.; Brown, David J.; Anderson, Todd; Brooks, Erin S.; Yourek, Matt A.
2015-04-01
Obtaining accurate soil moisture data from a sensor network requires sensor calibration. Soil moisture sensors are factory calibrated, but multiple site specific factors may contribute to sensor inaccuracies. Thus, sensors should be calibrated for the specific soil type and conditions in which they will be installed. Lab calibration of a large number of sensors prior to installation in a heterogeneous setting may not be feasible, and it may not reflect the actual performance of the installed sensor. We investigated a multi-step approach to retroactively re-calibrate sensor water content data from the dielectric permittivity readings obtained by sensors in the field. We used water content data collected since 2009 from a sensor network installed at 42 locations and 5 depths (210 sensors total) within the 37-ha Cook Agronomy Farm with highly variable soils located in the Palouse region of the Northwest United States. First, volumetric water content was calculated from sensor dielectric readings using three equations: (1) a factory calibration using the Topp equation; (2) a custom calibration obtained empirically from an instrumented soil in the field; and (3) a hybrid equation that combines the Topp and custom equations. Second, we used soil physical properties (particle size and bulk density) and pedotransfer functions to estimate water content at saturation, field capacity, and wilting point for each installation location and depth. We also extracted the same reference points from the sensor readings, when available. Using these reference points, we re-scaled the sensor readings, such that water content was restricted to the range of values that we would expect given the physical properties of the soil. The re-calibration accuracy was assessed with volumetric water content measurements obtained from field-sampled cores taken on multiple dates. In general, the re-calibration was most accurate when all three reference points (saturation, field capacity, and wilting point) were represented in the sensor readings. We anticipate that obtaining water retention curves for field soils will improve the re-calibration accuracy by providing more precise estimates of saturation, field capacity, and wilting point. This approach may serve as an alternative method for sensor calibration in lieu of or to complement pre-installation calibration.
Dombrowski, Kirk; Khan, Bilal; Wendel, Travis; McLean, Katherine; Misshula, Evan; Curtis, Ric
2012-12-01
As part of a recent study of the dynamics of the retail market for methamphetamine use in New York City, we used network sampling methods to estimate the size of the total networked population. This process involved sampling from respondents' list of co-use contacts, which in turn became the basis for capture-recapture estimation. Recapture sampling was based on links to other respondents derived from demographic and "telefunken" matching procedures-the latter being an anonymized version of telephone number matching. This paper describes the matching process used to discover the links between the solicited contacts and project respondents, the capture-recapture calculation, the estimation of "false matches", and the development of confidence intervals for the final population estimates. A final population of 12,229 was estimated, with a range of 8235 - 23,750. The techniques described here have the special virtue of deriving an estimate for a hidden population while retaining respondent anonymity and the anonymity of network alters, but likely require larger sample size than the 132 persons interviewed to attain acceptable confidence levels for the estimate.
Re-electrospraying splash-landed proteins and nanoparticles.
Benner, W Henry; Lewis, Gregory S; Hering, Susanne V; Selgelke, Brent; Corzett, Michelle; Evans, James E; Lightstone, Felice C
2012-03-06
FITC-albumin, Lsr-F, or fluorescent polystyrene latex particles were electrosprayed from aqueous buffer and subjected to dispersion by differential electrical mobility at atmospheric pressure. A resulting narrow size cut of singly charged molecular ions or particles was passed through a condensation growth tube collector to create a flow stream of small water droplets, each carrying a single ion or particle. The droplets were splash landed (impacted) onto a solid or liquid temperature controlled surface. Small pools of droplets containing size-selected particles, FITC-albumin, or Lsr-F were recovered, re-electrosprayed, and, when analyzed a second time by differential electrical mobility, showed increased homogeneity. Transmission electron microscopy (TEM) analysis of the size-selected Lsr-F sample corroborated the mobility observation.
The Impact of Sample Size and Other Factors When Estimating Multilevel Logistic Models
ERIC Educational Resources Information Center
Schoeneberger, Jason A.
2016-01-01
The design of research studies utilizing binary multilevel models must necessarily incorporate knowledge of multiple factors, including estimation method, variance component size, or number of predictors, in addition to sample sizes. This Monte Carlo study examined the performance of random effect binary outcome multilevel models under varying…
Quantifying and Mitigating the Effect of Preferential Sampling on Phylodynamic Inference
Karcher, Michael D.; Palacios, Julia A.; Bedford, Trevor; Suchard, Marc A.; Minin, Vladimir N.
2016-01-01
Phylodynamics seeks to estimate effective population size fluctuations from molecular sequences of individuals sampled from a population of interest. One way to accomplish this task formulates an observed sequence data likelihood exploiting a coalescent model for the sampled individuals’ genealogy and then integrating over all possible genealogies via Monte Carlo or, less efficiently, by conditioning on one genealogy estimated from the sequence data. However, when analyzing sequences sampled serially through time, current methods implicitly assume either that sampling times are fixed deterministically by the data collection protocol or that their distribution does not depend on the size of the population. Through simulation, we first show that, when sampling times do probabilistically depend on effective population size, estimation methods may be systematically biased. To correct for this deficiency, we propose a new model that explicitly accounts for preferential sampling by modeling the sampling times as an inhomogeneous Poisson process dependent on effective population size. We demonstrate that in the presence of preferential sampling our new model not only reduces bias, but also improves estimation precision. Finally, we compare the performance of the currently used phylodynamic methods with our proposed model through clinically-relevant, seasonal human influenza examples. PMID:26938243
Weighting by Inverse Variance or by Sample Size in Random-Effects Meta-Analysis
ERIC Educational Resources Information Center
Marin-Martinez, Fulgencio; Sanchez-Meca, Julio
2010-01-01
Most of the statistical procedures in meta-analysis are based on the estimation of average effect sizes from a set of primary studies. The optimal weight for averaging a set of independent effect sizes is the inverse variance of each effect size, but in practice these weights have to be estimated, being affected by sampling error. When assuming a…
Resonance region measurements of dysprosium and rhenium
NASA Astrophysics Data System (ADS)
Leinweber, Gregory; Block, Robert C.; Epping, Brian E.; Barry, Devin P.; Rapp, Michael J.; Danon, Yaron; Donovan, Timothy J.; Landsberger, Sheldon; Burke, John A.; Bishop, Mary C.; Youmans, Amanda; Kim, Guinyun N.; Kang, yeong-rok; Lee, Man Woo; Drindak, Noel J.
2017-09-01
Neutron capture and transmission measurements have been performed, and resonance parameter analysis has been completed for dysprosium, Dy, and rhenium, Re. The 60 MeV electron accelerator at RPI Gaerttner LINAC Center produced neutrons in the thermal and epithermal energy regions for these measurements. Transmission measurements were made using 6Li glass scintillation detectors. The neutron capture measurements were made with a 16-segment NaI multiplicity detector. The detectors for all experiments were located at ≈25 m except for thermal transmission, which was done at ≈15 m. The dysprosium samples included one highly enriched 164Dy metal, 6 liquid solutions of enriched 164Dy, two natural Dy metals. The Re samples were natural metals. Their capture yield normalizations were corrected for their high gamma attenuation. The multi-level R-matrix Bayesian computer code SAMMY was used to extract the resonance parameters from the data. 164Dy resonance data were analyzed up to 550 eV, other Dy isotopes up to 17 eV, and Re resonance data up to 1 keV. Uncertainties due to resolution function, flight path, burst width, sample thickness, normalization, background, and zero time were estimated and propagated using SAMMY. An additional check of sample-to-sample consistency is presented as an estimate of uncertainty. The thermal total cross sections and neutron capture resonance integrals of 164Dy and Re were determined from the resonance parameters. The NJOY and INTER codes were used to process and integrate the cross sections. Plots of the data, fits, and calculations using ENDF/B-VII.1 resonance parameters are presented.
The effects of sample size on population genomic analyses--implications for the tests of neutrality.
Subramanian, Sankar
2016-02-20
One of the fundamental measures of molecular genetic variation is the Watterson's estimator (θ), which is based on the number of segregating sites. The estimation of θ is unbiased only under neutrality and constant population growth. It is well known that the estimation of θ is biased when these assumptions are violated. However, the effects of sample size in modulating the bias was not well appreciated. We examined this issue in detail based on large-scale exome data and robust simulations. Our investigation revealed that sample size appreciably influences θ estimation and this effect was much higher for constrained genomic regions than that of neutral regions. For instance, θ estimated for synonymous sites using 512 human exomes was 1.9 times higher than that obtained using 16 exomes. However, this difference was 2.5 times for the nonsynonymous sites of the same data. We observed a positive correlation between the rate of increase in θ estimates (with respect to the sample size) and the magnitude of selection pressure. For example, θ estimated for the nonsynonymous sites of highly constrained genes (dN/dS < 0.1) using 512 exomes was 3.6 times higher than that estimated using 16 exomes. In contrast this difference was only 2 times for the less constrained genes (dN/dS > 0.9). The results of this study reveal the extent of underestimation owing to small sample sizes and thus emphasize the importance of sample size in estimating a number of population genomic parameters. Our results have serious implications for neutrality tests such as Tajima D, Fu-Li D and those based on the McDonald and Kreitman test: Neutrality Index and the fraction of adaptive substitutions. For instance, use of 16 exomes produced 2.4 times higher proportion of adaptive substitutions compared to that obtained using 512 exomes (24% vs 10 %).
The Petersen-Lincoln estimator and its extension to estimate the size of a shared population.
Chao, Anne; Pan, H-Y; Chiang, Shu-Chuan
2008-12-01
The Petersen-Lincoln estimator has been used to estimate the size of a population in a single mark release experiment. However, the estimator is not valid when the capture sample and recapture sample are not independent. We provide an intuitive interpretation for "independence" between samples based on 2 x 2 categorical data formed by capture/non-capture in each of the two samples. From the interpretation, we review a general measure of "dependence" and quantify the correlation bias of the Petersen-Lincoln estimator when two types of dependences (local list dependence and heterogeneity of capture probability) exist. An important implication in the census undercount problem is that instead of using a post enumeration sample to assess the undercount of a census, one should conduct a prior enumeration sample to avoid correlation bias. We extend the Petersen-Lincoln method to the case of two populations. This new estimator of the size of the shared population is proposed and its variance is derived. We discuss a special case where the correlation bias of the proposed estimator due to dependence between samples vanishes. The proposed method is applied to a study of the relapse rate of illicit drug use in Taiwan. ((c) 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim).
An audit of the statistics and the comparison with the parameter in the population
NASA Astrophysics Data System (ADS)
Bujang, Mohamad Adam; Sa'at, Nadiah; Joys, A. Reena; Ali, Mariana Mohamad
2015-10-01
The sufficient sample size that is needed to closely estimate the statistics for particular parameters are use to be an issue. Although sample size might had been calculated referring to objective of the study, however, it is difficult to confirm whether the statistics are closed with the parameter for a particular population. All these while, guideline that uses a p-value less than 0.05 is widely used as inferential evidence. Therefore, this study had audited results that were analyzed from various sub sample and statistical analyses and had compared the results with the parameters in three different populations. Eight types of statistical analysis and eight sub samples for each statistical analysis were analyzed. Results found that the statistics were consistent and were closed to the parameters when the sample study covered at least 15% to 35% of population. Larger sample size is needed to estimate parameter that involve with categorical variables compared with numerical variables. Sample sizes with 300 to 500 are sufficient to estimate the parameters for medium size of population.
Evaluating information content of SNPs for sample-tagging in re-sequencing projects.
Hu, Hao; Liu, Xiang; Jin, Wenfei; Hilger Ropers, H; Wienker, Thomas F
2015-05-15
Sample-tagging is designed for identification of accidental sample mix-up, which is a major issue in re-sequencing studies. In this work, we develop a model to measure the information content of SNPs, so that we can optimize a panel of SNPs that approach the maximal information for discrimination. The analysis shows that as low as 60 optimized SNPs can differentiate the individuals in a population as large as the present world, and only 30 optimized SNPs are in practice sufficient in labeling up to 100 thousand individuals. In the simulated populations of 100 thousand individuals, the average Hamming distances, generated by the optimized set of 30 SNPs are larger than 18, and the duality frequency, is lower than 1 in 10 thousand. This strategy of sample discrimination is proved robust in large sample size and different datasets. The optimized sets of SNPs are designed for Whole Exome Sequencing, and a program is provided for SNP selection, allowing for customized SNP numbers and interested genes. The sample-tagging plan based on this framework will improve re-sequencing projects in terms of reliability and cost-effectiveness.
van Heumen, Moniek; Tol, Johannes L; de Vos, Robert-Jan; Moen, Maarten H; Weir, Adam; Orchard, John; Reurink, Gustaaf
2017-09-01
A challenge for sports physicians is to estimate the risk of a hamstring re-injury, but the current evidence for MRI variables as a risk factor is unknown. To systematically review the literature on the prognostic value of MRI findings at index injury and/or return to play for acute hamstring re-injuries. Databases of PubMed, Embase, MEDLINE, Scopus, CINAHL, Google Scholar, Web of Science, LILACS, SciELO, ScienceDirect, ProQuest, SPORTDiscus and Cochrane Library were searched until 20 June 2016. Studies evaluating MRI as a prognostic tool for determining the risk of re-injury for athletes with acute hamstring injuries were eligible for inclusion. Two authors independently screened the search results and assessed risk of bias using standardised criteria from a consensus statement. A best-evidence synthesis was used to identify the level of evidence. Post hoc analysis included correction for insufficient sample size. Of the 11 studies included, 7 had a low and 4 had a high risk of bias. No strong evidence for any MRI finding as a risk factor for hamstring re-injury was found. There was moderate evidence that intratendinous injuries were associated with increased re-injury risk. Post hoc analysis showed moderate evidence that injury to the biceps femoris was a moderate to strong risk factor for re-injury. There is currently no strong evidence for any MRI finding in predicting hamstring re-injury risk. Intratendinous injuries and biceps femoris injuries showed moderate evidence for association with a higher re-injury risk. Registration in the PROSPERO International prospective register of systematic reviews was performed prior to study initiation (registration number CRD42015024620). © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Investigations on Size Effects of Zerodur®
NASA Astrophysics Data System (ADS)
Behar-Lafenetre, S.; Cornillon, L.; Ait-Zaid, S.; Rancurel, M.
2014-06-01
Zerodur® is a well-known glass-ceramic used for optical components because of its unequalled stability under thermal environment (due to its extremely low Coefficient of Thermal Expansion). In particular it has been used since decades in Thales Alenia Space's optical payloads for space telescopes, especially for primary mirrors.The drawback of Zerodur® however is its quite low strength: 10 MPa is historically used as a rule of thumb. However, as performance of space telescopes is increasing, an optimization of the design is necessary and therefore an increase of the strength limit taken into account in the calculations.Thales Alenia Space is therefore currently investigating the so-called "size effect" on Zerodur® (see Weibull theory), under CNES funding, with the aim of re- estimating the lower bound of Zerodur® strength.For this, a complete test campaign has been defined with a high number of samples in order to reduce uncertainties. This article presents the first results obtained.
NASA Astrophysics Data System (ADS)
Soucemarianadin, Laure; Cécillon, Lauric; Baudin, François; Cecchini, Sébastien; Chenu, Claire; Mériguet, Jacques; Nicolas, Manuel; Savignac, Florence; Barré, Pierre
2017-04-01
Soil organic matter (SOM) is the largest terrestrial carbon pool and SOM degradation has multiple consequences on key ecosystem properties like nutrients cycling, soil emissions of greenhouse gases or carbon sequestration potential. With the strong feedbacks between SOM and climate change, it becomes particularly urgent to develop reliable routine methodologies capable of indicating the turnover time of soil organic carbon (SOC) stocks. Thermal analyses have been used to characterize SOM and among them, Rock-Eval 6 (RE6) analysis of soil has shown promising results in the determination of in-situ SOC biogeochemical stability. This technique combines a phase of pyrolysis followed by a phase of oxidation to provide information on both the SOC bulk chemistry and thermal stability. We analyzed with RE6 a set of 495 soils samples from 102 permanent forest sites of the French national network for the long-term monitoring of forest ecosystems (''RENECOFOR'' network). Along with covering pedoclimatic variability at a national level, these samples include a range of 5 depths up to 1 meter (0-10 cm, 10-20 cm, 20-40 cm, 40-80 cm and 80-100 cm). Using RE6 parameters that were previously shown to be correlated to short-term (hydrogen index, HI; T50 CH pyrolysis) or long-term (T50 CO2 oxidation and HI) SOC persistence, and that characterize SOM bulk chemical composition (oxygen index, OI and HI), we tested the influence of depth (n = 5), soil class (n = 6) and vegetation type (n = 3; deciduous, coniferous-fir, coniferous-pine) on SOM thermal stability and bulk chemistry. Results showed that depth was the dominant discriminating factor, affecting significantly all RE6 parameters. With depth, we observed a decrease of the thermally labile SOC pool and an increase of the thermally stable SOC pool, along with an oxidation and a depletion of hydrogen-rich moieties of the SOC. Soil class and vegetation type had contrasted effects on the RE6 parameters but both affected significantly T50 CO2 oxidation with, for instance, entic Podzols and dystric Cambisols containing relatively more thermally stable SOC in the deepest layer than hypereutric/calcaric Cambisols. Moreover, soils in deciduous plots contained a higher proportion of thermally stable SOC than soils in coniferous plots. This study shows that RE6 analysis constitutes a fast and cost effective way to qualitatively estimate SOM turnover and to discuss its ecosystem drivers. It offers promising prospects towards a quantitative estimation of SOC turnover and the development of RE6-based indicators related to the size of the different SOC kinetic pools.
NASA Technical Reports Server (NTRS)
Morgera, S. D.; Cooper, D. B.
1976-01-01
The experimental observation that a surprisingly small sample size vis-a-vis dimension is needed to achieve good signal-to-interference ratio (SIR) performance with an adaptive predetection filter is explained. The adaptive filter requires estimates as obtained by a recursive stochastic algorithm of the inverse of the filter input data covariance matrix. The SIR performance with sample size is compared for the situations where the covariance matrix estimates are of unstructured (generalized) form and of structured (finite Toeplitz) form; the latter case is consistent with weak stationarity of the input data stochastic process.
NASA Astrophysics Data System (ADS)
Willie, Jacob; Petre, Charles-Albert; Tagg, Nikki; Lens, Luc
2012-11-01
Data from forest herbaceous plants in a site of known species richness in Cameroon were used to test the performance of rarefaction and eight species richness estimators (ACE, ICE, Chao1, Chao2, Jack1, Jack2, Bootstrap and MM). Bias, accuracy, precision and sensitivity to patchiness and sample grain size were the evaluation criteria. An evaluation of the effects of sampling effort and patchiness on diversity estimation is also provided. Stems were identified and counted in linear series of 1-m2 contiguous square plots distributed in six habitat types. Initially, 500 plots were sampled in each habitat type. The sampling process was monitored using rarefaction and a set of richness estimator curves. Curves from the first dataset suggested adequate sampling in riparian forest only. Additional plots ranging from 523 to 2143 were subsequently added in the undersampled habitats until most of the curves stabilized. Jack1 and ICE, the non-parametric richness estimators, performed better, being more accurate and less sensitive to patchiness and sample grain size, and significantly reducing biases that could not be detected by rarefaction and other estimators. This study confirms the usefulness of non-parametric incidence-based estimators, and recommends Jack1 or ICE alongside rarefaction while describing taxon richness and comparing results across areas sampled using similar or different grain sizes. As patchiness varied across habitat types, accurate estimations of diversity did not require the same number of plots. The number of samples needed to fully capture diversity is not necessarily the same across habitats, and can only be known when taxon sampling curves have indicated adequate sampling. Differences in observed species richness between habitats were generally due to differences in patchiness, except between two habitats where they resulted from differences in abundance. We suggest that communities should first be sampled thoroughly using appropriate taxon sampling curves before explaining differences in diversity.
van de Schoot, Rens; Broere, Joris J.; Perryck, Koen H.; Zondervan-Zwijnenburg, Mariëlle; van Loey, Nancy E.
2015-01-01
Background The analysis of small data sets in longitudinal studies can lead to power issues and often suffers from biased parameter values. These issues can be solved by using Bayesian estimation in conjunction with informative prior distributions. By means of a simulation study and an empirical example concerning posttraumatic stress symptoms (PTSS) following mechanical ventilation in burn survivors, we demonstrate the advantages and potential pitfalls of using Bayesian estimation. Methods First, we show how to specify prior distributions and by means of a sensitivity analysis we demonstrate how to check the exact influence of the prior (mis-) specification. Thereafter, we show by means of a simulation the situations in which the Bayesian approach outperforms the default, maximum likelihood and approach. Finally, we re-analyze empirical data on burn survivors which provided preliminary evidence of an aversive influence of a period of mechanical ventilation on the course of PTSS following burns. Results Not suprisingly, maximum likelihood estimation showed insufficient coverage as well as power with very small samples. Only when Bayesian analysis, in conjunction with informative priors, was used power increased to acceptable levels. As expected, we showed that the smaller the sample size the more the results rely on the prior specification. Conclusion We show that two issues often encountered during analysis of small samples, power and biased parameters, can be solved by including prior information into Bayesian analysis. We argue that the use of informative priors should always be reported together with a sensitivity analysis. PMID:25765534
van de Schoot, Rens; Broere, Joris J; Perryck, Koen H; Zondervan-Zwijnenburg, Mariëlle; van Loey, Nancy E
2015-01-01
Background : The analysis of small data sets in longitudinal studies can lead to power issues and often suffers from biased parameter values. These issues can be solved by using Bayesian estimation in conjunction with informative prior distributions. By means of a simulation study and an empirical example concerning posttraumatic stress symptoms (PTSS) following mechanical ventilation in burn survivors, we demonstrate the advantages and potential pitfalls of using Bayesian estimation. Methods : First, we show how to specify prior distributions and by means of a sensitivity analysis we demonstrate how to check the exact influence of the prior (mis-) specification. Thereafter, we show by means of a simulation the situations in which the Bayesian approach outperforms the default, maximum likelihood and approach. Finally, we re-analyze empirical data on burn survivors which provided preliminary evidence of an aversive influence of a period of mechanical ventilation on the course of PTSS following burns. Results : Not suprisingly, maximum likelihood estimation showed insufficient coverage as well as power with very small samples. Only when Bayesian analysis, in conjunction with informative priors, was used power increased to acceptable levels. As expected, we showed that the smaller the sample size the more the results rely on the prior specification. Conclusion : We show that two issues often encountered during analysis of small samples, power and biased parameters, can be solved by including prior information into Bayesian analysis. We argue that the use of informative priors should always be reported together with a sensitivity analysis.
Effects of sample size on KERNEL home range estimates
Seaman, D.E.; Millspaugh, J.J.; Kernohan, Brian J.; Brundige, Gary C.; Raedeke, Kenneth J.; Gitzen, Robert A.
1999-01-01
Kernel methods for estimating home range are being used increasingly in wildlife research, but the effect of sample size on their accuracy is not known. We used computer simulations of 10-200 points/home range and compared accuracy of home range estimates produced by fixed and adaptive kernels with the reference (REF) and least-squares cross-validation (LSCV) methods for determining the amount of smoothing. Simulated home ranges varied from simple to complex shapes created by mixing bivariate normal distributions. We used the size of the 95% home range area and the relative mean squared error of the surface fit to assess the accuracy of the kernel home range estimates. For both measures, the bias and variance approached an asymptote at about 50 observations/home range. The fixed kernel with smoothing selected by LSCV provided the least-biased estimates of the 95% home range area. All kernel methods produced similar surface fit for most simulations, but the fixed kernel with LSCV had the lowest frequency and magnitude of very poor estimates. We reviewed 101 papers published in The Journal of Wildlife Management (JWM) between 1980 and 1997 that estimated animal home ranges. A minority of these papers used nonparametric utilization distribution (UD) estimators, and most did not adequately report sample sizes. We recommend that home range studies using kernel estimates use LSCV to determine the amount of smoothing, obtain a minimum of 30 observations per animal (but preferably a?Y50), and report sample sizes in published results.
Intra-class correlation estimates for assessment of vitamin A intake in children.
Agarwal, Girdhar G; Awasthi, Shally; Walter, Stephen D
2005-03-01
In many community-based surveys, multi-level sampling is inherent in the design. In the design of these studies, especially to calculate the appropriate sample size, investigators need good estimates of intra-class correlation coefficient (ICC), along with the cluster size, to adjust for variation inflation due to clustering at each level. The present study used data on the assessment of clinical vitamin A deficiency and intake of vitamin A-rich food in children in a district in India. For the survey, 16 households were sampled from 200 villages nested within eight randomly-selected blocks of the district. ICCs and components of variances were estimated from a three-level hierarchical random effects analysis of variance model. Estimates of ICCs and variance components were obtained at village and block levels. Between-cluster variation was evident at each level of clustering. In these estimates, ICCs were inversely related to cluster size, but the design effect could be substantial for large clusters. At the block level, most ICC estimates were below 0.07. At the village level, many ICC estimates ranged from 0.014 to 0.45. These estimates may provide useful information for the design of epidemiological studies in which the sampled (or allocated) units range in size from households to large administrative zones.
Liu, An; Wijesiri, Buddhi; Hong, Nian; Zhu, Panfeng; Egodawatta, Prasanna; Goonetilleke, Ashantha
2018-05-08
Road deposited pollutants (build-up) are continuously re-distributed by external factors such as traffic and wind turbulence, influencing stormwater runoff quality. However, current stormwater quality modelling approaches do not account for the re-distribution of pollutants. This undermines the accuracy of stormwater quality predictions, constraining the design of effective stormwater treatment measures. This study, using over 1000 data points, developed a Bayesian Network modelling approach to investigate the re-distribution of pollutant build-up on urban road surfaces. BTEX, which are a group of highly toxic pollutants, was the case study pollutants. Build-up sampling was undertaken in Shenzhen, China, using a dry and wet vacuuming method. The research outcomes confirmed that the vehicle type and particle size significantly influence the re-distribution of particle-bound BTEX. Compared to heavy-duty traffic in commercial areas, light-duty traffic dominates the re-distribution of particles of all size ranges. In industrial areas, heavy-duty traffic re-distributes particles >75 μm, and light-duty traffic re-distributes particles <75 μm. In residential areas, light-duty traffic re-distributes particles >300 μm and <75 μm and heavy-duty traffic re-distributes particles in the 300-150 μm range. The study results provide important insights to improve stormwater quality modelling and the interpretation of modelling outcomes, contributing to safeguard the urban water environment. Copyright © 2018 Elsevier B.V. All rights reserved.
Hierarchical modeling of cluster size in wildlife surveys
Royle, J. Andrew
2008-01-01
Clusters or groups of individuals are the fundamental unit of observation in many wildlife sampling problems, including aerial surveys of waterfowl, marine mammals, and ungulates. Explicit accounting of cluster size in models for estimating abundance is necessary because detection of individuals within clusters is not independent and detectability of clusters is likely to increase with cluster size. This induces a cluster size bias in which the average cluster size in the sample is larger than in the population at large. Thus, failure to account for the relationship between delectability and cluster size will tend to yield a positive bias in estimates of abundance or density. I describe a hierarchical modeling framework for accounting for cluster-size bias in animal sampling. The hierarchical model consists of models for the observation process conditional on the cluster size distribution and the cluster size distribution conditional on the total number of clusters. Optionally, a spatial model can be specified that describes variation in the total number of clusters per sample unit. Parameter estimation, model selection, and criticism may be carried out using conventional likelihood-based methods. An extension of the model is described for the situation where measurable covariates at the level of the sample unit are available. Several candidate models within the proposed class are evaluated for aerial survey data on mallard ducks (Anas platyrhynchos).
Generalizing the Network Scale-Up Method: A New Estimator for the Size of Hidden Populations*
Feehan, Dennis M.; Salganik, Matthew J.
2018-01-01
The network scale-up method enables researchers to estimate the size of hidden populations, such as drug injectors and sex workers, using sampled social network data. The basic scale-up estimator offers advantages over other size estimation techniques, but it depends on problematic modeling assumptions. We propose a new generalized scale-up estimator that can be used in settings with non-random social mixing and imperfect awareness about membership in the hidden population. Further, the new estimator can be used when data are collected via complex sample designs and from incomplete sampling frames. However, the generalized scale-up estimator also requires data from two samples: one from the frame population and one from the hidden population. In some situations these data from the hidden population can be collected by adding a small number of questions to already planned studies. For other situations, we develop interpretable adjustment factors that can be applied to the basic scale-up estimator. We conclude with practical recommendations for the design and analysis of future studies. PMID:29375167
Accuracy or precision: Implications of sample design and methodology on abundance estimation
Kowalewski, Lucas K.; Chizinski, Christopher J.; Powell, Larkin A.; Pope, Kevin L.; Pegg, Mark A.
2015-01-01
Sampling by spatially replicated counts (point-count) is an increasingly popular method of estimating population size of organisms. Challenges exist when sampling by point-count method, and it is often impractical to sample entire area of interest and impossible to detect every individual present. Ecologists encounter logistical limitations that force them to sample either few large-sample units or many small sample-units, introducing biases to sample counts. We generated a computer environment and simulated sampling scenarios to test the role of number of samples, sample unit area, number of organisms, and distribution of organisms in the estimation of population sizes using N-mixture models. Many sample units of small area provided estimates that were consistently closer to true abundance than sample scenarios with few sample units of large area. However, sample scenarios with few sample units of large area provided more precise abundance estimates than abundance estimates derived from sample scenarios with many sample units of small area. It is important to consider accuracy and precision of abundance estimates during the sample design process with study goals and objectives fully recognized, although and with consequence, consideration of accuracy and precision of abundance estimates is often an afterthought that occurs during the data analysis process.
The Use of a Binary Composite Endpoint and Sample Size Requirement: Influence of Endpoints Overlap.
Marsal, Josep-Ramon; Ferreira-González, Ignacio; Bertran, Sandra; Ribera, Aida; Permanyer-Miralda, Gaietà; García-Dorado, David; Gómez, Guadalupe
2017-05-01
Although composite endpoints (CE) are common in clinical trials, the impact of the relationship between the components of a binary CE on the sample size requirement (SSR) has not been addressed. We performed a computational study considering 2 treatments and a CE with 2 components: the relevant endpoint (RE) and the additional endpoint (AE). We assessed the strength of the components' interrelation by the degree of relative overlap between them, which was stratified into 5 groups. Within each stratum, SSR was computed for multiple scenarios by varying the events proportion and the effect of the therapy. A lower SSR using CE was defined as the best scenario for using the CE. In 25 of 66 scenarios the degree of relative overlap determined the benefit of using CE instead of the RE. Adding an AE with greater effect than the RE leads to lower SSR using the CE regardless of the AE proportion and the relative overlap. The influence of overlapping decreases when the effect on RE increases. Adding an AE with lower effect than the RE constitutes the most uncertain situation. In summary, the interrelationship between CE components, assessed by the relative overlap, can help to define the SSR in specific situations and it should be considered for SSR computation. © The Author 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Sampling studies to estimate the HIV prevalence rate in female commercial sex workers.
Pascom, Ana Roberta Pati; Szwarcwald, Célia Landmann; Barbosa Júnior, Aristides
2010-01-01
We investigated sampling methods being used to estimate the HIV prevalence rate among female commercial sex workers. The studies were classified according to the adequacy or not of the sample size to estimate HIV prevalence rate and according to the sampling method (probabilistic or convenience). We identified 75 studies that estimated the HIV prevalence rate among female sex workers. Most of the studies employed convenience samples. The sample size was not adequate to estimate HIV prevalence rate in 35 studies. The use of convenience sample limits statistical inference for the whole group. It was observed that there was an increase in the number of published studies since 2005, as well as in the number of studies that used probabilistic samples. This represents a large advance in the monitoring of risk behavior practices and HIV prevalence rate in this group.
Frison, Severine; Kerac, Marko; Checchi, Francesco; Nicholas, Jennifer
2017-01-01
The assessment of the prevalence of acute malnutrition in children under five is widely used for the detection of emergencies, planning interventions, advocacy, and monitoring and evaluation. This study examined PROBIT Methods which convert parameters (mean and standard deviation (SD)) of a normally distributed variable to a cumulative probability below any cut-off to estimate acute malnutrition in children under five using Middle-Upper Arm Circumference (MUAC). We assessed the performance of: PROBIT Method I, with mean MUAC from the survey sample and MUAC SD from a database of previous surveys; and PROBIT Method II, with mean and SD of MUAC observed in the survey sample. Specifically, we generated sub-samples from 852 survey datasets, simulating 100 surveys for eight sample sizes. Overall the methods were tested on 681 600 simulated surveys. PROBIT methods relying on sample sizes as small as 50 had better performance than the classic method for estimating and classifying the prevalence of acute malnutrition. They had better precision in the estimation of acute malnutrition for all sample sizes and better coverage for smaller sample sizes, while having relatively little bias. They classified situations accurately for a threshold of 5% acute malnutrition. Both PROBIT methods had similar outcomes. PROBIT Methods have a clear advantage in the assessment of acute malnutrition prevalence based on MUAC, compared to the classic method. Their use would require much lower sample sizes, thus enable great time and resource savings and permit timely and/or locally relevant prevalence estimates of acute malnutrition for a swift and well-targeted response.
Xie, Yanming; Wang, Yanping; Tian, Feng; Wang, Yongyan
2011-10-01
As information on safety and effectiveness is not comprehensive, gained from the researches for listing approval of Chinese medicine, it is very necessary to conduct post-marketing clinical re-evaluation of Chinese medicine. Effectiveness, safety and economic evaluation are three main aspects of post-marketing clinical re-evaluation. In this paper, the difference and relations between the post-marketing clinical re-evaluation and the phase IV clinical trials were discussed, and the basic requests and suggestions were proposed, according to the domestic and foreign relevant regulations and experts' suggestions, and discussed the requirements of the phase IV clinical trials on indications, design methods, inclusion and exclusion criteria, sample size, etc.
Advanced measurement techniques to characterize thermo-mechanical aspects of solid oxide fuel cells
NASA Astrophysics Data System (ADS)
Malzbender, J.; Steinbrech, R. W.
Advanced characterization methods have been used to analyze the thermo-mechanical behaviour of solid oxide fuel cells in a model stack. The primarily experimental work included contacting studies, sealing of a model stack, thermal and re-oxidation cycling. Also an attempt was made to correlate cell fracture in the stack with pore sizes determined from computer tomography. The contacting studies were carried out using pressure sensitive foils. The load to achieve full contact on anode and cathode side of the cell was assessed and applied in the subsequent model stack test. The stack experiment permitted a detailed analysis of stack compaction during sealing. During steady state operation thermal and re-oxidation cycling the changes in open cell voltage and acoustic emissions were monitored. Significant softening of the sealant material was observed at low temperatures. Heating in the thermal cycling loop of the stack appeared to be less critical than the cooling. Re-oxidation cycling led to significant damage if a critical re-oxidation time was exceeded. Microstructural studies permitted further insight into the re-oxidation mechanism. Finally, the maximum defect size in the cell was determined by computer tomography. A limit of maximum anode stress was estimated and the result correlated this with the failure strength observed during the model stack testing.
Tukiendorf, Andrzej; Mansournia, Mohammad Ali; Wydmański, Jerzy; Wolny-Rokicka, Edyta
2017-04-01
Background: Clinical datasets for epithelial ovarian cancer brain metastatic patients are usually small in size. When adequate case numbers are lacking, resulting estimates of regression coefficients may demonstrate bias. One of the direct approaches to reduce such sparse-data bias is based on penalized estimation. Methods: A re- analysis of formerly reported hazard ratios in diagnosed patients was performed using penalized Cox regression with a popular SAS package providing additional software codes for a statistical computational procedure. Results: It was found that the penalized approach can readily diminish sparse data artefacts and radically reduce the magnitude of estimated regression coefficients. Conclusions: It was confirmed that classical statistical approaches may exaggerate regression estimates or distort study interpretations and conclusions. The results support the thesis that penalization via weak informative priors and data augmentation are the safest approaches to shrink sparse data artefacts frequently occurring in epidemiological research. Creative Commons Attribution License
Alegana, Victor A; Wright, Jim; Bosco, Claudio; Okiro, Emelda A; Atkinson, Peter M; Snow, Robert W; Tatem, Andrew J; Noor, Abdisalan M
2017-11-21
One pillar to monitoring progress towards the Sustainable Development Goals is the investment in high quality data to strengthen the scientific basis for decision-making. At present, nationally-representative surveys are the main source of data for establishing a scientific evidence base, monitoring, and evaluation of health metrics. However, little is known about the optimal precisions of various population-level health and development indicators that remains unquantified in nationally-representative household surveys. Here, a retrospective analysis of the precision of prevalence from these surveys was conducted. Using malaria indicators, data were assembled in nine sub-Saharan African countries with at least two nationally-representative surveys. A Bayesian statistical model was used to estimate between- and within-cluster variability for fever and malaria prevalence, and insecticide-treated bed nets (ITNs) use in children under the age of 5 years. The intra-class correlation coefficient was estimated along with the optimal sample size for each indicator with associated uncertainty. Results suggest that the estimated sample sizes for the current nationally-representative surveys increases with declining malaria prevalence. Comparison between the actual sample size and the modelled estimate showed a requirement to increase the sample size for parasite prevalence by up to 77.7% (95% Bayesian credible intervals 74.7-79.4) for the 2015 Kenya MIS (estimated sample size of children 0-4 years 7218 [7099-7288]), and 54.1% [50.1-56.5] for the 2014-2015 Rwanda DHS (12,220 [11,950-12,410]). This study highlights the importance of defining indicator-relevant sample sizes to achieve the required precision in the current national surveys. While expanding the current surveys would need additional investment, the study highlights the need for improved approaches to cost effective sampling.
Richman, Julie D.; Livi, Kenneth J.T.; Geyh, Alison S.
2011-01-01
Increasing evidence suggests that the physicochemical properties of inhaled nanoparticles influence the resulting toxicokinetics and toxicodynamics. This report presents a method using scanning transmission electron microscopy (STEM) to measure the Mn content throughout the primary particle size distribution of welding fume particle samples collected on filters for application in exposure and health research. Dark field images were collected to assess the primary particle size distribution and energy-dispersive X-ray and electron energy loss spectroscopy were performed for measurement of Mn composition as a function of primary particle size. A manual method incorporating imaging software was used to measure the primary particle diameter and to select an integration region for compositional analysis within primary particles throughout the size range. To explore the variation in the developed metric, the method was applied to 10 gas metal arc welding (GMAW) fume particle samples of mild steel that were collected under a variety of conditions. The range of Mn composition by particle size was −0.10 to 0.19 %/nm, where a positive estimate indicates greater relative abundance of Mn increasing with primary particle size and a negative estimate conversely indicates decreasing Mn content with size. However, the estimate was only statistically significant (p<0.05) in half of the samples (n=5), which all had a positive estimate. In the remaining samples, no significant trend was measured. Our findings indicate that the method is reproducible and that differences in the abundance of Mn by primary particle size among welding fume samples can be detected. PMID:21625364
Richman, Julie D; Livi, Kenneth J T; Geyh, Alison S
2011-06-01
Increasing evidence suggests that the physicochemical properties of inhaled nanoparticles influence the resulting toxicokinetics and toxicodynamics. This report presents a method using scanning transmission electron microscopy (STEM) to measure the Mn content throughout the primary particle size distribution of welding fume particle samples collected on filters for application in exposure and health research. Dark field images were collected to assess the primary particle size distribution and energy-dispersive X-ray and electron energy loss spectroscopy were performed for measurement of Mn composition as a function of primary particle size. A manual method incorporating imaging software was used to measure the primary particle diameter and to select an integration region for compositional analysis within primary particles throughout the size range. To explore the variation in the developed metric, the method was applied to 10 gas metal arc welding (GMAW) fume particle samples of mild steel that were collected under a variety of conditions. The range of Mn composition by particle size was -0.10 to 0.19 %/nm, where a positive estimate indicates greater relative abundance of Mn increasing with primary particle size and a negative estimate conversely indicates decreasing Mn content with size. However, the estimate was only statistically significant (p<0.05) in half of the samples (n=5), which all had a positive estimate. In the remaining samples, no significant trend was measured. Our findings indicate that the method is reproducible and that differences in the abundance of Mn by primary particle size among welding fume samples can be detected.
The impact of sample size on the reproducibility of voxel-based lesion-deficit mappings.
Lorca-Puls, Diego L; Gajardo-Vidal, Andrea; White, Jitrachote; Seghier, Mohamed L; Leff, Alexander P; Green, David W; Crinion, Jenny T; Ludersdorfer, Philipp; Hope, Thomas M H; Bowman, Howard; Price, Cathy J
2018-07-01
This study investigated how sample size affects the reproducibility of findings from univariate voxel-based lesion-deficit analyses (e.g., voxel-based lesion-symptom mapping and voxel-based morphometry). Our effect of interest was the strength of the mapping between brain damage and speech articulation difficulties, as measured in terms of the proportion of variance explained. First, we identified a region of interest by searching on a voxel-by-voxel basis for brain areas where greater lesion load was associated with poorer speech articulation using a large sample of 360 right-handed English-speaking stroke survivors. We then randomly drew thousands of bootstrap samples from this data set that included either 30, 60, 90, 120, 180, or 360 patients. For each resample, we recorded effect size estimates and p values after conducting exactly the same lesion-deficit analysis within the previously identified region of interest and holding all procedures constant. The results show (1) how often small effect sizes in a heterogeneous population fail to be detected; (2) how effect size and its statistical significance varies with sample size; (3) how low-powered studies (due to small sample sizes) can greatly over-estimate as well as under-estimate effect sizes; and (4) how large sample sizes (N ≥ 90) can yield highly significant p values even when effect sizes are so small that they become trivial in practical terms. The implications of these findings for interpreting the results from univariate voxel-based lesion-deficit analyses are discussed. Copyright © 2018 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Estimating population sizes for elusive animals: the forest elephants of Kakum National Park, Ghana.
Eggert, L S; Eggert, J A; Woodruff, D S
2003-06-01
African forest elephants are difficult to observe in the dense vegetation, and previous studies have relied upon indirect methods to estimate population sizes. Using multilocus genotyping of noninvasively collected samples, we performed a genetic survey of the forest elephant population at Kakum National Park, Ghana. We estimated population size, sex ratio and genetic variability from our data, then combined this information with field observations to divide the population into age groups. Our population size estimate was very close to that obtained using dung counts, the most commonly used indirect method of estimating the population sizes of forest elephant populations. As their habitat is fragmented by expanding human populations, management will be increasingly important to the persistence of forest elephant populations. The data that can be obtained from noninvasively collected samples will help managers plan for the conservation of this keystone species.
Beverage Cans Used for Sediment Collection.
ERIC Educational Resources Information Center
Studlick, Joseph R. J.; Trautman, Timothy A.
1979-01-01
Beverage cans are well suited for sediment collection and storage containers. Advantages include being free, readily available, and the correct size for many samples. Instruction for selection, preparation, and use of cans in sediment collection and storage is provided. (RE)
Sample Size Determination for Regression Models Using Monte Carlo Methods in R
ERIC Educational Resources Information Center
Beaujean, A. Alexander
2014-01-01
A common question asked by researchers using regression models is, What sample size is needed for my study? While there are formulae to estimate sample sizes, their assumptions are often not met in the collected data. A more realistic approach to sample size determination requires more information such as the model of interest, strength of the…
Statistical power calculations for mixed pharmacokinetic study designs using a population approach.
Kloprogge, Frank; Simpson, Julie A; Day, Nicholas P J; White, Nicholas J; Tarning, Joel
2014-09-01
Simultaneous modelling of dense and sparse pharmacokinetic data is possible with a population approach. To determine the number of individuals required to detect the effect of a covariate, simulation-based power calculation methodologies can be employed. The Monte Carlo Mapped Power method (a simulation-based power calculation methodology using the likelihood ratio test) was extended in the current study to perform sample size calculations for mixed pharmacokinetic studies (i.e. both sparse and dense data collection). A workflow guiding an easy and straightforward pharmacokinetic study design, considering also the cost-effectiveness of alternative study designs, was used in this analysis. Initially, data were simulated for a hypothetical drug and then for the anti-malarial drug, dihydroartemisinin. Two datasets (sampling design A: dense; sampling design B: sparse) were simulated using a pharmacokinetic model that included a binary covariate effect and subsequently re-estimated using (1) the same model and (2) a model not including the covariate effect in NONMEM 7.2. Power calculations were performed for varying numbers of patients with sampling designs A and B. Study designs with statistical power >80% were selected and further evaluated for cost-effectiveness. The simulation studies of the hypothetical drug and the anti-malarial drug dihydroartemisinin demonstrated that the simulation-based power calculation methodology, based on the Monte Carlo Mapped Power method, can be utilised to evaluate and determine the sample size of mixed (part sparsely and part densely sampled) study designs. The developed method can contribute to the design of robust and efficient pharmacokinetic studies.
The Effects of Model Misspecification and Sample Size on LISREL Maximum Likelihood Estimates.
ERIC Educational Resources Information Center
Baldwin, Beatrice
The robustness of LISREL computer program maximum likelihood estimates under specific conditions of model misspecification and sample size was examined. The population model used in this study contains one exogenous variable; three endogenous variables; and eight indicator variables, two for each latent variable. Conditions of model…
Bennett, Ryan C; Brough, Chris; Miller, Dave A; O'Donnell, Kevin P; Keen, Justin M; Hughey, Justin R; Williams, Robert O; McGinity, James W
2015-03-01
Acetyl-11-keto-β-boswellic acid (AKBA), a gum resin extract, possesses poor water-solubility that limits bioavailability and a high melting point making it difficult to successfully process into solid dispersions by fusion methods. The purpose of this study was to investigate solvent and thermal processing techniques for the preparation of amorphous solid dispersions (ASDs) exhibiting enhanced solubility, dissolution rates and bioavailability. Solid dispersions were successfully produced by rotary evaporation (RE) and KinetiSol® Dispersing (KSD). Solid state and chemical characterization revealed that ASD with good potency and purity were produced by both RE and KSD. Results of the RE studies demonstrated that AQOAT®-LF, AQOAT®-MF, Eudragit® L100-55 and Soluplus with the incorporation of dioctyl sulfosuccinate sodium provided substantial solubility enhancement. Non-sink dissolution analysis showed enhanced dissolution properties for KSD-processed solid dispersions in comparison to RE-processed solid dispersions. Variances in release performance were identified when different particle size fractions of KSD samples were analyzed. Selected RE samples varying in particle surface morphologies were placed under storage and exhibited crystalline growth following solid-state stability analysis at 12 months in comparison to stored KSD samples confirming amorphous instability for RE products. In vivo analysis of KSD-processed solid dispersions revealed significantly enhanced AKBA absorption in comparison to the neat, active substance.
Ongoing Capabilities and Developments of Re-Entry Plasma Ground Tests at EADS-ASTRIUM
NASA Technical Reports Server (NTRS)
Jullien, Pierre
2008-01-01
During re-entry, spacecrafts are subjected to extreme thermal loads. On mars, they may go through dust storms. These external heat loads are leading the design of re-entry vehicles or are affecting it for spacecraft facing solid propellant jet stream. Sizing the Thermal Protection System require a good knowledge of such solicitations and means to model and reproduce them on earth. Through its work on European projects, ASTRIUM has developed the full range of competences to deal with such issues. For instance, we have designed and tested the heat-shield of the Huygens probe which landed on Titan. In particular, our plasma generators aim to reproduce a wide variety of re-entry conditions. Heat loads are generated by the huge speed of the probes. Such conditions cannot be fully reproduced. Ground tests focus on reproducing local aerothermal loads by using slower but hotter flows. Our inductive plasma torch enables to test little samples at low TRL. Amongst the arc-jets, one was design to test architecture design of ISS crew return system and others fit more severe re-entry such as sample returns or Venus re-entry. The last developments aimed in testing samples in seeded flows. First step was to design and test the seeding device. Special diagnostics characterizing the resulting flow enabled us to fit it to the requirements.
Statistical homogeneity tests applied to large data sets from high energy physics experiments
NASA Astrophysics Data System (ADS)
Trusina, J.; Franc, J.; Kůs, V.
2017-12-01
Homogeneity tests are used in high energy physics for the verification of simulated Monte Carlo samples, it means if they have the same distribution as a measured data from particle detector. Kolmogorov-Smirnov, χ 2, and Anderson-Darling tests are the most used techniques to assess the samples’ homogeneity. Since MC generators produce plenty of entries from different models, each entry has to be re-weighted to obtain the same sample size as the measured data has. One way of the homogeneity testing is through the binning. If we do not want to lose any information, we can apply generalized tests based on weighted empirical distribution functions. In this paper, we propose such generalized weighted homogeneity tests and introduce some of their asymptotic properties. We present the results based on numerical analysis which focuses on estimations of the type-I error and power of the test. Finally, we present application of our homogeneity tests to data from the experiment DØ in Fermilab.
Sample size calculation for stepped wedge and other longitudinal cluster randomised trials.
Hooper, Richard; Teerenstra, Steven; de Hoop, Esther; Eldridge, Sandra
2016-11-20
The sample size required for a cluster randomised trial is inflated compared with an individually randomised trial because outcomes of participants from the same cluster are correlated. Sample size calculations for longitudinal cluster randomised trials (including stepped wedge trials) need to take account of at least two levels of clustering: the clusters themselves and times within clusters. We derive formulae for sample size for repeated cross-section and closed cohort cluster randomised trials with normally distributed outcome measures, under a multilevel model allowing for variation between clusters and between times within clusters. Our formulae agree with those previously described for special cases such as crossover and analysis of covariance designs, although simulation suggests that the formulae could underestimate required sample size when the number of clusters is small. Whether using a formula or simulation, a sample size calculation requires estimates of nuisance parameters, which in our model include the intracluster correlation, cluster autocorrelation, and individual autocorrelation. A cluster autocorrelation less than 1 reflects a situation where individuals sampled from the same cluster at different times have less correlated outcomes than individuals sampled from the same cluster at the same time. Nuisance parameters could be estimated from time series obtained in similarly clustered settings with the same outcome measure, using analysis of variance to estimate variance components. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
NASA Technical Reports Server (NTRS)
Parada, N. D. J. (Principal Investigator); Moreira, M. A.
1983-01-01
Using digitally processed MSS/LANDSAT data as auxiliary variable, a methodology to estimate wheat (Triticum aestivum L) area by means of sampling techniques was developed. To perform this research, aerial photographs covering 720 sq km in Cruz Alta test site at the NW of Rio Grande do Sul State, were visually analyzed. LANDSAT digital data were analyzed using non-supervised and supervised classification algorithms; as post-processing the classification was submitted to spatial filtering. To estimate wheat area, the regression estimation method was applied and different sample sizes and various sampling units (10, 20, 30, 40 and 60 sq km) were tested. Based on the four decision criteria established for this research, it was concluded that: (1) as the size of sampling units decreased the percentage of sampled area required to obtain similar estimation performance also decreased; (2) the lowest percentage of the area sampled for wheat estimation with relatively high precision and accuracy through regression estimation was 90% using 10 sq km s the sampling unit; and (3) wheat area estimation by direct expansion (using only aerial photographs) was less precise and accurate when compared to those obtained by means of regression estimation.
Trap configuration and spacing influences parameter estimates in spatial capture-recapture models
Sun, Catherine C.; Fuller, Angela K.; Royle, J. Andrew
2014-01-01
An increasing number of studies employ spatial capture-recapture models to estimate population size, but there has been limited research on how different spatial sampling designs and trap configurations influence parameter estimators. Spatial capture-recapture models provide an advantage over non-spatial models by explicitly accounting for heterogeneous detection probabilities among individuals that arise due to the spatial organization of individuals relative to sampling devices. We simulated black bear (Ursus americanus) populations and spatial capture-recapture data to evaluate the influence of trap configuration and trap spacing on estimates of population size and a spatial scale parameter, sigma, that relates to home range size. We varied detection probability and home range size, and considered three trap configurations common to large-mammal mark-recapture studies: regular spacing, clustered, and a temporal sequence of different cluster configurations (i.e., trap relocation). We explored trap spacing and number of traps per cluster by varying the number of traps. The clustered arrangement performed well when detection rates were low, and provides for easier field implementation than the sequential trap arrangement. However, performance differences between trap configurations diminished as home range size increased. Our simulations suggest it is important to consider trap spacing relative to home range sizes, with traps ideally spaced no more than twice the spatial scale parameter. While spatial capture-recapture models can accommodate different sampling designs and still estimate parameters with accuracy and precision, our simulations demonstrate that aspects of sampling design, namely trap configuration and spacing, must consider study area size, ranges of individual movement, and home range sizes in the study population.
False-Negative Rate and Recovery Efficiency Performance of a Validated Sponge Wipe Sampling Method
Piepel, Greg F.; Boucher, Raymond; Tezak, Matt; Amidan, Brett G.; Einfeld, Wayne
2012-01-01
Recovery of spores from environmental surfaces varies due to sampling and analysis methods, spore size and characteristics, surface materials, and environmental conditions. Tests were performed to evaluate a new, validated sponge wipe method using Bacillus atrophaeus spores. Testing evaluated the effects of spore concentration and surface material on recovery efficiency (RE), false-negative rate (FNR), limit of detection (LOD), and their uncertainties. Ceramic tile and stainless steel had the highest mean RE values (48.9 and 48.1%, respectively). Faux leather, vinyl tile, and painted wood had mean RE values of 30.3, 25.6, and 25.5, respectively, while plastic had the lowest mean RE (9.8%). Results show roughly linear dependences of RE and FNR on surface roughness, with smoother surfaces resulting in higher mean REs and lower FNRs. REs were not influenced by the low spore concentrations tested (3.10 × 10−3 to 1.86 CFU/cm2). Stainless steel had the lowest mean FNR (0.123), and plastic had the highest mean FNR (0.479). The LOD90 (≥1 CFU detected 90% of the time) varied with surface material, from 0.015 CFU/cm2 on stainless steel up to 0.039 on plastic. It may be possible to improve sampling results by considering surface roughness in selecting sampling locations and interpreting spore recovery data. Further, FNR values (calculated as a function of concentration and surface material) can be used presampling to calculate the numbers of samples for statistical sampling plans with desired performance and postsampling to calculate the confidence in characterization and clearance decisions. PMID:22138998
Dispersion and sampling of adult Dermacentor andersoni in rangeland in Western North America.
Rochon, K; Scoles, G A; Lysyk, T J
2012-03-01
A fixed precision sampling plan was developed for off-host populations of adult Rocky Mountain wood tick, Dermacentor andersoni (Stiles) based on data collected by dragging at 13 locations in Alberta, Canada; Washington; and Oregon. In total, 222 site-date combinations were sampled. Each site-date combination was considered a sample, and each sample ranged in size from 86 to 250 10 m2 quadrats. Analysis of simulated quadrats ranging in size from 10 to 50 m2 indicated that the most precise sample unit was the 10 m2 quadrat. Samples taken when abundance < 0.04 ticks per 10 m2 were more likely to not depart significantly from statistical randomness than samples taken when abundance was greater. Data were grouped into ten abundance classes and assessed for fit to the Poisson and negative binomial distributions. The Poisson distribution fit only data in abundance classes < 0.02 ticks per 10 m2, while the negative binomial distribution fit data from all abundance classes. A negative binomial distribution with common k = 0.3742 fit data in eight of the 10 abundance classes. Both the Taylor and Iwao mean-variance relationships were fit and used to predict sample sizes for a fixed level of precision. Sample sizes predicted using the Taylor model tended to underestimate actual sample sizes, while sample sizes estimated using the Iwao model tended to overestimate actual sample sizes. Using a negative binomial with common k provided estimates of required sample sizes closest to empirically calculated sample sizes.
Fujishima, Motonobu; Kawaguchi, Atsushi; Maikusa, Norihide; Kuwano, Ryozo; Iwatsubo, Takeshi; Matsuda, Hiroshi
2017-01-01
Little is known about the sample sizes required for clinical trials of Alzheimer's disease (AD)-modifying treatments using atrophy measures from serial brain magnetic resonance imaging (MRI) in the Japanese population. The primary objective of the present study was to estimate how large a sample size would be needed for future clinical trials for AD-modifying treatments in Japan using atrophy measures of the brain as a surrogate biomarker. Sample sizes were estimated from the rates of change of the whole brain and hippocampus by the k-means normalized boundary shift integral (KN-BSI) and cognitive measures using the data of 537 Japanese Alzheimer's Neuroimaging Initiative (J-ADNI) participants with a linear mixed-effects model. We also examined the potential use of ApoE status as a trial enrichment strategy. The hippocampal atrophy rate required smaller sample sizes than cognitive measures of AD and mild cognitive impairment (MCI). Inclusion of ApoE status reduced sample sizes for AD and MCI patients in the atrophy measures. These results show the potential use of longitudinal hippocampal atrophy measurement using automated image analysis as a progression biomarker and ApoE status as a trial enrichment strategy in a clinical trial of AD-modifying treatment in Japanese people.
Estimation of Effect Size from a Series of Experiments Involving Paired Comparisons.
ERIC Educational Resources Information Center
Gibbons, Robert D.; And Others
1993-01-01
A distribution theory is derived for a G. V. Glass-type (1976) estimator of effect size from studies involving paired comparisons. The possibility of combining effect sizes from studies involving a mixture of related and unrelated samples is also explored. Resulting estimates are illustrated using data from previous psychiatric research. (SLD)
Christopher W. Woodall; Vicente J. Monleon
2009-01-01
The Forest Inventory and Analysis program of the Forest Service, U.S. Department of Agriculture conducts a national inventory of fine woody debris (FWD); however, the sampling protocols involve tallying only the number of FWD pieces by size class that intersect a sampling transect with no measure of actual size. The line intersect estimator used with those samples...
Usami, Satoshi
2017-03-01
Behavioral and psychological researchers have shown strong interests in investigating contextual effects (i.e., the influences of combinations of individual- and group-level predictors on individual-level outcomes). The present research provides generalized formulas for determining the sample size needed in investigating contextual effects according to the desired level of statistical power as well as width of confidence interval. These formulas are derived within a three-level random intercept model that includes one predictor/contextual variable at each level to simultaneously cover various kinds of contextual effects that researchers can show interest. The relative influences of indices included in the formulas on the standard errors of contextual effects estimates are investigated with the aim of further simplifying sample size determination procedures. In addition, simulation studies are performed to investigate finite sample behavior of calculated statistical power, showing that estimated sample sizes based on derived formulas can be both positively and negatively biased due to complex effects of unreliability of contextual variables, multicollinearity, and violation of assumption regarding the known variances. Thus, it is advisable to compare estimated sample sizes under various specifications of indices and to evaluate its potential bias, as illustrated in the example.
Lord, Dominique
2006-07-01
There has been considerable research conducted on the development of statistical models for predicting crashes on highway facilities. Despite numerous advancements made for improving the estimation tools of statistical models, the most common probabilistic structure used for modeling motor vehicle crashes remains the traditional Poisson and Poisson-gamma (or Negative Binomial) distribution; when crash data exhibit over-dispersion, the Poisson-gamma model is usually the model of choice most favored by transportation safety modelers. Crash data collected for safety studies often have the unusual attributes of being characterized by low sample mean values. Studies have shown that the goodness-of-fit of statistical models produced from such datasets can be significantly affected. This issue has been defined as the "low mean problem" (LMP). Despite recent developments on methods to circumvent the LMP and test the goodness-of-fit of models developed using such datasets, no work has so far examined how the LMP affects the fixed dispersion parameter of Poisson-gamma models used for modeling motor vehicle crashes. The dispersion parameter plays an important role in many types of safety studies and should, therefore, be reliably estimated. The primary objective of this research project was to verify whether the LMP affects the estimation of the dispersion parameter and, if it is, to determine the magnitude of the problem. The secondary objective consisted of determining the effects of an unreliably estimated dispersion parameter on common analyses performed in highway safety studies. To accomplish the objectives of the study, a series of Poisson-gamma distributions were simulated using different values describing the mean, the dispersion parameter, and the sample size. Three estimators commonly used by transportation safety modelers for estimating the dispersion parameter of Poisson-gamma models were evaluated: the method of moments, the weighted regression, and the maximum likelihood method. In an attempt to complement the outcome of the simulation study, Poisson-gamma models were fitted to crash data collected in Toronto, Ont. characterized by a low sample mean and small sample size. The study shows that a low sample mean combined with a small sample size can seriously affect the estimation of the dispersion parameter, no matter which estimator is used within the estimation process. The probability the dispersion parameter becomes unreliably estimated increases significantly as the sample mean and sample size decrease. Consequently, the results show that an unreliably estimated dispersion parameter can significantly undermine empirical Bayes (EB) estimates as well as the estimation of confidence intervals for the gamma mean and predicted response. The paper ends with recommendations about minimizing the likelihood of producing Poisson-gamma models with an unreliable dispersion parameter for modeling motor vehicle crashes.
Ellison, Laura E.; Lukacs, Paul M.
2014-01-01
Concern for migratory tree-roosting bats in North America has grown because of possible population declines from wind energy development. This concern has driven interest in estimating population-level changes. Mark-recapture methodology is one possible analytical framework for assessing bat population changes, but sample size requirements to produce reliable estimates have not been estimated. To illustrate the sample sizes necessary for a mark-recapture-based monitoring program we conducted power analyses using a statistical model that allows reencounters of live and dead marked individuals. We ran 1,000 simulations for each of five broad sample size categories in a Burnham joint model, and then compared the proportion of simulations in which 95% confidence intervals overlapped between and among years for a 4-year study. Additionally, we conducted sensitivity analyses of sample size to various capture probabilities and recovery probabilities. More than 50,000 individuals per year would need to be captured and released to accurately determine 10% and 15% declines in annual survival. To detect more dramatic declines of 33% or 50% survival over four years, then sample sizes of 25,000 or 10,000 per year, respectively, would be sufficient. Sensitivity analyses reveal that increasing recovery of dead marked individuals may be more valuable than increasing capture probability of marked individuals. Because of the extraordinary effort that would be required, we advise caution should such a mark-recapture effort be initiated because of the difficulty in attaining reliable estimates. We make recommendations for what techniques show the most promise for mark-recapture studies of bats because some techniques violate the assumptions of mark-recapture methodology when used to mark bats.
Olive, F; Rey, S; Zmirou, D
1998-09-01
Epidemiological studies, conducted in touristic resorts, often face the difficulty of assessing the size of the referent population. Recently, some population size indicators, have been tested. Among them, the amount of municipal waste seems to be easy and readily accessible. The purpose of the study is to describe how this indicator can be used in touristic mountain resorts. Four touristic resorts were chosen in Isère departement (France): Alpe d'Huez, Deux Alpes, Chamrousse, plateau du Vercors. The evolution of municipal waste over several years was used to compute an individual output level for residents and for tourists. This waste indicator was compared with data on tourists reservations in hotels in the resorts. We found a good fit during touristic seasons in three resorts (Spearman test). For the last one (Chamrousse), the correlation rate was low. We think that the type of tourism is different in this resort with many non residents. This indicator is reliable but needs further validation by sample surveys across several sites and several types of lodging. We propose to estimate the size of the referent population, based on an individual output of 1 kg per person and per day for residents and 0.5 kg per person per day for tourists.
A general approach to double-moment normalization of drop size distributions
NASA Astrophysics Data System (ADS)
Lee, G. W.; Sempere-Torres, D.; Uijlenhoet, R.; Zawadzki, I.
2003-04-01
Normalization of drop size distributions (DSDs) is re-examined here. First, we present an extension of scaling normalization using one moment of the DSD as a parameter (as introduced by Sempere-Torres et al, 1994) to a scaling normalization using two moments as parameters of the normalization. It is shown that the normalization of Testud et al. (2001) is a particular case of the two-moment scaling normalization. Thus, a unified vision of the question of DSDs normalization and a good model representation of DSDs is given. Data analysis shows that from the point of view of moment estimation least square regression is slightly more effective than moment estimation from the normalized average DSD.
Zhang, Song; Cao, Jing; Ahn, Chul
2017-02-20
We investigate the estimation of intervention effect and sample size determination for experiments where subjects are supposed to contribute paired binary outcomes with some incomplete observations. We propose a hybrid estimator to appropriately account for the mixed nature of observed data: paired outcomes from those who contribute complete pairs of observations and unpaired outcomes from those who contribute either pre-intervention or post-intervention outcomes. We theoretically prove that if incomplete data are evenly distributed between the pre-intervention and post-intervention periods, the proposed estimator will always be more efficient than the traditional estimator. A numerical research shows that when the distribution of incomplete data is unbalanced, the proposed estimator will be superior when there is moderate-to-strong positive within-subject correlation. We further derive a closed-form sample size formula to help researchers determine how many subjects need to be enrolled in such studies. Simulation results suggest that the calculated sample size maintains the empirical power and type I error under various design configurations. We demonstrate the proposed method using a real application example. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Haag, Wendell R
2013-08-01
Selection is expected to optimize reproductive investment resulting in characteristic trade-offs among traits such as brood size, offspring size, somatic maintenance, and lifespan; relative patterns of energy allocation to these functions are important in defining life-history strategies. Freshwater mussels are a diverse and imperiled component of aquatic ecosystems, but little is known about their life-history strategies, particularly patterns of fecundity and reproductive effort. Because mussels have an unusual life cycle in which larvae (glochidia) are obligate parasites on fishes, differences in host relationships are expected to influence patterns of reproductive output among species. I investigated fecundity and reproductive effort (RE) and their relationships to other life-history traits for a taxonomically broad cross section of North American mussel diversity. Annual fecundity of North American mussel species spans nearly four orders of magnitude, ranging from < 2000 to 10 million, but most species have considerably lower fecundity than previous generalizations, which portrayed the group as having uniformly high fecundity (e.g. > 200000). Estimates of RE also were highly variable, ranging among species from 0.06 to 25.4%. Median fecundity and RE differed among phylogenetic groups, but patterns for these two traits differed in several ways. For example, the tribe Anodontini had relatively low median fecundity but had the highest RE of any group. Within and among species, body size was a strong predictor of fecundity and explained a high percentage of variation in fecundity among species. Fecundity showed little relationship to other life-history traits including glochidial size, lifespan, brooding strategies, or host strategies. The only apparent trade-off evident among these traits was the extraordinarily high fecundity of Leptodea, Margaritifera, and Truncilla, which may come at a cost of greatly reduced glochidial size; there was no relationship between fecundity and glochidial size for the remaining 61 species in the dataset. In contrast to fecundity, RE showed evidence of a strong trade-off with lifespan, which was negatively related to RE. The raw number of glochidia produced may be determined primarily by physical and energetic constraints rather than selection for optimal output based on differences in host strategies or other traits. By integrating traits such as body size, glochidial size, and fecundity, RE appears more useful in defining mussel life-history strategies. Combined with trade-offs between other traits such as growth, lifespan, and age at maturity, differences in RE among species depict a broad continuum of divergent strategies ranging from strongly r-selected species (e.g. tribe Anodontini and some Lampsilini) to K-selected species (e.g. tribes Pleurobemini and Quadrulini; family Margaritiferidae). Future studies of reproductive effort in an environmental and life-history context will be useful for understanding the explosive radiation of this group of animals in North America and will aid in the development of effective conservation strategies. Published 2013. This article is a U.S. Government work and is in the public domain in the USA.
Sepúlveda, Nuno; Drakeley, Chris
2015-04-03
In the last decade, several epidemiological studies have demonstrated the potential of using seroprevalence (SP) and seroconversion rate (SCR) as informative indicators of malaria burden in low transmission settings or in populations on the cusp of elimination. However, most of studies are designed to control ensuing statistical inference over parasite rates and not on these alternative malaria burden measures. SP is in essence a proportion and, thus, many methods exist for the respective sample size determination. In contrast, designing a study where SCR is the primary endpoint, is not an easy task because precision and statistical power are affected by the age distribution of a given population. Two sample size calculators for SCR estimation are proposed. The first one consists of transforming the confidence interval for SP into the corresponding one for SCR given a known seroreversion rate (SRR). The second calculator extends the previous one to the most common situation where SRR is unknown. In this situation, data simulation was used together with linear regression in order to study the expected relationship between sample size and precision. The performance of the first sample size calculator was studied in terms of the coverage of the confidence intervals for SCR. The results pointed out to eventual problems of under or over coverage for sample sizes ≤250 in very low and high malaria transmission settings (SCR ≤ 0.0036 and SCR ≥ 0.29, respectively). The correct coverage was obtained for the remaining transmission intensities with sample sizes ≥ 50. Sample size determination was then carried out for cross-sectional surveys using realistic SCRs from past sero-epidemiological studies and typical age distributions from African and non-African populations. For SCR < 0.058, African studies require a larger sample size than their non-African counterparts in order to obtain the same precision. The opposite happens for the remaining transmission intensities. With respect to the second sample size calculator, simulation unravelled the likelihood of not having enough information to estimate SRR in low transmission settings (SCR ≤ 0.0108). In that case, the respective estimates tend to underestimate the true SCR. This problem is minimized by sample sizes of no less than 500 individuals. The sample sizes determined by this second method highlighted the prior expectation that, when SRR is not known, sample sizes are increased in relation to the situation of a known SRR. In contrast to the first sample size calculation, African studies would now require lesser individuals than their counterparts conducted elsewhere, irrespective of the transmission intensity. Although the proposed sample size calculators can be instrumental to design future cross-sectional surveys, the choice of a particular sample size must be seen as a much broader exercise that involves weighting statistical precision with ethical issues, available human and economic resources, and possible time constraints. Moreover, if the sample size determination is carried out on varying transmission intensities, as done here, the respective sample sizes can also be used in studies comparing sites with different malaria transmission intensities. In conclusion, the proposed sample size calculators are a step towards the design of better sero-epidemiological studies. Their basic ideas show promise to be applied to the planning of alternative sampling schemes that may target or oversample specific age groups.
State Estimates of Disability in America. Disability Statistics Report 3.
ERIC Educational Resources Information Center
LaPlante, Mitchell P.
This study presents and discusses existing data on disability by state, from the 1980 and 1990 censuses, the Current Population Survey (CPS), and the National Health Interview Survey (NHIS). The study used direct methods for states with large sample sizes and synthetic estimates for states with low sample sizes. The study's highlighted findings…
Jeffrey H. Gove
2003-01-01
Many of the most popular sampling schemes used in forestry are probability proportional to size methods. These methods are also referred to as size biased because sampling is actually from a weighted form of the underlying population distribution. Length- and area-biased sampling are special cases of size-biased sampling where the probability weighting comes from a...
Accounting for Incomplete Species Detection in Fish Community Monitoring
DOE Office of Scientific and Technical Information (OSTI.GOV)
McManamay, Ryan A; Orth, Dr. Donald J; Jager, Yetta
2013-01-01
Riverine fish assemblages are heterogeneous and very difficult to characterize with a one-size-fits-all approach to sampling. Furthermore, detecting changes in fish assemblages over time requires accounting for variation in sampling designs. We present a modeling approach that permits heterogeneous sampling by accounting for site and sampling covariates (including method) in a model-based framework for estimation (versus a sampling-based framework). We snorkeled during three surveys and electrofished during a single survey in suite of delineated habitats stratified by reach types. We developed single-species occupancy models to determine covariates influencing patch occupancy and species detection probabilities whereas community occupancy models estimated speciesmore » richness in light of incomplete detections. For most species, information-theoretic criteria showed higher support for models that included patch size and reach as covariates of occupancy. In addition, models including patch size and sampling method as covariates of detection probabilities also had higher support. Detection probability estimates for snorkeling surveys were higher for larger non-benthic species whereas electrofishing was more effective at detecting smaller benthic species. The number of sites and sampling occasions required to accurately estimate occupancy varied among fish species. For rare benthic species, our results suggested that higher number of occasions, and especially the addition of electrofishing, may be required to improve detection probabilities and obtain accurate occupancy estimates. Community models suggested that richness was 41% higher than the number of species actually observed and the addition of an electrofishing survey increased estimated richness by 13%. These results can be useful to future fish assemblage monitoring efforts by informing sampling designs, such as site selection (e.g. stratifying based on patch size) and determining effort required (e.g. number of sites versus occasions).« less
How Large Should a Statistical Sample Be?
ERIC Educational Resources Information Center
Menil, Violeta C.; Ye, Ruili
2012-01-01
This study serves as a teaching aid for teachers of introductory statistics. The aim of this study was limited to determining various sample sizes when estimating population proportion. Tables on sample sizes were generated using a C[superscript ++] program, which depends on population size, degree of precision or error level, and confidence…
Revisiting sample size: are big trials the answer?
Lurati Buse, Giovanna A L; Botto, Fernando; Devereaux, P J
2012-07-18
The superiority of the evidence generated in randomized controlled trials over observational data is not only conditional to randomization. Randomized controlled trials require proper design and implementation to provide a reliable effect estimate. Adequate random sequence generation, allocation implementation, analyses based on the intention-to-treat principle, and sufficient power are crucial to the quality of a randomized controlled trial. Power, or the probability of the trial to detect a difference when a real difference between treatments exists, strongly depends on sample size. The quality of orthopaedic randomized controlled trials is frequently threatened by a limited sample size. This paper reviews basic concepts and pitfalls in sample-size estimation and focuses on the importance of large trials in the generation of valid evidence.
genepop'007: a complete re-implementation of the genepop software for Windows and Linux.
Rousset, François
2008-01-01
This note summarizes developments of the genepop software since its first description in 1995, and in particular those new to version 4.0: an extended input format, several estimators of neighbourhood size under isolation by distance, new estimators and confidence intervals for null allele frequency, and less important extensions to previous options. genepop now runs under Linux as well as under Windows, and can be entirely controlled by batch calls. © 2007 The Author.
Sub-sampling genetic data to estimate black bear population size: A case study
Tredick, C.A.; Vaughan, M.R.; Stauffer, D.F.; Simek, S.L.; Eason, T.
2007-01-01
Costs for genetic analysis of hair samples collected for individual identification of bears average approximately US$50 [2004] per sample. This can easily exceed budgetary allowances for large-scale studies or studies of high-density bear populations. We used 2 genetic datasets from 2 areas in the southeastern United States to explore how reducing costs of analysis by sub-sampling affected precision and accuracy of resulting population estimates. We used several sub-sampling scenarios to create subsets of the full datasets and compared summary statistics, population estimates, and precision of estimates generated from these subsets to estimates generated from the complete datasets. Our results suggested that bias and precision of estimates improved as the proportion of total samples used increased, and heterogeneity models (e.g., Mh[CHAO]) were more robust to reduced sample sizes than other models (e.g., behavior models). We recommend that only high-quality samples (>5 hair follicles) be used when budgets are constrained, and efforts should be made to maximize capture and recapture rates in the field.
A Note on Sample Size and Solution Propriety for Confirmatory Factor Analytic Models
ERIC Educational Resources Information Center
Jackson, Dennis L.; Voth, Jennifer; Frey, Marc P.
2013-01-01
Determining an appropriate sample size for use in latent variable modeling techniques has presented ongoing challenges to researchers. In particular, small sample sizes are known to present concerns over sampling error for the variances and covariances on which model estimation is based, as well as for fit indexes and convergence failures. The…
Letcher, B.H.; Horton, G.E.
2008-01-01
We estimated the magnitude and shape of size-dependent survival (SDS) across multiple sampling intervals for two cohorts of stream-dwelling Atlantic salmon (Salmo salar) juveniles using multistate capture-mark-recapture (CMR) models. Simulations designed to test the effectiveness of multistate models for detecting SDS in our system indicated that error in SDS estimates was low and that both time-invariant and time-varying SDS could be detected with sample sizes of >250, average survival of >0.6, and average probability of capture of >0.6, except for cases of very strong SDS. In the field (N ??? 750, survival 0.6-0.8 among sampling intervals, probability of capture 0.6-0.8 among sampling occasions), about one-third of the sampling intervals showed evidence of SDS, with poorer survival of larger fish during the age-2+ autumn and quadratic survival (opposite direction between cohorts) during age-1+ spring. The varying magnitude and shape of SDS among sampling intervals suggest a potential mechanism for the maintenance of the very wide observed size distributions. Estimating SDS using multistate CMR models appears complementary to established approaches, can provide estimates with low error, and can be used to detect intermittent SDS. ?? 2008 NRC Canada.
Statistical power analysis in wildlife research
Steidl, R.J.; Hayes, J.P.
1997-01-01
Statistical power analysis can be used to increase the efficiency of research efforts and to clarify research results. Power analysis is most valuable in the design or planning phases of research efforts. Such prospective (a priori) power analyses can be used to guide research design and to estimate the number of samples necessary to achieve a high probability of detecting biologically significant effects. Retrospective (a posteriori) power analysis has been advocated as a method to increase information about hypothesis tests that were not rejected. However, estimating power for tests of null hypotheses that were not rejected with the effect size observed in the study is incorrect; these power estimates will always be a??0.50 when bias adjusted and have no relation to true power. Therefore, retrospective power estimates based on the observed effect size for hypothesis tests that were not rejected are misleading; retrospective power estimates are only meaningful when based on effect sizes other than the observed effect size, such as those effect sizes hypothesized to be biologically significant. Retrospective power analysis can be used effectively to estimate the number of samples or effect size that would have been necessary for a completed study to have rejected a specific null hypothesis. Simply presenting confidence intervals can provide additional information about null hypotheses that were not rejected, including information about the size of the true effect and whether or not there is adequate evidence to 'accept' a null hypothesis as true. We suggest that (1) statistical power analyses be routinely incorporated into research planning efforts to increase their efficiency, (2) confidence intervals be used in lieu of retrospective power analyses for null hypotheses that were not rejected to assess the likely size of the true effect, (3) minimum biologically significant effect sizes be used for all power analyses, and (4) if retrospective power estimates are to be reported, then the I?-level, effect sizes, and sample sizes used in calculations must also be reported.
A novel measure of effect size for mediation analysis.
Lachowicz, Mark J; Preacher, Kristopher J; Kelley, Ken
2018-06-01
Mediation analysis has become one of the most popular statistical methods in the social sciences. However, many currently available effect size measures for mediation have limitations that restrict their use to specific mediation models. In this article, we develop a measure of effect size that addresses these limitations. We show how modification of a currently existing effect size measure results in a novel effect size measure with many desirable properties. We also derive an expression for the bias of the sample estimator for the proposed effect size measure and propose an adjusted version of the estimator. We present a Monte Carlo simulation study conducted to examine the finite sampling properties of the adjusted and unadjusted estimators, which shows that the adjusted estimator is effective at recovering the true value it estimates. Finally, we demonstrate the use of the effect size measure with an empirical example. We provide freely available software so that researchers can immediately implement the methods we discuss. Our developments here extend the existing literature on effect sizes and mediation by developing a potentially useful method of communicating the magnitude of mediation. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Gupta, Manan; Joshi, Amitabh; Vidya, T N C
2017-01-01
Mark-recapture estimators are commonly used for population size estimation, and typically yield unbiased estimates for most solitary species with low to moderate home range sizes. However, these methods assume independence of captures among individuals, an assumption that is clearly violated in social species that show fission-fusion dynamics, such as the Asian elephant. In the specific case of Asian elephants, doubts have been raised about the accuracy of population size estimates. More importantly, the potential problem for the use of mark-recapture methods posed by social organization in general has not been systematically addressed. We developed an individual-based simulation framework to systematically examine the potential effects of type of social organization, as well as other factors such as trap density and arrangement, spatial scale of sampling, and population density, on bias in population sizes estimated by POPAN, Robust Design, and Robust Design with detection heterogeneity. In the present study, we ran simulations with biological, demographic and ecological parameters relevant to Asian elephant populations, but the simulation framework is easily extended to address questions relevant to other social species. We collected capture history data from the simulations, and used those data to test for bias in population size estimation. Social organization significantly affected bias in most analyses, but the effect sizes were variable, depending on other factors. Social organization tended to introduce large bias when trap arrangement was uniform and sampling effort was low. POPAN clearly outperformed the two Robust Design models we tested, yielding close to zero bias if traps were arranged at random in the study area, and when population density and trap density were not too low. Social organization did not have a major effect on bias for these parameter combinations at which POPAN gave more or less unbiased population size estimates. Therefore, the effect of social organization on bias in population estimation could be removed by using POPAN with specific parameter combinations, to obtain population size estimates in a social species.
Joshi, Amitabh; Vidya, T. N. C.
2017-01-01
Mark-recapture estimators are commonly used for population size estimation, and typically yield unbiased estimates for most solitary species with low to moderate home range sizes. However, these methods assume independence of captures among individuals, an assumption that is clearly violated in social species that show fission-fusion dynamics, such as the Asian elephant. In the specific case of Asian elephants, doubts have been raised about the accuracy of population size estimates. More importantly, the potential problem for the use of mark-recapture methods posed by social organization in general has not been systematically addressed. We developed an individual-based simulation framework to systematically examine the potential effects of type of social organization, as well as other factors such as trap density and arrangement, spatial scale of sampling, and population density, on bias in population sizes estimated by POPAN, Robust Design, and Robust Design with detection heterogeneity. In the present study, we ran simulations with biological, demographic and ecological parameters relevant to Asian elephant populations, but the simulation framework is easily extended to address questions relevant to other social species. We collected capture history data from the simulations, and used those data to test for bias in population size estimation. Social organization significantly affected bias in most analyses, but the effect sizes were variable, depending on other factors. Social organization tended to introduce large bias when trap arrangement was uniform and sampling effort was low. POPAN clearly outperformed the two Robust Design models we tested, yielding close to zero bias if traps were arranged at random in the study area, and when population density and trap density were not too low. Social organization did not have a major effect on bias for these parameter combinations at which POPAN gave more or less unbiased population size estimates. Therefore, the effect of social organization on bias in population estimation could be removed by using POPAN with specific parameter combinations, to obtain population size estimates in a social species. PMID:28306735
Santin-Janin, Hugues; Hugueny, Bernard; Aubry, Philippe; Fouchet, David; Gimenez, Olivier; Pontier, Dominique
2014-01-01
Data collected to inform time variations in natural population size are tainted by sampling error. Ignoring sampling error in population dynamics models induces bias in parameter estimators, e.g., density-dependence. In particular, when sampling errors are independent among populations, the classical estimator of the synchrony strength (zero-lag correlation) is biased downward. However, this bias is rarely taken into account in synchrony studies although it may lead to overemphasizing the role of intrinsic factors (e.g., dispersal) with respect to extrinsic factors (the Moran effect) in generating population synchrony as well as to underestimating the extinction risk of a metapopulation. The aim of this paper was first to illustrate the extent of the bias that can be encountered in empirical studies when sampling error is neglected. Second, we presented a space-state modelling approach that explicitly accounts for sampling error when quantifying population synchrony. Third, we exemplify our approach with datasets for which sampling variance (i) has been previously estimated, and (ii) has to be jointly estimated with population synchrony. Finally, we compared our results to those of a standard approach neglecting sampling variance. We showed that ignoring sampling variance can mask a synchrony pattern whatever its true value and that the common practice of averaging few replicates of population size estimates poorly performed at decreasing the bias of the classical estimator of the synchrony strength. The state-space model used in this study provides a flexible way of accurately quantifying the strength of synchrony patterns from most population size data encountered in field studies, including over-dispersed count data. We provided a user-friendly R-program and a tutorial example to encourage further studies aiming at quantifying the strength of population synchrony to account for uncertainty in population size estimates.
Santin-Janin, Hugues; Hugueny, Bernard; Aubry, Philippe; Fouchet, David; Gimenez, Olivier; Pontier, Dominique
2014-01-01
Background Data collected to inform time variations in natural population size are tainted by sampling error. Ignoring sampling error in population dynamics models induces bias in parameter estimators, e.g., density-dependence. In particular, when sampling errors are independent among populations, the classical estimator of the synchrony strength (zero-lag correlation) is biased downward. However, this bias is rarely taken into account in synchrony studies although it may lead to overemphasizing the role of intrinsic factors (e.g., dispersal) with respect to extrinsic factors (the Moran effect) in generating population synchrony as well as to underestimating the extinction risk of a metapopulation. Methodology/Principal findings The aim of this paper was first to illustrate the extent of the bias that can be encountered in empirical studies when sampling error is neglected. Second, we presented a space-state modelling approach that explicitly accounts for sampling error when quantifying population synchrony. Third, we exemplify our approach with datasets for which sampling variance (i) has been previously estimated, and (ii) has to be jointly estimated with population synchrony. Finally, we compared our results to those of a standard approach neglecting sampling variance. We showed that ignoring sampling variance can mask a synchrony pattern whatever its true value and that the common practice of averaging few replicates of population size estimates poorly performed at decreasing the bias of the classical estimator of the synchrony strength. Conclusion/Significance The state-space model used in this study provides a flexible way of accurately quantifying the strength of synchrony patterns from most population size data encountered in field studies, including over-dispersed count data. We provided a user-friendly R-program and a tutorial example to encourage further studies aiming at quantifying the strength of population synchrony to account for uncertainty in population size estimates. PMID:24489839
Novikov, I; Fund, N; Freedman, L S
2010-01-15
Different methods for the calculation of sample size for simple logistic regression (LR) with one normally distributed continuous covariate give different results. Sometimes the difference can be large. Furthermore, some methods require the user to specify the prevalence of cases when the covariate equals its population mean, rather than the more natural population prevalence. We focus on two commonly used methods and show through simulations that the power for a given sample size may differ substantially from the nominal value for one method, especially when the covariate effect is large, while the other method performs poorly if the user provides the population prevalence instead of the required parameter. We propose a modification of the method of Hsieh et al. that requires specification of the population prevalence and that employs Schouten's sample size formula for a t-test with unequal variances and group sizes. This approach appears to increase the accuracy of the sample size estimates for LR with one continuous covariate.
A new estimator of the discovery probability.
Favaro, Stefano; Lijoi, Antonio; Prünster, Igor
2012-12-01
Species sampling problems have a long history in ecological and biological studies and a number of issues, including the evaluation of species richness, the design of sampling experiments, and the estimation of rare species variety, are to be addressed. Such inferential problems have recently emerged also in genomic applications, however, exhibiting some peculiar features that make them more challenging: specifically, one has to deal with very large populations (genomic libraries) containing a huge number of distinct species (genes) and only a small portion of the library has been sampled (sequenced). These aspects motivate the Bayesian nonparametric approach we undertake, since it allows to achieve the degree of flexibility typically needed in this framework. Based on an observed sample of size n, focus will be on prediction of a key aspect of the outcome from an additional sample of size m, namely, the so-called discovery probability. In particular, conditionally on an observed basic sample of size n, we derive a novel estimator of the probability of detecting, at the (n+m+1)th observation, species that have been observed with any given frequency in the enlarged sample of size n+m. Such an estimator admits a closed-form expression that can be exactly evaluated. The result we obtain allows us to quantify both the rate at which rare species are detected and the achieved sample coverage of abundant species, as m increases. Natural applications are represented by the estimation of the probability of discovering rare genes within genomic libraries and the results are illustrated by means of two expressed sequence tags datasets. © 2012, The International Biometric Society.
Improved Rhenium Thrust Chambers
NASA Technical Reports Server (NTRS)
O'Dell, John Scott
2015-01-01
Radiation-cooled bipropellant thrust chambers are being considered for ascent/ descent engines and reaction control systems on various NASA missions and spacecraft, such as the Mars Sample Return and Orion Multi-Purpose Crew Vehicle (MPCV). Currently, iridium (Ir)-lined rhenium (Re) combustion chambers are the state of the art for in-space engines. NASA's Advanced Materials Bipropellant Rocket (AMBR) engine, a 150-lbf Ir-Re chamber produced by Plasma Processes and Aerojet Rocketdyne, recently set a hydrazine specific impulse record of 333.5 seconds. To withstand the high loads during terrestrial launch, Re chambers with improved mechanical properties are needed. Recent electrochemical forming (EL-Form"TM") results have shown considerable promise for improving Re's mechanical properties by producing a multilayered deposit composed of a tailored microstructure (i.e., Engineered Re). The Engineered Re processing techniques were optimized, and detailed characterization and mechanical properties tests were performed. The most promising techniques were selected and used to produce an Engineered Re AMBR-sized combustion chamber for testing at Aerojet Rocketdyne.
Jun, Jae Kwan; Kim, Mi Jin; Choi, Kui Son; Suh, Mina; Jung, Kyu-Won
2012-01-01
Mammographic breast density is a known risk factor for breast cancer. To conduct a survey to estimate the distribution of mammographic breast density in Korean women, appropriate sampling strategies for representative and efficient sampling design were evaluated through simulation. Using the target population from the National Cancer Screening Programme (NCSP) for breast cancer in 2009, we verified the distribution estimate by repeating the simulation 1,000 times using stratified random sampling to investigate the distribution of breast density of 1,340,362 women. According to the simulation results, using a sampling design stratifying the nation into three groups (metropolitan, urban, and rural), with a total sample size of 4,000, we estimated the distribution of breast density in Korean women at a level of 0.01% tolerance. Based on the results of our study, a nationwide survey for estimating the distribution of mammographic breast density among Korean women can be conducted efficiently.
Gilbert, Andrew T.; O'Connell, Allan F.; Annand, Elizabeth M.; Talancy, Neil W.; Sauer, John R.; Nichols, James D.
2008-01-01
An inventory of mammals was conducted during 2004 at nine national park sites in the Northeast Temperate Network (NETN): Acadia National Park (NP), Marsh-Billings-Rockefeller National Historical Park (NHP), Minute Man NHP, Morristown NHP, Roosevelt-Vanderbilt National Historic Site (NHS), Saint-Gaudens NHS, Saugus Iron Works NHS, Saratoga NHP, and Weir Farm NHS. Sagamore Hill NHS, part of the Northeast Coastal and Barrier Network (NCBN), was also surveyed. Each park except Acadia NP was sampled twice, once in the winter/spring and again in the summer/fall. During the winter/spring visit, indirect measure (IM) sampling arrays were employed at 2 to 16 stations and included sampling by remote cameras, cubby boxes (covered trackplates), and hair traps. IM stations were established and re-used during the summer/fall sampling period. Trapping was conducted at 2 to 12 stations at all parks except Acadia NP during the summer/fall period and consisted of arrays of small-mammal traps, squirrel-sized live traps, and some fox-sized live traps. We used estimation-based procedures and probabilistic sampling techniques to design this inventory. A total of 38 species was detected by IM sampling, trapping, and field observations. Species diversity (number of species) varied among parks, ranging from 8 to 24, with Minute Man NHP having the most species detected. Raccoon (Procyon lotor), Virginia Opossum (Didelphis virginiana), Fisher (Martes pennanti), and Domestic Cat (Felis silvestris) were the most common medium-sized mammals detected in this study and White-footed Mouse (Peromyscus leucopus), Northern Short-tailed Shrew (Blarina brevicauda), Deer Mouse (P. maniculatus), and Meadow Vole (Microtus pennsylvanicus) the most common small mammals detected. All species detected are considered fairly common throughout their range including the Fisher, which has been reintroduced in several New England states. We did not detect any state or federal endangered or threatened species.
Influence of item distribution pattern and abundance on efficiency of benthic core sampling
Behney, Adam C.; O'Shaughnessy, Ryan; Eichholz, Michael W.; Stafford, Joshua D.
2014-01-01
ore sampling is a commonly used method to estimate benthic item density, but little information exists about factors influencing the accuracy and time-efficiency of this method. We simulated core sampling in a Geographic Information System framework by generating points (benthic items) and polygons (core samplers) to assess how sample size (number of core samples), core sampler size (cm2), distribution of benthic items, and item density affected the bias and precision of estimates of density, the detection probability of items, and the time-costs. When items were distributed randomly versus clumped, bias decreased and precision increased with increasing sample size and increased slightly with increasing core sampler size. Bias and precision were only affected by benthic item density at very low values (500–1,000 items/m2). Detection probability (the probability of capturing ≥ 1 item in a core sample if it is available for sampling) was substantially greater when items were distributed randomly as opposed to clumped. Taking more small diameter core samples was always more time-efficient than taking fewer large diameter samples. We are unable to present a single, optimal sample size, but provide information for researchers and managers to derive optimal sample sizes dependent on their research goals and environmental conditions.
NASA Astrophysics Data System (ADS)
Engelbrecht, Johann P.; Moosmüller, Hans; Pincock, Samuel; Jayanty, R. K. M.; Lersch, Traci; Casuccio, Gary
2016-08-01
This paper promotes an understanding of the mineralogical, chemical, and physical interrelationships of re-suspended mineral dusts collected as grab samples from global dust sources. Surface soils were collected from arid regions, including the southwestern USA, Mali, Chad, Morocco, Canary Islands, Cabo Verde, Djibouti, Afghanistan, Iraq, Kuwait, Qatar, UAE, Serbia, China, Namibia, Botswana, Australia, and Chile. The < 38 µm sieved fraction of each sample was re-suspended in a chamber, from which the airborne mineral dust could be extracted, sampled, and analyzed. Instruments integrated into the entrainment facility included two PM10 and two PM2.5 filter samplers, a beta attenuation gauge for the continuous measurement of PM10 and PM2.5 particulate mass fractions, an aerodynamic particle size analyzer, and a three-wavelength (405, 532, 781 nm) photoacoustic instrument with integrating reciprocal nephelometer for monitoring absorption and scattering coefficients during the dust re-suspension process. Filter sampling media included Teflon® membrane and quartz fiber filters for chemical analysis and Nuclepore® filters for individual particle analysis by scanning electron microscopy (SEM). The < 38 µm sieved fractions were also analyzed by X-ray diffraction for their mineral content while the > 75, < 125 µm soil fractions were mineralogically assessed by optical microscopy. Presented here are results of the optical measurements, showing the interdependency of single-scattering albedos (SSA) at three different wavelengths and mineralogical content of the entrained dust samples. To explain the elevated concentrations of iron (Fe) and Fe / Al ratios in the soil re-suspensions, we propose that dust particles are to a large extent composed of nano-sized particles of micas, clays, metal oxides, and ions of potassium (K+), calcium (Ca2+), and sodium (Na+) evenly dispersed as a colloid or adsorbed in amorphous clay-like material. Also shown are differences in SSA of the kaolinite/hematite/goethite samples from Mali and those from colloidal soils elsewhere. Results from this study can be integrated into a database of mineral dust properties, for applications in climate modeling, remote sensing, visibility, health (medical geology), ocean fertilization, and impact on equipment.
Sample Size and Allocation of Effort in Point Count Sampling of Birds in Bottomland Hardwood Forests
Winston P. Smith; Daniel J. Twedt; Robert J. Cooper; David A. Wiedenfeld; Paul B. Hamel; Robert P. Ford
1995-01-01
To examine sample size requirements and optimum allocation of effort in point count sampling of bottomland hardwood forests, we computed minimum sample sizes from variation recorded during 82 point counts (May 7-May 16, 1992) from three localities containing three habitat types across three regions of the Mississippi Alluvial Valley (MAV). Also, we estimated the effect...
Monitoring Species of Concern Using Noninvasive Genetic Sampling and Capture-Recapture Methods
2016-11-01
ABBREVIATIONS AICc Akaike’s Information Criterion with small sample size correction AZGFD Arizona Game and Fish Department BMGR Barry M. Goldwater...MNKA Minimum Number Known Alive N Abundance Ne Effective Population Size NGS Noninvasive Genetic Sampling NGS-CR Noninvasive Genetic...parameter estimates from capture-recapture models require sufficient sample sizes , capture probabilities and low capture biases. For NGS-CR, sample
Meta-analysis of multiple outcomes: a multilevel approach.
Van den Noortgate, Wim; López-López, José Antonio; Marín-Martínez, Fulgencio; Sánchez-Meca, Julio
2015-12-01
In meta-analysis, dependent effect sizes are very common. An example is where in one or more studies the effect of an intervention is evaluated on multiple outcome variables for the same sample of participants. In this paper, we evaluate a three-level meta-analytic model to account for this kind of dependence, extending the simulation results of Van den Noortgate, López-López, Marín-Martínez, and Sánchez-Meca Behavior Research Methods, 45, 576-594 (2013) by allowing for a variation in the number of effect sizes per study, in the between-study variance, in the correlations between pairs of outcomes, and in the sample size of the studies. At the same time, we explore the performance of the approach if the outcomes used in a study can be regarded as a random sample from a population of outcomes. We conclude that although this approach is relatively simple and does not require prior estimates of the sampling covariances between effect sizes, it gives appropriate mean effect size estimates, standard error estimates, and confidence interval coverage proportions in a variety of realistic situations.
How Big Is Big Enough? Sample Size Requirements for CAST Item Parameter Estimation
ERIC Educational Resources Information Center
Chuah, Siang Chee; Drasgow, Fritz; Luecht, Richard
2006-01-01
Adaptive tests offer the advantages of reduced test length and increased accuracy in ability estimation. However, adaptive tests require large pools of precalibrated items. This study looks at the development of an item pool for 1 type of adaptive administration: the computer-adaptive sequential test. An important issue is the sample size required…
Sample Size Calculation for Estimating or Testing a Nonzero Squared Multiple Correlation Coefficient
ERIC Educational Resources Information Center
Krishnamoorthy, K.; Xia, Yanping
2008-01-01
The problems of hypothesis testing and interval estimation of the squared multiple correlation coefficient of a multivariate normal distribution are considered. It is shown that available one-sided tests are uniformly most powerful, and the one-sided confidence intervals are uniformly most accurate. An exact method of calculating sample size to…
A weighted generalized score statistic for comparison of predictive values of diagnostic tests.
Kosinski, Andrzej S
2013-03-15
Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting. Copyright © 2012 John Wiley & Sons, Ltd.
A weighted generalized score statistic for comparison of predictive values of diagnostic tests
Kosinski, Andrzej S.
2013-01-01
Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations which are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we present, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic which incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, it always reduces to the score statistic in the independent samples situation, and it preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the weighted generalized score test statistic in a general GEE setting. PMID:22912343
Global Burden of Disease estimates of depression – how reliable is the epidemiological evidence?
Brhlikova, Petra; Pollock, Allyson M; Manners, Rachel
2011-01-01
Summary Objectives To re-assess the quality of the epidemiological studies used to estimate the global burden of depression 2000, as published in the GBDep study. Design Primary and secondary data sources used in the global burden of depression estimate were identified and assigned to country of origin. Each source was assessed with respect to completeness and representativeness for national/regional estimates and against the inclusion criteria used by the scientific team estimating GBDep. Setting Not applicable. Participants Not applicable. Main outcome measures Not applicable. Results First, National estimates: The 28 scientific sources cited in the GBDep study related to 40 of the 191 WHO member countries. The EURO region had studies relating to 15 of 52 countries whereas AFRO region had studies for only three of 46 countries. Only six of the 40 countries had data drawn from a nationally representative population: the three AFRO country studies were based on a single village or town and, likewise, SEARO region had no nationally representative data; second, GBDep criteria: GBDep inclusion criteria required study sample size of more than 1000 people; 19 (45%) of the 42 studies did not meet this criterion. Sixteen (44%) of 36 studies did not meet the requirement that studies show a clear sample frame and method. GBD estimates rely on estimates of incidence; only two of the 42 country studies provided incidence data (Canada and Norway), the remaining 34 studies were prevalence studies. Duration of depression is based on three studies conducted in the USA and Holland. Conclusions Most studies exhibit significant shortcomings and limitations with respect to study design and analysis and compliance with GBDep inclusion criteria. Poor quality data limit the interpretation and validity of global burden of depression estimates. The uncritical application of these estimates to international healthcare policy-making could divert scarce resources from other public healthcare priorities. PMID:21205775
Sample size and power considerations in network meta-analysis
2012-01-01
Background Network meta-analysis is becoming increasingly popular for establishing comparative effectiveness among multiple interventions for the same disease. Network meta-analysis inherits all methodological challenges of standard pairwise meta-analysis, but with increased complexity due to the multitude of intervention comparisons. One issue that is now widely recognized in pairwise meta-analysis is the issue of sample size and statistical power. This issue, however, has so far only received little attention in network meta-analysis. To date, no approaches have been proposed for evaluating the adequacy of the sample size, and thus power, in a treatment network. Findings In this article, we develop easy-to-use flexible methods for estimating the ‘effective sample size’ in indirect comparison meta-analysis and network meta-analysis. The effective sample size for a particular treatment comparison can be interpreted as the number of patients in a pairwise meta-analysis that would provide the same degree and strength of evidence as that which is provided in the indirect comparison or network meta-analysis. We further develop methods for retrospectively estimating the statistical power for each comparison in a network meta-analysis. We illustrate the performance of the proposed methods for estimating effective sample size and statistical power using data from a network meta-analysis on interventions for smoking cessation including over 100 trials. Conclusion The proposed methods are easy to use and will be of high value to regulatory agencies and decision makers who must assess the strength of the evidence supporting comparative effectiveness estimates. PMID:22992327
Estimating the breeding population of long-billed curlew in the United States
Stanley, T.R.; Skagen, S.K.
2007-01-01
Determining population size and long-term trends in population size for species of high concern is a priority of international, national, and regional conservation plans. Long-billed curlews (Numenius americanus) are a species of special concern in North America due to apparent declines in their population. Because long-billed curlews are not adequately monitored by existing programs, we undertook a 2-year study with the goals of 1) determining present long-billed curlew distribution and breeding population size in the United States and 2) providing recommendations for a long-term long-billed curlew monitoring protocol. We selected a stratified random sample of survey routes in 16 western states for sampling in 2004 and 2005, and we analyzed count data from these routes to estimate detection probabilities and abundance. In addition, we evaluated habitat along roadsides to determine how well roadsides represented habitat throughout the sampling units. We estimated there were 164,515 (SE = 42,047) breeding long-billed curlews in 2004, and 109,533 (SE = 31,060) breeding individuals in 2005. These estimates far exceed currently accepted estimates based on expert opinion. We found that habitat along roadsides was representative of long-billed curlew habitat in general. We make recommendations for improving sampling methodology, and we present power curves to provide guidance on minimum sample sizes required to detect trends in abundance.
Re-evaluating the link between brain size and behavioural ecology in primates.
Powell, Lauren E; Isler, Karin; Barton, Robert A
2017-10-25
Comparative studies have identified a wide range of behavioural and ecological correlates of relative brain size, with results differing between taxonomic groups, and even within them. In primates for example, recent studies contradict one another over whether social or ecological factors are critical. A basic assumption of such studies is that with sufficiently large samples and appropriate analysis, robust correlations indicative of selection pressures on cognition will emerge. We carried out a comprehensive re-examination of correlates of primate brain size using two large comparative datasets and phylogenetic comparative methods. We found evidence in both datasets for associations between brain size and ecological variables (home range size, diet and activity period), but little evidence for an effect of social group size, a correlation which has previously formed the empirical basis of the Social Brain Hypothesis. However, reflecting divergent results in the literature, our results exhibited instability across datasets, even when they were matched for species composition and predictor variables. We identify several potential empirical and theoretical difficulties underlying this instability and suggest that these issues raise doubts about inferring cognitive selection pressures from behavioural correlates of brain size. © 2017 The Author(s).
Contrasting natural regeneration and tree planting in fourteen North American cities
David J. Nowak
2012-01-01
Field data from randomly located plots in 12 cities in the United States and Canada were used to estimate the proportion of the existing tree population that was planted or occurred via natural regeneration. In addition, two cities (Baltimore and Syracuse) were recently re-sampled to estimate the proportion of newly established trees that were planted. Results for the...
Xie, Yanming; Wei, Xu
2011-10-01
Re-evaluation of post-marketed based on pharmacoepidemiology is to study and collect clinical medicine safety in large population under practical applications for a long time. It is necessary to conduct re-evaluation of clinical effectiveness because of particularity of traditional Chinese medicine (TCM). Right before carrying out clinical trials on re-evaluation of post-marketed TCM, we should determine the objective of the study and progress it in the assessment mode of combination of disease and syndrome. Specical population, involving children and seniors who were excluded in pre-marketed clinical trial, were brought into drug monitoring. Sample size needs to comply with statistical requirement. We commonly use cohort study, case-control study, nested case-control, pragmatic randomized controlled trials.
Su, Chun-Lung; Gardner, Ian A; Johnson, Wesley O
2004-07-30
The two-test two-population model, originally formulated by Hui and Walter, for estimation of test accuracy and prevalence estimation assumes conditionally independent tests, constant accuracy across populations and binomial sampling. The binomial assumption is incorrect if all individuals in a population e.g. child-care centre, village in Africa, or a cattle herd are sampled or if the sample size is large relative to population size. In this paper, we develop statistical methods for evaluating diagnostic test accuracy and prevalence estimation based on finite sample data in the absence of a gold standard. Moreover, two tests are often applied simultaneously for the purpose of obtaining a 'joint' testing strategy that has either higher overall sensitivity or specificity than either of the two tests considered singly. Sequential versions of such strategies are often applied in order to reduce the cost of testing. We thus discuss joint (simultaneous and sequential) testing strategies and inference for them. Using the developed methods, we analyse two real and one simulated data sets, and we compare 'hypergeometric' and 'binomial-based' inferences. Our findings indicate that the posterior standard deviations for prevalence (but not sensitivity and specificity) based on finite population sampling tend to be smaller than their counterparts for infinite population sampling. Finally, we make recommendations about how small the sample size should be relative to the population size to warrant use of the binomial model for prevalence estimation. Copyright 2004 John Wiley & Sons, Ltd.
Lyons, James E.; Kendall, William L.; Royle, J. Andrew; Converse, Sarah J.; Andres, Brad A.; Buchanan, Joseph B.
2016-01-01
We present a novel formulation of a mark–recapture–resight model that allows estimation of population size, stopover duration, and arrival and departure schedules at migration areas. Estimation is based on encounter histories of uniquely marked individuals and relative counts of marked and unmarked animals. We use a Bayesian analysis of a state–space formulation of the Jolly–Seber mark–recapture model, integrated with a binomial model for counts of unmarked animals, to derive estimates of population size and arrival and departure probabilities. We also provide a novel estimator for stopover duration that is derived from the latent state variable representing the interim between arrival and departure in the state–space model. We conduct a simulation study of field sampling protocols to understand the impact of superpopulation size, proportion marked, and number of animals sampled on bias and precision of estimates. Simulation results indicate that relative bias of estimates of the proportion of the population with marks was low for all sampling scenarios and never exceeded 2%. Our approach does not require enumeration of all unmarked animals detected or direct knowledge of the number of marked animals in the population at the time of the study. This provides flexibility and potential application in a variety of sampling situations (e.g., migratory birds, breeding seabirds, sea turtles, fish, pinnipeds, etc.). Application of the methods is demonstrated with data from a study of migratory sandpipers.
Reproducibility of preclinical animal research improves with heterogeneity of study samples
Vogt, Lucile; Sena, Emily S.; Würbel, Hanno
2018-01-01
Single-laboratory studies conducted under highly standardized conditions are the gold standard in preclinical animal research. Using simulations based on 440 preclinical studies across 13 different interventions in animal models of stroke, myocardial infarction, and breast cancer, we compared the accuracy of effect size estimates between single-laboratory and multi-laboratory study designs. Single-laboratory studies generally failed to predict effect size accurately, and larger sample sizes rendered effect size estimates even less accurate. By contrast, multi-laboratory designs including as few as 2 to 4 laboratories increased coverage probability by up to 42 percentage points without a need for larger sample sizes. These findings demonstrate that within-study standardization is a major cause of poor reproducibility. More representative study samples are required to improve the external validity and reproducibility of preclinical animal research and to prevent wasting animals and resources for inconclusive research. PMID:29470495
DiMichele, Daniel L; Spradley, M Katherine
2012-09-10
Reliable methods for sex estimation during the development of a biological profile are important to the forensic community in instances when the common skeletal elements used to assess sex are absent or damaged. Sex estimation from the calcaneus has potentially significant importance for the forensic community. Specifically, measurements of the calcaneus provide an additional reliable method for sex estimation via discriminant function analysis based on a North American forensic population. Research on a modern American sample was chosen in order to develop up-to-date population specific discriminant functions for sex estimation. The current study addresses this matter, building upon previous research and introduces a new measurement, posterior circumference that promises to advance the accuracy of use of this single, highly resistant bone in future instances of sex determination from partial skeletal remains. Data were collected from The William Bass Skeletal Collection, housed at The University of Tennessee. Sample size includes 320 adult individuals born between the years 1900 and 1985. The sample was comprised of 136 females and 184 males. Skeletons used for measurements were confined to those with fused diaphyses showing no signs of pathology or damage that may have altered measurements, and that also had accompanying records that included information on ancestry, age, and sex. Measurements collected and analyzed include maximum length, load-arm length, load-arm width, and posterior circumference. The sample was used to compute a discriminant function, based on all four variables, and was performed in SAS 9.1.3. The discriminant function obtained an overall cross-validated classification rate of 86.69%. Females were classified correctly in 88.64% of the cases and males were correctly classified in 84.75% of the cases. Due to the increasing heterogeneity of current populations further discussion on this topic will include the importance that the re-evaluation of past studies has on modern forensic populations. Due to secular and micro evolutionary changes among populations, the near future must include additional methods being updated, and new methods being examined, both which should cover a wide population spectrum. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Design and analysis of three-arm trials with negative binomially distributed endpoints.
Mütze, Tobias; Munk, Axel; Friede, Tim
2016-02-20
A three-arm clinical trial design with an experimental treatment, an active control, and a placebo control, commonly referred to as the gold standard design, enables testing of non-inferiority or superiority of the experimental treatment compared with the active control. In this paper, we propose methods for designing and analyzing three-arm trials with negative binomially distributed endpoints. In particular, we develop a Wald-type test with a restricted maximum-likelihood variance estimator for testing non-inferiority or superiority. For this test, sample size and power formulas as well as optimal sample size allocations will be derived. The performance of the proposed test will be assessed in an extensive simulation study with regard to type I error rate, power, sample size, and sample size allocation. For the purpose of comparison, Wald-type statistics with a sample variance estimator and an unrestricted maximum-likelihood estimator are included in the simulation study. We found that the proposed Wald-type test with a restricted variance estimator performed well across the considered scenarios and is therefore recommended for application in clinical trials. The methods proposed are motivated and illustrated by a recent clinical trial in multiple sclerosis. The R package ThreeArmedTrials, which implements the methods discussed in this paper, is available on CRAN. Copyright © 2015 John Wiley & Sons, Ltd.
Re-assessing the surface cycling of molybdenum and rhenium
NASA Astrophysics Data System (ADS)
Miller, Christian A.; Peucker-Ehrenbrink, Bernhard; Walker, Brett D.; Marcantonio, Franco
2011-11-01
We re-evaluate the cycling of molybdenum (Mo) and rhenium (Re) in the near-surface environment. World river average Mo and Re concentrations, initially based on a handful of rivers, are calculated using 38 rivers representing five continents, and 11 of 19 large-scale drainage regions. Our new river concentration estimates are 8.0 nmol kg -1 (Mo), and 16.5 pmol kg -1 (Re, natural + anthropogenic). The linear relationship of dissolved Re and SO42- in global rivers ( R2 = 0.76) indicates labile continental Re is predominantly hosted within sulfide minerals and reduced sediments; it also provides a means of correcting for the anthropogenic contribution of Re to world rivers using independent estimates of anthropogenic sulfate. Approximately 30% of Re in global rivers is anthropogenic, yielding a pre-anthropogenic world river average of 11.2 pmol Re kg -1. The potential for anthropogenic contribution is also seen in the non-negligible Re concentrations in precipitation (0.03-5.9 pmol kg -1), and the nmol kg -1 level Re concentrations of mine waters. The linear Mo- SO42- relationship ( R2 = 0.69) indicates that the predominant source of Mo to rivers is the weathering of pyrite. An anthropogenic Mo correction was not done as anthropogenically-influenced samples do not display the unambiguous metal enrichment observed for Re. Metal concentrations in high temperature hydrothermal fluids from the Manus Basin indicate that calculated end-member fluids (i.e. Mg-free) yield negative Mo and Re concentrations, showing that Mo and Re can be removed more quickly than Mg during recharge. High temperature hydrothermal fluids are unimportant sinks relative to their river sources 0.4% (Mo), and 0.1% (pre-anthropogenic Re). We calculate new seawater response times of 4.4 × 10 5 yr ( τMo) and 1.3 × 10 5 yr ( τRe, pre-anthropogenic).
The Influence of Mark-Recapture Sampling Effort on Estimates of Rock Lobster Survival
Kordjazi, Ziya; Frusher, Stewart; Buxton, Colin; Gardner, Caleb; Bird, Tomas
2016-01-01
Five annual capture-mark-recapture surveys on Jasus edwardsii were used to evaluate the effect of sample size and fishing effort on the precision of estimated survival probability. Datasets of different numbers of individual lobsters (ranging from 200 to 1,000 lobsters) were created by random subsampling from each annual survey. This process of random subsampling was also used to create 12 datasets of different levels of effort based on three levels of the number of traps (15, 30 and 50 traps per day) and four levels of the number of sampling-days (2, 4, 6 and 7 days). The most parsimonious Cormack-Jolly-Seber (CJS) model for estimating survival probability shifted from a constant model towards sex-dependent models with increasing sample size and effort. A sample of 500 lobsters or 50 traps used on four consecutive sampling-days was required for obtaining precise survival estimations for males and females, separately. Reduced sampling effort of 30 traps over four sampling days was sufficient if a survival estimate for both sexes combined was sufficient for management of the fishery. PMID:26990561
USDA-ARS?s Scientific Manuscript database
Particle size distributions (PSD) have long been used to more accurately estimate the PM10 fraction of total particulate matter (PM) stack samples taken from agricultural sources. These PSD analyses were typically conducted using a Coulter Counter with 50 micrometer aperture tube. With recent increa...
A Bayesian nonparametric method for prediction in EST analysis
Lijoi, Antonio; Mena, Ramsés H; Prünster, Igor
2007-01-01
Background Expressed sequence tags (ESTs) analyses are a fundamental tool for gene identification in organisms. Given a preliminary EST sample from a certain library, several statistical prediction problems arise. In particular, it is of interest to estimate how many new genes can be detected in a future EST sample of given size and also to determine the gene discovery rate: these estimates represent the basis for deciding whether to proceed sequencing the library and, in case of a positive decision, a guideline for selecting the size of the new sample. Such information is also useful for establishing sequencing efficiency in experimental design and for measuring the degree of redundancy of an EST library. Results In this work we propose a Bayesian nonparametric approach for tackling statistical problems related to EST surveys. In particular, we provide estimates for: a) the coverage, defined as the proportion of unique genes in the library represented in the given sample of reads; b) the number of new unique genes to be observed in a future sample; c) the discovery rate of new genes as a function of the future sample size. The Bayesian nonparametric model we adopt conveys, in a statistically rigorous way, the available information into prediction. Our proposal has appealing properties over frequentist nonparametric methods, which become unstable when prediction is required for large future samples. EST libraries, previously studied with frequentist methods, are analyzed in detail. Conclusion The Bayesian nonparametric approach we undertake yields valuable tools for gene capture and prediction in EST libraries. The estimators we obtain do not feature the kind of drawbacks associated with frequentist estimators and are reliable for any size of the additional sample. PMID:17868445
Wang, Shulian; Campbell, Jeff; Stenmark, Matthew H; Stanton, Paul; Zhao, Jing; Matuszak, Martha M; Ten Haken, Randall K; Kong, Feng-Ming
2018-03-01
To study whether cytokine markers may improve predictive accuracy of radiation esophagitis (RE) in non-small cell lung cancer (NSCLC) patients. A total of 129 patients with stage I-III NSCLC treated with radiotherapy (RT) from prospective studies were included. Thirty inflammatory cytokines were measured in platelet-poor plasma samples. Logistic regression was performed to evaluate the risk factors of RE. Stepwise Akaike information criterion (AIC) and likelihood ratio test were used to assess model predictions. Forty-nine of 129 patients (38.0%) developed grade ≥2 RE. Univariate analysis showed that age, stage, concurrent chemotherapy, and eight dosimetric parameters were significantly associated with grade ≥2 RE (p < 0.05). IL-4, IL-5, IL-8, IL-13, IL-15, IL-1α, TGFα and eotaxin were also associated with grade ≥2 RE (p < 0.1). Age, esophagus generalized equivalent uniform dose (EUD), and baseline IL-8 were independently associated grade ≥2 RE. The combination of these three factors had significantly higher predictive power than any single factor alone. Addition of IL-8 to toxicity model significantly improves RE predictive accuracy (p = 0.019). Combining baseline level of IL-8, age and esophagus EUD may predict RE more accurately. Refinement of this model with larger sample sizes and validation from multicenter database are warranted. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
A standardized sampling protocol for channel catfish in prairie streams
Vokoun, Jason C.; Rabeni, Charles F.
2001-01-01
Three alternative gears—an AC electrofishing raft, bankpoles, and a 15-hoop-net set—were used in a standardized manner to sample channel catfish Ictalurus punctatus in three prairie streams of varying size in three seasons. We compared these gears as to time required per sample, size selectivity, mean catch per unit effort (CPUE) among months, mean CPUE within months, effect of fluctuating stream stage, and sensitivity to population size. According to these comparisons, the 15-hoop-net set used during stable water levels in October had the most desirable characteristics. Using our catch data, we estimated the precision of CPUE and size structure by varying sample sizes for the 15-hoop-net set. We recommend that 11–15 repetitions of the 15-hoop-net set be used for most management activities. This standardized basic unit of effort will increase the precision of estimates and allow better comparisons among samples as well as increased confidence in management decisions.
Optimizing Sampling Efficiency for Biomass Estimation Across NEON Domains
NASA Astrophysics Data System (ADS)
Abercrombie, H. H.; Meier, C. L.; Spencer, J. J.
2013-12-01
Over the course of 30 years, the National Ecological Observatory Network (NEON) will measure plant biomass and productivity across the U.S. to enable an understanding of terrestrial carbon cycle responses to ecosystem change drivers. Over the next several years, prior to operational sampling at a site, NEON will complete construction and characterization phases during which a limited amount of sampling will be done at each site to inform sampling designs, and guide standardization of data collection across all sites. Sampling biomass in 60+ sites distributed among 20 different eco-climatic domains poses major logistical and budgetary challenges. Traditional biomass sampling methods such as clip harvesting and direct measurements of Leaf Area Index (LAI) involve collecting and processing plant samples, and are time and labor intensive. Possible alternatives include using indirect sampling methods for estimating LAI such as digital hemispherical photography (DHP) or using a LI-COR 2200 Plant Canopy Analyzer. These LAI estimations can then be used as a proxy for biomass. The biomass estimates calculated can then inform the clip harvest sampling design during NEON operations, optimizing both sample size and number so that standardized uncertainty limits can be achieved with a minimum amount of sampling effort. In 2011, LAI and clip harvest data were collected from co-located sampling points at the Central Plains Experimental Range located in northern Colorado, a short grass steppe ecosystem that is the NEON Domain 10 core site. LAI was measured with a LI-COR 2200 Plant Canopy Analyzer. The layout of the sampling design included four, 300 meter transects, with clip harvests plots spaced every 50m, and LAI sub-transects spaced every 10m. LAI was measured at four points along 6m sub-transects running perpendicular to the 300m transect. Clip harvest plots were co-located 4m from corresponding LAI transects, and had dimensions of 0.1m by 2m. We conducted regression analyses with LAI and clip harvest data to determine whether LAI can be used as a suitable proxy for aboveground standing biomass. We also compared optimal sample sizes derived from LAI data, and clip-harvest data from two different size clip harvest areas (0.1m by 1m vs. 0.1m by 2m). Sample sizes were calculated in order to estimate the mean to within a standardized level of uncertainty that will be used to guide sampling effort across all vegetation types (i.e. estimated within × 10% with 95% confidence). Finally, we employed a Semivariogram approach to determine optimal sample size and spacing.
Luo, Dehui; Wan, Xiang; Liu, Jiming; Tong, Tiejun
2018-06-01
The era of big data is coming, and evidence-based medicine is attracting increasing attention to improve decision making in medical practice via integrating evidence from well designed and conducted clinical research. Meta-analysis is a statistical technique widely used in evidence-based medicine for analytically combining the findings from independent clinical trials to provide an overall estimation of a treatment effectiveness. The sample mean and standard deviation are two commonly used statistics in meta-analysis but some trials use the median, the minimum and maximum values, or sometimes the first and third quartiles to report the results. Thus, to pool results in a consistent format, researchers need to transform those information back to the sample mean and standard deviation. In this article, we investigate the optimal estimation of the sample mean for meta-analysis from both theoretical and empirical perspectives. A major drawback in the literature is that the sample size, needless to say its importance, is either ignored or used in a stepwise but somewhat arbitrary manner, e.g. the famous method proposed by Hozo et al. We solve this issue by incorporating the sample size in a smoothly changing weight in the estimators to reach the optimal estimation. Our proposed estimators not only improve the existing ones significantly but also share the same virtue of the simplicity. The real data application indicates that our proposed estimators are capable to serve as "rules of thumb" and will be widely applied in evidence-based medicine.
Porto, Paolo; Walling, Desmond E; Cogliandro, Vanessa; Callegari, Giovanni
2016-11-01
In recent years, the fallout radionuclides caesium-137 ( 137 Cs) and unsupported lead-210 ( 210 Pb ex) have been successfully used to document rates of soil erosion in many areas of the world, as an alternative to conventional measurements. By virtue of their different half-lives, these two radionuclides are capable of providing information related to different time windows. 137 Cs measurements are commonly used to generate information on mean annual erosion rates over the past ca. 50-60 years, whereas 210 Pb ex measurements are able to provide information relating to a longer period of up to ca. 100 years. However, the time-integrated nature of the estimates of soil redistribution provided by 137 Cs and 210 Pb ex measurements can be seen as a limitation, particularly when viewed in the context of global change and interest in the response of soil redistribution rates to contemporary climate change and land use change. Re-sampling techniques used with these two fallout radionuclides potentially provide a basis for providing information on recent changes in soil redistribution rates. By virtue of the effectively continuous fallout input, of 210 Pb, the response of the 210 Pb ex inventory of a soil profile to changing soil redistribution rates and thus its potential for use with the re-sampling approach differs from that of 137 Cs. Its greater sensitivity to recent changes in soil redistribution rates suggests that 210 Pb ex may have advantages over 137 Cs for use in the re-sampling approach. The potential for using 210 Pb ex measurements in re-sampling studies is explored further in this contribution. Attention focuses on a small (1.38 ha) forested catchment in southern Italy. The catchment was originally sampled for 210 Pb ex measurements in 2001 and equivalent samples were collected from points very close to the original sampling points again in 2013. This made it possible to compare the estimates of mean annual erosion related to two different time windows. This comparison suggests that mean annual rates of net soil loss had increased during the period between the two sampling campaigns and that this increase was associated with a shift to an increased sediment delivery ratio. This change was consistent with independent information on likely changes in the sediment response of the study catchment provided by the available records of annual sediment yield and changes in the annual rainfall documented for the local area. Copyright © 2016 Elsevier Ltd. All rights reserved.
Bellier, Edwige; Grøtan, Vidar; Engen, Steinar; Schartau, Ann Kristin; Diserud, Ola H; Finstad, Anders G
2012-10-01
Obtaining accurate estimates of diversity indices is difficult because the number of species encountered in a sample increases with sampling intensity. We introduce a novel method that requires that the presence of species in a sample to be assessed while the counts of the number of individuals per species are only required for just a small part of the sample. To account for species included as incidence data in the species abundance distribution, we modify the likelihood function of the classical Poisson log-normal distribution. Using simulated community assemblages, we contrast diversity estimates based on a community sample, a subsample randomly extracted from the community sample, and a mixture sample where incidence data are added to a subsample. We show that the mixture sampling approach provides more accurate estimates than the subsample and at little extra cost. Diversity indices estimated from a freshwater zooplankton community sampled using the mixture approach show the same pattern of results as the simulation study. Our method efficiently increases the accuracy of diversity estimates and comprehension of the left tail of the species abundance distribution. We show how to choose the scale of sample size needed for a compromise between information gained, accuracy of the estimates and cost expended when assessing biological diversity. The sample size estimates are obtained from key community characteristics, such as the expected number of species in the community, the expected number of individuals in a sample and the evenness of the community.
Qualitative Meta-Analysis on the Hospital Task: Implications for Research
ERIC Educational Resources Information Center
Noll, Jennifer; Sharma, Sashi
2014-01-01
The "law of large numbers" indicates that as sample size increases, sample statistics become less variable and more closely estimate their corresponding population parameters. Different research studies investigating how people consider sample size when evaluating the reliability of a sample statistic have found a wide range of…
Patinha, C; Durães, N; Sousa, P; Dias, A C; Reis, A P; Noack, Y; Ferreira da Silva, E
2015-08-01
Urban dust is a heterogeneous mix, where traffic-related particles can combine with soil mineral compounds, forming a unique and site-specific material. These traffic-related particles are usually enriched in potentially harmful elements, enhancing the health risk for population by inhalation or ingestion. Urban dust samples from Estarreja city and traffic-related particles (brake dust and white traffic paint) were studied to understand the relative contribution of the traffic particles in the geochemical behaviour of urban dust and to evaluate the long-term impacts of the metals on an urban environment, as well as the risk to the populations. It was possible to distinguish two groups of urban dust samples according to Cu behaviour: (1) one group with low amounts of fine particles (<38 µm), low contents of organic material, high percentage of Cu in soluble phases, and low Cu bioaccessible fraction (Bf) values. This group showed similar chemical behaviour with the brake dust samples of low- to mid-range car brands (with more than 10 years old), composed by coarser wear particles; and (2) another group with greater amounts of fine particles (<38 µm), with low percentage of Cu associated with soluble phases, and with greater Cu Bf values. This group behaved similar to those found for brake dust of mid- to high-range car brands (with less than 10 years old). The results obtained showed that there is no direct correlation between the geoavailability of metals estimated by sequential selective chemical extraction (SSCE) and the in vitro oral bioaccessibility (UBM) test. Thus, oral bioaccessibility of urban dust is site specific. Geoavailability was greatly dependent on particle size, where the bioaccessibility tended to increase with a reduction in particle diameter. As anthropogenic particles showed high metal concentration and a smaller size than mineral particles, urban dusts are of major concern to the populations' health, since fine particles are easily re-suspended, easily ingested, and show high metal bioaccessibility. In addition, Estarreja is a coastal city often influenced by winds, which favours the re-suspension of small-sized contaminated particles. Even if the risk to the population does not represent an acute case, it should not be overlooked, and this study can serve as baseline study for cities under high traffic influence.
Evolution of Query Optimization Methods
NASA Astrophysics Data System (ADS)
Hameurlain, Abdelkader; Morvan, Franck
Query optimization is the most critical phase in query processing. In this paper, we try to describe synthetically the evolution of query optimization methods from uniprocessor relational database systems to data Grid systems through parallel, distributed and data integration systems. We point out a set of parameters to characterize and compare query optimization methods, mainly: (i) size of the search space, (ii) type of method (static or dynamic), (iii) modification types of execution plans (re-optimization or re-scheduling), (iv) level of modification (intra-operator and/or inter-operator), (v) type of event (estimation errors, delay, user preferences), and (vi) nature of decision-making (centralized or decentralized control).
What is the extent of prokaryotic diversity?
Curtis, Thomas P; Head, Ian M; Lunn, Mary; Woodcock, Stephen; Schloss, Patrick D; Sloan, William T
2006-01-01
The extent of microbial diversity is an intrinsically fascinating subject of profound practical importance. The term ‘diversity’ may allude to the number of taxa or species richness as well as their relative abundance. There is uncertainty about both, primarily because sample sizes are too small. Non-parametric diversity estimators make gross underestimates if used with small sample sizes on unevenly distributed communities. One can make richness estimates over many scales using small samples by assuming a species/taxa-abundance distribution. However, no one knows what the underlying taxa-abundance distributions are for bacterial communities. Latterly, diversity has been estimated by fitting data from gene clone libraries and extrapolating from this to taxa-abundance curves to estimate richness. However, since sample sizes are small, we cannot be sure that such samples are representative of the community from which they were drawn. It is however possible to formulate, and calibrate, models that predict the diversity of local communities and of samples drawn from that local community. The calibration of such models suggests that migration rates are small and decrease as the community gets larger. The preliminary predictions of the model are qualitatively consistent with the patterns seen in clone libraries in ‘real life’. The validation of this model is also confounded by small sample sizes. However, if such models were properly validated, they could form invaluable tools for the prediction of microbial diversity and a basis for the systematic exploration of microbial diversity on the planet. PMID:17028084
Shoukri, Mohamed M; Elkum, Nasser; Walter, Stephen D
2006-01-01
Background In this paper we propose the use of the within-subject coefficient of variation as an index of a measurement's reliability. For continuous variables and based on its maximum likelihood estimation we derive a variance-stabilizing transformation and discuss confidence interval construction within the framework of a one-way random effects model. We investigate sample size requirements for the within-subject coefficient of variation for continuous and binary variables. Methods We investigate the validity of the approximate normal confidence interval by Monte Carlo simulations. In designing a reliability study, a crucial issue is the balance between the number of subjects to be recruited and the number of repeated measurements per subject. We discuss efficiency of estimation and cost considerations for the optimal allocation of the sample resources. The approach is illustrated by an example on Magnetic Resonance Imaging (MRI). We also discuss the issue of sample size estimation for dichotomous responses with two examples. Results For the continuous variable we found that the variance stabilizing transformation improves the asymptotic coverage probabilities on the within-subject coefficient of variation for the continuous variable. The maximum like estimation and sample size estimation based on pre-specified width of confidence interval are novel contribution to the literature for the binary variable. Conclusion Using the sample size formulas, we hope to help clinical epidemiologists and practicing statisticians to efficiently design reliability studies using the within-subject coefficient of variation, whether the variable of interest is continuous or binary. PMID:16686943
NASA Astrophysics Data System (ADS)
Terada, T.; Sato, M.; Mochizuki, N.; Yamamoto, Y.; Tsunakawa, H.
2013-12-01
Magnetic properties of ferromagnetic minerals generally depend on their chemical composition, crystal structure, size, and shape. In the usual paleomagnetic study, we use a bulk sample which is the assemblage of magnetic minerals showing broad distributions of various magnetic properties. Microscopic and Curie-point observations of the bulk sample enable us to identify the constituent magnetic minerals, while other measurements, for example, stepwise thermal and/or alternating field demagnetizations (ThD, AFD) make it possible to estimate size, shape and domain state of the constituent magnetic grains. However, estimation based on stepwise demagnetizations has a limitation that magnetic grains with the same coercivity Hc (or blocking temperature Tb) can be identified as the single population even though they could have different size and shape. Dunlop and West (1969) carried out mapping of grain size and coercivity (Hc) using pTRM. However, it is considered that their mapping method is basically applicable to natural rocks containing only SD grains, since the grain sizes are estimated on the basis of the single domain theory (Neel, 1949). In addition, it is impossible to check thermal alteration due to laboratory heating in their experiment. In the present study we propose a new experimental method which makes it possible to estimate distribution of size and shape of magnetic minerals in a bulk sample. The present method is composed of simple procedures: (1) imparting ARM to a bulk sample, (2) ThD at a certain temperature, (3) stepwise AFD on the remaining ARM, (4) repeating the steps (1) ~ (3) with ThD at elevating temperatures up to the Curie temperature of the sample. After completion of the whole procedures, ARM spectra are calculated and mapped on the HC-Tb plane (hereafter called HC-Tb diagram). We analyze the Hc-Tb diagrams as follows: (1) For uniaxial SD populations, theoretical curve for a certain grain size (or shape anisotropy) is drawn on the Hc-Tb diagram. The curves are calculated using the single domain theory, since coercivity and blocking temperature of uniaxial SD grains can be expressed as a function of size and shape. (2) Boundary between SD and MD grains are calculated and drawn on the Hc-Tb diagram according to the theory by Butler and Banerjee (1975). (3) Theoretical predictions by (1) and (2) are compared with the obtained ARM spectra to estimate quantitive distribution of size, shape and domain state of magnetic grains in the sample. This mapping method has been applied to three samples: Hawaiian basaltic lava extruded in 1995, Ueno basaltic lava formed during Matsuyama chron, and Oshima basaltic lava extruded in 1986. We will discuss physical states of magnetic grains (size, shape, domain state, etc.) and their possible origins.
Sample allocation balancing overall representativeness and stratum precision.
Diaz-Quijano, Fredi Alexander
2018-05-07
In large-scale surveys, it is often necessary to distribute a preset sample size among a number of strata. Researchers must make a decision between prioritizing overall representativeness or precision of stratum estimates. Hence, I evaluated different sample allocation strategies based on stratum size. The strategies evaluated herein included allocation proportional to stratum population; equal sample for all strata; and proportional to the natural logarithm, cubic root, and square root of the stratum population. This study considered the fact that, from a preset sample size, the dispersion index of stratum sampling fractions is correlated with the population estimator error and the dispersion index of stratum-specific sampling errors would measure the inequality in precision distribution. Identification of a balanced and efficient strategy was based on comparing those both dispersion indices. Balance and efficiency of the strategies changed depending on overall sample size. As the sample to be distributed increased, the most efficient allocation strategies were equal sample for each stratum; proportional to the logarithm, to the cubic root, to square root; and that proportional to the stratum population, respectively. Depending on sample size, each of the strategies evaluated could be considered in optimizing the sample to keep both overall representativeness and stratum-specific precision. Copyright © 2018 Elsevier Inc. All rights reserved.
Temporal dynamics of linkage disequilibrium in two populations of bighorn sheep
Miller, Joshua M; Poissant, Jocelyn; Malenfant, René M; Hogg, John T; Coltman, David W
2015-01-01
Linkage disequilibrium (LD) is the nonrandom association of alleles at two markers. Patterns of LD have biological implications as well as practical ones when designing association studies or conservation programs aimed at identifying the genetic basis of fitness differences within and among populations. However, the temporal dynamics of LD in wild populations has received little empirical attention. In this study, we examined the overall extent of LD, the effect of sample size on the accuracy and precision of LD estimates, and the temporal dynamics of LD in two populations of bighorn sheep (Ovis canadensis) with different demographic histories. Using over 200 microsatellite loci, we assessed two metrics of multi-allelic LD, D′, and χ′2. We found that both populations exhibited high levels of LD, although the extent was much shorter in a native population than one that was founded via translocation, experienced a prolonged bottleneck post founding, followed by recent admixture. In addition, we observed significant variation in LD in relation to the sample size used, with small sample sizes leading to depressed estimates of the extent of LD but inflated estimates of background levels of LD. In contrast, there was not much variation in LD among yearly cross-sections within either population once sample size was accounted for. Lack of pronounced interannual variability suggests that researchers may not have to worry about interannual variation when estimating LD in a population and can instead focus on obtaining the largest sample size possible. PMID:26380673
Creel, Scott; Spong, Goran; Sands, Jennifer L; Rotella, Jay; Zeigle, Janet; Joe, Lawrence; Murphy, Kerry M; Smith, Douglas
2003-07-01
Determining population sizes can be difficult, but is essential for conservation. By counting distinct microsatellite genotypes, DNA from noninvasive samples (hair, faeces) allows estimation of population size. Problems arise because genotypes from noninvasive samples are error-prone, but genotyping errors can be reduced by multiple polymerase chain reaction (PCR). For faecal genotypes from wolves in Yellowstone National Park, error rates varied substantially among samples, often above the 'worst-case threshold' suggested by simulation. Consequently, a substantial proportion of multilocus genotypes held one or more errors, despite multiple PCR. These genotyping errors created several genotypes per individual and caused overestimation (up to 5.5-fold) of population size. We propose a 'matching approach' to eliminate this overestimation bias.
The impact of multiple endpoint dependency on Q and I(2) in meta-analysis.
Thompson, Christopher Glen; Becker, Betsy Jane
2014-09-01
A common assumption in meta-analysis is that effect sizes are independent. When correlated effect sizes are analyzed using traditional univariate techniques, this assumption is violated. This research assesses the impact of dependence arising from treatment-control studies with multiple endpoints on homogeneity measures Q and I(2) in scenarios using the unbiased standardized-mean-difference effect size. Univariate and multivariate meta-analysis methods are examined. Conditions included different overall outcome effects, study sample sizes, numbers of studies, between-outcomes correlations, dependency structures, and ways of computing the correlation. The univariate approach used typical fixed-effects analyses whereas the multivariate approach used generalized least-squares (GLS) estimates of a fixed-effects model, weighted by the inverse variance-covariance matrix. Increased dependence among effect sizes led to increased Type I error rates from univariate models. When effect sizes were strongly dependent, error rates were drastically higher than nominal levels regardless of study sample size and number of studies. In contrast, using GLS estimation to account for multiple-endpoint dependency maintained error rates within nominal levels. Conversely, mean I(2) values were not greatly affected by increased amounts of dependency. Last, we point out that the between-outcomes correlation should be estimated as a pooled within-groups correlation rather than using a full-sample estimator that does not consider treatment/control group membership. Copyright © 2014 John Wiley & Sons, Ltd.
Peel, D; Waples, R S; Macbeth, G M; Do, C; Ovenden, J R
2013-03-01
Theoretical models are often applied to population genetic data sets without fully considering the effect of missing data. Researchers can deal with missing data by removing individuals that have failed to yield genotypes and/or by removing loci that have failed to yield allelic determinations, but despite their best efforts, most data sets still contain some missing data. As a consequence, realized sample size differs among loci, and this poses a problem for unbiased methods that must explicitly account for random sampling error. One commonly used solution for the calculation of contemporary effective population size (N(e) ) is to calculate the effective sample size as an unweighted mean or harmonic mean across loci. This is not ideal because it fails to account for the fact that loci with different numbers of alleles have different information content. Here we consider this problem for genetic estimators of contemporary effective population size (N(e) ). To evaluate bias and precision of several statistical approaches for dealing with missing data, we simulated populations with known N(e) and various degrees of missing data. Across all scenarios, one method of correcting for missing data (fixed-inverse variance-weighted harmonic mean) consistently performed the best for both single-sample and two-sample (temporal) methods of estimating N(e) and outperformed some methods currently in widespread use. The approach adopted here may be a starting point to adjust other population genetics methods that include per-locus sample size components. © 2012 Blackwell Publishing Ltd.
Boitard, Simon; Rodríguez, Willy; Jay, Flora; Mona, Stefano; Austerlitz, Frédéric
2016-01-01
Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey), PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles. PMID:26943927
Anti-Depressants, Suicide, and Drug Regulation
ERIC Educational Resources Information Center
Ludwig, Jens; Marcotte, Dave E.
2005-01-01
Policymakers are increasingly concerned that a relatively new class of anti-depressant drugs, selective serotonin re-uptake inhibitors (SSRI), may increase the risk of suicide for at least some patients, particularly children. Prior randomized trials are not informative on this question because of small sample sizes and other limitations. Using…
As part of the Desert Southwest Coarse Particulate Matter Study which characterized the composition of fine and coarse particulate matter in Pinal County, AZ, several source samples were collected from several different soil types to assist in source apportionment analysis of the...
A log-linear model approach to estimation of population size using the line-transect sampling method
Anderson, D.R.; Burnham, K.P.; Crain, B.R.
1978-01-01
The technique of estimating wildlife population size and density using the belt or line-transect sampling method has been used in many past projects, such as the estimation of density of waterfowl nestling sites in marshes, and is being used currently in such areas as the assessment of Pacific porpoise stocks in regions of tuna fishing activity. A mathematical framework for line-transect methodology has only emerged in the last 5 yr. In the present article, we extend this mathematical framework to a line-transect estimator based upon a log-linear model approach.
Combining the boundary shift integral and tensor-based morphometry for brain atrophy estimation
NASA Astrophysics Data System (ADS)
Michalkiewicz, Mateusz; Pai, Akshay; Leung, Kelvin K.; Sommer, Stefan; Darkner, Sune; Sørensen, Lauge; Sporring, Jon; Nielsen, Mads
2016-03-01
Brain atrophy from structural magnetic resonance images (MRIs) is widely used as an imaging surrogate marker for Alzheimers disease. Their utility has been limited due to the large degree of variance and subsequently high sample size estimates. The only consistent and reasonably powerful atrophy estimation methods has been the boundary shift integral (BSI). In this paper, we first propose a tensor-based morphometry (TBM) method to measure voxel-wise atrophy that we combine with BSI. The combined model decreases the sample size estimates significantly when compared to BSI and TBM alone.
Sample Size for Tablet Compression and Capsule Filling Events During Process Validation.
Charoo, Naseem Ahmad; Durivage, Mark; Rahman, Ziyaur; Ayad, Mohamad Haitham
2017-12-01
During solid dosage form manufacturing, the uniformity of dosage units (UDU) is ensured by testing samples at 2 stages, that is, blend stage and tablet compression or capsule/powder filling stage. The aim of this work is to propose a sample size selection approach based on quality risk management principles for process performance qualification (PPQ) and continued process verification (CPV) stages by linking UDU to potential formulation and process risk factors. Bayes success run theorem appeared to be the most appropriate approach among various methods considered in this work for computing sample size for PPQ. The sample sizes for high-risk (reliability level of 99%), medium-risk (reliability level of 95%), and low-risk factors (reliability level of 90%) were estimated to be 299, 59, and 29, respectively. Risk-based assignment of reliability levels was supported by the fact that at low defect rate, the confidence to detect out-of-specification units would decrease which must be supplemented with an increase in sample size to enhance the confidence in estimation. Based on level of knowledge acquired during PPQ and the level of knowledge further required to comprehend process, sample size for CPV was calculated using Bayesian statistics to accomplish reduced sampling design for CPV. Copyright © 2017 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.
Stellar mass functions and implications for a variable IMF
NASA Astrophysics Data System (ADS)
Bernardi, M.; Sheth, R. K.; Fischer, J.-L.; Meert, A.; Chae, K.-H.; Dominguez-Sanchez, H.; Huertas-Company, M.; Shankar, F.; Vikram, V.
2018-03-01
Spatially resolved kinematics of nearby galaxies has shown that the ratio of dynamical to stellar population-based estimates of the mass of a galaxy (M_{*}^JAM/M_{*}) correlates with σe, the light-weighted velocity dispersion within its half-light radius, if M* is estimated using the same initial mass function (IMF) for all galaxies and the stellar mass-to-light ratio within each galaxy is constant. This correlation may indicate that, in fact, the IMF is more bottom-heavy or dwarf-rich for galaxies with large σ. We use this correlation to estimate a dynamical or IMF-corrected stellar mass, M_{*}^{α _{JAM}}, from M* and σe for a sample of 6 × 105 Sloan Digital Sky Survey (SDSS) galaxies for which spatially resolved kinematics is not available. We also compute the `virial' mass estimate k(n,R) R_e σ _R^2/G, where n is the Sérsic index, in the SDSS and ATLAS3D samples. We show that an n-dependent correction must be applied to the k(n, R) values provided by Prugniel & Simien. Our analysis also shows that the shape of the velocity dispersion profile in the ATLAS3D sample varies weakly with n: (σR/σe) = (R/Re)-γ(n). The resulting stellar mass functions, based on M_*^{α _{JAM}} and the recalibrated virial mass, are in good agreement. Using a Fundamental Plane-based observational proxy for σe produces comparable results. The use of direct measurements for estimating the IMF-dependent stellar mass is prohibitively expensive for a large sample of galaxies. By demonstrating that cheaper proxies are sufficiently accurate, our analysis should enable a more reliable census of the mass in stars, especially at high redshift, at a fraction of the cost. Our results are provided in tabular form.
Overview of the Mars Sample Return Earth Entry Vehicle
NASA Technical Reports Server (NTRS)
Dillman, Robert; Corliss, James
2008-01-01
NASA's Mars Sample Return (MSR) project will bring Mars surface and atmosphere samples back to Earth for detailed examination. Langley Research Center's MSR Earth Entry Vehicle (EEV) is a core part of the mission, protecting the sample container during atmospheric entry, descent, and landing. Planetary protection requirements demand a higher reliability from the EEV than for any previous planetary entry vehicle. An overview of the EEV design and preliminary analysis is presented, with a follow-on discussion of recommended future design trade studies to be performed over the next several years in support of an MSR launch in 2018 or 2020. Planned topics include vehicle size for impact protection of a range of sample container sizes, outer mold line changes to achieve surface sterilization during re-entry, micrometeoroid protection, aerodynamic stability, thermal protection, and structural materials selection.
Tests of Independence in Contingency Tables with Small Samples: A Comparison of Statistical Power.
ERIC Educational Resources Information Center
Parshall, Cynthia G.; Kromrey, Jeffrey D.
1996-01-01
Power and Type I error rates were estimated for contingency tables with small sample sizes for the following four types of tests: (1) Pearson's chi-square; (2) chi-square with Yates's continuity correction; (3) the likelihood ratio test; and (4) Fisher's Exact Test. Various marginal distributions, sample sizes, and effect sizes were examined. (SLD)
Visscher, Peter M; Goddard, Michael E
2015-01-01
Heritability is a population parameter of importance in evolution, plant and animal breeding, and human medical genetics. It can be estimated using pedigree designs and, more recently, using relationships estimated from markers. We derive the sampling variance of the estimate of heritability for a wide range of experimental designs, assuming that estimation is by maximum likelihood and that the resemblance between relatives is solely due to additive genetic variation. We show that well-known results for balanced designs are special cases of a more general unified framework. For pedigree designs, the sampling variance is inversely proportional to the variance of relationship in the pedigree and it is proportional to 1/N, whereas for population samples it is approximately proportional to 1/N(2), where N is the sample size. Variation in relatedness is a key parameter in the quantification of the sampling variance of heritability. Consequently, the sampling variance is high for populations with large recent effective population size (e.g., humans) because this causes low variation in relationship. However, even using human population samples, low sampling variance is possible with high N. Copyright © 2015 by the Genetics Society of America.
Sample Size Calculations for Precise Interval Estimation of the Eta-Squared Effect Size
ERIC Educational Resources Information Center
Shieh, Gwowen
2015-01-01
Analysis of variance is one of the most frequently used statistical analyses in the behavioral, educational, and social sciences, and special attention has been paid to the selection and use of an appropriate effect size measure of association in analysis of variance. This article presents the sample size procedures for precise interval estimation…
Finite mixture model: A maximum likelihood estimation approach on time series data
NASA Astrophysics Data System (ADS)
Yen, Phoong Seuk; Ismail, Mohd Tahir; Hamzah, Firdaus Mohamad
2014-09-01
Recently, statistician emphasized on the fitting of finite mixture model by using maximum likelihood estimation as it provides asymptotic properties. In addition, it shows consistency properties as the sample sizes increases to infinity. This illustrated that maximum likelihood estimation is an unbiased estimator. Moreover, the estimate parameters obtained from the application of maximum likelihood estimation have smallest variance as compared to others statistical method as the sample sizes increases. Thus, maximum likelihood estimation is adopted in this paper to fit the two-component mixture model in order to explore the relationship between rubber price and exchange rate for Malaysia, Thailand, Philippines and Indonesia. Results described that there is a negative effect among rubber price and exchange rate for all selected countries.
Simulating realistic predator signatures in quantitative fatty acid signature analysis
Bromaghin, Jeffrey F.
2015-01-01
Diet estimation is an important field within quantitative ecology, providing critical insights into many aspects of ecology and community dynamics. Quantitative fatty acid signature analysis (QFASA) is a prominent method of diet estimation, particularly for marine mammal and bird species. Investigators using QFASA commonly use computer simulation to evaluate statistical characteristics of diet estimators for the populations they study. Similar computer simulations have been used to explore and compare the performance of different variations of the original QFASA diet estimator. In both cases, computer simulations involve bootstrap sampling prey signature data to construct pseudo-predator signatures with known properties. However, bootstrap sample sizes have been selected arbitrarily and pseudo-predator signatures therefore may not have realistic properties. I develop an algorithm to objectively establish bootstrap sample sizes that generates pseudo-predator signatures with realistic properties, thereby enhancing the utility of computer simulation for assessing QFASA estimator performance. The algorithm also appears to be computationally efficient, resulting in bootstrap sample sizes that are smaller than those commonly used. I illustrate the algorithm with an example using data from Chukchi Sea polar bears (Ursus maritimus) and their marine mammal prey. The concepts underlying the approach may have value in other areas of quantitative ecology in which bootstrap samples are post-processed prior to their use.
Allen, John C; Thumboo, Julian; Lye, Weng Kit; Conaghan, Philip G; Chew, Li-Ching; Tan, York Kiat
2018-03-01
To determine whether novel methods of selecting joints through (i) ultrasonography (individualized-ultrasound [IUS] method), or (ii) ultrasonography and clinical examination (individualized-composite-ultrasound [ICUS] method) translate into smaller rheumatoid arthritis (RA) clinical trial sample sizes when compared to existing methods utilizing predetermined joint sites for ultrasonography. Cohen's effect size (ES) was estimated (ES^) and a 95% CI (ES^L, ES^U) calculated on a mean change in 3-month total inflammatory score for each method. Corresponding 95% CIs [nL(ES^U), nU(ES^L)] were obtained on a post hoc sample size reflecting the uncertainty in ES^. Sample size calculations were based on a one-sample t-test as the patient numbers needed to provide 80% power at α = 0.05 to reject a null hypothesis H 0 : ES = 0 versus alternative hypotheses H 1 : ES = ES^, ES = ES^L and ES = ES^U. We aimed to provide point and interval estimates on projected sample sizes for future studies reflecting the uncertainty in our study ES^S. Twenty-four treated RA patients were followed up for 3 months. Utilizing the 12-joint approach and existing methods, the post hoc sample size (95% CI) was 22 (10-245). Corresponding sample sizes using ICUS and IUS were 11 (7-40) and 11 (6-38), respectively. Utilizing a seven-joint approach, the corresponding sample sizes using ICUS and IUS methods were nine (6-24) and 11 (6-35), respectively. Our pilot study suggests that sample size for RA clinical trials with ultrasound endpoints may be reduced using the novel methods, providing justification for larger studies to confirm these observations. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.
Statistical theory and methodology for remote sensing data analysis
NASA Technical Reports Server (NTRS)
Odell, P. L.
1974-01-01
A model is developed for the evaluation of acreages (proportions) of different crop-types over a geographical area using a classification approach and methods for estimating the crop acreages are given. In estimating the acreages of a specific croptype such as wheat, it is suggested to treat the problem as a two-crop problem: wheat vs. nonwheat, since this simplifies the estimation problem considerably. The error analysis and the sample size problem is investigated for the two-crop approach. Certain numerical results for sample sizes are given for a JSC-ERTS-1 data example on wheat identification performance in Hill County, Montana and Burke County, North Dakota. Lastly, for a large area crop acreages inventory a sampling scheme is suggested for acquiring sample data and the problem of crop acreage estimation and the error analysis is discussed.
Tunable microwave absorbing nano-material for X-band applications
NASA Astrophysics Data System (ADS)
Sadiq, Imran; Naseem, Shahzad; Ashiq, Muhammad Naeem; Khan, M. A.; Niaz, Shanawer; Rana, M. U.
2016-03-01
The effect of rare earth elements substitution in Sr1.96RE0.04Co2Fe27.80Mn0.2O46 (RE=Ce, Gd, Nd, La and Sm) X-type hexagonal ferrites prepared by using sol gel autocombustion method was studied. The XRD and FTIR analysis show the single phase of the prepared material. The lattice constants a (Å) and c (Å) varies with the additives. The particle size measured by Scherer formula for all the samples varies in the range of 54-100 nm and confirmed by the TEM analysis. The average grain size measured by SEM analysis lies in the range of 0.672-1.01 μm for all the samples. The Gd-substituted ferrite has higher value of coercivity (526.06 G) among all the samples which could be a good material for longitudinal recording media. The results also indicate that the Gd-substituted sample has maximum reflection loss of -25.2 dB at 11.878 GHz, can exhibit the best microwave absorption properties among all the substituted samples. Furthermore, the minimum value of reflection loss shifts towards the lower and higher frequencies with the substitution of rare earth elements which confirms that the microwave absorption properties can be tuned with the substitution of rare earth elements in pure ferrites. The peak value of attenuation constant at higher frequency agrees well the reflection loss data.
Improving size estimates of open animal populations by incorporating information on age
Manly, Bryan F.J.; McDonald, Trent L.; Amstrup, Steven C.; Regehr, Eric V.
2003-01-01
Around the world, a great deal of effort is expended each year to estimate the sizes of wild animal populations. Unfortunately, population size has proven to be one of the most intractable parameters to estimate. The capture-recapture estimation models most commonly used (of the Jolly-Seber type) are complicated and require numerous, sometimes questionable, assumptions. The derived estimates usually have large variances and lack consistency over time. In capture–recapture studies of long-lived animals, the ages of captured animals can often be determined with great accuracy and relative ease. We show how to incorporate age information into size estimates for open populations, where the size changes through births, deaths, immigration, and emigration. The proposed method allows more precise estimates of population size than the usual models, and it can provide these estimates from two sample occasions rather than the three usually required. Moreover, this method does not require specialized programs for capture-recapture data; researchers can derive their estimates using the logistic regression module in any standard statistical package.
NASA Technical Reports Server (NTRS)
Walker, H. F.
1976-01-01
Likelihood equations determined by the two types of samples which are necessary conditions for a maximum-likelihood estimate are considered. These equations, suggest certain successive-approximations iterative procedures for obtaining maximum-likelihood estimates. These are generalized steepest ascent (deflected gradient) procedures. It is shown that, with probability 1 as N sub 0 approaches infinity (regardless of the relative sizes of N sub 0 and N sub 1, i=1,...,m), these procedures converge locally to the strongly consistent maximum-likelihood estimates whenever the step size is between 0 and 2. Furthermore, the value of the step size which yields optimal local convergence rates is bounded from below by a number which always lies between 1 and 2.
Tree mortality rates and tree population projections in Baltimore, Maryland, USA
David J. Nowak; Miki Kuroda; Daniel E. Crane
2004-01-01
Based on re-measurements (1999 and 2001) of randomly-distributed permanent plots within the city boundaries of Baltimore, Maryland, trees are estimated to have an annual mortality rate of 6.6% with an overall annual net change in the number of live trees of -4.2%. Tree mortality rates were significantly different based on tree size, condition, species, and Land use....
Sandra Ryan; Kathleen Dwire
2012-01-01
In this study of a burned watershed in northwestern Wyoming, USA, sedimentation impacts following a moderately-sized fire (Boulder Creek burn, 2000) were evaluated against sediment loads estimated for the period prior to burning. Early observations of suspended sediment yield showed substantially elevated loads (5x) the first year post-fire (2001), followed by less...
ESTIMATING SAMPLE REQUIREMENTS FOR FIELD EVALUATIONS OF PESTICIDE LEACHING
A method is presented for estimating the number of samples needed to evaluate pesticide leaching threats to ground water at a desired level of precision. Sample size projections are based on desired precision (exhibited as relative tolerable error), level of confidence (90 or 95%...
ERIC Educational Resources Information Center
Myers, Nicholas D.; Ahn, Soyeon; Jin, Ying
2011-01-01
Monte Carlo methods can be used in data analytic situations (e.g., validity studies) to make decisions about sample size and to estimate power. The purpose of using Monte Carlo methods in a validity study is to improve the methodological approach within a study where the primary focus is on construct validity issues and not on advancing…
Sample size and power for cost-effectiveness analysis (part 1).
Glick, Henry A
2011-03-01
Basic sample size and power formulae for cost-effectiveness analysis have been established in the literature. These formulae are reviewed and the similarities and differences between sample size and power for cost-effectiveness analysis and for the analysis of other continuous variables such as changes in blood pressure or weight are described. The types of sample size and power tables that are commonly calculated for cost-effectiveness analysis are also described and the impact of varying the assumed parameter values on the resulting sample size and power estimates is discussed. Finally, the way in which the data for these calculations may be derived are discussed.
NASA Astrophysics Data System (ADS)
Chen, Kang; Walker, Richard J.; Rudnick, Roberta L.; Gao, Shan; Gaschnig, Richard M.; Puchtel, Igor S.; Tang, Ming; Hu, Zhao-Chu
2016-10-01
The fine-grained matrix of glacial diamictites, deposited periodically by continental ice sheets over much of Earth history, provides insights into the average composition and chemical evolution of the upper continental crust (UCC) (Gaschnig et al., 2016, and references therein). The concentrations of platinum-group elements (PGEs, including Os, Ir, Ru, Pt and Pd) and the geochemically related Re, as well as 187Re/188Os and 187Os/188Os ratios, are reported here for globally-distributed glacial diamictites that were deposited during the Mesoarchean, Paleoproterozoic, Neoproterozoic and Paleozoic eras. The medians and averages of PGE concentrations of these diamictite composites decrease from the Mesoarchean to the Neoproterozoic, mimicking decreases in the concentrations of first-row transition elements (Sc, V, Cr, Co and Ni). By contrast, Re concentrations are highly variable with no discernable trend, owing to its high solubility. Assuming these diamictites are representative of average UCC through time, the new data are fully consistent with the previous inference that the Archean UCC contained a greater proportion of mafic-ultramafic rocks relative to younger UCC. Linear regressions of PGEs versus Cr and Ni concentrations in all the diamictite composites from the four time periods are used to estimate the following concentrations of the PGEs in the present-day UCC: 0.059 ± 0.016 ng/g Os, 0.036 ± 0.008 ng/g Ir, 0.079 ± 0.026 ng/g Ru, 0.80 ± 0.22 ng/g Pt and 0.80 ± 0.26 ng/g Pd (2σ of 10,000 bootstrapping regression results). These PGE estimates are slightly higher than the estimates obtained from loess samples. We suggest this probably results from loess preferentially sampling younger UCC rocks that have lower PGE concentrations, or PGEs being fractionated during loess formation. A Re concentration of 0.25 ± 0.12 ng/g (2σ) is obtained from a regression of Re versus Mo. From this, time-integrated 187Re/188Os and 187Os/188Os ratios for the UCC are calculated, assuming an average UCC residence duration of ∼2.0 Ga, yielding ratios of 20 ± 12 and 0.80 ± 0.38 (2σ), respectively.
Bergh, Daniel
2015-01-01
Chi-square statistics are commonly used for tests of fit of measurement models. Chi-square is also sensitive to sample size, which is why several approaches to handle large samples in test of fit analysis have been developed. One strategy to handle the sample size problem may be to adjust the sample size in the analysis of fit. An alternative is to adopt a random sample approach. The purpose of this study was to analyze and to compare these two strategies using simulated data. Given an original sample size of 21,000, for reductions of sample sizes down to the order of 5,000 the adjusted sample size function works as good as the random sample approach. In contrast, when applying adjustments to sample sizes of lower order the adjustment function is less effective at approximating the chi-square value for an actual random sample of the relevant size. Hence, the fit is exaggerated and misfit under-estimated using the adjusted sample size function. Although there are big differences in chi-square values between the two approaches at lower sample sizes, the inferences based on the p-values may be the same.
ERIC Educational Resources Information Center
Guo, Jiin-Huarng; Luh, Wei-Ming
2008-01-01
This study proposes an approach for determining appropriate sample size for Welch's F test when unequal variances are expected. Given a certain maximum deviation in population means and using the quantile of F and t distributions, there is no need to specify a noncentrality parameter and it is easy to estimate the approximate sample size needed…
Non-invasive genetic censusing and monitoring of primate populations.
Arandjelovic, Mimi; Vigilant, Linda
2018-03-01
Knowing the density or abundance of primate populations is essential for their conservation management and contextualizing socio-demographic and behavioral observations. When direct counts of animals are not possible, genetic analysis of non-invasive samples collected from wildlife populations allows estimates of population size with higher accuracy and precision than is possible using indirect signs. Furthermore, in contrast to traditional indirect survey methods, prolonged or periodic genetic sampling across months or years enables inference of group membership, movement, dynamics, and some kin relationships. Data may also be used to estimate sex ratios, sex differences in dispersal distances, and detect gene flow among locations. Recent advances in capture-recapture models have further improved the precision of population estimates derived from non-invasive samples. Simulations using these methods have shown that the confidence interval of point estimates includes the true population size when assumptions of the models are met, and therefore this range of population size minima and maxima should be emphasized in population monitoring studies. Innovations such as the use of sniffer dogs or anti-poaching patrols for sample collection are important to ensure adequate sampling, and the expected development of efficient and cost-effective genotyping by sequencing methods for DNAs derived from non-invasive samples will automate and speed analyses. © 2018 Wiley Periodicals, Inc.
Conservative Sample Size Determination for Repeated Measures Analysis of Covariance.
Morgan, Timothy M; Case, L Douglas
2013-07-05
In the design of a randomized clinical trial with one pre and multiple post randomized assessments of the outcome variable, one needs to account for the repeated measures in determining the appropriate sample size. Unfortunately, one seldom has a good estimate of the variance of the outcome measure, let alone the correlations among the measurements over time. We show how sample sizes can be calculated by making conservative assumptions regarding the correlations for a variety of covariance structures. The most conservative choice for the correlation depends on the covariance structure and the number of repeated measures. In the absence of good estimates of the correlations, the sample size is often based on a two-sample t-test, making the 'ultra' conservative and unrealistic assumption that there are zero correlations between the baseline and follow-up measures while at the same time assuming there are perfect correlations between the follow-up measures. Compared to the case of taking a single measurement, substantial savings in sample size can be realized by accounting for the repeated measures, even with very conservative assumptions regarding the parameters of the assumed correlation matrix. Assuming compound symmetry, the sample size from the two-sample t-test calculation can be reduced at least 44%, 56%, and 61% for repeated measures analysis of covariance by taking 2, 3, and 4 follow-up measures, respectively. The results offer a rational basis for determining a fairly conservative, yet efficient, sample size for clinical trials with repeated measures and a baseline value.
National River and Stream Assessment Monitoring Design
The USEPA designed the National River and Stream Assessment (NRSA) in 2007 and field sampling was completed in 2008-9. The objective of the assessment is to estimate the ecological condition of river and streams nationally. This paper describes the national survey design and re...
A Compact Microwave Microfluidic Sensor Using a Re-Entrant Cavity.
Hamzah, Hayder; Abduljabar, Ali; Lees, Jonathan; Porch, Adrian
2018-03-19
A miniaturized 2.4 GHz re-entrant cavity has been designed, manufactured and tested as a sensor for microfluidic compositional analysis. It has been fully evaluated experimentally with water and common solvents, namely methanol, ethanol, and chloroform, with excellent agreement with the expected behaviour predicted by the Debye model. The sensor's performance has also been assessed for analysis of segmented flow using water and oil. The samples' interaction with the electric field in the gap region has been maximized by aligning the sample tube parallel to the electric field in this region, and the small width of the gap (typically 1 mm) result in a highly localised complex permittivity measurement. The re-entrant cavity has simple mechanical geometry, small size, high quality factor, and due to the high concentration of electric field in the gap region, a very small mode volume. These factors combine to result in a highly sensitive, compact sensor for both pure liquids and liquid mixtures in capillary or microfluidic environments.
Automated sampling assessment for molecular simulations using the effective sample size
Zhang, Xin; Bhatt, Divesh; Zuckerman, Daniel M.
2010-01-01
To quantify the progress in the development of algorithms and forcefields used in molecular simulations, a general method for the assessment of the sampling quality is needed. Statistical mechanics principles suggest the populations of physical states characterize equilibrium sampling in a fundamental way. We therefore develop an approach for analyzing the variances in state populations, which quantifies the degree of sampling in terms of the effective sample size (ESS). The ESS estimates the number of statistically independent configurations contained in a simulated ensemble. The method is applicable to both traditional dynamics simulations as well as more modern (e.g., multi–canonical) approaches. Our procedure is tested in a variety of systems from toy models to atomistic protein simulations. We also introduce a simple automated procedure to obtain approximate physical states from dynamic trajectories: this allows sample–size estimation in systems for which physical states are not known in advance. PMID:21221418
Goodall-Copestake, W P; Tarling, G A; Murphy, E J
2012-07-01
Estimates of genetic diversity represent a valuable resource for biodiversity assessments and are increasingly used to guide conservation and management programs. The most commonly reported estimates of DNA sequence diversity in animal populations are haplotype diversity (h) and nucleotide diversity (π) for the mitochondrial gene cytochrome c oxidase subunit I (cox1). However, several issues relevant to the comparison of h and π within and between studies remain to be assessed. We used population-level cox1 data from peer-reviewed publications to quantify the extent to which data sets can be re-assembled, to provide a standardized summary of h and π estimates, to explore the relationship between these metrics and to assess their sensitivity to under-sampling. Only 19 out of 42 selected publications had archived data that could be unambiguously re-assembled; this comprised 127 population-level data sets (n ≥ 15) from 23 animal species. Estimates of h and π were calculated using a 456-base region of cox1 that was common to all the data sets (median h=0.70130, median π=0.00356). Non-linear regression methods and Bayesian information criterion analysis revealed that the most parsimonious model describing the relationship between the estimates of h and π was π=0.0081 h(2). Deviations from this model can be used to detect outliers due to biological processes or methodological issues. Subsampling analyses indicated that samples of n>5 were sufficient to discriminate extremes of high from low population-level cox1 diversity, but samples of n ≥ 25 are recommended for greater accuracy.
Increased accuracy of batch fecundity estimates using oocyte stage ratios in Plectropomus leopardus.
Carter, A B; Williams, A J; Russ, G R
2009-08-01
Using the ratio of the number of migratory nuclei to hydrated oocytes to estimate batch fecundity of common coral trout Plectropomus leopardus increases the time over which samples can be collected and, therefore, increases the sample size available and reduces biases in batch fecundity estimates.
ERIC Educational Resources Information Center
Dong, Nianbo; Maynard, Rebecca
2013-01-01
This paper and the accompanying tool are intended to complement existing supports for conducting power analysis tools by offering a tool based on the framework of Minimum Detectable Effect Sizes (MDES) formulae that can be used in determining sample size requirements and in estimating minimum detectable effect sizes for a range of individual- and…
Wickenberg-Bolin, Ulrika; Göransson, Hanna; Fryknäs, Mårten; Gustafsson, Mats G; Isaksson, Anders
2006-03-13
Supervised learning for classification of cancer employs a set of design examples to learn how to discriminate between tumors. In practice it is crucial to confirm that the classifier is robust with good generalization performance to new examples, or at least that it performs better than random guessing. A suggested alternative is to obtain a confidence interval of the error rate using repeated design and test sets selected from available examples. However, it is known that even in the ideal situation of repeated designs and tests with completely novel samples in each cycle, a small test set size leads to a large bias in the estimate of the true variance between design sets. Therefore different methods for small sample performance estimation such as a recently proposed procedure called Repeated Random Sampling (RSS) is also expected to result in heavily biased estimates, which in turn translates into biased confidence intervals. Here we explore such biases and develop a refined algorithm called Repeated Independent Design and Test (RIDT). Our simulations reveal that repeated designs and tests based on resampling in a fixed bag of samples yield a biased variance estimate. We also demonstrate that it is possible to obtain an improved variance estimate by means of a procedure that explicitly models how this bias depends on the number of samples used for testing. For the special case of repeated designs and tests using new samples for each design and test, we present an exact analytical expression for how the expected value of the bias decreases with the size of the test set. We show that via modeling and subsequent reduction of the small sample bias, it is possible to obtain an improved estimate of the variance of classifier performance between design sets. However, the uncertainty of the variance estimate is large in the simulations performed indicating that the method in its present form cannot be directly applied to small data sets.
Comparison of Two Methods Used to Model Shape Parameters of Pareto Distributions
Liu, C.; Charpentier, R.R.; Su, J.
2011-01-01
Two methods are compared for estimating the shape parameters of Pareto field-size (or pool-size) distributions for petroleum resource assessment. Both methods assume mature exploration in which most of the larger fields have been discovered. Both methods use the sizes of larger discovered fields to estimate the numbers and sizes of smaller fields: (1) the tail-truncated method uses a plot of field size versus size rank, and (2) the log-geometric method uses data binned in field-size classes and the ratios of adjacent bin counts. Simulation experiments were conducted using discovered oil and gas pool-size distributions from four petroleum systems in Alberta, Canada and using Pareto distributions generated by Monte Carlo simulation. The estimates of the shape parameters of the Pareto distributions, calculated by both the tail-truncated and log-geometric methods, generally stabilize where discovered pool numbers are greater than 100. However, with fewer than 100 discoveries, these estimates can vary greatly with each new discovery. The estimated shape parameters of the tail-truncated method are more stable and larger than those of the log-geometric method where the number of discovered pools is more than 100. Both methods, however, tend to underestimate the shape parameter. Monte Carlo simulation was also used to create sequences of discovered pool sizes by sampling from a Pareto distribution with a discovery process model using a defined exploration efficiency (in order to show how biased the sampling was in favor of larger fields being discovered first). A higher (more biased) exploration efficiency gives better estimates of the Pareto shape parameters. ?? 2011 International Association for Mathematical Geosciences.
Mollet, Pierre; Kery, Marc; Gardner, Beth; Pasinelli, Gilberto; Royle, Andy
2015-01-01
We conducted a survey of an endangered and cryptic forest grouse, the capercaillie Tetrao urogallus, based on droppings collected on two sampling occasions in eight forest fragments in central Switzerland in early spring 2009. We used genetic analyses to sex and individually identify birds. We estimated sex-dependent detection probabilities and population size using a modern spatial capture-recapture (SCR) model for the data from pooled surveys. A total of 127 capercaillie genotypes were identified (77 males, 46 females, and 4 of unknown sex). The SCR model yielded atotal population size estimate (posterior mean) of 137.3 capercaillies (posterior sd 4.2, 95% CRI 130–147). The observed sex ratio was skewed towards males (0.63). The posterior mean of the sex ratio under the SCR model was 0.58 (posterior sd 0.02, 95% CRI 0.54–0.61), suggesting a male-biased sex ratio in our study area. A subsampling simulation study indicated that a reduced sampling effort representing 75% of the actual detections would still yield practically acceptable estimates of total size and sex ratio in our population. Hence, field work and financial effort could be reduced without compromising accuracy when the SCR model is used to estimate key population parameters of cryptic species.
Mi, Michael Y; Betensky, Rebecca A
2013-04-01
Currently, a growing placebo response rate has been observed in clinical trials for antidepressant drugs, a phenomenon that has made it increasingly difficult to demonstrate efficacy. The sequential parallel comparison design (SPCD) is a clinical trial design that was proposed to address this issue. The SPCD theoretically has the potential to reduce the sample-size requirement for a clinical trial and to simultaneously enrich the study population to be less responsive to the placebo. Because the basic SPCD already reduces the placebo response by removing placebo responders between the first and second phases of a trial, the purpose of this study was to examine whether we can further improve the efficiency of the basic SPCD and whether we can do so when the projected underlying drug and placebo response rates differ considerably from the actual ones. Three adaptive designs that used interim analyses to readjust the length of study duration for individual patients were tested to reduce the sample-size requirement or increase the statistical power of the SPCD. Various simulations of clinical trials using the SPCD with interim analyses were conducted to test these designs through calculations of empirical power. From the simulations, we found that the adaptive designs can recover unnecessary resources spent in the traditional SPCD trial format with overestimated initial sample sizes and provide moderate gains in power. Under the first design, results showed up to a 25% reduction in person-days, with most power losses below 5%. In the second design, results showed up to a 8% reduction in person-days with negligible loss of power. In the third design using sample-size re-estimation, up to 25% power was recovered from underestimated sample-size scenarios. Given the numerous possible test parameters that could have been chosen for the simulations, the study's results are limited to situations described by the parameters that were used and may not generalize to all possible scenarios. Furthermore, dropout of patients is not considered in this study. It is possible to make an already complex design such as the SPCD adaptive, and thus more efficient, potentially overcoming the problem of placebo response at lower cost. Ultimately, such a design may expedite the approval of future effective treatments.
Mi, Michael Y.; Betensky, Rebecca A.
2013-01-01
Background Currently, a growing placebo response rate has been observed in clinical trials for antidepressant drugs, a phenomenon that has made it increasingly difficult to demonstrate efficacy. The sequential parallel comparison design (SPCD) is a clinical trial design that was proposed to address this issue. The SPCD theoretically has the potential to reduce the sample size requirement for a clinical trial and to simultaneously enrich the study population to be less responsive to the placebo. Purpose Because the basic SPCD design already reduces the placebo response by removing placebo responders between the first and second phases of a trial, the purpose of this study was to examine whether we can further improve the efficiency of the basic SPCD and if we can do so when the projected underlying drug and placebo response rates differ considerably from the actual ones. Methods Three adaptive designs that used interim analyses to readjust the length of study duration for individual patients were tested to reduce the sample size requirement or increase the statistical power of the SPCD. Various simulations of clinical trials using the SPCD with interim analyses were conducted to test these designs through calculations of empirical power. Results From the simulations, we found that the adaptive designs can recover unnecessary resources spent in the traditional SPCD trial format with overestimated initial sample sizes and provide moderate gains in power. Under the first design, results showed up to a 25% reduction in person-days, with most power losses below 5%. In the second design, results showed up to a 8% reduction in person-days with negligible loss of power. In the third design using sample size re-estimation, up to 25% power was recovered from underestimated sample size scenarios. Limitations Given the numerous possible test parameters that could have been chosen for the simulations, the study’s results are limited to situations described by the parameters that were used, and may not generalize to all possible scenarios. Furthermore, drop-out of patients is not considered in this study. Conclusions It is possible to make an already complex design such as the SPCD adaptive, and thus more efficient, potentially overcoming the problem of placebo response at lower cost. Ultimately, such a design may expedite the approval of future effective treatments. PMID:23283576
Effective population sizes of a major vector of human diseases, Aedes aegypti.
Saarman, Norah P; Gloria-Soria, Andrea; Anderson, Eric C; Evans, Benjamin R; Pless, Evlyn; Cosme, Luciano V; Gonzalez-Acosta, Cassandra; Kamgang, Basile; Wesson, Dawn M; Powell, Jeffrey R
2017-12-01
The effective population size ( N e ) is a fundamental parameter in population genetics that determines the relative strength of selection and random genetic drift, the effect of migration, levels of inbreeding, and linkage disequilibrium. In many cases where it has been estimated in animals, N e is on the order of 10%-20% of the census size. In this study, we use 12 microsatellite markers and 14,888 single nucleotide polymorphisms (SNPs) to empirically estimate N e in Aedes aegypti , the major vector of yellow fever, dengue, chikungunya, and Zika viruses. We used the method of temporal sampling to estimate N e on a global dataset made up of 46 samples of Ae. aegypti that included multiple time points from 17 widely distributed geographic localities. Our N e estimates for Ae. aegypti fell within a broad range (~25-3,000) and averaged between 400 and 600 across all localities and time points sampled. Adult census size (N c ) estimates for this species range between one and five thousand, so the N e / N c ratio is about the same as for most animals. These N e values are lower than estimates available for other insects and have important implications for the design of genetic control strategies to reduce the impact of this species of mosquito on human health.
Improved population estimates through the use of auxiliary information
Johnson, D.H.; Ralph, C.J.; Scott, J.M.
1981-01-01
When estimating the size of a population of birds, the investigator may have, in addition to an estimator based on a statistical sample, information on one of several auxiliary variables, such as: (1) estimates of the population made on previous occasions, (2) measures of habitat variables associated with the size of the population, and (3) estimates of the population sizes of other species that correlate with the species of interest. Although many studies have described the relationships between each of these kinds of data and the population size to be estimated, very little work has been done to improve the estimator by incorporating such auxiliary information. A statistical methodology termed 'empirical Bayes' seems to be appropriate to these situations. The potential that empirical Bayes methodology has for improved estimation of the population size of the Mallard (Anas platyrhynchos) is explored. In the example considered, three empirical Bayes estimators were found to reduce the error by one-fourth to one-half of that of the usual estimator.
SAS procedures for designing and analyzing sample surveys
Stafford, Joshua D.; Reinecke, Kenneth J.; Kaminski, Richard M.
2003-01-01
Complex surveys often are necessary to estimate occurrence (or distribution), density, and abundance of plants and animals for purposes of re-search and conservation. Most scientists are familiar with simple random sampling, where sample units are selected from a population of interest (sampling frame) with equal probability. However, the goal of ecological surveys often is to make inferences about populations over large or complex spatial areas where organisms are not homogeneously distributed or sampling frames are in-convenient or impossible to construct. Candidate sampling strategies for such complex surveys include stratified,multistage, and adaptive sampling (Thompson 1992, Buckland 1994).
Icing Characteristics of Low Altitude, Supercooled Layer Clouds. Revision
1980-05-01
Droplet Size Distribution 5. Icing Rate Meters C. Accuracy and Sources of Error in the Measurements from the Period 1944-1950 11 1. Rotating...whether currently available LWC meters and icing rate detectors will give re- liable results when flown on helicopters. Concerning the forecasting...Max Dia. Size Distrib. Meter Samples 4 1944 MSP DP -- Al .... 4 6 1946 OR 2,4RC 2,4RHC Al 4RMC -- 3 7 1946-47 NEMO, 4RMC 4RMC AI 4RMC - 31 TN,OH, IN
Effects of sample size and sampling frequency on studies of brown bear home ranges and habitat use
Arthur, Steve M.; Schwartz, Charles C.
1999-01-01
We equipped 9 brown bears (Ursus arctos) on the Kenai Peninsula, Alaska, with collars containing both conventional very-high-frequency (VHF) transmitters and global positioning system (GPS) receivers programmed to determine an animal's position at 5.75-hr intervals. We calculated minimum convex polygon (MCP) and fixed and adaptive kernel home ranges for randomly-selected subsets of the GPS data to examine the effects of sample size on accuracy and precision of home range estimates. We also compared results obtained by weekly aerial radiotracking versus more frequent GPS locations to test for biases in conventional radiotracking data. Home ranges based on the MCP were 20-606 km2 (x = 201) for aerial radiotracking data (n = 12-16 locations/bear) and 116-1,505 km2 (x = 522) for the complete GPS data sets (n = 245-466 locations/bear). Fixed kernel home ranges were 34-955 km2 (x = 224) for radiotracking data and 16-130 km2 (x = 60) for the GPS data. Differences between means for radiotracking and GPS data were due primarily to the larger samples provided by the GPS data. Means did not differ between radiotracking data and equivalent-sized subsets of GPS data (P > 0.10). For the MCP, home range area increased and variability decreased asymptotically with number of locations. For the kernel models, both area and variability decreased with increasing sample size. Simulations suggested that the MCP and kernel models required >60 and >80 locations, respectively, for estimates to be both accurate (change in area <1%/additional location) and precise (CV < 50%). Although the radiotracking data appeared unbiased, except for the relationship between area and sample size, these data failed to indicate some areas that likely were important to bears. Our results suggest that the usefulness of conventional radiotracking data may be limited by potential biases and variability due to small samples. Investigators that use home range estimates in statistical tests should consider the effects of variability of those estimates. Use of GPS-equipped collars can facilitate obtaining larger samples of unbiased data and improve accuracy and precision of home range estimates.
Estimation of sample size and testing power (part 6).
Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo
2012-03-01
The design of one factor with k levels (k ≥ 3) refers to the research that only involves one experimental factor with k levels (k ≥ 3), and there is no arrangement for other important non-experimental factors. This paper introduces the estimation of sample size and testing power for quantitative data and qualitative data having a binary response variable with the design of one factor with k levels (k ≥ 3).
Estimation of sample size and testing power (Part 3).
Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo
2011-12-01
This article introduces the definition and sample size estimation of three special tests (namely, non-inferiority test, equivalence test and superiority test) for qualitative data with the design of one factor with two levels having a binary response variable. Non-inferiority test refers to the research design of which the objective is to verify that the efficacy of the experimental drug is not clinically inferior to that of the positive control drug. Equivalence test refers to the research design of which the objective is to verify that the experimental drug and the control drug have clinically equivalent efficacy. Superiority test refers to the research design of which the objective is to verify that the efficacy of the experimental drug is clinically superior to that of the control drug. By specific examples, this article introduces formulas of sample size estimation for the three special tests, and their SAS realization in detail.
Jorgenson, Andrew K; Clark, Brett
2013-01-01
This study examines the regional and temporal differences in the statistical relationship between national-level carbon dioxide emissions and national-level population size. The authors analyze panel data from 1960 to 2005 for a diverse sample of nations, and employ descriptive statistics and rigorous panel regression modeling techniques. Initial descriptive analyses indicate that all regions experienced overall increases in carbon emissions and population size during the 45-year period of investigation, but with notable differences. For carbon emissions, the sample of countries in Asia experienced the largest percent increase, followed by countries in Latin America, Africa, and lastly the sample of relatively affluent countries in Europe, North America, and Oceania combined. For population size, the sample of countries in Africa experienced the largest percent increase, followed countries in Latin America, Asia, and the combined sample of countries in Europe, North America, and Oceania. Findings for two-way fixed effects panel regression elasticity models of national-level carbon emissions indicate that the estimated elasticity coefficient for population size is much smaller for nations in Africa than for nations in other regions of the world. Regarding potential temporal changes, from 1960 to 2005 the estimated elasticity coefficient for population size decreased by 25% for the sample of Africa countries, 14% for the sample of Asia countries, 6.5% for the sample of Latin America countries, but remained the same in size for the sample of countries in Europe, North America, and Oceania. Overall, while population size continues to be the primary driver of total national-level anthropogenic carbon dioxide emissions, the findings for this study highlight the need for future research and policies to recognize that the actual impacts of population size on national-level carbon emissions differ across both time and region.
Terry, Leann; Kelley, Ken
2012-11-01
Composite measures play an important role in psychology and related disciplines. Composite measures almost always have error. Correspondingly, it is important to understand the reliability of the scores from any particular composite measure. However, the point estimates of the reliability of composite measures are fallible and thus all such point estimates should be accompanied by a confidence interval. When confidence intervals are wide, there is much uncertainty in the population value of the reliability coefficient. Given the importance of reporting confidence intervals for estimates of reliability, coupled with the undesirability of wide confidence intervals, we develop methods that allow researchers to plan sample size in order to obtain narrow confidence intervals for population reliability coefficients. We first discuss composite reliability coefficients and then provide a discussion on confidence interval formation for the corresponding population value. Using the accuracy in parameter estimation approach, we develop two methods to obtain accurate estimates of reliability by planning sample size. The first method provides a way to plan sample size so that the expected confidence interval width for the population reliability coefficient is sufficiently narrow. The second method ensures that the confidence interval width will be sufficiently narrow with some desired degree of assurance (e.g., 99% assurance that the 95% confidence interval for the population reliability coefficient will be less than W units wide). The effectiveness of our methods was verified with Monte Carlo simulation studies. We demonstrate how to easily implement the methods with easy-to-use and freely available software. ©2011 The British Psychological Society.
Sidler, Dominik; Schwaninger, Arthur; Riniker, Sereina
2016-10-21
In molecular dynamics (MD) simulations, free-energy differences are often calculated using free energy perturbation or thermodynamic integration (TI) methods. However, both techniques are only suited to calculate free-energy differences between two end states. Enveloping distribution sampling (EDS) presents an attractive alternative that allows to calculate multiple free-energy differences in a single simulation. In EDS, a reference state is simulated which "envelopes" the end states. The challenge of this methodology is the determination of optimal reference-state parameters to ensure equal sampling of all end states. Currently, the automatic determination of the reference-state parameters for multiple end states is an unsolved issue that limits the application of the methodology. To resolve this, we have generalised the replica-exchange EDS (RE-EDS) approach, introduced by Lee et al. [J. Chem. Theory Comput. 10, 2738 (2014)] for constant-pH MD simulations. By exchanging configurations between replicas with different reference-state parameters, the complexity of the parameter-choice problem can be substantially reduced. A new robust scheme to estimate the reference-state parameters from a short initial RE-EDS simulation with default parameters was developed, which allowed the calculation of 36 free-energy differences between nine small-molecule inhibitors of phenylethanolamine N-methyltransferase from a single simulation. The resulting free-energy differences were in excellent agreement with values obtained previously by TI and two-state EDS simulations.
Babamoradi, Hamid; van den Berg, Frans; Rinnan, Åsmund
2016-02-18
In Multivariate Statistical Process Control, when a fault is expected or detected in the process, contribution plots are essential for operators and optimization engineers in identifying those process variables that were affected by or might be the cause of the fault. The traditional way of interpreting a contribution plot is to examine the largest contributing process variables as the most probable faulty ones. This might result in false readings purely due to the differences in natural variation, measurement uncertainties, etc. It is more reasonable to compare variable contributions for new process runs with historical results achieved under Normal Operating Conditions, where confidence limits for contribution plots estimated from training data are used to judge new production runs. Asymptotic methods cannot provide confidence limits for contribution plots, leaving re-sampling methods as the only option. We suggest bootstrap re-sampling to build confidence limits for all contribution plots in online PCA-based MSPC. The new strategy to estimate CLs is compared to the previously reported CLs for contribution plots. An industrial batch process dataset was used to illustrate the concepts. Copyright © 2016 Elsevier B.V. All rights reserved.
Galhoum, Ahmed A.; Mafhouz, Mohammad G.; Abdel-Rehem, Sayed T.; Gomaa, Nabawia A.; Atia, Asem A.; Vincent, Thierry; Guibal, Eric
2015-01-01
Cysteine-functionalized chitosan magnetic nano-based particles were synthesized for the sorption of light and heavy rare earth (RE) metal ions (La(III), Nd(III) and Yb(III)). The structural, surface, and magnetic properties of nano-sized sorbent were investigated by elemental analysis, FTIR, XRD, TEM and VSM (vibrating sample magnetometry). Experimental data show that the pseudo second-order rate equation fits the kinetic profiles well, while sorption isotherms are described by the Langmuir model. Thermodynamic constants (ΔG°, ΔH°) demonstrate the spontaneous and endothermic nature of sorption. Yb(III) (heavy RE) was selectively sorbed while light RE metal ions La(III) and Nd(III) were concentrated/enriched in the solution. Cationic species RE(III) in aqueous solution can be adsorbed by the combination of chelating and anion-exchange mechanisms. The sorbent can be efficiently regenerated using acidified thiourea. PMID:28347004
An internal pilot design for prospective cancer screening trials with unknown disease prevalence.
Brinton, John T; Ringham, Brandy M; Glueck, Deborah H
2015-10-13
For studies that compare the diagnostic accuracy of two screening tests, the sample size depends on the prevalence of disease in the study population, and on the variance of the outcome. Both parameters may be unknown during the design stage, which makes finding an accurate sample size difficult. To solve this problem, we propose adapting an internal pilot design. In this adapted design, researchers will accrue some percentage of the planned sample size, then estimate both the disease prevalence and the variances of the screening tests. The updated estimates of the disease prevalence and variance are used to conduct a more accurate power and sample size calculation. We demonstrate that in large samples, the adapted internal pilot design produces no Type I inflation. For small samples (N less than 50), we introduce a novel adjustment of the critical value to control the Type I error rate. We apply the method to two proposed prospective cancer screening studies: 1) a small oral cancer screening study in individuals with Fanconi anemia and 2) a large oral cancer screening trial. Conducting an internal pilot study without adjusting the critical value can cause Type I error rate inflation in small samples, but not in large samples. An internal pilot approach usually achieves goal power and, for most studies with sample size greater than 50, requires no Type I error correction. Further, we have provided a flexible and accurate approach to bound Type I error below a goal level for studies with small sample size.
Sampling effort and estimates of species richness based on prepositioned area electrofisher samples
Bowen, Z.H.; Freeman, Mary C.
1998-01-01
Estimates of species richness based on electrofishing data are commonly used to describe the structure of fish communities. One electrofishing method for sampling riverine fishes that has become popular in the last decade is the prepositioned area electrofisher (PAE). We investigated the relationship between sampling effort and fish species richness at seven sites in the Tallapoosa River system, USA based on 1,400 PAE samples collected during 1994 and 1995. First, we estimated species richness at each site using the first-order jackknife and compared observed values for species richness and jackknife estimates of species richness to estimates based on historical collection data. Second, we used a permutation procedure and nonlinear regression to examine rates of species accumulation. Third, we used regression to predict the number of PAE samples required to collect the jackknife estimate of species richness at each site during 1994 and 1995. We found that jackknife estimates of species richness generally were less than or equal to estimates based on historical collection data. The relationship between PAE electrofishing effort and species richness in the Tallapoosa River was described by a positive asymptotic curve as found in other studies using different electrofishing gears in wadable streams. Results from nonlinear regression analyses indicted that rates of species accumulation were variable among sites and between years. Across sites and years, predictions of sampling effort required to collect jackknife estimates of species richness suggested that doubling sampling effort (to 200 PAEs) would typically increase observed species richness by not more than six species. However, sampling effort beyond about 60 PAE samples typically increased observed species richness by < 10%. We recommend using historical collection data in conjunction with a preliminary sample size of at least 70 PAE samples to evaluate estimates of species richness in medium-sized rivers. Seventy PAE samples should provide enough information to describe the relationship between sampling effort and species richness and thus facilitate evaluation of a sampling effort.
Power and sample-size estimation for microbiome studies using pairwise distances and PERMANOVA.
Kelly, Brendan J; Gross, Robert; Bittinger, Kyle; Sherrill-Mix, Scott; Lewis, James D; Collman, Ronald G; Bushman, Frederic D; Li, Hongzhe
2015-08-01
The variation in community composition between microbiome samples, termed beta diversity, can be measured by pairwise distance based on either presence-absence or quantitative species abundance data. PERMANOVA, a permutation-based extension of multivariate analysis of variance to a matrix of pairwise distances, partitions within-group and between-group distances to permit assessment of the effect of an exposure or intervention (grouping factor) upon the sampled microbiome. Within-group distance and exposure/intervention effect size must be accurately modeled to estimate statistical power for a microbiome study that will be analyzed with pairwise distances and PERMANOVA. We present a framework for PERMANOVA power estimation tailored to marker-gene microbiome studies that will be analyzed by pairwise distances, which includes: (i) a novel method for distance matrix simulation that permits modeling of within-group pairwise distances according to pre-specified population parameters; (ii) a method to incorporate effects of different sizes within the simulated distance matrix; (iii) a simulation-based method for estimating PERMANOVA power from simulated distance matrices; and (iv) an R statistical software package that implements the above. Matrices of pairwise distances can be efficiently simulated to satisfy the triangle inequality and incorporate group-level effects, which are quantified by the adjusted coefficient of determination, omega-squared (ω2). From simulated distance matrices, available PERMANOVA power or necessary sample size can be estimated for a planned microbiome study. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Body mass estimates of hominin fossils and the evolution of human body size.
Grabowski, Mark; Hatala, Kevin G; Jungers, William L; Richmond, Brian G
2015-08-01
Body size directly influences an animal's place in the natural world, including its energy requirements, home range size, relative brain size, locomotion, diet, life history, and behavior. Thus, an understanding of the biology of extinct organisms, including species in our own lineage, requires accurate estimates of body size. Since the last major review of hominin body size based on postcranial morphology over 20 years ago, new fossils have been discovered, species attributions have been clarified, and methods improved. Here, we present the most comprehensive and thoroughly vetted set of individual fossil hominin body mass predictions to date, and estimation equations based on a large (n = 220) sample of modern humans of known body masses. We also present species averages based exclusively on fossils with reliable taxonomic attributions, estimates of species averages by sex, and a metric for levels of sexual dimorphism. Finally, we identify individual traits that appear to be the most reliable for mass estimation for each fossil species, for use when only one measurement is available for a fossil. Our results show that many early hominins were generally smaller-bodied than previously thought, an outcome likely due to larger estimates in previous studies resulting from the use of large-bodied modern human reference samples. Current evidence indicates that modern human-like large size first appeared by at least 3-3.5 Ma in some Australopithecus afarensis individuals. Our results challenge an evolutionary model arguing that body size increased from Australopithecus to early Homo. Instead, we show that there is no reliable evidence that the body size of non-erectus early Homo differed from that of australopiths, and confirm that Homo erectus evolved larger average body size than earlier hominins. Copyright © 2015 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mohammadzadeh, Roghayeh, E-mail: r_mohammadzadeh@sut.ac.ir; Akbari, Alireza, E-mail: akbari@sut.ac.ir
2014-07-01
Prolonged exposure at high temperatures during solution nitriding induces grain coarsening which deteriorates the mechanical properties of high nitrogen austenitic stainless steels. In this study, grain refinement of nickel and manganese free Fe–22.75Cr–2.42Mo–1.17N high nitrogen austenitic stainless steel plates was investigated via a two-stage heat treatment procedure. Initially, the coarse-grained austenitic stainless steel samples were subjected to an isothermal heating at 700 °C to be decomposed into the ferrite + Cr{sub 2}N eutectoid structure and then re-austenitized at 1200 °C followed by water quenching. Microstructure and hardness of samples were characterized using X-ray diffraction, optical and scanning electron microscopy, andmore » micro-hardness testing. The results showed that the as-solution-nitrided steel decomposes non-uniformly to the colonies of ferrite and Cr{sub 2}N nitrides with strip like morphology after isothermal heat treatment at 700 °C. Additionally, the complete dissolution of the Cr{sub 2}N precipitates located in the sample edges during re-austenitizing requires longer times than 1 h. In order to avoid this problem an intermediate nitrogen homogenizing heat treatment cycle at 1200 °C for 10 h was applied before grain refinement process. As a result, the initial austenite was uniformly decomposed during the first stage, and a fine grained austenitic structure with average grain size of about 20 μm was successfully obtained by re-austenitizing for 10 min. - Highlights: • Successful grain refinement of Fe–22.75Cr–2.42Mo–1.17N steel by heat treatment • Using the γ → α + Cr{sub 2}N reaction for grain refinement of a Ni and Mn free HNASS • Obtaining a single phase austenitic structure with average grain size of ∼ 20 μm • Incomplete dissolution of Cr{sub 2}N during re-austenitizing at 1200 °C for long times • Reducing re-austenitizing time by homogenizing treatment before grain refinement.« less
Re-Ranking Sequencing Variants in the Post-GWAS Era for Accurate Causal Variant Identification
Faye, Laura L.; Machiela, Mitchell J.; Kraft, Peter; Bull, Shelley B.; Sun, Lei
2013-01-01
Next generation sequencing has dramatically increased our ability to localize disease-causing variants by providing base-pair level information at costs increasingly feasible for the large sample sizes required to detect complex-trait associations. Yet, identification of causal variants within an established region of association remains a challenge. Counter-intuitively, certain factors that increase power to detect an associated region can decrease power to localize the causal variant. First, combining GWAS with imputation or low coverage sequencing to achieve the large sample sizes required for high power can have the unintended effect of producing differential genotyping error among SNPs. This tends to bias the relative evidence for association toward better genotyped SNPs. Second, re-use of GWAS data for fine-mapping exploits previous findings to ensure genome-wide significance in GWAS-associated regions. However, using GWAS findings to inform fine-mapping analysis can bias evidence away from the causal SNP toward the tag SNP and SNPs in high LD with the tag. Together these factors can reduce power to localize the causal SNP by more than half. Other strategies commonly employed to increase power to detect association, namely increasing sample size and using higher density genotyping arrays, can, in certain common scenarios, actually exacerbate these effects and further decrease power to localize causal variants. We develop a re-ranking procedure that accounts for these adverse effects and substantially improves the accuracy of causal SNP identification, often doubling the probability that the causal SNP is top-ranked. Application to the NCI BPC3 aggressive prostate cancer GWAS with imputation meta-analysis identified a new top SNP at 2 of 3 associated loci and several additional possible causal SNPs at these loci that may have otherwise been overlooked. This method is simple to implement using R scripts provided on the author's website. PMID:23950724
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1978-01-01
This paper addresses the problem of obtaining numerically maximum-likelihood estimates of the parameters for a mixture of normal distributions. In recent literature, a certain successive-approximations procedure, based on the likelihood equations, was shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, we introduce a general iterative procedure, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. We show that, with probability 1 as the sample size grows large, this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. We also show that the step-size which yields optimal local convergence rates for large samples is determined in a sense by the 'separation' of the component normal densities and is bounded below by a number between 1 and 2.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1976-01-01
The problem of obtaining numerically maximum likelihood estimates of the parameters for a mixture of normal distributions is addressed. In recent literature, a certain successive approximations procedure, based on the likelihood equations, is shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, a general iterative procedure is introduced, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. With probability 1 as the sample size grows large, it is shown that this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. The step-size which yields optimal local convergence rates for large samples is determined in a sense by the separation of the component normal densities and is bounded below by a number between 1 and 2.
Overlap between treatment and control distributions as an effect size measure in experiments.
Hedges, Larry V; Olkin, Ingram
2016-03-01
The proportion π of treatment group observations that exceed the control group mean has been proposed as an effect size measure for experiments that randomly assign independent units into 2 groups. We give the exact distribution of a simple estimator of π based on the standardized mean difference and use it to study the small sample bias of this estimator. We also give the minimum variance unbiased estimator of π under 2 models, one in which the variance of the mean difference is known and one in which the variance is unknown. We show how to use the relation between the standardized mean difference and the overlap measure to compute confidence intervals for π and show that these results can be used to obtain unbiased estimators, large sample variances, and confidence intervals for 3 related effect size measures based on the overlap. Finally, we show how the effect size π can be used in a meta-analysis. (c) 2016 APA, all rights reserved).
Ronald E. McRoberts; Geoffrey R. Holden; Mark D. Nelson; Greg C. Liknes; Dale D. Gormanson
2006-01-01
Forest inventory programs report estimates of forest variables for areas of interest ranging in size from municipalities, to counties, to states or provinces. Because of numerous factors, sample sizes are often insufficient to estimate attributes as precisely as is desired, unless the estimation process is enhanced using ancillary data. Classified satellite imagery has...
Ozay, Guner; Seyhan, Ferda; Yilmaz, Aysun; Whitaker, Thomas B; Slate, Andrew B; Giesbrecht, Francis
2006-01-01
The variability associated with the aflatoxin test procedure used to estimate aflatoxin levels in bulk shipments of hazelnuts was investigated. Sixteen 10 kg samples of shelled hazelnuts were taken from each of 20 lots that were suspected of aflatoxin contamination. The total variance associated with testing shelled hazelnuts was estimated and partitioned into sampling, sample preparation, and analytical variance components. Each variance component increased as aflatoxin concentration (either B1 or total) increased. With the use of regression analysis, mathematical expressions were developed to model the relationship between aflatoxin concentration and the total, sampling, sample preparation, and analytical variances. The expressions for these relationships were used to estimate the variance for any sample size, subsample size, and number of analyses for a specific aflatoxin concentration. The sampling, sample preparation, and analytical variances associated with estimating aflatoxin in a hazelnut lot at a total aflatoxin level of 10 ng/g and using a 10 kg sample, a 50 g subsample, dry comminution with a Robot Coupe mill, and a high-performance liquid chromatographic analytical method are 174.40, 0.74, and 0.27, respectively. The sampling, sample preparation, and analytical steps of the aflatoxin test procedure accounted for 99.4, 0.4, and 0.2% of the total variability, respectively.
NASA Astrophysics Data System (ADS)
Giorli, Giacomo; Drazen, Jeffrey C.; Neuheimer, Anna B.; Copeland, Adrienne; Au, Whitlow W. L.
2018-01-01
Pelagic animals that form deep sea scattering layers (DSLs) represent an important link in the food web between zooplankton and top predators. While estimating the composition, density and location of the DSL is important to understand mesopelagic ecosystem dynamics and to predict top predators' distribution, DSL composition and density are often estimated from trawls which may be biased in terms of extrusion, avoidance, and gear-associated biases. Instead, location and biomass of DSLs can be estimated from active acoustic techniques, though estimates are often in aggregate without regard to size or taxon specific information. For the first time in the open ocean, we used a DIDSON sonar to characterize the fauna in DSLs. Estimates of the numerical density and length of animals at different depths and locations along the Kona coast of the Island of Hawaii were determined. Data were collected below and inside the DSLs with the sonar mounted on a profiler. A total of 7068 animals were counted and sized. We estimated numerical densities ranging from 1 to 7 animals/m3 and individuals as long as 3 m were detected. These numerical densities were orders of magnitude higher than those estimated from trawls and average sizes of animals were much larger as well. A mixed model was used to characterize numerical density and length of animals as a function of deep sea layer sampled, location, time of day, and day of the year. Numerical density and length of animals varied by month, with numerical density also a function of depth. The DIDSON proved to be a good tool for open-ocean/deep-sea estimation of the numerical density and size of marine animals, especially larger ones. Further work is needed to understand how this methodology relates to estimates of volume backscatters obtained with standard echosounding techniques, density measures obtained with other sampling methodologies, and to precisely evaluate sampling biases.
Cui, Zaixu; Gong, Gaolang
2018-06-02
Individualized behavioral/cognitive prediction using machine learning (ML) regression approaches is becoming increasingly applied. The specific ML regression algorithm and sample size are two key factors that non-trivially influence prediction accuracies. However, the effects of the ML regression algorithm and sample size on individualized behavioral/cognitive prediction performance have not been comprehensively assessed. To address this issue, the present study included six commonly used ML regression algorithms: ordinary least squares (OLS) regression, least absolute shrinkage and selection operator (LASSO) regression, ridge regression, elastic-net regression, linear support vector regression (LSVR), and relevance vector regression (RVR), to perform specific behavioral/cognitive predictions based on different sample sizes. Specifically, the publicly available resting-state functional MRI (rs-fMRI) dataset from the Human Connectome Project (HCP) was used, and whole-brain resting-state functional connectivity (rsFC) or rsFC strength (rsFCS) were extracted as prediction features. Twenty-five sample sizes (ranged from 20 to 700) were studied by sub-sampling from the entire HCP cohort. The analyses showed that rsFC-based LASSO regression performed remarkably worse than the other algorithms, and rsFCS-based OLS regression performed markedly worse than the other algorithms. Regardless of the algorithm and feature type, both the prediction accuracy and its stability exponentially increased with increasing sample size. The specific patterns of the observed algorithm and sample size effects were well replicated in the prediction using re-testing fMRI data, data processed by different imaging preprocessing schemes, and different behavioral/cognitive scores, thus indicating excellent robustness/generalization of the effects. The current findings provide critical insight into how the selected ML regression algorithm and sample size influence individualized predictions of behavior/cognition and offer important guidance for choosing the ML regression algorithm or sample size in relevant investigations. Copyright © 2018 Elsevier Inc. All rights reserved.
Trask, Amanda E; Bignal, Eric M; McCracken, Davy I; Piertney, Stuart B; Reid, Jane M
2017-09-01
A population's effective size (N e ) is a key parameter that shapes rates of inbreeding and loss of genetic diversity, thereby influencing evolutionary processes and population viability. However, estimating N e , and identifying key demographic mechanisms that underlie the N e to census population size (N) ratio, remains challenging, especially for small populations with overlapping generations and substantial environmental and demographic stochasticity and hence dynamic age-structure. A sophisticated demographic method of estimating N e /N, which uses Fisher's reproductive value to account for dynamic age-structure, has been formulated. However, this method requires detailed individual- and population-level data on sex- and age-specific reproduction and survival, and has rarely been implemented. Here, we use the reproductive value method and detailed demographic data to estimate N e /N for a small and apparently isolated red-billed chough (Pyrrhocorax pyrrhocorax) population of high conservation concern. We additionally calculated two single-sample molecular genetic estimates of N e to corroborate the demographic estimate and examine evidence for unobserved immigration and gene flow. The demographic estimate of N e /N was 0.21, reflecting a high total demographic variance (σ2dg) of 0.71. Females and males made similar overall contributions to σ2dg. However, contributions varied among sex-age classes, with greater contributions from 3 year-old females than males, but greater contributions from ≥5 year-old males than females. The demographic estimate of N e was ~30, suggesting that rates of increase of inbreeding and loss of genetic variation per generation will be relatively high. Molecular genetic estimates of N e computed from linkage disequilibrium and approximate Bayesian computation were approximately 50 and 30, respectively, providing no evidence of substantial unobserved immigration which could bias demographic estimates of N e . Our analyses identify key sex-age classes contributing to demographic variance and thus decreasing N e /N in a small age-structured population inhabiting a variable environment. They thereby demonstrate how assessments of N e can incorporate stochastic sex- and age-specific demography and elucidate key demographic processes affecting a population's evolutionary trajectory and viability. Furthermore, our analyses show that N e for the focal chough population is critically small, implying that management to re-establish genetic connectivity may be required to ensure population viability. © 2017 The Authors. Journal of Animal Ecology © 2017 British Ecological Society.
NASA Astrophysics Data System (ADS)
Graham, Mark T.; Cappellari, Michele; Li, Hongyu; Mao, Shude; Bershady, Matthew A.; Bizyaev, Dmitry; Brinkmann, Jonathan; Brownstein, Joel R.; Bundy, Kevin; Drory, Niv; Law, David R.; Pan, Kaike; Thomas, Daniel; Wake, David A.; Weijmans, Anne-Marie; Westfall, Kyle B.; Yan, Renbin
2018-07-01
We measure λ _{R_e}, a proxy for galaxy specific stellar angular momentum within one effective radius, and the ellipticity, ɛ, for about 2300 galaxies of all morphological types observed with integral field spectroscopy as part of the Mapping Nearby Galaxies at Apache Point Observatory survey, the largest such sample to date. We use the (λ _{R_e}, ɛ ) diagram to separate early-type galaxies into fast and slow rotators. We also visually classify each galaxy according to its optical morphology and two-dimensional stellar velocity field. Comparing these classifications to quantitative λ _{R_e} measurements reveals tight relationships between angular momentum and galaxy structure. In order to account for atmospheric seeing, we use realistic models of galaxy kinematics to derive a general approximate analytic correction for λ _{R_e}. Thanks to the size of the sample and the large number of massive galaxies, we unambiguously detect a clear bimodality in the (λ _{R_e}, ɛ ) diagram which may result from fundamental differences in galaxy assembly history. There is a sharp secondary density peak inside the region of the diagram with low λ _{R_e} and ɛ < 0.4, previously suggested as the definition for slow rotators. Most of these galaxies are visually classified as non-regular rotators and have high velocity dispersion. The intrinsic bimodality must be stronger, as it tends to be smoothed by noise and inclination. The large sample of slow rotators allows us for the first time to unveil a secondary peak at ±90° in their distribution of the misalignments between the photometric and kinematic position angles. We confirm that genuine slow rotators start appearing above M ≥ 2 × 1011 M⊙ where a significant number of high-mass fast rotators also exist.
NASA Astrophysics Data System (ADS)
Graham, Mark T.; Cappellari, Michele; Li, Hongyu; Mao, Shude; Bershady, Matthew; Bizyaev, Dmitry; Brinkmann, Jonathan; Brownstein, Joel R.; Bundy, Kevin; Drory, Niv; Law, David R.; Pan, Kaike; Thomas, Daniel; Wake, David A.; Weijmans, Anne-Marie; Westfall, Kyle B.; Yan, Renbin
2018-03-01
We measure λ _{R_e}, a proxy for galaxy specific stellar angular momentum within one effective radius, and the ellipticity, ɛ, for about 2300 galaxies of all morphological types observed with integral field spectroscopy as part of the MaNGA survey, the largest such sample to date. We use the (λ _{R_e}, ɛ ) diagram to separate early-type galaxies into fast and slow rotators. We also visually classify each galaxy according to its optical morphology and two-dimensional stellar velocity field. Comparing these classifications to quantitative λ _{R_e} measurements reveals tight relationships between angular momentum and galaxy structure. In order to account for atmospheric seeing, we use realistic models of galaxy kinematics to derive a general approximate analytic correction for λ _{R_e}. Thanks to the size of the sample and the large number of massive galaxies, we unambiguously detect a clear bimodality in the (λ _{R_e}, ɛ ) diagram which may result from fundamental differences in galaxy assembly history. There is a sharp secondary density peak inside the region of the diagram with low λ _{R_e} and ɛ < 0.4, previously suggested as the definition for slow rotators. Most of these galaxies are visually classified as non-regular rotators and have high velocity dispersion. The intrinsic bimodality must be stronger, as it tends to be smoothed by noise and inclination. The large sample of slow rotators allows us for the first time to unveil a secondary peak at ±90○ in their distribution of the misalignments between the photometric and kinematic position angles. We confirm that genuine slow rotators start appearing above M ≥ 2 × 1011M⊙ where a significant number of high-mass fast rotators also exist.
Sample Size Methods for Estimating HIV Incidence from Cross-Sectional Surveys
Brookmeyer, Ron
2015-01-01
Summary Understanding HIV incidence, the rate at which new infections occur in populations, is critical for tracking and surveillance of the epidemic. In this paper we derive methods for determining sample sizes for cross-sectional surveys to estimate incidence with sufficient precision. We further show how to specify sample sizes for two successive cross-sectional surveys to detect changes in incidence with adequate power. In these surveys biomarkers such as CD4 cell count, viral load, and recently developed serological assays are used to determine which individuals are in an early disease stage of infection. The total number of individuals in this stage, divided by the number of people who are uninfected, is used to approximate the incidence rate. Our methods account for uncertainty in the durations of time spent in the biomarker defined early disease stage. We find that failure to account for this uncertainty when designing surveys can lead to imprecise estimates of incidence and underpowered studies. We evaluated our sample size methods in simulations and found that they performed well in a variety of underlying epidemics. Code for implementing our methods in R is available with this paper at the Biometrics website on Wiley Online Library. PMID:26302040
Sample size methods for estimating HIV incidence from cross-sectional surveys.
Konikoff, Jacob; Brookmeyer, Ron
2015-12-01
Understanding HIV incidence, the rate at which new infections occur in populations, is critical for tracking and surveillance of the epidemic. In this article, we derive methods for determining sample sizes for cross-sectional surveys to estimate incidence with sufficient precision. We further show how to specify sample sizes for two successive cross-sectional surveys to detect changes in incidence with adequate power. In these surveys biomarkers such as CD4 cell count, viral load, and recently developed serological assays are used to determine which individuals are in an early disease stage of infection. The total number of individuals in this stage, divided by the number of people who are uninfected, is used to approximate the incidence rate. Our methods account for uncertainty in the durations of time spent in the biomarker defined early disease stage. We find that failure to account for this uncertainty when designing surveys can lead to imprecise estimates of incidence and underpowered studies. We evaluated our sample size methods in simulations and found that they performed well in a variety of underlying epidemics. Code for implementing our methods in R is available with this article at the Biometrics website on Wiley Online Library. © 2015, The International Biometric Society.
Bootstrap Estimation of Sample Statistic Bias in Structural Equation Modeling.
ERIC Educational Resources Information Center
Thompson, Bruce; Fan, Xitao
This study empirically investigated bootstrap bias estimation in the area of structural equation modeling (SEM). Three correctly specified SEM models were used under four different sample size conditions. Monte Carlo experiments were carried out to generate the criteria against which bootstrap bias estimation should be judged. For SEM fit indices,…
Fuertes, Gustavo; Banterle, Niccolò; Ruff, Kiersten M.; Chowdhury, Aritra; Mercadante, Davide; Koehler, Christine; Kachala, Michael; Estrada Girona, Gemma; Milles, Sigrid; Mishra, Ankur; Onck, Patrick R.; Gräter, Frauke; Esteban-Martín, Santiago; Pappu, Rohit V.; Svergun, Dmitri I.; Lemke, Edward A.
2017-01-01
Unfolded states of proteins and native states of intrinsically disordered proteins (IDPs) populate heterogeneous conformational ensembles in solution. The average sizes of these heterogeneous systems, quantified by the radius of gyration (RG), can be measured by small-angle X-ray scattering (SAXS). Another parameter, the mean dye-to-dye distance (RE) for proteins with fluorescently labeled termini, can be estimated using single-molecule Förster resonance energy transfer (smFRET). A number of studies have reported inconsistencies in inferences drawn from the two sets of measurements for the dimensions of unfolded proteins and IDPs in the absence of chemical denaturants. These differences are typically attributed to the influence of fluorescent labels used in smFRET and to the impact of high concentrations and averaging features of SAXS. By measuring the dimensions of a collection of labeled and unlabeled polypeptides using smFRET and SAXS, we directly assessed the contributions of dyes to the experimental values RG and RE. For chemically denatured proteins we obtain mutual consistency in our inferences based on RG and RE, whereas for IDPs under native conditions, we find substantial deviations. Using computations, we show that discrepant inferences are neither due to methodological shortcomings of specific measurements nor due to artifacts of dyes. Instead, our analysis suggests that chemical heterogeneity in heteropolymeric systems leads to a decoupling between RE and RG that is amplified in the absence of denaturants. Therefore, joint assessments of RG and RE combined with measurements of polymer shapes should provide a consistent and complete picture of the underlying ensembles. PMID:28716919
Parameter Estimation with Small Sample Size: A Higher-Order IRT Model Approach
ERIC Educational Resources Information Center
de la Torre, Jimmy; Hong, Yuan
2010-01-01
Sample size ranks as one of the most important factors that affect the item calibration task. However, due to practical concerns (e.g., item exposure) items are typically calibrated with much smaller samples than what is desired. To address the need for a more flexible framework that can be used in small sample item calibration, this article…
Ibrahim, Mohamed; Wickenhauser, Patrick; Rautek, Peter; Reina, Guido; Hadwiger, Markus
2018-01-01
Molecular dynamics (MD) simulations are crucial to investigating important processes in physics and thermodynamics. The simulated atoms are usually visualized as hard spheres with Phong shading, where individual particles and their local density can be perceived well in close-up views. However, for large-scale simulations with 10 million particles or more, the visualization of large fields-of-view usually suffers from strong aliasing artifacts, because the mismatch between data size and output resolution leads to severe under-sampling of the geometry. Excessive super-sampling can alleviate this problem, but is prohibitively expensive. This paper presents a novel visualization method for large-scale particle data that addresses aliasing while enabling interactive high-quality rendering. We introduce the novel concept of screen-space normal distribution functions (S-NDFs) for particle data. S-NDFs represent the distribution of surface normals that map to a given pixel in screen space, which enables high-quality re-lighting without re-rendering particles. In order to facilitate interactive zooming, we cache S-NDFs in a screen-space mipmap (S-MIP). Together, these two concepts enable interactive, scale-consistent re-lighting and shading changes, as well as zooming, without having to re-sample the particle data. We show how our method facilitates the interactive exploration of real-world large-scale MD simulation data in different scenarios.
Hurwitz, Lisa B.; Schmitt, Kelly L.; Olsen, Megan K.
2017-01-01
Recruiting children and families for research studies can be challenging, and re-recruiting former participants for longitudinal research can be even more difficult, especially when a study was not prospectively designed to encompass continuous data collection. In this article, we explain how researchers can set up initial studies to potentially facilitate later waves of data collection; locate former study participants using newer, often digital, tools; schedule families using recruitment phone/email/mail scripts that highlight the many benefits to continued study participation; and confirm appointments with other digital tools. We draw from prior methodological and longitudinal pieces to provide suggestions to others wishing to re-recruit families for longitudinal studies. In addition, we draw upon our own experience conducting a non-prospective longitudinal study 6 years after an educational intervention, in which we successfully re-located 122 (90%) and interviewed 101 of 136 (83% of the located sample and 74% of the full original sample) parents and their early adolescent children. Although the majority of participants were recruited via original contact information (especially phone numbers), using a range of strategies to recruit (e.g., search engines focused on contact information, social media) and motivate participation (e.g., multifaceted phone/email/mail scheduling scripts, flexibility in location and means of participation) yielded a more desirable sample size at relatively low costs. PMID:28955265
Arnason, T; Albertsdóttir, E; Fikse, W F; Eriksson, S; Sigurdsson, A
2012-02-01
The consequences of assuming a zero environmental covariance between a binary trait 'test-status' and a continuous trait on the estimates of genetic parameters by restricted maximum likelihood and Gibbs sampling and on response from genetic selection when the true environmental covariance deviates from zero were studied. Data were simulated for two traits (one that culling was based on and a continuous trait) using the following true parameters, on the underlying scale: h² = 0.4; r(A) = 0.5; r(E) = 0.5, 0.0 or -0.5. The selection on the continuous trait was applied to five subsequent generations where 25 sires and 500 dams produced 1500 offspring per generation. Mass selection was applied in the analysis of the effect on estimation of genetic parameters. Estimated breeding values were used in the study of the effect of genetic selection on response and accuracy. The culling frequency was either 0.5 or 0.8 within each generation. Each of 10 replicates included 7500 records on 'test-status' and 9600 animals in the pedigree file. Results from bivariate analysis showed unbiased estimates of variance components and genetic parameters when true r(E) = 0.0. For r(E) = 0.5, variance components (13-19% bias) and especially (50-80%) were underestimated for the continuous trait, while heritability estimates were unbiased. For r(E) = -0.5, heritability estimates of test-status were unbiased, while genetic variance and heritability of the continuous trait together with were overestimated (25-50%). The bias was larger for the higher culling frequency. Culling always reduced genetic progress from selection, but the genetic progress was found to be robust to the use of wrong parameter values of the true environmental correlation between test-status and the continuous trait. Use of a bivariate linear-linear model reduced bias in genetic evaluations, when data were subject to culling. © 2011 Blackwell Verlag GmbH.
Estimation and applications of size-biased distributions in forestry
Jeffrey H. Gove
2003-01-01
Size-biased distributions arise naturally in several contexts in forestry and ecology. Simple power relationships (e.g. basal area and diameter at breast height) between variables are one such area of interest arising from a modelling perspective. Another, probability proportional to size PPS) sampling, is found in the most widely used methods for sampling standing or...
Modelling size-fractionated primary production in the Atlantic Ocean from remote sensing
NASA Astrophysics Data System (ADS)
Brewin, Robert J. W.; Tilstone, Gavin H.; Jackson, Thomas; Cain, Terry; Miller, Peter I.; Lange, Priscila K.; Misra, Ankita; Airs, Ruth L.
2017-11-01
Marine primary production influences the transfer of carbon dioxide between the ocean and atmosphere, and the availability of energy for the pelagic food web. Both the rate and the fate of organic carbon from primary production are dependent on phytoplankton size. A key aim of the Atlantic Meridional Transect (AMT) programme has been to quantify biological carbon cycling in the Atlantic Ocean and measurements of total primary production have been routinely made on AMT cruises, as well as additional measurements of size-fractionated primary production on some cruises. Measurements of total primary production collected on the AMT have been used to evaluate remote-sensing techniques capable of producing basin-scale estimates of primary production. Though models exist to estimate size-fractionated primary production from satellite data, these have not been well validated in the Atlantic Ocean, and have been parameterised using measurements of phytoplankton pigments rather than direct measurements of phytoplankton size structure. Here, we re-tune a remote-sensing primary production model to estimate production in three size fractions of phytoplankton (<2 μm, 2-10 μm and >10 μm) in the Atlantic Ocean, using measurements of size-fractionated chlorophyll and size-fractionated photosynthesis-irradiance experiments conducted on AMT 22 and 23 using sequential filtration-based methods. The performance of the remote-sensing technique was evaluated using: (i) independent estimates of size-fractionated primary production collected on a number of AMT cruises using 14C on-deck incubation experiments and (ii) Monte Carlo simulations. Considering uncertainty in the satellite inputs and model parameters, we estimate an average model error of between 0.27 and 0.63 for log10-transformed size-fractionated production, with lower errors for the small size class (<2 μm), higher errors for the larger size classes (2-10 μm and >10 μm), and errors generally higher in oligotrophic waters. Application to satellite data in 2007 suggests the contribution of cells <2 μm and >2 μm to total primary production is approximately equal in the Atlantic Ocean.
Code of Federal Regulations, 2011 CFR
2011-07-01
... 40 Protection of Environment 5 2011-07-01 2011-07-01 false Estimated Mass Concentration... 53—Estimated Mass Concentration Measurement of PM2.5 for Idealized “Typical” Coarse Aerosol Size Distribution Particle Aerodynamic Diameter (µm) Test Sampler Fractional Sampling Effectiveness Interval Mass...
Code of Federal Regulations, 2010 CFR
2010-07-01
... 40 Protection of Environment 5 2010-07-01 2010-07-01 false Estimated Mass Concentration... 53—Estimated Mass Concentration Measurement of PM2.5 for Idealized “Typical” Coarse Aerosol Size Distribution Particle Aerodynamic Diameter (µm) Test Sampler Fractional Sampling Effectiveness Interval Mass...
Cheng, Ningtao; Wu, Leihong; Cheng, Yiyu
2013-01-01
The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot be reliably estimated with only a small number of training samples. Therefore, it is of vital importance to determine the minimum number of training samples and to ensure the clinical value of microarrays in cancer outcome prediction. We evaluated the impact of training sample size on model performance extensively based on 3 large-scale cancer microarray datasets provided by the second phase of MicroArray Quality Control project (MAQC-II). An SSNR-based (scale of signal-to-noise ratio) protocol was proposed in this study for minimum training sample size determination. External validation results based on another 3 cancer datasets confirmed that the SSNR-based approach could not only determine the minimum number of training samples efficiently, but also provide a valuable strategy for estimating the underlying performance of classifiers in advance. Once translated into clinical routine applications, the SSNR-based protocol would provide great convenience in microarray-based cancer outcome prediction in improving classifier reliability. PMID:23861920
Unfolding sphere size distributions with a density estimator based on Tikhonov regularization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Weese, J.; Korat, E.; Maier, D.
1997-12-01
This report proposes a method for unfolding sphere size distributions given a sample of radii that combines the advantages of a density estimator with those of Tikhonov regularization methods. The following topics are discusses in this report to achieve this method: the relation between the profile and the sphere size distribution; the method for unfolding sphere size distributions; the results based on simulations; and the experimental data comparison.
Protocol for monitoring forest-nesting birds in National Park Service parks
Dawson, Deanna K.; Efford, Murray G.
2013-01-01
These documents detail the protocol for monitoring forest-nesting birds in National Park Service parks in the National Capital Region Network (NCRN). In the first year of sampling, counts of birds should be made at 384 points on the NCRN spatially randomized grid, developed to sample terrestrial resources. Sampling should begin on or about May 20 and continue into early July; on each day the sampling period begins at sunrise and ends five hours later. Each point should be counted twice, once in the first half of the field season and once in the second half, with visits made by different observers, balancing the within-season coverage of points and their spatial coverage by observers, and allowing observer differences to be tested. Three observers, skilled in identifying birds of the region by sight and sound and with previous experience in conducting timed counts of birds, will be needed for this effort. Observers should be randomly assigned to ‘routes’ consisting of eight points, in close proximity and, ideally, in similar habitat, that can be covered in one morning. Counts are 10 minutes in length, subdivided into four 2.5-min intervals. Within each time interval, new birds (i.e., those not already detected) are recorded as within or beyond 50 m of the point, based on where first detected. Binomial distance methods are used to calculate annual estimates of density for species. The data are also amenable to estimation of abundance and detection probability via the removal method. Generalized linear models can be used to assess between-year changes in density estimates or unadjusted count data. This level of sampling is expected to be sufficient to detect a 50% decline in 10 years for approximately 50 bird species, including 14 of 19 species that are priorities for conservation efforts, if analyses are based on unadjusted count data, and for 30 species (6 priority species) if analyses are based on density estimates. The estimates of required sample sizes are based on the mean number of individuals detected per 10 minutes in available data from surveys in three NCRN parks. Once network-wide data from the first year of sampling are available, this and other aspects of the protocol should be re-assessed, and changes made as desired or necessary before the start of the second field season. Thereafter, changes should not be made to the field methods, and sampling should be conducted annually for at least ten years. NCRN staff should keep apprised of new analytical methods developed for analysis of point-count data.
Graf, Alexandra C; Bauer, Peter
2011-06-30
We calculate the maximum type 1 error rate of the pre-planned conventional fixed sample size test for comparing the means of independent normal distributions (with common known variance) which can be yielded when sample size and allocation rate to the treatment arms can be modified in an interim analysis. Thereby it is assumed that the experimenter fully exploits knowledge of the unblinded interim estimates of the treatment effects in order to maximize the conditional type 1 error rate. The 'worst-case' strategies require knowledge of the unknown common treatment effect under the null hypothesis. Although this is a rather hypothetical scenario it may be approached in practice when using a standard control treatment for which precise estimates are available from historical data. The maximum inflation of the type 1 error rate is substantially larger than derived by Proschan and Hunsberger (Biometrics 1995; 51:1315-1324) for design modifications applying balanced samples before and after the interim analysis. Corresponding upper limits for the maximum type 1 error rate are calculated for a number of situations arising from practical considerations (e.g. restricting the maximum sample size, not allowing sample size to decrease, allowing only increase in the sample size in the experimental treatment). The application is discussed for a motivating example. Copyright © 2011 John Wiley & Sons, Ltd.
Problems with sampling desert tortoises: A simulation analysis based on field data
Freilich, J.E.; Camp, R.J.; Duda, J.J.; Karl, A.E.
2005-01-01
The desert tortoise (Gopherus agassizii) was listed as a U.S. threatened species in 1990 based largely on population declines inferred from mark-recapture surveys of 2.59-km2 (1-mi2) plots. Since then, several census methods have been proposed and tested, but all methods still pose logistical or statistical difficulties. We conducted computer simulations using actual tortoise location data from 2 1-mi2 plot surveys in southern California, USA, to identify strengths and weaknesses of current sampling strategies. We considered tortoise population estimates based on these plots as "truth" and then tested various sampling methods based on sampling smaller plots or transect lines passing through the mile squares. Data were analyzed using Schnabel's mark-recapture estimate and program CAPTURE. Experimental subsampling with replacement of the 1-mi2 data using 1-km2 and 0.25-km2 plot boundaries produced data sets of smaller plot sizes, which we compared to estimates from the 1-mi 2 plots. We also tested distance sampling by saturating a 1-mi 2 site with computer simulated transect lines, once again evaluating bias in density estimates. Subsampling estimates from 1-km2 plots did not differ significantly from the estimates derived at 1-mi2. The 0.25-km2 subsamples significantly overestimated population sizes, chiefly because too few recaptures were made. Distance sampling simulations were biased 80% of the time and had high coefficient of variation to density ratios. Furthermore, a prospective power analysis suggested limited ability to detect population declines as high as 50%. We concluded that poor performance and bias of both sampling procedures was driven by insufficient sample size, suggesting that all efforts must be directed to increasing numbers found in order to produce reliable results. Our results suggest that present methods may not be capable of accurately estimating desert tortoise populations.
Pecha, Petr; Šmídl, Václav
2016-11-01
A stepwise sequential assimilation algorithm is proposed based on an optimisation approach for recursive parameter estimation and tracking of radioactive plume propagation in the early stage of a radiation accident. Predictions of the radiological situation in each time step of the plume propagation are driven by an existing short-term meteorological forecast and the assimilation procedure manipulates the model parameters to match the observations incoming concurrently from the terrain. Mathematically, the task is a typical ill-posed inverse problem of estimating the parameters of the release. The proposed method is designated as a stepwise re-estimation of the source term release dynamics and an improvement of several input model parameters. It results in a more precise determination of the adversely affected areas in the terrain. The nonlinear least-squares regression methodology is applied for estimation of the unknowns. The fast and adequately accurate segmented Gaussian plume model (SGPM) is used in the first stage of direct (forward) modelling. The subsequent inverse procedure infers (re-estimates) the values of important model parameters from the actual observations. Accuracy and sensitivity of the proposed method for real-time forecasting of the accident propagation is studied. First, a twin experiment generating noiseless simulated "artificial" observations is studied to verify the minimisation algorithm. Second, the impact of the measurement noise on the re-estimated source release rate is examined. In addition, the presented method can be used as a proposal for more advanced statistical techniques using, e.g., importance sampling. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Bozorgzadeh, Nezam; Yanagimura, Yoko; Harrison, John P.
2017-12-01
The Hoek-Brown empirical strength criterion for intact rock is widely used as the basis for estimating the strength of rock masses. Estimations of the intact rock H-B parameters, namely the empirical constant m and the uniaxial compressive strength σc, are commonly obtained by fitting the criterion to triaxial strength data sets of small sample size. This paper investigates how such small sample sizes affect the uncertainty associated with the H-B parameter estimations. We use Monte Carlo (MC) simulation to generate data sets of different sizes and different combinations of H-B parameters, and then investigate the uncertainty in H-B parameters estimated from these limited data sets. We show that the uncertainties depend not only on the level of variability but also on the particular combination of parameters being investigated. As particular combinations of H-B parameters can informally be considered to represent specific rock types, we discuss that as the minimum number of required samples depends on rock type it should correspond to some acceptable level of uncertainty in the estimations. Also, a comparison of the results from our analysis with actual rock strength data shows that the probability of obtaining reliable strength parameter estimations using small samples may be very low. We further discuss the impact of this on ongoing implementation of reliability-based design protocols and conclude with suggestions for improvements in this respect.
Omulo, Sylvia; Lofgren, Eric T; Mugoh, Maina; Alando, Moshe; Obiya, Joshua; Kipyegon, Korir; Kikwai, Gilbert; Gumbi, Wilson; Kariuki, Samuel; Call, Douglas R
2017-05-01
Investigators often rely on studies of Escherichia coli to characterize the burden of antibiotic resistance in a clinical or community setting. To determine if prevalence estimates for antibiotic resistance are sensitive to sample handling and interpretive criteria, we collected presumptive E. coli isolates (24 or 95 per stool sample) from a community in an urban informal settlement in Kenya. Isolates were tested for susceptibility to nine antibiotics using agar breakpoint assays and results were analyzed using generalized linear mixed models. We observed a <3-fold difference between prevalence estimates based on freshly isolated bacteria when compared to isolates collected from unprocessed fecal samples or fecal slurries that had been stored at 4°C for up to 7days. No time-dependence was evident (P>0.1). Prevalence estimates did not differ for five distinct E. coli colony morphologies on MacConkey agar plates (P>0.2). Successive re-plating of samples for up to five consecutive days had little to no impact on prevalence estimates. Finally, culturing E. coli under different conditions (with 5% CO 2 or micro-aerobic) did not affect estimates of prevalence. For the conditions tested in these experiments, minor modifications in sample processing protocols are unlikely to bias estimates of the prevalence of antibiotic-resistance for fecal E. coli. Copyright © 2017 Elsevier B.V. All rights reserved.
Study samples are too small to produce sufficiently precise reliability coefficients.
Charter, Richard A
2003-04-01
In a survey of journal articles, test manuals, and test critique books, the author found that a mean sample size (N) of 260 participants had been used for reliability studies on 742 tests. The distribution was skewed because the median sample size for the total sample was only 90. The median sample sizes for the internal consistency, retest, and interjudge reliabilities were 182, 64, and 36, respectively. The author presented sample size statistics for the various internal consistency methods and types of tests. In general, the author found that the sample sizes that were used in the internal consistency studies were too small to produce sufficiently precise reliability coefficients, which in turn could cause imprecise estimates of examinee true-score confidence intervals. The results also suggest that larger sample sizes have been used in the last decade compared with those that were used in earlier decades.
Humphry, R W; Evans, J; Webster, C; Tongue, S C; Innocent, G T; Gunn, G J
2018-02-01
Antimicrobial resistance is primarily a problem in human medicine but there are unquantified links of transmission in both directions between animal and human populations. Quantitative assessment of the costs and benefits of reduced antimicrobial usage in livestock requires robust quantification of transmission of resistance between animals, the environment and the human population. This in turn requires appropriate measurement of resistance. To tackle this we selected two different methods for determining whether a sample is resistant - one based on screening a sample, the other on testing individual isolates. Our overall objective was to explore the differences arising from choice of measurement. A literature search demonstrated the widespread use of testing of individual isolates. The first aim of this study was to compare, quantitatively, sample level and isolate level screening. Cattle or sheep faecal samples (n=41) submitted for routine parasitology were tested for antimicrobial resistance in two ways: (1) "streak" direct culture onto plates containing the antimicrobial of interest; (2) determination of minimum inhibitory concentration (MIC) of 8-10 isolates per sample compared to published MIC thresholds. Two antibiotics (ampicillin and nalidixic acid) were tested. With ampicillin, direct culture resulted in more than double the number of resistant samples than the MIC method based on eight individual isolates. The second aim of this study was to demonstrate the utility of the observed relationship between these two measures of antimicrobial resistance to re-estimate the prevalence of antimicrobial resistance from a previous study, in which we had used "streak" cultures. Boot-strap methods were used to estimate the proportion of samples that would have tested resistant in the historic study, had we used the isolate-based MIC method instead. Our boot-strap results indicate that our estimates of prevalence of antimicrobial resistance would have been considerably lower in the historic study had the MIC method been used. Finally we conclude that there is no single way of defining a sample as resistant to an antimicrobial agent. The method used greatly affects the estimated prevalence of antimicrobial resistance in a sampled population of animals, thus potentially resulting in misleading results. Comparing methods on the same samples allows us to re-estimate the prevalence from other studies, had other methods for determining resistance been used. The results of this study highlight the importance of establishing what the most appropriate measure of antimicrobial resistance is, for the proposed purpose of the results. Copyright © 2017 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
James, David E.; Schraw, Gregory; Kuch, Fred
2015-01-01
We present an equation, derived from standard statistical theory, that can be used to estimate sampling margin of error for student evaluations of teaching (SETs). We use the equation to examine the effect of sample size, response rates and sample variability on the estimated sampling margin of error, and present results in four tables that allow…
Stratum variance estimation for sample allocation in crop surveys. [Great Plains Corridor
NASA Technical Reports Server (NTRS)
Perry, C. R., Jr.; Chhikara, R. S. (Principal Investigator)
1980-01-01
The problem of determining stratum variances needed in achieving an optimum sample allocation for crop surveys by remote sensing is investigated by considering an approach based on the concept of stratum variance as a function of the sampling unit size. A methodology using the existing and easily available information of historical crop statistics is developed for obtaining initial estimates of tratum variances. The procedure is applied to estimate stratum variances for wheat in the U.S. Great Plains and is evaluated based on the numerical results thus obtained. It is shown that the proposed technique is viable and performs satisfactorily, with the use of a conservative value for the field size and the crop statistics from the small political subdivision level, when the estimated stratum variances were compared to those obtained using the LANDSAT data.
Dunham, Kylee; Grand, James B.
2016-01-01
We examined the effects of complexity and priors on the accuracy of models used to estimate ecological and observational processes, and to make predictions regarding population size and structure. State-space models are useful for estimating complex, unobservable population processes and making predictions about future populations based on limited data. To better understand the utility of state space models in evaluating population dynamics, we used them in a Bayesian framework and compared the accuracy of models with differing complexity, with and without informative priors using sequential importance sampling/resampling (SISR). Count data were simulated for 25 years using known parameters and observation process for each model. We used kernel smoothing to reduce the effect of particle depletion, which is common when estimating both states and parameters with SISR. Models using informative priors estimated parameter values and population size with greater accuracy than their non-informative counterparts. While the estimates of population size and trend did not suffer greatly in models using non-informative priors, the algorithm was unable to accurately estimate demographic parameters. This model framework provides reasonable estimates of population size when little to no information is available; however, when information on some vital rates is available, SISR can be used to obtain more precise estimates of population size and process. Incorporating model complexity such as that required by structured populations with stage-specific vital rates affects precision and accuracy when estimating latent population variables and predicting population dynamics. These results are important to consider when designing monitoring programs and conservation efforts requiring management of specific population segments.
Zeng, Yaohui; Singh, Sachinkumar; Wang, Kai
2017-01-01
Abstract Pharmacodynamic studies that use methacholine challenge to assess bioequivalence of generic and innovator albuterol formulations are generally designed per published Food and Drug Administration guidance, with 3 reference doses and 1 test dose (3‐by‐1 design). These studies are challenging and expensive to conduct, typically requiring large sample sizes. We proposed 14 modified study designs as alternatives to the Food and Drug Administration–recommended 3‐by‐1 design, hypothesizing that adding reference and/or test doses would reduce sample size and cost. We used Monte Carlo simulation to estimate sample size. Simulation inputs were selected based on published studies and our own experience with this type of trial. We also estimated effects of these modified study designs on study cost. Most of these altered designs reduced sample size and cost relative to the 3‐by‐1 design, some decreasing cost by more than 40%. The most effective single study dose to add was 180 μg of test formulation, which resulted in an estimated 30% relative cost reduction. Adding a single test dose of 90 μg was less effective, producing only a 13% cost reduction. Adding a lone reference dose of either 180, 270, or 360 μg yielded little benefit (less than 10% cost reduction), whereas adding 720 μg resulted in a 19% cost reduction. Of the 14 study design modifications we evaluated, the most effective was addition of both a 90‐μg test dose and a 720‐μg reference dose (42% cost reduction). Combining a 180‐μg test dose and a 720‐μg reference dose produced an estimated 36% cost reduction. PMID:29281130
Joint inversion of NMR and SIP data to estimate pore size distribution of geomaterials
NASA Astrophysics Data System (ADS)
Niu, Qifei; Zhang, Chi
2018-03-01
There are growing interests in using geophysical tools to characterize the microstructure of geomaterials because of the non-invasive nature and the applicability in field. In these applications, multiple types of geophysical data sets are usually processed separately, which may be inadequate to constrain the key feature of target variables. Therefore, simultaneous processing of multiple data sets could potentially improve the resolution. In this study, we propose a method to estimate pore size distribution by joint inversion of nuclear magnetic resonance (NMR) T2 relaxation and spectral induced polarization (SIP) spectra. The petrophysical relation between NMR T2 relaxation time and SIP relaxation time is incorporated in a nonlinear least squares problem formulation, which is solved using Gauss-Newton method. The joint inversion scheme is applied to a synthetic sample and a Berea sandstone sample. The jointly estimated pore size distributions are very close to the true model and results from other experimental method. Even when the knowledge of the petrophysical models of the sample is incomplete, the joint inversion can still capture the main features of the pore size distribution of the samples, including the general shape and relative peak positions of the distribution curves. It is also found from the numerical example that the surface relaxivity of the sample could be extracted with the joint inversion of NMR and SIP data if the diffusion coefficient of the ions in the electrical double layer is known. Comparing to individual inversions, the joint inversion could improve the resolution of the estimated pore size distribution because of the addition of extra data sets. The proposed approach might constitute a first step towards a comprehensive joint inversion that can extract the full pore geometry information of a geomaterial from NMR and SIP data.
Sample size in studies on diagnostic accuracy in ophthalmology: a literature survey.
Bochmann, Frank; Johnson, Zoe; Azuara-Blanco, Augusto
2007-07-01
To assess the sample sizes used in studies on diagnostic accuracy in ophthalmology. Design and sources: A survey literature published in 2005. The frequency of reporting calculations of sample sizes and the samples' sizes were extracted from the published literature. A manual search of five leading clinical journals in ophthalmology with the highest impact (Investigative Ophthalmology and Visual Science, Ophthalmology, Archives of Ophthalmology, American Journal of Ophthalmology and British Journal of Ophthalmology) was conducted by two independent investigators. A total of 1698 articles were identified, of which 40 studies were on diagnostic accuracy. One study reported that sample size was calculated before initiating the study. Another study reported consideration of sample size without calculation. The mean (SD) sample size of all diagnostic studies was 172.6 (218.9). The median prevalence of the target condition was 50.5%. Only a few studies consider sample size in their methods. Inadequate sample sizes in diagnostic accuracy studies may result in misleading estimates of test accuracy. An improvement over the current standards on the design and reporting of diagnostic studies is warranted.
López-Corbeto, Evelin; González, Victoria; Casabona, Jordi
Chlamydia trachomatis infection is the most common bacterial sexually transmitted disease. Re-infections are a major problem in its control as they increase the probability of developing sequellae. To estimate the prevalence of C.trachomatis and re-infection rate after 6 months of treatment by determining the possible causes. Cross-sectional study in which a urine sample was analysed by PCR in a convenience sample of 506 sexually active youths aged 16-25years. An epidemiological survey and re-test was performed at 3months. The prevalence of C.trachomatis was 8.5%. The age (OR=2.34; 95%CI: 1.21-4.55) and concurrency (OR=3.64; 95% CI: 3.58-26.39) were determining factors for acquiring C.trachomatis. The re-infection rate was 10.34%. The high prevalence of C.trachomatis, as well as the rate of reinfection, suggest the need to assess the effectiveness of the opportunistic screening program and ensure high levels of reporting of sexual partners. Ensuring these approaches facilitate the control of C.trachomatis among young people. Copyright © 2015 Elsevier España, S.L.U. and Sociedad Española de Enfermedades Infecciosas y Microbiología Clínica. All rights reserved.
An estimate of field size distributions for selected sites in the major grain producing countries
NASA Technical Reports Server (NTRS)
Podwysocki, M. H.
1977-01-01
The field size distributions for the major grain producing countries of the World were estimated. LANDSAT-1 and 2 images were evaluated for two areas each in the United States, People's Republic of China, and the USSR. One scene each was evaluated for France, Canada, and India. Grid sampling was done for representative sub-samples of each image, measuring the long and short axes of each field; area was then calculated. Each of the resulting data sets was computer analyzed for their frequency distributions. Nearly all frequency distributions were highly peaked and skewed (shifted) towards small values, approaching that of either a Poisson or log-normal distribution. The data were normalized by a log transformation, creating a Gaussian distribution which has moments readily interpretable and useful for estimating the total population of fields. Resultant predictors of the field size estimates are discussed.
ERIC Educational Resources Information Center
Granville, Arthur; And Others
This interim report re-examines data on instrument suitability, comparability of groups, and adequacy of sample size in Year III of the process evaluation of Project Developmental Continuity (PDC) and offers preliminary recommendations concerning the feasibility of continuing the impact study. PDC is a Head Start demonstration program aimed at…
DISTANCES TO DARK CLOUDS: COMPARING EXTINCTION DISTANCES TO MASER PARALLAX DISTANCES
DOE Office of Scientific and Technical Information (OSTI.GOV)
Foster, Jonathan B.; Jackson, James M.; Stead, Joseph J.
We test two different methods of using near-infrared extinction to estimate distances to dark clouds in the first quadrant of the Galaxy using large near-infrared (Two Micron All Sky Survey and UKIRT Infrared Deep Sky Survey) surveys. Very long baseline interferometry parallax measurements of masers around massive young stars provide the most direct and bias-free measurement of the distance to these dark clouds. We compare the extinction distance estimates to these maser parallax distances. We also compare these distances to kinematic distances, including recent re-calibrations of the Galactic rotation curve. The extinction distance methods agree with the maser parallax distancesmore » (within the errors) between 66% and 100% of the time (depending on method and input survey) and between 85% and 100% of the time outside of the crowded Galactic center. Although the sample size is small, extinction distance methods reproduce maser parallax distances better than kinematic distances; furthermore, extinction distance methods do not suffer from the kinematic distance ambiguity. This validation gives us confidence that these extinction methods may be extended to additional dark clouds where maser parallaxes are not available.« less
On the relationship between tumour growth rate and survival in non-small cell lung cancer.
Mistry, Hitesh B
2017-01-01
A recurrent question within oncology drug development is predicting phase III outcome for a new treatment using early clinical data. One approach to tackle this problem has been to derive metrics from mathematical models that describe tumour size dynamics termed re-growth rate and time to tumour re-growth. They have shown to be strong predictors of overall survival in numerous studies but there is debate about how these metrics are derived and if they are more predictive than empirical end-points. This work explores the issues raised in using model-derived metric as predictors for survival analyses. Re-growth rate and time to tumour re-growth were calculated for three large clinical studies by forward and reverse alignment. The latter involves re-aligning patients to their time of progression. Hence, it accounts for the time taken to estimate re-growth rate and time to tumour re-growth but also assesses if these predictors correlate to survival from the time of progression. I found that neither re-growth rate nor time to tumour re-growth correlated to survival using reverse alignment. This suggests that the dynamics of tumours up until disease progression has no relationship to survival post progression. For prediction of a phase III trial I found the metrics performed no better than empirical end-points. These results highlight that care must be taken when relating dynamics of tumour imaging to survival and that bench-marking new approaches to existing ones is essential.
Estill, Cheryl Fairfield; Baron, Paul A.; Beard, Jeremy K.; Hein, Misty J.; Larsen, Lloyd D.; Rose, Laura; Schaefer, Frank W.; Noble-Wang, Judith; Hodges, Lisa; Lindquist, H. D. Alan; Deye, Gregory J.; Arduino, Matthew J.
2009-01-01
After the 2001 anthrax incidents, surface sampling techniques for biological agents were found to be inadequately validated, especially at low surface loadings. We aerosolized Bacillus anthracis Sterne spores within a chamber to achieve very low surface loading (ca. 3, 30, and 200 CFU per 100 cm2). Steel and carpet coupons seeded in the chamber were sampled with swab (103 cm2) or wipe or vacuum (929 cm2) surface sampling methods and analyzed at three laboratories. Agar settle plates (60 cm2) were the reference for determining recovery efficiency (RE). The minimum estimated surface concentrations to achieve a 95% response rate based on probit regression were 190, 15, and 44 CFU/100 cm2 for sampling steel surfaces and 40, 9.2, and 28 CFU/100 cm2 for sampling carpet surfaces with swab, wipe, and vacuum methods, respectively; however, these results should be cautiously interpreted because of high observed variability. Mean REs at the highest surface loading were 5.0%, 18%, and 3.7% on steel and 12%, 23%, and 4.7% on carpet for the swab, wipe, and vacuum methods, respectively. Precision (coefficient of variation) was poor at the lower surface concentrations but improved with increasing surface concentration. The best precision was obtained with wipe samples on carpet, achieving 38% at the highest surface concentration. The wipe sampling method detected B. anthracis at lower estimated surface concentrations and had higher RE and better precision than the other methods. These results may guide investigators to more meaningfully conduct environmental sampling, quantify contamination levels, and conduct risk assessment for humans. PMID:19429546
Estill, Cheryl Fairfield; Baron, Paul A; Beard, Jeremy K; Hein, Misty J; Larsen, Lloyd D; Rose, Laura; Schaefer, Frank W; Noble-Wang, Judith; Hodges, Lisa; Lindquist, H D Alan; Deye, Gregory J; Arduino, Matthew J
2009-07-01
After the 2001 anthrax incidents, surface sampling techniques for biological agents were found to be inadequately validated, especially at low surface loadings. We aerosolized Bacillus anthracis Sterne spores within a chamber to achieve very low surface loading (ca. 3, 30, and 200 CFU per 100 cm(2)). Steel and carpet coupons seeded in the chamber were sampled with swab (103 cm(2)) or wipe or vacuum (929 cm(2)) surface sampling methods and analyzed at three laboratories. Agar settle plates (60 cm(2)) were the reference for determining recovery efficiency (RE). The minimum estimated surface concentrations to achieve a 95% response rate based on probit regression were 190, 15, and 44 CFU/100 cm(2) for sampling steel surfaces and 40, 9.2, and 28 CFU/100 cm(2) for sampling carpet surfaces with swab, wipe, and vacuum methods, respectively; however, these results should be cautiously interpreted because of high observed variability. Mean REs at the highest surface loading were 5.0%, 18%, and 3.7% on steel and 12%, 23%, and 4.7% on carpet for the swab, wipe, and vacuum methods, respectively. Precision (coefficient of variation) was poor at the lower surface concentrations but improved with increasing surface concentration. The best precision was obtained with wipe samples on carpet, achieving 38% at the highest surface concentration. The wipe sampling method detected B. anthracis at lower estimated surface concentrations and had higher RE and better precision than the other methods. These results may guide investigators to more meaningfully conduct environmental sampling, quantify contamination levels, and conduct risk assessment for humans.
Long-term effective population size dynamics of an intensively monitored vertebrate population
Mueller, A-K; Chakarov, N; Krüger, O; Hoffman, J I
2016-01-01
Long-term genetic data from intensively monitored natural populations are important for understanding how effective population sizes (Ne) can vary over time. We therefore genotyped 1622 common buzzard (Buteo buteo) chicks sampled over 12 consecutive years (2002–2013 inclusive) at 15 microsatellite loci. This data set allowed us to both compare single-sample with temporal approaches and explore temporal patterns in the effective number of parents that produced each cohort in relation to the observed population dynamics. We found reasonable consistency between linkage disequilibrium-based single-sample and temporal estimators, particularly during the latter half of the study, but no clear relationship between annual Ne estimates () and census sizes. We also documented a 14-fold increase in between 2008 and 2011, a period during which the census size doubled, probably reflecting a combination of higher adult survival and immigration from further afield. Our study thus reveals appreciable temporal heterogeneity in the effective population size of a natural vertebrate population, confirms the need for long-term studies and cautions against drawing conclusions from a single sample. PMID:27553455
Amazon river dolphins (Inia geoffrensis) use a high-frequency short-range biosonar.
Ladegaard, Michael; Jensen, Frants Havmand; de Freitas, Mafalda; Ferreira da Silva, Vera Maria; Madsen, Peter Teglberg
2015-10-01
Toothed whales produce echolocation clicks with source parameters related to body size; however, it may be equally important to consider the influence of habitat, as suggested by studies on echolocating bats. A few toothed whale species have fully adapted to river systems, where sonar operation is likely to result in higher clutter and reverberation levels than those experienced by most toothed whales at sea because of the shallow water and dense vegetation. To test the hypothesis that habitat shapes the evolution of toothed whale biosonar parameters by promoting simpler auditory scenes to interpret in acoustically complex habitats, echolocation clicks of wild Amazon river dolphins were recorded using a vertical seven-hydrophone array. We identified 404 on-axis biosonar clicks having a mean SLpp of 190.3 ± 6.1 dB re. 1 µPa, mean SLEFD of 132.1 ± 6.0 dB re. 1 µPa(2)s, mean Fc of 101.2 ± 10.5 kHz, mean BWRMS of 29.3 ± 4.3 kHz and mean ICI of 35.1 ± 17.9 ms. Piston fit modelling resulted in an estimated half-power beamwidth of 10.2 deg (95% CI: 9.6-10.5 deg) and directivity index of 25.2 dB (95% CI: 24.9-25.7 dB). These results support the hypothesis that river-dwelling toothed whales operate their biosonars at lower amplitude and higher sampling rates than similar-sized marine species without sacrificing high directivity, in order to provide high update rates in acoustically complex habitats and simplify auditory scenes through reduced clutter and reverberation levels. We conclude that habitat, along with body size, is an important evolutionary driver of source parameters in toothed whale biosonars. © 2015. Published by The Company of Biologists Ltd.
The re-identification risk of Canadians from longitudinal demographics
2011-01-01
Background The public is less willing to allow their personal health information to be disclosed for research purposes if they do not trust researchers and how researchers manage their data. However, the public is more comfortable with their data being used for research if the risk of re-identification is low. There are few studies on the risk of re-identification of Canadians from their basic demographics, and no studies on their risk from their longitudinal data. Our objective was to estimate the risk of re-identification from the basic cross-sectional and longitudinal demographics of Canadians. Methods Uniqueness is a common measure of re-identification risk. Demographic data on a 25% random sample of the population of Montreal were analyzed to estimate population uniqueness on postal code, date of birth, and gender as well as their generalizations, for periods ranging from 1 year to 11 years. Results Almost 98% of the population was unique on full postal code, date of birth and gender: these three variables are effectively a unique identifier for Montrealers. Uniqueness increased for longitudinal data. Considerable generalization was required to reach acceptably low uniqueness levels, especially for longitudinal data. Detailed guidelines and disclosure policies on how to ensure that the re-identification risk is low are provided. Conclusions A large percentage of Montreal residents are unique on basic demographics. For non-longitudinal data sets, the three character postal code, gender, and month/year of birth represent sufficiently low re-identification risk. Data custodians need to generalize their demographic information further for longitudinal data sets. PMID:21696636
Kroll, Lars Eric; Schumann, Maria; Müters, Stephan; Lampert, Thomas
2017-12-01
Nationwide health surveys can be used to estimate regional differences in health. Using traditional estimation techniques, the spatial depth for these estimates is limited due to the constrained sample size. So far - without special refreshment samples - results have only been available for larger populated federal states of Germany. An alternative is regression-based small-area estimation techniques. These models can generate smaller-scale data, but are also subject to greater statistical uncertainties because of the model assumptions. In the present article, exemplary regionalized results based on the studies "Gesundheit in Deutschland aktuell" (GEDA studies) 2009, 2010 and 2012, are compared to the self-rated health status of the respondents. The aim of the article is to analyze the range of regional estimates in order to assess the usefulness of the techniques for health reporting more adequately. The results show that the estimated prevalence is relatively stable when using different samples. Important determinants of the variation of the estimates are the achieved sample size on the district level and the type of the district (cities vs. rural regions). Overall, the present study shows that small-area modeling of prevalence is associated with additional uncertainties compared to conventional estimates, which should be taken into account when interpreting the corresponding findings.
Jiang, Wenyu; Simon, Richard
2007-12-20
This paper first provides a critical review on some existing methods for estimating the prediction error in classifying microarray data where the number of genes greatly exceeds the number of specimens. Special attention is given to the bootstrap-related methods. When the sample size n is small, we find that all the reviewed methods suffer from either substantial bias or variability. We introduce a repeated leave-one-out bootstrap (RLOOB) method that predicts for each specimen in the sample using bootstrap learning sets of size ln. We then propose an adjusted bootstrap (ABS) method that fits a learning curve to the RLOOB estimates calculated with different bootstrap learning set sizes. The ABS method is robust across the situations we investigate and provides a slightly conservative estimate for the prediction error. Even with small samples, it does not suffer from large upward bias as the leave-one-out bootstrap and the 0.632+ bootstrap, and it does not suffer from large variability as the leave-one-out cross-validation in microarray applications. Copyright (c) 2007 John Wiley & Sons, Ltd.
Modeling misidentification errors that result from use of genetic tags in capture-recapture studies
Yoshizaki, J.; Brownie, C.; Pollock, K.H.; Link, W.A.
2011-01-01
Misidentification of animals is potentially important when naturally existing features (natural tags) such as DNA fingerprints (genetic tags) are used to identify individual animals. For example, when misidentification leads to multiple identities being assigned to an animal, traditional estimators tend to overestimate population size. Accounting for misidentification in capture-recapture models requires detailed understanding of the mechanism. Using genetic tags as an example, we outline a framework for modeling the effect of misidentification in closed population studies when individual identification is based on natural tags that are consistent over time (non-evolving natural tags). We first assume a single sample is obtained per animal for each capture event, and then generalize to the case where multiple samples (such as hair or scat samples) are collected per animal per capture occasion. We introduce methods for estimating population size and, using a simulation study, we show that our new estimators perform well for cases with moderately high capture probabilities or high misidentification rates. In contrast, conventional estimators can seriously overestimate population size when errors due to misidentification are ignored. ?? 2009 Springer Science+Business Media, LLC.
Estimating the Availability of Potential Homes for Unwanted Horses in the United States
Weiss, Emily; Dolan, Emily D.; Mohan-Gibbons, Heather; Gramann, Shannon; Slater, Margaret R.
2017-01-01
Simple Summary There are approximately 200,000 unwanted horses annually in the United States. Many are shipped to slaughter, enter rescue facilities, or are held on federal lands. This study aimed to estimate a potential number of available homes for unwanted horses in order to examine broadly the viability of pursuing re-homing policies as an option for the thousands of unwanted horses in the U.S. The results of this survey suggest there could be an estimated 1.2 million homes who have both the perceived resources and desire to house an unwanted horse. This number exceeds the approximately 200,000 unwanted horses living each year in the United States. These data suggest that efforts to reduce unwanted horses could involve matching such horses with adoptive homes and enhancing opportunities to keep horses in the homes they already have. Abstract There are approximately 200,000 unwanted horses annually in the United States. This study aimed to better understand the potential homes for horses that need to be re-homed. Using an independent survey company through an Omnibus telephone (land and cell) survey, we interviewed a nationally projectable sample of 3036 adults (using both landline and cellular phone numbers) to learn of their interest and capacity to adopt a horse. Potential adopters with interest in horses with medical and/or behavioral problems and self-assessed perceived capacity to adopt, constituted 0.92% of the total sample. Extrapolating the results of this survey using U.S. Census data, suggests there could be an estimated 1.25 million households who have both the self-reported and perceived resources and desire to house an unwanted horse. This number exceeds the estimated number of unwanted horses living each year in the United States. This study points to opportunities and need to increase communication and support between individuals and organizations that have unwanted horses to facilitate re-homing with people in their community willing to adopt them. PMID:28726730
How accurate is the Pearson r-from-Z approximation? A Monte Carlo simulation study.
Hittner, James B; May, Kim
2012-01-01
The Pearson r-from-Z approximation estimates the sample correlation (as an effect size measure) from the ratio of two quantities: the standard normal deviate equivalent (Z-score) corresponding to a one-tailed p-value divided by the square root of the total (pooled) sample size. The formula has utility in meta-analytic work when reports of research contain minimal statistical information. Although simple to implement, the accuracy of the Pearson r-from-Z approximation has not been empirically evaluated. To address this omission, we performed a series of Monte Carlo simulations. Results indicated that in some cases the formula did accurately estimate the sample correlation. However, when sample size was very small (N = 10) and effect sizes were small to small-moderate (ds of 0.1 and 0.3), the Pearson r-from-Z approximation was very inaccurate. Detailed figures that provide guidance as to when the Pearson r-from-Z formula will likely yield valid inferences are presented.
Inference from single occasion capture experiments using genetic markers.
Hettiarachchige, Chathurika K H; Huggins, Richard M
2018-05-01
Accurate estimation of the size of animal populations is an important task in ecological science. Recent advances in the field of molecular genetics researches allow the use of genetic data to estimate the size of a population from a single capture occasion rather than repeated occasions as in the usual capture-recapture experiments. Estimating the population size using genetic data also has sometimes led to estimates that differ markedly from each other and also from classical capture-recapture estimates. Here, we develop a closed form estimator that uses genetic information to estimate the size of a population consisting of mothers and daughters, focusing on estimating the number of mothers, using data from a single sample. We demonstrate the estimator is consistent and propose a parametric bootstrap to estimate the standard errors. The estimator is evaluated in a simulation study and applied to real data. We also consider maximum likelihood in this setting and discover problems that preclude its general use. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Small-Sample DIF Estimation Using SIBTEST, Cochran's Z, and Log-Linear Smoothing
ERIC Educational Resources Information Center
Lei, Pui-Wa; Li, Hongli
2013-01-01
Minimum sample sizes of about 200 to 250 per group are often recommended for differential item functioning (DIF) analyses. However, there are times when sample sizes for one or both groups of interest are smaller than 200 due to practical constraints. This study attempts to examine the performance of Simultaneous Item Bias Test (SIBTEST),…
Conditional Optimal Design in Three- and Four-Level Experiments
ERIC Educational Resources Information Center
Hedges, Larry V.; Borenstein, Michael
2014-01-01
The precision of estimates of treatment effects in multilevel experiments depends on the sample sizes chosen at each level. It is often desirable to choose sample sizes at each level to obtain the smallest variance for a fixed total cost, that is, to obtain optimal sample allocation. This article extends previous results on optimal allocation to…
Paek, Insu
2015-01-01
The effect of guessing on the point estimate of coefficient alpha has been studied in the literature, but the impact of guessing and its interactions with other test characteristics on the interval estimators for coefficient alpha has not been fully investigated. This study examined the impact of guessing and its interactions with other test characteristics on four confidence interval (CI) procedures for coefficient alpha in terms of coverage rate (CR), length, and the degree of asymmetry of CI estimates. In addition, interval estimates of coefficient alpha when data follow the essentially tau-equivalent condition were investigated as a supplement to the case of dichotomous data with examinee guessing. For dichotomous data with guessing, the results did not reveal salient negative effects of guessing and its interactions with other test characteristics (sample size, test length, coefficient alpha levels) on CR and the degree of asymmetry, but the effect of guessing was salient as a main effect and an interaction effect with sample size on the length of the CI estimates, making longer CI estimates as guessing increases, especially when combined with a small sample size. Other important effects (e.g., CI procedures on CR) are also discussed. PMID:29795863
Human body mass estimation: a comparison of "morphometric" and "mechanical" methods.
Auerbach, Benjamin M; Ruff, Christopher B
2004-12-01
In the past, body mass was reconstructed from hominin skeletal remains using both "mechanical" methods which rely on the support of body mass by weight-bearing skeletal elements, and "morphometric" methods which reconstruct body mass through direct assessment of body size and shape. A previous comparison of two such techniques, using femoral head breadth (mechanical) and stature and bi-iliac breadth (morphometric), indicated a good general correspondence between them (Ruff et al. [1997] Nature 387:173-176). However, the two techniques were never systematically compared across a large group of modern humans of diverse body form. This study incorporates skeletal measures taken from 1,173 Holocene adult individuals, representing diverse geographic origins, body sizes, and body shapes. Femoral head breadth, bi-iliac breadth (after pelvic rearticulation), and long bone lengths were measured on each individual. Statures were estimated from long bone lengths using appropriate reference samples. Body masses were calculated using three available femoral head breadth (FH) formulae and the stature/bi-iliac breadth (STBIB) formula, and compared. All methods yielded similar results. Correlations between FH estimates and STBIB estimates are 0.74-0.81. Slight differences in results between the three FH estimates can be attributed to sampling differences in the original reference samples, and in particular, the body-size ranges included in those samples. There is no evidence for systematic differences in results due to differences in body proportions. Since the STBIB method was validated on other samples, and the FH methods produced similar estimates, this argues that either may be applied to skeletal remains with some confidence. 2004 Wiley-Liss, Inc.
Fleming, A; Schenkel, F S; Koeck, A; Malchiodi, F; Ali, R A; Corredig, M; Mallard, B; Sargolzaei, M; Miglior, F
2017-05-01
The objective of this study was to estimate the heritability of milk fat globule (MFG) size and mid-infrared (MIR) predicted MFG size in Holstein cattle. The genetic correlations between measured and predicted MFG size with milk fat and protein percentage were also investigated. Average MFG size was measured in 1,583 milk samples taken from 254 Holstein cows from 29 herds across Canada. Size was expressed as volume moment mean (D[4,3]) and surface moment mean (D[3,2]). Analyzed milk samples also had average MFG size predicted from their MIR spectral records. Fat and protein percentages were obtained for all test-day milk samples in the cow's lactation. Univariate and bivariate repeatability animal models were used to estimate heritability and genetic correlations. Moderate heritabilities of 0.364 and 0.466 were found for D[4,3] and D[3,2], respectively, and a strong genetic correlation was found between the 2 traits (0.98). The heritabilities for the MIR-predicted MFG size were lower than those estimated for the measured MFG size at 0.300 for predicted D[4,3] and 0.239 for predicted D[3,2]. The genetic correlation between measured and predicted D[4,3] was 0.685; the correlation was slightly higher between measured and predicted D[3,2] at 0.764, likely due to the better prediction accuracy of D[3,2]. Milk fat percentage had moderate genetic correlations with both D[4,3] and D[3,2] (0.538 and 0.681, respectively). The genetic correlation between predicted MFG size and fat percentage was much stronger (greater than 0.97 for both predicted D[4,3] and D[3,2]). The stronger correlation suggests a limitation for the use of the predicted values of MFG size as indicator traits for true average MFG size in milk in selection programs. Larger samples sizes are required to provide better evidence of the estimated genetic parameters. A genetic component appears to exist for the average MFG size in bovine milk, and the variation could be exploited in selection programs. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
A test and re-estimation of Taylor's empirical capacity-reserve relationship
Long, K.R.
2009-01-01
In 1977, Taylor proposed a constant elasticity model relating capacity choice in mines to reserves. A test of this model using a very large (n = 1,195) dataset confirms its validity but obtains significantly different estimated values for the model coefficients. Capacity is somewhat inelastic with respect to reserves, with an elasticity of 0.65 estimated for open-pit plus block-cave underground mines and 0.56 for all other underground mines. These new estimates should be useful for capacity determinations as scoping studies and as a starting point for feasibility studies. The results are robust over a wide range of deposit types, deposit sizes, and time, consistent with physical constraints on mine capacity that are largely independent of technology. ?? 2009 International Association for Mathematical Geology.
Baldissera, Sandro; Ferrante, Gianluigi; Quarchioni, Elisa; Minardi, Valentina; Possenti, Valentina; Carrozzi, Giuliano; Masocco, Maria; Salmaso, Stefania
2014-04-01
Field substitution of nonrespondents can be used to maintain the planned sample size and structure in surveys but may introduce additional bias. Sample weighting is suggested as the preferable alternative; however, limited empirical evidence exists comparing the two methods. We wanted to assess the impact of substitution on surveillance results using data from Progressi delle Aziende Sanitarie per la Salute in Italia-Progress by Local Health Units towards a Healthier Italy (PASSI). PASSI is conducted by Local Health Units (LHUs) through telephone interviews of stratified random samples of residents. Nonrespondents are replaced with substitutes randomly preselected in the same LHU stratum. We compared the weighted estimates obtained in the original PASSI sample (used as a reference) and in the substitutes' sample. The differences were evaluated using a Wald test. In 2011, 50,697 units were selected: 37,252 were from the original sample and 13,445 were substitutes; 37,162 persons were interviewed. The initially planned size and demographic composition were restored. No significant differences in the estimates between the original and the substitutes' sample were found. In our experience, field substitution is an acceptable method for dealing with nonresponse, maintaining the characteristics of the original sample without affecting the results. This evidence can support appropriate decisions about planning and implementing a surveillance system. Copyright © 2014 Elsevier Inc. All rights reserved.
An empirical Bayes approach to analyzing recurring animal surveys
Johnson, D.H.
1989-01-01
Recurring estimates of the size of animal populations are often required by biologists or wildlife managers. Because of cost or other constraints, estimates frequently lack the accuracy desired but cannot readily be improved by additional sampling. This report proposes a statistical method employing empirical Bayes (EB) estimators as alternatives to those customarily used to estimate population size, and evaluates them by a subsampling experiment on waterfowl surveys. EB estimates, especially a simple limited-translation version, were more accurate and provided shorter confidence intervals with greater coverage probabilities than customary estimates.
Shannon, Casey P; Chen, Virginia; Takhar, Mandeep; Hollander, Zsuzsanna; Balshaw, Robert; McManus, Bruce M; Tebbutt, Scott J; Sin, Don D; Ng, Raymond T
2016-11-14
Gene network inference (GNI) algorithms can be used to identify sets of coordinately expressed genes, termed network modules from whole transcriptome gene expression data. The identification of such modules has become a popular approach to systems biology, with important applications in translational research. Although diverse computational and statistical approaches have been devised to identify such modules, their performance behavior is still not fully understood, particularly in complex human tissues. Given human heterogeneity, one important question is how the outputs of these computational methods are sensitive to the input sample set, or stability. A related question is how this sensitivity depends on the size of the sample set. We describe here the SABRE (Similarity Across Bootstrap RE-sampling) procedure for assessing the stability of gene network modules using a re-sampling strategy, introduce a novel criterion for identifying stable modules, and demonstrate the utility of this approach in a clinically-relevant cohort, using two different gene network module discovery algorithms. The stability of modules increased as sample size increased and stable modules were more likely to be replicated in larger sets of samples. Random modules derived from permutated gene expression data were consistently unstable, as assessed by SABRE, and provide a useful baseline value for our proposed stability criterion. Gene module sets identified by different algorithms varied with respect to their stability, as assessed by SABRE. Finally, stable modules were more readily annotated in various curated gene set databases. The SABRE procedure and proposed stability criterion may provide guidance when designing systems biology studies in complex human disease and tissues.
Synthesis and photoluminescence studies of Tm3+/Yb3+ codoped Y2O3 phosphors
NASA Astrophysics Data System (ADS)
Maurya, S. K.; Tiwari, S. P.; Kumar, A.; Kumar, K.
2018-05-01
Tm3+/Yb3+ codoped Y2O3 phosphor nanoparticles are synthesized by the solution combustion method using urea as a fuel regent. The nitrate of all rare earths RE(NO)3.6H2O (RE = Y, Tm and Yb) are used in a stoichiometric ratios to get the optimized emission intensities. The sample is further annealed at 900 °C for characterizations. The phase confirmation of synthesized samples is carried out by using XRD analysis. FESEM images are analyzed to confirm the shape and size of particles. The EDX image shows all elements are present in the sample. The agglomerated particles are monitored for annealed samples. The comparative studies in upconversion and downconversion behavior of annealed powder samples are monitored, consequently, the emission intensity is dominantly assigned at 464, 477 and 655 nm corresponding to 1D2→3F4, 1G4→3H6 and 1G4→3F4, respectively. The CIE coordinates of the recorded samples are calculated with different excitation wavelength and found to be invariant which exhibits the applicability of sample for display devices.
Power and sample-size estimation for microbiome studies using pairwise distances and PERMANOVA
Kelly, Brendan J.; Gross, Robert; Bittinger, Kyle; Sherrill-Mix, Scott; Lewis, James D.; Collman, Ronald G.; Bushman, Frederic D.; Li, Hongzhe
2015-01-01
Motivation: The variation in community composition between microbiome samples, termed beta diversity, can be measured by pairwise distance based on either presence–absence or quantitative species abundance data. PERMANOVA, a permutation-based extension of multivariate analysis of variance to a matrix of pairwise distances, partitions within-group and between-group distances to permit assessment of the effect of an exposure or intervention (grouping factor) upon the sampled microbiome. Within-group distance and exposure/intervention effect size must be accurately modeled to estimate statistical power for a microbiome study that will be analyzed with pairwise distances and PERMANOVA. Results: We present a framework for PERMANOVA power estimation tailored to marker-gene microbiome studies that will be analyzed by pairwise distances, which includes: (i) a novel method for distance matrix simulation that permits modeling of within-group pairwise distances according to pre-specified population parameters; (ii) a method to incorporate effects of different sizes within the simulated distance matrix; (iii) a simulation-based method for estimating PERMANOVA power from simulated distance matrices; and (iv) an R statistical software package that implements the above. Matrices of pairwise distances can be efficiently simulated to satisfy the triangle inequality and incorporate group-level effects, which are quantified by the adjusted coefficient of determination, omega-squared (ω2). From simulated distance matrices, available PERMANOVA power or necessary sample size can be estimated for a planned microbiome study. Availability and implementation: http://github.com/brendankelly/micropower. Contact: brendank@mail.med.upenn.edu or hongzhe@upenn.edu PMID:25819674
Li, Peng; Redden, David T.
2014-01-01
SUMMARY The sandwich estimator in generalized estimating equations (GEE) approach underestimates the true variance in small samples and consequently results in inflated type I error rates in hypothesis testing. This fact limits the application of the GEE in cluster-randomized trials (CRTs) with few clusters. Under various CRT scenarios with correlated binary outcomes, we evaluate the small sample properties of the GEE Wald tests using bias-corrected sandwich estimators. Our results suggest that the GEE Wald z test should be avoided in the analyses of CRTs with few clusters even when bias-corrected sandwich estimators are used. With t-distribution approximation, the Kauermann and Carroll (KC)-correction can keep the test size to nominal levels even when the number of clusters is as low as 10, and is robust to the moderate variation of the cluster sizes. However, in cases with large variations in cluster sizes, the Fay and Graubard (FG)-correction should be used instead. Furthermore, we derive a formula to calculate the power and minimum total number of clusters one needs using the t test and KC-correction for the CRTs with binary outcomes. The power levels as predicted by the proposed formula agree well with the empirical powers from the simulations. The proposed methods are illustrated using real CRT data. We conclude that with appropriate control of type I error rates under small sample sizes, we recommend the use of GEE approach in CRTs with binary outcomes due to fewer assumptions and robustness to the misspecification of the covariance structure. PMID:25345738
Thoracic and respirable particle definitions for human health risk assessment.
Brown, James S; Gordon, Terry; Price, Owen; Asgharian, Bahman
2013-04-10
Particle size-selective sampling refers to the collection of particles of varying sizes that potentially reach and adversely affect specific regions of the respiratory tract. Thoracic and respirable fractions are defined as the fraction of inhaled particles capable of passing beyond the larynx and ciliated airways, respectively, during inhalation. In an attempt to afford greater protection to exposed individuals, current size-selective sampling criteria overestimate the population means of particle penetration into regions of the lower respiratory tract. The purpose of our analyses was to provide estimates of the thoracic and respirable fractions for adults and children during typical activities with both nasal and oral inhalation, that may be used in the design of experimental studies and interpretation of health effects evidence. We estimated the fraction of inhaled particles (0.5-20 μm aerodynamic diameter) penetrating beyond the larynx (based on experimental data) and ciliated airways (based on a mathematical model) for an adult male, adult female, and a 10 yr old child during typical daily activities and breathing patterns. Our estimates show less penetration of coarse particulate matter into the thoracic and gas exchange regions of the respiratory tract than current size-selective criteria. Of the parameters we evaluated, particle penetration into the lower respiratory tract was most dependent on route of breathing. For typical activity levels and breathing habits, we estimated a 50% cut-size for the thoracic fraction at an aerodynamic diameter of around 3 μm in adults and 5 μm in children, whereas current ambient and occupational criteria suggest a 50% cut-size of 10 μm. By design, current size-selective sample criteria overestimate the mass of particles generally expected to penetrate into the lower respiratory tract to provide protection for individuals who may breathe orally. We provide estimates of thoracic and respirable fractions for a variety of breathing habits and activities that may benefit the design of experimental studies and interpretation of particle size-specific health effects.
Thoracic and respirable particle definitions for human health risk assessment
2013-01-01
Background Particle size-selective sampling refers to the collection of particles of varying sizes that potentially reach and adversely affect specific regions of the respiratory tract. Thoracic and respirable fractions are defined as the fraction of inhaled particles capable of passing beyond the larynx and ciliated airways, respectively, during inhalation. In an attempt to afford greater protection to exposed individuals, current size-selective sampling criteria overestimate the population means of particle penetration into regions of the lower respiratory tract. The purpose of our analyses was to provide estimates of the thoracic and respirable fractions for adults and children during typical activities with both nasal and oral inhalation, that may be used in the design of experimental studies and interpretation of health effects evidence. Methods We estimated the fraction of inhaled particles (0.5-20 μm aerodynamic diameter) penetrating beyond the larynx (based on experimental data) and ciliated airways (based on a mathematical model) for an adult male, adult female, and a 10 yr old child during typical daily activities and breathing patterns. Results Our estimates show less penetration of coarse particulate matter into the thoracic and gas exchange regions of the respiratory tract than current size-selective criteria. Of the parameters we evaluated, particle penetration into the lower respiratory tract was most dependent on route of breathing. For typical activity levels and breathing habits, we estimated a 50% cut-size for the thoracic fraction at an aerodynamic diameter of around 3 μm in adults and 5 μm in children, whereas current ambient and occupational criteria suggest a 50% cut-size of 10 μm. Conclusions By design, current size-selective sample criteria overestimate the mass of particles generally expected to penetrate into the lower respiratory tract to provide protection for individuals who may breathe orally. We provide estimates of thoracic and respirable fractions for a variety of breathing habits and activities that may benefit the design of experimental studies and interpretation of particle size-specific health effects. PMID:23575443
NASA Astrophysics Data System (ADS)
Namburi, Devendra K.; Shi, Yunhua; Palmer, Kysen G.; Dennis, Anthony R.; Durrell, John H.; Cardwell, David A.
2016-09-01
A fundamental requirement of the fabrication of high performing, (RE)-Ba-Cu-O bulk superconductors is achieving a single grain microstructure that exhibits good flux pinning properties. The top seeded melt growth (TSMG) process is a well-established technique for the fabrication of single grain (RE)BCO bulk samples and is now applied routinely by a number of research groups around the world. The introduction of a buffer layer to the TSMG process has been demonstrated recently to improve significantly the general reliability of the process. However, a number of growth-related defects, such as porosity and the formation of micro-cracks, remain inherent to the TSMG process, and are proving difficult to eliminate by varying the melt process parameters. The seeded infiltration and growth (SIG) process has been shown to yield single grain samples that exhibit significantly improved microstructures compared to the TSMG technique. Unfortunately, however, SIG leads to other processing challenges, such as the reliability of fabrication, optimisation of RE2BaCuO5 (RE-211) inclusions (size and content) in the sample microstructure, practical oxygenation of as processed samples and, hence, optimisation of the superconducting properties of the bulk single grain. In the present paper, we report the development of a near-net shaping technique based on a novel two-step, buffer-aided top seeded infiltration and growth (BA-TSIG) process, which has been demonstrated to improve greatly the reliability of the single grain growth process and has been used to fabricate successfully bulk, single grain (RE)BCO superconductors with improved microstructures and superconducting properties. A trapped field of ˜0.84 T and a zero field current density of 60 kA cm-2 have been measured at 77 K in a bulk, YBCO single grain sample of diameter 25 mm processed by this two-step BA-TSIG technique. To the best of our knowledge, this value of trapped field is the highest value ever reported for a sample fabricated by an infiltration and growth process. In this study we report the successful fabrication of 14 YBCO samples, with diameters of up to 32 mm, by this novel technique with a success rate of greater than 92%.
NASA Astrophysics Data System (ADS)
Torkamani, Hadi; Raygan, Shahram; Garcia Mateo, Carlos; Rassizadehghani, Jafar; Palizdar, Yahya; San-Martin, David
2018-07-01
In this research Rare Earth elements (RE), La and Ce (200 ppm), were added to a low carbon cast microalloyed steel to disclose their influence on the microstructure and impact toughness. It is suggested that RE are able to change the interaction between the inclusions and matrix during the solidification process (comprising peritectic transformation), which could affect the microstructural features and consequently the impact property; compared to the base steel a clear evolution was observed in nature and morphology of the inclusions present in the RE-added steel i.e. (1) they changed from MnS-based to (RE,Al)(S,O) and RE(S)-based; (2) they obtained an aspect ratio closer to 1 with a lower area fraction as well as a smaller average size. Besides, the microstructural examination of the matrix phases showed that a bimodal type of ferrite grain size distribution exists in both base and RE-added steels, while the mean ferrite grain size was reduced from 12 to 7 μm and the bimodality was redressed in the RE-added steel. It was found that pearlite nodule size decreases from 9 to 6 μm in the RE-added steel; however, microalloying with RE caused only a slight decrease in pearlite volume fraction. After detailed fractography analyses, it was found that, compared to the based steel, the significant enhancement of the impact toughness in RE-added steel (from 63 to 100 J) can be mainly attributed to the differences observed in the nature of the inclusions, the ferrite grain size distribution, and the pearlite nodule size. The presence of carbides (cementite) at ferrite grain boundaries and probable change in distribution of Nb-nanoprecipitation (promoted by RE addition) can be considered as other reasons affecting the impact toughness of steels under investigation.
NASA Astrophysics Data System (ADS)
Torkamani, Hadi; Raygan, Shahram; Garcia Mateo, Carlos; Rassizadehghani, Jafar; Palizdar, Yahya; San-Martin, David
2018-03-01
In this research Rare Earth elements (RE), La and Ce (200 ppm), were added to a low carbon cast microalloyed steel to disclose their influence on the microstructure and impact toughness. It is suggested that RE are able to change the interaction between the inclusions and matrix during the solidification process (comprising peritectic transformation), which could affect the microstructural features and consequently the impact property; compared to the base steel a clear evolution was observed in nature and morphology of the inclusions present in the RE-added steel i.e. (1) they changed from MnS-based to (RE,Al)(S,O) and RE(S)-based; (2) they obtained an aspect ratio closer to 1 with a lower area fraction as well as a smaller average size. Besides, the microstructural examination of the matrix phases showed that a bimodal type of ferrite grain size distribution exists in both base and RE-added steels, while the mean ferrite grain size was reduced from 12 to 7 μm and the bimodality was redressed in the RE-added steel. It was found that pearlite nodule size decreases from 9 to 6 μm in the RE-added steel; however, microalloying with RE caused only a slight decrease in pearlite volume fraction. After detailed fractography analyses, it was found that, compared to the based steel, the significant enhancement of the impact toughness in RE-added steel (from 63 to 100 J) can be mainly attributed to the differences observed in the nature of the inclusions, the ferrite grain size distribution, and the pearlite nodule size. The presence of carbides (cementite) at ferrite grain boundaries and probable change in distribution of Nb-nanoprecipitation (promoted by RE addition) can be considered as other reasons affecting the impact toughness of steels under investigation.
Shah, R; Worner, S P; Chapman, R B
2012-10-01
Pesticide resistance monitoring includes resistance detection and subsequent documentation/ measurement. Resistance detection would require at least one (≥1) resistant individual(s) to be present in a sample to initiate management strategies. Resistance documentation, on the other hand, would attempt to get an estimate of the entire population (≥90%) of the resistant individuals. A computer simulation model was used to compare the efficiency of simple random and systematic sampling plans to detect resistant individuals and to document their frequencies when the resistant individuals were randomly or patchily distributed. A patchy dispersion pattern of resistant individuals influenced the sampling efficiency of systematic sampling plans while the efficiency of random sampling was independent of such patchiness. When resistant individuals were randomly distributed, sample sizes required to detect at least one resistant individual (resistance detection) with a probability of 0.95 were 300 (1%) and 50 (10% and 20%); whereas, when resistant individuals were patchily distributed, using systematic sampling, sample sizes required for such detection were 6000 (1%), 600 (10%) and 300 (20%). Sample sizes of 900 and 400 would be required to detect ≥90% of resistant individuals (resistance documentation) with a probability of 0.95 when resistant individuals were randomly dispersed and present at a frequency of 10% and 20%, respectively; whereas, when resistant individuals were patchily distributed, using systematic sampling, a sample size of 3000 and 1500, respectively, was necessary. Small sample sizes either underestimated or overestimated the resistance frequency. A simple random sampling plan is, therefore, recommended for insecticide resistance detection and subsequent documentation.
Overall, John E; Tonidandel, Scott; Starbuck, Robert R
2006-01-01
Recent contributions to the statistical literature have provided elegant model-based solutions to the problem of estimating sample sizes for testing the significance of differences in mean rates of change across repeated measures in controlled longitudinal studies with differentially correlated error and missing data due to dropouts. However, the mathematical complexity and model specificity of these solutions make them generally inaccessible to most applied researchers who actually design and undertake treatment evaluation research in psychiatry. In contrast, this article relies on a simple two-stage analysis in which dropout-weighted slope coefficients fitted to the available repeated measurements for each subject separately serve as the dependent variable for a familiar ANCOVA test of significance for differences in mean rates of change. This article is about how a sample of size that is estimated or calculated to provide desired power for testing that hypothesis without considering dropouts can be adjusted appropriately to take dropouts into account. Empirical results support the conclusion that, whatever reasonable level of power would be provided by a given sample size in the absence of dropouts, essentially the same power can be realized in the presence of dropouts simply by adding to the original dropout-free sample size the number of subjects who would be expected to drop from a sample of that original size under conditions of the proposed study.
Tests for informative cluster size using a novel balanced bootstrap scheme.
Nevalainen, Jaakko; Oja, Hannu; Datta, Somnath
2017-07-20
Clustered data are often encountered in biomedical studies, and to date, a number of approaches have been proposed to analyze such data. However, the phenomenon of informative cluster size (ICS) is a challenging problem, and its presence has an impact on the choice of a correct analysis methodology. For example, Dutta and Datta (2015, Biometrics) presented a number of marginal distributions that could be tested. Depending on the nature and degree of informativeness of the cluster size, these marginal distributions may differ, as do the choices of the appropriate test. In particular, they applied their new test to a periodontal data set where the plausibility of the informativeness was mentioned, but no formal test for the same was conducted. We propose bootstrap tests for testing the presence of ICS. A balanced bootstrap method is developed to successfully estimate the null distribution by merging the re-sampled observations with closely matching counterparts. Relying on the assumption of exchangeability within clusters, the proposed procedure performs well in simulations even with a small number of clusters, at different distributions and against different alternative hypotheses, thus making it an omnibus test. We also explain how to extend the ICS test to a regression setting and thereby enhancing its practical utility. The methodologies are illustrated using the periodontal data set mentioned earlier. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Modification of Co/Cu nanoferrites properties via Gd3+/Er3+doping
NASA Astrophysics Data System (ADS)
Ateia, Ebtesam E.; Soliman, Fatma S.
2017-05-01
Pure nanoparticles of the rare earth-substituted cobalt and copper ferrites with general formula Me Gd0.025 Er0.05 Fe1.925 O4 (Me = Co, Cu) were prepared by the chemical citrate method. X-ray diffraction, field emission scanning electron microscopy, BET analysis are utilized to study the effect of rare earth substitution and its impact on the physical properties of the investigated samples. Rare earth-doped cobalt shows type IV isotherm suggesting mesopore structure with its hysteresis loop. The estimated crystallite sizes are found in the range of 21.49 and 36.11 nm for the doped Co and Cu samples, respectively. The magnetic properties of rare earth-substituted cobalt and copper ferrites showed a definite hysteresis loop at room temperature. An increase in coercivity and a decrease in saturation magnetization were detected. This can be explained in view of weaker nature of the Re3+-Fe3+ interaction compared to Fe3+-Fe3+ interaction. Greater than 1.13-fold increase in coercivity (Hc = 2184 Oe) was observed in doped cobalt nanoferrite samples compared to copper (Hc = 1936 Oe). It was found that the decreasing in temperature leads to great improvement in the magnetic properties of the investigated samples. As the magnetic recording performance of the magnetic samples is improved for well-crystallized samples with nano-structural, the effect of rare earth substitution seems to be particularly valuable in this regard.
Adaptive cluster sampling: An efficient method for assessing inconspicuous species
Andrea M. Silletti; Joan Walker
2003-01-01
Restorationistis typically evaluate the success of a project by estimating the population sizes of species that have been planted or seeded. Because total census is raely feasible, they must rely on sampling methods for population estimates. However, traditional random sampling designs may be inefficient for species that, for one reason or another, are challenging to...
Sequential sampling: a novel method in farm animal welfare assessment.
Heath, C A E; Main, D C J; Mullan, S; Haskell, M J; Browne, W J
2016-02-01
Lameness in dairy cows is an important welfare issue. As part of a welfare assessment, herd level lameness prevalence can be estimated from scoring a sample of animals, where higher levels of accuracy are associated with larger sample sizes. As the financial cost is related to the number of cows sampled, smaller samples are preferred. Sequential sampling schemes have been used for informing decision making in clinical trials. Sequential sampling involves taking samples in stages, where sampling can stop early depending on the estimated lameness prevalence. When welfare assessment is used for a pass/fail decision, a similar approach could be applied to reduce the overall sample size. The sampling schemes proposed here apply the principles of sequential sampling within a diagnostic testing framework. This study develops three sequential sampling schemes of increasing complexity to classify 80 fully assessed UK dairy farms, each with known lameness prevalence. Using the Welfare Quality herd-size-based sampling scheme, the first 'basic' scheme involves two sampling events. At the first sampling event half the Welfare Quality sample size is drawn, and then depending on the outcome, sampling either stops or is continued and the same number of animals is sampled again. In the second 'cautious' scheme, an adaptation is made to ensure that correctly classifying a farm as 'bad' is done with greater certainty. The third scheme is the only scheme to go beyond lameness as a binary measure and investigates the potential for increasing accuracy by incorporating the number of severely lame cows into the decision. The three schemes are evaluated with respect to accuracy and average sample size by running 100 000 simulations for each scheme, and a comparison is made with the fixed size Welfare Quality herd-size-based sampling scheme. All three schemes performed almost as well as the fixed size scheme but with much smaller average sample sizes. For the third scheme, an overall association between lameness prevalence and the proportion of lame cows that were severely lame on a farm was found. However, as this association was found to not be consistent across all farms, the sampling scheme did not prove to be as useful as expected. The preferred scheme was therefore the 'cautious' scheme for which a sampling protocol has also been developed.
A robust measure of HIV-1 population turnover within chronically infected individuals.
Achaz, G; Palmer, S; Kearney, M; Maldarelli, F; Mellors, J W; Coffin, J M; Wakeley, J
2004-10-01
A simple nonparameteric test for population structure was applied to temporally spaced samples of HIV-1 sequences from the gag-pol region within two chronically infected individuals. The results show that temporal structure can be detected for samples separated by about 22 months or more. The performance of the method, which was originally proposed to detect geographic structure, was tested for temporally spaced samples using neutral coalescent simulations. Simulations showed that the method is robust to variation in samples sizes and mutation rates, to the presence/absence of recombination, and that the power to detect temporal structure is high. By comparing levels of temporal structure in simulations to the levels observed in real data, we estimate the effective intra-individual population size of HIV-1 to be between 10(3) and 10(4) viruses, which is in agreement with some previous estimates. Using this estimate and a simple measure of sequence diversity, we estimate an effective neutral mutation rate of about 5 x 10(-6) per site per generation in the gag-pol region. The definition and interpretation of estimates of such "effective" population parameters are discussed.
Russell F. Thurow; Daniel J. Schill
1996-01-01
Biologists lack sufficient information to develop protocols for sampling the abundance and size structure of bull trout Salvelinus confluentus. We compared summer estimates of the abundance and size structure of bull trout in a second-order central Idaho stream, derived by day snorkeling, night snorkeling, and electrofishing. We also examined the influence of water...
Wan, Xiang; Wang, Wenqian; Liu, Jiming; Tong, Tiejun
2014-12-19
In systematic reviews and meta-analysis, researchers often pool the results of the sample mean and standard deviation from a set of similar clinical trials. A number of the trials, however, reported the study using the median, the minimum and maximum values, and/or the first and third quartiles. Hence, in order to combine results, one may have to estimate the sample mean and standard deviation for such trials. In this paper, we propose to improve the existing literature in several directions. First, we show that the sample standard deviation estimation in Hozo et al.'s method (BMC Med Res Methodol 5:13, 2005) has some serious limitations and is always less satisfactory in practice. Inspired by this, we propose a new estimation method by incorporating the sample size. Second, we systematically study the sample mean and standard deviation estimation problem under several other interesting settings where the interquartile range is also available for the trials. We demonstrate the performance of the proposed methods through simulation studies for the three frequently encountered scenarios, respectively. For the first two scenarios, our method greatly improves existing methods and provides a nearly unbiased estimate of the true sample standard deviation for normal data and a slightly biased estimate for skewed data. For the third scenario, our method still performs very well for both normal data and skewed data. Furthermore, we compare the estimators of the sample mean and standard deviation under all three scenarios and present some suggestions on which scenario is preferred in real-world applications. In this paper, we discuss different approximation methods in the estimation of the sample mean and standard deviation and propose some new estimation methods to improve the existing literature. We conclude our work with a summary table (an Excel spread sheet including all formulas) that serves as a comprehensive guidance for performing meta-analysis in different situations.
Robust functional statistics applied to Probability Density Function shape screening of sEMG data.
Boudaoud, S; Rix, H; Al Harrach, M; Marin, F
2014-01-01
Recent studies pointed out possible shape modifications of the Probability Density Function (PDF) of surface electromyographical (sEMG) data according to several contexts like fatigue and muscle force increase. Following this idea, criteria have been proposed to monitor these shape modifications mainly using High Order Statistics (HOS) parameters like skewness and kurtosis. In experimental conditions, these parameters are confronted with small sample size in the estimation process. This small sample size induces errors in the estimated HOS parameters restraining real-time and precise sEMG PDF shape monitoring. Recently, a functional formalism, the Core Shape Model (CSM), has been used to analyse shape modifications of PDF curves. In this work, taking inspiration from CSM method, robust functional statistics are proposed to emulate both skewness and kurtosis behaviors. These functional statistics combine both kernel density estimation and PDF shape distances to evaluate shape modifications even in presence of small sample size. Then, the proposed statistics are tested, using Monte Carlo simulations, on both normal and Log-normal PDFs that mimic observed sEMG PDF shape behavior during muscle contraction. According to the obtained results, the functional statistics seem to be more robust than HOS parameters to small sample size effect and more accurate in sEMG PDF shape screening applications.
Spatially-explicit estimation of Wright's neighborhood size in continuous populations
Andrew J. Shirk; Samuel A. Cushman
2014-01-01
Effective population size (Ne) is an important parameter in conservation genetics because it quantifies a population's capacity to resist loss of genetic diversity due to inbreeding and drift. The classical approach to estimate Ne from genetic data involves grouping sampled individuals into discretely defined subpopulations assumed to be panmictic. Importantly,...
NASA Astrophysics Data System (ADS)
Stiefenhofer, Johann; Thurston, Malcolm L.; Bush, David E.
2018-04-01
Microdiamonds offer several advantages as a resource estimation tool, such as access to deeper parts of a deposit which may be beyond the reach of large diameter drilling (LDD) techniques, the recovery of the total diamond content in the kimberlite, and a cost benefit due to the cheaper treatment cost compared to large diameter samples. In this paper we take the first step towards local estimation by showing that micro-diamond samples can be treated as a regionalised variable suitable for use in geostatistical applications and we show examples of such output. Examples of microdiamond variograms are presented, the variance-support relationship for microdiamonds is demonstrated and consistency of the diamond size frequency distribution (SFD) is shown with the aid of real datasets. The focus therefore is on why local microdiamond estimation should be possible, not how to generate such estimates. Data from our case studies and examples demonstrate a positive correlation between micro- and macrodiamond sample grades as well as block estimates. This relationship can be demonstrated repeatedly across multiple mining operations. The smaller sample support size for microdiamond samples is a key difference between micro- and macrodiamond estimates and this aspect must be taken into account during the estimation process. We discuss three methods which can be used to validate or reconcile the estimates against macrodiamond data, either as estimates or in the form of production grades: (i) reconcilliation using production data, (ii) by comparing LDD-based grade estimates against microdiamond-based estimates and (iii) using simulation techniques.
Rasmussen, Blake B
2016-01-01
The goal of this critical review is to comprehensively assess the evidence for the molecular, physiologic, and phenotypic skeletal muscle responses to resistance exercise (RE) combined with the nutritional intervention of protein and/or amino acid (AA) ingestion in young adults. We gathered the literature regarding the translational response in human skeletal muscle to acute exposure to RE and protein/AA supplements and the literature describing the phenotypic skeletal muscle adaptation to RE and nutritional interventions. Supplementation of protein/AAs with RE exhibited clear protein dose–dependent effects on translational regulation (protein synthesis) through mammalian target of rapamycin complex 1 (mTORC1) signaling, which was most apparent through increases in p70 ribosomal protein S6 kinase 1 (S6K1) phosphorylation, compared with postexercise recovery in the fasted or carbohydrate-fed state. These acute findings were critically tested via long-term exposure to RE training (RET) and protein/AA supplementation, and it was determined that a diminishing protein/AA supplement effect occurs over a prolonged exposure stimulus after exercise training. Furthermore, we found that protein/AA supplements, combined with RET, produced a positive, albeit minor, effect on the promotion of lean mass growth (when assessed in >20 participants/treatment); a negligible effect on muscle mass; and a negligible to no additional effect on strength. A potential concern we discovered was that the majority of the exercise training studies were underpowered in their ability to discern effects of protein/AA supplementation. Regardless, even when using optimal methodology and large sample sizes, it is clear that the effect size for protein/AA supplementation is low and likely limited to a subset of individuals because the individual variability is high. With regard to nutritional intakes, total protein intake per day, rather than protein timing or quality, appears to be more of a factor on this effect during long-term exercise interventions. There were no differences in strength or mass/muscle mass on RET outcomes between protein types when a leucine threshold (>2 g/dose) was reached. Future research with larger sample sizes and more homogeneity in design is necessary to understand the underlying adaptations and to better evaluate the individual variability in the muscle-adaptive response to protein/AA supplementation during RET. PMID:26764320
Authoritarian Parenting and Asian Adolescent School Performance: Insights from the US and Taiwan
Pong, Suet-ling; Johnston, Jamie; Chen, Vivien
2014-01-01
Our study re-examines the relationship between parenting and school performance among Asian students. We use two sources of data: wave I of the Adolescent Health Longitudinal Survey (Add Health), and waves I and II of the Taiwan Educational Panel Survey (TEPS). Analysis using Add Health reveals that the Asian-American/European-American difference in the parenting–school performance relationship is due largely to differential sample sizes. When we select a random sample of European-American students comparable to the sample size of Asian-American students, authoritarian parenting also shows no effect for European-American students. Furthermore, analysis of TEPS shows that authoritarian parenting is negatively associated with children's school achievement, while authoritative parenting is positively associated. This result for Taiwanese Chinese students is similar to previous results for European-American students in the US. PMID:24850978
Authoritarian Parenting and Asian Adolescent School Performance: Insights from the US and Taiwan.
Pong, Suet-Ling; Johnston, Jamie; Chen, Vivien
2010-01-01
Our study re-examines the relationship between parenting and school performance among Asian students. We use two sources of data: wave I of the Adolescent Health Longitudinal Survey (Add Health), and waves I and II of the Taiwan Educational Panel Survey (TEPS). Analysis using Add Health reveals that the Asian-American/European-American difference in the parenting-school performance relationship is due largely to differential sample sizes. When we select a random sample of European-American students comparable to the sample size of Asian-American students, authoritarian parenting also shows no effect for European-American students. Furthermore, analysis of TEPS shows that authoritarian parenting is negatively associated with children's school achievement, while authoritative parenting is positively associated. This result for Taiwanese Chinese students is similar to previous results for European-American students in the US.
Paquet, Victor; Joseph, Caroline; D'Souza, Clive
2012-01-01
Anthropometric studies typically require a large number of individuals that are selected in a manner so that demographic characteristics that impact body size and function are proportionally representative of a user population. This sampling approach does not allow for an efficient characterization of the distribution of body sizes and functions of sub-groups within a population and the demographic characteristics of user populations can often change with time, limiting the application of the anthropometric data in design. The objective of this study is to demonstrate how demographically representative user populations can be developed from samples that are not proportionally representative in order to improve the application of anthropometric data in design. An engineering anthropometry problem of door width and clear floor space width is used to illustrate the value of the approach.
Sample sizes to control error estimates in determining soil bulk density in California forest soils
Youzhi Han; Jianwei Zhang; Kim G. Mattson; Weidong Zhang; Thomas A. Weber
2016-01-01
Characterizing forest soil properties with high variability is challenging, sometimes requiring large numbers of soil samples. Soil bulk density is a standard variable needed along with element concentrations to calculate nutrient pools. This study aimed to determine the optimal sample size, the number of observation (n), for predicting the soil bulk density with a...
Inventory implications of using sampling variances in estimation of growth model coefficients
Albert R. Stage; William R. Wykoff
2000-01-01
Variables based on stand densities or stocking have sampling errors that depend on the relation of tree size to plot size and on the spatial structure of the population, ignoring the sampling errors of such variables, which include most measures of competition used in both distance-dependent and distance-independent growth models, can bias the predictions obtained from...
Blinded and unblinded internal pilot study designs for clinical trials with count data.
Schneider, Simon; Schmidli, Heinz; Friede, Tim
2013-07-01
Internal pilot studies are a popular design feature to address uncertainties in the sample size calculations caused by vague information on nuisance parameters. Despite their popularity, only very recently blinded sample size reestimation procedures for trials with count data were proposed and their properties systematically investigated. Although blinded procedures are favored by regulatory authorities, practical application is somewhat limited by fears that blinded procedures are prone to bias if the treatment effect was misspecified in the planning. Here, we compare unblinded and blinded procedures with respect to bias, error rates, and sample size distribution. We find that both procedures maintain the desired power and that the unblinded procedure is slightly liberal whereas the actual significance level of the blinded procedure is close to the nominal level. Furthermore, we show that in situations where uncertainty about the assumed treatment effect exists, the blinded estimator of the control event rate is biased in contrast to the unblinded estimator, which results in differences in mean sample sizes in favor of the unblinded procedure. However, these differences are rather small compared to the deviations of the mean sample sizes from the sample size required to detect the true, but unknown effect. We demonstrate that the variation of the sample size resulting from the blinded procedure is in many practically relevant situations considerably smaller than the one of the unblinded procedures. The methods are extended to overdispersed counts using a quasi-likelihood approach and are illustrated by trials in relapsing multiple sclerosis. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Sampling through time and phylodynamic inference with coalescent and birth–death models
Volz, Erik M.; Frost, Simon D. W.
2014-01-01
Many population genetic models have been developed for the purpose of inferring population size and growth rates from random samples of genetic data. We examine two popular approaches to this problem, the coalescent and the birth–death-sampling model (BDM), in the context of estimating population size and birth rates in a population growing exponentially according to the birth–death branching process. For sequences sampled at a single time, we found the coalescent and the BDM gave virtually indistinguishable results in terms of the growth rates and fraction of the population sampled, even when sampling from a small population. For sequences sampled at multiple time points, we find that the birth–death model estimators are subject to large bias if the sampling process is misspecified. Since BDMs incorporate a model of the sampling process, we show how much of the statistical power of BDMs arises from the sequence of sample times and not from the genealogical tree. This motivates the development of a new coalescent estimator, which is augmented with a model of the known sampling process and is potentially more precise than the coalescent that does not use sample time information. PMID:25401173
Mark J. Ducey; Jeffrey H. Gove; Harry T. Valentine
2008-01-01
Perpendicular distance sampling (PDS) is a fast probability-proportional-to-size method for inventory of downed wood. However, previous development of PDS had limited the method to estimating only one variable (such as volume per hectare, or surface area per hectare) at a time. Here, we develop a general design-unbiased estimator for PDS. We then show how that...
Analysis of variograms with various sample sizes from a multispectral image
USDA-ARS?s Scientific Manuscript database
Variogram plays a crucial role in remote sensing application and geostatistics. It is very important to estimate variogram reliably from sufficient data. In this study, the analysis of variograms with various sample sizes of remotely sensed data was conducted. A 100x100-pixel subset was chosen from ...
Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses
Lanfear, Robert; Hua, Xia; Warren, Dan L.
2016-01-01
Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS) estimates how many truly independent samples of a given parameter the output of the MCMC represents. The ESS of a parameter is frequently much lower than the number of samples taken from the MCMC because sequential samples from the chain can be non-independent due to autocorrelation. Typically, phylogeneticists use a rule of thumb that the ESS of all parameters should be greater than 200. However, we have no method to calculate an ESS of tree topology samples, despite the fact that the tree topology is often the parameter of primary interest and is almost always central to the estimation of other parameters. That is, we lack a method to determine whether we have adequately sampled one of the most important parameters in our analyses. In this study, we address this problem by developing methods to estimate the ESS for tree topologies. We combine these methods with two new diagnostic plots for assessing posterior samples of tree topologies, and compare their performance on simulated and empirical data sets. Combined, the methods we present provide new ways to assess the mixing and convergence of phylogenetic tree topologies in Bayesian MCMC analyses. PMID:27435794
Srivastava, Alka; Balaji, Petety V
2015-12-01
This study probes the early events during lag phase of aggregation of GNNQQNY using all atom MD simulations in explicit solvent. Simulations were performed by varying system size, temperature and starting configuration. Peptides dispersed randomly in the simulation box come together early on in the simulation and form aggregates. These aggregates are dynamic implying the absence of stabilizing interactions. This facilitates the exploration of alternate arrangements. The constituent peptides sample a variety of conformations, frequently re-orient and re-arrange with respect to each other and dissociate from/re-associate with the aggregate. The size and lifetime of aggregates vary depending upon the number of inter-peptide backbone H-bonds. Most of the aggregates formed are amorphous but crystalline aggregates of smaller size (mainly 2-mers) do appear and sustain for varying durations of time. The peptides in crystalline 2-mers are mostly anti-parallel. The largest crystalline aggregate that appears is a 4-mer in a single sheet and a 4-, 5-, or 6-mer in double layered arrangement. Crystalline aggregates grow either by the sequential addition of peptides, or by the head-on or lateral collision-adhesion of 2-mers. The formation of various smaller aggregates suggests the polymorphic nature of oligomers and heterogeneity in the lag phase. Copyright © 2015 Elsevier Inc. All rights reserved.
The efficacy of respondent-driven sampling for the health assessment of minority populations.
Badowski, Grazyna; Somera, Lilnabeth P; Simsiman, Brayan; Lee, Hye-Ryeon; Cassel, Kevin; Yamanaka, Alisha; Ren, JunHao
2017-10-01
Respondent driven sampling (RDS) is a relatively new network sampling technique typically employed for hard-to-reach populations. Like snowball sampling, initial respondents or "seeds" recruit additional respondents from their network of friends. Under certain assumptions, the method promises to produce a sample independent from the biases that may have been introduced by the non-random choice of "seeds." We conducted a survey on health communication in Guam's general population using the RDS method, the first survey that has utilized this methodology in Guam. It was conducted in hopes of identifying a cost-efficient non-probability sampling strategy that could generate reasonable population estimates for both minority and general populations. RDS data was collected in Guam in 2013 (n=511) and population estimates were compared with 2012 BRFSS data (n=2031) and the 2010 census data. The estimates were calculated using the unweighted RDS sample and the weighted sample using RDS inference methods and compared with known population characteristics. The sample size was reached in 23days, providing evidence that the RDS method is a viable, cost-effective data collection method, which can provide reasonable population estimates. However, the results also suggest that the RDS inference methods used to reduce bias, based on self-reported estimates of network sizes, may not always work. Caution is needed when interpreting RDS study findings. For a more diverse sample, data collection should not be conducted in just one location. Fewer questions about network estimates should be asked, and more careful consideration should be given to the kind of incentives offered to participants. Copyright © 2017. Published by Elsevier Ltd.
Jorgenson, Andrew K.; Clark, Brett
2013-01-01
This study examines the regional and temporal differences in the statistical relationship between national-level carbon dioxide emissions and national-level population size. The authors analyze panel data from 1960 to 2005 for a diverse sample of nations, and employ descriptive statistics and rigorous panel regression modeling techniques. Initial descriptive analyses indicate that all regions experienced overall increases in carbon emissions and population size during the 45-year period of investigation, but with notable differences. For carbon emissions, the sample of countries in Asia experienced the largest percent increase, followed by countries in Latin America, Africa, and lastly the sample of relatively affluent countries in Europe, North America, and Oceania combined. For population size, the sample of countries in Africa experienced the largest percent increase, followed countries in Latin America, Asia, and the combined sample of countries in Europe, North America, and Oceania. Findings for two-way fixed effects panel regression elasticity models of national-level carbon emissions indicate that the estimated elasticity coefficient for population size is much smaller for nations in Africa than for nations in other regions of the world. Regarding potential temporal changes, from 1960 to 2005 the estimated elasticity coefficient for population size decreased by 25% for the sample of Africa countries, 14% for the sample of Asia countries, 6.5% for the sample of Latin America countries, but remained the same in size for the sample of countries in Europe, North America, and Oceania. Overall, while population size continues to be the primary driver of total national-level anthropogenic carbon dioxide emissions, the findings for this study highlight the need for future research and policies to recognize that the actual impacts of population size on national-level carbon emissions differ across both time and region. PMID:23437323
NASA Astrophysics Data System (ADS)
Norris, R.; Miller, N.; Wassenaar, L.; Hobson, K.
2010-12-01
Each spring, millions of monarch butterflies (Danaus plexippus) migrate up to 3000 km from central Mexico to re-colonize eastern North America. However, despite centuries of research, the patterns of re-colonization are not well understood. We combined stable-hydrogen (δD) and -carbon (δ13C) isotope measurements with demographic models to test (1) whether individuals sampled in the northern part of the breeding range in the Great Lakes originate directly from Mexico or are second generation individuals born in the southern US and (2) to estimate whether populations on the eastern seaboard migrate longitudinally over the Appalachians or originate directly from the Gulf Coast. In the Great Lakes, we found that the majority of individuals were second-generation monarchs born in the Gulf Coast and Central regions of the US. However, 25% individuals originated directly from Mexico and we estimated that these individuals produced the majority of offspring born in the Great Lakes region during June. On the eastern seaboard, we found the majority of monarchs (88%) originated in the mid-west and Great Lakes regions, providing the first direct evidence that second generation monarchs born in June complete a (trans-) longitudinal migration across the Appalachian mountains. The remaining individuals (12%) originated from parents that migrated directly from the Gulf coast during early spring. Our results demonstrate how stable isotopes, when combined with ecological data, can provide insights into patterns of connectivity in migratory insects that have been impossible to test using conventional techniques. The migration patterns presented here have important implications for predicting future changes in population size and for developing effective conservation plans for this species.
Nixon, Richard M; Wonderling, David; Grieve, Richard D
2010-03-01
Cost-effectiveness analyses (CEA) alongside randomised controlled trials commonly estimate incremental net benefits (INB), with 95% confidence intervals, and compute cost-effectiveness acceptability curves and confidence ellipses. Two alternative non-parametric methods for estimating INB are to apply the central limit theorem (CLT) or to use the non-parametric bootstrap method, although it is unclear which method is preferable. This paper describes the statistical rationale underlying each of these methods and illustrates their application with a trial-based CEA. It compares the sampling uncertainty from using either technique in a Monte Carlo simulation. The experiments are repeated varying the sample size and the skewness of costs in the population. The results showed that, even when data were highly skewed, both methods accurately estimated the true standard errors (SEs) when sample sizes were moderate to large (n>50), and also gave good estimates for small data sets with low skewness. However, when sample sizes were relatively small and the data highly skewed, using the CLT rather than the bootstrap led to slightly more accurate SEs. We conclude that while in general using either method is appropriate, the CLT is easier to implement, and provides SEs that are at least as accurate as the bootstrap. (c) 2009 John Wiley & Sons, Ltd.
Brummitt, Neil; Bachman, Steven P.; Aletrari, Elina; Chadburn, Helen; Griffiths-Lee, Janine; Lutz, Maiko; Moat, Justin; Rivers, Malin C.; Syfert, Mindy M.; Nic Lughadha, Eimear M.
2015-01-01
The IUCN Sampled Red List Index (SRLI) is a policy response by biodiversity scientists to the need to estimate trends in extinction risk of the world's diminishing biological diversity. Assessments of plant species for the SRLI project rely predominantly on herbarium specimen data from natural history collections, in the overwhelming absence of accurate population data or detailed distribution maps for the vast majority of plant species. This creates difficulties in re-assessing these species so as to measure genuine changes in conservation status, which must be observed under the same Red List criteria in order to be distinguished from an increase in the knowledge available for that species, and thus re-calculate the SRLI. However, the same specimen data identify precise localities where threatened species have previously been collected and can be used to model species ranges and to target fieldwork in order to test specimen-based range estimates and collect population data for SRLI plant species. Here, we outline a strategy for prioritizing fieldwork efforts in order to apply a wider range of IUCN Red List criteria to assessments of plant species, or any taxa with detailed locality or natural history specimen data, to produce a more robust estimation of the SRLI. PMID:25561676
Carleton, R. Drew; Heard, Stephen B.; Silk, Peter J.
2013-01-01
Estimation of pest density is a basic requirement for integrated pest management in agriculture and forestry, and efficiency in density estimation is a common goal. Sequential sampling techniques promise efficient sampling, but their application can involve cumbersome mathematics and/or intensive warm-up sampling when pests have complex within- or between-site distributions. We provide tools for assessing the efficiency of sequential sampling and of alternative, simpler sampling plans, using computer simulation with “pre-sampling” data. We illustrate our approach using data for balsam gall midge (Paradiplosis tumifex) attack in Christmas tree farms. Paradiplosis tumifex proved recalcitrant to sequential sampling techniques. Midge distributions could not be fit by a common negative binomial distribution across sites. Local parameterization, using warm-up samples to estimate the clumping parameter k for each site, performed poorly: k estimates were unreliable even for samples of n∼100 trees. These methods were further confounded by significant within-site spatial autocorrelation. Much simpler sampling schemes, involving random or belt-transect sampling to preset sample sizes, were effective and efficient for P. tumifex. Sampling via belt transects (through the longest dimension of a stand) was the most efficient, with sample means converging on true mean density for sample sizes of n∼25–40 trees. Pre-sampling and simulation techniques provide a simple method for assessing sampling strategies for estimating insect infestation. We suspect that many pests will resemble P. tumifex in challenging the assumptions of sequential sampling methods. Our software will allow practitioners to optimize sampling strategies before they are brought to real-world applications, while potentially avoiding the need for the cumbersome calculations required for sequential sampling methods. PMID:24376556
NASA Astrophysics Data System (ADS)
Al-Mudhafar, W. J.
2013-12-01
Precisely prediction of rock facies leads to adequate reservoir characterization by improving the porosity-permeability relationships to estimate the properties in non-cored intervals. It also helps to accurately identify the spatial facies distribution to perform an accurate reservoir model for optimal future reservoir performance. In this paper, the facies estimation has been done through Multinomial logistic regression (MLR) with respect to the well logs and core data in a well in upper sandstone formation of South Rumaila oil field. The entire independent variables are gamma rays, formation density, water saturation, shale volume, log porosity, core porosity, and core permeability. Firstly, Robust Sequential Imputation Algorithm has been considered to impute the missing data. This algorithm starts from a complete subset of the dataset and estimates sequentially the missing values in an incomplete observation by minimizing the determinant of the covariance of the augmented data matrix. Then, the observation is added to the complete data matrix and the algorithm continues with the next observation with missing values. The MLR has been chosen to estimate the maximum likelihood and minimize the standard error for the nonlinear relationships between facies & core and log data. The MLR is used to predict the probabilities of the different possible facies given each independent variable by constructing a linear predictor function having a set of weights that are linearly combined with the independent variables by using a dot product. Beta distribution of facies has been considered as prior knowledge and the resulted predicted probability (posterior) has been estimated from MLR based on Baye's theorem that represents the relationship between predicted probability (posterior) with the conditional probability and the prior knowledge. To assess the statistical accuracy of the model, the bootstrap should be carried out to estimate extra-sample prediction error by randomly drawing datasets with replacement from the training data. Each sample has the same size of the original training set and it can be conducted N times to produce N bootstrap datasets to re-fit the model accordingly to decrease the squared difference between the estimated and observed categorical variables (facies) leading to decrease the degree of uncertainty.
Discovery of Taeniid Eggs from A 17th Century Tomb in Korea
Lee, Hye-Jung; Shin, Dong-Hoon
2011-01-01
Even though Taenia spp. eggs are occasionally discovered from archeological remains around the world, these eggs have never been discovered in ancient samples from Korea. When we attempted to re-examine the archeological samples maintained in our collection, the eggs of Taenia spp., 5 in total number, were recovered from a tomb of Gongju-si. The eggs had radially striated embryophore, and 37.5-40.0 µm×37.5 µm in size. This is the first report on taeniid eggs from ancient samples of Korea, and it is suggested that intensive examination of voluminous archeological samples should be needed for identification of Taenia spp. PMID:22072839
Discovery of taeniid eggs from a 17th century tomb in Korea.
Lee, Hye-Jung; Shin, Dong-Hoon; Seo, Min
2011-09-01
Even though Taenia spp. eggs are occasionally discovered from archeological remains around the world, these eggs have never been discovered in ancient samples from Korea. When we attempted to re-examine the archeological samples maintained in our collection, the eggs of Taenia spp., 5 in total number, were recovered from a tomb of Gongju-si. The eggs had radially striated embryophore, and 37.5-40.0 µm×37.5 µm in size. This is the first report on taeniid eggs from ancient samples of Korea, and it is suggested that intensive examination of voluminous archeological samples should be needed for identification of Taenia spp.
Karanth, K.Ullas; Chundawat, Raghunandan S.; Nichols, James D.; Kumar, N. Samba
2004-01-01
Tropical dry-deciduous forests comprise more than 45% of the tiger (Panthera tigris) habitat in India. However, in the absence of rigorously derived estimates of ecological densities of tigers in dry forests, critical baseline data for managing tiger populations are lacking. In this study tiger densities were estimated using photographic capture–recapture sampling in the dry forests of Panna Tiger Reserve in Central India. Over a 45-day survey period, 60 camera trap sites were sampled in a well-protected part of the 542-km2 reserve during 2002. A total sampling effort of 914 camera-trap-days yielded photo-captures of 11 individual tigers over 15 sampling occasions that effectively covered a 418-km2 area. The closed capture–recapture model Mh, which incorporates individual heterogeneity in capture probabilities, fitted these photographic capture history data well. The estimated capture probability/sample, p̂= 0.04, resulted in an estimated tiger population size and standard error (N̂(SÊN̂)) of 29 (9.65), and a density (D̂(SÊD̂)) of 6.94 (3.23) tigers/100 km2. The estimated tiger density matched predictions based on prey abundance. Our results suggest that, if managed appropriately, the available dry forest habitat in India has the potential to support a population size of about 9000 wild tigers.
Population growth of Yellowstone grizzly bears: Uncertainty and future monitoring
Harris, R.B.; White, Gary C.; Schwartz, C.C.; Haroldson, M.A.
2007-01-01
Grizzly bears (Ursus arctos) in the Greater Yellowstone Ecosystem of the US Rocky Mountains have recently increased in numbers, but remain vulnerable due to isolation from other populations and predicted reductions in favored food resources. Harris et al. (2006) projected how this population might fare in the future under alternative survival rates, and in doing so estimated the rate of population growth, 1983–2002. We address issues that remain from that earlier work: (1) the degree of uncertainty surrounding our estimates of the rate of population change (λ); (2) the effect of correlation among demographic parameters on these estimates; and (3) how a future monitoring system using counts of females accompanied by cubs might usefully differentiate between short-term, expected, and inconsequential fluctuations versus a true change in system state. We used Monte Carlo re-sampling of beta distributions derived from the demographic parameters used by Harris et al. (2006) to derive distributions of λ during 1983–2002 given our sampling uncertainty. Approximate 95% confidence intervals were 0.972–1.096 (assuming females with unresolved fates died) and 1.008–1.115 (with unresolved females censored at last contact). We used well-supported models of Haroldson et al. (2006) and Schwartz et al. (2006a,b,c) to assess the strength of correlations among demographic processes and the effect of omitting them in projection models. Incorporating correlations among demographic parameters yielded point estimates of λ that were nearly identical to those from the earlier model that omitted correlations, but yielded wider confidence intervals surrounding λ. Finally, we suggest that fitting linear and quadratic curves to the trend suggested by the estimated number of females with cubs in the ecosystem, and using AICc model weights to infer population sizes and λ provides an objective means to monitoring approximate population trajectories in addition to demographic analysis.
Health and re-employment in a two year follow up of long term unemployed.
Claussen, B; Bjørndal, A; Hjort, P F
1993-01-01
STUDY OBJECTIVE--The aim was to examine re-employment and changes in health during a two year follow up of a representative sample of long term unemployed. DESIGN--This was a cross sectional study and a two year follow up. Health was measured by psychometric testing, Hopkins symptom checklist, General health questionnaire, and medical examination. Health related selection to continuous unemployment and recovery by re-employment was estimated by logistic regression with covariances deduced from the labour market theories of human capital and segmented labour market. SETTING--Four municipalities in Greenland, southern Norway. SUBJECTS--Participants were a random sample of 17 to 63 year old people registered as unemployed for more than 12 weeks. MAIN RESULTS--In the cross sectional study, the prevalence of depression, anxiety, and somatic illness was from four to 10 times higher than in a control group of employed people. In the follow up study, there was considerable health related selection to re-employment. A psychiatric diagnosis was associated with a 70% reduction in chances of obtaining a job. Normal performance on psychometric testing showed a two to three times increased chance of re-employment. Recovery of health following re-employment was less than expected from previous studies. CONCLUSIONS--Health related selection to long term unemployment seems to explain a substantial part of the excess mental morbidity among unemployed people. An increased proportion of the long term unemployed will be vocationally handicapped as years pass, putting a heavy burden on social services. Images PMID:8436885
Health and re-employment in a two year follow up of long term unemployed.
Claussen, B; Bjørndal, A; Hjort, P F
1993-02-01
The aim was to examine re-employment and changes in health during a two year follow up of a representative sample of long term unemployed. This was a cross sectional study and a two year follow up. Health was measured by psychometric testing, Hopkins symptom checklist, General health questionnaire, and medical examination. Health related selection to continuous unemployment and recovery by re-employment was estimated by logistic regression with covariances deduced from the labour market theories of human capital and segmented labour market. Four municipalities in Greenland, southern Norway. Participants were a random sample of 17 to 63 year old people registered as unemployed for more than 12 weeks. In the cross sectional study, the prevalence of depression, anxiety, and somatic illness was from four to 10 times higher than in a control group of employed people. In the follow up study, there was considerable health related selection to re-employment. A psychiatric diagnosis was associated with a 70% reduction in chances of obtaining a job. Normal performance on psychometric testing showed a two to three times increased chance of re-employment. Recovery of health following re-employment was less than expected from previous studies. Health related selection to long term unemployment seems to explain a substantial part of the excess mental morbidity among unemployed people. An increased proportion of the long term unemployed will be vocationally handicapped as years pass, putting a heavy burden on social services.
2010-03-01
sufficient replications often lead to models that lack precision in error estimation and thus imprecision in corresponding conclusions. This work develops...v Preface This work is dedicated to all who gave and continue to give in order for me to achieve some semblance of success. Benjamin M. Lee vi...develop, examine and test methodologies for an- alyzing test results from split-plot designs. In particular, this work determines the applicability
Rosenblum, Michael A; Laan, Mark J van der
2009-01-07
The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes. We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability under much weaker assumptions than are required for standard methods. A drawback of this approach, as we show, is that these confidence intervals are often quite wide. In response to this, we present a method for constructing much narrower confidence intervals, which are better suited for practical applications, and that are still more robust than confidence intervals based on standard methods, when dealing with small sample sizes. We show how to extend our approaches to much more general estimation problems than estimating the sample mean. We describe how these methods can be used to obtain more reliable confidence intervals in survey sampling. As a concrete example, we construct confidence intervals using our methods for the number of violent deaths between March 2003 and July 2006 in Iraq, based on data from the study "Mortality after the 2003 invasion of Iraq: A cross sectional cluster sample survey," by Burnham et al. (2006).
DS — Software for analyzing data collected using double sampling
Bart, Jonathan; Hartley, Dana
2011-01-01
DS analyzes count data to estimate density or relative density and population size when appropriate. The software is available at http://iwcbm.dev4.fsr.com/IWCBM/default.asp?PageID=126. The software was designed to analyze data collected using double sampling, but it also can be used to analyze index data. DS is not currently configured to apply distance methods or methods based on capture-recapture theory. Double sampling for the purpose of this report means surveying a sample of locations with a rapid method of unknown accuracy and surveying a subset of these locations using a more intensive method assumed to yield unbiased estimates. "Detection ratios" are calculated as the ratio of results from rapid surveys on intensive plots to the number actually present as determined from the intensive surveys. The detection ratios are used to adjust results from the rapid surveys. The formula for density is (results from rapid survey)/(estimated detection ratio from intensive surveys). Population sizes are estimated as (density)(area). Double sampling is well-established in the survey sampling literature—see Cochran (1977) for the basic theory, Smith (1995) for applications of double sampling in waterfowl surveys, Bart and Earnst (2002, 2005) for discussions of its use in wildlife studies, and Bart and others (in press) for a detailed account of how the method was used to survey shorebirds across the arctic region of North America. Indices are surveys that do not involve complete counts of well-defined plots or recording information to estimate detection rates (Thompson and others, 1998). In most cases, such data should not be used to estimate density or population size but, under some circumstances, may be used to compare two densities or estimate how density changes through time or across space (Williams and others, 2005). The Breeding Bird Survey (Sauer and others, 2008) provides a good example of an index survey. Surveyors record all birds detected but do not record any information, such as distance or whether each bird is recorded in subperiods, that could be used to estimate detection rates. Nonetheless, the data are widely used to estimate temporal trends and spatial patterns in abundance (Sauer and others, 2008). DS produces estimates of density (or relative density for indices) by species and stratum. Strata are usually defined using region and habitat but other variables may be used, and the entire study area may be classified as a single stratum. Population size in each stratum and for the entire study area also is estimated for each species. For indices, the estimated totals generally are only useful if (a) plots are surveyed so that densities can be calculated and extrapolated to the entire study area and (b) if the detection rates are close to 1.0. All estimates are accompanied by standard errors (SE) and coefficients of variation (CV, that is, SE/estimate).
Sampling plantations to determine white-pine weevil injury
Robert L. Talerico; Robert W., Jr. Wilson
1973-01-01
Use of 1/10-acre square plots to obtain estimates of the proportion of never-weeviled trees necessary for evaluating and scheduling white-pine weevil control is described. The optimum number of trees to observe per plot is estimated from data obtained from sample plantations in the Northeast and a table is given. Of sample size required to achieve a standard error of...
Spatially explicit dynamic N-mixture models
Zhao, Qing; Royle, Andy; Boomer, G. Scott
2017-01-01
Knowledge of demographic parameters such as survival, reproduction, emigration, and immigration is essential to understand metapopulation dynamics. Traditionally the estimation of these demographic parameters requires intensive data from marked animals. The development of dynamic N-mixture models makes it possible to estimate demographic parameters from count data of unmarked animals, but the original dynamic N-mixture model does not distinguish emigration and immigration from survival and reproduction, limiting its ability to explain important metapopulation processes such as movement among local populations. In this study we developed a spatially explicit dynamic N-mixture model that estimates survival, reproduction, emigration, local population size, and detection probability from count data under the assumption that movement only occurs among adjacent habitat patches. Simulation studies showed that the inference of our model depends on detection probability, local population size, and the implementation of robust sampling design. Our model provides reliable estimates of survival, reproduction, and emigration when detection probability is high, regardless of local population size or the type of sampling design. When detection probability is low, however, our model only provides reliable estimates of survival, reproduction, and emigration when local population size is moderate to high and robust sampling design is used. A sensitivity analysis showed that our model is robust against the violation of the assumption that movement only occurs among adjacent habitat patches, suggesting wide applications of this model. Our model can be used to improve our understanding of metapopulation dynamics based on count data that are relatively easy to collect in many systems.
Cherry, S.; White, G.C.; Keating, K.A.; Haroldson, Mark A.; Schwartz, Charles C.
2007-01-01
Current management of the grizzly bear (Ursus arctos) population in Yellowstone National Park and surrounding areas requires annual estimation of the number of adult female bears with cubs-of-the-year. We examined the performance of nine estimators of population size via simulation. Data were simulated using two methods for different combinations of population size, sample size, and coefficient of variation of individual sighting probabilities. We show that the coefficient of variation does not, by itself, adequately describe the effects of capture heterogeneity, because two different distributions of capture probabilities can have the same coefficient of variation. All estimators produced biased estimates of population size with bias decreasing as effort increased. Based on the simulation results we recommend the Chao estimator for model M h be used to estimate the number of female bears with cubs of the year; however, the estimator of Chao and Shen may also be useful depending on the goals of the research.
Change-in-ratio methods for estimating population size
Udevitz, Mark S.; Pollock, Kenneth H.; McCullough, Dale R.; Barrett, Reginald H.
2002-01-01
Change-in-ratio (CIR) methods can provide an effective, low cost approach for estimating the size of wildlife populations. They rely on being able to observe changes in proportions of population subclasses that result from the removal of a known number of individuals from the population. These methods were first introduced in the 1940’s to estimate the size of populations with 2 subclasses under the assumption of equal subclass encounter probabilities. Over the next 40 years, closed population CIR models were developed to consider additional subclasses and use additional sampling periods. Models with assumptions about how encounter probabilities vary over time, rather than between subclasses, also received some attention. Recently, all of these CIR models have been shown to be special cases of a more general model. Under the general model, information from additional samples can be used to test assumptions about the encounter probabilities and to provide estimates of subclass sizes under relaxations of these assumptions. These developments have greatly extended the applicability of the methods. CIR methods are attractive because they do not require the marking of individuals, and subclass proportions often can be estimated with relatively simple sampling procedures. However, CIR methods require a carefully monitored removal of individuals from the population, and the estimates will be of poor quality unless the removals induce substantial changes in subclass proportions. In this paper, we review the state of the art for closed population estimation with CIR methods. Our emphasis is on the assumptions of CIR methods and on identifying situations where these methods are likely to be effective. We also identify some important areas for future CIR research.
NASA Technical Reports Server (NTRS)
Sielken, R. L., Jr. (Principal Investigator)
1981-01-01
Several methods of estimating individual crop acreages using a mixture of completely identified and partially identified (generic) segments from a single growing year are derived and discussed. A small Monte Carlo study of eight estimators is presented. The relative empirical behavior of these estimators is discussed as are the effects of segment sample size and amount of partial identification. The principle recommendations are (1) to not exclude, but rather incorporate partially identified sample segments into the estimation procedure, (2) try to avoid having a large percentage (say 80%) of only partially identified segments, in the sample, and (3) use the maximum likelihood estimator although the weighted least squares estimator and least squares ratio estimator both perform almost as well. Sets of spring small grains (North Dakota) data were used.
Minetti, Andrea; Riera-Montes, Margarita; Nackers, Fabienne; Roederer, Thomas; Koudika, Marie Hortense; Sekkenes, Johanne; Taconet, Aurore; Fermon, Florence; Touré, Albouhary; Grais, Rebecca F; Checchi, Francesco
2012-10-12
Estimation of vaccination coverage at the local level is essential to identify communities that may require additional support. Cluster surveys can be used in resource-poor settings, when population figures are inaccurate. To be feasible, cluster samples need to be small, without losing robustness of results. The clustered LQAS (CLQAS) approach has been proposed as an alternative, as smaller sample sizes are required. We explored (i) the efficiency of cluster surveys of decreasing sample size through bootstrapping analysis and (ii) the performance of CLQAS under three alternative sampling plans to classify local VC, using data from a survey carried out in Mali after mass vaccination against meningococcal meningitis group A. VC estimates provided by a 10 × 15 cluster survey design were reasonably robust. We used them to classify health areas in three categories and guide mop-up activities: i) health areas not requiring supplemental activities; ii) health areas requiring additional vaccination; iii) health areas requiring further evaluation. As sample size decreased (from 10 × 15 to 10 × 3), standard error of VC and ICC estimates were increasingly unstable. Results of CLQAS simulations were not accurate for most health areas, with an overall risk of misclassification greater than 0.25 in one health area out of three. It was greater than 0.50 in one health area out of two under two of the three sampling plans. Small sample cluster surveys (10 × 15) are acceptably robust for classification of VC at local level. We do not recommend the CLQAS method as currently formulated for evaluating vaccination programmes.
Assessing relative abundance and reproductive success of shrubsteppe raptors
Lehman, Robert N.; Carpenter, L.B.; Steenhof, Karen; Kochert, Michael N.
1998-01-01
From 1991-1994, we quantified relative abundance and reproductive success of the Ferruginous Hawk (Buteo regalis), Northern Harrier (Circus cyaneus), Burrowing Owl (Speotytoc unicularia), and Short-eared Owl (Asio flammeus) on the shrubsteppe plateaus (benchlands) in and near the Snake River Birds of Prey National Conservation Area in southwestern Idaho. To assess relative abundance, we searched randomly selected plots using four sampling methods: point counts, line transects, and quadrats of two sizes. On a persampling-effort basis, transects were slightly more effective than point counts and quadrats for locating raptor nests (3.4 pairs detected/100 h of effort vs. 2.2-3.1 pairs). Random sampling using quadrats failed to detect a Short-eared Owl population increase from 1993 to 1994. To evaluate nesting success, we tried to determine reproductive outcome for all nesting attempts located during random, historical, and incidental nest searches. We compared nesting success estimates based on all nesting attempts, on attempts found during incubation, and the Mayfield model. Most pairs used to evaluate success were pairs found incidentally. Visits to historical nesting areas yielded the highest number of pairs per sampling effort (14.6/100 h), but reoccupancy rates for most species decreased through time. Estimates based on all attempts had the highest sample sizes but probably overestimated success for all species except the Ferruginous Hawk. Estimates of success based on nesting attempts found during incubation had the lowest sample sizes. All three methods yielded biased nesting snccess estimates for the Northern Harrier and Short-eared Owl. The estimate based on pairs found during incubation probably provided the least biased estimate for the Burrowing Owl. Assessments of nesting success were hindered by difficulties in confirming egg laying and nesting success for all species except the Ferruginous hawk.
Neeser, Rudolph; Ackermann, Rebecca Rogers; Gain, James
2009-09-01
Various methodological approaches have been used for reconstructing fossil hominin remains in order to increase sample sizes and to better understand morphological variation. Among these, morphometric quantitative techniques for reconstruction are increasingly common. Here we compare the accuracy of three approaches--mean substitution, thin plate splines, and multiple linear regression--for estimating missing landmarks of damaged fossil specimens. Comparisons are made varying the number of missing landmarks, sample sizes, and the reference species of the population used to perform the estimation. The testing is performed on landmark data from individuals of Homo sapiens, Pan troglodytes and Gorilla gorilla, and nine hominin fossil specimens. Results suggest that when a small, same-species fossil reference sample is available to guide reconstructions, thin plate spline approaches perform best. However, if no such sample is available (or if the species of the damaged individual is uncertain), estimates of missing morphology based on a single individual (or even a small sample) of close taxonomic affinity are less accurate than those based on a large sample of individuals drawn from more distantly related extant populations using a technique (such as a regression method) able to leverage the information (e.g., variation/covariation patterning) contained in this large sample. Thin plate splines also show an unexpectedly large amount of error in estimating landmarks, especially over large areas. Recommendations are made for estimating missing landmarks under various scenarios. Copyright 2009 Wiley-Liss, Inc.
Rutterford, Clare; Taljaard, Monica; Dixon, Stephanie; Copas, Andrew; Eldridge, Sandra
2015-06-01
To assess the quality of reporting and accuracy of a priori estimates used in sample size calculations for cluster randomized trials (CRTs). We reviewed 300 CRTs published between 2000 and 2008. The prevalence of reporting sample size elements from the 2004 CONSORT recommendations was evaluated and a priori estimates compared with those observed in the trial. Of the 300 trials, 166 (55%) reported a sample size calculation. Only 36 of 166 (22%) reported all recommended descriptive elements. Elements specific to CRTs were the worst reported: a measure of within-cluster correlation was specified in only 58 of 166 (35%). Only 18 of 166 articles (11%) reported both a priori and observed within-cluster correlation values. Except in two cases, observed within-cluster correlation values were either close to or less than a priori values. Even with the CONSORT extension for cluster randomization, the reporting of sample size elements specific to these trials remains below that necessary for transparent reporting. Journal editors and peer reviewers should implement stricter requirements for authors to follow CONSORT recommendations. Authors should report observed and a priori within-cluster correlation values to enable comparisons between these over a wider range of trials. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Wu, Zhichao; Medeiros, Felipe A
2018-03-20
Visual field testing is an important endpoint in glaucoma clinical trials, and the testing paradigm used can have a significant impact on the sample size requirements. To investigate this, this study included 353 eyes of 247 glaucoma patients seen over a 3-year period to extract real-world visual field rates of change and variability estimates to provide sample size estimates from computer simulations. The clinical trial scenario assumed that a new treatment was added to one of two groups that were both under routine clinical care, with various treatment effects examined. Three different visual field testing paradigms were evaluated: a) evenly spaced testing, b) United Kingdom Glaucoma Treatment Study (UKGTS) follow-up scheme, which adds clustered tests at the beginning and end of follow-up in addition to evenly spaced testing, and c) clustered testing paradigm, with clusters of tests at the beginning and end of the trial period and two intermediary visits. The sample size requirements were reduced by 17-19% and 39-40% using the UKGTS and clustered testing paradigms, respectively, when compared to the evenly spaced approach. These findings highlight how the clustered testing paradigm can substantially reduce sample size requirements and improve the feasibility of future glaucoma clinical trials.
Junno, Juho-Antti; Niskanen, Markku; Maijanen, Heli; Holt, Brigitte; Sladek, Vladimir; Niinimäki, Sirpa; Berner, Margit
2018-02-01
The stature/bi-iliac breadth method provides reasonably precise, skeletal frame size (SFS) based body mass (BM) estimations across adults as a whole. In this study, we examine the potential effects of age changes in anthropometric dimensions on the estimation accuracy of SFS-based body mass estimation. We use anthropometric data from the literature and our own skeletal data from two osteological collections to study effects of age on stature, bi-iliac breadth, body mass, and body composition, as they are major components behind body size and body size estimations. We focus on males, as relevant longitudinal data are based on male study samples. As a general rule, lean body mass (LBM) increases through adolescence and early adulthood until people are aged in their 30s or 40s, and starts to decline in the late 40s or early 50s. Fat mass (FM) tends to increase until the mid-50s and declines thereafter, but in more mobile traditional societies it may decline throughout adult life. Because BM is the sum of LBM and FM, it exhibits a curvilinear age-related pattern in all societies. Skeletal frame size is based on stature and bi-iliac breadth, and both of those dimensions are affected by age. Skeletal frame size based body mass estimation tends to increase throughout adult life in both skeletal and anthropometric samples because an age-related increase in bi-iliac breadth more than compensates for an age-related stature decline commencing in the 30s or 40s. Combined with the above-mentioned curvilinear BM change, this results in curvilinear estimation bias. However, for simulations involving low to moderate percent body fat, the stature/bi-iliac method works well in predicting body mass in younger and middle-aged adults. Such conditions are likely to have applied to most human paleontological and archaeological samples. Copyright © 2017 Elsevier Ltd. All rights reserved.
Sillett, Scott T.; Chandler, Richard B.; Royle, J. Andrew; Kéry, Marc; Morrison, Scott A.
2012-01-01
Population size and habitat-specific abundance estimates are essential for conservation management. A major impediment to obtaining such estimates is that few statistical models are able to simultaneously account for both spatial variation in abundance and heterogeneity in detection probability, and still be amenable to large-scale applications. The hierarchical distance-sampling model of J. A. Royle, D. K. Dawson, and S. Bates provides a practical solution. Here, we extend this model to estimate habitat-specific abundance and rangewide population size of a bird species of management concern, the Island Scrub-Jay (Aphelocoma insularis), which occurs solely on Santa Cruz Island, California, USA. We surveyed 307 randomly selected, 300 m diameter, point locations throughout the 250-km2 island during October 2008 and April 2009. Population size was estimated to be 2267 (95% CI 1613-3007) and 1705 (1212-2369) during the fall and spring respectively, considerably lower than a previously published but statistically problematic estimate of 12 500. This large discrepancy emphasizes the importance of proper survey design and analysis for obtaining reliable information for management decisions. Jays were most abundant in low-elevation chaparral habitat; the detection function depended primarily on the percent cover of chaparral and forest within count circles. Vegetation change on the island has been dramatic in recent decades, due to release from herbivory following the eradication of feral sheep (Ovis aries) from the majority of the island in the mid-1980s. We applied best-fit fall and spring models of habitat-specific jay abundance to a vegetation map from 1985, and estimated the population size of A. insularis was 1400-1500 at that time. The 20-30% increase in the jay population suggests that the species has benefited from the recovery of native vegetation since sheep removal. Nevertheless, this jay's tiny range and small population size make it vulnerable to natural disasters and to habitat alteration related to climate change. Our results demonstrate that hierarchical distance-sampling models hold promise for estimating population size and spatial density variation at large scales. Our statistical methods have been incorporated into the R package unmarked to facilitate their use by animal ecologists, and we provide annotated code in the Supplement.
Using e-mail recruitment and an online questionnaire to establish effect size: A worked example.
Kirkby, Helen M; Wilson, Sue; Calvert, Melanie; Draper, Heather
2011-06-09
Sample size calculations require effect size estimations. Sometimes, effect size estimations and standard deviation may not be readily available, particularly if efficacy is unknown because the intervention is new or developing, or the trial targets a new population. In such cases, one way to estimate the effect size is to gather expert opinion. This paper reports the use of a simple strategy to gather expert opinion to estimate a suitable effect size to use in a sample size calculation. Researchers involved in the design and analysis of clinical trials were identified at the University of Birmingham and via the MRC Hubs for Trials Methodology Research. An email invited them to participate.An online questionnaire was developed using the free online tool 'Survey Monkey©'. The questionnaire described an intervention, an electronic participant information sheet (e-PIS), which may increase recruitment rates to a trial. Respondents were asked how much they would need to see recruitment rates increased by, based on 90%. 70%, 50% and 30% baseline rates, (in a hypothetical study) before they would consider using an e-PIS in their research.Analyses comprised simple descriptive statistics. The invitation to participate was sent to 122 people; 7 responded to say they were not involved in trial design and could not complete the questionnaire, 64 attempted it, 26 failed to complete it. Thirty-eight people completed the questionnaire and were included in the analysis (response rate 33%; 38/115). Of those who completed the questionnaire 44.7% (17/38) were at the academic grade of research fellow 26.3% (10/38) senior research fellow, and 28.9% (11/38) professor. Dependent upon the baseline recruitment rates presented in the questionnaire, participants wanted recruitment rate to increase from 6.9% to 28.9% before they would consider using the intervention. This paper has shown that in situations where effect size estimations cannot be collected from previous research, opinions from researchers and trialists can be quickly and easily collected by conducting a simple study using email recruitment and an online questionnaire. The results collected from the survey were successfully used in sample size calculations for a PhD research study protocol.
Lee, K V; Moon, R D; Burkness, E C; Hutchison, W D; Spivak, M
2010-08-01
The parasitic mite Varroa destructor Anderson & Trueman (Acari: Varroidae) is arguably the most detrimental pest of the European-derived honey bee, Apis mellifera L. Unfortunately, beekeepers lack a standardized sampling plan to make informed treatment decisions. Based on data from 31 commercial apiaries, we developed sampling plans for use by beekeepers and researchers to estimate the density of mites in individual colonies or whole apiaries. Beekeepers can estimate a colony's mite density with chosen level of precision by dislodging mites from approximately to 300 adult bees taken from one brood box frame in the colony, and they can extrapolate to mite density on a colony's adults and pupae combined by doubling the number of mites on adults. For sampling whole apiaries, beekeepers can repeat the process in each of n = 8 colonies, regardless of apiary size. Researchers desiring greater precision can estimate mite density in an individual colony by examining three, 300-bee sample units. Extrapolation to density on adults and pupae may require independent estimates of numbers of adults, of pupae, and of their respective mite densities. Researchers can estimate apiary-level mite density by taking one 300-bee sample unit per colony, but should do so from a variable number of colonies, depending on apiary size. These practical sampling plans will allow beekeepers and researchers to quantify mite infestation levels and enhance understanding and management of V. destructor.
Technical note: Alternatives to reduce adipose tissue sampling bias.
Cruz, G D; Wang, Y; Fadel, J G
2014-10-01
Understanding the mechanisms by which nutritional and pharmaceutical factors can manipulate adipose tissue growth and development in production animals has direct and indirect effects in the profitability of an enterprise. Adipocyte cellularity (number and size) is a key biological response that is commonly measured in animal science research. The variability and sampling of adipocyte cellularity within a muscle has been addressed in previous studies, but no attempt to critically investigate these issues has been proposed in the literature. The present study evaluated 2 sampling techniques (random and systematic) in an attempt to minimize sampling bias and to determine the minimum number of samples from 1 to 15 needed to represent the overall adipose tissue in the muscle. Both sampling procedures were applied on adipose tissue samples dissected from 30 longissimus muscles from cattle finished either on grass or grain. Briefly, adipose tissue samples were fixed with osmium tetroxide, and size and number of adipocytes were determined by a Coulter Counter. These results were then fit in a finite mixture model to obtain distribution parameters of each sample. To evaluate the benefits of increasing number of samples and the advantage of the new sampling technique, the concept of acceptance ratio was used; simply stated, the higher the acceptance ratio, the better the representation of the overall population. As expected, a great improvement on the estimation of the overall adipocyte cellularity parameters was observed using both sampling techniques when sample size number increased from 1 to 15 samples, considering both techniques' acceptance ratio increased from approximately 3 to 25%. When comparing sampling techniques, the systematic procedure slightly improved parameters estimation. The results suggest that more detailed research using other sampling techniques may provide better estimates for minimum sampling.
On the Post Hoc Power in Testing Mean Differences
ERIC Educational Resources Information Center
Yuan, Ke-Hai; Maxwell, Scott
2005-01-01
Retrospective or post hoc power analysis is recommended by reviewers and editors of many journals. Little literature has been found that gave a serious study of the post hoc power. When the sample size is large, the observed effect size is a good estimator of the true power. This article studies whether such a power estimator provides valuable…
Measuring Compartment Size and Gas Solubility in Marine Mammals
2014-09-30
analyzed by gas chromatography . Injection of the sample into the gas chromatograph is done using a sample loop to minimize volume injection error. We...1 DISTRIBUTION STATEMENT A. Approved for public release; distribution is unlimited. Measuring Compartment Size and Gas Solubility in Marine...study is to develop methods to estimate marine mammal tissue compartment sizes, and tissue gas solubility. We aim to improve the data available for
Minimax Estimation of Functionals of Discrete Distributions
Jiao, Jiantao; Venkat, Kartik; Han, Yanjun; Weissman, Tsachy
2017-01-01
We propose a general methodology for the construction and analysis of essentially minimax estimators for a wide class of functionals of finite dimensional parameters, and elaborate on the case of discrete distributions, where the support size S is unknown and may be comparable with or even much larger than the number of observations n. We treat the respective regions where the functional is nonsmooth and smooth separately. In the nonsmooth regime, we apply an unbiased estimator for the best polynomial approximation of the functional whereas, in the smooth regime, we apply a bias-corrected version of the maximum likelihood estimator (MLE). We illustrate the merit of this approach by thoroughly analyzing the performance of the resulting schemes for estimating two important information measures: 1) the entropy H(P)=∑i=1S−pilnpi and 2) Fα(P)=∑i=1Spiα, α > 0. We obtain the minimax L2 rates for estimating these functionals. In particular, we demonstrate that our estimator achieves the optimal sample complexity n ≍ S/ln S for entropy estimation. We also demonstrate that the sample complexity for estimating Fα(P), 0 < α < 1, is n ≍ S1/α/ln S, which can be achieved by our estimator but not the MLE. For 1 < α < 3/2, we show the minimax L2 rate for estimating Fα(P) is (n ln n)−2(α−1) for infinite support size, while the maximum L2 rate for the MLE is n−2(α−1). For all the above cases, the behavior of the minimax rate-optimal estimators with n samples is essentially that of the MLE (plug-in rule) with n ln n samples, which we term “effective sample size enlargement.” We highlight the practical advantages of our schemes for the estimation of entropy and mutual information. We compare our performance with various existing approaches, and demonstrate that our approach reduces running time and boosts the accuracy. Moreover, we show that the minimax rate-optimal mutual information estimator yielded by our framework leads to significant performance boosts over the Chow–Liu algorithm in learning graphical models. The wide use of information measure estimation suggests that the insights and estimators obtained in this paper could be broadly applicable. PMID:29375152
Code of Federal Regulations, 2013 CFR
2013-01-01
... between 1 and k. The value of k is dependent upon the estimated size of the universe and the sample size... included in the active universe defined in paragraph (e)(1) of this section) during the annual review... review (i.e., households which are part of the negative universe defined in paragraph (e)(2) of this...
Code of Federal Regulations, 2012 CFR
2012-01-01
... between 1 and k. The value of k is dependent upon the estimated size of the universe and the sample size... included in the active universe defined in paragraph (e)(1) of this section) during the annual review... review (i.e., households which are part of the negative universe defined in paragraph (e)(2) of this...
Code of Federal Regulations, 2014 CFR
2014-01-01
... between 1 and k. The value of k is dependent upon the estimated size of the universe and the sample size... included in the active universe defined in paragraph (e)(1) of this section) during the annual review... review (i.e., households which are part of the negative universe defined in paragraph (e)(2) of this...
Exploratory Factor Analysis with Small Sample Sizes
ERIC Educational Resources Information Center
de Winter, J. C. F.; Dodou, D.; Wieringa, P. A.
2009-01-01
Exploratory factor analysis (EFA) is generally regarded as a technique for large sample sizes ("N"), with N = 50 as a reasonable absolute minimum. This study offers a comprehensive overview of the conditions in which EFA can yield good quality results for "N" below 50. Simulations were carried out to estimate the minimum required "N" for different…
Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning
ERIC Educational Resources Information Center
Li, Zhushan
2014-01-01
Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…
An Investigation of Sample Size Splitting on ATFIND and DIMTEST
ERIC Educational Resources Information Center
Socha, Alan; DeMars, Christine E.
2013-01-01
Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…
Power and Precision in Confirmatory Factor Analytic Tests of Measurement Invariance
ERIC Educational Resources Information Center
Meade, Adam W.; Bauer, Daniel J.
2007-01-01
This study investigates the effects of sample size, factor overdetermination, and communality on the precision of factor loading estimates and the power of the likelihood ratio test of factorial invariance in multigroup confirmatory factor analysis. Although sample sizes are typically thought to be the primary determinant of precision and power,…
Capturing heterogeneity: The role of a study area's extent for estimating mean throughfall
NASA Astrophysics Data System (ADS)
Zimmermann, Alexander; Voss, Sebastian; Metzger, Johanna Clara; Hildebrandt, Anke; Zimmermann, Beate
2016-11-01
The selection of an appropriate spatial extent of a sampling plot is one among several important decisions involved in planning a throughfall sampling scheme. In fact, the choice of the extent may determine whether or not a study can adequately characterize the hydrological fluxes of the studied ecosystem. Previous attempts to optimize throughfall sampling schemes focused on the selection of an appropriate sample size, support, and sampling design, while comparatively little attention has been given to the role of the extent. In this contribution, we investigated the influence of the extent on the representativeness of mean throughfall estimates for three forest ecosystems of varying stand structure. Our study is based on virtual sampling of simulated throughfall fields. We derived these fields from throughfall data sampled in a simply structured forest (young tropical forest) and two heterogeneous forests (old tropical forest, unmanaged mixed European beech forest). We then sampled the simulated throughfall fields with three common extents and various sample sizes for a range of events and for accumulated data. Our findings suggest that the size of the study area should be carefully adapted to the complexity of the system under study and to the required temporal resolution of the throughfall data (i.e. event-based versus accumulated). Generally, event-based sampling in complex structured forests (conditions that favor comparatively long autocorrelations in throughfall) requires the largest extents. For event-based sampling, the choice of an appropriate extent can be as important as using an adequate sample size.
Image analysis of representative food structures: application of the bootstrap method.
Ramírez, Cristian; Germain, Juan C; Aguilera, José M
2009-08-01
Images (for example, photomicrographs) are routinely used as qualitative evidence of the microstructure of foods. In quantitative image analysis it is important to estimate the area (or volume) to be sampled, the field of view, and the resolution. The bootstrap method is proposed to estimate the size of the sampling area as a function of the coefficient of variation (CV(Bn)) and standard error (SE(Bn)) of the bootstrap taking sub-areas of different sizes. The bootstrap method was applied to simulated and real structures (apple tissue). For simulated structures, 10 computer-generated images were constructed containing 225 black circles (elements) and different coefficient of variation (CV(image)). For apple tissue, 8 images of apple tissue containing cellular cavities with different CV(image) were analyzed. Results confirmed that for simulated and real structures, increasing the size of the sampling area decreased the CV(Bn) and SE(Bn). Furthermore, there was a linear relationship between the CV(image) and CV(Bn) (.) For example, to obtain a CV(Bn) = 0.10 in an image with CV(image) = 0.60, a sampling area of 400 x 400 pixels (11% of whole image) was required, whereas if CV(image) = 1.46, a sampling area of 1000 x 100 pixels (69% of whole image) became necessary. This suggests that a large-size dispersion of element sizes in an image requires increasingly larger sampling areas or a larger number of images.
Effect Size in Efficacy Trials of Women With Decreased Sexual Desire.
Pyke, Robert E; Clayton, Anita H
2018-03-22
Regarding hypoactive sexual desire disorder (HSDD) in women, some reviewers judge the effect size small for medications vs placebo, but substantial for cognitive behavior therapy (CBT) or mindfulness meditation training (MMT) vs wait list. However, we lack comparisons of the effect sizes for the active intervention itself, for the control treatment, and for the differential between the two. For efficacy trials of HSDD in women, compare effect sizes for medications (testosterone/testosterone transdermal system, flibanserin, and bremelanotide) and placebo vs effect sizes for psychotherapy and wait-list control. We conducted a literature search for mean changes and SD on main measures of sexual desire and associated distress in trials of medications, CBT, or MMT. Effect size was used as it measures the magnitude of the intervention without confounding by sample size. Cohen d was used to determine effect sizes. For medications, mean (SD) effect size was 1.0 (0.34); for CBT and MMT, 1.0 (0.36); for placebo, 0.55 (0.16); and for wait list, 0.05 (0.26). Recommendations of psychotherapy over medication for treatment of HSDD are premature and not supported by data on effect sizes. Active participation in treatment conveys considerable non-specific benefits. Caregivers should attend to biological and psychosocial elements, and patient preference, to optimize response. Few clinical trials of psychotherapies were substantial in size or utilized adequate control paradigms. Medications and psychotherapies had similar, large effect sizes. Effect size of placebo was moderate. Effect size of wait-list control was very small, about one quarter that of placebo. Thus, a substantial non-specific therapeutic effect is associated with receiving placebo plus active care and evaluation. The difference in effect size between placebo and wait-list controls distorts the value of the subtraction of effect of the control paradigms to estimate intervention effectiveness. Pyke RE, Clayton AH. Effect Size in Efficacy Trials of Women With Decreased Sexual Desire. Sex Med Rev 2018;XX:XXX-XXX. Copyright © 2018 International Society for Sexual Medicine. Published by Elsevier Inc. All rights reserved.
Daly, Caitlin H; Higgins, Victoria; Adeli, Khosrow; Grey, Vijay L; Hamid, Jemila S
2017-12-01
To statistically compare and evaluate commonly used methods of estimating reference intervals and to determine which method is best based on characteristics of the distribution of various data sets. Three approaches for estimating reference intervals, i.e. parametric, non-parametric, and robust, were compared with simulated Gaussian and non-Gaussian data. The hierarchy of the performances of each method was examined based on bias and measures of precision. The findings of the simulation study were illustrated through real data sets. In all Gaussian scenarios, the parametric approach provided the least biased and most precise estimates. In non-Gaussian scenarios, no single method provided the least biased and most precise estimates for both limits of a reference interval across all sample sizes, although the non-parametric approach performed the best for most scenarios. The hierarchy of the performances of the three methods was only impacted by sample size and skewness. Differences between reference interval estimates established by the three methods were inflated by variability. Whenever possible, laboratories should attempt to transform data to a Gaussian distribution and use the parametric approach to obtain the most optimal reference intervals. When this is not possible, laboratories should consider sample size and skewness as factors in their choice of reference interval estimation method. The consequences of false positives or false negatives may also serve as factors in this decision. Copyright © 2017 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.
Diefenbach, Duane R.; Hansen, Leslie A.; Bohling, Justin H.; Miller-Butterworth, Cassandra
2015-01-01
In 1988–1989, 32 bobcats Lynx rufus were reintroduced to Cumberland Island (CUIS), Georgia, USA, from which they had previously been extirpated. They were monitored intensively for 3 years immediately post-reintroduction, but no estimation of the size or genetic diversity of the population had been conducted in over 20 years since reintroduction. We returned to CUIS in 2012 to estimate abundance and effective population size of the present-day population, as well as to quantify genetic diversity and inbreeding. We amplified 12 nuclear microsatellite loci from DNA isolated from scats to establish genetic profiles to identify individuals. We used spatially explicit capture–recapture population estimation to estimate abundance. From nine unique genetic profiles, we estimate a population size of 14.4 (SE = 3.052) bobcats, with an effective population size (Ne) of 5–8 breeding individuals. This is consistent with predictions of a population viability analysis conducted at the time of reintroduction, which estimated the population would average 12–13 bobcats after 10 years. We identified several pairs of related bobcats (parent-offspring and full siblings), but ~75% of the pairwise comparisons were typical of unrelated individuals, and only one individual appeared inbred. Despite the small population size and other indications that it has likely experienced a genetic bottleneck, levels of genetic diversity in the CUIS bobcat population remain high compared to other mammalian carnivores. The reintroduction of bobcats to CUIS provides an opportunity to study changes in genetic diversity in an insular population without risk to this common species. Opportunities for natural immigration to the island are limited; therefore, continued monitoring and supplemental bobcat reintroductions could be used to evaluate the effect of different management strategies to maintain genetic diversity and population viability. The successful reintroduction and maintenance of a bobcat population on CUIS illustrates the suitability of translocation as a management tool for re-establishing felid populations.
Fan, Chunpeng; Zhang, Donghui
2012-01-01
Although the Kruskal-Wallis test has been widely used to analyze ordered categorical data, power and sample size methods for this test have been investigated to a much lesser extent when the underlying multinomial distributions are unknown. This article generalizes the power and sample size procedures proposed by Fan et al. ( 2011 ) for continuous data to ordered categorical data, when estimates from a pilot study are used in the place of knowledge of the true underlying distribution. Simulations show that the proposed power and sample size formulas perform well. A myelin oligodendrocyte glycoprotein (MOG) induced experimental autoimmunce encephalomyelitis (EAE) mouse study is used to demonstrate the application of the methods.
Baranowski, Tom; Baranowski, Janice C; Watson, Kathleen B; Martin, Shelby; Beltran, Alicia; Islam, Noemi; Dadabhoy, Hafza; Adame, Su-heyla; Cullen, Karen; Thompson, Debbe; Buday, Richard; Subar, Amy
2011-03-01
To test the effect of image size and presence of size cues on the accuracy of portion size estimation by children. Children were randomly assigned to seeing images with or without food size cues (utensils and checked tablecloth) and were presented with sixteen food models (foods commonly eaten by children) in varying portion sizes, one at a time. They estimated each food model's portion size by selecting a digital food image. The same food images were presented in two ways: (i) as small, graduated portion size images all on one screen or (ii) by scrolling across large, graduated portion size images, one per sequential screen. Laboratory-based with computer and food models. Volunteer multi-ethnic sample of 120 children, equally distributed by gender and ages (8 to 13 years) in 2008-2009. Average percentage of correctly classified foods was 60·3 %. There were no differences in accuracy by any design factor or demographic characteristic. Multiple small pictures on the screen at once took half the time to estimate portion size compared with scrolling through large pictures. Larger pictures had more overestimation of size. Multiple images of successively larger portion sizes of a food on one computer screen facilitated quicker portion size responses with no decrease in accuracy. This is the method of choice for portion size estimation on a computer.
A Model Based Approach to Sample Size Estimation in Recent Onset Type 1 Diabetes
Bundy, Brian; Krischer, Jeffrey P.
2016-01-01
The area under the curve C-peptide following a 2-hour mixed meal tolerance test from 481 individuals enrolled on 5 prior TrialNet studies of recent onset type 1 diabetes from baseline to 12 months after enrollment were modelled to produce estimates of its rate of loss and variance. Age at diagnosis and baseline C-peptide were found to be significant predictors and adjusting for these in an ANCOVA resulted in estimates with lower variance. Using these results as planning parameters for new studies results in a nearly 50% reduction in the target sample size. The modelling also produces an expected C-peptide that can be used in Observed vs. Expected calculations to estimate the presumption of benefit in ongoing trials. PMID:26991448
Thomson, R; Kawrakow, I
2012-06-01
Widely-used classical trajectory Monte Carlo simulations of low energy electron transport neglect the quantum nature of electrons; however, at sub-1 keV energies quantum effects have the potential to become significant. This work compares quantum and classical simulations within a simplified model of electron transport in water. Electron transport is modeled in water droplets using quantum mechanical (QM) and classical trajectory Monte Carlo (MC) methods. Water droplets are modeled as collections of point scatterers representing water molecules from which electrons may be isotropically scattered. The role of inelastic scattering is investigated by introducing absorption. QM calculations involve numerically solving a system of coupled equations for the electron wavefield incident on each scatterer. A minimum distance between scatterers is introduced to approximate structured water. The average QM water droplet incoherent cross section is compared with the MC cross section; a relative error (RE) on the MC results is computed. RE varies with electron energy, average and minimum distances between scatterers, and scattering amplitude. The mean free path is generally the relevant length scale for estimating RE. The introduction of a minimum distance between scatterers increases RE substantially (factors of 5 to 10), suggesting that the structure of water must be modeled for accurate simulations. Inelastic scattering does not improve agreement between QM and MC simulations: for the same magnitude of elastic scattering, the introduction of inelastic scattering increases RE. Droplet cross sections are sensitive to droplet size and shape; considerable variations in RE are observed with changing droplet size and shape. At sub-1 keV energies, quantum effects may become non-negligible for electron transport in condensed media. Electron transport is strongly affected by the structure of the medium. Inelastic scatter does not improve agreement between QM and MC simulations of low energy electron transport in condensed media. © 2012 American Association of Physicists in Medicine.
Fung, Tak; Keenan, Kevin
2014-01-01
The estimation of population allele frequencies using sample data forms a central component of studies in population genetics. These estimates can be used to test hypotheses on the evolutionary processes governing changes in genetic variation among populations. However, existing studies frequently do not account for sampling uncertainty in these estimates, thus compromising their utility. Incorporation of this uncertainty has been hindered by the lack of a method for constructing confidence intervals containing the population allele frequencies, for the general case of sampling from a finite diploid population of any size. In this study, we address this important knowledge gap by presenting a rigorous mathematical method to construct such confidence intervals. For a range of scenarios, the method is used to demonstrate that for a particular allele, in order to obtain accurate estimates within 0.05 of the population allele frequency with high probability (> or = 95%), a sample size of > 30 is often required. This analysis is augmented by an application of the method to empirical sample allele frequency data for two populations of the checkerspot butterfly (Melitaea cinxia L.), occupying meadows in Finland. For each population, the method is used to derive > or = 98.3% confidence intervals for the population frequencies of three alleles. These intervals are then used to construct two joint > or = 95% confidence regions, one for the set of three frequencies for each population. These regions are then used to derive a > or = 95%% confidence interval for Jost's D, a measure of genetic differentiation between the two populations. Overall, the results demonstrate the practical utility of the method with respect to informing sampling design and accounting for sampling uncertainty in studies of population genetics, important for scientific hypothesis-testing and also for risk-based natural resource management.
Dawson, Ree; Lavori, Philip W
2012-01-01
Clinical demand for individualized "adaptive" treatment policies in diverse fields has spawned development of clinical trial methodology for their experimental evaluation via multistage designs, building upon methods intended for the analysis of naturalistically observed strategies. Because often there is no need to parametrically smooth multistage trial data (in contrast to observational data for adaptive strategies), it is possible to establish direct connections among different methodological approaches. We show by algebraic proof that the maximum likelihood (ML) and optimal semiparametric (SP) estimators of the population mean of the outcome of a treatment policy and its standard error are equal under certain experimental conditions. This result is used to develop a unified and efficient approach to design and inference for multistage trials of policies that adapt treatment according to discrete responses. We derive a sample size formula expressed in terms of a parametric version of the optimal SP population variance. Nonparametric (sample-based) ML estimation performed well in simulation studies, in terms of achieved power, for scenarios most likely to occur in real studies, even though sample sizes were based on the parametric formula. ML outperformed the SP estimator; differences in achieved power predominately reflected differences in their estimates of the population mean (rather than estimated standard errors). Neither methodology could mitigate the potential for overestimated sample sizes when strong nonlinearity was purposely simulated for certain discrete outcomes; however, such departures from linearity may not be an issue for many clinical contexts that make evaluation of competitive treatment policies meaningful.
Accounting for randomness in measurement and sampling in studying cancer cell population dynamics.
Ghavami, Siavash; Wolkenhauer, Olaf; Lahouti, Farshad; Ullah, Mukhtar; Linnebacher, Michael
2014-10-01
Knowing the expected temporal evolution of the proportion of different cell types in sample tissues gives an indication about the progression of the disease and its possible response to drugs. Such systems have been modelled using Markov processes. We here consider an experimentally realistic scenario in which transition probabilities are estimated from noisy cell population size measurements. Using aggregated data of FACS measurements, we develop MMSE and ML estimators and formulate two problems to find the minimum number of required samples and measurements to guarantee the accuracy of predicted population sizes. Our numerical results show that the convergence mechanism of transition probabilities and steady states differ widely from the real values if one uses the standard deterministic approach for noisy measurements. This provides support for our argument that for the analysis of FACS data one should consider the observed state as a random variable. The second problem we address is about the consequences of estimating the probability of a cell being in a particular state from measurements of small population of cells. We show how the uncertainty arising from small sample sizes can be captured by a distribution for the state probability.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gihring, Thomas; Green, Stefan; Schadt, Christopher Warren
2011-01-01
Technologies for massively parallel sequencing are revolutionizing microbial ecology and are vastly increasing the scale of ribosomal RNA (rRNA) gene studies. Although pyrosequencing has increased the breadth and depth of possible rRNA gene sampling, one drawback is that the number of reads obtained per sample is difficult to control. Pyrosequencing libraries typically vary widely in the number of sequences per sample, even within individual studies, and there is a need to revisit the behaviour of richness estimators and diversity indices with variable gene sequence library sizes. Multiple reports and review papers have demonstrated the bias in non-parametric richness estimators (e.g.more » Chao1 and ACE) and diversity indices when using clone libraries. However, we found that biased community comparisons are accumulating in the literature. Here we demonstrate the effects of sample size on Chao1, ACE, CatchAll, Shannon, Chao-Shen and Simpson's estimations specifically using pyrosequencing libraries. The need to equalize the number of reads being compared across libraries is reiterated, and investigators are directed towards available tools for making unbiased diversity comparisons.« less
Zeng, Yaohui; Singh, Sachinkumar; Wang, Kai; Ahrens, Richard C
2018-04-01
Pharmacodynamic studies that use methacholine challenge to assess bioequivalence of generic and innovator albuterol formulations are generally designed per published Food and Drug Administration guidance, with 3 reference doses and 1 test dose (3-by-1 design). These studies are challenging and expensive to conduct, typically requiring large sample sizes. We proposed 14 modified study designs as alternatives to the Food and Drug Administration-recommended 3-by-1 design, hypothesizing that adding reference and/or test doses would reduce sample size and cost. We used Monte Carlo simulation to estimate sample size. Simulation inputs were selected based on published studies and our own experience with this type of trial. We also estimated effects of these modified study designs on study cost. Most of these altered designs reduced sample size and cost relative to the 3-by-1 design, some decreasing cost by more than 40%. The most effective single study dose to add was 180 μg of test formulation, which resulted in an estimated 30% relative cost reduction. Adding a single test dose of 90 μg was less effective, producing only a 13% cost reduction. Adding a lone reference dose of either 180, 270, or 360 μg yielded little benefit (less than 10% cost reduction), whereas adding 720 μg resulted in a 19% cost reduction. Of the 14 study design modifications we evaluated, the most effective was addition of both a 90-μg test dose and a 720-μg reference dose (42% cost reduction). Combining a 180-μg test dose and a 720-μg reference dose produced an estimated 36% cost reduction. © 2017, The Authors. The Journal of Clinical Pharmacology published by Wiley Periodicals, Inc. on behalf of American College of Clinical Pharmacology.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reiser, I; Lu, Z
2014-06-01
Purpose: Recently, task-based assessment of diagnostic CT systems has attracted much attention. Detection task performance can be estimated using human observers, or mathematical observer models. While most models are well established, considerable bias can be introduced when performance is estimated from a limited number of image samples. Thus, the purpose of this work was to assess the effect of sample size on bias and uncertainty of two channelized Hotelling observers and a template-matching observer. Methods: The image data used for this study consisted of 100 signal-present and 100 signal-absent regions-of-interest, which were extracted from CT slices. The experimental conditions includedmore » two signal sizes and five different x-ray beam current settings (mAs). Human observer performance for these images was determined in 2-alternative forced choice experiments. These data were provided by the Mayo clinic in Rochester, MN. Detection performance was estimated from three observer models, including channelized Hotelling observers (CHO) with Gabor or Laguerre-Gauss (LG) channels, and a template-matching observer (TM). Different sample sizes were generated by randomly selecting a subset of image pairs, (N=20,40,60,80). Observer performance was quantified as proportion of correct responses (PC). Bias was quantified as the relative difference of PC for 20 and 80 image pairs. Results: For n=100, all observer models predicted human performance across mAs and signal sizes. Bias was 23% for CHO (Gabor), 7% for CHO (LG), and 3% for TM. The relative standard deviation, σ(PC)/PC at N=20 was highest for the TM observer (11%) and lowest for the CHO (Gabor) observer (5%). Conclusion: In order to make image quality assessment feasible in the clinical practice, a statistically efficient observer model, that can predict performance from few samples, is needed. Our results identified two observer models that may be suited for this task.« less
Fazey, Francesca M C; Ryan, Peter G
2016-03-01
Recent estimates suggest that roughly 100 times more plastic litter enters the sea than is found floating at the sea surface, despite the buoyancy and durability of many plastic polymers. Biofouling by marine biota is one possible mechanism responsible for this discrepancy. Microplastics (<5 mm in diameter) are more scarce than larger size classes, which makes sense because fouling is a function of surface area whereas buoyancy is a function of volume; the smaller an object, the greater its relative surface area. We tested whether plastic items with high surface area to volume ratios sank more rapidly by submerging 15 different sizes of polyethylene samples in False Bay, South Africa, for 12 weeks to determine the time required for samples to sink. All samples became sufficiently fouled to sink within the study period, but small samples lost buoyancy much faster than larger ones. There was a direct relationship between sample volume (buoyancy) and the time to attain a 50% probability of sinking, which ranged from 17 to 66 days of exposure. Our results provide the first estimates of the longevity of different sizes of plastic debris at the ocean surface. Further research is required to determine how fouling rates differ on free floating debris in different regions and in different types of marine environments. Such estimates could be used to improve model predictions of the distribution and abundance of floating plastic debris globally. Copyright © 2016 Elsevier Ltd. All rights reserved.
A two-stage Monte Carlo approach to the expression of uncertainty with finite sample sizes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crowder, Stephen Vernon; Moyer, Robert D.
2005-05-01
Proposed supplement I to the GUM outlines a 'propagation of distributions' approach to deriving the distribution of a measurand for any non-linear function and for any set of random inputs. The supplement's proposed Monte Carlo approach assumes that the distributions of the random inputs are known exactly. This implies that the sample sizes are effectively infinite. In this case, the mean of the measurand can be determined precisely using a large number of Monte Carlo simulations. In practice, however, the distributions of the inputs will rarely be known exactly, but must be estimated using possibly small samples. If these approximatedmore » distributions are treated as exact, the uncertainty in estimating the mean is not properly taken into account. In this paper, we propose a two-stage Monte Carlo procedure that explicitly takes into account the finite sample sizes used to estimate parameters of the input distributions. We will illustrate the approach with a case study involving the efficiency of a thermistor mount power sensor. The performance of the proposed approach will be compared to the standard GUM approach for finite samples using simple non-linear measurement equations. We will investigate performance in terms of coverage probabilities of derived confidence intervals.« less
Number of pins in two-stage stratified sampling for estimating herbage yield
William G. O' Regan; C. Eugene Conrad
1975-01-01
In a two-stage stratified procedure for sampling herbage yield, plots are stratified by a pin frame in stage one, and clipped. In stage two, clippings from selected plots are sorted, dried, and weighed. Sample size and distribution of plots between the two stages are determined by equations. A way to compute the effect of number of pins on the variance of estimated...
NASA Astrophysics Data System (ADS)
Xu, Heqiucen; Shiokawa, Kazuo; Frühauff, Dennis
2017-10-01
We statistically analyzed severe magnetic fluctuations in the nightside near-Earth plasma sheet at 6-12 RE (Earth radii; 1 RE = 6371 km), because they are important for non-magnetohydrodynamics (non-MHD) effects in the magnetotail and are considered to be necessary for current disruption in the inside-out substorm model. We used magnetic field data from 2013 and 2014 obtained by the Time History of Events and Macroscale Interactions during Substorms E (THEMIS-E) satellite (sampling rate: 4 Hz). A total of 1283 severe magnetic fluctuation events were identified that satisfied the criteria σB/B > 0. 5, where σB and B are the standard deviation and the average value of magnetic field intensity during the time interval of the local proton gyroperiod, respectively. We found that the occurrence rates of severe fluctuation events are 0.00118, 0.00899, and 0.0238 % at 6-8, 8-10, and 10-12 RE, respectively, and most events last for no more than 15 s. From these occurrence rates, we estimated the possible scale sizes of current disruption by severe magnetic fluctuations as 3.83 RE3 by assuming that four substorms with 5 min intervals of current disruption occur every day. The fluctuation events occurred most frequently at the ZGSM (Z distance in the geocentric solar magnetospheric coordinate system) close to the model neutral sheet within 0.2 RE. Most events occur in association with sudden decreases in the auroral electrojet lower (AL) index and magnetic field dipolarization, indicating that they are related to substorms. Sixty-two percent of magnetic fluctuation events were accompanied by ion flow with velocity V > 100 km s-1, indicating that the violation of ion gyromotion tends to occur during high-speed flow in the near-Earth plasma sheet. The superposed epoch analysis also indicated that the flow speed increases before the severe magnetic fluctuations. We discuss how both the inside-out and outside-in substorm models can explain this increase in flow speeds before magnetic fluctuation events.
USDA-ARS?s Scientific Manuscript database
Noroviruses (NoV) annually cause millions of cases of gastrointestinal disease in the United States. Although NoV outbreaks are generally associated with raw shellfish, particularly oysters, outbreaks have also been known to occur from other common-source food-borne vehicles such as lettuce, frozen...
Shanmuga Doss, Sreeja; Bhatt, Nirav Pravinbhai; Jayaraman, Guhan
2017-08-15
There is an unreasonably high variation in the literature reports on molecular weight of hyaluronic acid (HA) estimated using conventional size exclusion chromatography (SEC). This variation is most likely due to errors in estimation. Working with commercially available HA molecular weight standards, this work examines the extent of error in molecular weight estimation due to two factors: use of non-HA based calibration and concentration of sample injected into the SEC column. We develop a multivariate regression correlation to correct for concentration effect. Our analysis showed that, SEC calibration based on non-HA standards like polyethylene oxide and pullulan led to approximately 2 and 10 times overestimation, respectively, when compared to HA-based calibration. Further, we found that injected sample concentration has an effect on molecular weight estimation. Even at 1g/l injected sample concentration, HA molecular weight standards of 0.7 and 1.64MDa showed appreciable underestimation of 11-24%. The multivariate correlation developed was found to reduce error in estimations at 1g/l to <4%. The correlation was also successfully applied to accurately estimate the molecular weight of HA produced by a recombinant Lactococcus lactis fermentation. Copyright © 2017 Elsevier B.V. All rights reserved.
Osmium isotope and highly siderophile element systematics of the lunar crust
NASA Astrophysics Data System (ADS)
Day, James M. D.; Walker, Richard J.; James, Odette B.; Puchtel, Igor S.
2010-01-01
Coupled 187Os/ 188Os and highly siderophile element (HSE: Os, Ir, Ru, Pt, Pd, and Re) abundance data are reported for pristine lunar crustal rocks 60025, 62255, 65315 (ferroan anorthosites, FAN) and 76535, 78235, 77215 and a norite clast in 15455 (magnesian-suite rocks, MGS). Osmium isotopes permit more refined discrimination than previously possible of samples that have been contaminated by meteoritic additions and the new results show that some rocks, previously identified as pristine, contain meteorite-derived HSE. Low HSE abundances in FAN and MGS rocks are consistent with derivation from a strongly HSE-depleted lunar mantle. At the time of formation, the lunar floatation crust, represented by FAN, had 1.4 ± 0.3 pg g - 1 Os, 1.5 ± 0.6 pg g - 1 Ir, 6.8 ± 2.7 pg g - 1 Ru, 16 ± 15 pg g - 1 Pt, 33 ± 30 pg g - 1 Pd and 0.29 ± 0.10 pg g - 1 Re (˜ 0.00002 × CI) and Re/Os ratios that were modestly elevated ( 187Re/ 188Os = 0.6 to 1.7) relative to CI chondrites. MGS samples are, on average, characterised by more elevated HSE abundances (˜ 0.00007 × CI) compared with FAN. This either reflects contrasting mantle-source HSE characteristics of FAN and MGS rocks, or different mantle-crust HSE fractionation behaviour during production of these lithologies. Previous studies of lunar impact-melt rocks have identified possible elevated Ru and Pd in lunar crustal target rocks. The new results provide no supporting evidence for such enrichments. If maximum estimates for HSE in the lunar mantle are compared with FAN and MGS averages, crust-mantle concentration ratios ( D-values) must be ≤ 0.3. Such D-values are broadly similar to those estimated for partitioning between the terrestrial crust and upper mantle, with the notable exception of Re. Given the presumably completely different mode of origin for the primary lunar floatation crust and tertiary terrestrial continental crust, the potential similarities in crust-mantle HSE partitioning for the Earth and Moon are somewhat surprising. Low HSE abundances in the lunar crust, coupled with estimates of HSE concentrations in the lunar mantle implies there may be a 'missing component' of late-accreted materials (as much as 95%) to the Moon if the Earth/Moon mass-flux estimates are correct and terrestrial mantle HSE abundances were established by late accretion.
Osmium isotope and highly siderophile element systematics of the lunar crust
Day, J.M.D.; Walker, R.J.; James, O.B.; Puchtel, I.S.
2010-01-01
Coupled 187Os/188Os and highly siderophile element (HSE: Os, Ir, Ru, Pt, Pd, and Re) abundance data are reported for pristine lunar crustal rocks 60025, 62255, 65315 (ferroan anorthosites, FAN) and 76535, 78235, 77215 and a norite clast in 15455 (magnesian-suite rocks, MGS). Osmium isotopes permit more refined discrimination than previously possible of samples that have been contaminated by meteoritic additions and the new results show that some rocks, previously identified as pristine, contain meteorite-derived HSE. Low HSE abundances in FAN and MGS rocks are consistent with derivation from a strongly HSE-depleted lunar mantle. At the time of formation, the lunar floatation crust, represented by FAN, had 1.4 ?? 0.3 pg g- 1 Os, 1.5 ?? 0.6 pg g- 1 Ir, 6.8 ?? 2.7 pg g- 1 Ru, 16 ?? 15 pg g- 1 Pt, 33 ?? 30 pg g- 1 Pd and 0.29 ?? 0.10 pg g- 1 Re (??? 0.00002 ?? CI) and Re/Os ratios that were modestly elevated (187Re/188Os = 0.6 to 1.7) relative to CI chondrites. MGS samples are, on average, characterised by more elevated HSE abundances (??? 0.00007 ?? CI) compared with FAN. This either reflects contrasting mantle-source HSE characteristics of FAN and MGS rocks, or different mantle-crust HSE fractionation behaviour during production of these lithologies. Previous studies of lunar impact-melt rocks have identified possible elevated Ru and Pd in lunar crustal target rocks. The new results provide no supporting evidence for such enrichments. If maximum estimates for HSE in the lunar mantle are compared with FAN and MGS averages, crust-mantle concentration ratios (D-values) must be ??? 0.3. Such D-values are broadly similar to those estimated for partitioning between the terrestrial crust and upper mantle, with the notable exception of Re. Given the presumably completely different mode of origin for the primary lunar floatation crust and tertiary terrestrial continental crust, the potential similarities in crust-mantle HSE partitioning for the Earth and Moon are somewhat surprising. Low HSE abundances in the lunar crust, coupled with estimates of HSE concentrations in the lunar mantle implies there may be a 'missing component' of late-accreted materials (as much as 95%) to the Moon if the Earth/Moon mass-flux estimates are correct and terrestrial mantle HSE abundances were established by late accretion. ?? 2009 Elsevier B.V. All rights reserved.
Shih, Weichung Joe; Li, Gang; Wang, Yining
2016-03-01
Sample size plays a crucial role in clinical trials. Flexible sample-size designs, as part of the more general category of adaptive designs that utilize interim data, have been a popular topic in recent years. In this paper, we give a comparative review of four related methods for such a design. The likelihood method uses the likelihood ratio test with an adjusted critical value. The weighted method adjusts the test statistic with given weights rather than the critical value. The dual test method requires both the likelihood ratio statistic and the weighted statistic to be greater than the unadjusted critical value. The promising zone approach uses the likelihood ratio statistic with the unadjusted value and other constraints. All four methods preserve the type-I error rate. In this paper we explore their properties and compare their relationships and merits. We show that the sample size rules for the dual test are in conflict with the rules of the promising zone approach. We delineate what is necessary to specify in the study protocol to ensure the validity of the statistical procedure and what can be kept implicit in the protocol so that more flexibility can be attained for confirmatory phase III trials in meeting regulatory requirements. We also prove that under mild conditions, the likelihood ratio test still preserves the type-I error rate when the actual sample size is larger than the re-calculated one. Copyright © 2015 Elsevier Inc. All rights reserved.
Precise, unbiased estimates of population size are an essential tool for fisheries management. For a wide variety of salmonid fishes, redd counts from a sample of reaches are commonly used to monitor annual trends in abundance. Using a 9-year time series of georeferenced censuses...
Tomyn, Ronald L; Sleeth, Darrah K; Thiese, Matthew S; Larson, Rodney R
2016-01-01
In addition to chemical composition, the site of deposition of inhaled particles is important for determining the potential health effects from an exposure. As a result, the International Organization for Standardization adopted a particle deposition sampling convention. This includes extrathoracic particle deposition sampling conventions for the anterior nasal passages (ET1) and the posterior nasal and oral passages (ET2). This study assessed how well a polyurethane foam insert placed in an Institute of Occupational Medicine (IOM) sampler can match an extrathoracic deposition sampling convention, while accounting for possible static buildup in the test particles. In this way, the study aimed to assess whether neutralized particles affected the performance of this sampler for estimating extrathoracic particle deposition. A total of three different particle sizes (4.9, 9.5, and 12.8 µm) were used. For each trial, one particle size was introduced into a low-speed wind tunnel with a wind speed set a 0.2 m/s (∼40 ft/min). This wind speed was chosen to closely match the conditions of most indoor working environments. Each particle size was tested twice either neutralized, using a high voltage neutralizer, or left in its normal (non neutralized) state as standard particles. IOM samplers were fitted with a polyurethane foam insert and placed on a rotating mannequin inside the wind tunnel. Foam sampling efficiencies were calculated for all trials to compare against the normalized ET1 sampling deposition convention. The foam sampling efficiencies matched well to the ET1 deposition convention for the larger particle sizes, but had a general trend of underestimating for all three particle sizes. The results of a Wilcoxon Rank Sum Test also showed that only at 4.9 µm was there a statistically significant difference (p-value = 0.03) between the foam sampling efficiency using the standard particles and the neutralized particles. This is interpreted to mean that static buildup may be occurring and neutralizing the particles that are 4.9 µm diameter in size did affect the performance of the foam sampler when estimating extrathoracic particle deposition.
Trends in study design and the statistical methods employed in a leading general medicine journal.
Gosho, M; Sato, Y; Nagashima, K; Takahashi, S
2018-02-01
Study design and statistical methods have become core components of medical research, and the methodology has become more multifaceted and complicated over time. The study of the comprehensive details and current trends of study design and statistical methods is required to support the future implementation of well-planned clinical studies providing information about evidence-based medicine. Our purpose was to illustrate study design and statistical methods employed in recent medical literature. This was an extension study of Sato et al. (N Engl J Med 2017; 376: 1086-1087), which reviewed 238 articles published in 2015 in the New England Journal of Medicine (NEJM) and briefly summarized the statistical methods employed in NEJM. Using the same database, we performed a new investigation of the detailed trends in study design and individual statistical methods that were not reported in the Sato study. Due to the CONSORT statement, prespecification and justification of sample size are obligatory in planning intervention studies. Although standard survival methods (eg Kaplan-Meier estimator and Cox regression model) were most frequently applied, the Gray test and Fine-Gray proportional hazard model for considering competing risks were sometimes used for a more valid statistical inference. With respect to handling missing data, model-based methods, which are valid for missing-at-random data, were more frequently used than single imputation methods. These methods are not recommended as a primary analysis, but they have been applied in many clinical trials. Group sequential design with interim analyses was one of the standard designs, and novel design, such as adaptive dose selection and sample size re-estimation, was sometimes employed in NEJM. Model-based approaches for handling missing data should replace single imputation methods for primary analysis in the light of the information found in some publications. Use of adaptive design with interim analyses is increasing after the presentation of the FDA guidance for adaptive design. © 2017 John Wiley & Sons Ltd.
Scott, Richard B; Eccles, Fiona; Lloyd, Andrew; Carpenter, Katherine
2008-01-01
Background The neuropsychological arm of the International Subarachnoid Aneurysm Trial (N-ISAT) evaluated the cognitive outcome of 573 patients at 12 months following subarachnoid haemorrhage (SAH). The assessment included 29 psychometric measures, yielding a substantial and complex body of data. We have explored alternative and optimal methodologies for analysing and summarising these data to enable the estimation of a cognitive complication rate (CCR). Any differences in cognitive outcome between the two arms of the trial are not however reported here. Methods All individual test scores were transformed into z-scores and a 5th percentile cut-off for impairment was established. A principal components analysis (PCA) was applied to these data to mathematically transform correlated test scores into a smaller number of uncorrelated principal components, or cognitive 'domains'. These domains formed the basis for grouping and weighting individual patients' impaired scores on individual measures. In order to increase the sample size, a series of methods for handling missing data were applied. Results We estimated a 34.1% CCR in all those patients seen face-to-face, rising to 37.4% CCR with the inclusion of patients who were unable to attend assessment for reason related to the index SAH. This group demonstrated significantly more self and carer/relative rated disability on a Health Related Quality of Life questionnaire, than patients classified as having no functionally significant cognitive deficits. Conclusion Evaluating neuropsychological outcome in a large RCT involves unique methodological and organizational challenges. We have demonstrated how these problems may be addressed by re-classifying interval data from 29 measures into a dichotomous CCR. We have presented a 'sliding scale' of undifferentiated individual cognitive impairments, and then on the basis of PCA-derived cognitive 'domains', included consideration of the distribution of impairments in these terms. In order to maximize sample size we have suggested ways for patients who did not complete the entire protocol to be included in the overall CCR. ISAT trial registration ISRCTN49866681 PMID:18341689
Quantitative endoscopy: initial accuracy measurements.
Truitt, T O; Adelman, R A; Kelly, D H; Willging, J P
2000-02-01
The geometric optics of an endoscope can be used to determine the absolute size of an object in an endoscopic field without knowing the actual distance from the object. This study explores the accuracy of a technique that estimates absolute object size from endoscopic images. Quantitative endoscopy involves calibrating a rigid endoscope to produce size estimates from 2 images taken with a known traveled distance between the images. The heights of 12 samples, ranging in size from 0.78 to 11.80 mm, were estimated with this calibrated endoscope. Backup distances of 5 mm and 10 mm were used for comparison. The mean percent error for all estimated measurements when compared with the actual object sizes was 1.12%. The mean errors for 5-mm and 10-mm backup distances were 0.76% and 1.65%, respectively. The mean errors for objects <2 mm and > or =2 mm were 0.94% and 1.18%, respectively. Quantitative endoscopy estimates endoscopic image size to within 5% of the actual object size. This method remains promising for quantitatively evaluating object size from endoscopic images. It does not require knowledge of the absolute distance of the endoscope from the object, rather, only the distance traveled by the endoscope between images.
Daniel J. Isaak; Jay M. Ver Hoef; Erin E. Peterson; Dona L. Horan; David E. Nagel
2017-01-01
Population size estimates for stream fishes are important for conservation and management, but sampling costs limit the extent of most estimates to small portions of river networks that encompass 100sâ10 000s of linear kilometres. However, the advent of large fish density data sets, spatial-stream-network (SSN) models that benefit from nonindependence among samples,...
ERIC Educational Resources Information Center
Carvajal, Jorge; Skorupski, William P.
2010-01-01
This study is an evaluation of the behavior of the Liu-Agresti estimator of the cumulative common odds ratio when identifying differential item functioning (DIF) with polytomously scored test items using small samples. The Liu-Agresti estimator has been proposed by Penfield and Algina as a promising approach for the study of polytomous DIF but no…
Keiter, David A.; Cunningham, Fred L.; Rhodes, Olin E.; Irwin, Brian J.; Beasley, James
2016-01-01
Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocols with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig (Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. Knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keiter, David A.; Cunningham, Fred L.; Rhodes, Jr., Olin E.
Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocolsmore » with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig ( Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. In conclusion, knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.« less
Keiter, David A; Cunningham, Fred L; Rhodes, Olin E; Irwin, Brian J; Beasley, James C
2016-01-01
Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocols with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig (Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. Knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.
Keiter, David A.; Cunningham, Fred L.; Rhodes, Jr., Olin E.; ...
2016-05-25
Collection of scat samples is common in wildlife research, particularly for genetic capture-mark-recapture applications. Due to high degradation rates of genetic material in scat, large numbers of samples must be collected to generate robust estimates. Optimization of sampling approaches to account for taxa-specific patterns of scat deposition is, therefore, necessary to ensure sufficient sample collection. While scat collection methods have been widely studied in carnivores, research to maximize scat collection and noninvasive sampling efficiency for social ungulates is lacking. Further, environmental factors or scat morphology may influence detection of scat by observers. We contrasted performance of novel radial search protocolsmore » with existing adaptive cluster sampling protocols to quantify differences in observed amounts of wild pig ( Sus scrofa) scat. We also evaluated the effects of environmental (percentage of vegetative ground cover and occurrence of rain immediately prior to sampling) and scat characteristics (fecal pellet size and number) on the detectability of scat by observers. We found that 15- and 20-m radial search protocols resulted in greater numbers of scats encountered than the previously used adaptive cluster sampling approach across habitat types, and that fecal pellet size, number of fecal pellets, percent vegetative ground cover, and recent rain events were significant predictors of scat detection. Our results suggest that use of a fixed-width radial search protocol may increase the number of scats detected for wild pigs, or other social ungulates, allowing more robust estimation of population metrics using noninvasive genetic sampling methods. Further, as fecal pellet size affected scat detection, juvenile or smaller-sized animals may be less detectable than adult or large animals, which could introduce bias into abundance estimates. In conclusion, knowledge of relationships between environmental variables and scat detection may allow researchers to optimize sampling protocols to maximize utility of noninvasive sampling for wild pigs and other social ungulates.« less
Sample size determination for GEE analyses of stepped wedge cluster randomized trials.
Li, Fan; Turner, Elizabeth L; Preisser, John S
2018-06-19
In stepped wedge cluster randomized trials, intact clusters of individuals switch from control to intervention from a randomly assigned period onwards. Such trials are becoming increasingly popular in health services research. When a closed cohort is recruited from each cluster for longitudinal follow-up, proper sample size calculation should account for three distinct types of intraclass correlations: the within-period, the inter-period, and the within-individual correlations. Setting the latter two correlation parameters to be equal accommodates cross-sectional designs. We propose sample size procedures for continuous and binary responses within the framework of generalized estimating equations that employ a block exchangeable within-cluster correlation structure defined from the distinct correlation types. For continuous responses, we show that the intraclass correlations affect power only through two eigenvalues of the correlation matrix. We demonstrate that analytical power agrees well with simulated power for as few as eight clusters, when data are analyzed using bias-corrected estimating equations for the correlation parameters concurrently with a bias-corrected sandwich variance estimator. © 2018, The International Biometric Society.
Evans, T A
2001-12-01
Although mark-recapture protocols produce inaccurate population estimates of termite colonies, they might be employed to estimate a relative change in colony size. This possibility was tested using two Australian, mound-building, wood-eating, subterranean Coptotermes species. Three different toxicants delivered in baits were used to decrease (but not eliminate) colony size, and a single mark-recapture protocol was used to estimate pre- and postbaiting population sizes. For both species, the numbers of termites retrieved from bait stations varied widely, resulting in no significant differences in the numbers of termites sampled between treatments in either the pre- or postbaiting protocols. There were significantly fewer termites sampled in all treatments, controls included, in the postbaiting protocol compared with the pre-, suggesting a seasonal change in forager numbers. The comparison of population estimates shows a large decrease in toxicant treated colonies compared with little change in control colonies, which suggests that estimating the relative decline in population size using mark-recapture protocols might to be possible. However, the change in population estimate was due entirely to the significantly lower recapture rate in the control colonies relative to the toxicant treated colonies, as numbers of unmarked termites did not change between treatments. The population estimates should be treated with caution because low recapture rates produce dubious population estimates and, in some cases, postbaiting mark-recapture population estimates could be much greater than those at prebaiting, despite consumption of bait in sufficient quantities to cause population decline. A possible interaction between fat-stain markers and toxicants should be investigated if mark-recapture population estimates are used. Alternative methods of population change are advised, along with other indirect measures.
3D data processing with advanced computer graphics tools
NASA Astrophysics Data System (ADS)
Zhang, Song; Ekstrand, Laura; Grieve, Taylor; Eisenmann, David J.; Chumbley, L. Scott
2012-09-01
Often, the 3-D raw data coming from an optical profilometer contains spiky noises and irregular grid, which make it difficult to analyze and difficult to store because of the enormously large size. This paper is to address these two issues for an optical profilometer by substantially reducing the spiky noise of the 3-D raw data from an optical profilometer, and by rapidly re-sampling the raw data into regular grids at any pixel size and any orientation with advanced computer graphics tools. Experimental results will be presented to demonstrate the effectiveness of the proposed approach.
Vainik, Uku; Konstabel, Kenn; Lätt, Evelin; Mäestu, Jarek; Purge, Priit; Jürimäe, Jaak
2016-10-01
Subjective energy intake (sEI) is often misreported, providing unreliable estimates of energy consumed. Therefore, relating sEI data to health outcomes is difficult. Recently, Börnhorst et al. compared various methods to correct sEI-based energy intake estimates. They criticised approaches that categorise participants as under-reporters, plausible reporters and over-reporters based on the sEI:total energy expenditure (TEE) ratio, and thereafter use these categories as statistical covariates or exclusion criteria. Instead, they recommended using external predictors of sEI misreporting as statistical covariates. We sought to confirm and extend these findings. Using a sample of 190 adolescent boys (mean age=14), we demonstrated that dual-energy X-ray absorptiometry-measured fat-free mass is strongly associated with objective energy intake data (onsite weighted breakfast), but the association with sEI (previous 3-d dietary interview) is weak. Comparing sEI with TEE revealed that sEI was mostly under-reported (74 %). Interestingly, statistically controlling for dietary reporting groups or restricting samples to plausible reporters created a stronger-than-expected association between fat-free mass and sEI. However, the association was an artifact caused by selection bias - that is, data re-sampling and simulations showed that these methods overestimated the effect size because fat-free mass was related to sEI both directly and indirectly via TEE. A more realistic association between sEI and fat-free mass was obtained when the model included common predictors of misreporting (e.g. BMI, restraint). To conclude, restricting sEI data only to plausible reporters can cause selection bias and inflated associations in later analyses. Therefore, we further support statistically correcting sEI data in nutritional analyses. The script for running simulations is provided.
Natural variations in the rhenium isotopic composition of meteorites
NASA Astrophysics Data System (ADS)
Liu, R.; Hu, L.; Humayun, M.
2017-03-01
Rhenium is an important element with which to test hypotheses of isotope variation. Historically, it has been difficult to precisely correct the instrumental mass bias in thermal ionization mass spectrometry. We used W as an internal standard to correct mass bias on the MC-ICP-MS, and obtained the first precise δ187Re values ( ±0.02‰, 2SE) for iron meteorites and chondritic metal. Relative to metal from H chondrites, IVB irons are systematically higher in δ187Re by 0.14 ‰. δ187Re for other irons are similar to H chondritic metal, although some individual samples show significant isotope fractionation. Since 185Re has a high neutron capture cross section, the effect of galactic cosmic-ray (GCR) irradiation on δ187Re was examined using correlations with Pt isotopes. The pre-GCR irradiation δ187Re for IVB irons is lower, but the difference in δ187Re between IVB irons and other meteoritic metal remains. Nuclear volume-dependent fractionation for Re is about the right magnitude near the melting point of iron, but because of the refractory and compatible character of Re, a compelling explanation in terms of mass-dependent fractionation is elusive. The magnitude of a nucleosynthetic s-process deficit for Re estimated from Mo and Ru isotopes is essentially unresolvable. Since thermal processing reduced nucleosynthetic effects in Pd, it is conceivable that Re isotopic variations larger than those in Mo and Ru may be present in IVBs since Re is more refractory than Mo and Ru. Thus, the Re isotopic difference between IVBs and other irons or chondritic metal remains unexplained.
ERIC Educational Resources Information Center
Sideridis, Georgios; Simos, Panagiotis; Papanicolaou, Andrew; Fletcher, Jack
2014-01-01
The present study assessed the impact of sample size on the power and fit of structural equation modeling applied to functional brain connectivity hypotheses. The data consisted of time-constrained minimum norm estimates of regional brain activity during performance of a reading task obtained with magnetoencephalography. Power analysis was first…